All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 00/19] Generic DT bindings for PCI IOMMUs and ARM SMMU
@ 2016-08-23 19:05 Robin Murphy
       [not found] ` <cover.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
  2016-08-23 19:05   ` Robin Murphy
  0 siblings, 2 replies; 61+ messages in thread
From: Robin Murphy @ 2016-08-23 19:05 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, lorenzo.pieralisi-5wv7dgnIgG8,
	jean-philippe.brucker-5wv7dgnIgG8, punit.agrawal-5wv7dgnIgG8,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA,
	eric.auger-H+wXaHxf7aLQT0dZR+AlfA

Hi all,

At long last I've finished the big SMMUv2 rework, so here's everything
all together for a v5. As a quick breakdown:

Patches 1-3 are the core PCI part, all acked and ready to go. No code
changes from v4.

Patch 4 is merely bugfixed from v4 for simplicity, as I've not yet
managed to take as close a look at Lorenzo's follow-on work as I'd like.

Patches 5-7 (SMMUv3) are mostly unchanged beyond a slight tweak to #5.

Patches 8-17 are the all-new SMMUv2 rework.

Patch 18 goes along with the fix already in 4.8-rc3 to help avoid 64-bit
DMA masks going wrong now that DMA ops will be enabled.

Finally, patch 19 addresses the previous problem of having to choose
between DMA ops or working MSIs. This is currently at the end as
moving it before #17 would require a further interim SMMUv2 patch, and
a 19-patch series is already quite enough...

I've pushed out a branch based on iommu/next to the usual place:

git://linux-arm.org/linux-rm iommu/generic-v5

Thanks,
Robin.
---

Mark Rutland (1):
  Docs: dt: add PCI IOMMU map bindings

Robin Murphy (18):
  of/irq: Break out msi-map lookup (again)
  iommu/of: Handle iommu-map property for PCI
  iommu/of: Introduce iommu_fwspec
  iommu/arm-smmu: Implement of_xlate() for SMMUv3
  iommu/arm-smmu: Support non-PCI devices with SMMUv3
  iommu/arm-smmu: Set PRIVCFG in stage 1 STEs
  iommu/arm-smmu: Handle stream IDs more dynamically
  iommu/arm-smmu: Consolidate stream map entry state
  iommu/arm-smmu: Keep track of S2CR state
  iommu/arm-smmu: Refactor mmu-masters handling
  iommu/arm-smmu: Streamline SMMU data lookups
  iommu/arm-smmu: Add a stream map entry iterator
  iommu/arm-smmu: Intelligent SMR allocation
  iommu/arm-smmu: Convert to iommu_fwspec
  Docs: dt: document ARM SMMU generic binding usage
  iommu/arm-smmu: Wire up generic configuration support
  iommu/arm-smmu: Set domain geometry
  iommu/dma: Add support for mapping MSIs

 .../devicetree/bindings/iommu/arm,smmu.txt         |  63 +-
 .../devicetree/bindings/pci/pci-iommu.txt          | 171 ++++
 drivers/iommu/Kconfig                              |   2 +-
 drivers/iommu/arm-smmu-v3.c                        | 347 ++++----
 drivers/iommu/arm-smmu.c                           | 952 ++++++++++-----------
 drivers/iommu/dma-iommu.c                          | 141 ++-
 drivers/iommu/of_iommu.c                           |  95 +-
 drivers/irqchip/irq-gic-v2m.c                      |   3 +
 drivers/irqchip/irq-gic-v3-its.c                   |   3 +
 drivers/of/irq.c                                   |  78 +-
 drivers/of/of_pci.c                                | 102 +++
 include/linux/dma-iommu.h                          |   9 +
 include/linux/of_iommu.h                           |  15 +
 include/linux/of_pci.h                             |  10 +
 14 files changed, 1208 insertions(+), 783 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/pci/pci-iommu.txt

-- 
2.8.1.dirty

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v5 01/19] Docs: dt: add PCI IOMMU map bindings
       [not found] ` <cover.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
@ 2016-08-23 19:05   ` Robin Murphy
  2016-08-23 19:05   ` [PATCH v5 02/19] of/irq: Break out msi-map lookup (again) Robin Murphy
                     ` (19 subsequent siblings)
  20 siblings, 0 replies; 61+ messages in thread
From: Robin Murphy @ 2016-08-23 19:05 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA,
	punit.agrawal-5wv7dgnIgG8,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA

From: Mark Rutland <mark.rutland-5wv7dgnIgG8@public.gmane.org>

The existing IOMMU bindings are able to specify the relationship between
masters and IOMMUs, but they are insufficient for describing the general
case of hotpluggable busses such as PCI where the set of masters is not
known until runtime, and the relationship between masters and IOMMUs is
a property of the integration of the system.

This patch adds a generic binding for mapping PCI devices to IOMMUs,
using a new iommu-map property (specific to PCI*) which may be used to
map devices (identified by their Requester ID) to sideband data for the
IOMMU which they master through.

Acked-by: Rob Herring <robh-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Acked-by: Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org>
Signed-off-by: Mark Rutland <mark.rutland-5wv7dgnIgG8@public.gmane.org>
---
 .../devicetree/bindings/pci/pci-iommu.txt          | 171 +++++++++++++++++++++
 1 file changed, 171 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/pci/pci-iommu.txt

diff --git a/Documentation/devicetree/bindings/pci/pci-iommu.txt b/Documentation/devicetree/bindings/pci/pci-iommu.txt
new file mode 100644
index 000000000000..56c829621b9a
--- /dev/null
+++ b/Documentation/devicetree/bindings/pci/pci-iommu.txt
@@ -0,0 +1,171 @@
+This document describes the generic device tree binding for describing the
+relationship between PCI(e) devices and IOMMU(s).
+
+Each PCI(e) device under a root complex is uniquely identified by its Requester
+ID (AKA RID). A Requester ID is a triplet of a Bus number, Device number, and
+Function number.
+
+For the purpose of this document, when treated as a numeric value, a RID is
+formatted such that:
+
+* Bits [15:8] are the Bus number.
+* Bits [7:3] are the Device number.
+* Bits [2:0] are the Function number.
+* Any other bits required for padding must be zero.
+
+IOMMUs may distinguish PCI devices through sideband data derived from the
+Requester ID. While a given PCI device can only master through one IOMMU, a
+root complex may split masters across a set of IOMMUs (e.g. with one IOMMU per
+bus).
+
+The generic 'iommus' property is insufficient to describe this relationship,
+and a mechanism is required to map from a PCI device to its IOMMU and sideband
+data.
+
+For generic IOMMU bindings, see
+Documentation/devicetree/bindings/iommu/iommu.txt.
+
+
+PCI root complex
+================
+
+Optional properties
+-------------------
+
+- iommu-map: Maps a Requester ID to an IOMMU and associated iommu-specifier
+  data.
+
+  The property is an arbitrary number of tuples of
+  (rid-base,iommu,iommu-base,length).
+
+  Any RID r in the interval [rid-base, rid-base + length) is associated with
+  the listed IOMMU, with the iommu-specifier (r - rid-base + iommu-base).
+
+- iommu-map-mask: A mask to be applied to each Requester ID prior to being
+  mapped to an iommu-specifier per the iommu-map property.
+
+
+Example (1)
+===========
+
+/ {
+	#address-cells = <1>;
+	#size-cells = <1>;
+
+	iommu: iommu@a {
+		reg = <0xa 0x1>;
+		compatible = "vendor,some-iommu";
+		#iommu-cells = <1>;
+	};
+
+	pci: pci@f {
+		reg = <0xf 0x1>;
+		compatible = "vendor,pcie-root-complex";
+		device_type = "pci";
+
+		/*
+		 * The sideband data provided to the IOMMU is the RID,
+		 * identity-mapped.
+		 */
+		iommu-map = <0x0 &iommu 0x0 0x10000>;
+	};
+};
+
+
+Example (2)
+===========
+
+/ {
+	#address-cells = <1>;
+	#size-cells = <1>;
+
+	iommu: iommu@a {
+		reg = <0xa 0x1>;
+		compatible = "vendor,some-iommu";
+		#iommu-cells = <1>;
+	};
+
+	pci: pci@f {
+		reg = <0xf 0x1>;
+		compatible = "vendor,pcie-root-complex";
+		device_type = "pci";
+
+		/*
+		 * The sideband data provided to the IOMMU is the RID with the
+		 * function bits masked out.
+		 */
+		iommu-map = <0x0 &iommu 0x0 0x10000>;
+		iommu-map-mask = <0xfff8>;
+	};
+};
+
+
+Example (3)
+===========
+
+/ {
+	#address-cells = <1>;
+	#size-cells = <1>;
+
+	iommu: iommu@a {
+		reg = <0xa 0x1>;
+		compatible = "vendor,some-iommu";
+		#iommu-cells = <1>;
+	};
+
+	pci: pci@f {
+		reg = <0xf 0x1>;
+		compatible = "vendor,pcie-root-complex";
+		device_type = "pci";
+
+		/*
+		 * The sideband data provided to the IOMMU is the RID,
+		 * but the high bits of the bus number are flipped.
+		 */
+		iommu-map = <0x0000 &iommu 0x8000 0x8000>,
+			    <0x8000 &iommu 0x0000 0x8000>;
+	};
+};
+
+
+Example (4)
+===========
+
+/ {
+	#address-cells = <1>;
+	#size-cells = <1>;
+
+	iommu_a: iommu@a {
+		reg = <0xa 0x1>;
+		compatible = "vendor,some-iommu";
+		#iommu-cells = <1>;
+	};
+
+	iommu_b: iommu@b {
+		reg = <0xb 0x1>;
+		compatible = "vendor,some-iommu";
+		#iommu-cells = <1>;
+	};
+
+	iommu_c: iommu@c {
+		reg = <0xc 0x1>;
+		compatible = "vendor,some-iommu";
+		#iommu-cells = <1>;
+	};
+
+	pci: pci@f {
+		reg = <0xf 0x1>;
+		compatible = "vendor,pcie-root-complex";
+		device_type = "pci";
+
+		/*
+		 * Devices with bus number 0-127 are mastered via IOMMU
+		 * a, with sideband data being RID[14:0].
+		 * Devices with bus number 128-255 are mastered via
+		 * IOMMU b, with sideband data being RID[14:0].
+		 * No devices master via IOMMU c.
+		 */
+		iommu-map = <0x0000 &iommu_a 0x0000 0x8000>,
+			    <0x8000 &iommu_b 0x0000 0x8000>;
+	};
+};
-- 
2.8.1.dirty

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v5 02/19] of/irq: Break out msi-map lookup (again)
       [not found] ` <cover.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
  2016-08-23 19:05   ` [PATCH v5 01/19] Docs: dt: add PCI IOMMU map bindings Robin Murphy
@ 2016-08-23 19:05   ` Robin Murphy
  2016-08-23 19:05   ` [PATCH v5 03/19] iommu/of: Handle iommu-map property for PCI Robin Murphy
                     ` (18 subsequent siblings)
  20 siblings, 0 replies; 61+ messages in thread
From: Robin Murphy @ 2016-08-23 19:05 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, punit.agrawal-5wv7dgnIgG8,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA

The PCI msi-map code is already doing double-duty translating IDs and
retrieving MSI parents, which unsurprisingly is the same functionality
we need for the identically-formatted PCI iommu-map property. Drag the
core parsing routine up yet another layer into the general OF-PCI code,
and further generalise it for either kind of lookup in either flavour
of map property.

Acked-by: Rob Herring <robh+dt-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Acked-by: Marc Zyngier <marc.zyngier-5wv7dgnIgG8@public.gmane.org>
Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
---

v5: +Rob's ack

 drivers/of/irq.c       |  78 ++-----------------------------------
 drivers/of/of_pci.c    | 102 +++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/of_pci.h |  10 +++++
 3 files changed, 116 insertions(+), 74 deletions(-)

diff --git a/drivers/of/irq.c b/drivers/of/irq.c
index a2e68f740eda..393fea85eb4e 100644
--- a/drivers/of/irq.c
+++ b/drivers/of/irq.c
@@ -26,6 +26,7 @@
 #include <linux/module.h>
 #include <linux/of.h>
 #include <linux/of_irq.h>
+#include <linux/of_pci.h>
 #include <linux/string.h>
 #include <linux/slab.h>
 
@@ -592,87 +593,16 @@ static u32 __of_msi_map_rid(struct device *dev, struct device_node **np,
 			    u32 rid_in)
 {
 	struct device *parent_dev;
-	struct device_node *msi_controller_node;
-	struct device_node *msi_np = *np;
-	u32 map_mask, masked_rid, rid_base, msi_base, rid_len, phandle;
-	int msi_map_len;
-	bool matched;
 	u32 rid_out = rid_in;
-	const __be32 *msi_map = NULL;
 
 	/*
 	 * Walk up the device parent links looking for one with a
 	 * "msi-map" property.
 	 */
-	for (parent_dev = dev; parent_dev; parent_dev = parent_dev->parent) {
-		if (!parent_dev->of_node)
-			continue;
-
-		msi_map = of_get_property(parent_dev->of_node,
-					  "msi-map", &msi_map_len);
-		if (!msi_map)
-			continue;
-
-		if (msi_map_len % (4 * sizeof(__be32))) {
-			dev_err(parent_dev, "Error: Bad msi-map length: %d\n",
-				msi_map_len);
-			return rid_out;
-		}
-		/* We have a good parent_dev and msi_map, let's use them. */
-		break;
-	}
-	if (!msi_map)
-		return rid_out;
-
-	/* The default is to select all bits. */
-	map_mask = 0xffffffff;
-
-	/*
-	 * Can be overridden by "msi-map-mask" property.  If
-	 * of_property_read_u32() fails, the default is used.
-	 */
-	of_property_read_u32(parent_dev->of_node, "msi-map-mask", &map_mask);
-
-	masked_rid = map_mask & rid_in;
-	matched = false;
-	while (!matched && msi_map_len >= 4 * sizeof(__be32)) {
-		rid_base = be32_to_cpup(msi_map + 0);
-		phandle = be32_to_cpup(msi_map + 1);
-		msi_base = be32_to_cpup(msi_map + 2);
-		rid_len = be32_to_cpup(msi_map + 3);
-
-		if (rid_base & ~map_mask) {
-			dev_err(parent_dev,
-				"Invalid msi-map translation - msi-map-mask (0x%x) ignores rid-base (0x%x)\n",
-				map_mask, rid_base);
-			return rid_out;
-		}
-
-		msi_controller_node = of_find_node_by_phandle(phandle);
-
-		matched = (masked_rid >= rid_base &&
-			   masked_rid < rid_base + rid_len);
-		if (msi_np)
-			matched &= msi_np == msi_controller_node;
-
-		if (matched && !msi_np) {
-			*np = msi_np = msi_controller_node;
+	for (parent_dev = dev; parent_dev; parent_dev = parent_dev->parent)
+		if (!of_pci_map_rid(parent_dev->of_node, rid_in, "msi-map",
+				    "msi-map-mask", np, &rid_out))
 			break;
-		}
-
-		of_node_put(msi_controller_node);
-		msi_map_len -= 4 * sizeof(__be32);
-		msi_map += 4;
-	}
-	if (!matched)
-		return rid_out;
-
-	rid_out = masked_rid - rid_base + msi_base;
-	dev_dbg(dev,
-		"msi-map at: %s, using mask %08x, rid-base: %08x, msi-base: %08x, length: %08x, rid: %08x -> %08x\n",
-		dev_name(parent_dev), map_mask, rid_base, msi_base,
-		rid_len, rid_in, rid_out);
-
 	return rid_out;
 }
 
diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c
index 589b30c68e14..b58be12ab277 100644
--- a/drivers/of/of_pci.c
+++ b/drivers/of/of_pci.c
@@ -308,3 +308,105 @@ struct msi_controller *of_pci_find_msi_chip_by_node(struct device_node *of_node)
 EXPORT_SYMBOL_GPL(of_pci_find_msi_chip_by_node);
 
 #endif /* CONFIG_PCI_MSI */
+
+/**
+ * of_pci_map_rid - Translate a requester ID through a downstream mapping.
+ * @np: root complex device node.
+ * @rid: PCI requester ID to map.
+ * @map_name: property name of the map to use.
+ * @map_mask_name: optional property name of the mask to use.
+ * @target: optional pointer to a target device node.
+ * @id_out: optional pointer to receive the translated ID.
+ *
+ * Given a PCI requester ID, look up the appropriate implementation-defined
+ * platform ID and/or the target device which receives transactions on that
+ * ID, as per the "iommu-map" and "msi-map" bindings. Either of @target or
+ * @id_out may be NULL if only the other is required. If @target points to
+ * a non-NULL device node pointer, only entries targeting that node will be
+ * matched; if it points to a NULL value, it will receive the device node of
+ * the first matching target phandle, with a reference held.
+ *
+ * Return: 0 on success or a standard error code on failure.
+ */
+int of_pci_map_rid(struct device_node *np, u32 rid,
+		   const char *map_name, const char *map_mask_name,
+		   struct device_node **target, u32 *id_out)
+{
+	u32 map_mask, masked_rid;
+	int map_len;
+	const __be32 *map = NULL;
+
+	if (!np || !map_name || (!target && !id_out))
+		return -EINVAL;
+
+	map = of_get_property(np, map_name, &map_len);
+	if (!map) {
+		if (target)
+			return -ENODEV;
+		/* Otherwise, no map implies no translation */
+		*id_out = rid;
+		return 0;
+	}
+
+	if (!map_len || map_len % (4 * sizeof(*map))) {
+		pr_err("%s: Error: Bad %s length: %d\n", np->full_name,
+			map_name, map_len);
+		return -EINVAL;
+	}
+
+	/* The default is to select all bits. */
+	map_mask = 0xffffffff;
+
+	/*
+	 * Can be overridden by "{iommu,msi}-map-mask" property.
+	 * If of_property_read_u32() fails, the default is used.
+	 */
+	if (map_mask_name)
+		of_property_read_u32(np, map_mask_name, &map_mask);
+
+	masked_rid = map_mask & rid;
+	for ( ; map_len > 0; map_len -= 4 * sizeof(*map), map += 4) {
+		struct device_node *phandle_node;
+		u32 rid_base = be32_to_cpup(map + 0);
+		u32 phandle = be32_to_cpup(map + 1);
+		u32 out_base = be32_to_cpup(map + 2);
+		u32 rid_len = be32_to_cpup(map + 3);
+
+		if (rid_base & ~map_mask) {
+			pr_err("%s: Invalid %s translation - %s-mask (0x%x) ignores rid-base (0x%x)\n",
+				np->full_name, map_name, map_name,
+				map_mask, rid_base);
+			return -EFAULT;
+		}
+
+		if (masked_rid < rid_base || masked_rid >= rid_base + rid_len)
+			continue;
+
+		phandle_node = of_find_node_by_phandle(phandle);
+		if (!phandle_node)
+			return -ENODEV;
+
+		if (target) {
+			if (*target)
+				of_node_put(phandle_node);
+			else
+				*target = phandle_node;
+
+			if (*target != phandle_node)
+				continue;
+		}
+
+		if (id_out)
+			*id_out = masked_rid - rid_base + out_base;
+
+		pr_debug("%s: %s, using mask %08x, rid-base: %08x, out-base: %08x, length: %08x, rid: %08x -> %08x\n",
+			np->full_name, map_name, map_mask, rid_base, out_base,
+			rid_len, rid, *id_out);
+		return 0;
+	}
+
+	pr_err("%s: Invalid %s translation - no match for rid 0x%x on %s\n",
+		np->full_name, map_name, rid,
+		target && *target ? (*target)->full_name : "any target");
+	return -EFAULT;
+}
diff --git a/include/linux/of_pci.h b/include/linux/of_pci.h
index b969e9443962..7fd5cfce9140 100644
--- a/include/linux/of_pci.h
+++ b/include/linux/of_pci.h
@@ -17,6 +17,9 @@ int of_irq_parse_and_map_pci(const struct pci_dev *dev, u8 slot, u8 pin);
 int of_pci_parse_bus_range(struct device_node *node, struct resource *res);
 int of_get_pci_domain_nr(struct device_node *node);
 void of_pci_check_probe_only(void);
+int of_pci_map_rid(struct device_node *np, u32 rid,
+		   const char *map_name, const char *map_mask_name,
+		   struct device_node **target, u32 *id_out);
 #else
 static inline int of_irq_parse_pci(const struct pci_dev *pdev, struct of_phandle_args *out_irq)
 {
@@ -52,6 +55,13 @@ of_get_pci_domain_nr(struct device_node *node)
 	return -1;
 }
 
+static inline int of_pci_map_rid(struct device_node *np, u32 rid,
+			const char *map_name, const char *map_mask_name,
+			struct device_node **target, u32 *id_out)
+{
+	return -EINVAL;
+}
+
 static inline void of_pci_check_probe_only(void) { }
 #endif
 
-- 
2.8.1.dirty

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v5 03/19] iommu/of: Handle iommu-map property for PCI
       [not found] ` <cover.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
  2016-08-23 19:05   ` [PATCH v5 01/19] Docs: dt: add PCI IOMMU map bindings Robin Murphy
  2016-08-23 19:05   ` [PATCH v5 02/19] of/irq: Break out msi-map lookup (again) Robin Murphy
@ 2016-08-23 19:05   ` Robin Murphy
       [not found]     ` <93909648835867008b21cb688a1d7db238d3641a.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
  2016-08-23 19:05   ` [PATCH v5 04/19] iommu/of: Introduce iommu_fwspec Robin Murphy
                     ` (17 subsequent siblings)
  20 siblings, 1 reply; 61+ messages in thread
From: Robin Murphy @ 2016-08-23 19:05 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, punit.agrawal-5wv7dgnIgG8,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA

Now that we have a way to pick up the RID translation and target IOMMU,
hook up of_iommu_configure() to bring PCI devices into the of_xlate
mechanism and allow them IOMMU-backed DMA ops without the need for
driver-specific handling.

Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
---
 drivers/iommu/of_iommu.c | 43 ++++++++++++++++++++++++++++++++++++-------
 1 file changed, 36 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
index 57f23eaaa2f9..1a65cc806898 100644
--- a/drivers/iommu/of_iommu.c
+++ b/drivers/iommu/of_iommu.c
@@ -22,6 +22,7 @@
 #include <linux/limits.h>
 #include <linux/of.h>
 #include <linux/of_iommu.h>
+#include <linux/of_pci.h>
 #include <linux/slab.h>
 
 static const struct of_device_id __iommu_of_table_sentinel
@@ -134,20 +135,48 @@ const struct iommu_ops *of_iommu_get_ops(struct device_node *np)
 	return ops;
 }
 
+static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void *data)
+{
+	struct of_phandle_args *iommu_spec = data;
+
+	iommu_spec->args[0] = alias;
+	return iommu_spec->np == pdev->bus->dev.of_node;
+}
+
 const struct iommu_ops *of_iommu_configure(struct device *dev,
 					   struct device_node *master_np)
 {
 	struct of_phandle_args iommu_spec;
-	struct device_node *np;
+	struct device_node *np = NULL;
 	const struct iommu_ops *ops = NULL;
 	int idx = 0;
 
-	/*
-	 * We can't do much for PCI devices without knowing how
-	 * device IDs are wired up from the PCI bus to the IOMMU.
-	 */
-	if (dev_is_pci(dev))
-		return NULL;
+	if (dev_is_pci(dev)) {
+		/*
+		 * Start by tracing the RID alias down the PCI topology as
+		 * far as the host bridge whose OF node we have...
+		 */
+		iommu_spec.np = master_np;
+		pci_for_each_dma_alias(to_pci_dev(dev), __get_pci_rid,
+				       &iommu_spec);
+		/*
+		 * ...then find out what that becomes once it escapes the PCI
+		 * bus into the system beyond, and which IOMMU it ends up at.
+		 */
+		if (of_pci_map_rid(master_np, iommu_spec.args[0], "iommu-map",
+				    "iommu-map-mask", &np, iommu_spec.args))
+			return NULL;
+
+		/* We're not attempting to handle multi-alias devices yet */
+		iommu_spec.np = np;
+		iommu_spec.args_count = 1;
+		ops = of_iommu_get_ops(np);
+		if (!ops || !ops->of_xlate || ops->of_xlate(dev, &iommu_spec))
+			ops = NULL;
+
+		of_node_put(np);
+		return ops;
+	}
 
 	/*
 	 * We don't currently walk up the tree looking for a parent IOMMU.
-- 
2.8.1.dirty

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v5 04/19] iommu/of: Introduce iommu_fwspec
       [not found] ` <cover.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
                     ` (2 preceding siblings ...)
  2016-08-23 19:05   ` [PATCH v5 03/19] iommu/of: Handle iommu-map property for PCI Robin Murphy
@ 2016-08-23 19:05   ` Robin Murphy
       [not found]     ` <3e8eaf4fd65833fecc62828214aee81f6ca6c190.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
  2016-08-23 19:05   ` [PATCH v5 05/19] iommu/arm-smmu: Implement of_xlate() for SMMUv3 Robin Murphy
                     ` (16 subsequent siblings)
  20 siblings, 1 reply; 61+ messages in thread
From: Robin Murphy @ 2016-08-23 19:05 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, lorenzo.pieralisi-5wv7dgnIgG8,
	jean-philippe.brucker-5wv7dgnIgG8, punit.agrawal-5wv7dgnIgG8,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA,
	eric.auger-H+wXaHxf7aLQT0dZR+AlfA

Introduce a common structure to hold the per-device firmware data that
non-architectural IOMMU drivers generally need to keep track of.
Initially this is DT-specific to complement the existing of_iommu
support code, but will generalise further once other firmware methods
(e.g. ACPI IORT) come along.

Ultimately the aim is to promote the fwspec to a first-class member of
struct device, and handle the init/free automatically in the firmware
code. That way we can have API calls look for dev->fwspec->iommu_ops
before falling back to dev->bus->iommu_ops, and thus gracefully handle
those troublesome multi-IOMMU systems which we currently cannot. To
start with, though, make use of the existing archdata field and delegate
the init/free to drivers to allow an incremental conversion rather than
the impractical pain of trying to attempt everything in one go.

Suggested-by: Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org>
Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
---

v5: Fix shocking num_ids oversight.

 drivers/iommu/of_iommu.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/of_iommu.h | 15 ++++++++++++++
 2 files changed, 67 insertions(+)

diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
index 1a65cc806898..bec51eb47b0d 100644
--- a/drivers/iommu/of_iommu.c
+++ b/drivers/iommu/of_iommu.c
@@ -219,3 +219,55 @@ static int __init of_iommu_init(void)
 	return 0;
 }
 postcore_initcall_sync(of_iommu_init);
+
+int iommu_fwspec_init(struct device *dev, struct device_node *iommu_np)
+{
+	struct iommu_fwspec *fwspec = dev->archdata.iommu;
+
+	if (fwspec)
+		return 0;
+
+	fwspec = kzalloc(sizeof(*fwspec), GFP_KERNEL);
+	if (!fwspec)
+		return -ENOMEM;
+
+	fwspec->iommu_np = of_node_get(iommu_np);
+	fwspec->iommu_ops = of_iommu_get_ops(iommu_np);
+	dev->archdata.iommu = fwspec;
+	return 0;
+}
+
+void iommu_fwspec_free(struct device *dev)
+{
+	struct iommu_fwspec *fwspec = dev->archdata.iommu;
+
+	if (fwspec) {
+		of_node_put(fwspec->iommu_np);
+		kfree(fwspec);
+	}
+}
+
+int iommu_fwspec_add_ids(struct device *dev, u32 *ids, int num_ids)
+{
+	struct iommu_fwspec *fwspec = dev->archdata.iommu;
+	size_t size;
+
+	if (!fwspec)
+		return -EINVAL;
+
+	size = offsetof(struct iommu_fwspec, ids[fwspec->num_ids + num_ids]);
+	fwspec = krealloc(dev->archdata.iommu, size, GFP_KERNEL);
+	if (!fwspec)
+		return -ENOMEM;
+
+	while (num_ids--)
+		fwspec->ids[fwspec->num_ids++] = *ids++;
+
+	dev->archdata.iommu = fwspec;
+	return 0;
+}
+
+inline struct iommu_fwspec *dev_iommu_fwspec(struct device *dev)
+{
+	return dev->archdata.iommu;
+}
diff --git a/include/linux/of_iommu.h b/include/linux/of_iommu.h
index e80b9c762a03..accdc0525f28 100644
--- a/include/linux/of_iommu.h
+++ b/include/linux/of_iommu.h
@@ -14,6 +14,14 @@ extern int of_get_dma_window(struct device_node *dn, const char *prefix,
 extern const struct iommu_ops *of_iommu_configure(struct device *dev,
 					struct device_node *master_np);
 
+struct iommu_fwspec {
+	const struct iommu_ops	*iommu_ops;
+	struct device_node	*iommu_np;
+	void			*iommu_priv;
+	unsigned int		num_ids;
+	u32			ids[];
+};
+
 #else
 
 static inline int of_get_dma_window(struct device_node *dn, const char *prefix,
@@ -29,8 +37,15 @@ static inline const struct iommu_ops *of_iommu_configure(struct device *dev,
 	return NULL;
 }
 
+struct iommu_fwspec;
+
 #endif	/* CONFIG_OF_IOMMU */
 
+int iommu_fwspec_init(struct device *dev, struct device_node *iommu_np);
+void iommu_fwspec_free(struct device *dev);
+int iommu_fwspec_add_ids(struct device *dev, u32 *ids, int num_ids);
+struct iommu_fwspec *dev_iommu_fwspec(struct device *dev);
+
 void of_iommu_set_ops(struct device_node *np, const struct iommu_ops *ops);
 const struct iommu_ops *of_iommu_get_ops(struct device_node *np);
 
-- 
2.8.1.dirty

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v5 05/19] iommu/arm-smmu: Implement of_xlate() for SMMUv3
       [not found] ` <cover.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
                     ` (3 preceding siblings ...)
  2016-08-23 19:05   ` [PATCH v5 04/19] iommu/of: Introduce iommu_fwspec Robin Murphy
@ 2016-08-23 19:05   ` Robin Murphy
       [not found]     ` <6088007f60a24b36a3bf965b62521f99cd908019.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
  2016-08-23 19:05   ` [PATCH v5 06/19] iommu/arm-smmu: Support non-PCI devices with SMMUv3 Robin Murphy
                     ` (15 subsequent siblings)
  20 siblings, 1 reply; 61+ messages in thread
From: Robin Murphy @ 2016-08-23 19:05 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, punit.agrawal-5wv7dgnIgG8,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA

Now that we can properly describe the mapping between PCI RIDs and
stream IDs via "iommu-map", and have it fed it to the driver
automatically via of_xlate(), rework the SMMUv3 driver to benefit from
that, and get rid of the current misuse of the "iommus" binding.

Since having of_xlate wired up means that masters will now be given the
appropriate DMA ops, we also need to make sure that default domains work
properly. This necessitates dispensing with the "whole group at a time"
notion for attaching to a domain, as devices which share a group get
attached to the group's default domain one by one as they are initially
probed.

Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
---

v5: Simplify init code, use firmware-agnostic (and more efficient)
    driver-based instance lookup

 drivers/iommu/arm-smmu-v3.c | 312 ++++++++++++++++++++------------------------
 1 file changed, 139 insertions(+), 173 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 641e88761319..094babff64a6 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -30,6 +30,7 @@
 #include <linux/msi.h>
 #include <linux/of.h>
 #include <linux/of_address.h>
+#include <linux/of_iommu.h>
 #include <linux/of_platform.h>
 #include <linux/pci.h>
 #include <linux/platform_device.h>
@@ -606,12 +607,9 @@ struct arm_smmu_device {
 	struct arm_smmu_strtab_cfg	strtab_cfg;
 };
 
-/* SMMU private data for an IOMMU group */
-struct arm_smmu_group {
+/* SMMU private data for each master */
+struct arm_smmu_master_data {
 	struct arm_smmu_device		*smmu;
-	struct arm_smmu_domain		*domain;
-	int				num_sids;
-	u32				*sids;
 	struct arm_smmu_strtab_ent	ste;
 };
 
@@ -1578,20 +1576,6 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
 	return ret;
 }
 
-static struct arm_smmu_group *arm_smmu_group_get(struct device *dev)
-{
-	struct iommu_group *group;
-	struct arm_smmu_group *smmu_group;
-
-	group = iommu_group_get(dev);
-	if (!group)
-		return NULL;
-
-	smmu_group = iommu_group_get_iommudata(group);
-	iommu_group_put(group);
-	return smmu_group;
-}
-
 static __le64 *arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
 {
 	__le64 *step;
@@ -1614,27 +1598,17 @@ static __le64 *arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
 	return step;
 }
 
-static int arm_smmu_install_ste_for_group(struct arm_smmu_group *smmu_group)
+static int arm_smmu_install_ste_for_dev(struct iommu_fwspec *fwspec)
 {
 	int i;
-	struct arm_smmu_domain *smmu_domain = smmu_group->domain;
-	struct arm_smmu_strtab_ent *ste = &smmu_group->ste;
-	struct arm_smmu_device *smmu = smmu_group->smmu;
+	struct arm_smmu_master_data *master = fwspec->iommu_priv;
+	struct arm_smmu_device *smmu = master->smmu;
 
-	if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
-		ste->s1_cfg = &smmu_domain->s1_cfg;
-		ste->s2_cfg = NULL;
-		arm_smmu_write_ctx_desc(smmu, ste->s1_cfg);
-	} else {
-		ste->s1_cfg = NULL;
-		ste->s2_cfg = &smmu_domain->s2_cfg;
-	}
-
-	for (i = 0; i < smmu_group->num_sids; ++i) {
-		u32 sid = smmu_group->sids[i];
+	for (i = 0; i < fwspec->num_ids; ++i) {
+		u32 sid = fwspec->ids[i];
 		__le64 *step = arm_smmu_get_step_for_sid(smmu, sid);
 
-		arm_smmu_write_strtab_ent(smmu, sid, step, ste);
+		arm_smmu_write_strtab_ent(smmu, sid, step, &master->ste);
 	}
 
 	return 0;
@@ -1642,13 +1616,12 @@ static int arm_smmu_install_ste_for_group(struct arm_smmu_group *smmu_group)
 
 static void arm_smmu_detach_dev(struct device *dev)
 {
-	struct arm_smmu_group *smmu_group = arm_smmu_group_get(dev);
+	struct iommu_fwspec *fwspec = dev_iommu_fwspec(dev);
+	struct arm_smmu_master_data *master = fwspec->iommu_priv;
 
-	smmu_group->ste.bypass = true;
-	if (arm_smmu_install_ste_for_group(smmu_group) < 0)
+	master->ste.bypass = true;
+	if (arm_smmu_install_ste_for_dev(fwspec) < 0)
 		dev_warn(dev, "failed to install bypass STE\n");
-
-	smmu_group->domain = NULL;
 }
 
 static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
@@ -1656,16 +1629,21 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 	int ret = 0;
 	struct arm_smmu_device *smmu;
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
-	struct arm_smmu_group *smmu_group = arm_smmu_group_get(dev);
+	struct arm_smmu_master_data *master;
+	struct arm_smmu_strtab_ent *ste;
+	struct iommu_fwspec *fwspec = dev_iommu_fwspec(dev);
 
-	if (!smmu_group)
+	if (!fwspec)
 		return -ENOENT;
 
+	master = fwspec->iommu_priv;
+	smmu = master->smmu;
+	ste = &master->ste;
+
 	/* Already attached to a different domain? */
-	if (smmu_group->domain && smmu_group->domain != smmu_domain)
+	if (!ste->bypass)
 		arm_smmu_detach_dev(dev);
 
-	smmu = smmu_group->smmu;
 	mutex_lock(&smmu_domain->init_mutex);
 
 	if (!smmu_domain->smmu) {
@@ -1684,21 +1662,21 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 		goto out_unlock;
 	}
 
-	/* Group already attached to this domain? */
-	if (smmu_group->domain)
-		goto out_unlock;
+	ste->bypass = false;
+	ste->valid = true;
 
-	smmu_group->domain	= smmu_domain;
+	if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) {
+		ste->s1_cfg = &smmu_domain->s1_cfg;
+		ste->s2_cfg = NULL;
+		arm_smmu_write_ctx_desc(smmu, ste->s1_cfg);
+	} else {
+		ste->s1_cfg = NULL;
+		ste->s2_cfg = &smmu_domain->s2_cfg;
+	}
 
-	/*
-	 * FIXME: This should always be "false" once we have IOMMU-backed
-	 * DMA ops for all devices behind the SMMU.
-	 */
-	smmu_group->ste.bypass	= domain->type == IOMMU_DOMAIN_DMA;
-
-	ret = arm_smmu_install_ste_for_group(smmu_group);
+	ret = arm_smmu_install_ste_for_dev(fwspec);
 	if (ret < 0)
-		smmu_group->domain = NULL;
+		ste->valid = false;
 
 out_unlock:
 	mutex_unlock(&smmu_domain->init_mutex);
@@ -1757,40 +1735,19 @@ arm_smmu_iova_to_phys(struct iommu_domain *domain, dma_addr_t iova)
 	return ret;
 }
 
-static int __arm_smmu_get_pci_sid(struct pci_dev *pdev, u16 alias, void *sidp)
+static struct platform_driver arm_smmu_driver;
+
+static int arm_smmu_match_node(struct device *dev, void *data)
 {
-	*(u32 *)sidp = alias;
-	return 0; /* Continue walking */
+	return dev->of_node == data;
 }
 
-static void __arm_smmu_release_pci_iommudata(void *data)
+static struct arm_smmu_device *arm_smmu_get_by_node(struct device_node *np)
 {
-	kfree(data);
-}
-
-static struct arm_smmu_device *arm_smmu_get_for_pci_dev(struct pci_dev *pdev)
-{
-	struct device_node *of_node;
-	struct platform_device *smmu_pdev;
-	struct arm_smmu_device *smmu = NULL;
-	struct pci_bus *bus = pdev->bus;
-
-	/* Walk up to the root bus */
-	while (!pci_is_root_bus(bus))
-		bus = bus->parent;
-
-	/* Follow the "iommus" phandle from the host controller */
-	of_node = of_parse_phandle(bus->bridge->parent->of_node, "iommus", 0);
-	if (!of_node)
-		return NULL;
-
-	/* See if we can find an SMMU corresponding to the phandle */
-	smmu_pdev = of_find_device_by_node(of_node);
-	if (smmu_pdev)
-		smmu = platform_get_drvdata(smmu_pdev);
-
-	of_node_put(of_node);
-	return smmu;
+	struct device *dev = driver_find_device(&arm_smmu_driver.driver, NULL,
+						np, arm_smmu_match_node);
+	put_device(dev);
+	return dev ? dev_get_drvdata(dev) : NULL;
 }
 
 static bool arm_smmu_sid_in_range(struct arm_smmu_device *smmu, u32 sid)
@@ -1803,94 +1760,74 @@ static bool arm_smmu_sid_in_range(struct arm_smmu_device *smmu, u32 sid)
 	return sid < limit;
 }
 
+static struct iommu_ops arm_smmu_ops;
+
 static int arm_smmu_add_device(struct device *dev)
 {
 	int i, ret;
-	u32 sid, *sids;
-	struct pci_dev *pdev;
-	struct iommu_group *group;
-	struct arm_smmu_group *smmu_group;
 	struct arm_smmu_device *smmu;
+	struct arm_smmu_master_data *master;
+	struct iommu_fwspec *fwspec = dev_iommu_fwspec(dev);
+	struct iommu_group *group;
 
-	/* We only support PCI, for now */
-	if (!dev_is_pci(dev))
+	if (!fwspec || fwspec->iommu_ops != &arm_smmu_ops)
 		return -ENODEV;
-
-	pdev = to_pci_dev(dev);
-	group = iommu_group_get_for_dev(dev);
-	if (IS_ERR(group))
-		return PTR_ERR(group);
-
-	smmu_group = iommu_group_get_iommudata(group);
-	if (!smmu_group) {
-		smmu = arm_smmu_get_for_pci_dev(pdev);
-		if (!smmu) {
-			ret = -ENOENT;
-			goto out_remove_dev;
-		}
-
-		smmu_group = kzalloc(sizeof(*smmu_group), GFP_KERNEL);
-		if (!smmu_group) {
-			ret = -ENOMEM;
-			goto out_remove_dev;
-		}
-
-		smmu_group->ste.valid	= true;
-		smmu_group->smmu	= smmu;
-		iommu_group_set_iommudata(group, smmu_group,
-					  __arm_smmu_release_pci_iommudata);
+	/*
+	 * We _can_ actually withstand dodgy bus code re-calling add_device()
+	 * without an intervening remove_device()/of_xlate() sequence, but
+	 * we're not going to do so quietly...
+	 */
+	if (WARN_ON_ONCE(fwspec->iommu_priv)) {
+		master = fwspec->iommu_priv;
+		smmu = master->smmu;
 	} else {
-		smmu = smmu_group->smmu;
+		smmu = arm_smmu_get_by_node(fwspec->iommu_np);
+		if (!smmu)
+			return -ENODEV;
+		master = kzalloc(sizeof(*master), GFP_KERNEL);
+		if (!master)
+			return -ENOMEM;
+
+		master->smmu = smmu;
+		fwspec->iommu_priv = master;
 	}
 
-	/* Assume SID == RID until firmware tells us otherwise */
-	pci_for_each_dma_alias(pdev, __arm_smmu_get_pci_sid, &sid);
-	for (i = 0; i < smmu_group->num_sids; ++i) {
-		/* If we already know about this SID, then we're done */
-		if (smmu_group->sids[i] == sid)
-			goto out_put_group;
+	/* Check the SIDs are in range of the SMMU and our stream table */
+	for (i = 0; i < fwspec->num_ids; i++) {
+		u32 sid = fwspec->ids[i];
+
+		if (!arm_smmu_sid_in_range(smmu, sid))
+			return -ERANGE;
+
+		/* Ensure l2 strtab is initialised */
+		if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) {
+			ret = arm_smmu_init_l2_strtab(smmu, sid);
+			if (ret)
+				return ret;
+		}
 	}
 
-	/* Check the SID is in range of the SMMU and our stream table */
-	if (!arm_smmu_sid_in_range(smmu, sid)) {
-		ret = -ERANGE;
-		goto out_remove_dev;
-	}
+	group = iommu_group_get_for_dev(dev);
+	if (!IS_ERR(group))
+		iommu_group_put(group);
 
-	/* Ensure l2 strtab is initialised */
-	if (smmu->features & ARM_SMMU_FEAT_2_LVL_STRTAB) {
-		ret = arm_smmu_init_l2_strtab(smmu, sid);
-		if (ret)
-			goto out_remove_dev;
-	}
-
-	/* Resize the SID array for the group */
-	smmu_group->num_sids++;
-	sids = krealloc(smmu_group->sids, smmu_group->num_sids * sizeof(*sids),
-			GFP_KERNEL);
-	if (!sids) {
-		smmu_group->num_sids--;
-		ret = -ENOMEM;
-		goto out_remove_dev;
-	}
-
-	/* Add the new SID */
-	sids[smmu_group->num_sids - 1] = sid;
-	smmu_group->sids = sids;
-
-out_put_group:
-	iommu_group_put(group);
-	return 0;
-
-out_remove_dev:
-	iommu_group_remove_device(dev);
-	iommu_group_put(group);
-	return ret;
+	return PTR_ERR_OR_ZERO(group);
 }
 
 static void arm_smmu_remove_device(struct device *dev)
 {
+	struct iommu_fwspec *fwspec = dev_iommu_fwspec(dev);
+	struct arm_smmu_master_data *master;
+
+	if (!fwspec || fwspec->iommu_ops != &arm_smmu_ops)
+		return;
+
+	master = fwspec->iommu_priv;
+	if (master && master->ste.valid)
+		arm_smmu_detach_dev(dev);
 	iommu_group_remove_device(dev);
+	kfree(master);
+	iommu_fwspec_free(dev);
 }
 
 static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
@@ -1937,6 +1874,21 @@ out_unlock:
 	return ret;
 }
 
+static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args)
+{
+	int ret;
+
+	/* We only support PCI, for now */
+	if (!dev_is_pci(dev))
+		return -ENODEV;
+
+	ret = iommu_fwspec_init(dev, args->np);
+	if (!ret)
+		ret = iommu_fwspec_add_ids(dev, &args->args[0], 1);
+
+	return ret;
+}
+
 static struct iommu_ops arm_smmu_ops = {
 	.capable		= arm_smmu_capable,
 	.domain_alloc		= arm_smmu_domain_alloc,
@@ -1951,6 +1903,7 @@ static struct iommu_ops arm_smmu_ops = {
 	.device_group		= pci_device_group,
 	.domain_get_attr	= arm_smmu_domain_get_attr,
 	.domain_set_attr	= arm_smmu_domain_set_attr,
+	.of_xlate		= arm_smmu_of_xlate,
 	.pgsize_bitmap		= -1UL, /* Restricted during device attach */
 };
 
@@ -2649,7 +2602,14 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
 	platform_set_drvdata(pdev, smmu);
 
 	/* Reset the device */
-	return arm_smmu_device_reset(smmu);
+	ret = arm_smmu_device_reset(smmu);
+	if (ret)
+		return ret;
+
+	/* And we're up. Go go go! */
+	of_iommu_set_ops(dev->of_node, &arm_smmu_ops);
+	pci_request_acs();
+	return bus_set_iommu(&pci_bus_type, &arm_smmu_ops);
 }
 
 static int arm_smmu_device_remove(struct platform_device *pdev)
@@ -2677,22 +2637,14 @@ static struct platform_driver arm_smmu_driver = {
 
 static int __init arm_smmu_init(void)
 {
-	struct device_node *np;
-	int ret;
+	static bool registered;
+	int ret = 0;
 
-	np = of_find_matching_node(NULL, arm_smmu_of_match);
-	if (!np)
-		return 0;
-
-	of_node_put(np);
-
-	ret = platform_driver_register(&arm_smmu_driver);
-	if (ret)
-		return ret;
-
-	pci_request_acs();
-
-	return bus_set_iommu(&pci_bus_type, &arm_smmu_ops);
+	if (!registered) {
+		ret = platform_driver_register(&arm_smmu_driver);
+		registered = !ret;
+	}
+	return ret;
 }
 
 static void __exit arm_smmu_exit(void)
@@ -2703,6 +2655,20 @@ static void __exit arm_smmu_exit(void)
 subsys_initcall(arm_smmu_init);
 module_exit(arm_smmu_exit);
 
+static int __init arm_smmu_of_init(struct device_node *np)
+{
+	int ret = arm_smmu_init();
+
+	if (ret)
+		return ret;
+
+	if (!of_platform_device_create(np, NULL, platform_bus_type.dev_root))
+		return -ENODEV;
+
+	return 0;
+}
+IOMMU_OF_DECLARE(arm_smmuv3, "arm,smmu-v3", arm_smmu_of_init);
+
 MODULE_DESCRIPTION("IOMMU API for ARM architected SMMUv3 implementations");
 MODULE_AUTHOR("Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org>");
 MODULE_LICENSE("GPL v2");
-- 
2.8.1.dirty

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v5 06/19] iommu/arm-smmu: Support non-PCI devices with SMMUv3
       [not found] ` <cover.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
                     ` (4 preceding siblings ...)
  2016-08-23 19:05   ` [PATCH v5 05/19] iommu/arm-smmu: Implement of_xlate() for SMMUv3 Robin Murphy
@ 2016-08-23 19:05   ` Robin Murphy
       [not found]     ` <207d0ae38c5b01b7cf7e48231a4d01bac453b57c.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
  2016-08-23 19:05   ` [PATCH v5 07/19] iommu/arm-smmu: Set PRIVCFG in stage 1 STEs Robin Murphy
                     ` (14 subsequent siblings)
  20 siblings, 1 reply; 61+ messages in thread
From: Robin Murphy @ 2016-08-23 19:05 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, lorenzo.pieralisi-5wv7dgnIgG8,
	jean-philippe.brucker-5wv7dgnIgG8, punit.agrawal-5wv7dgnIgG8,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA,
	eric.auger-H+wXaHxf7aLQT0dZR+AlfA

With the device <-> stream ID relationship suitably abstracted and
of_xlate() hooked up, the PCI dependency now looks, and is, entirely
arbitrary. Any bus using the of_dma_configure() mechanism will work,
so extend support to the platform and AMBA buses which do just that.

Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
---
 drivers/iommu/Kconfig       |  2 +-
 drivers/iommu/arm-smmu-v3.c | 40 ++++++++++++++++++++++++++++++++--------
 2 files changed, 33 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index d432ca828472..8ee54d71c7eb 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -309,7 +309,7 @@ config ARM_SMMU
 
 config ARM_SMMU_V3
 	bool "ARM Ltd. System MMU Version 3 (SMMUv3) Support"
-	depends on ARM64 && PCI
+	depends on ARM64
 	select IOMMU_API
 	select IOMMU_IO_PGTABLE_LPAE
 	select GENERIC_MSI_IRQ_DOMAIN
diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 094babff64a6..e0384f7afb03 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -35,6 +35,8 @@
 #include <linux/pci.h>
 #include <linux/platform_device.h>
 
+#include <linux/amba/bus.h>
+
 #include "io-pgtable.h"
 
 /* MMIO registers */
@@ -1830,6 +1832,23 @@ static void arm_smmu_remove_device(struct device *dev)
 	iommu_fwspec_free(dev);
 }
 
+static struct iommu_group *arm_smmu_device_group(struct device *dev)
+{
+	struct iommu_group *group;
+
+	/*
+	 * We don't support devices sharing stream IDs other than PCI RID
+	 * aliases, since the necessary ID-to-device lookup becomes rather
+	 * impractical given a potential sparse 32-bit stream ID space.
+	 */
+	if (dev_is_pci(dev))
+		group = pci_device_group(dev);
+	else
+		group = generic_device_group(dev);
+
+	return group;
+}
+
 static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
 				    enum iommu_attr attr, void *data)
 {
@@ -1876,13 +1895,8 @@ out_unlock:
 
 static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args)
 {
-	int ret;
+	int ret = iommu_fwspec_init(dev, args->np);
 
-	/* We only support PCI, for now */
-	if (!dev_is_pci(dev))
-		return -ENODEV;
-
-	ret = iommu_fwspec_init(dev, args->np);
 	if (!ret)
 		ret = iommu_fwspec_add_ids(dev, &args->args[0], 1);
 
@@ -1900,7 +1914,7 @@ static struct iommu_ops arm_smmu_ops = {
 	.iova_to_phys		= arm_smmu_iova_to_phys,
 	.add_device		= arm_smmu_add_device,
 	.remove_device		= arm_smmu_remove_device,
-	.device_group		= pci_device_group,
+	.device_group		= arm_smmu_device_group,
 	.domain_get_attr	= arm_smmu_domain_get_attr,
 	.domain_set_attr	= arm_smmu_domain_set_attr,
 	.of_xlate		= arm_smmu_of_xlate,
@@ -2608,8 +2622,18 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
 
 	/* And we're up. Go go go! */
 	of_iommu_set_ops(dev->of_node, &arm_smmu_ops);
+#ifdef CONFIG_PCI
 	pci_request_acs();
-	return bus_set_iommu(&pci_bus_type, &arm_smmu_ops);
+	ret = bus_set_iommu(&pci_bus_type, &arm_smmu_ops);
+	if (ret)
+		return ret;
+#endif
+#ifdef CONFIG_ARM_AMBA
+	ret = bus_set_iommu(&amba_bustype, &arm_smmu_ops);
+	if (ret)
+		return ret;
+#endif
+	return bus_set_iommu(&platform_bus_type, &arm_smmu_ops);
 }
 
 static int arm_smmu_device_remove(struct platform_device *pdev)
-- 
2.8.1.dirty

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v5 07/19] iommu/arm-smmu: Set PRIVCFG in stage 1 STEs
       [not found] ` <cover.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
                     ` (5 preceding siblings ...)
  2016-08-23 19:05   ` [PATCH v5 06/19] iommu/arm-smmu: Support non-PCI devices with SMMUv3 Robin Murphy
@ 2016-08-23 19:05   ` Robin Murphy
       [not found]     ` <1cda9861ce3ede6c2de9c6c4f2294549808b421b.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
  2016-08-23 19:05   ` [PATCH v5 08/19] iommu/arm-smmu: Handle stream IDs more dynamically Robin Murphy
                     ` (13 subsequent siblings)
  20 siblings, 1 reply; 61+ messages in thread
From: Robin Murphy @ 2016-08-23 19:05 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, lorenzo.pieralisi-5wv7dgnIgG8,
	jean-philippe.brucker-5wv7dgnIgG8, punit.agrawal-5wv7dgnIgG8,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA,
	eric.auger-H+wXaHxf7aLQT0dZR+AlfA

Implement the SMMUv3 equivalent of d346180e70b9 ("iommu/arm-smmu: Treat
all device transactions as unprivileged"), so that once again those
pesky DMA controllers with their privileged instruction fetches don't
unexpectedly fault in stage 1 domains due to VMSAv8 rules.

Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
---
 drivers/iommu/arm-smmu-v3.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index e0384f7afb03..72b996aa7460 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -263,6 +263,9 @@
 #define STRTAB_STE_1_SHCFG_INCOMING	1UL
 #define STRTAB_STE_1_SHCFG_SHIFT	44
 
+#define STRTAB_STE_1_PRIVCFG_UNPRIV	2UL
+#define STRTAB_STE_1_PRIVCFG_SHIFT	48
+
 #define STRTAB_STE_2_S2VMID_SHIFT	0
 #define STRTAB_STE_2_S2VMID_MASK	0xffffUL
 #define STRTAB_STE_2_VTCR_SHIFT		32
@@ -1073,7 +1076,9 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_device *smmu, u32 sid,
 #ifdef CONFIG_PCI_ATS
 			 STRTAB_STE_1_EATS_TRANS << STRTAB_STE_1_EATS_SHIFT |
 #endif
-			 STRTAB_STE_1_STRW_NSEL1 << STRTAB_STE_1_STRW_SHIFT);
+			 STRTAB_STE_1_STRW_NSEL1 << STRTAB_STE_1_STRW_SHIFT |
+			 STRTAB_STE_1_PRIVCFG_UNPRIV <<
+			 STRTAB_STE_1_PRIVCFG_SHIFT);
 
 		if (smmu->features & ARM_SMMU_FEAT_STALLS)
 			dst[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD);
-- 
2.8.1.dirty

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v5 08/19] iommu/arm-smmu: Handle stream IDs more dynamically
       [not found] ` <cover.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
                     ` (6 preceding siblings ...)
  2016-08-23 19:05   ` [PATCH v5 07/19] iommu/arm-smmu: Set PRIVCFG in stage 1 STEs Robin Murphy
@ 2016-08-23 19:05   ` Robin Murphy
       [not found]     ` <36f71a07fbc6037ca664bdcc540650f893081dd1.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
  2016-08-23 19:05   ` [PATCH v5 09/19] iommu/arm-smmu: Consolidate stream map entry state Robin Murphy
                     ` (12 subsequent siblings)
  20 siblings, 1 reply; 61+ messages in thread
From: Robin Murphy @ 2016-08-23 19:05 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, lorenzo.pieralisi-5wv7dgnIgG8,
	jean-philippe.brucker-5wv7dgnIgG8, punit.agrawal-5wv7dgnIgG8,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA,
	eric.auger-H+wXaHxf7aLQT0dZR+AlfA

Rather than assuming fixed worst-case values for stream IDs and SMR
masks, keep track of whatever implemented bits the hardware actually
reports. This also obviates the slightly questionable validation of SMR
fields in isolation - rather than aborting the whole SMMU probe for a
hardware configuration which is still architecturally valid, we can
simply refuse masters later if they try to claim an unrepresentable ID
or mask (which almost certainly implies a DT error anyway).

Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
---
 drivers/iommu/arm-smmu.c | 43 ++++++++++++++++++++++---------------------
 1 file changed, 22 insertions(+), 21 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 2db74ebc3240..3357afdd6865 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -165,9 +165,7 @@
 #define ARM_SMMU_GR0_SMR(n)		(0x800 + ((n) << 2))
 #define SMR_VALID			(1 << 31)
 #define SMR_MASK_SHIFT			16
-#define SMR_MASK_MASK			0x7fff
 #define SMR_ID_SHIFT			0
-#define SMR_ID_MASK			0x7fff
 
 #define ARM_SMMU_GR0_S2CR(n)		(0xc00 + ((n) << 2))
 #define S2CR_CBNDX_SHIFT		0
@@ -346,6 +344,8 @@ struct arm_smmu_device {
 	atomic_t			irptndx;
 
 	u32				num_mapping_groups;
+	u16				streamid_mask;
+	u16				smr_mask_mask;
 	DECLARE_BITMAP(smr_map, ARM_SMMU_MAX_SMRS);
 
 	unsigned long			va_size;
@@ -1690,39 +1690,40 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
 		dev_notice(smmu->dev,
 			   "\t(IDR0.CTTW overridden by dma-coherent property)\n");
 
+	/* Max. number of entries we have for stream matching/indexing */
+	size = 1 << ((id >> ID0_NUMSIDB_SHIFT) & ID0_NUMSIDB_MASK);
+	smmu->streamid_mask = size - 1;
 	if (id & ID0_SMS) {
-		u32 smr, sid, mask;
+		u32 smr;
 
 		smmu->features |= ARM_SMMU_FEAT_STREAM_MATCH;
-		smmu->num_mapping_groups = (id >> ID0_NUMSMRG_SHIFT) &
-					   ID0_NUMSMRG_MASK;
-		if (smmu->num_mapping_groups == 0) {
+		size = (id >> ID0_NUMSMRG_SHIFT) & ID0_NUMSMRG_MASK;
+		if (size == 0) {
 			dev_err(smmu->dev,
 				"stream-matching supported, but no SMRs present!\n");
 			return -ENODEV;
 		}
 
-		smr = SMR_MASK_MASK << SMR_MASK_SHIFT;
-		smr |= (SMR_ID_MASK << SMR_ID_SHIFT);
+		/*
+		 * SMR.ID bits may not be preserved if the corresponding MASK
+		 * bits are set, so check each one separately. We can reject
+		 * masters later if they try to claim IDs outside these masks.
+		 */
+		smr = smmu->streamid_mask << SMR_ID_SHIFT;
 		writel_relaxed(smr, gr0_base + ARM_SMMU_GR0_SMR(0));
 		smr = readl_relaxed(gr0_base + ARM_SMMU_GR0_SMR(0));
+		smmu->streamid_mask = smr >> SMR_ID_SHIFT;
 
-		mask = (smr >> SMR_MASK_SHIFT) & SMR_MASK_MASK;
-		sid = (smr >> SMR_ID_SHIFT) & SMR_ID_MASK;
-		if ((mask & sid) != sid) {
-			dev_err(smmu->dev,
-				"SMR mask bits (0x%x) insufficient for ID field (0x%x)\n",
-				mask, sid);
-			return -ENODEV;
-		}
+		smr = smmu->streamid_mask << SMR_MASK_SHIFT;
+		writel_relaxed(smr, gr0_base + ARM_SMMU_GR0_SMR(0));
+		smr = readl_relaxed(gr0_base + ARM_SMMU_GR0_SMR(0));
+		smmu->smr_mask_mask = smr >> SMR_MASK_SHIFT;
 
 		dev_notice(smmu->dev,
-			   "\tstream matching with %u register groups, mask 0x%x",
-			   smmu->num_mapping_groups, mask);
-	} else {
-		smmu->num_mapping_groups = (id >> ID0_NUMSIDB_SHIFT) &
-					   ID0_NUMSIDB_MASK;
+			   "\tstream matching with %lu register groups, mask 0x%x",
+			   size, smmu->smr_mask_mask);
 	}
+	smmu->num_mapping_groups = size;
 
 	if (smmu->version < ARM_SMMU_V2 || !(id & ID0_PTFS_NO_AARCH32)) {
 		smmu->features |= ARM_SMMU_FEAT_FMT_AARCH32_L;
-- 
2.8.1.dirty

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v5 09/19] iommu/arm-smmu: Consolidate stream map entry state
       [not found] ` <cover.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
                     ` (7 preceding siblings ...)
  2016-08-23 19:05   ` [PATCH v5 08/19] iommu/arm-smmu: Handle stream IDs more dynamically Robin Murphy
@ 2016-08-23 19:05   ` Robin Murphy
       [not found]     ` <26fcf7d3138816b9546a3dcc2bbbc2f229f34c91.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
  2016-08-23 19:05   ` [PATCH v5 10/19] iommu/arm-smmu: Keep track of S2CR state Robin Murphy
                     ` (11 subsequent siblings)
  20 siblings, 1 reply; 61+ messages in thread
From: Robin Murphy @ 2016-08-23 19:05 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, lorenzo.pieralisi-5wv7dgnIgG8,
	jean-philippe.brucker-5wv7dgnIgG8, punit.agrawal-5wv7dgnIgG8,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA,
	eric.auger-H+wXaHxf7aLQT0dZR+AlfA

In order to consider SMR masking, we really want to be able to validate
ID/mask pairs against existing SMR contents to prevent stream match
conflicts, which at best would cause transactions to fault unexpectedly,
and at worst lead to silent unpredictable behaviour. With our SMMU
instance data holding only an allocator bitmap, and the SMR values
themselves scattered across master configs hanging off devices which we
may have no way of finding, there's essentially no way short of digging
everything back out of the hardware. Similarly, the thought of power
management ops to support suspend/resume faces the exact same problem.

By massaging the software state into a closer shape to the underlying
hardware, everything comes together quite nicely; the allocator and the
high-level view of the data become a single centralised state which we
can easily keep track of, and to which any updates can be validated in
full before being synchronised to the hardware itself.

Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
---
 drivers/iommu/arm-smmu.c | 147 +++++++++++++++++++++++++++--------------------
 1 file changed, 86 insertions(+), 61 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 3357afdd6865..401af10683a2 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -28,6 +28,7 @@
 
 #define pr_fmt(fmt) "arm-smmu: " fmt
 
+#include <linux/atomic.h>
 #include <linux/delay.h>
 #include <linux/dma-iommu.h>
 #include <linux/dma-mapping.h>
@@ -55,9 +56,6 @@
 /* Maximum number of context banks per SMMU */
 #define ARM_SMMU_MAX_CBS		128
 
-/* Maximum number of mapping groups per SMMU */
-#define ARM_SMMU_MAX_SMRS		128
-
 /* SMMU global address space */
 #define ARM_SMMU_GR0(smmu)		((smmu)->base)
 #define ARM_SMMU_GR1(smmu)		((smmu)->base + (1 << (smmu)->pgshift))
@@ -295,16 +293,17 @@ enum arm_smmu_implementation {
 };
 
 struct arm_smmu_smr {
-	u8				idx;
 	u16				mask;
 	u16				id;
+	bool				valid;
 };
 
 struct arm_smmu_master_cfg {
 	int				num_streamids;
 	u16				streamids[MAX_MASTER_STREAMIDS];
-	struct arm_smmu_smr		*smrs;
+	s16				smendx[MAX_MASTER_STREAMIDS];
 };
+#define INVALID_SMENDX			-1
 
 struct arm_smmu_master {
 	struct device_node		*of_node;
@@ -346,7 +345,7 @@ struct arm_smmu_device {
 	u32				num_mapping_groups;
 	u16				streamid_mask;
 	u16				smr_mask_mask;
-	DECLARE_BITMAP(smr_map, ARM_SMMU_MAX_SMRS);
+	struct arm_smmu_smr		*smrs;
 
 	unsigned long			va_size;
 	unsigned long			ipa_size;
@@ -550,6 +549,7 @@ static int register_smmu_master(struct arm_smmu_device *smmu,
 			return -ERANGE;
 		}
 		master->cfg.streamids[i] = streamid;
+		master->cfg.smendx[i] = INVALID_SMENDX;
 	}
 	return insert_smmu_master(smmu, master);
 }
@@ -1055,79 +1055,91 @@ static void arm_smmu_domain_free(struct iommu_domain *domain)
 	kfree(smmu_domain);
 }
 
-static int arm_smmu_master_configure_smrs(struct arm_smmu_device *smmu,
-					  struct arm_smmu_master_cfg *cfg)
+static int arm_smmu_alloc_smr(struct arm_smmu_device *smmu)
 {
 	int i;
-	struct arm_smmu_smr *smrs;
-	void __iomem *gr0_base = ARM_SMMU_GR0(smmu);
 
-	if (!(smmu->features & ARM_SMMU_FEAT_STREAM_MATCH))
-		return 0;
+	for (i = 0; i < smmu->num_mapping_groups; i++)
+		if (!cmpxchg(&smmu->smrs[i].valid, false, true))
+			return i;
 
-	if (cfg->smrs)
-		return -EEXIST;
+	return INVALID_SMENDX;
+}
 
-	smrs = kmalloc_array(cfg->num_streamids, sizeof(*smrs), GFP_KERNEL);
-	if (!smrs) {
-		dev_err(smmu->dev, "failed to allocate %d SMRs\n",
-			cfg->num_streamids);
-		return -ENOMEM;
-	}
+static void arm_smmu_free_smr(struct arm_smmu_device *smmu, int idx)
+{
+	writel_relaxed(~SMR_VALID, ARM_SMMU_GR0(smmu) + ARM_SMMU_GR0_SMR(idx));
+	WRITE_ONCE(smmu->smrs[idx].valid, false);
+}
+
+static void arm_smmu_write_smr(struct arm_smmu_device *smmu, int idx)
+{
+	struct arm_smmu_smr *smr = smmu->smrs + idx;
+	u32 reg = (smr->id & smmu->streamid_mask) << SMR_ID_SHIFT |
+		  (smr->mask & smmu->smr_mask_mask) << SMR_MASK_SHIFT;
+
+	if (smr->valid)
+		reg |= SMR_VALID;
+	writel_relaxed(reg, ARM_SMMU_GR0(smmu) + ARM_SMMU_GR0_SMR(idx));
+}
+
+static int arm_smmu_master_alloc_smes(struct arm_smmu_device *smmu,
+				      struct arm_smmu_master_cfg *cfg)
+{
+	struct arm_smmu_smr *smrs = smmu->smrs;
+	int i, idx;
 
 	/* Allocate the SMRs on the SMMU */
 	for (i = 0; i < cfg->num_streamids; ++i) {
-		int idx = __arm_smmu_alloc_bitmap(smmu->smr_map, 0,
-						  smmu->num_mapping_groups);
+		if (cfg->smendx[i] >= 0)
+			return -EEXIST;
+
+		/* ...except on stream indexing hardware, of course */
+		if (!smrs) {
+			cfg->smendx[i] = cfg->streamids[i];
+			continue;
+		}
+
+		idx = arm_smmu_alloc_smr(smmu);
 		if (idx < 0) {
 			dev_err(smmu->dev, "failed to allocate free SMR\n");
 			goto err_free_smrs;
 		}
+		cfg->smendx[i] = idx;
 
-		smrs[i] = (struct arm_smmu_smr) {
-			.idx	= idx,
-			.mask	= 0, /* We don't currently share SMRs */
-			.id	= cfg->streamids[i],
-		};
+		smrs[idx].id = cfg->streamids[i];
+		smrs[idx].mask = 0; /* We don't currently share SMRs */
 	}
 
+	if (!smrs)
+		return 0;
+
 	/* It worked! Now, poke the actual hardware */
-	for (i = 0; i < cfg->num_streamids; ++i) {
-		u32 reg = SMR_VALID | smrs[i].id << SMR_ID_SHIFT |
-			  smrs[i].mask << SMR_MASK_SHIFT;
-		writel_relaxed(reg, gr0_base + ARM_SMMU_GR0_SMR(smrs[i].idx));
-	}
+	for (i = 0; i < cfg->num_streamids; ++i)
+		arm_smmu_write_smr(smmu, cfg->smendx[i]);
 
-	cfg->smrs = smrs;
 	return 0;
 
 err_free_smrs:
-	while (--i >= 0)
-		__arm_smmu_free_bitmap(smmu->smr_map, smrs[i].idx);
-	kfree(smrs);
+	while (i--) {
+		arm_smmu_free_smr(smmu, cfg->smendx[i]);
+		cfg->smendx[i] = INVALID_SMENDX;
+	}
 	return -ENOSPC;
 }
 
-static void arm_smmu_master_free_smrs(struct arm_smmu_device *smmu,
+static void arm_smmu_master_free_smes(struct arm_smmu_device *smmu,
 				      struct arm_smmu_master_cfg *cfg)
 {
 	int i;
-	void __iomem *gr0_base = ARM_SMMU_GR0(smmu);
-	struct arm_smmu_smr *smrs = cfg->smrs;
-
-	if (!smrs)
-		return;
 
 	/* Invalidate the SMRs before freeing back to the allocator */
 	for (i = 0; i < cfg->num_streamids; ++i) {
-		u8 idx = smrs[i].idx;
+		if (smmu->smrs)
+			arm_smmu_free_smr(smmu, cfg->smendx[i]);
 
-		writel_relaxed(~SMR_VALID, gr0_base + ARM_SMMU_GR0_SMR(idx));
-		__arm_smmu_free_bitmap(smmu->smr_map, idx);
+		cfg->smendx[i] = INVALID_SMENDX;
 	}
-
-	cfg->smrs = NULL;
-	kfree(smrs);
 }
 
 static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain,
@@ -1147,14 +1159,14 @@ static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain,
 		return 0;
 
 	/* Devices in an IOMMU group may already be configured */
-	ret = arm_smmu_master_configure_smrs(smmu, cfg);
+	ret = arm_smmu_master_alloc_smes(smmu, cfg);
 	if (ret)
 		return ret == -EEXIST ? 0 : ret;
 
 	for (i = 0; i < cfg->num_streamids; ++i) {
 		u32 idx, s2cr;
 
-		idx = cfg->smrs ? cfg->smrs[i].idx : cfg->streamids[i];
+		idx = cfg->smendx[i];
 		s2cr = S2CR_TYPE_TRANS | S2CR_PRIVCFG_UNPRIV |
 		       (smmu_domain->cfg.cbndx << S2CR_CBNDX_SHIFT);
 		writel_relaxed(s2cr, gr0_base + ARM_SMMU_GR0_S2CR(idx));
@@ -1170,22 +1182,22 @@ static void arm_smmu_domain_remove_master(struct arm_smmu_domain *smmu_domain,
 	struct arm_smmu_device *smmu = smmu_domain->smmu;
 	void __iomem *gr0_base = ARM_SMMU_GR0(smmu);
 
-	/* An IOMMU group is torn down by the first device to be removed */
-	if ((smmu->features & ARM_SMMU_FEAT_STREAM_MATCH) && !cfg->smrs)
-		return;
-
 	/*
 	 * We *must* clear the S2CR first, because freeing the SMR means
 	 * that it can be re-allocated immediately.
 	 */
 	for (i = 0; i < cfg->num_streamids; ++i) {
-		u32 idx = cfg->smrs ? cfg->smrs[i].idx : cfg->streamids[i];
+		int idx = cfg->smendx[i];
 		u32 reg = disable_bypass ? S2CR_TYPE_FAULT : S2CR_TYPE_BYPASS;
 
+		/* An IOMMU group is torn down by the first device to be removed */
+		if (idx < 0)
+			return;
+
 		writel_relaxed(reg, gr0_base + ARM_SMMU_GR0_S2CR(idx));
 	}
 
-	arm_smmu_master_free_smrs(smmu, cfg);
+	arm_smmu_master_free_smes(smmu, cfg);
 }
 
 static void arm_smmu_detach_dev(struct device *dev,
@@ -1399,8 +1411,11 @@ static int arm_smmu_init_pci_device(struct pci_dev *pdev,
 			break;
 
 	/* Avoid duplicate SIDs, as this can lead to SMR conflicts */
-	if (i == cfg->num_streamids)
-		cfg->streamids[cfg->num_streamids++] = sid;
+	if (i == cfg->num_streamids) {
+		cfg->streamids[i] = sid;
+		cfg->smendx[i] = INVALID_SMENDX;
+		cfg->num_streamids++;
+	}
 
 	return 0;
 }
@@ -1531,17 +1546,21 @@ static void arm_smmu_device_reset(struct arm_smmu_device *smmu)
 {
 	void __iomem *gr0_base = ARM_SMMU_GR0(smmu);
 	void __iomem *cb_base;
-	int i = 0;
+	int i;
 	u32 reg, major;
 
 	/* clear global FSR */
 	reg = readl_relaxed(ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sGFSR);
 	writel(reg, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sGFSR);
 
-	/* Mark all SMRn as invalid and all S2CRn as bypass unless overridden */
+	/*
+	 * Reset stream mapping groups: Initial values mark all SMRn as
+	 * invalid and all S2CRn as bypass unless overridden.
+	 */
 	reg = disable_bypass ? S2CR_TYPE_FAULT : S2CR_TYPE_BYPASS;
 	for (i = 0; i < smmu->num_mapping_groups; ++i) {
-		writel_relaxed(0, gr0_base + ARM_SMMU_GR0_SMR(i));
+		if (smmu->smrs)
+			arm_smmu_write_smr(smmu, i);
 		writel_relaxed(reg, gr0_base + ARM_SMMU_GR0_S2CR(i));
 	}
 
@@ -1719,6 +1738,12 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
 		smr = readl_relaxed(gr0_base + ARM_SMMU_GR0_SMR(0));
 		smmu->smr_mask_mask = smr >> SMR_MASK_SHIFT;
 
+		/* Zero-initialised to mark as invalid */
+		smmu->smrs = devm_kcalloc(smmu->dev, size, sizeof(*smmu->smrs),
+					  GFP_KERNEL);
+		if (!smmu->smrs)
+			return -ENOMEM;
+
 		dev_notice(smmu->dev,
 			   "\tstream matching with %lu register groups, mask 0x%x",
 			   size, smmu->smr_mask_mask);
-- 
2.8.1.dirty

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v5 10/19] iommu/arm-smmu: Keep track of S2CR state
       [not found] ` <cover.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
                     ` (8 preceding siblings ...)
  2016-08-23 19:05   ` [PATCH v5 09/19] iommu/arm-smmu: Consolidate stream map entry state Robin Murphy
@ 2016-08-23 19:05   ` Robin Murphy
       [not found]     ` <e086741acfd0959671d184203ef758c824c8d7da.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
  2016-08-23 19:05   ` [PATCH v5 11/19] iommu/arm-smmu: Refactor mmu-masters handling Robin Murphy
                     ` (10 subsequent siblings)
  20 siblings, 1 reply; 61+ messages in thread
From: Robin Murphy @ 2016-08-23 19:05 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, lorenzo.pieralisi-5wv7dgnIgG8,
	jean-philippe.brucker-5wv7dgnIgG8, punit.agrawal-5wv7dgnIgG8,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA,
	eric.auger-H+wXaHxf7aLQT0dZR+AlfA

Making S2CRs first-class citizens within the driver with a high-level
representation of their state offers a neat solution to a few problems:

Firstly, the information about which context a device's stream IDs are
associated with is already present by necessity in the S2CR. With that
state easily accessible we can refer directly to it and obviate the need
to track an IOMMU domain in each device's archdata (its earlier purpose
of enforcing correct attachment of multi-device groups now being handled
by the IOMMU core itself).

Secondly, the core API now deprecates explicit domain detach and expects
domain attach to move devices smoothly from one domain to another; for
SMMUv2, this notion maps directly to simply rewriting the S2CRs assigned
to the device. By giving the driver a suitable abstraction of those
S2CRs to work with, we can massively reduce the overhead of the current
heavy-handed "detach, free resources, reallocate resources, attach"
approach.

Thirdly, making the software state hardware-shaped and attached to the
SMMU instance once again makes suspend/resume of this register group
that much simpler to implement in future.

Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
---
 drivers/iommu/arm-smmu.c | 159 +++++++++++++++++++++++++++--------------------
 1 file changed, 93 insertions(+), 66 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 401af10683a2..22c093030322 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -170,12 +170,20 @@
 #define S2CR_CBNDX_MASK			0xff
 #define S2CR_TYPE_SHIFT			16
 #define S2CR_TYPE_MASK			0x3
-#define S2CR_TYPE_TRANS			(0 << S2CR_TYPE_SHIFT)
-#define S2CR_TYPE_BYPASS		(1 << S2CR_TYPE_SHIFT)
-#define S2CR_TYPE_FAULT			(2 << S2CR_TYPE_SHIFT)
+enum arm_smmu_s2cr_type {
+	S2CR_TYPE_TRANS,
+	S2CR_TYPE_BYPASS,
+	S2CR_TYPE_FAULT,
+};
 
 #define S2CR_PRIVCFG_SHIFT		24
-#define S2CR_PRIVCFG_UNPRIV		(2 << S2CR_PRIVCFG_SHIFT)
+#define S2CR_PRIVCFG_MASK		0x3
+enum arm_smmu_s2cr_privcfg {
+	S2CR_PRIVCFG_DEFAULT,
+	S2CR_PRIVCFG_DIPAN,
+	S2CR_PRIVCFG_UNPRIV,
+	S2CR_PRIVCFG_PRIV,
+};
 
 /* Context bank attribute registers */
 #define ARM_SMMU_GR1_CBAR(n)		(0x0 + ((n) << 2))
@@ -292,6 +300,16 @@ enum arm_smmu_implementation {
 	CAVIUM_SMMUV2,
 };
 
+struct arm_smmu_s2cr {
+	enum arm_smmu_s2cr_type		type;
+	enum arm_smmu_s2cr_privcfg	privcfg;
+	u8				cbndx;
+};
+
+#define s2cr_init_val (struct arm_smmu_s2cr){				\
+	.type = disable_bypass ? S2CR_TYPE_FAULT : S2CR_TYPE_BYPASS,	\
+}
+
 struct arm_smmu_smr {
 	u16				mask;
 	u16				id;
@@ -346,6 +364,7 @@ struct arm_smmu_device {
 	u16				streamid_mask;
 	u16				smr_mask_mask;
 	struct arm_smmu_smr		*smrs;
+	struct arm_smmu_s2cr		*s2crs;
 
 	unsigned long			va_size;
 	unsigned long			ipa_size;
@@ -1083,6 +1102,23 @@ static void arm_smmu_write_smr(struct arm_smmu_device *smmu, int idx)
 	writel_relaxed(reg, ARM_SMMU_GR0(smmu) + ARM_SMMU_GR0_SMR(idx));
 }
 
+static void arm_smmu_write_s2cr(struct arm_smmu_device *smmu, int idx)
+{
+	struct arm_smmu_s2cr *s2cr = smmu->s2crs + idx;
+	u32 reg = (s2cr->type & S2CR_TYPE_MASK) << S2CR_TYPE_SHIFT |
+		  (s2cr->cbndx & S2CR_CBNDX_MASK) << S2CR_CBNDX_SHIFT |
+		  (s2cr->privcfg & S2CR_PRIVCFG_MASK) << S2CR_PRIVCFG_SHIFT;
+
+	writel_relaxed(reg, ARM_SMMU_GR0(smmu) + ARM_SMMU_GR0_S2CR(idx));
+}
+
+static void arm_smmu_write_sme(struct arm_smmu_device *smmu, int idx)
+{
+	arm_smmu_write_s2cr(smmu, idx);
+	if (smmu->smrs)
+		arm_smmu_write_smr(smmu, idx);
+}
+
 static int arm_smmu_master_alloc_smes(struct arm_smmu_device *smmu,
 				      struct arm_smmu_master_cfg *cfg)
 {
@@ -1133,6 +1169,23 @@ static void arm_smmu_master_free_smes(struct arm_smmu_device *smmu,
 {
 	int i;
 
+	/*
+	 * We *must* clear the S2CR first, because freeing the SMR means
+	 * that it can be re-allocated immediately.
+	 */
+	for (i = 0; i < cfg->num_streamids; ++i) {
+		int idx = cfg->smendx[i];
+
+		/* An IOMMU group is torn down by the first device to be removed */
+		if (idx < 0)
+			return;
+
+		smmu->s2crs[idx] = s2cr_init_val;
+		arm_smmu_write_s2cr(smmu, idx);
+	}
+	/* Sync S2CR updates before touching anything else */
+	__iowmb();
+
 	/* Invalidate the SMRs before freeing back to the allocator */
 	for (i = 0; i < cfg->num_streamids; ++i) {
 		if (smmu->smrs)
@@ -1145,9 +1198,16 @@ static void arm_smmu_master_free_smes(struct arm_smmu_device *smmu,
 static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain,
 				      struct arm_smmu_master_cfg *cfg)
 {
-	int i, ret;
+	int i, ret = 0;
 	struct arm_smmu_device *smmu = smmu_domain->smmu;
-	void __iomem *gr0_base = ARM_SMMU_GR0(smmu);
+	struct arm_smmu_s2cr *s2cr = smmu->s2crs;
+	enum arm_smmu_s2cr_type type = S2CR_TYPE_TRANS;
+	u8 cbndx = smmu_domain->cfg.cbndx;
+
+	if (cfg->smendx[0] < 0)
+		ret = arm_smmu_master_alloc_smes(smmu, cfg);
+	if (ret)
+		return ret;
 
 	/*
 	 * FIXME: This won't be needed once we have IOMMU-backed DMA ops
@@ -1156,58 +1216,21 @@ static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain,
 	 * and a PCI device (i.e. a PCI host controller)
 	 */
 	if (smmu_domain->domain.type == IOMMU_DOMAIN_DMA)
-		return 0;
+		type = S2CR_TYPE_BYPASS;
 
-	/* Devices in an IOMMU group may already be configured */
-	ret = arm_smmu_master_alloc_smes(smmu, cfg);
-	if (ret)
-		return ret == -EEXIST ? 0 : ret;
-
-	for (i = 0; i < cfg->num_streamids; ++i) {
-		u32 idx, s2cr;
-
-		idx = cfg->smendx[i];
-		s2cr = S2CR_TYPE_TRANS | S2CR_PRIVCFG_UNPRIV |
-		       (smmu_domain->cfg.cbndx << S2CR_CBNDX_SHIFT);
-		writel_relaxed(s2cr, gr0_base + ARM_SMMU_GR0_S2CR(idx));
-	}
-
-	return 0;
-}
-
-static void arm_smmu_domain_remove_master(struct arm_smmu_domain *smmu_domain,
-					  struct arm_smmu_master_cfg *cfg)
-{
-	int i;
-	struct arm_smmu_device *smmu = smmu_domain->smmu;
-	void __iomem *gr0_base = ARM_SMMU_GR0(smmu);
-
-	/*
-	 * We *must* clear the S2CR first, because freeing the SMR means
-	 * that it can be re-allocated immediately.
-	 */
 	for (i = 0; i < cfg->num_streamids; ++i) {
 		int idx = cfg->smendx[i];
-		u32 reg = disable_bypass ? S2CR_TYPE_FAULT : S2CR_TYPE_BYPASS;
 
-		/* An IOMMU group is torn down by the first device to be removed */
-		if (idx < 0)
-			return;
+		/* Devices in an IOMMU group may already be configured */
+		if (type == s2cr[idx].type && cbndx == s2cr[idx].cbndx)
+			break;
 
-		writel_relaxed(reg, gr0_base + ARM_SMMU_GR0_S2CR(idx));
+		s2cr[idx].type = type;
+		s2cr[idx].privcfg = S2CR_PRIVCFG_UNPRIV;
+		s2cr[idx].cbndx = cbndx;
+		arm_smmu_write_s2cr(smmu, idx);
 	}
-
-	arm_smmu_master_free_smes(smmu, cfg);
-}
-
-static void arm_smmu_detach_dev(struct device *dev,
-				struct arm_smmu_master_cfg *cfg)
-{
-	struct iommu_domain *domain = dev->archdata.iommu;
-	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
-
-	dev->archdata.iommu = NULL;
-	arm_smmu_domain_remove_master(smmu_domain, cfg);
+	return 0;
 }
 
 static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
@@ -1244,14 +1267,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 	if (!cfg)
 		return -ENODEV;
 
-	/* Detach the dev from its current domain */
-	if (dev->archdata.iommu)
-		arm_smmu_detach_dev(dev, cfg);
-
-	ret = arm_smmu_domain_add_master(smmu_domain, cfg);
-	if (!ret)
-		dev->archdata.iommu = domain;
-	return ret;
+	return arm_smmu_domain_add_master(smmu_domain, cfg);
 }
 
 static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
@@ -1452,6 +1468,12 @@ static int arm_smmu_add_device(struct device *dev)
 
 static void arm_smmu_remove_device(struct device *dev)
 {
+	struct arm_smmu_device *smmu = find_smmu_for_device(dev);
+	struct arm_smmu_master_cfg *cfg = find_smmu_master_cfg(dev);
+
+	if (smmu && cfg)
+		arm_smmu_master_free_smes(smmu, cfg);
+
 	iommu_group_remove_device(dev);
 }
 
@@ -1557,12 +1579,8 @@ static void arm_smmu_device_reset(struct arm_smmu_device *smmu)
 	 * Reset stream mapping groups: Initial values mark all SMRn as
 	 * invalid and all S2CRn as bypass unless overridden.
 	 */
-	reg = disable_bypass ? S2CR_TYPE_FAULT : S2CR_TYPE_BYPASS;
-	for (i = 0; i < smmu->num_mapping_groups; ++i) {
-		if (smmu->smrs)
-			arm_smmu_write_smr(smmu, i);
-		writel_relaxed(reg, gr0_base + ARM_SMMU_GR0_S2CR(i));
-	}
+	for (i = 0; i < smmu->num_mapping_groups; ++i)
+		arm_smmu_write_sme(smmu, i);
 
 	/*
 	 * Before clearing ARM_MMU500_ACTLR_CPRE, need to
@@ -1651,6 +1669,7 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
 	void __iomem *gr0_base = ARM_SMMU_GR0(smmu);
 	u32 id;
 	bool cttw_dt, cttw_reg;
+	int i;
 
 	dev_notice(smmu->dev, "probing hardware configuration...\n");
 	dev_notice(smmu->dev, "SMMUv%d with:\n",
@@ -1748,6 +1767,14 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
 			   "\tstream matching with %lu register groups, mask 0x%x",
 			   size, smmu->smr_mask_mask);
 	}
+	/* s2cr->type == 0 means translation, so initialise explicitly */
+	smmu->s2crs = devm_kmalloc_array(smmu->dev, size, sizeof(*smmu->s2crs),
+					 GFP_KERNEL);
+	if (!smmu->s2crs)
+		return -ENOMEM;
+	for (i = 0; i < size; i++)
+		smmu->s2crs[i] = s2cr_init_val;
+
 	smmu->num_mapping_groups = size;
 
 	if (smmu->version < ARM_SMMU_V2 || !(id & ID0_PTFS_NO_AARCH32)) {
-- 
2.8.1.dirty

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v5 11/19] iommu/arm-smmu: Refactor mmu-masters handling
       [not found] ` <cover.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
                     ` (9 preceding siblings ...)
  2016-08-23 19:05   ` [PATCH v5 10/19] iommu/arm-smmu: Keep track of S2CR state Robin Murphy
@ 2016-08-23 19:05   ` Robin Murphy
       [not found]     ` <00301aa60323bb94588d078f2962feea0cb45c72.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
  2016-08-23 19:05   ` [PATCH v5 12/19] iommu/arm-smmu: Streamline SMMU data lookups Robin Murphy
                     ` (9 subsequent siblings)
  20 siblings, 1 reply; 61+ messages in thread
From: Robin Murphy @ 2016-08-23 19:05 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, lorenzo.pieralisi-5wv7dgnIgG8,
	jean-philippe.brucker-5wv7dgnIgG8, punit.agrawal-5wv7dgnIgG8,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA,
	eric.auger-H+wXaHxf7aLQT0dZR+AlfA

To be able to support the generic bindings and handle of_xlate() calls,
we need to be able to associate SMMUs and stream IDs directly with
devices *before* allocating IOMMU groups. Furthermore, to support real
default domains with multi-device groups we also have to handle domain
attach on a per-device basis, as the "whole group at a time" assumption
fails to properly handle subsequent devices added to a group after the
first has already triggered default domain creation and attachment.

To that end, use the now-vacant dev->archdata.iommu field for easy
config and SMMU instance lookup, and unify config management by chopping
down the platform-device-specific tree and probing the "mmu-masters"
property on-demand instead. This may add a bit of one-off overhead to
initially adding a new device, but we're about to deprecate that binding
in favour of the inherently-more-efficient generic ones anyway.

For the sake of simplicity, this patch does temporarily regress the case
of aliasing PCI devices by losing the duplicate stream ID detection that
the previous per-group config had. Stay tuned, because we'll be back to
fix that in a better and more general way momentarily...

Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
---
 drivers/iommu/arm-smmu.c | 382 +++++++++++++----------------------------------
 1 file changed, 107 insertions(+), 275 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 22c093030322..9066fd1399d4 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -317,18 +317,13 @@ struct arm_smmu_smr {
 };
 
 struct arm_smmu_master_cfg {
+	struct arm_smmu_device		*smmu;
 	int				num_streamids;
 	u16				streamids[MAX_MASTER_STREAMIDS];
 	s16				smendx[MAX_MASTER_STREAMIDS];
 };
 #define INVALID_SMENDX			-1
 
-struct arm_smmu_master {
-	struct device_node		*of_node;
-	struct rb_node			node;
-	struct arm_smmu_master_cfg	cfg;
-};
-
 struct arm_smmu_device {
 	struct device			*dev;
 
@@ -376,7 +371,6 @@ struct arm_smmu_device {
 	unsigned int			*irqs;
 
 	struct list_head		list;
-	struct rb_root			masters;
 
 	u32				cavium_id_base; /* Specific to Cavium */
 };
@@ -415,12 +409,6 @@ struct arm_smmu_domain {
 	struct iommu_domain		domain;
 };
 
-struct arm_smmu_phandle_args {
-	struct device_node *np;
-	int args_count;
-	uint32_t args[MAX_MASTER_STREAMIDS];
-};
-
 static DEFINE_SPINLOCK(arm_smmu_devices_lock);
 static LIST_HEAD(arm_smmu_devices);
 
@@ -462,132 +450,89 @@ static struct device_node *dev_get_dev_node(struct device *dev)
 
 		while (!pci_is_root_bus(bus))
 			bus = bus->parent;
-		return bus->bridge->parent->of_node;
+		return of_node_get(bus->bridge->parent->of_node);
 	}
 
-	return dev->of_node;
+	return of_node_get(dev->of_node);
 }
 
-static struct arm_smmu_master *find_smmu_master(struct arm_smmu_device *smmu,
-						struct device_node *dev_node)
+static int __arm_smmu_get_pci_sid(struct pci_dev *pdev, u16 alias, void *data)
 {
-	struct rb_node *node = smmu->masters.rb_node;
-
-	while (node) {
-		struct arm_smmu_master *master;
-
-		master = container_of(node, struct arm_smmu_master, node);
-
-		if (dev_node < master->of_node)
-			node = node->rb_left;
-		else if (dev_node > master->of_node)
-			node = node->rb_right;
-		else
-			return master;
-	}
-
-	return NULL;
+	*((__be32 *)data) = cpu_to_be32(alias);
+	return 0; /* Continue walking */
 }
 
-static struct arm_smmu_master_cfg *
-find_smmu_master_cfg(struct device *dev)
+static int __find_legacy_master_phandle(struct device *dev, void *data)
 {
-	struct arm_smmu_master_cfg *cfg = NULL;
-	struct iommu_group *group = iommu_group_get(dev);
+	struct of_phandle_iterator *it = *(void **)data;
+	struct device_node *np = it->node;
+	int err;
 
-	if (group) {
-		cfg = iommu_group_get_iommudata(group);
-		iommu_group_put(group);
-	}
-
-	return cfg;
-}
-
-static int insert_smmu_master(struct arm_smmu_device *smmu,
-			      struct arm_smmu_master *master)
-{
-	struct rb_node **new, *parent;
-
-	new = &smmu->masters.rb_node;
-	parent = NULL;
-	while (*new) {
-		struct arm_smmu_master *this
-			= container_of(*new, struct arm_smmu_master, node);
-
-		parent = *new;
-		if (master->of_node < this->of_node)
-			new = &((*new)->rb_left);
-		else if (master->of_node > this->of_node)
-			new = &((*new)->rb_right);
-		else
-			return -EEXIST;
-	}
-
-	rb_link_node(&master->node, parent, new);
-	rb_insert_color(&master->node, &smmu->masters);
-	return 0;
-}
-
-static int register_smmu_master(struct arm_smmu_device *smmu,
-				struct device *dev,
-				struct arm_smmu_phandle_args *masterspec)
-{
-	int i;
-	struct arm_smmu_master *master;
-
-	master = find_smmu_master(smmu, masterspec->np);
-	if (master) {
-		dev_err(dev,
-			"rejecting multiple registrations for master device %s\n",
-			masterspec->np->name);
-		return -EBUSY;
-	}
-
-	if (masterspec->args_count > MAX_MASTER_STREAMIDS) {
-		dev_err(dev,
-			"reached maximum number (%d) of stream IDs for master device %s\n",
-			MAX_MASTER_STREAMIDS, masterspec->np->name);
-		return -ENOSPC;
-	}
-
-	master = devm_kzalloc(dev, sizeof(*master), GFP_KERNEL);
-	if (!master)
-		return -ENOMEM;
-
-	master->of_node			= masterspec->np;
-	master->cfg.num_streamids	= masterspec->args_count;
-
-	for (i = 0; i < master->cfg.num_streamids; ++i) {
-		u16 streamid = masterspec->args[i];
-
-		if (!(smmu->features & ARM_SMMU_FEAT_STREAM_MATCH) &&
-		     (streamid >= smmu->num_mapping_groups)) {
-			dev_err(dev,
-				"stream ID for master device %s greater than maximum allowed (%d)\n",
-				masterspec->np->name, smmu->num_mapping_groups);
-			return -ERANGE;
+	of_for_each_phandle(it, err, dev->of_node, "mmu-masters",
+			    "#stream-id-cells", 0)
+		if (it->node == np) {
+			*(void **)data = dev;
+			return 1;
 		}
-		master->cfg.streamids[i] = streamid;
-		master->cfg.smendx[i] = INVALID_SMENDX;
-	}
-	return insert_smmu_master(smmu, master);
+	it->node = np;
+	return err;
 }
 
-static struct arm_smmu_device *find_smmu_for_device(struct device *dev)
+static int arm_smmu_register_legacy_master(struct device *dev)
 {
 	struct arm_smmu_device *smmu;
-	struct arm_smmu_master *master = NULL;
-	struct device_node *dev_node = dev_get_dev_node(dev);
+	struct arm_smmu_master_cfg *cfg;
+	struct device_node *np;
+	struct of_phandle_iterator it;
+	void *data = &it;
+	__be32 pci_sid;
+	int err;
 
+	np = dev_get_dev_node(dev);
+	if (!np || !of_find_property(np, "#stream-id-cells", NULL)) {
+		of_node_put(np);
+		return -ENODEV;
+	}
+
+	it.node = np;
 	spin_lock(&arm_smmu_devices_lock);
 	list_for_each_entry(smmu, &arm_smmu_devices, list) {
-		master = find_smmu_master(smmu, dev_node);
-		if (master)
+		err = __find_legacy_master_phandle(smmu->dev, &data);
+		if (err)
 			break;
 	}
 	spin_unlock(&arm_smmu_devices_lock);
+	of_node_put(np);
+	if (err == 0)
+		return -ENODEV;
+	if (err < 0)
+		return err;
 
-	return master ? smmu : NULL;
+	if (it.cur_count > MAX_MASTER_STREAMIDS) {
+		dev_err(smmu->dev,
+			"reached maximum number (%d) of stream IDs for master device %s\n",
+			MAX_MASTER_STREAMIDS, dev_name(dev));
+		return -ENOSPC;
+	}
+	if (dev_is_pci(dev)) {
+		/* "mmu-masters" assumes Stream ID == Requester ID */
+		pci_for_each_dma_alias(to_pci_dev(dev), __arm_smmu_get_pci_sid,
+				       &pci_sid);
+		it.cur = &pci_sid;
+		it.cur_count = 1;
+	}
+
+	cfg = kzalloc(sizeof(*cfg), GFP_KERNEL);
+	if (!cfg)
+		return -ENOMEM;
+
+	cfg->smmu = smmu;
+	dev->archdata.iommu = cfg;
+
+	while (it.cur_count--)
+		cfg->streamids[cfg->num_streamids++] = be32_to_cpup(it.cur++);
+
+	return 0;
 }
 
 static int __arm_smmu_alloc_bitmap(unsigned long *map, int start, int end)
@@ -1094,8 +1039,7 @@ static void arm_smmu_free_smr(struct arm_smmu_device *smmu, int idx)
 static void arm_smmu_write_smr(struct arm_smmu_device *smmu, int idx)
 {
 	struct arm_smmu_smr *smr = smmu->smrs + idx;
-	u32 reg = (smr->id & smmu->streamid_mask) << SMR_ID_SHIFT |
-		  (smr->mask & smmu->smr_mask_mask) << SMR_MASK_SHIFT;
+	u32 reg = smr->id << SMR_ID_SHIFT | smr->mask << SMR_MASK_SHIFT;
 
 	if (smr->valid)
 		reg |= SMR_VALID;
@@ -1164,9 +1108,9 @@ err_free_smrs:
 	return -ENOSPC;
 }
 
-static void arm_smmu_master_free_smes(struct arm_smmu_device *smmu,
-				      struct arm_smmu_master_cfg *cfg)
+static void arm_smmu_master_free_smes(struct arm_smmu_master_cfg *cfg)
 {
+	struct arm_smmu_device *smmu = cfg->smmu;
 	int i;
 
 	/*
@@ -1237,17 +1181,15 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 {
 	int ret;
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
-	struct arm_smmu_device *smmu;
-	struct arm_smmu_master_cfg *cfg;
+	struct arm_smmu_master_cfg *cfg = dev->archdata.iommu;
 
-	smmu = find_smmu_for_device(dev);
-	if (!smmu) {
+	if (!cfg) {
 		dev_err(dev, "cannot attach to SMMU, is it on the same bus?\n");
 		return -ENXIO;
 	}
 
 	/* Ensure that the domain is finalised */
-	ret = arm_smmu_init_domain_context(domain, smmu);
+	ret = arm_smmu_init_domain_context(domain, cfg->smmu);
 	if (ret < 0)
 		return ret;
 
@@ -1255,18 +1197,14 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 	 * Sanity check the domain. We don't support domains across
 	 * different SMMUs.
 	 */
-	if (smmu_domain->smmu != smmu) {
+	if (smmu_domain->smmu != cfg->smmu) {
 		dev_err(dev,
 			"cannot attach to SMMU %s whilst already attached to domain on SMMU %s\n",
-			dev_name(smmu_domain->smmu->dev), dev_name(smmu->dev));
+			dev_name(smmu_domain->smmu->dev), dev_name(cfg->smmu->dev));
 		return -EINVAL;
 	}
 
 	/* Looks ok, so add the device to the domain */
-	cfg = find_smmu_master_cfg(dev);
-	if (!cfg)
-		return -ENODEV;
-
 	return arm_smmu_domain_add_master(smmu_domain, cfg);
 }
 
@@ -1386,120 +1324,65 @@ static bool arm_smmu_capable(enum iommu_cap cap)
 	}
 }
 
-static int __arm_smmu_get_pci_sid(struct pci_dev *pdev, u16 alias, void *data)
-{
-	*((u16 *)data) = alias;
-	return 0; /* Continue walking */
-}
-
-static void __arm_smmu_release_pci_iommudata(void *data)
-{
-	kfree(data);
-}
-
-static int arm_smmu_init_pci_device(struct pci_dev *pdev,
-				    struct iommu_group *group)
-{
-	struct arm_smmu_master_cfg *cfg;
-	u16 sid;
-	int i;
-
-	cfg = iommu_group_get_iommudata(group);
-	if (!cfg) {
-		cfg = kzalloc(sizeof(*cfg), GFP_KERNEL);
-		if (!cfg)
-			return -ENOMEM;
-
-		iommu_group_set_iommudata(group, cfg,
-					  __arm_smmu_release_pci_iommudata);
-	}
-
-	if (cfg->num_streamids >= MAX_MASTER_STREAMIDS)
-		return -ENOSPC;
-
-	/*
-	 * Assume Stream ID == Requester ID for now.
-	 * We need a way to describe the ID mappings in FDT.
-	 */
-	pci_for_each_dma_alias(pdev, __arm_smmu_get_pci_sid, &sid);
-	for (i = 0; i < cfg->num_streamids; ++i)
-		if (cfg->streamids[i] == sid)
-			break;
-
-	/* Avoid duplicate SIDs, as this can lead to SMR conflicts */
-	if (i == cfg->num_streamids) {
-		cfg->streamids[i] = sid;
-		cfg->smendx[i] = INVALID_SMENDX;
-		cfg->num_streamids++;
-	}
-
-	return 0;
-}
-
-static int arm_smmu_init_platform_device(struct device *dev,
-					 struct iommu_group *group)
-{
-	struct arm_smmu_device *smmu = find_smmu_for_device(dev);
-	struct arm_smmu_master *master;
-
-	if (!smmu)
-		return -ENODEV;
-
-	master = find_smmu_master(smmu, dev->of_node);
-	if (!master)
-		return -ENODEV;
-
-	iommu_group_set_iommudata(group, &master->cfg, NULL);
-
-	return 0;
-}
-
 static int arm_smmu_add_device(struct device *dev)
 {
+	struct arm_smmu_master_cfg *cfg;
 	struct iommu_group *group;
+	int i, ret;
+
+	ret = arm_smmu_register_legacy_master(dev);
+	cfg = dev->archdata.iommu;
+	if (ret)
+		goto out_free;
+
+	ret = -EINVAL;
+	for (i = 0; i < cfg->num_streamids; i++) {
+		u16 sid = cfg->streamids[i];
+
+		if (sid & ~cfg->smmu->streamid_mask) {
+			dev_err(dev, "stream ID 0x%x out of range for SMMU (0x%x)\n",
+				sid, cfg->smmu->streamid_mask);
+			goto out_free;
+		}
+		cfg->smendx[i] = INVALID_SMENDX;
+	}
 
 	group = iommu_group_get_for_dev(dev);
-	if (IS_ERR(group))
-		return PTR_ERR(group);
-
+	if (IS_ERR(group)) {
+		ret = PTR_ERR(group);
+		goto out_free;
+	}
 	iommu_group_put(group);
 	return 0;
+
+out_free:
+	kfree(cfg);
+	dev->archdata.iommu = NULL;
+	return ret;
 }
 
 static void arm_smmu_remove_device(struct device *dev)
 {
-	struct arm_smmu_device *smmu = find_smmu_for_device(dev);
-	struct arm_smmu_master_cfg *cfg = find_smmu_master_cfg(dev);
+	struct arm_smmu_master_cfg *cfg = dev->archdata.iommu;
 
-	if (smmu && cfg)
-		arm_smmu_master_free_smes(smmu, cfg);
+	if (!cfg)
+		return;
 
+	arm_smmu_master_free_smes(cfg);
 	iommu_group_remove_device(dev);
+	kfree(cfg);
+	dev->archdata.iommu = NULL;
 }
 
 static struct iommu_group *arm_smmu_device_group(struct device *dev)
 {
 	struct iommu_group *group;
-	int ret;
 
 	if (dev_is_pci(dev))
 		group = pci_device_group(dev);
 	else
 		group = generic_device_group(dev);
 
-	if (IS_ERR(group))
-		return group;
-
-	if (dev_is_pci(dev))
-		ret = arm_smmu_init_pci_device(to_pci_dev(dev), group);
-	else
-		ret = arm_smmu_init_platform_device(dev, group);
-
-	if (ret) {
-		iommu_group_put(group);
-		group = ERR_PTR(ret);
-	}
-
 	return group;
 }
 
@@ -1913,9 +1796,6 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
 	struct resource *res;
 	struct arm_smmu_device *smmu;
 	struct device *dev = &pdev->dev;
-	struct rb_node *node;
-	struct of_phandle_iterator it;
-	struct arm_smmu_phandle_args *masterspec;
 	int num_irqs, i, err;
 
 	smmu = devm_kzalloc(dev, sizeof(*smmu), GFP_KERNEL);
@@ -1976,37 +1856,6 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
 	if (err)
 		return err;
 
-	i = 0;
-	smmu->masters = RB_ROOT;
-
-	err = -ENOMEM;
-	/* No need to zero the memory for masterspec */
-	masterspec = kmalloc(sizeof(*masterspec), GFP_KERNEL);
-	if (!masterspec)
-		goto out_put_masters;
-
-	of_for_each_phandle(&it, err, dev->of_node,
-			    "mmu-masters", "#stream-id-cells", 0) {
-		int count = of_phandle_iterator_args(&it, masterspec->args,
-						     MAX_MASTER_STREAMIDS);
-		masterspec->np		= of_node_get(it.node);
-		masterspec->args_count	= count;
-
-		err = register_smmu_master(smmu, dev, masterspec);
-		if (err) {
-			dev_err(dev, "failed to add master %s\n",
-				masterspec->np->name);
-			kfree(masterspec);
-			goto out_put_masters;
-		}
-
-		i++;
-	}
-
-	dev_notice(dev, "registered %d master devices\n", i);
-
-	kfree(masterspec);
-
 	parse_driver_options(smmu);
 
 	if (smmu->version == ARM_SMMU_V2 &&
@@ -2014,8 +1863,7 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
 		dev_err(dev,
 			"found only %d context interrupt(s) but %d required\n",
 			smmu->num_context_irqs, smmu->num_context_banks);
-		err = -ENODEV;
-		goto out_put_masters;
+		return -ENODEV;
 	}
 
 	for (i = 0; i < smmu->num_global_irqs; ++i) {
@@ -2027,7 +1875,7 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
 		if (err) {
 			dev_err(dev, "failed to request global IRQ %d (%u)\n",
 				i, smmu->irqs[i]);
-			goto out_put_masters;
+			return err;
 		}
 	}
 
@@ -2038,15 +1886,6 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
 
 	arm_smmu_device_reset(smmu);
 	return 0;
-
-out_put_masters:
-	for (node = rb_first(&smmu->masters); node; node = rb_next(node)) {
-		struct arm_smmu_master *master
-			= container_of(node, struct arm_smmu_master, node);
-		of_node_put(master->of_node);
-	}
-
-	return err;
 }
 
 static int arm_smmu_device_remove(struct platform_device *pdev)
@@ -2054,7 +1893,6 @@ static int arm_smmu_device_remove(struct platform_device *pdev)
 	int i;
 	struct device *dev = &pdev->dev;
 	struct arm_smmu_device *curr, *smmu = NULL;
-	struct rb_node *node;
 
 	spin_lock(&arm_smmu_devices_lock);
 	list_for_each_entry(curr, &arm_smmu_devices, list) {
@@ -2069,12 +1907,6 @@ static int arm_smmu_device_remove(struct platform_device *pdev)
 	if (!smmu)
 		return -ENODEV;
 
-	for (node = rb_first(&smmu->masters); node; node = rb_next(node)) {
-		struct arm_smmu_master *master
-			= container_of(node, struct arm_smmu_master, node);
-		of_node_put(master->of_node);
-	}
-
 	if (!bitmap_empty(smmu->context_map, ARM_SMMU_MAX_CBS))
 		dev_err(dev, "removing device with active domains!\n");
 
-- 
2.8.1.dirty

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v5 12/19] iommu/arm-smmu: Streamline SMMU data lookups
       [not found] ` <cover.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
                     ` (10 preceding siblings ...)
  2016-08-23 19:05   ` [PATCH v5 11/19] iommu/arm-smmu: Refactor mmu-masters handling Robin Murphy
@ 2016-08-23 19:05   ` Robin Murphy
  2016-08-23 19:05   ` [PATCH v5 13/19] iommu/arm-smmu: Add a stream map entry iterator Robin Murphy
                     ` (8 subsequent siblings)
  20 siblings, 0 replies; 61+ messages in thread
From: Robin Murphy @ 2016-08-23 19:05 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, lorenzo.pieralisi-5wv7dgnIgG8,
	jean-philippe.brucker-5wv7dgnIgG8, punit.agrawal-5wv7dgnIgG8,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA,
	eric.auger-H+wXaHxf7aLQT0dZR+AlfA

Simplify things somewhat by stashing our arm_smmu_device instance in
drvdata, so that it's readily available to our driver model callbacks.
Then we can excise the private list entirely, since the driver core
already has a perfectly good list of SMMU devices we can use in the one
instance we actually need to. Finally, make a further modest code saving
with the relatively new of_device_get_match_data() helper.

Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
---
 drivers/iommu/arm-smmu.c | 44 +++++++++++---------------------------------
 1 file changed, 11 insertions(+), 33 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 9066fd1399d4..bd6f8bdc7086 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -41,6 +41,7 @@
 #include <linux/module.h>
 #include <linux/of.h>
 #include <linux/of_address.h>
+#include <linux/of_device.h>
 #include <linux/pci.h>
 #include <linux/platform_device.h>
 #include <linux/slab.h>
@@ -370,8 +371,6 @@ struct arm_smmu_device {
 	u32				num_context_irqs;
 	unsigned int			*irqs;
 
-	struct list_head		list;
-
 	u32				cavium_id_base; /* Specific to Cavium */
 };
 
@@ -409,9 +408,6 @@ struct arm_smmu_domain {
 	struct iommu_domain		domain;
 };
 
-static DEFINE_SPINLOCK(arm_smmu_devices_lock);
-static LIST_HEAD(arm_smmu_devices);
-
 struct arm_smmu_option_prop {
 	u32 opt;
 	const char *prop;
@@ -478,6 +474,8 @@ static int __find_legacy_master_phandle(struct device *dev, void *data)
 	return err;
 }
 
+static struct platform_driver arm_smmu_driver;
+
 static int arm_smmu_register_legacy_master(struct device *dev)
 {
 	struct arm_smmu_device *smmu;
@@ -495,19 +493,16 @@ static int arm_smmu_register_legacy_master(struct device *dev)
 	}
 
 	it.node = np;
-	spin_lock(&arm_smmu_devices_lock);
-	list_for_each_entry(smmu, &arm_smmu_devices, list) {
-		err = __find_legacy_master_phandle(smmu->dev, &data);
-		if (err)
-			break;
-	}
-	spin_unlock(&arm_smmu_devices_lock);
+	err = driver_for_each_device(&arm_smmu_driver.driver, NULL, &data,
+				     __find_legacy_master_phandle);
 	of_node_put(np);
 	if (err == 0)
 		return -ENODEV;
 	if (err < 0)
 		return err;
 
+	smmu = dev_get_drvdata(data);
+
 	if (it.cur_count > MAX_MASTER_STREAMIDS) {
 		dev_err(smmu->dev,
 			"reached maximum number (%d) of stream IDs for master device %s\n",
@@ -1791,7 +1786,6 @@ MODULE_DEVICE_TABLE(of, arm_smmu_of_match);
 
 static int arm_smmu_device_dt_probe(struct platform_device *pdev)
 {
-	const struct of_device_id *of_id;
 	const struct arm_smmu_match_data *data;
 	struct resource *res;
 	struct arm_smmu_device *smmu;
@@ -1805,8 +1799,7 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
 	}
 	smmu->dev = dev;
 
-	of_id = of_match_node(arm_smmu_of_match, dev->of_node);
-	data = of_id->data;
+	data = of_device_get_match_data(dev);
 	smmu->version = data->version;
 	smmu->model = data->model;
 
@@ -1879,36 +1872,21 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
 		}
 	}
 
-	INIT_LIST_HEAD(&smmu->list);
-	spin_lock(&arm_smmu_devices_lock);
-	list_add(&smmu->list, &arm_smmu_devices);
-	spin_unlock(&arm_smmu_devices_lock);
-
+	platform_set_drvdata(pdev, smmu);
 	arm_smmu_device_reset(smmu);
 	return 0;
 }
 
 static int arm_smmu_device_remove(struct platform_device *pdev)
 {
+	struct arm_smmu_device *smmu = platform_get_drvdata(pdev);
 	int i;
-	struct device *dev = &pdev->dev;
-	struct arm_smmu_device *curr, *smmu = NULL;
-
-	spin_lock(&arm_smmu_devices_lock);
-	list_for_each_entry(curr, &arm_smmu_devices, list) {
-		if (curr->dev == dev) {
-			smmu = curr;
-			list_del(&smmu->list);
-			break;
-		}
-	}
-	spin_unlock(&arm_smmu_devices_lock);
 
 	if (!smmu)
 		return -ENODEV;
 
 	if (!bitmap_empty(smmu->context_map, ARM_SMMU_MAX_CBS))
-		dev_err(dev, "removing device with active domains!\n");
+		dev_err(&pdev->dev, "removing device with active domains!\n");
 
 	for (i = 0; i < smmu->num_global_irqs; ++i)
 		devm_free_irq(smmu->dev, smmu->irqs[i], smmu);
-- 
2.8.1.dirty

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v5 13/19] iommu/arm-smmu: Add a stream map entry iterator
       [not found] ` <cover.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
                     ` (11 preceding siblings ...)
  2016-08-23 19:05   ` [PATCH v5 12/19] iommu/arm-smmu: Streamline SMMU data lookups Robin Murphy
@ 2016-08-23 19:05   ` Robin Murphy
  2016-08-23 19:05   ` [PATCH v5 14/19] iommu/arm-smmu: Intelligent SMR allocation Robin Murphy
                     ` (7 subsequent siblings)
  20 siblings, 0 replies; 61+ messages in thread
From: Robin Murphy @ 2016-08-23 19:05 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, lorenzo.pieralisi-5wv7dgnIgG8,
	jean-philippe.brucker-5wv7dgnIgG8, punit.agrawal-5wv7dgnIgG8,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA,
	eric.auger-H+wXaHxf7aLQT0dZR+AlfA

We iterate over the SMEs associated with a master config quite a lot in
various places, and are about to do so even more. Let's wrap the idiom
in a handy iterator macro before the repetition gets out of hand.

Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
---
 drivers/iommu/arm-smmu.c | 26 ++++++++++++--------------
 1 file changed, 12 insertions(+), 14 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index bd6f8bdc7086..17bf871030c6 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -324,6 +324,8 @@ struct arm_smmu_master_cfg {
 	s16				smendx[MAX_MASTER_STREAMIDS];
 };
 #define INVALID_SMENDX			-1
+#define for_each_cfg_sme(cfg, i, idx) \
+	for (i = 0; idx = cfg->smendx[i], i < cfg->num_streamids; ++i)
 
 struct arm_smmu_device {
 	struct device			*dev;
@@ -1065,8 +1067,8 @@ static int arm_smmu_master_alloc_smes(struct arm_smmu_device *smmu,
 	int i, idx;
 
 	/* Allocate the SMRs on the SMMU */
-	for (i = 0; i < cfg->num_streamids; ++i) {
-		if (cfg->smendx[i] >= 0)
+	for_each_cfg_sme(cfg, i, idx) {
+		if (idx >= 0)
 			return -EEXIST;
 
 		/* ...except on stream indexing hardware, of course */
@@ -1090,8 +1092,8 @@ static int arm_smmu_master_alloc_smes(struct arm_smmu_device *smmu,
 		return 0;
 
 	/* It worked! Now, poke the actual hardware */
-	for (i = 0; i < cfg->num_streamids; ++i)
-		arm_smmu_write_smr(smmu, cfg->smendx[i]);
+	for_each_cfg_sme(cfg, i, idx)
+		arm_smmu_write_smr(smmu, idx);
 
 	return 0;
 
@@ -1106,15 +1108,13 @@ err_free_smrs:
 static void arm_smmu_master_free_smes(struct arm_smmu_master_cfg *cfg)
 {
 	struct arm_smmu_device *smmu = cfg->smmu;
-	int i;
+	int i, idx;
 
 	/*
 	 * We *must* clear the S2CR first, because freeing the SMR means
 	 * that it can be re-allocated immediately.
 	 */
-	for (i = 0; i < cfg->num_streamids; ++i) {
-		int idx = cfg->smendx[i];
-
+	for_each_cfg_sme(cfg, i, idx) {
 		/* An IOMMU group is torn down by the first device to be removed */
 		if (idx < 0)
 			return;
@@ -1126,9 +1126,9 @@ static void arm_smmu_master_free_smes(struct arm_smmu_master_cfg *cfg)
 	__iowmb();
 
 	/* Invalidate the SMRs before freeing back to the allocator */
-	for (i = 0; i < cfg->num_streamids; ++i) {
+	for_each_cfg_sme(cfg, i, idx) {
 		if (smmu->smrs)
-			arm_smmu_free_smr(smmu, cfg->smendx[i]);
+			arm_smmu_free_smr(smmu, idx);
 
 		cfg->smendx[i] = INVALID_SMENDX;
 	}
@@ -1137,7 +1137,7 @@ static void arm_smmu_master_free_smes(struct arm_smmu_master_cfg *cfg)
 static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain,
 				      struct arm_smmu_master_cfg *cfg)
 {
-	int i, ret = 0;
+	int i, idx, ret = 0;
 	struct arm_smmu_device *smmu = smmu_domain->smmu;
 	struct arm_smmu_s2cr *s2cr = smmu->s2crs;
 	enum arm_smmu_s2cr_type type = S2CR_TYPE_TRANS;
@@ -1157,9 +1157,7 @@ static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain,
 	if (smmu_domain->domain.type == IOMMU_DOMAIN_DMA)
 		type = S2CR_TYPE_BYPASS;
 
-	for (i = 0; i < cfg->num_streamids; ++i) {
-		int idx = cfg->smendx[i];
-
+	for_each_cfg_sme(cfg, i, idx) {
 		/* Devices in an IOMMU group may already be configured */
 		if (type == s2cr[idx].type && cbndx == s2cr[idx].cbndx)
 			break;
-- 
2.8.1.dirty

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v5 14/19] iommu/arm-smmu: Intelligent SMR allocation
       [not found] ` <cover.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
                     ` (12 preceding siblings ...)
  2016-08-23 19:05   ` [PATCH v5 13/19] iommu/arm-smmu: Add a stream map entry iterator Robin Murphy
@ 2016-08-23 19:05   ` Robin Murphy
       [not found]     ` <693b7fdd58be254297eb43ac8f5e035beb5226b2.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
  2016-08-23 19:05   ` [PATCH v5 15/19] iommu/arm-smmu: Convert to iommu_fwspec Robin Murphy
                     ` (6 subsequent siblings)
  20 siblings, 1 reply; 61+ messages in thread
From: Robin Murphy @ 2016-08-23 19:05 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, lorenzo.pieralisi-5wv7dgnIgG8,
	jean-philippe.brucker-5wv7dgnIgG8, punit.agrawal-5wv7dgnIgG8,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA,
	eric.auger-H+wXaHxf7aLQT0dZR+AlfA

Stream Match Registers are one of the more awkward parts of the SMMUv2
architecture; there are typically never enough to assign one to each
stream ID in the system, and configuring them such that a single ID
matches multiple entries is catastrophically bad - at best, every
transaction raises a global fault; at worst, they go *somewhere*.

To address the former issue, we can mask ID bits such that a single
register may be used to match multiple IDs belonging to the same device
or group, but doing so also heightens the risk of the latter problem
(which can be nasty to debug).

Tackle both problems at once by replacing the simple bitmap allocator
with something much cleverer. Now that we have convenient in-memory
representations of the stream mapping table, it becomes straightforward
to properly validate new SMR entries against the current state, opening
the door to arbitrary masking and SMR sharing.

Another feature which falls out of this is that with IDs shared by
separate devices being automatically accounted for, simply associating a
group pointer with the S2CR offers appropriate group allocation almost
for free, so hook that up in the process.

Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
---
 drivers/iommu/arm-smmu.c | 192 ++++++++++++++++++++++++++++-------------------
 1 file changed, 114 insertions(+), 78 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 17bf871030c6..88f82eb8d1fe 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -302,6 +302,8 @@ enum arm_smmu_implementation {
 };
 
 struct arm_smmu_s2cr {
+	struct iommu_group		*group;
+	int				count;
 	enum arm_smmu_s2cr_type		type;
 	enum arm_smmu_s2cr_privcfg	privcfg;
 	u8				cbndx;
@@ -363,6 +365,7 @@ struct arm_smmu_device {
 	u16				smr_mask_mask;
 	struct arm_smmu_smr		*smrs;
 	struct arm_smmu_s2cr		*s2crs;
+	struct mutex			stream_map_mutex;
 
 	unsigned long			va_size;
 	unsigned long			ipa_size;
@@ -1016,23 +1019,6 @@ static void arm_smmu_domain_free(struct iommu_domain *domain)
 	kfree(smmu_domain);
 }
 
-static int arm_smmu_alloc_smr(struct arm_smmu_device *smmu)
-{
-	int i;
-
-	for (i = 0; i < smmu->num_mapping_groups; i++)
-		if (!cmpxchg(&smmu->smrs[i].valid, false, true))
-			return i;
-
-	return INVALID_SMENDX;
-}
-
-static void arm_smmu_free_smr(struct arm_smmu_device *smmu, int idx)
-{
-	writel_relaxed(~SMR_VALID, ARM_SMMU_GR0(smmu) + ARM_SMMU_GR0_SMR(idx));
-	WRITE_ONCE(smmu->smrs[idx].valid, false);
-}
-
 static void arm_smmu_write_smr(struct arm_smmu_device *smmu, int idx)
 {
 	struct arm_smmu_smr *smr = smmu->smrs + idx;
@@ -1060,49 +1046,110 @@ static void arm_smmu_write_sme(struct arm_smmu_device *smmu, int idx)
 		arm_smmu_write_smr(smmu, idx);
 }
 
-static int arm_smmu_master_alloc_smes(struct arm_smmu_device *smmu,
-				      struct arm_smmu_master_cfg *cfg)
+static int arm_smmu_find_sme(struct arm_smmu_device *smmu, u16 id, u16 mask)
 {
 	struct arm_smmu_smr *smrs = smmu->smrs;
-	int i, idx;
+	int i, idx = -ENOSPC;
 
-	/* Allocate the SMRs on the SMMU */
-	for_each_cfg_sme(cfg, i, idx) {
-		if (idx >= 0)
-			return -EEXIST;
+	/* Stream indexing is blissfully easy */
+	if (!smrs)
+		return id;
 
-		/* ...except on stream indexing hardware, of course */
-		if (!smrs) {
-			cfg->smendx[i] = cfg->streamids[i];
+	/* Validating SMRs is... less so */
+	for (i = 0; i < smmu->num_mapping_groups; ++i) {
+		if (!smrs[i].valid) {
+			if (idx < 0)
+				idx = i;
 			continue;
 		}
 
-		idx = arm_smmu_alloc_smr(smmu);
-		if (idx < 0) {
-			dev_err(smmu->dev, "failed to allocate free SMR\n");
-			goto err_free_smrs;
-		}
-		cfg->smendx[i] = idx;
+		/* Exact matches are good */
+		if (mask == smrs[i].mask && id == smrs[i].id)
+			return i;
 
-		smrs[idx].id = cfg->streamids[i];
-		smrs[idx].mask = 0; /* We don't currently share SMRs */
+		/* New unmasked IDs matching existing masks we can cope with */
+		if (!mask && !((smrs[i].id ^ id) & ~smrs[i].mask))
+			return i;
+
+		/* Overlapping masks are right out */
+		if (mask & smrs[i].mask)
+			return -EINVAL;
+
+		/* Distinct masks must match unambiguous ranges */
+		if (mask && !((smrs[i].id ^ id) & ~(smrs[i].mask | mask)))
+			return -EINVAL;
 	}
 
-	if (!smrs)
-		return 0;
+	return idx;
+}
+
+static bool arm_smmu_free_sme(struct arm_smmu_device *smmu, int idx)
+{
+	if (--smmu->s2crs[idx].count)
+		return false;
+
+	smmu->s2crs[idx] = s2cr_init_val;
+	if (smmu->smrs)
+		smmu->smrs[idx].valid = false;
+
+	return true;
+}
+
+static int arm_smmu_master_alloc_smes(struct device *dev)
+{
+	struct arm_smmu_master_cfg *cfg = dev->archdata.iommu;
+	struct arm_smmu_device *smmu = cfg->smmu;
+	struct arm_smmu_smr *smrs = smmu->smrs;
+	struct iommu_group *group;
+	int i, idx, ret;
+
+	mutex_lock(&smmu->stream_map_mutex);
+	/* Figure out a viable stream map entry allocation */
+	for_each_cfg_sme(cfg, i, idx) {
+		if (idx >= 0) {
+			ret = -EEXIST;
+			goto out_err;
+		}
+
+		ret = arm_smmu_find_sme(smmu, cfg->streamids[i], 0);
+		if (ret < 0)
+			goto out_err;
+
+		idx = ret;
+		if (smrs && smmu->s2crs[idx].count == 0) {
+			smrs[idx].id = cfg->streamids[i];
+			smrs[idx].mask = 0; /* We don't currently share SMRs */
+			smrs[idx].valid = true;
+		}
+		smmu->s2crs[idx].count++;
+		cfg->smendx[i] = (s16)idx;
+	}
+
+	group = iommu_group_get_for_dev(dev);
+	if (!group)
+		group = ERR_PTR(-ENOMEM);
+	if (IS_ERR(group)) {
+		ret = PTR_ERR(group);
+		goto out_err;
+	}
+	iommu_group_put(group);
 
 	/* It worked! Now, poke the actual hardware */
-	for_each_cfg_sme(cfg, i, idx)
-		arm_smmu_write_smr(smmu, idx);
+	for_each_cfg_sme(cfg, i, idx) {
+		arm_smmu_write_sme(smmu, idx);
+		smmu->s2crs[idx].group = group;
+	}
 
+	mutex_unlock(&smmu->stream_map_mutex);
 	return 0;
 
-err_free_smrs:
+out_err:
 	while (i--) {
-		arm_smmu_free_smr(smmu, cfg->smendx[i]);
+		arm_smmu_free_sme(smmu, cfg->smendx[i]);
 		cfg->smendx[i] = INVALID_SMENDX;
 	}
-	return -ENOSPC;
+	mutex_unlock(&smmu->stream_map_mutex);
+	return ret;
 }
 
 static void arm_smmu_master_free_smes(struct arm_smmu_master_cfg *cfg)
@@ -1110,43 +1157,23 @@ static void arm_smmu_master_free_smes(struct arm_smmu_master_cfg *cfg)
 	struct arm_smmu_device *smmu = cfg->smmu;
 	int i, idx;
 
-	/*
-	 * We *must* clear the S2CR first, because freeing the SMR means
-	 * that it can be re-allocated immediately.
-	 */
+	mutex_lock(&smmu->stream_map_mutex);
 	for_each_cfg_sme(cfg, i, idx) {
-		/* An IOMMU group is torn down by the first device to be removed */
-		if (idx < 0)
-			return;
-
-		smmu->s2crs[idx] = s2cr_init_val;
-		arm_smmu_write_s2cr(smmu, idx);
-	}
-	/* Sync S2CR updates before touching anything else */
-	__iowmb();
-
-	/* Invalidate the SMRs before freeing back to the allocator */
-	for_each_cfg_sme(cfg, i, idx) {
-		if (smmu->smrs)
-			arm_smmu_free_smr(smmu, idx);
-
+		if (arm_smmu_free_sme(smmu, idx))
+			arm_smmu_write_sme(smmu, idx);
 		cfg->smendx[i] = INVALID_SMENDX;
 	}
+	mutex_unlock(&smmu->stream_map_mutex);
 }
 
 static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain,
 				      struct arm_smmu_master_cfg *cfg)
 {
-	int i, idx, ret = 0;
 	struct arm_smmu_device *smmu = smmu_domain->smmu;
 	struct arm_smmu_s2cr *s2cr = smmu->s2crs;
 	enum arm_smmu_s2cr_type type = S2CR_TYPE_TRANS;
 	u8 cbndx = smmu_domain->cfg.cbndx;
-
-	if (cfg->smendx[0] < 0)
-		ret = arm_smmu_master_alloc_smes(smmu, cfg);
-	if (ret)
-		return ret;
+	int i, idx;
 
 	/*
 	 * FIXME: This won't be needed once we have IOMMU-backed DMA ops
@@ -1158,9 +1185,8 @@ static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain,
 		type = S2CR_TYPE_BYPASS;
 
 	for_each_cfg_sme(cfg, i, idx) {
-		/* Devices in an IOMMU group may already be configured */
 		if (type == s2cr[idx].type && cbndx == s2cr[idx].cbndx)
-			break;
+			continue;
 
 		s2cr[idx].type = type;
 		s2cr[idx].privcfg = S2CR_PRIVCFG_UNPRIV;
@@ -1320,7 +1346,6 @@ static bool arm_smmu_capable(enum iommu_cap cap)
 static int arm_smmu_add_device(struct device *dev)
 {
 	struct arm_smmu_master_cfg *cfg;
-	struct iommu_group *group;
 	int i, ret;
 
 	ret = arm_smmu_register_legacy_master(dev);
@@ -1340,13 +1365,9 @@ static int arm_smmu_add_device(struct device *dev)
 		cfg->smendx[i] = INVALID_SMENDX;
 	}
 
-	group = iommu_group_get_for_dev(dev);
-	if (IS_ERR(group)) {
-		ret = PTR_ERR(group);
-		goto out_free;
-	}
-	iommu_group_put(group);
-	return 0;
+	ret = arm_smmu_master_alloc_smes(dev);
+	if (!ret)
+		return ret;
 
 out_free:
 	kfree(cfg);
@@ -1369,7 +1390,21 @@ static void arm_smmu_remove_device(struct device *dev)
 
 static struct iommu_group *arm_smmu_device_group(struct device *dev)
 {
-	struct iommu_group *group;
+	struct arm_smmu_master_cfg *cfg = dev->archdata.iommu;
+	struct arm_smmu_device *smmu = cfg->smmu;
+	struct iommu_group *group = NULL;
+	int i, idx;
+
+	for_each_cfg_sme(cfg, i, idx) {
+		if (group && smmu->s2crs[idx].group &&
+		    group != smmu->s2crs[idx].group)
+			return ERR_PTR(-EINVAL);
+
+		group = smmu->s2crs[idx].group;
+	}
+
+	if (group)
+		return group;
 
 	if (dev_is_pci(dev))
 		group = pci_device_group(dev);
@@ -1652,6 +1687,7 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
 		smmu->s2crs[i] = s2cr_init_val;
 
 	smmu->num_mapping_groups = size;
+	mutex_init(&smmu->stream_map_mutex);
 
 	if (smmu->version < ARM_SMMU_V2 || !(id & ID0_PTFS_NO_AARCH32)) {
 		smmu->features |= ARM_SMMU_FEAT_FMT_AARCH32_L;
-- 
2.8.1.dirty

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v5 15/19] iommu/arm-smmu: Convert to iommu_fwspec
       [not found] ` <cover.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
                     ` (13 preceding siblings ...)
  2016-08-23 19:05   ` [PATCH v5 14/19] iommu/arm-smmu: Intelligent SMR allocation Robin Murphy
@ 2016-08-23 19:05   ` Robin Murphy
       [not found]     ` <221f668d606abdfb4d6ee6da2c5f568c57ceccdd.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
  2016-08-23 19:05   ` [PATCH v5 16/19] Docs: dt: document ARM SMMU generic binding usage Robin Murphy
                     ` (5 subsequent siblings)
  20 siblings, 1 reply; 61+ messages in thread
From: Robin Murphy @ 2016-08-23 19:05 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, lorenzo.pieralisi-5wv7dgnIgG8,
	jean-philippe.brucker-5wv7dgnIgG8, punit.agrawal-5wv7dgnIgG8,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA,
	eric.auger-H+wXaHxf7aLQT0dZR+AlfA

In the final step of preparation for full generic configuration support,
swap our fixed-size master_cfg for the generic iommu_fwspec. For the
legacy DT bindings, the driver simply gets to act as its own 'firmware'.
Farewell, arbitrary MAX_MASTER_STREAMIDS!

Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
---
 drivers/iommu/arm-smmu.c | 139 ++++++++++++++++++++++++++---------------------
 1 file changed, 77 insertions(+), 62 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 88f82eb8d1fe..ea22beb58b59 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -42,6 +42,7 @@
 #include <linux/of.h>
 #include <linux/of_address.h>
 #include <linux/of_device.h>
+#include <linux/of_iommu.h>
 #include <linux/pci.h>
 #include <linux/platform_device.h>
 #include <linux/slab.h>
@@ -51,9 +52,6 @@
 
 #include "io-pgtable.h"
 
-/* Maximum number of stream IDs assigned to a single device */
-#define MAX_MASTER_STREAMIDS		128
-
 /* Maximum number of context banks per SMMU */
 #define ARM_SMMU_MAX_CBS		128
 
@@ -321,13 +319,13 @@ struct arm_smmu_smr {
 
 struct arm_smmu_master_cfg {
 	struct arm_smmu_device		*smmu;
-	int				num_streamids;
-	u16				streamids[MAX_MASTER_STREAMIDS];
-	s16				smendx[MAX_MASTER_STREAMIDS];
+	s16				smendx[];
 };
 #define INVALID_SMENDX			-1
-#define for_each_cfg_sme(cfg, i, idx) \
-	for (i = 0; idx = cfg->smendx[i], i < cfg->num_streamids; ++i)
+#define __fwspec_cfg(fw) ((struct arm_smmu_master_cfg *)fw->iommu_priv)
+#define fwspec_smmu(fw)  (__fwspec_cfg(fw)->smmu)
+#define for_each_cfg_sme(fw, i, idx) \
+	for (i = 0; idx = __fwspec_cfg(fw)->smendx[i], i < fw->num_ids; ++i)
 
 struct arm_smmu_device {
 	struct device			*dev;
@@ -481,13 +479,14 @@ static int __find_legacy_master_phandle(struct device *dev, void *data)
 
 static struct platform_driver arm_smmu_driver;
 
-static int arm_smmu_register_legacy_master(struct device *dev)
+static int arm_smmu_register_legacy_master(struct device *dev,
+					   struct arm_smmu_device **smmu)
 {
-	struct arm_smmu_device *smmu;
-	struct arm_smmu_master_cfg *cfg;
+	struct device *smmu_dev;
 	struct device_node *np;
 	struct of_phandle_iterator it;
 	void *data = &it;
+	u32 *sids;
 	__be32 pci_sid;
 	int err;
 
@@ -500,20 +499,13 @@ static int arm_smmu_register_legacy_master(struct device *dev)
 	it.node = np;
 	err = driver_for_each_device(&arm_smmu_driver.driver, NULL, &data,
 				     __find_legacy_master_phandle);
+	smmu_dev = data;
 	of_node_put(np);
 	if (err == 0)
 		return -ENODEV;
 	if (err < 0)
 		return err;
 
-	smmu = dev_get_drvdata(data);
-
-	if (it.cur_count > MAX_MASTER_STREAMIDS) {
-		dev_err(smmu->dev,
-			"reached maximum number (%d) of stream IDs for master device %s\n",
-			MAX_MASTER_STREAMIDS, dev_name(dev));
-		return -ENOSPC;
-	}
 	if (dev_is_pci(dev)) {
 		/* "mmu-masters" assumes Stream ID == Requester ID */
 		pci_for_each_dma_alias(to_pci_dev(dev), __arm_smmu_get_pci_sid,
@@ -522,17 +514,19 @@ static int arm_smmu_register_legacy_master(struct device *dev)
 		it.cur_count = 1;
 	}
 
-	cfg = kzalloc(sizeof(*cfg), GFP_KERNEL);
-	if (!cfg)
+	err = iommu_fwspec_init(dev, smmu_dev->of_node);
+	if (err)
+		return err;
+
+	sids = kcalloc(it.cur_count, sizeof(*sids), GFP_KERNEL);
+	if (!sids)
 		return -ENOMEM;
 
-	cfg->smmu = smmu;
-	dev->archdata.iommu = cfg;
-
-	while (it.cur_count--)
-		cfg->streamids[cfg->num_streamids++] = be32_to_cpup(it.cur++);
-
-	return 0;
+	*smmu = dev_get_drvdata(smmu_dev);
+	of_phandle_iterator_args(&it, sids, it.cur_count);
+	err = iommu_fwspec_add_ids(dev, sids, it.cur_count);
+	kfree(sids);
+	return err;
 }
 
 static int __arm_smmu_alloc_bitmap(unsigned long *map, int start, int end)
@@ -1097,7 +1091,8 @@ static bool arm_smmu_free_sme(struct arm_smmu_device *smmu, int idx)
 
 static int arm_smmu_master_alloc_smes(struct device *dev)
 {
-	struct arm_smmu_master_cfg *cfg = dev->archdata.iommu;
+	struct iommu_fwspec *fwspec = dev_iommu_fwspec(dev);
+	struct arm_smmu_master_cfg *cfg = fwspec->iommu_priv;
 	struct arm_smmu_device *smmu = cfg->smmu;
 	struct arm_smmu_smr *smrs = smmu->smrs;
 	struct iommu_group *group;
@@ -1105,19 +1100,19 @@ static int arm_smmu_master_alloc_smes(struct device *dev)
 
 	mutex_lock(&smmu->stream_map_mutex);
 	/* Figure out a viable stream map entry allocation */
-	for_each_cfg_sme(cfg, i, idx) {
+	for_each_cfg_sme(fwspec, i, idx) {
 		if (idx >= 0) {
 			ret = -EEXIST;
 			goto out_err;
 		}
 
-		ret = arm_smmu_find_sme(smmu, cfg->streamids[i], 0);
+		ret = arm_smmu_find_sme(smmu, fwspec->ids[i], 0);
 		if (ret < 0)
 			goto out_err;
 
 		idx = ret;
 		if (smrs && smmu->s2crs[idx].count == 0) {
-			smrs[idx].id = cfg->streamids[i];
+			smrs[idx].id = fwspec->ids[i];
 			smrs[idx].mask = 0; /* We don't currently share SMRs */
 			smrs[idx].valid = true;
 		}
@@ -1135,7 +1130,7 @@ static int arm_smmu_master_alloc_smes(struct device *dev)
 	iommu_group_put(group);
 
 	/* It worked! Now, poke the actual hardware */
-	for_each_cfg_sme(cfg, i, idx) {
+	for_each_cfg_sme(fwspec, i, idx) {
 		arm_smmu_write_sme(smmu, idx);
 		smmu->s2crs[idx].group = group;
 	}
@@ -1152,13 +1147,14 @@ out_err:
 	return ret;
 }
 
-static void arm_smmu_master_free_smes(struct arm_smmu_master_cfg *cfg)
+static void arm_smmu_master_free_smes(struct iommu_fwspec *fwspec)
 {
-	struct arm_smmu_device *smmu = cfg->smmu;
+	struct arm_smmu_device *smmu = fwspec_smmu(fwspec);
+	struct arm_smmu_master_cfg *cfg = fwspec->iommu_priv;
 	int i, idx;
 
 	mutex_lock(&smmu->stream_map_mutex);
-	for_each_cfg_sme(cfg, i, idx) {
+	for_each_cfg_sme(fwspec, i, idx) {
 		if (arm_smmu_free_sme(smmu, idx))
 			arm_smmu_write_sme(smmu, idx);
 		cfg->smendx[i] = INVALID_SMENDX;
@@ -1167,7 +1163,7 @@ static void arm_smmu_master_free_smes(struct arm_smmu_master_cfg *cfg)
 }
 
 static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain,
-				      struct arm_smmu_master_cfg *cfg)
+				      struct iommu_fwspec *fwspec)
 {
 	struct arm_smmu_device *smmu = smmu_domain->smmu;
 	struct arm_smmu_s2cr *s2cr = smmu->s2crs;
@@ -1184,7 +1180,7 @@ static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain,
 	if (smmu_domain->domain.type == IOMMU_DOMAIN_DMA)
 		type = S2CR_TYPE_BYPASS;
 
-	for_each_cfg_sme(cfg, i, idx) {
+	for_each_cfg_sme(fwspec, i, idx) {
 		if (type == s2cr[idx].type && cbndx == s2cr[idx].cbndx)
 			continue;
 
@@ -1196,19 +1192,22 @@ static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain,
 	return 0;
 }
 
+static struct iommu_ops arm_smmu_ops;
+
 static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 {
 	int ret;
+	struct iommu_fwspec *fwspec = dev_iommu_fwspec(dev);
+	struct arm_smmu_device *smmu = fwspec_smmu(fwspec);
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
-	struct arm_smmu_master_cfg *cfg = dev->archdata.iommu;
 
-	if (!cfg) {
+	if (!fwspec || fwspec->iommu_ops != &arm_smmu_ops) {
 		dev_err(dev, "cannot attach to SMMU, is it on the same bus?\n");
 		return -ENXIO;
 	}
 
 	/* Ensure that the domain is finalised */
-	ret = arm_smmu_init_domain_context(domain, cfg->smmu);
+	ret = arm_smmu_init_domain_context(domain, smmu);
 	if (ret < 0)
 		return ret;
 
@@ -1216,15 +1215,15 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 	 * Sanity check the domain. We don't support domains across
 	 * different SMMUs.
 	 */
-	if (smmu_domain->smmu != cfg->smmu) {
+	if (smmu_domain->smmu != smmu) {
 		dev_err(dev,
 			"cannot attach to SMMU %s whilst already attached to domain on SMMU %s\n",
-			dev_name(smmu_domain->smmu->dev), dev_name(cfg->smmu->dev));
+			dev_name(smmu_domain->smmu->dev), dev_name(smmu->dev));
 		return -EINVAL;
 	}
 
 	/* Looks ok, so add the device to the domain */
-	return arm_smmu_domain_add_master(smmu_domain, cfg);
+	return arm_smmu_domain_add_master(smmu_domain, fwspec);
 }
 
 static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
@@ -1345,57 +1344,72 @@ static bool arm_smmu_capable(enum iommu_cap cap)
 
 static int arm_smmu_add_device(struct device *dev)
 {
+	struct arm_smmu_device *smmu;
 	struct arm_smmu_master_cfg *cfg;
+	struct iommu_fwspec *fwspec;
 	int i, ret;
 
-	ret = arm_smmu_register_legacy_master(dev);
-	cfg = dev->archdata.iommu;
+	ret = arm_smmu_register_legacy_master(dev, &smmu);
+	fwspec = dev_iommu_fwspec(dev);
 	if (ret)
 		goto out_free;
 
 	ret = -EINVAL;
-	for (i = 0; i < cfg->num_streamids; i++) {
-		u16 sid = cfg->streamids[i];
+	for (i = 0; i < fwspec->num_ids; i++) {
+		u16 sid = fwspec->ids[i];
 
-		if (sid & ~cfg->smmu->streamid_mask) {
+		if (sid & ~smmu->streamid_mask) {
 			dev_err(dev, "stream ID 0x%x out of range for SMMU (0x%x)\n",
 				sid, cfg->smmu->streamid_mask);
 			goto out_free;
 		}
-		cfg->smendx[i] = INVALID_SMENDX;
 	}
 
+	ret = -ENOMEM;
+	cfg = kzalloc(offsetof(struct arm_smmu_master_cfg, smendx[i]),
+		      GFP_KERNEL);
+	if (!cfg)
+		goto out_free;
+
+	cfg->smmu = smmu;
+	fwspec->iommu_priv = cfg;
+	while (i--)
+		cfg->smendx[i] = INVALID_SMENDX;
+
 	ret = arm_smmu_master_alloc_smes(dev);
-	if (!ret)
-		return ret;
+	if (ret)
+		goto out_free;
+
+	return 0;
 
 out_free:
-	kfree(cfg);
-	dev->archdata.iommu = NULL;
+	if (fwspec)
+		kfree(fwspec->iommu_priv);
+	iommu_fwspec_free(dev);
 	return ret;
 }
 
 static void arm_smmu_remove_device(struct device *dev)
 {
-	struct arm_smmu_master_cfg *cfg = dev->archdata.iommu;
+	struct iommu_fwspec *fwspec = dev_iommu_fwspec(dev);
 
-	if (!cfg)
+	if (!fwspec || fwspec->iommu_ops != &arm_smmu_ops)
 		return;
 
-	arm_smmu_master_free_smes(cfg);
+	arm_smmu_master_free_smes(fwspec);
 	iommu_group_remove_device(dev);
-	kfree(cfg);
-	dev->archdata.iommu = NULL;
+	kfree(fwspec->iommu_priv);
+	iommu_fwspec_free(dev);
 }
 
 static struct iommu_group *arm_smmu_device_group(struct device *dev)
 {
-	struct arm_smmu_master_cfg *cfg = dev->archdata.iommu;
-	struct arm_smmu_device *smmu = cfg->smmu;
+	struct iommu_fwspec *fwspec = dev_iommu_fwspec(dev);
+	struct arm_smmu_device *smmu = fwspec_smmu(fwspec);
 	struct iommu_group *group = NULL;
 	int i, idx;
 
-	for_each_cfg_sme(cfg, i, idx) {
+	for_each_cfg_sme(fwspec, i, idx) {
 		if (group && smmu->s2crs[idx].group &&
 		    group != smmu->s2crs[idx].group)
 			return ERR_PTR(-EINVAL);
@@ -1906,6 +1920,7 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
 		}
 	}
 
+	of_iommu_set_ops(dev->of_node, &arm_smmu_ops);
 	platform_set_drvdata(pdev, smmu);
 	arm_smmu_device_reset(smmu);
 	return 0;
-- 
2.8.1.dirty

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v5 16/19] Docs: dt: document ARM SMMU generic binding usage
       [not found] ` <cover.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
                     ` (14 preceding siblings ...)
  2016-08-23 19:05   ` [PATCH v5 15/19] iommu/arm-smmu: Convert to iommu_fwspec Robin Murphy
@ 2016-08-23 19:05   ` Robin Murphy
       [not found]     ` <b4f0eca93ac944c3430297b97c703e1bc54846d7.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
  2016-08-23 19:05   ` [PATCH v5 17/19] iommu/arm-smmu: Wire up generic configuration support Robin Murphy
                     ` (4 subsequent siblings)
  20 siblings, 1 reply; 61+ messages in thread
From: Robin Murphy @ 2016-08-23 19:05 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, lorenzo.pieralisi-5wv7dgnIgG8,
	jean-philippe.brucker-5wv7dgnIgG8, punit.agrawal-5wv7dgnIgG8,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA,
	eric.auger-H+wXaHxf7aLQT0dZR+AlfA, Rob Herring, Mark Rutland

Document how the generic "iommus" binding should be used to describe ARM
SMMU stream IDs instead of the old "mmu-masters" binding.

CC: Rob Herring <robh+dt-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
CC: Mark Rutland <mark.rutland-5wv7dgnIgG8@public.gmane.org>
Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
---
 .../devicetree/bindings/iommu/arm,smmu.txt         | 63 ++++++++++++++++------
 1 file changed, 48 insertions(+), 15 deletions(-)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
index 19fe6f2c83f6..e9d447cf3a76 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
@@ -35,12 +35,16 @@ conditions.
                   interrupt per context bank. In the case of a single,
                   combined interrupt, it must be listed multiple times.
 
-- mmu-masters   : A list of phandles to device nodes representing bus
-                  masters for which the SMMU can provide a translation
-                  and their corresponding StreamIDs (see example below).
-                  Each device node linked from this list must have a
-                  "#stream-id-cells" property, indicating the number of
-                  StreamIDs associated with it.
+- #iommu-cells  : See Documentation/devicetree/bindings/iommu/iommu.txt
+                  for details. With a value of 1, each "iommus" entry
+                  represents a distinct stream ID emitted by that device
+                  into the relevant SMMU.
+
+                  SMMUs with stream matching support and complex masters
+                  may use a value of 2, where the second cell represents
+                  an SMR mask to combine with the ID in the first cell.
+                  Care must be taken to ensure the set of matched IDs
+                  does not result in conflicts.
 
 ** System MMU optional properties:
 
@@ -56,9 +60,20 @@ conditions.
                   aliases of secure registers have to be used during
                   SMMU configuration.
 
-Example:
+** Deprecated properties:
 
-        smmu {
+- mmu-masters (deprecated in favour of the generic "iommus" binding) :
+                  A list of phandles to device nodes representing bus
+                  masters for which the SMMU can provide a translation
+                  and their corresponding StreamIDs (see example below).
+                  Each device node linked from this list must have a
+                  "#stream-id-cells" property, indicating the number of
+                  StreamIDs associated with it.
+
+** Examples:
+
+        /* SMMU with stream matching or stream indexing */
+        smmu1: iommu {
                 compatible = "arm,smmu-v1";
                 reg = <0xba5e0000 0x10000>;
                 #global-interrupts = <2>;
@@ -68,11 +83,29 @@ Example:
                              <0 35 4>,
                              <0 36 4>,
                              <0 37 4>;
-
-                /*
-                 * Two DMA controllers, the first with two StreamIDs (0xd01d
-                 * and 0xd01e) and the second with only one (0xd11c).
-                 */
-                mmu-masters = <&dma0 0xd01d 0xd01e>,
-                              <&dma1 0xd11c>;
+                #iommu-cells = <1>;
+        };
+
+        /* device with two stream IDs, 0 and 7 */
+        master1 {
+                iommus = <&smmu1 0>,
+                         <&smmu1 7>;
+        };
+
+
+        /* SMMU with stream matching */
+        smmu2: iommu {
+                ...
+                #iommu-cells = <2>;
+        };
+
+        /* device with stream IDs 0 and 7 */
+        master2 {
+                iommus = <&smmu2 0 0>,
+                         <&smmu2 7 0>;
+        };
+
+        /* device with stream IDs 1, 17, 33 and 49 */
+        master3 {
+                iommus = <&smmu2 1 0x30>;
         };
-- 
2.8.1.dirty

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v5 17/19] iommu/arm-smmu: Wire up generic configuration support
       [not found] ` <cover.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
                     ` (15 preceding siblings ...)
  2016-08-23 19:05   ` [PATCH v5 16/19] Docs: dt: document ARM SMMU generic binding usage Robin Murphy
@ 2016-08-23 19:05   ` Robin Murphy
       [not found]     ` <4439250e01ac071bae8f03a5ccf107ed7ddc0b49.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
  2016-08-23 19:05   ` [PATCH v5 18/19] iommu/arm-smmu: Set domain geometry Robin Murphy
                     ` (3 subsequent siblings)
  20 siblings, 1 reply; 61+ messages in thread
From: Robin Murphy @ 2016-08-23 19:05 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, lorenzo.pieralisi-5wv7dgnIgG8,
	jean-philippe.brucker-5wv7dgnIgG8, punit.agrawal-5wv7dgnIgG8,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA,
	eric.auger-H+wXaHxf7aLQT0dZR+AlfA

With everything else now in place, fill in an of_xlate callback and the
appropriate registration to plumb into the generic configuration
machinery, and watch everything just work.

Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
---
 drivers/iommu/arm-smmu.c | 168 ++++++++++++++++++++++++++++++-----------------
 1 file changed, 107 insertions(+), 61 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index ea22beb58b59..85bc74d8fca0 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -43,6 +43,7 @@
 #include <linux/of_address.h>
 #include <linux/of_device.h>
 #include <linux/of_iommu.h>
+#include <linux/of_platform.h>
 #include <linux/pci.h>
 #include <linux/platform_device.h>
 #include <linux/slab.h>
@@ -418,6 +419,8 @@ struct arm_smmu_option_prop {
 
 static atomic_t cavium_smmu_context_count = ATOMIC_INIT(0);
 
+static bool legacy_binding_used;
+
 static struct arm_smmu_option_prop arm_smmu_options[] = {
 	{ ARM_SMMU_OPT_SECURE_CFG_ACCESS, "calxeda,smmu-secure-config-access" },
 	{ 0, NULL},
@@ -799,12 +802,6 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	if (smmu_domain->smmu)
 		goto out_unlock;
 
-	/* We're bypassing these SIDs, so don't allocate an actual context */
-	if (domain->type == IOMMU_DOMAIN_DMA) {
-		smmu_domain->smmu = smmu;
-		goto out_unlock;
-	}
-
 	/*
 	 * Mapping the requested stage onto what we support is surprisingly
 	 * complicated, mainly because the spec allows S1+S2 SMMUs without
@@ -954,7 +951,7 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain)
 	void __iomem *cb_base;
 	int irq;
 
-	if (!smmu || domain->type == IOMMU_DOMAIN_DMA)
+	if (!smmu)
 		return;
 
 	/*
@@ -988,8 +985,8 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
 	if (!smmu_domain)
 		return NULL;
 
-	if (type == IOMMU_DOMAIN_DMA &&
-	    iommu_get_dma_cookie(&smmu_domain->domain)) {
+	if (type == IOMMU_DOMAIN_DMA && (legacy_binding_used ||
+	    iommu_get_dma_cookie(&smmu_domain->domain))) {
 		kfree(smmu_domain);
 		return NULL;
 	}
@@ -1101,19 +1098,22 @@ static int arm_smmu_master_alloc_smes(struct device *dev)
 	mutex_lock(&smmu->stream_map_mutex);
 	/* Figure out a viable stream map entry allocation */
 	for_each_cfg_sme(fwspec, i, idx) {
+		u16 sid = fwspec->ids[i];
+		u16 mask = fwspec->ids[i] >> SMR_MASK_SHIFT;
+
 		if (idx >= 0) {
 			ret = -EEXIST;
 			goto out_err;
 		}
 
-		ret = arm_smmu_find_sme(smmu, fwspec->ids[i], 0);
+		ret = arm_smmu_find_sme(smmu, sid, mask);
 		if (ret < 0)
 			goto out_err;
 
 		idx = ret;
 		if (smrs && smmu->s2crs[idx].count == 0) {
-			smrs[idx].id = fwspec->ids[i];
-			smrs[idx].mask = 0; /* We don't currently share SMRs */
+			smrs[idx].id = sid;
+			smrs[idx].mask = mask;
 			smrs[idx].valid = true;
 		}
 		smmu->s2crs[idx].count++;
@@ -1171,15 +1171,6 @@ static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain,
 	u8 cbndx = smmu_domain->cfg.cbndx;
 	int i, idx;
 
-	/*
-	 * FIXME: This won't be needed once we have IOMMU-backed DMA ops
-	 * for all devices behind the SMMU. Note that we need to take
-	 * care configuring SMRs for devices both a platform_device and
-	 * and a PCI device (i.e. a PCI host controller)
-	 */
-	if (smmu_domain->domain.type == IOMMU_DOMAIN_DMA)
-		type = S2CR_TYPE_BYPASS;
-
 	for_each_cfg_sme(fwspec, i, idx) {
 		if (type == s2cr[idx].type && cbndx == s2cr[idx].cbndx)
 			continue;
@@ -1342,25 +1333,50 @@ static bool arm_smmu_capable(enum iommu_cap cap)
 	}
 }
 
+static int arm_smmu_match_node(struct device *dev, void *data)
+{
+	return dev->of_node == data;
+}
+
+static struct arm_smmu_device *arm_smmu_get_by_node(struct device_node *np)
+{
+	struct device *dev = driver_find_device(&arm_smmu_driver.driver, NULL,
+						np, arm_smmu_match_node);
+	put_device(dev);
+	return dev ? dev_get_drvdata(dev) : NULL;
+}
+
 static int arm_smmu_add_device(struct device *dev)
 {
 	struct arm_smmu_device *smmu;
 	struct arm_smmu_master_cfg *cfg;
-	struct iommu_fwspec *fwspec;
-	int i, ret;
+	struct iommu_fwspec *fwspec = dev_iommu_fwspec(dev);
+	int i, ret = 0;
 
-	ret = arm_smmu_register_legacy_master(dev, &smmu);
-	fwspec = dev_iommu_fwspec(dev);
-	if (ret)
-		goto out_free;
+	if (fwspec) {
+		smmu = arm_smmu_get_by_node(fwspec->iommu_np);
+	} else {
+		if (!legacy_binding_used)
+			return -ENODEV;
+		ret = arm_smmu_register_legacy_master(dev, &smmu);
+		fwspec = dev_iommu_fwspec(dev);
+		if (ret)
+			goto out_free;
+	}
 
 	ret = -EINVAL;
 	for (i = 0; i < fwspec->num_ids; i++) {
 		u16 sid = fwspec->ids[i];
+		u16 mask = fwspec->ids[i] >> SMR_MASK_SHIFT;
 
 		if (sid & ~smmu->streamid_mask) {
 			dev_err(dev, "stream ID 0x%x out of range for SMMU (0x%x)\n",
-				sid, cfg->smmu->streamid_mask);
+				sid, smmu->streamid_mask);
+			goto out_free;
+		}
+		if (mask & ~smmu->smr_mask_mask) {
+			dev_err(dev, "SMR mask 0x%x out of range for SMMU (0x%x)\n",
+				sid, smmu->smr_mask_mask);
 			goto out_free;
 		}
 	}
@@ -1472,6 +1488,23 @@ out_unlock:
 	return ret;
 }
 
+static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args)
+{
+	u32 fwid = 0;
+	int ret = iommu_fwspec_init(dev, args->np);
+
+	if (ret)
+		return ret;
+
+	if (args->args_count > 0)
+		fwid |= (u16)args->args[0];
+
+	if (args->args_count > 1)
+		fwid |= (u16)args->args[1] << SMR_MASK_SHIFT;
+
+	return iommu_fwspec_add_ids(dev, &fwid, 1);
+}
+
 static struct iommu_ops arm_smmu_ops = {
 	.capable		= arm_smmu_capable,
 	.domain_alloc		= arm_smmu_domain_alloc,
@@ -1486,6 +1519,7 @@ static struct iommu_ops arm_smmu_ops = {
 	.device_group		= arm_smmu_device_group,
 	.domain_get_attr	= arm_smmu_domain_get_attr,
 	.domain_set_attr	= arm_smmu_domain_set_attr,
+	.of_xlate		= arm_smmu_of_xlate,
 	.pgsize_bitmap		= -1UL, /* Restricted during device attach */
 };
 
@@ -1920,9 +1954,29 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
 		}
 	}
 
+	if (!legacy_binding_used &&
+	    of_find_property(dev->of_node, "mmu-masters", NULL)) {
+		pr_notice("Deprecated \"mmu-masters\" property in use; DMA API support unavailable.\n");
+		legacy_binding_used = true;
+	}
+
 	of_iommu_set_ops(dev->of_node, &arm_smmu_ops);
 	platform_set_drvdata(pdev, smmu);
 	arm_smmu_device_reset(smmu);
+
+	/* Oh, for a proper bus abstraction */
+	if (!iommu_present(&platform_bus_type))
+		bus_set_iommu(&platform_bus_type, &arm_smmu_ops);
+#ifdef CONFIG_ARM_AMBA
+	if (!iommu_present(&amba_bustype))
+		bus_set_iommu(&amba_bustype, &arm_smmu_ops);
+#endif
+#ifdef CONFIG_PCI
+	if (!iommu_present(&pci_bus_type)) {
+		pci_request_acs();
+		bus_set_iommu(&pci_bus_type, &arm_smmu_ops);
+	}
+#endif
 	return 0;
 }
 
@@ -1956,41 +2010,14 @@ static struct platform_driver arm_smmu_driver = {
 
 static int __init arm_smmu_init(void)
 {
-	struct device_node *np;
-	int ret;
+	static bool registered;
+	int ret = 0;
 
-	/*
-	 * Play nice with systems that don't have an ARM SMMU by checking that
-	 * an ARM SMMU exists in the system before proceeding with the driver
-	 * and IOMMU bus operation registration.
-	 */
-	np = of_find_matching_node(NULL, arm_smmu_of_match);
-	if (!np)
-		return 0;
-
-	of_node_put(np);
-
-	ret = platform_driver_register(&arm_smmu_driver);
-	if (ret)
-		return ret;
-
-	/* Oh, for a proper bus abstraction */
-	if (!iommu_present(&platform_bus_type))
-		bus_set_iommu(&platform_bus_type, &arm_smmu_ops);
-
-#ifdef CONFIG_ARM_AMBA
-	if (!iommu_present(&amba_bustype))
-		bus_set_iommu(&amba_bustype, &arm_smmu_ops);
-#endif
-
-#ifdef CONFIG_PCI
-	if (!iommu_present(&pci_bus_type)) {
-		pci_request_acs();
-		bus_set_iommu(&pci_bus_type, &arm_smmu_ops);
+	if (!registered) {
+		ret = platform_driver_register(&arm_smmu_driver);
+		registered = !ret;
 	}
-#endif
-
-	return 0;
+	return ret;
 }
 
 static void __exit arm_smmu_exit(void)
@@ -2001,6 +2028,25 @@ static void __exit arm_smmu_exit(void)
 subsys_initcall(arm_smmu_init);
 module_exit(arm_smmu_exit);
 
+static int __init arm_smmu_of_init(struct device_node *np)
+{
+	int ret = arm_smmu_init();
+
+	if (ret)
+		return ret;
+
+	if (!of_platform_device_create(np, NULL, platform_bus_type.dev_root))
+		return -ENODEV;
+
+	return 0;
+}
+IOMMU_OF_DECLARE(arm_smmuv1, "arm,smmu-v1", arm_smmu_of_init);
+IOMMU_OF_DECLARE(arm_smmuv2, "arm,smmu-v2", arm_smmu_of_init);
+IOMMU_OF_DECLARE(arm_mmu400, "arm,mmu-400", arm_smmu_of_init);
+IOMMU_OF_DECLARE(arm_mmu401, "arm,mmu-401", arm_smmu_of_init);
+IOMMU_OF_DECLARE(arm_mmu500, "arm,mmu-500", arm_smmu_of_init);
+IOMMU_OF_DECLARE(cavium_smmuv2, "cavium,smmu-v2", arm_smmu_of_init);
+
 MODULE_DESCRIPTION("IOMMU API for ARM architected SMMU implementations");
 MODULE_AUTHOR("Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org>");
 MODULE_LICENSE("GPL v2");
-- 
2.8.1.dirty

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v5 18/19] iommu/arm-smmu: Set domain geometry
       [not found] ` <cover.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
                     ` (16 preceding siblings ...)
  2016-08-23 19:05   ` [PATCH v5 17/19] iommu/arm-smmu: Wire up generic configuration support Robin Murphy
@ 2016-08-23 19:05   ` Robin Murphy
       [not found]     ` <d6cedec16fe96a081ea2f9f27378dd1a6f406c72.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
  2016-08-23 19:15     ` Robin Murphy
                     ` (2 subsequent siblings)
  20 siblings, 1 reply; 61+ messages in thread
From: Robin Murphy @ 2016-08-23 19:05 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, lorenzo.pieralisi-5wv7dgnIgG8,
	jean-philippe.brucker-5wv7dgnIgG8, punit.agrawal-5wv7dgnIgG8,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA,
	eric.auger-H+wXaHxf7aLQT0dZR+AlfA

For non-aperture-based IOMMUs, the domain geometry seems to have become
the de-facto way of indicating the input address space size. That is
quite a useful thing from the users' perspective, so let's do the same.

Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
---
 drivers/iommu/arm-smmu-v3.c | 2 ++
 drivers/iommu/arm-smmu.c    | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 72b996aa7460..9c56bd194dc2 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -1574,6 +1574,8 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
 		return -ENOMEM;
 
 	domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
+	domain->geometry.aperture_end = (1UL << ias) - 1;
+	domain->geometry.force_aperture = true;
 	smmu_domain->pgtbl_ops = pgtbl_ops;
 
 	ret = finalise_stage_fn(smmu_domain, &pgtbl_cfg);
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 85bc74d8fca0..112918d787eb 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -913,6 +913,8 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 
 	/* Update the domain's page sizes to reflect the page table format */
 	domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
+	domain->geometry.aperture_end = (1UL << ias) - 1;
+	domain->geometry.force_aperture = true;
 
 	/* Initialise the context bank with our page table cfg */
 	arm_smmu_init_context_bank(smmu_domain, &pgtbl_cfg);
-- 
2.8.1.dirty

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v5 19/19] iommu/dma: Add support for mapping MSIs
       [not found] ` <cover.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
@ 2016-08-23 19:05   ` Robin Murphy
  2016-08-23 19:05   ` [PATCH v5 02/19] of/irq: Break out msi-map lookup (again) Robin Murphy
                     ` (19 subsequent siblings)
  20 siblings, 0 replies; 61+ messages in thread
From: Robin Murphy @ 2016-08-23 19:05 UTC (permalink / raw)
  To: joro, will.deacon, iommu
  Cc: devicetree, lorenzo.pieralisi, jean-philippe.brucker,
	punit.agrawal, thunder.leizhen, eric.auger, Thomas Gleixner,
	Jason Cooper, Marc Zyngier, linux-kernel

When an MSI doorbell is located downstream of an IOMMU, attaching
devices to a DMA ops domain and switching on translation leads to a rude
shock when their attempt to write to the physical address returned by
the irqchip driver faults (or worse, writes into some already-mapped
buffer) and no interrupt is forthcoming.

Address this by adding a hook for relevant irqchip drivers to call from
their compose_msi_msg() callback, to swizzle the physical address with
an appropriatly-mapped IOVA for any device attached to one of our DMA
ops domains.

CC: Thomas Gleixner <tglx@linutronix.de>
CC: Jason Cooper <jason@lakedaemon.net>
CC: Marc Zyngier <marc.zyngier@arm.com>
CC: linux-kernel@vger.kernel.org
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/dma-iommu.c        | 141 ++++++++++++++++++++++++++++++++++-----
 drivers/irqchip/irq-gic-v2m.c    |   3 +
 drivers/irqchip/irq-gic-v3-its.c |   3 +
 include/linux/dma-iommu.h        |   9 +++
 4 files changed, 141 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 00c8a08d56e7..330cce60cad9 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -25,10 +25,29 @@
 #include <linux/huge_mm.h>
 #include <linux/iommu.h>
 #include <linux/iova.h>
+#include <linux/irq.h>
 #include <linux/mm.h>
 #include <linux/scatterlist.h>
 #include <linux/vmalloc.h>
 
+struct iommu_dma_msi_page {
+	struct list_head	list;
+	dma_addr_t		iova;
+	u32			phys_lo;
+	u32			phys_hi;
+};
+
+struct iommu_dma_cookie {
+	struct iova_domain	iovad;
+	struct list_head	msi_page_list;
+	spinlock_t		msi_lock;
+};
+
+static inline struct iova_domain *cookie_iovad(struct iommu_domain *domain)
+{
+	return &((struct iommu_dma_cookie *)domain->iova_cookie)->iovad;
+}
+
 int iommu_dma_init(void)
 {
 	return iova_cache_get();
@@ -43,15 +62,19 @@ int iommu_dma_init(void)
  */
 int iommu_get_dma_cookie(struct iommu_domain *domain)
 {
-	struct iova_domain *iovad;
+	struct iommu_dma_cookie *cookie;
 
 	if (domain->iova_cookie)
 		return -EEXIST;
 
-	iovad = kzalloc(sizeof(*iovad), GFP_KERNEL);
-	domain->iova_cookie = iovad;
+	cookie = kzalloc(sizeof(*cookie), GFP_KERNEL);
+	if (!cookie)
+		return -ENOMEM;
 
-	return iovad ? 0 : -ENOMEM;
+	spin_lock_init(&cookie->msi_lock);
+	INIT_LIST_HEAD(&cookie->msi_page_list);
+	domain->iova_cookie = cookie;
+	return 0;
 }
 EXPORT_SYMBOL(iommu_get_dma_cookie);
 
@@ -63,14 +86,20 @@ EXPORT_SYMBOL(iommu_get_dma_cookie);
  */
 void iommu_put_dma_cookie(struct iommu_domain *domain)
 {
-	struct iova_domain *iovad = domain->iova_cookie;
+	struct iommu_dma_cookie *cookie = domain->iova_cookie;
+	struct iommu_dma_msi_page *msi, *tmp;
 
-	if (!iovad)
+	if (!cookie)
 		return;
 
-	if (iovad->granule)
-		put_iova_domain(iovad);
-	kfree(iovad);
+	if (cookie->iovad.granule)
+		put_iova_domain(&cookie->iovad);
+
+	list_for_each_entry_safe(msi, tmp, &cookie->msi_page_list, list) {
+		list_del(&msi->list);
+		kfree(msi);
+	}
+	kfree(cookie);
 	domain->iova_cookie = NULL;
 }
 EXPORT_SYMBOL(iommu_put_dma_cookie);
@@ -88,7 +117,7 @@ EXPORT_SYMBOL(iommu_put_dma_cookie);
  */
 int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base, u64 size)
 {
-	struct iova_domain *iovad = domain->iova_cookie;
+	struct iova_domain *iovad = cookie_iovad(domain);
 	unsigned long order, base_pfn, end_pfn;
 
 	if (!iovad)
@@ -155,7 +184,7 @@ int dma_direction_to_prot(enum dma_data_direction dir, bool coherent)
 static struct iova *__alloc_iova(struct iommu_domain *domain, size_t size,
 		dma_addr_t dma_limit)
 {
-	struct iova_domain *iovad = domain->iova_cookie;
+	struct iova_domain *iovad = cookie_iovad(domain);
 	unsigned long shift = iova_shift(iovad);
 	unsigned long length = iova_align(iovad, size) >> shift;
 
@@ -171,7 +200,7 @@ static struct iova *__alloc_iova(struct iommu_domain *domain, size_t size,
 /* The IOVA allocator knows what we mapped, so just unmap whatever that was */
 static void __iommu_dma_unmap(struct iommu_domain *domain, dma_addr_t dma_addr)
 {
-	struct iova_domain *iovad = domain->iova_cookie;
+	struct iova_domain *iovad = cookie_iovad(domain);
 	unsigned long shift = iova_shift(iovad);
 	unsigned long pfn = dma_addr >> shift;
 	struct iova *iova = find_iova(iovad, pfn);
@@ -294,7 +323,7 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
 		void (*flush_page)(struct device *, const void *, phys_addr_t))
 {
 	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
-	struct iova_domain *iovad = domain->iova_cookie;
+	struct iova_domain *iovad = cookie_iovad(domain);
 	struct iova *iova;
 	struct page **pages;
 	struct sg_table sgt;
@@ -386,7 +415,7 @@ dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page,
 {
 	dma_addr_t dma_addr;
 	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
-	struct iova_domain *iovad = domain->iova_cookie;
+	struct iova_domain *iovad = cookie_iovad(domain);
 	phys_addr_t phys = page_to_phys(page) + offset;
 	size_t iova_off = iova_offset(iovad, phys);
 	size_t len = iova_align(iovad, size + iova_off);
@@ -495,7 +524,7 @@ int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
 		int nents, int prot)
 {
 	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
-	struct iova_domain *iovad = domain->iova_cookie;
+	struct iova_domain *iovad = cookie_iovad(domain);
 	struct iova *iova;
 	struct scatterlist *s, *prev = NULL;
 	dma_addr_t dma_addr;
@@ -587,3 +616,85 @@ int iommu_dma_mapping_error(struct device *dev, dma_addr_t dma_addr)
 {
 	return dma_addr == DMA_ERROR_CODE;
 }
+
+static int __iommu_dma_map_msi_page(struct device *dev, struct msi_msg *msg,
+		struct iommu_domain *domain, struct iommu_dma_msi_page **ppage)
+{
+	struct iommu_dma_cookie *cookie = domain->iova_cookie;
+	struct iommu_dma_msi_page *msi_page;
+	struct iova_domain *iovad = &cookie->iovad;
+	struct iova *iova;
+	phys_addr_t msi_addr = (u64)msg->address_hi << 32 | msg->address_lo;
+	int ret, prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
+
+	msi_page = kzalloc(sizeof(*msi_page), GFP_ATOMIC);
+	if (!msi_page)
+		return -ENOMEM;
+
+	iova = __alloc_iova(domain, iovad->granule, dma_get_mask(dev));
+	if (!iova) {
+		ret = -ENOSPC;
+		goto out_free_page;
+	}
+
+	msi_page->iova = iova_dma_addr(iovad, iova);
+	ret = iommu_map(domain, msi_page->iova, msi_addr & ~iova_mask(iovad),
+			iovad->granule, prot);
+	if (ret)
+		goto out_free_iova;
+
+	msi_page->phys_hi = msg->address_hi;
+	msi_page->phys_lo = msg->address_lo;
+	INIT_LIST_HEAD(&msi_page->list);
+	list_add(&msi_page->list, &cookie->msi_page_list);
+	*ppage = msi_page;
+	return 0;
+
+out_free_iova:
+	__free_iova(iovad, iova);
+out_free_page:
+	kfree(msi_page);
+	return ret;
+}
+
+void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
+{
+	struct device *dev = msi_desc_to_dev(irq_get_msi_desc(irq));
+	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
+	struct iova_domain *iovad;
+	struct iommu_dma_cookie *cookie;
+	struct iommu_dma_msi_page *msi_page;
+	int ret = 0;
+
+	if (!domain || !domain->iova_cookie)
+		return;
+
+	cookie = domain->iova_cookie;
+	iovad = &cookie->iovad;
+
+	spin_lock(&cookie->msi_lock);
+	list_for_each_entry(msi_page, &cookie->msi_page_list, list)
+		if (msi_page->phys_hi == msg->address_hi &&
+		    msi_page->phys_lo - msg->address_lo < iovad->granule)
+			goto unlock;
+
+	ret = __iommu_dma_map_msi_page(dev, msg, domain, &msi_page);
+unlock:
+	spin_unlock(&cookie->msi_lock);
+
+	if (!ret) {
+		msg->address_hi = upper_32_bits(msi_page->iova);
+		msg->address_lo &= iova_mask(iovad);
+		msg->address_lo += lower_32_bits(msi_page->iova);
+	} else {
+		/*
+		 * We're called from a void callback, so the best we can do is
+		 * 'fail' by filling the message with obviously bogus values.
+		 * Since we got this far due to an IOMMU being present, it's
+		 * not like the existing address would have worked anyway...
+		 */
+		msg->address_hi = ~0U;
+		msg->address_lo = ~0U;
+		msg->data = ~0U;
+	}
+}
diff --git a/drivers/irqchip/irq-gic-v2m.c b/drivers/irqchip/irq-gic-v2m.c
index 35eb7ac5d21f..863e073c6f7f 100644
--- a/drivers/irqchip/irq-gic-v2m.c
+++ b/drivers/irqchip/irq-gic-v2m.c
@@ -16,6 +16,7 @@
 #define pr_fmt(fmt) "GICv2m: " fmt
 
 #include <linux/acpi.h>
+#include <linux/dma-iommu.h>
 #include <linux/irq.h>
 #include <linux/irqdomain.h>
 #include <linux/kernel.h>
@@ -108,6 +109,8 @@ static void gicv2m_compose_msi_msg(struct irq_data *data, struct msi_msg *msg)
 
 	if (v2m->flags & GICV2M_NEEDS_SPI_OFFSET)
 		msg->data -= v2m->spi_offset;
+
+	iommu_dma_map_msi_msg(data->irq, msg);
 }
 
 static struct irq_chip gicv2m_irq_chip = {
diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 7ceaba81efb4..73f4f10dc204 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -18,6 +18,7 @@
 #include <linux/bitmap.h>
 #include <linux/cpu.h>
 #include <linux/delay.h>
+#include <linux/dma-iommu.h>
 #include <linux/interrupt.h>
 #include <linux/log2.h>
 #include <linux/mm.h>
@@ -655,6 +656,8 @@ static void its_irq_compose_msi_msg(struct irq_data *d, struct msi_msg *msg)
 	msg->address_lo		= addr & ((1UL << 32) - 1);
 	msg->address_hi		= addr >> 32;
 	msg->data		= its_get_event_id(d);
+
+	iommu_dma_map_msi_msg(d->irq, msg);
 }
 
 static struct irq_chip its_irq_chip = {
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index 81c5c8d167ad..5ee806e41b5c 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -21,6 +21,7 @@
 
 #ifdef CONFIG_IOMMU_DMA
 #include <linux/iommu.h>
+#include <linux/msi.h>
 
 int iommu_dma_init(void);
 
@@ -62,9 +63,13 @@ void iommu_dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
 int iommu_dma_supported(struct device *dev, u64 mask);
 int iommu_dma_mapping_error(struct device *dev, dma_addr_t dma_addr);
 
+/* The DMA API isn't _quite_ the whole story, though... */
+void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg);
+
 #else
 
 struct iommu_domain;
+struct msi_msg;
 
 static inline int iommu_dma_init(void)
 {
@@ -80,6 +85,10 @@ static inline void iommu_put_dma_cookie(struct iommu_domain *domain)
 {
 }
 
+static inline void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
+{
+}
+
 #endif	/* CONFIG_IOMMU_DMA */
 #endif	/* __KERNEL__ */
 #endif	/* __DMA_IOMMU_H */
-- 
2.8.1.dirty

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v5 19/19] iommu/dma: Add support for mapping MSIs
@ 2016-08-23 19:05   ` Robin Murphy
  0 siblings, 0 replies; 61+ messages in thread
From: Robin Murphy @ 2016-08-23 19:05 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, lorenzo.pieralisi-5wv7dgnIgG8,
	jean-philippe.brucker-5wv7dgnIgG8, punit.agrawal-5wv7dgnIgG8,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA,
	eric.auger-H+wXaHxf7aLQT0dZR+AlfA, Thomas Gleixner, Jason Cooper,
	Marc Zyngier, linux-kernel-u79uwXL29TY76Z2rM5mHXA

When an MSI doorbell is located downstream of an IOMMU, attaching
devices to a DMA ops domain and switching on translation leads to a rude
shock when their attempt to write to the physical address returned by
the irqchip driver faults (or worse, writes into some already-mapped
buffer) and no interrupt is forthcoming.

Address this by adding a hook for relevant irqchip drivers to call from
their compose_msi_msg() callback, to swizzle the physical address with
an appropriatly-mapped IOVA for any device attached to one of our DMA
ops domains.

CC: Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
CC: Jason Cooper <jason-NLaQJdtUoK4Be96aLqz0jA@public.gmane.org>
CC: Marc Zyngier <marc.zyngier-5wv7dgnIgG8@public.gmane.org>
CC: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
---
 drivers/iommu/dma-iommu.c        | 141 ++++++++++++++++++++++++++++++++++-----
 drivers/irqchip/irq-gic-v2m.c    |   3 +
 drivers/irqchip/irq-gic-v3-its.c |   3 +
 include/linux/dma-iommu.h        |   9 +++
 4 files changed, 141 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 00c8a08d56e7..330cce60cad9 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -25,10 +25,29 @@
 #include <linux/huge_mm.h>
 #include <linux/iommu.h>
 #include <linux/iova.h>
+#include <linux/irq.h>
 #include <linux/mm.h>
 #include <linux/scatterlist.h>
 #include <linux/vmalloc.h>
 
+struct iommu_dma_msi_page {
+	struct list_head	list;
+	dma_addr_t		iova;
+	u32			phys_lo;
+	u32			phys_hi;
+};
+
+struct iommu_dma_cookie {
+	struct iova_domain	iovad;
+	struct list_head	msi_page_list;
+	spinlock_t		msi_lock;
+};
+
+static inline struct iova_domain *cookie_iovad(struct iommu_domain *domain)
+{
+	return &((struct iommu_dma_cookie *)domain->iova_cookie)->iovad;
+}
+
 int iommu_dma_init(void)
 {
 	return iova_cache_get();
@@ -43,15 +62,19 @@ int iommu_dma_init(void)
  */
 int iommu_get_dma_cookie(struct iommu_domain *domain)
 {
-	struct iova_domain *iovad;
+	struct iommu_dma_cookie *cookie;
 
 	if (domain->iova_cookie)
 		return -EEXIST;
 
-	iovad = kzalloc(sizeof(*iovad), GFP_KERNEL);
-	domain->iova_cookie = iovad;
+	cookie = kzalloc(sizeof(*cookie), GFP_KERNEL);
+	if (!cookie)
+		return -ENOMEM;
 
-	return iovad ? 0 : -ENOMEM;
+	spin_lock_init(&cookie->msi_lock);
+	INIT_LIST_HEAD(&cookie->msi_page_list);
+	domain->iova_cookie = cookie;
+	return 0;
 }
 EXPORT_SYMBOL(iommu_get_dma_cookie);
 
@@ -63,14 +86,20 @@ EXPORT_SYMBOL(iommu_get_dma_cookie);
  */
 void iommu_put_dma_cookie(struct iommu_domain *domain)
 {
-	struct iova_domain *iovad = domain->iova_cookie;
+	struct iommu_dma_cookie *cookie = domain->iova_cookie;
+	struct iommu_dma_msi_page *msi, *tmp;
 
-	if (!iovad)
+	if (!cookie)
 		return;
 
-	if (iovad->granule)
-		put_iova_domain(iovad);
-	kfree(iovad);
+	if (cookie->iovad.granule)
+		put_iova_domain(&cookie->iovad);
+
+	list_for_each_entry_safe(msi, tmp, &cookie->msi_page_list, list) {
+		list_del(&msi->list);
+		kfree(msi);
+	}
+	kfree(cookie);
 	domain->iova_cookie = NULL;
 }
 EXPORT_SYMBOL(iommu_put_dma_cookie);
@@ -88,7 +117,7 @@ EXPORT_SYMBOL(iommu_put_dma_cookie);
  */
 int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base, u64 size)
 {
-	struct iova_domain *iovad = domain->iova_cookie;
+	struct iova_domain *iovad = cookie_iovad(domain);
 	unsigned long order, base_pfn, end_pfn;
 
 	if (!iovad)
@@ -155,7 +184,7 @@ int dma_direction_to_prot(enum dma_data_direction dir, bool coherent)
 static struct iova *__alloc_iova(struct iommu_domain *domain, size_t size,
 		dma_addr_t dma_limit)
 {
-	struct iova_domain *iovad = domain->iova_cookie;
+	struct iova_domain *iovad = cookie_iovad(domain);
 	unsigned long shift = iova_shift(iovad);
 	unsigned long length = iova_align(iovad, size) >> shift;
 
@@ -171,7 +200,7 @@ static struct iova *__alloc_iova(struct iommu_domain *domain, size_t size,
 /* The IOVA allocator knows what we mapped, so just unmap whatever that was */
 static void __iommu_dma_unmap(struct iommu_domain *domain, dma_addr_t dma_addr)
 {
-	struct iova_domain *iovad = domain->iova_cookie;
+	struct iova_domain *iovad = cookie_iovad(domain);
 	unsigned long shift = iova_shift(iovad);
 	unsigned long pfn = dma_addr >> shift;
 	struct iova *iova = find_iova(iovad, pfn);
@@ -294,7 +323,7 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
 		void (*flush_page)(struct device *, const void *, phys_addr_t))
 {
 	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
-	struct iova_domain *iovad = domain->iova_cookie;
+	struct iova_domain *iovad = cookie_iovad(domain);
 	struct iova *iova;
 	struct page **pages;
 	struct sg_table sgt;
@@ -386,7 +415,7 @@ dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page,
 {
 	dma_addr_t dma_addr;
 	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
-	struct iova_domain *iovad = domain->iova_cookie;
+	struct iova_domain *iovad = cookie_iovad(domain);
 	phys_addr_t phys = page_to_phys(page) + offset;
 	size_t iova_off = iova_offset(iovad, phys);
 	size_t len = iova_align(iovad, size + iova_off);
@@ -495,7 +524,7 @@ int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
 		int nents, int prot)
 {
 	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
-	struct iova_domain *iovad = domain->iova_cookie;
+	struct iova_domain *iovad = cookie_iovad(domain);
 	struct iova *iova;
 	struct scatterlist *s, *prev = NULL;
 	dma_addr_t dma_addr;
@@ -587,3 +616,85 @@ int iommu_dma_mapping_error(struct device *dev, dma_addr_t dma_addr)
 {
 	return dma_addr == DMA_ERROR_CODE;
 }
+
+static int __iommu_dma_map_msi_page(struct device *dev, struct msi_msg *msg,
+		struct iommu_domain *domain, struct iommu_dma_msi_page **ppage)
+{
+	struct iommu_dma_cookie *cookie = domain->iova_cookie;
+	struct iommu_dma_msi_page *msi_page;
+	struct iova_domain *iovad = &cookie->iovad;
+	struct iova *iova;
+	phys_addr_t msi_addr = (u64)msg->address_hi << 32 | msg->address_lo;
+	int ret, prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
+
+	msi_page = kzalloc(sizeof(*msi_page), GFP_ATOMIC);
+	if (!msi_page)
+		return -ENOMEM;
+
+	iova = __alloc_iova(domain, iovad->granule, dma_get_mask(dev));
+	if (!iova) {
+		ret = -ENOSPC;
+		goto out_free_page;
+	}
+
+	msi_page->iova = iova_dma_addr(iovad, iova);
+	ret = iommu_map(domain, msi_page->iova, msi_addr & ~iova_mask(iovad),
+			iovad->granule, prot);
+	if (ret)
+		goto out_free_iova;
+
+	msi_page->phys_hi = msg->address_hi;
+	msi_page->phys_lo = msg->address_lo;
+	INIT_LIST_HEAD(&msi_page->list);
+	list_add(&msi_page->list, &cookie->msi_page_list);
+	*ppage = msi_page;
+	return 0;
+
+out_free_iova:
+	__free_iova(iovad, iova);
+out_free_page:
+	kfree(msi_page);
+	return ret;
+}
+
+void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
+{
+	struct device *dev = msi_desc_to_dev(irq_get_msi_desc(irq));
+	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
+	struct iova_domain *iovad;
+	struct iommu_dma_cookie *cookie;
+	struct iommu_dma_msi_page *msi_page;
+	int ret = 0;
+
+	if (!domain || !domain->iova_cookie)
+		return;
+
+	cookie = domain->iova_cookie;
+	iovad = &cookie->iovad;
+
+	spin_lock(&cookie->msi_lock);
+	list_for_each_entry(msi_page, &cookie->msi_page_list, list)
+		if (msi_page->phys_hi == msg->address_hi &&
+		    msi_page->phys_lo - msg->address_lo < iovad->granule)
+			goto unlock;
+
+	ret = __iommu_dma_map_msi_page(dev, msg, domain, &msi_page);
+unlock:
+	spin_unlock(&cookie->msi_lock);
+
+	if (!ret) {
+		msg->address_hi = upper_32_bits(msi_page->iova);
+		msg->address_lo &= iova_mask(iovad);
+		msg->address_lo += lower_32_bits(msi_page->iova);
+	} else {
+		/*
+		 * We're called from a void callback, so the best we can do is
+		 * 'fail' by filling the message with obviously bogus values.
+		 * Since we got this far due to an IOMMU being present, it's
+		 * not like the existing address would have worked anyway...
+		 */
+		msg->address_hi = ~0U;
+		msg->address_lo = ~0U;
+		msg->data = ~0U;
+	}
+}
diff --git a/drivers/irqchip/irq-gic-v2m.c b/drivers/irqchip/irq-gic-v2m.c
index 35eb7ac5d21f..863e073c6f7f 100644
--- a/drivers/irqchip/irq-gic-v2m.c
+++ b/drivers/irqchip/irq-gic-v2m.c
@@ -16,6 +16,7 @@
 #define pr_fmt(fmt) "GICv2m: " fmt
 
 #include <linux/acpi.h>
+#include <linux/dma-iommu.h>
 #include <linux/irq.h>
 #include <linux/irqdomain.h>
 #include <linux/kernel.h>
@@ -108,6 +109,8 @@ static void gicv2m_compose_msi_msg(struct irq_data *data, struct msi_msg *msg)
 
 	if (v2m->flags & GICV2M_NEEDS_SPI_OFFSET)
 		msg->data -= v2m->spi_offset;
+
+	iommu_dma_map_msi_msg(data->irq, msg);
 }
 
 static struct irq_chip gicv2m_irq_chip = {
diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 7ceaba81efb4..73f4f10dc204 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -18,6 +18,7 @@
 #include <linux/bitmap.h>
 #include <linux/cpu.h>
 #include <linux/delay.h>
+#include <linux/dma-iommu.h>
 #include <linux/interrupt.h>
 #include <linux/log2.h>
 #include <linux/mm.h>
@@ -655,6 +656,8 @@ static void its_irq_compose_msi_msg(struct irq_data *d, struct msi_msg *msg)
 	msg->address_lo		= addr & ((1UL << 32) - 1);
 	msg->address_hi		= addr >> 32;
 	msg->data		= its_get_event_id(d);
+
+	iommu_dma_map_msi_msg(d->irq, msg);
 }
 
 static struct irq_chip its_irq_chip = {
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index 81c5c8d167ad..5ee806e41b5c 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -21,6 +21,7 @@
 
 #ifdef CONFIG_IOMMU_DMA
 #include <linux/iommu.h>
+#include <linux/msi.h>
 
 int iommu_dma_init(void);
 
@@ -62,9 +63,13 @@ void iommu_dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
 int iommu_dma_supported(struct device *dev, u64 mask);
 int iommu_dma_mapping_error(struct device *dev, dma_addr_t dma_addr);
 
+/* The DMA API isn't _quite_ the whole story, though... */
+void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg);
+
 #else
 
 struct iommu_domain;
+struct msi_msg;
 
 static inline int iommu_dma_init(void)
 {
@@ -80,6 +85,10 @@ static inline void iommu_put_dma_cookie(struct iommu_domain *domain)
 {
 }
 
+static inline void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
+{
+}
+
 #endif	/* CONFIG_IOMMU_DMA */
 #endif	/* __KERNEL__ */
 #endif	/* __DMA_IOMMU_H */
-- 
2.8.1.dirty

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [PATCH v5 00/19] Generic DT bindings for PCI IOMMUs and ARM SMMU
  2016-08-23 19:05 [PATCH v5 00/19] Generic DT bindings for PCI IOMMUs and ARM SMMU Robin Murphy
@ 2016-08-23 19:15     ` Robin Murphy
  2016-08-23 19:05   ` Robin Murphy
  1 sibling, 0 replies; 61+ messages in thread
From: Robin Murphy @ 2016-08-23 19:15 UTC (permalink / raw)
  To: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
  Cc: joro-zLv9SwRftAIdnm+yROfE0A, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	devicetree-u79uwXL29TY76Z2rM5mHXA, punit.agrawal-5wv7dgnIgG8,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA

On 23/08/16 20:05, Robin Murphy wrote:
> Hi all,

Oh bums, looks like I managed to miss LAKML off the CC list. If anyone
there is interested, it's over here:

https://lists.linuxfoundation.org/pipermail/iommu/2016-August/018230.html

Robin.

> 
> At long last I've finished the big SMMUv2 rework, so here's everything
> all together for a v5. As a quick breakdown:
> 
> Patches 1-3 are the core PCI part, all acked and ready to go. No code
> changes from v4.
> 
> Patch 4 is merely bugfixed from v4 for simplicity, as I've not yet
> managed to take as close a look at Lorenzo's follow-on work as I'd like.
> 
> Patches 5-7 (SMMUv3) are mostly unchanged beyond a slight tweak to #5.
> 
> Patches 8-17 are the all-new SMMUv2 rework.
> 
> Patch 18 goes along with the fix already in 4.8-rc3 to help avoid 64-bit
> DMA masks going wrong now that DMA ops will be enabled.
> 
> Finally, patch 19 addresses the previous problem of having to choose
> between DMA ops or working MSIs. This is currently at the end as
> moving it before #17 would require a further interim SMMUv2 patch, and
> a 19-patch series is already quite enough...
> 
> I've pushed out a branch based on iommu/next to the usual place:
> 
> git://linux-arm.org/linux-rm iommu/generic-v5
> 
> Thanks,
> Robin.
> ---
> 
> Mark Rutland (1):
>   Docs: dt: add PCI IOMMU map bindings
> 
> Robin Murphy (18):
>   of/irq: Break out msi-map lookup (again)
>   iommu/of: Handle iommu-map property for PCI
>   iommu/of: Introduce iommu_fwspec
>   iommu/arm-smmu: Implement of_xlate() for SMMUv3
>   iommu/arm-smmu: Support non-PCI devices with SMMUv3
>   iommu/arm-smmu: Set PRIVCFG in stage 1 STEs
>   iommu/arm-smmu: Handle stream IDs more dynamically
>   iommu/arm-smmu: Consolidate stream map entry state
>   iommu/arm-smmu: Keep track of S2CR state
>   iommu/arm-smmu: Refactor mmu-masters handling
>   iommu/arm-smmu: Streamline SMMU data lookups
>   iommu/arm-smmu: Add a stream map entry iterator
>   iommu/arm-smmu: Intelligent SMR allocation
>   iommu/arm-smmu: Convert to iommu_fwspec
>   Docs: dt: document ARM SMMU generic binding usage
>   iommu/arm-smmu: Wire up generic configuration support
>   iommu/arm-smmu: Set domain geometry
>   iommu/dma: Add support for mapping MSIs
> 
>  .../devicetree/bindings/iommu/arm,smmu.txt         |  63 +-
>  .../devicetree/bindings/pci/pci-iommu.txt          | 171 ++++
>  drivers/iommu/Kconfig                              |   2 +-
>  drivers/iommu/arm-smmu-v3.c                        | 347 ++++----
>  drivers/iommu/arm-smmu.c                           | 952 ++++++++++-----------
>  drivers/iommu/dma-iommu.c                          | 141 ++-
>  drivers/iommu/of_iommu.c                           |  95 +-
>  drivers/irqchip/irq-gic-v2m.c                      |   3 +
>  drivers/irqchip/irq-gic-v3-its.c                   |   3 +
>  drivers/of/irq.c                                   |  78 +-
>  drivers/of/of_pci.c                                | 102 +++
>  include/linux/dma-iommu.h                          |   9 +
>  include/linux/of_iommu.h                           |  15 +
>  include/linux/of_pci.h                             |  10 +
>  14 files changed, 1208 insertions(+), 783 deletions(-)
>  create mode 100644 Documentation/devicetree/bindings/pci/pci-iommu.txt
> 

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v5 00/19] Generic DT bindings for PCI IOMMUs and ARM SMMU
@ 2016-08-23 19:15     ` Robin Murphy
  0 siblings, 0 replies; 61+ messages in thread
From: Robin Murphy @ 2016-08-23 19:15 UTC (permalink / raw)
  To: linux-arm-kernel

On 23/08/16 20:05, Robin Murphy wrote:
> Hi all,

Oh bums, looks like I managed to miss LAKML off the CC list. If anyone
there is interested, it's over here:

https://lists.linuxfoundation.org/pipermail/iommu/2016-August/018230.html

Robin.

> 
> At long last I've finished the big SMMUv2 rework, so here's everything
> all together for a v5. As a quick breakdown:
> 
> Patches 1-3 are the core PCI part, all acked and ready to go. No code
> changes from v4.
> 
> Patch 4 is merely bugfixed from v4 for simplicity, as I've not yet
> managed to take as close a look at Lorenzo's follow-on work as I'd like.
> 
> Patches 5-7 (SMMUv3) are mostly unchanged beyond a slight tweak to #5.
> 
> Patches 8-17 are the all-new SMMUv2 rework.
> 
> Patch 18 goes along with the fix already in 4.8-rc3 to help avoid 64-bit
> DMA masks going wrong now that DMA ops will be enabled.
> 
> Finally, patch 19 addresses the previous problem of having to choose
> between DMA ops or working MSIs. This is currently at the end as
> moving it before #17 would require a further interim SMMUv2 patch, and
> a 19-patch series is already quite enough...
> 
> I've pushed out a branch based on iommu/next to the usual place:
> 
> git://linux-arm.org/linux-rm iommu/generic-v5
> 
> Thanks,
> Robin.
> ---
> 
> Mark Rutland (1):
>   Docs: dt: add PCI IOMMU map bindings
> 
> Robin Murphy (18):
>   of/irq: Break out msi-map lookup (again)
>   iommu/of: Handle iommu-map property for PCI
>   iommu/of: Introduce iommu_fwspec
>   iommu/arm-smmu: Implement of_xlate() for SMMUv3
>   iommu/arm-smmu: Support non-PCI devices with SMMUv3
>   iommu/arm-smmu: Set PRIVCFG in stage 1 STEs
>   iommu/arm-smmu: Handle stream IDs more dynamically
>   iommu/arm-smmu: Consolidate stream map entry state
>   iommu/arm-smmu: Keep track of S2CR state
>   iommu/arm-smmu: Refactor mmu-masters handling
>   iommu/arm-smmu: Streamline SMMU data lookups
>   iommu/arm-smmu: Add a stream map entry iterator
>   iommu/arm-smmu: Intelligent SMR allocation
>   iommu/arm-smmu: Convert to iommu_fwspec
>   Docs: dt: document ARM SMMU generic binding usage
>   iommu/arm-smmu: Wire up generic configuration support
>   iommu/arm-smmu: Set domain geometry
>   iommu/dma: Add support for mapping MSIs
> 
>  .../devicetree/bindings/iommu/arm,smmu.txt         |  63 +-
>  .../devicetree/bindings/pci/pci-iommu.txt          | 171 ++++
>  drivers/iommu/Kconfig                              |   2 +-
>  drivers/iommu/arm-smmu-v3.c                        | 347 ++++----
>  drivers/iommu/arm-smmu.c                           | 952 ++++++++++-----------
>  drivers/iommu/dma-iommu.c                          | 141 ++-
>  drivers/iommu/of_iommu.c                           |  95 +-
>  drivers/irqchip/irq-gic-v2m.c                      |   3 +
>  drivers/irqchip/irq-gic-v3-its.c                   |   3 +
>  drivers/of/irq.c                                   |  78 +-
>  drivers/of/of_pci.c                                | 102 +++
>  include/linux/dma-iommu.h                          |   9 +
>  include/linux/of_iommu.h                           |  15 +
>  include/linux/of_pci.h                             |  10 +
>  14 files changed, 1208 insertions(+), 783 deletions(-)
>  create mode 100644 Documentation/devicetree/bindings/pci/pci-iommu.txt
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v5 19/19] iommu/dma: Add support for mapping MSIs
@ 2016-08-24  8:16     ` Thomas Gleixner
  0 siblings, 0 replies; 61+ messages in thread
From: Thomas Gleixner @ 2016-08-24  8:16 UTC (permalink / raw)
  To: Robin Murphy
  Cc: joro, will.deacon, iommu, devicetree, lorenzo.pieralisi,
	jean-philippe.brucker, punit.agrawal, thunder.leizhen,
	eric.auger, Jason Cooper, Marc Zyngier, linux-kernel

On Tue, 23 Aug 2016, Robin Murphy wrote:
> +	cookie = domain->iova_cookie;
> +	iovad = &cookie->iovad;
> +
> +	spin_lock(&cookie->msi_lock);
> +	list_for_each_entry(msi_page, &cookie->msi_page_list, list)
> +		if (msi_page->phys_hi == msg->address_hi &&
> +		    msi_page->phys_lo - msg->address_lo < iovad->granule)
> +			goto unlock;
> +
> +	ret = __iommu_dma_map_msi_page(dev, msg, domain, &msi_page);
> +unlock:
> +	spin_unlock(&cookie->msi_lock);
> +
> +	if (!ret) {
> +		msg->address_hi = upper_32_bits(msi_page->iova);
> +		msg->address_lo &= iova_mask(iovad);
> +		msg->address_lo += lower_32_bits(msi_page->iova);
> +	} else {
> +		/*
> +		 * We're called from a void callback, so the best we can do is
> +		 * 'fail' by filling the message with obviously bogus values.
> +		 * Since we got this far due to an IOMMU being present, it's
> +		 * not like the existing address would have worked anyway...
> +		 */
> +		msg->address_hi = ~0U;
> +		msg->address_lo = ~0U;
> +		msg->data = ~0U;
> +	}

The above is really horrible to parse. I had to read it five times to
understand the logic.

static struct iommu_dma_msi_page *
find_or_map_msi_page(struct iommu_dma_cookie *cookie, struct msi_msg *msg)
{
	struct iova_domain *iovad = &cookie->iovad;
       	struct iommu_dma_msi_page *page;

	list_for_each_entry(*page, &cookie->msi_page_list, list) {
		if (page->phys_hi == msg->address_hi &&
		    page->phys_lo - msg->address_lo < iovad->granule)
		    	return page;
	}

	/*
	 * FIXME: __iommu_dma_map_msi_page() should return a page or NULL.
	 * The integer return value is pretty pointless. If seperate error
	 * codes are required that's what ERR_PTR() is for ....
	 */
	ret = __iommu_dma_map_msi_page(dev, msg, domain, &page);
	return ret ? ERR_PTR(ret) : page;
}

So now the code in iommu_dma_map_msi_msg() becomes:

	spin_lock(&cookie->msi_lock);
	msi_page = find_or_map_msi_page(cookie, msg);
	spin_unlock(&cookie->msi_lock);

	if (!IS_ERR_OR_NULL(msi_page)) {
		msg->address_hi = upper_32_bits(msi_page->iova);
		msg->address_lo &= iova_mask(iovad);
		msg->address_lo += lower_32_bits(msi_page->iova);
	} else {
		/*
		 * We're called from a void callback, so the best we can do is
		 * 'fail' by filling the message with obviously bogus values.
		 * Since we got this far due to an IOMMU being present, it's
		 * not like the existing address would have worked anyway...
		 */
		msg->address_hi = ~0U;
		msg->address_lo = ~0U;
		msg->data = ~0U;
	}

Hmm? 

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v5 19/19] iommu/dma: Add support for mapping MSIs
@ 2016-08-24  8:16     ` Thomas Gleixner
  0 siblings, 0 replies; 61+ messages in thread
From: Thomas Gleixner @ 2016-08-24  8:16 UTC (permalink / raw)
  To: Robin Murphy
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, Jason Cooper,
	punit.agrawal-5wv7dgnIgG8, will.deacon-5wv7dgnIgG8,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA, Marc Zyngier

On Tue, 23 Aug 2016, Robin Murphy wrote:
> +	cookie = domain->iova_cookie;
> +	iovad = &cookie->iovad;
> +
> +	spin_lock(&cookie->msi_lock);
> +	list_for_each_entry(msi_page, &cookie->msi_page_list, list)
> +		if (msi_page->phys_hi == msg->address_hi &&
> +		    msi_page->phys_lo - msg->address_lo < iovad->granule)
> +			goto unlock;
> +
> +	ret = __iommu_dma_map_msi_page(dev, msg, domain, &msi_page);
> +unlock:
> +	spin_unlock(&cookie->msi_lock);
> +
> +	if (!ret) {
> +		msg->address_hi = upper_32_bits(msi_page->iova);
> +		msg->address_lo &= iova_mask(iovad);
> +		msg->address_lo += lower_32_bits(msi_page->iova);
> +	} else {
> +		/*
> +		 * We're called from a void callback, so the best we can do is
> +		 * 'fail' by filling the message with obviously bogus values.
> +		 * Since we got this far due to an IOMMU being present, it's
> +		 * not like the existing address would have worked anyway...
> +		 */
> +		msg->address_hi = ~0U;
> +		msg->address_lo = ~0U;
> +		msg->data = ~0U;
> +	}

The above is really horrible to parse. I had to read it five times to
understand the logic.

static struct iommu_dma_msi_page *
find_or_map_msi_page(struct iommu_dma_cookie *cookie, struct msi_msg *msg)
{
	struct iova_domain *iovad = &cookie->iovad;
       	struct iommu_dma_msi_page *page;

	list_for_each_entry(*page, &cookie->msi_page_list, list) {
		if (page->phys_hi == msg->address_hi &&
		    page->phys_lo - msg->address_lo < iovad->granule)
		    	return page;
	}

	/*
	 * FIXME: __iommu_dma_map_msi_page() should return a page or NULL.
	 * The integer return value is pretty pointless. If seperate error
	 * codes are required that's what ERR_PTR() is for ....
	 */
	ret = __iommu_dma_map_msi_page(dev, msg, domain, &page);
	return ret ? ERR_PTR(ret) : page;
}

So now the code in iommu_dma_map_msi_msg() becomes:

	spin_lock(&cookie->msi_lock);
	msi_page = find_or_map_msi_page(cookie, msg);
	spin_unlock(&cookie->msi_lock);

	if (!IS_ERR_OR_NULL(msi_page)) {
		msg->address_hi = upper_32_bits(msi_page->iova);
		msg->address_lo &= iova_mask(iovad);
		msg->address_lo += lower_32_bits(msi_page->iova);
	} else {
		/*
		 * We're called from a void callback, so the best we can do is
		 * 'fail' by filling the message with obviously bogus values.
		 * Since we got this far due to an IOMMU being present, it's
		 * not like the existing address would have worked anyway...
		 */
		msg->address_hi = ~0U;
		msg->address_lo = ~0U;
		msg->data = ~0U;
	}

Hmm? 

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v5 19/19] iommu/dma: Add support for mapping MSIs
  2016-08-24  8:16     ` Thomas Gleixner
@ 2016-08-24 10:06       ` Robin Murphy
  -1 siblings, 0 replies; 61+ messages in thread
From: Robin Murphy @ 2016-08-24 10:06 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: joro, will.deacon, iommu, devicetree, lorenzo.pieralisi,
	jean-philippe.brucker, punit.agrawal, thunder.leizhen,
	eric.auger, Jason Cooper, Marc Zyngier, linux-kernel

On 24/08/16 09:16, Thomas Gleixner wrote:
> On Tue, 23 Aug 2016, Robin Murphy wrote:
>> +	cookie = domain->iova_cookie;
>> +	iovad = &cookie->iovad;
>> +
>> +	spin_lock(&cookie->msi_lock);
>> +	list_for_each_entry(msi_page, &cookie->msi_page_list, list)
>> +		if (msi_page->phys_hi == msg->address_hi &&
>> +		    msi_page->phys_lo - msg->address_lo < iovad->granule)
>> +			goto unlock;
>> +
>> +	ret = __iommu_dma_map_msi_page(dev, msg, domain, &msi_page);
>> +unlock:
>> +	spin_unlock(&cookie->msi_lock);
>> +
>> +	if (!ret) {
>> +		msg->address_hi = upper_32_bits(msi_page->iova);
>> +		msg->address_lo &= iova_mask(iovad);
>> +		msg->address_lo += lower_32_bits(msi_page->iova);
>> +	} else {
>> +		/*
>> +		 * We're called from a void callback, so the best we can do is
>> +		 * 'fail' by filling the message with obviously bogus values.
>> +		 * Since we got this far due to an IOMMU being present, it's
>> +		 * not like the existing address would have worked anyway...
>> +		 */
>> +		msg->address_hi = ~0U;
>> +		msg->address_lo = ~0U;
>> +		msg->data = ~0U;
>> +	}
> 
> The above is really horrible to parse. I had to read it five times to
> understand the logic.

Yeah, on reflection it is needlessly hideous. I think we should take
this as a clear lesson that whenever you find yourself thinking "Man, I
wish I had Python's for...else construct here", you're doing it wrong ;)

> static struct iommu_dma_msi_page *
> find_or_map_msi_page(struct iommu_dma_cookie *cookie, struct msi_msg *msg)
> {
> 	struct iova_domain *iovad = &cookie->iovad;
>        	struct iommu_dma_msi_page *page;
> 
> 	list_for_each_entry(*page, &cookie->msi_page_list, list) {
> 		if (page->phys_hi == msg->address_hi &&
> 		    page->phys_lo - msg->address_lo < iovad->granule)
> 		    	return page;
> 	}
> 
> 	/*
> 	 * FIXME: __iommu_dma_map_msi_page() should return a page or NULL.
> 	 * The integer return value is pretty pointless. If seperate error
> 	 * codes are required that's what ERR_PTR() is for ....
> 	 */
> 	ret = __iommu_dma_map_msi_page(dev, msg, domain, &page);
> 	return ret ? ERR_PTR(ret) : page;
> }
> 
> So now the code in iommu_dma_map_msi_msg() becomes:
> 
> 	spin_lock(&cookie->msi_lock);
> 	msi_page = find_or_map_msi_page(cookie, msg);
> 	spin_unlock(&cookie->msi_lock);
> 
> 	if (!IS_ERR_OR_NULL(msi_page)) {
> 		msg->address_hi = upper_32_bits(msi_page->iova);
> 		msg->address_lo &= iova_mask(iovad);
> 		msg->address_lo += lower_32_bits(msi_page->iova);
> 	} else {
> 		/*
> 		 * We're called from a void callback, so the best we can do is
> 		 * 'fail' by filling the message with obviously bogus values.
> 		 * Since we got this far due to an IOMMU being present, it's
> 		 * not like the existing address would have worked anyway...
> 		 */
> 		msg->address_hi = ~0U;
> 		msg->address_lo = ~0U;
> 		msg->data = ~0U;
> 	}
> 
> Hmm? 

OK, I've turned map_msi_page into get_msi_page (returning a page) and
just hoisted the list lookup into that, which leads to knock-on
simplifications throughout and is _much_ nicer. I now can't imagine why
I didn't get that far in the first place - thanks for the reality check!

Robin.

> 
> Thanks,
> 
> 	tglx
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v5 19/19] iommu/dma: Add support for mapping MSIs
@ 2016-08-24 10:06       ` Robin Murphy
  0 siblings, 0 replies; 61+ messages in thread
From: Robin Murphy @ 2016-08-24 10:06 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, Jason Cooper,
	punit.agrawal-5wv7dgnIgG8, will.deacon-5wv7dgnIgG8,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA, Marc Zyngier

On 24/08/16 09:16, Thomas Gleixner wrote:
> On Tue, 23 Aug 2016, Robin Murphy wrote:
>> +	cookie = domain->iova_cookie;
>> +	iovad = &cookie->iovad;
>> +
>> +	spin_lock(&cookie->msi_lock);
>> +	list_for_each_entry(msi_page, &cookie->msi_page_list, list)
>> +		if (msi_page->phys_hi == msg->address_hi &&
>> +		    msi_page->phys_lo - msg->address_lo < iovad->granule)
>> +			goto unlock;
>> +
>> +	ret = __iommu_dma_map_msi_page(dev, msg, domain, &msi_page);
>> +unlock:
>> +	spin_unlock(&cookie->msi_lock);
>> +
>> +	if (!ret) {
>> +		msg->address_hi = upper_32_bits(msi_page->iova);
>> +		msg->address_lo &= iova_mask(iovad);
>> +		msg->address_lo += lower_32_bits(msi_page->iova);
>> +	} else {
>> +		/*
>> +		 * We're called from a void callback, so the best we can do is
>> +		 * 'fail' by filling the message with obviously bogus values.
>> +		 * Since we got this far due to an IOMMU being present, it's
>> +		 * not like the existing address would have worked anyway...
>> +		 */
>> +		msg->address_hi = ~0U;
>> +		msg->address_lo = ~0U;
>> +		msg->data = ~0U;
>> +	}
> 
> The above is really horrible to parse. I had to read it five times to
> understand the logic.

Yeah, on reflection it is needlessly hideous. I think we should take
this as a clear lesson that whenever you find yourself thinking "Man, I
wish I had Python's for...else construct here", you're doing it wrong ;)

> static struct iommu_dma_msi_page *
> find_or_map_msi_page(struct iommu_dma_cookie *cookie, struct msi_msg *msg)
> {
> 	struct iova_domain *iovad = &cookie->iovad;
>        	struct iommu_dma_msi_page *page;
> 
> 	list_for_each_entry(*page, &cookie->msi_page_list, list) {
> 		if (page->phys_hi == msg->address_hi &&
> 		    page->phys_lo - msg->address_lo < iovad->granule)
> 		    	return page;
> 	}
> 
> 	/*
> 	 * FIXME: __iommu_dma_map_msi_page() should return a page or NULL.
> 	 * The integer return value is pretty pointless. If seperate error
> 	 * codes are required that's what ERR_PTR() is for ....
> 	 */
> 	ret = __iommu_dma_map_msi_page(dev, msg, domain, &page);
> 	return ret ? ERR_PTR(ret) : page;
> }
> 
> So now the code in iommu_dma_map_msi_msg() becomes:
> 
> 	spin_lock(&cookie->msi_lock);
> 	msi_page = find_or_map_msi_page(cookie, msg);
> 	spin_unlock(&cookie->msi_lock);
> 
> 	if (!IS_ERR_OR_NULL(msi_page)) {
> 		msg->address_hi = upper_32_bits(msi_page->iova);
> 		msg->address_lo &= iova_mask(iovad);
> 		msg->address_lo += lower_32_bits(msi_page->iova);
> 	} else {
> 		/*
> 		 * We're called from a void callback, so the best we can do is
> 		 * 'fail' by filling the message with obviously bogus values.
> 		 * Since we got this far due to an IOMMU being present, it's
> 		 * not like the existing address would have worked anyway...
> 		 */
> 		msg->address_hi = ~0U;
> 		msg->address_lo = ~0U;
> 		msg->data = ~0U;
> 	}
> 
> Hmm? 

OK, I've turned map_msi_page into get_msi_page (returning a page) and
just hoisted the list lookup into that, which leads to knock-on
simplifications throughout and is _much_ nicer. I now can't imagine why
I didn't get that far in the first place - thanks for the reality check!

Robin.

> 
> Thanks,
> 
> 	tglx
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v5 19/19] iommu/dma: Add support for mapping MSIs
@ 2016-08-25 22:25     ` Auger Eric
  0 siblings, 0 replies; 61+ messages in thread
From: Auger Eric @ 2016-08-25 22:25 UTC (permalink / raw)
  To: Robin Murphy, joro, will.deacon, iommu
  Cc: devicetree, lorenzo.pieralisi, jean-philippe.brucker,
	punit.agrawal, thunder.leizhen, Thomas Gleixner, Jason Cooper,
	Marc Zyngier, linux-kernel

Hi Robin,

On 23/08/2016 21:05, Robin Murphy wrote:
> When an MSI doorbell is located downstream of an IOMMU, attaching
> devices to a DMA ops domain and switching on translation leads to a rude
> shock when their attempt to write to the physical address returned by
> the irqchip driver faults (or worse, writes into some already-mapped
> buffer) and no interrupt is forthcoming.
> 
> Address this by adding a hook for relevant irqchip drivers to call from
> their compose_msi_msg() callback, to swizzle the physical address with
> an appropriatly-mapped IOVA for any device attached to one of our DMA
> ops domains.
> 
> CC: Thomas Gleixner <tglx@linutronix.de>
> CC: Jason Cooper <jason@lakedaemon.net>
> CC: Marc Zyngier <marc.zyngier@arm.com>
> CC: linux-kernel@vger.kernel.org
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> ---
>  drivers/iommu/dma-iommu.c        | 141 ++++++++++++++++++++++++++++++++++-----
>  drivers/irqchip/irq-gic-v2m.c    |   3 +
>  drivers/irqchip/irq-gic-v3-its.c |   3 +
>  include/linux/dma-iommu.h        |   9 +++
>  4 files changed, 141 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 00c8a08d56e7..330cce60cad9 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -25,10 +25,29 @@
>  #include <linux/huge_mm.h>
>  #include <linux/iommu.h>
>  #include <linux/iova.h>
> +#include <linux/irq.h>
>  #include <linux/mm.h>
>  #include <linux/scatterlist.h>
>  #include <linux/vmalloc.h>
>  
> +struct iommu_dma_msi_page {
> +	struct list_head	list;
> +	dma_addr_t		iova;
> +	u32			phys_lo;
> +	u32			phys_hi;
> +};
> +
> +struct iommu_dma_cookie {
> +	struct iova_domain	iovad;
> +	struct list_head	msi_page_list;
> +	spinlock_t		msi_lock;
> +};
> +
> +static inline struct iova_domain *cookie_iovad(struct iommu_domain *domain)
> +{
> +	return &((struct iommu_dma_cookie *)domain->iova_cookie)->iovad;
> +}
> +
>  int iommu_dma_init(void)
>  {
>  	return iova_cache_get();
> @@ -43,15 +62,19 @@ int iommu_dma_init(void)
>   */
>  int iommu_get_dma_cookie(struct iommu_domain *domain)
>  {
> -	struct iova_domain *iovad;
> +	struct iommu_dma_cookie *cookie;
>  
>  	if (domain->iova_cookie)
>  		return -EEXIST;
>  
> -	iovad = kzalloc(sizeof(*iovad), GFP_KERNEL);
> -	domain->iova_cookie = iovad;
> +	cookie = kzalloc(sizeof(*cookie), GFP_KERNEL);
> +	if (!cookie)
> +		return -ENOMEM;
>  
> -	return iovad ? 0 : -ENOMEM;
> +	spin_lock_init(&cookie->msi_lock);
> +	INIT_LIST_HEAD(&cookie->msi_page_list);
> +	domain->iova_cookie = cookie;
> +	return 0;
>  }
>  EXPORT_SYMBOL(iommu_get_dma_cookie);
>  
> @@ -63,14 +86,20 @@ EXPORT_SYMBOL(iommu_get_dma_cookie);
>   */
>  void iommu_put_dma_cookie(struct iommu_domain *domain)
>  {
> -	struct iova_domain *iovad = domain->iova_cookie;
> +	struct iommu_dma_cookie *cookie = domain->iova_cookie;
> +	struct iommu_dma_msi_page *msi, *tmp;
>  
> -	if (!iovad)
> +	if (!cookie)
>  		return;
>  
> -	if (iovad->granule)
> -		put_iova_domain(iovad);
> -	kfree(iovad);
> +	if (cookie->iovad.granule)
> +		put_iova_domain(&cookie->iovad);
> +
> +	list_for_each_entry_safe(msi, tmp, &cookie->msi_page_list, list) {
> +		list_del(&msi->list);
> +		kfree(msi);
> +	}
> +	kfree(cookie);
>  	domain->iova_cookie = NULL;
>  }
>  EXPORT_SYMBOL(iommu_put_dma_cookie);
> @@ -88,7 +117,7 @@ EXPORT_SYMBOL(iommu_put_dma_cookie);
>   */
>  int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base, u64 size)
>  {
> -	struct iova_domain *iovad = domain->iova_cookie;
> +	struct iova_domain *iovad = cookie_iovad(domain);
>  	unsigned long order, base_pfn, end_pfn;
>  
>  	if (!iovad)
> @@ -155,7 +184,7 @@ int dma_direction_to_prot(enum dma_data_direction dir, bool coherent)
>  static struct iova *__alloc_iova(struct iommu_domain *domain, size_t size,
>  		dma_addr_t dma_limit)
>  {
> -	struct iova_domain *iovad = domain->iova_cookie;
> +	struct iova_domain *iovad = cookie_iovad(domain);
>  	unsigned long shift = iova_shift(iovad);
>  	unsigned long length = iova_align(iovad, size) >> shift;
>  
> @@ -171,7 +200,7 @@ static struct iova *__alloc_iova(struct iommu_domain *domain, size_t size,
>  /* The IOVA allocator knows what we mapped, so just unmap whatever that was */
>  static void __iommu_dma_unmap(struct iommu_domain *domain, dma_addr_t dma_addr)
>  {
> -	struct iova_domain *iovad = domain->iova_cookie;
> +	struct iova_domain *iovad = cookie_iovad(domain);
>  	unsigned long shift = iova_shift(iovad);
>  	unsigned long pfn = dma_addr >> shift;
>  	struct iova *iova = find_iova(iovad, pfn);
> @@ -294,7 +323,7 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
>  		void (*flush_page)(struct device *, const void *, phys_addr_t))
>  {
>  	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
> -	struct iova_domain *iovad = domain->iova_cookie;
> +	struct iova_domain *iovad = cookie_iovad(domain);
>  	struct iova *iova;
>  	struct page **pages;
>  	struct sg_table sgt;
> @@ -386,7 +415,7 @@ dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page,
>  {
>  	dma_addr_t dma_addr;
>  	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
> -	struct iova_domain *iovad = domain->iova_cookie;
> +	struct iova_domain *iovad = cookie_iovad(domain);
>  	phys_addr_t phys = page_to_phys(page) + offset;
>  	size_t iova_off = iova_offset(iovad, phys);
>  	size_t len = iova_align(iovad, size + iova_off);
> @@ -495,7 +524,7 @@ int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
>  		int nents, int prot)
>  {
>  	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
> -	struct iova_domain *iovad = domain->iova_cookie;
> +	struct iova_domain *iovad = cookie_iovad(domain);
>  	struct iova *iova;
>  	struct scatterlist *s, *prev = NULL;
>  	dma_addr_t dma_addr;
> @@ -587,3 +616,85 @@ int iommu_dma_mapping_error(struct device *dev, dma_addr_t dma_addr)
>  {
>  	return dma_addr == DMA_ERROR_CODE;
>  }
> +
> +static int __iommu_dma_map_msi_page(struct device *dev, struct msi_msg *msg,
> +		struct iommu_domain *domain, struct iommu_dma_msi_page **ppage)
> +{
> +	struct iommu_dma_cookie *cookie = domain->iova_cookie;
> +	struct iommu_dma_msi_page *msi_page;
> +	struct iova_domain *iovad = &cookie->iovad;
> +	struct iova *iova;
> +	phys_addr_t msi_addr = (u64)msg->address_hi << 32 | msg->address_lo;
> +	int ret, prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
In my series I ended up putting the memory attributes as a property of
the doorbell, advised to do so by Marc. Here we hard freeze them. Do you
foresee all the doorbells ill have the same attributes?
> +
> +	msi_page = kzalloc(sizeof(*msi_page), GFP_ATOMIC);
> +	if (!msi_page)
> +		return -ENOMEM;
> +
> +	iova = __alloc_iova(domain, iovad->granule, dma_get_mask(dev));
> +	if (!iova) {
> +		ret = -ENOSPC;
> +		goto out_free_page;
> +	}
> +
> +	msi_page->iova = iova_dma_addr(iovad, iova);
> +	ret = iommu_map(domain, msi_page->iova, msi_addr & ~iova_mask(iovad),
> +			iovad->granule, prot);
> +	if (ret)
> +		goto out_free_iova;
> +
> +	msi_page->phys_hi = msg->address_hi;
> +	msi_page->phys_lo = msg->address_lo;
> +	INIT_LIST_HEAD(&msi_page->list);
> +	list_add(&msi_page->list, &cookie->msi_page_list);
> +	*ppage = msi_page;
> +	return 0;
> +
> +out_free_iova:
> +	__free_iova(iovad, iova);
> +out_free_page:
> +	kfree(msi_page);
> +	return ret;
> +}
> +
> +void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
Marc told in the past it was reasonable to consider adding a size
parameter to the allocate function. Obviously you don't have the same
concern as I had in the passthrough series where the window aperture is
set by the userspace but well that is just for checking.

> +{
> +	struct device *dev = msi_desc_to_dev(irq_get_msi_desc(irq));
> +	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
> +	struct iova_domain *iovad;
> +	struct iommu_dma_cookie *cookie;
> +	struct iommu_dma_msi_page *msi_page;
> +	int ret = 0;
> +
> +	if (!domain || !domain->iova_cookie)
> +		return;
> +
> +	cookie = domain->iova_cookie;
> +	iovad = &cookie->iovad;
> +
> +	spin_lock(&cookie->msi_lock);
> +	list_for_each_entry(msi_page, &cookie->msi_page_list, list)
> +		if (msi_page->phys_hi == msg->address_hi &&
> +		    msi_page->phys_lo - msg->address_lo < iovad->granule)
> +			goto unlock;
> +
> +	ret = __iommu_dma_map_msi_page(dev, msg, domain, &msi_page);
> +unlock:
> +	spin_unlock(&cookie->msi_lock);
> +
> +	if (!ret) {
> +		msg->address_hi = upper_32_bits(msi_page->iova);
> +		msg->address_lo &= iova_mask(iovad);
> +		msg->address_lo += lower_32_bits(msi_page->iova);
> +	} else {
> +		/*
> +		 * We're called from a void callback, so the best we can do is
> +		 * 'fail' by filling the message with obviously bogus values.
> +		 * Since we got this far due to an IOMMU being present, it's
> +		 * not like the existing address would have worked anyway...
> +		 */
> +		msg->address_hi = ~0U;
> +		msg->address_lo = ~0U;
> +		msg->data = ~0U;
> +	}
> +}
> diff --git a/drivers/irqchip/irq-gic-v2m.c b/drivers/irqchip/irq-gic-v2m.c
> index 35eb7ac5d21f..863e073c6f7f 100644
> --- a/drivers/irqchip/irq-gic-v2m.c
> +++ b/drivers/irqchip/irq-gic-v2m.c
> @@ -16,6 +16,7 @@
>  #define pr_fmt(fmt) "GICv2m: " fmt
>  
>  #include <linux/acpi.h>
> +#include <linux/dma-iommu.h>
>  #include <linux/irq.h>
>  #include <linux/irqdomain.h>
>  #include <linux/kernel.h>
> @@ -108,6 +109,8 @@ static void gicv2m_compose_msi_msg(struct irq_data *data, struct msi_msg *msg)
>  
>  	if (v2m->flags & GICV2M_NEEDS_SPI_OFFSET)
>  		msg->data -= v2m->spi_offset;
> +
> +	iommu_dma_map_msi_msg(data->irq, msg);
In the past we identified that msi_compose was not authorized to sleep
(https://lkml.org/lkml/2016/3/10/216) since potentialy called in atomic
context.

This is why in my passthrough series I was forced to move the mapping in
msi_domain_alloc, which also has the benefit to happen earlier and is
able to fail whereas the compose cannot due, to the subsequent BUG_ON.
Has things changed since this notice which now allow to do the mapping here?

>  }
>  
>  static struct irq_chip gicv2m_irq_chip = {
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index 7ceaba81efb4..73f4f10dc204 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -18,6 +18,7 @@
>  #include <linux/bitmap.h>
>  #include <linux/cpu.h>
>  #include <linux/delay.h>
> +#include <linux/dma-iommu.h>
>  #include <linux/interrupt.h>
>  #include <linux/log2.h>
>  #include <linux/mm.h>
> @@ -655,6 +656,8 @@ static void its_irq_compose_msi_msg(struct irq_data *d, struct msi_msg *msg)
>  	msg->address_lo		= addr & ((1UL << 32) - 1);
>  	msg->address_hi		= addr >> 32;
>  	msg->data		= its_get_event_id(d);
> +
> +	iommu_dma_map_msi_msg(d->irq, msg);
>  }
>  
>  static struct irq_chip its_irq_chip = {
> diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
> index 81c5c8d167ad..5ee806e41b5c 100644
> --- a/include/linux/dma-iommu.h
> +++ b/include/linux/dma-iommu.h
> @@ -21,6 +21,7 @@
>  
>  #ifdef CONFIG_IOMMU_DMA
>  #include <linux/iommu.h>
> +#include <linux/msi.h>
>  
>  int iommu_dma_init(void);
>  
> @@ -62,9 +63,13 @@ void iommu_dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
>  int iommu_dma_supported(struct device *dev, u64 mask);
>  int iommu_dma_mapping_error(struct device *dev, dma_addr_t dma_addr);
>  
> +/* The DMA API isn't _quite_ the whole story, though... */
> +void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg);
So I understand the patch currently addresses dma-mapping use case. What
about the passthrough use case? Here you obviously propose a simpler
version but it also looks to me it skips some comments we collected in
the past which resulted in the direction taken before:

- generic API to allocate msi iovas
- msi_geometry semantic recommended by Alex
- the handling of the size parameter as recommended by Marc
- separation of allocation/enumeration for msi_domain_allocate_irqs
/msi_compose separation.

For passthrough we also have to care about the safety issue, the window
size computation. Please can we collaborate to converge on a unified
solution?

Best Regards

Eric

> +
>  #else
>  
>  struct iommu_domain;
> +struct msi_msg;
>  
>  static inline int iommu_dma_init(void)
>  {
> @@ -80,6 +85,10 @@ static inline void iommu_put_dma_cookie(struct iommu_domain *domain)
>  {
>  }
>  
> +static inline void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
> +{
> +}
> +
>  #endif	/* CONFIG_IOMMU_DMA */
>  #endif	/* __KERNEL__ */
>  #endif	/* __DMA_IOMMU_H */
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v5 19/19] iommu/dma: Add support for mapping MSIs
@ 2016-08-25 22:25     ` Auger Eric
  0 siblings, 0 replies; 61+ messages in thread
From: Auger Eric @ 2016-08-25 22:25 UTC (permalink / raw)
  To: Robin Murphy, joro-zLv9SwRftAIdnm+yROfE0A,
	will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, lorenzo.pieralisi-5wv7dgnIgG8,
	jean-philippe.brucker-5wv7dgnIgG8, punit.agrawal-5wv7dgnIgG8,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA, Thomas Gleixner,
	Jason Cooper, Marc Zyngier, linux-kernel-u79uwXL29TY76Z2rM5mHXA

Hi Robin,

On 23/08/2016 21:05, Robin Murphy wrote:
> When an MSI doorbell is located downstream of an IOMMU, attaching
> devices to a DMA ops domain and switching on translation leads to a rude
> shock when their attempt to write to the physical address returned by
> the irqchip driver faults (or worse, writes into some already-mapped
> buffer) and no interrupt is forthcoming.
> 
> Address this by adding a hook for relevant irqchip drivers to call from
> their compose_msi_msg() callback, to swizzle the physical address with
> an appropriatly-mapped IOVA for any device attached to one of our DMA
> ops domains.
> 
> CC: Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
> CC: Jason Cooper <jason-NLaQJdtUoK4Be96aLqz0jA@public.gmane.org>
> CC: Marc Zyngier <marc.zyngier-5wv7dgnIgG8@public.gmane.org>
> CC: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
> ---
>  drivers/iommu/dma-iommu.c        | 141 ++++++++++++++++++++++++++++++++++-----
>  drivers/irqchip/irq-gic-v2m.c    |   3 +
>  drivers/irqchip/irq-gic-v3-its.c |   3 +
>  include/linux/dma-iommu.h        |   9 +++
>  4 files changed, 141 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 00c8a08d56e7..330cce60cad9 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -25,10 +25,29 @@
>  #include <linux/huge_mm.h>
>  #include <linux/iommu.h>
>  #include <linux/iova.h>
> +#include <linux/irq.h>
>  #include <linux/mm.h>
>  #include <linux/scatterlist.h>
>  #include <linux/vmalloc.h>
>  
> +struct iommu_dma_msi_page {
> +	struct list_head	list;
> +	dma_addr_t		iova;
> +	u32			phys_lo;
> +	u32			phys_hi;
> +};
> +
> +struct iommu_dma_cookie {
> +	struct iova_domain	iovad;
> +	struct list_head	msi_page_list;
> +	spinlock_t		msi_lock;
> +};
> +
> +static inline struct iova_domain *cookie_iovad(struct iommu_domain *domain)
> +{
> +	return &((struct iommu_dma_cookie *)domain->iova_cookie)->iovad;
> +}
> +
>  int iommu_dma_init(void)
>  {
>  	return iova_cache_get();
> @@ -43,15 +62,19 @@ int iommu_dma_init(void)
>   */
>  int iommu_get_dma_cookie(struct iommu_domain *domain)
>  {
> -	struct iova_domain *iovad;
> +	struct iommu_dma_cookie *cookie;
>  
>  	if (domain->iova_cookie)
>  		return -EEXIST;
>  
> -	iovad = kzalloc(sizeof(*iovad), GFP_KERNEL);
> -	domain->iova_cookie = iovad;
> +	cookie = kzalloc(sizeof(*cookie), GFP_KERNEL);
> +	if (!cookie)
> +		return -ENOMEM;
>  
> -	return iovad ? 0 : -ENOMEM;
> +	spin_lock_init(&cookie->msi_lock);
> +	INIT_LIST_HEAD(&cookie->msi_page_list);
> +	domain->iova_cookie = cookie;
> +	return 0;
>  }
>  EXPORT_SYMBOL(iommu_get_dma_cookie);
>  
> @@ -63,14 +86,20 @@ EXPORT_SYMBOL(iommu_get_dma_cookie);
>   */
>  void iommu_put_dma_cookie(struct iommu_domain *domain)
>  {
> -	struct iova_domain *iovad = domain->iova_cookie;
> +	struct iommu_dma_cookie *cookie = domain->iova_cookie;
> +	struct iommu_dma_msi_page *msi, *tmp;
>  
> -	if (!iovad)
> +	if (!cookie)
>  		return;
>  
> -	if (iovad->granule)
> -		put_iova_domain(iovad);
> -	kfree(iovad);
> +	if (cookie->iovad.granule)
> +		put_iova_domain(&cookie->iovad);
> +
> +	list_for_each_entry_safe(msi, tmp, &cookie->msi_page_list, list) {
> +		list_del(&msi->list);
> +		kfree(msi);
> +	}
> +	kfree(cookie);
>  	domain->iova_cookie = NULL;
>  }
>  EXPORT_SYMBOL(iommu_put_dma_cookie);
> @@ -88,7 +117,7 @@ EXPORT_SYMBOL(iommu_put_dma_cookie);
>   */
>  int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base, u64 size)
>  {
> -	struct iova_domain *iovad = domain->iova_cookie;
> +	struct iova_domain *iovad = cookie_iovad(domain);
>  	unsigned long order, base_pfn, end_pfn;
>  
>  	if (!iovad)
> @@ -155,7 +184,7 @@ int dma_direction_to_prot(enum dma_data_direction dir, bool coherent)
>  static struct iova *__alloc_iova(struct iommu_domain *domain, size_t size,
>  		dma_addr_t dma_limit)
>  {
> -	struct iova_domain *iovad = domain->iova_cookie;
> +	struct iova_domain *iovad = cookie_iovad(domain);
>  	unsigned long shift = iova_shift(iovad);
>  	unsigned long length = iova_align(iovad, size) >> shift;
>  
> @@ -171,7 +200,7 @@ static struct iova *__alloc_iova(struct iommu_domain *domain, size_t size,
>  /* The IOVA allocator knows what we mapped, so just unmap whatever that was */
>  static void __iommu_dma_unmap(struct iommu_domain *domain, dma_addr_t dma_addr)
>  {
> -	struct iova_domain *iovad = domain->iova_cookie;
> +	struct iova_domain *iovad = cookie_iovad(domain);
>  	unsigned long shift = iova_shift(iovad);
>  	unsigned long pfn = dma_addr >> shift;
>  	struct iova *iova = find_iova(iovad, pfn);
> @@ -294,7 +323,7 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp,
>  		void (*flush_page)(struct device *, const void *, phys_addr_t))
>  {
>  	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
> -	struct iova_domain *iovad = domain->iova_cookie;
> +	struct iova_domain *iovad = cookie_iovad(domain);
>  	struct iova *iova;
>  	struct page **pages;
>  	struct sg_table sgt;
> @@ -386,7 +415,7 @@ dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page,
>  {
>  	dma_addr_t dma_addr;
>  	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
> -	struct iova_domain *iovad = domain->iova_cookie;
> +	struct iova_domain *iovad = cookie_iovad(domain);
>  	phys_addr_t phys = page_to_phys(page) + offset;
>  	size_t iova_off = iova_offset(iovad, phys);
>  	size_t len = iova_align(iovad, size + iova_off);
> @@ -495,7 +524,7 @@ int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
>  		int nents, int prot)
>  {
>  	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
> -	struct iova_domain *iovad = domain->iova_cookie;
> +	struct iova_domain *iovad = cookie_iovad(domain);
>  	struct iova *iova;
>  	struct scatterlist *s, *prev = NULL;
>  	dma_addr_t dma_addr;
> @@ -587,3 +616,85 @@ int iommu_dma_mapping_error(struct device *dev, dma_addr_t dma_addr)
>  {
>  	return dma_addr == DMA_ERROR_CODE;
>  }
> +
> +static int __iommu_dma_map_msi_page(struct device *dev, struct msi_msg *msg,
> +		struct iommu_domain *domain, struct iommu_dma_msi_page **ppage)
> +{
> +	struct iommu_dma_cookie *cookie = domain->iova_cookie;
> +	struct iommu_dma_msi_page *msi_page;
> +	struct iova_domain *iovad = &cookie->iovad;
> +	struct iova *iova;
> +	phys_addr_t msi_addr = (u64)msg->address_hi << 32 | msg->address_lo;
> +	int ret, prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
In my series I ended up putting the memory attributes as a property of
the doorbell, advised to do so by Marc. Here we hard freeze them. Do you
foresee all the doorbells ill have the same attributes?
> +
> +	msi_page = kzalloc(sizeof(*msi_page), GFP_ATOMIC);
> +	if (!msi_page)
> +		return -ENOMEM;
> +
> +	iova = __alloc_iova(domain, iovad->granule, dma_get_mask(dev));
> +	if (!iova) {
> +		ret = -ENOSPC;
> +		goto out_free_page;
> +	}
> +
> +	msi_page->iova = iova_dma_addr(iovad, iova);
> +	ret = iommu_map(domain, msi_page->iova, msi_addr & ~iova_mask(iovad),
> +			iovad->granule, prot);
> +	if (ret)
> +		goto out_free_iova;
> +
> +	msi_page->phys_hi = msg->address_hi;
> +	msi_page->phys_lo = msg->address_lo;
> +	INIT_LIST_HEAD(&msi_page->list);
> +	list_add(&msi_page->list, &cookie->msi_page_list);
> +	*ppage = msi_page;
> +	return 0;
> +
> +out_free_iova:
> +	__free_iova(iovad, iova);
> +out_free_page:
> +	kfree(msi_page);
> +	return ret;
> +}
> +
> +void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
Marc told in the past it was reasonable to consider adding a size
parameter to the allocate function. Obviously you don't have the same
concern as I had in the passthrough series where the window aperture is
set by the userspace but well that is just for checking.

> +{
> +	struct device *dev = msi_desc_to_dev(irq_get_msi_desc(irq));
> +	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
> +	struct iova_domain *iovad;
> +	struct iommu_dma_cookie *cookie;
> +	struct iommu_dma_msi_page *msi_page;
> +	int ret = 0;
> +
> +	if (!domain || !domain->iova_cookie)
> +		return;
> +
> +	cookie = domain->iova_cookie;
> +	iovad = &cookie->iovad;
> +
> +	spin_lock(&cookie->msi_lock);
> +	list_for_each_entry(msi_page, &cookie->msi_page_list, list)
> +		if (msi_page->phys_hi == msg->address_hi &&
> +		    msi_page->phys_lo - msg->address_lo < iovad->granule)
> +			goto unlock;
> +
> +	ret = __iommu_dma_map_msi_page(dev, msg, domain, &msi_page);
> +unlock:
> +	spin_unlock(&cookie->msi_lock);
> +
> +	if (!ret) {
> +		msg->address_hi = upper_32_bits(msi_page->iova);
> +		msg->address_lo &= iova_mask(iovad);
> +		msg->address_lo += lower_32_bits(msi_page->iova);
> +	} else {
> +		/*
> +		 * We're called from a void callback, so the best we can do is
> +		 * 'fail' by filling the message with obviously bogus values.
> +		 * Since we got this far due to an IOMMU being present, it's
> +		 * not like the existing address would have worked anyway...
> +		 */
> +		msg->address_hi = ~0U;
> +		msg->address_lo = ~0U;
> +		msg->data = ~0U;
> +	}
> +}
> diff --git a/drivers/irqchip/irq-gic-v2m.c b/drivers/irqchip/irq-gic-v2m.c
> index 35eb7ac5d21f..863e073c6f7f 100644
> --- a/drivers/irqchip/irq-gic-v2m.c
> +++ b/drivers/irqchip/irq-gic-v2m.c
> @@ -16,6 +16,7 @@
>  #define pr_fmt(fmt) "GICv2m: " fmt
>  
>  #include <linux/acpi.h>
> +#include <linux/dma-iommu.h>
>  #include <linux/irq.h>
>  #include <linux/irqdomain.h>
>  #include <linux/kernel.h>
> @@ -108,6 +109,8 @@ static void gicv2m_compose_msi_msg(struct irq_data *data, struct msi_msg *msg)
>  
>  	if (v2m->flags & GICV2M_NEEDS_SPI_OFFSET)
>  		msg->data -= v2m->spi_offset;
> +
> +	iommu_dma_map_msi_msg(data->irq, msg);
In the past we identified that msi_compose was not authorized to sleep
(https://lkml.org/lkml/2016/3/10/216) since potentialy called in atomic
context.

This is why in my passthrough series I was forced to move the mapping in
msi_domain_alloc, which also has the benefit to happen earlier and is
able to fail whereas the compose cannot due, to the subsequent BUG_ON.
Has things changed since this notice which now allow to do the mapping here?

>  }
>  
>  static struct irq_chip gicv2m_irq_chip = {
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index 7ceaba81efb4..73f4f10dc204 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -18,6 +18,7 @@
>  #include <linux/bitmap.h>
>  #include <linux/cpu.h>
>  #include <linux/delay.h>
> +#include <linux/dma-iommu.h>
>  #include <linux/interrupt.h>
>  #include <linux/log2.h>
>  #include <linux/mm.h>
> @@ -655,6 +656,8 @@ static void its_irq_compose_msi_msg(struct irq_data *d, struct msi_msg *msg)
>  	msg->address_lo		= addr & ((1UL << 32) - 1);
>  	msg->address_hi		= addr >> 32;
>  	msg->data		= its_get_event_id(d);
> +
> +	iommu_dma_map_msi_msg(d->irq, msg);
>  }
>  
>  static struct irq_chip its_irq_chip = {
> diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
> index 81c5c8d167ad..5ee806e41b5c 100644
> --- a/include/linux/dma-iommu.h
> +++ b/include/linux/dma-iommu.h
> @@ -21,6 +21,7 @@
>  
>  #ifdef CONFIG_IOMMU_DMA
>  #include <linux/iommu.h>
> +#include <linux/msi.h>
>  
>  int iommu_dma_init(void);
>  
> @@ -62,9 +63,13 @@ void iommu_dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
>  int iommu_dma_supported(struct device *dev, u64 mask);
>  int iommu_dma_mapping_error(struct device *dev, dma_addr_t dma_addr);
>  
> +/* The DMA API isn't _quite_ the whole story, though... */
> +void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg);
So I understand the patch currently addresses dma-mapping use case. What
about the passthrough use case? Here you obviously propose a simpler
version but it also looks to me it skips some comments we collected in
the past which resulted in the direction taken before:

- generic API to allocate msi iovas
- msi_geometry semantic recommended by Alex
- the handling of the size parameter as recommended by Marc
- separation of allocation/enumeration for msi_domain_allocate_irqs
/msi_compose separation.

For passthrough we also have to care about the safety issue, the window
size computation. Please can we collaborate to converge on a unified
solution?

Best Regards

Eric

> +
>  #else
>  
>  struct iommu_domain;
> +struct msi_msg;
>  
>  static inline int iommu_dma_init(void)
>  {
> @@ -80,6 +85,10 @@ static inline void iommu_put_dma_cookie(struct iommu_domain *domain)
>  {
>  }
>  
> +static inline void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
> +{
> +}
> +
>  #endif	/* CONFIG_IOMMU_DMA */
>  #endif	/* __KERNEL__ */
>  #endif	/* __DMA_IOMMU_H */
> 
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v5 19/19] iommu/dma: Add support for mapping MSIs
@ 2016-08-26  1:17       ` Robin Murphy
  0 siblings, 0 replies; 61+ messages in thread
From: Robin Murphy @ 2016-08-26  1:17 UTC (permalink / raw)
  To: Auger Eric
  Cc: joro, will.deacon, iommu, devicetree, lorenzo.pieralisi,
	jean-philippe.brucker, punit.agrawal, thunder.leizhen,
	Thomas Gleixner, Jason Cooper, Marc Zyngier, linux-kernel, nd

Hi Eric,

On Fri, 26 Aug 2016 00:25:34 +0200
Auger Eric <eric.auger@redhat.com> wrote:

> Hi Robin,
> 
> On 23/08/2016 21:05, Robin Murphy wrote:
> > When an MSI doorbell is located downstream of an IOMMU, attaching
> > devices to a DMA ops domain and switching on translation leads to a
> > rude shock when their attempt to write to the physical address
> > returned by the irqchip driver faults (or worse, writes into some
> > already-mapped buffer) and no interrupt is forthcoming.
> > 
> > Address this by adding a hook for relevant irqchip drivers to call
> > from their compose_msi_msg() callback, to swizzle the physical
> > address with an appropriatly-mapped IOVA for any device attached to
> > one of our DMA ops domains.
> > 
> > CC: Thomas Gleixner <tglx@linutronix.de>
> > CC: Jason Cooper <jason@lakedaemon.net>
> > CC: Marc Zyngier <marc.zyngier@arm.com>
> > CC: linux-kernel@vger.kernel.org
> > Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> > ---
> >  drivers/iommu/dma-iommu.c        | 141
> > ++++++++++++++++++++++++++++++++++-----
> > drivers/irqchip/irq-gic-v2m.c    |   3 +
> > drivers/irqchip/irq-gic-v3-its.c |   3 +
> > include/linux/dma-iommu.h        |   9 +++ 4 files changed, 141
> > insertions(+), 15 deletions(-)
> > 
> > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> > index 00c8a08d56e7..330cce60cad9 100644
> > --- a/drivers/iommu/dma-iommu.c
> > +++ b/drivers/iommu/dma-iommu.c
> > @@ -25,10 +25,29 @@
> >  #include <linux/huge_mm.h>
> >  #include <linux/iommu.h>
> >  #include <linux/iova.h>
> > +#include <linux/irq.h>
> >  #include <linux/mm.h>
> >  #include <linux/scatterlist.h>
> >  #include <linux/vmalloc.h>
> >  
> > +struct iommu_dma_msi_page {
> > +	struct list_head	list;
> > +	dma_addr_t		iova;
> > +	u32			phys_lo;
> > +	u32			phys_hi;
> > +};
> > +
> > +struct iommu_dma_cookie {
> > +	struct iova_domain	iovad;
> > +	struct list_head	msi_page_list;
> > +	spinlock_t		msi_lock;
> > +};
> > +
> > +static inline struct iova_domain *cookie_iovad(struct iommu_domain
> > *domain) +{
> > +	return &((struct iommu_dma_cookie
> > *)domain->iova_cookie)->iovad; +}
> > +
> >  int iommu_dma_init(void)
> >  {
> >  	return iova_cache_get();
> > @@ -43,15 +62,19 @@ int iommu_dma_init(void)
> >   */
> >  int iommu_get_dma_cookie(struct iommu_domain *domain)
> >  {
> > -	struct iova_domain *iovad;
> > +	struct iommu_dma_cookie *cookie;
> >  
> >  	if (domain->iova_cookie)
> >  		return -EEXIST;
> >  
> > -	iovad = kzalloc(sizeof(*iovad), GFP_KERNEL);
> > -	domain->iova_cookie = iovad;
> > +	cookie = kzalloc(sizeof(*cookie), GFP_KERNEL);
> > +	if (!cookie)
> > +		return -ENOMEM;
> >  
> > -	return iovad ? 0 : -ENOMEM;
> > +	spin_lock_init(&cookie->msi_lock);
> > +	INIT_LIST_HEAD(&cookie->msi_page_list);
> > +	domain->iova_cookie = cookie;
> > +	return 0;
> >  }
> >  EXPORT_SYMBOL(iommu_get_dma_cookie);
> >  
> > @@ -63,14 +86,20 @@ EXPORT_SYMBOL(iommu_get_dma_cookie);
> >   */
> >  void iommu_put_dma_cookie(struct iommu_domain *domain)
> >  {
> > -	struct iova_domain *iovad = domain->iova_cookie;
> > +	struct iommu_dma_cookie *cookie = domain->iova_cookie;
> > +	struct iommu_dma_msi_page *msi, *tmp;
> >  
> > -	if (!iovad)
> > +	if (!cookie)
> >  		return;
> >  
> > -	if (iovad->granule)
> > -		put_iova_domain(iovad);
> > -	kfree(iovad);
> > +	if (cookie->iovad.granule)
> > +		put_iova_domain(&cookie->iovad);
> > +
> > +	list_for_each_entry_safe(msi, tmp, &cookie->msi_page_list,
> > list) {
> > +		list_del(&msi->list);
> > +		kfree(msi);
> > +	}
> > +	kfree(cookie);
> >  	domain->iova_cookie = NULL;
> >  }
> >  EXPORT_SYMBOL(iommu_put_dma_cookie);
> > @@ -88,7 +117,7 @@ EXPORT_SYMBOL(iommu_put_dma_cookie);
> >   */
> >  int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t
> > base, u64 size) {
> > -	struct iova_domain *iovad = domain->iova_cookie;
> > +	struct iova_domain *iovad = cookie_iovad(domain);
> >  	unsigned long order, base_pfn, end_pfn;
> >  
> >  	if (!iovad)
> > @@ -155,7 +184,7 @@ int dma_direction_to_prot(enum
> > dma_data_direction dir, bool coherent) static struct iova
> > *__alloc_iova(struct iommu_domain *domain, size_t size, dma_addr_t
> > dma_limit) {
> > -	struct iova_domain *iovad = domain->iova_cookie;
> > +	struct iova_domain *iovad = cookie_iovad(domain);
> >  	unsigned long shift = iova_shift(iovad);
> >  	unsigned long length = iova_align(iovad, size) >> shift;
> >  
> > @@ -171,7 +200,7 @@ static struct iova *__alloc_iova(struct
> > iommu_domain *domain, size_t size, /* The IOVA allocator knows what
> > we mapped, so just unmap whatever that was */ static void
> > __iommu_dma_unmap(struct iommu_domain *domain, dma_addr_t dma_addr)
> > {
> > -	struct iova_domain *iovad = domain->iova_cookie;
> > +	struct iova_domain *iovad = cookie_iovad(domain);
> >  	unsigned long shift = iova_shift(iovad);
> >  	unsigned long pfn = dma_addr >> shift;
> >  	struct iova *iova = find_iova(iovad, pfn);
> > @@ -294,7 +323,7 @@ struct page **iommu_dma_alloc(struct device
> > *dev, size_t size, gfp_t gfp, void (*flush_page)(struct device *,
> > const void *, phys_addr_t)) {
> >  	struct iommu_domain *domain =
> > iommu_get_domain_for_dev(dev);
> > -	struct iova_domain *iovad = domain->iova_cookie;
> > +	struct iova_domain *iovad = cookie_iovad(domain);
> >  	struct iova *iova;
> >  	struct page **pages;
> >  	struct sg_table sgt;
> > @@ -386,7 +415,7 @@ dma_addr_t iommu_dma_map_page(struct device
> > *dev, struct page *page, {
> >  	dma_addr_t dma_addr;
> >  	struct iommu_domain *domain =
> > iommu_get_domain_for_dev(dev);
> > -	struct iova_domain *iovad = domain->iova_cookie;
> > +	struct iova_domain *iovad = cookie_iovad(domain);
> >  	phys_addr_t phys = page_to_phys(page) + offset;
> >  	size_t iova_off = iova_offset(iovad, phys);
> >  	size_t len = iova_align(iovad, size + iova_off);
> > @@ -495,7 +524,7 @@ int iommu_dma_map_sg(struct device *dev, struct
> > scatterlist *sg, int nents, int prot)
> >  {
> >  	struct iommu_domain *domain =
> > iommu_get_domain_for_dev(dev);
> > -	struct iova_domain *iovad = domain->iova_cookie;
> > +	struct iova_domain *iovad = cookie_iovad(domain);
> >  	struct iova *iova;
> >  	struct scatterlist *s, *prev = NULL;
> >  	dma_addr_t dma_addr;
> > @@ -587,3 +616,85 @@ int iommu_dma_mapping_error(struct device
> > *dev, dma_addr_t dma_addr) {
> >  	return dma_addr == DMA_ERROR_CODE;
> >  }
> > +
> > +static int __iommu_dma_map_msi_page(struct device *dev, struct
> > msi_msg *msg,
> > +		struct iommu_domain *domain, struct
> > iommu_dma_msi_page **ppage) +{
> > +	struct iommu_dma_cookie *cookie = domain->iova_cookie;
> > +	struct iommu_dma_msi_page *msi_page;
> > +	struct iova_domain *iovad = &cookie->iovad;
> > +	struct iova *iova;
> > +	phys_addr_t msi_addr = (u64)msg->address_hi << 32 |
> > msg->address_lo;
> > +	int ret, prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;  
> In my series I ended up putting the memory attributes as a property of
> the doorbell, advised to do so by Marc. Here we hard freeze them. Do
> you foresee all the doorbells ill have the same attributes?

I can't think of any good reason any device would want to read from an
ITS or GICv2m doorbell, execute it, or allow the interconnect to reorder
or cache its MSI writes. If some crazy hardware comes along to prove my
assumption wrong then we'll revisit it, but right now I just want the
simplest possible solution for PCI DMA ops to not break MSIs.

> > +
> > +	msi_page = kzalloc(sizeof(*msi_page), GFP_ATOMIC);
> > +	if (!msi_page)
> > +		return -ENOMEM;
> > +
> > +	iova = __alloc_iova(domain, iovad->granule,
> > dma_get_mask(dev));
> > +	if (!iova) {
> > +		ret = -ENOSPC;
> > +		goto out_free_page;
> > +	}
> > +
> > +	msi_page->iova = iova_dma_addr(iovad, iova);
> > +	ret = iommu_map(domain, msi_page->iova, msi_addr &
> > ~iova_mask(iovad),
> > +			iovad->granule, prot);
> > +	if (ret)
> > +		goto out_free_iova;
> > +
> > +	msi_page->phys_hi = msg->address_hi;
> > +	msi_page->phys_lo = msg->address_lo;
> > +	INIT_LIST_HEAD(&msi_page->list);
> > +	list_add(&msi_page->list, &cookie->msi_page_list);
> > +	*ppage = msi_page;
> > +	return 0;
> > +
> > +out_free_iova:
> > +	__free_iova(iovad, iova);
> > +out_free_page:
> > +	kfree(msi_page);
> > +	return ret;
> > +}
> > +
> > +void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)  
> Marc told in the past it was reasonable to consider adding a size
> parameter to the allocate function. Obviously you don't have the same
> concern as I had in the passthrough series where the window aperture
> is set by the userspace but well that is just for checking.

The beauty is that at this level we really don't care. It's simply
"here's an address from the irqchip driver, is there a mapping in this
domain which covers it?" The only possible use of knowing a size
here would be if it happens to correspond to a larger page size we
could use for the mapping, but that would entail a bunch of complication
for what seems like a highly tenuous benefit, and right now I just want
the simplest possible solution for PCI DMA ops to not break MSIs.

> 
> > +{
> > +	struct device *dev =
> > msi_desc_to_dev(irq_get_msi_desc(irq));
> > +	struct iommu_domain *domain =
> > iommu_get_domain_for_dev(dev);
> > +	struct iova_domain *iovad;
> > +	struct iommu_dma_cookie *cookie;
> > +	struct iommu_dma_msi_page *msi_page;
> > +	int ret = 0;
> > +
> > +	if (!domain || !domain->iova_cookie)
> > +		return;
> > +
> > +	cookie = domain->iova_cookie;
> > +	iovad = &cookie->iovad;
> > +
> > +	spin_lock(&cookie->msi_lock);
> > +	list_for_each_entry(msi_page, &cookie->msi_page_list, list)
> > +		if (msi_page->phys_hi == msg->address_hi &&
> > +		    msi_page->phys_lo - msg->address_lo <
> > iovad->granule)
> > +			goto unlock;
> > +
> > +	ret = __iommu_dma_map_msi_page(dev, msg, domain,
> > &msi_page); +unlock:
> > +	spin_unlock(&cookie->msi_lock);
> > +
> > +	if (!ret) {
> > +		msg->address_hi = upper_32_bits(msi_page->iova);
> > +		msg->address_lo &= iova_mask(iovad);
> > +		msg->address_lo += lower_32_bits(msi_page->iova);
> > +	} else {
> > +		/*
> > +		 * We're called from a void callback, so the best
> > we can do is
> > +		 * 'fail' by filling the message with obviously
> > bogus values.
> > +		 * Since we got this far due to an IOMMU being
> > present, it's
> > +		 * not like the existing address would have worked
> > anyway...
> > +		 */
> > +		msg->address_hi = ~0U;
> > +		msg->address_lo = ~0U;
> > +		msg->data = ~0U;
> > +	}
> > +}
> > diff --git a/drivers/irqchip/irq-gic-v2m.c
> > b/drivers/irqchip/irq-gic-v2m.c index 35eb7ac5d21f..863e073c6f7f
> > 100644 --- a/drivers/irqchip/irq-gic-v2m.c
> > +++ b/drivers/irqchip/irq-gic-v2m.c
> > @@ -16,6 +16,7 @@
> >  #define pr_fmt(fmt) "GICv2m: " fmt
> >  
> >  #include <linux/acpi.h>
> > +#include <linux/dma-iommu.h>
> >  #include <linux/irq.h>
> >  #include <linux/irqdomain.h>
> >  #include <linux/kernel.h>
> > @@ -108,6 +109,8 @@ static void gicv2m_compose_msi_msg(struct
> > irq_data *data, struct msi_msg *msg) 
> >  	if (v2m->flags & GICV2M_NEEDS_SPI_OFFSET)
> >  		msg->data -= v2m->spi_offset;
> > +
> > +	iommu_dma_map_msi_msg(data->irq, msg);  
> In the past we identified that msi_compose was not authorized to sleep
> (https://lkml.org/lkml/2016/3/10/216) since potentialy called in
> atomic context.
> 
> This is why in my passthrough series I was forced to move the mapping
> in msi_domain_alloc, which also has the benefit to happen earlier and
> is able to fail whereas the compose cannot due, to the subsequent
> BUG_ON. Has things changed since this notice which now allow to do
> the mapping here?

We've got a non-sleeping spinlock covering the lookup, and new pages are
allocated with GFP_ATOMIC; what have I missed? Any IOMMU driver whose
iommu_map() implementation might sleep already can't use this layer, as
DMA API calls may come from atomic context as well.

The "oh well, at least we tried" failure case (I've added a WARN_ON
now for good measure) is largely mitigated by the fact that 99% of the
time in practice we'll just be mapping one page once per domain and
hitting it on subsequent lookups. If someone's already consumed all the
available IOVA space before that page is mapped, or the IOMMU page
tables are corrupted, then the device won't be able to do DMA anyway, so
the fact that it can't raise its MSI is likely moot.

> 
> >  }
> >  
> >  static struct irq_chip gicv2m_irq_chip = {
> > diff --git a/drivers/irqchip/irq-gic-v3-its.c
> > b/drivers/irqchip/irq-gic-v3-its.c index 7ceaba81efb4..73f4f10dc204
> > 100644 --- a/drivers/irqchip/irq-gic-v3-its.c
> > +++ b/drivers/irqchip/irq-gic-v3-its.c
> > @@ -18,6 +18,7 @@
> >  #include <linux/bitmap.h>
> >  #include <linux/cpu.h>
> >  #include <linux/delay.h>
> > +#include <linux/dma-iommu.h>
> >  #include <linux/interrupt.h>
> >  #include <linux/log2.h>
> >  #include <linux/mm.h>
> > @@ -655,6 +656,8 @@ static void its_irq_compose_msi_msg(struct
> > irq_data *d, struct msi_msg *msg) msg->address_lo		=
> > addr & ((1UL << 32) - 1); msg->address_hi		= addr >>
> > 32; msg->data		= its_get_event_id(d);
> > +
> > +	iommu_dma_map_msi_msg(d->irq, msg);
> >  }
> >  
> >  static struct irq_chip its_irq_chip = {
> > diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
> > index 81c5c8d167ad..5ee806e41b5c 100644
> > --- a/include/linux/dma-iommu.h
> > +++ b/include/linux/dma-iommu.h
> > @@ -21,6 +21,7 @@
> >  
> >  #ifdef CONFIG_IOMMU_DMA
> >  #include <linux/iommu.h>
> > +#include <linux/msi.h>
> >  
> >  int iommu_dma_init(void);
> >  
> > @@ -62,9 +63,13 @@ void iommu_dma_unmap_sg(struct device *dev,
> > struct scatterlist *sg, int nents, int iommu_dma_supported(struct
> > device *dev, u64 mask); int iommu_dma_mapping_error(struct device
> > *dev, dma_addr_t dma_addr); 
> > +/* The DMA API isn't _quite_ the whole story, though... */
> > +void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg);  
> So I understand the patch currently addresses dma-mapping use case.
> What about the passthrough use case? Here you obviously propose a
> simpler version but it also looks to me it skips some comments we
> collected in the past which resulted in the direction taken before:

This is only intended to be the lowest-level remapping machinery -
clearly the guest MSI case has a lot more complexity on top, but that
doesn't apply on the host side, and right now I just want the simplest
possible solution for PCI DMA ops to not break MSIs.

> - generic API to allocate msi iovas
> - msi_geometry semantic recommended by Alex
> - the handling of the size parameter as recommended by Marc
> - separation of allocation/enumeration for msi_domain_allocate_irqs
> /msi_compose separation.

I have also thrown together an illustrative patch for plugging
this into VFIO domains [1] following some internal discussion, but
that's about as far as I was planning to go myself - AFAICS all your
MSI geometry and VFIO bits remain valid, I just looked at the
msi_cookie stuff and found it really didn't tie in with DMA ops at all
well.
 
> 
> For passthrough we also have to care about the safety issue, the
> window size computation. Please can we collaborate to converge on a
> unified solution?

I remain adamant that the safety thing is a property of the irqchip and
the irqchip alone (I've also come to realise that iommu_capable() is
fundamentally unworkable altogether, but that's another story). Thus
as I see it, getting the low-level remapping aspect out of the way in a
common manner leaves the rest of the guest MSI problem firmly between
VFIO and the MSI layer, now that we've got a much clearer view of
it thanks to your efforts. What do you think?

After all, right now... well, y'know ;)

Robin.

[1]:http://linux-arm.org/git?p=linux-rm.git;a=commitdiff;h=991effe0712750ce24f6a0b2e2e3f8f57d4a9910

> 
> Best Regards
> 
> Eric
> 
> > +
> >  #else
> >  
> >  struct iommu_domain;
> > +struct msi_msg;
> >  
> >  static inline int iommu_dma_init(void)
> >  {
> > @@ -80,6 +85,10 @@ static inline void iommu_put_dma_cookie(struct
> > iommu_domain *domain) {
> >  }
> >  
> > +static inline void iommu_dma_map_msi_msg(int irq, struct msi_msg
> > *msg) +{
> > +}
> > +
> >  #endif	/* CONFIG_IOMMU_DMA */
> >  #endif	/* __KERNEL__ */
> >  #endif	/* __DMA_IOMMU_H */
> >   
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v5 19/19] iommu/dma: Add support for mapping MSIs
@ 2016-08-26  1:17       ` Robin Murphy
  0 siblings, 0 replies; 61+ messages in thread
From: Robin Murphy @ 2016-08-26  1:17 UTC (permalink / raw)
  To: Auger Eric
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, Jason Cooper,
	punit.agrawal-5wv7dgnIgG8, will.deacon-5wv7dgnIgG8,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Marc Zyngier,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA, Thomas Gleixner,
	nd-5wv7dgnIgG8

Hi Eric,

On Fri, 26 Aug 2016 00:25:34 +0200
Auger Eric <eric.auger-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:

> Hi Robin,
> 
> On 23/08/2016 21:05, Robin Murphy wrote:
> > When an MSI doorbell is located downstream of an IOMMU, attaching
> > devices to a DMA ops domain and switching on translation leads to a
> > rude shock when their attempt to write to the physical address
> > returned by the irqchip driver faults (or worse, writes into some
> > already-mapped buffer) and no interrupt is forthcoming.
> > 
> > Address this by adding a hook for relevant irqchip drivers to call
> > from their compose_msi_msg() callback, to swizzle the physical
> > address with an appropriatly-mapped IOVA for any device attached to
> > one of our DMA ops domains.
> > 
> > CC: Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
> > CC: Jason Cooper <jason-NLaQJdtUoK4Be96aLqz0jA@public.gmane.org>
> > CC: Marc Zyngier <marc.zyngier-5wv7dgnIgG8@public.gmane.org>
> > CC: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
> > ---
> >  drivers/iommu/dma-iommu.c        | 141
> > ++++++++++++++++++++++++++++++++++-----
> > drivers/irqchip/irq-gic-v2m.c    |   3 +
> > drivers/irqchip/irq-gic-v3-its.c |   3 +
> > include/linux/dma-iommu.h        |   9 +++ 4 files changed, 141
> > insertions(+), 15 deletions(-)
> > 
> > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> > index 00c8a08d56e7..330cce60cad9 100644
> > --- a/drivers/iommu/dma-iommu.c
> > +++ b/drivers/iommu/dma-iommu.c
> > @@ -25,10 +25,29 @@
> >  #include <linux/huge_mm.h>
> >  #include <linux/iommu.h>
> >  #include <linux/iova.h>
> > +#include <linux/irq.h>
> >  #include <linux/mm.h>
> >  #include <linux/scatterlist.h>
> >  #include <linux/vmalloc.h>
> >  
> > +struct iommu_dma_msi_page {
> > +	struct list_head	list;
> > +	dma_addr_t		iova;
> > +	u32			phys_lo;
> > +	u32			phys_hi;
> > +};
> > +
> > +struct iommu_dma_cookie {
> > +	struct iova_domain	iovad;
> > +	struct list_head	msi_page_list;
> > +	spinlock_t		msi_lock;
> > +};
> > +
> > +static inline struct iova_domain *cookie_iovad(struct iommu_domain
> > *domain) +{
> > +	return &((struct iommu_dma_cookie
> > *)domain->iova_cookie)->iovad; +}
> > +
> >  int iommu_dma_init(void)
> >  {
> >  	return iova_cache_get();
> > @@ -43,15 +62,19 @@ int iommu_dma_init(void)
> >   */
> >  int iommu_get_dma_cookie(struct iommu_domain *domain)
> >  {
> > -	struct iova_domain *iovad;
> > +	struct iommu_dma_cookie *cookie;
> >  
> >  	if (domain->iova_cookie)
> >  		return -EEXIST;
> >  
> > -	iovad = kzalloc(sizeof(*iovad), GFP_KERNEL);
> > -	domain->iova_cookie = iovad;
> > +	cookie = kzalloc(sizeof(*cookie), GFP_KERNEL);
> > +	if (!cookie)
> > +		return -ENOMEM;
> >  
> > -	return iovad ? 0 : -ENOMEM;
> > +	spin_lock_init(&cookie->msi_lock);
> > +	INIT_LIST_HEAD(&cookie->msi_page_list);
> > +	domain->iova_cookie = cookie;
> > +	return 0;
> >  }
> >  EXPORT_SYMBOL(iommu_get_dma_cookie);
> >  
> > @@ -63,14 +86,20 @@ EXPORT_SYMBOL(iommu_get_dma_cookie);
> >   */
> >  void iommu_put_dma_cookie(struct iommu_domain *domain)
> >  {
> > -	struct iova_domain *iovad = domain->iova_cookie;
> > +	struct iommu_dma_cookie *cookie = domain->iova_cookie;
> > +	struct iommu_dma_msi_page *msi, *tmp;
> >  
> > -	if (!iovad)
> > +	if (!cookie)
> >  		return;
> >  
> > -	if (iovad->granule)
> > -		put_iova_domain(iovad);
> > -	kfree(iovad);
> > +	if (cookie->iovad.granule)
> > +		put_iova_domain(&cookie->iovad);
> > +
> > +	list_for_each_entry_safe(msi, tmp, &cookie->msi_page_list,
> > list) {
> > +		list_del(&msi->list);
> > +		kfree(msi);
> > +	}
> > +	kfree(cookie);
> >  	domain->iova_cookie = NULL;
> >  }
> >  EXPORT_SYMBOL(iommu_put_dma_cookie);
> > @@ -88,7 +117,7 @@ EXPORT_SYMBOL(iommu_put_dma_cookie);
> >   */
> >  int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t
> > base, u64 size) {
> > -	struct iova_domain *iovad = domain->iova_cookie;
> > +	struct iova_domain *iovad = cookie_iovad(domain);
> >  	unsigned long order, base_pfn, end_pfn;
> >  
> >  	if (!iovad)
> > @@ -155,7 +184,7 @@ int dma_direction_to_prot(enum
> > dma_data_direction dir, bool coherent) static struct iova
> > *__alloc_iova(struct iommu_domain *domain, size_t size, dma_addr_t
> > dma_limit) {
> > -	struct iova_domain *iovad = domain->iova_cookie;
> > +	struct iova_domain *iovad = cookie_iovad(domain);
> >  	unsigned long shift = iova_shift(iovad);
> >  	unsigned long length = iova_align(iovad, size) >> shift;
> >  
> > @@ -171,7 +200,7 @@ static struct iova *__alloc_iova(struct
> > iommu_domain *domain, size_t size, /* The IOVA allocator knows what
> > we mapped, so just unmap whatever that was */ static void
> > __iommu_dma_unmap(struct iommu_domain *domain, dma_addr_t dma_addr)
> > {
> > -	struct iova_domain *iovad = domain->iova_cookie;
> > +	struct iova_domain *iovad = cookie_iovad(domain);
> >  	unsigned long shift = iova_shift(iovad);
> >  	unsigned long pfn = dma_addr >> shift;
> >  	struct iova *iova = find_iova(iovad, pfn);
> > @@ -294,7 +323,7 @@ struct page **iommu_dma_alloc(struct device
> > *dev, size_t size, gfp_t gfp, void (*flush_page)(struct device *,
> > const void *, phys_addr_t)) {
> >  	struct iommu_domain *domain =
> > iommu_get_domain_for_dev(dev);
> > -	struct iova_domain *iovad = domain->iova_cookie;
> > +	struct iova_domain *iovad = cookie_iovad(domain);
> >  	struct iova *iova;
> >  	struct page **pages;
> >  	struct sg_table sgt;
> > @@ -386,7 +415,7 @@ dma_addr_t iommu_dma_map_page(struct device
> > *dev, struct page *page, {
> >  	dma_addr_t dma_addr;
> >  	struct iommu_domain *domain =
> > iommu_get_domain_for_dev(dev);
> > -	struct iova_domain *iovad = domain->iova_cookie;
> > +	struct iova_domain *iovad = cookie_iovad(domain);
> >  	phys_addr_t phys = page_to_phys(page) + offset;
> >  	size_t iova_off = iova_offset(iovad, phys);
> >  	size_t len = iova_align(iovad, size + iova_off);
> > @@ -495,7 +524,7 @@ int iommu_dma_map_sg(struct device *dev, struct
> > scatterlist *sg, int nents, int prot)
> >  {
> >  	struct iommu_domain *domain =
> > iommu_get_domain_for_dev(dev);
> > -	struct iova_domain *iovad = domain->iova_cookie;
> > +	struct iova_domain *iovad = cookie_iovad(domain);
> >  	struct iova *iova;
> >  	struct scatterlist *s, *prev = NULL;
> >  	dma_addr_t dma_addr;
> > @@ -587,3 +616,85 @@ int iommu_dma_mapping_error(struct device
> > *dev, dma_addr_t dma_addr) {
> >  	return dma_addr == DMA_ERROR_CODE;
> >  }
> > +
> > +static int __iommu_dma_map_msi_page(struct device *dev, struct
> > msi_msg *msg,
> > +		struct iommu_domain *domain, struct
> > iommu_dma_msi_page **ppage) +{
> > +	struct iommu_dma_cookie *cookie = domain->iova_cookie;
> > +	struct iommu_dma_msi_page *msi_page;
> > +	struct iova_domain *iovad = &cookie->iovad;
> > +	struct iova *iova;
> > +	phys_addr_t msi_addr = (u64)msg->address_hi << 32 |
> > msg->address_lo;
> > +	int ret, prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;  
> In my series I ended up putting the memory attributes as a property of
> the doorbell, advised to do so by Marc. Here we hard freeze them. Do
> you foresee all the doorbells ill have the same attributes?

I can't think of any good reason any device would want to read from an
ITS or GICv2m doorbell, execute it, or allow the interconnect to reorder
or cache its MSI writes. If some crazy hardware comes along to prove my
assumption wrong then we'll revisit it, but right now I just want the
simplest possible solution for PCI DMA ops to not break MSIs.

> > +
> > +	msi_page = kzalloc(sizeof(*msi_page), GFP_ATOMIC);
> > +	if (!msi_page)
> > +		return -ENOMEM;
> > +
> > +	iova = __alloc_iova(domain, iovad->granule,
> > dma_get_mask(dev));
> > +	if (!iova) {
> > +		ret = -ENOSPC;
> > +		goto out_free_page;
> > +	}
> > +
> > +	msi_page->iova = iova_dma_addr(iovad, iova);
> > +	ret = iommu_map(domain, msi_page->iova, msi_addr &
> > ~iova_mask(iovad),
> > +			iovad->granule, prot);
> > +	if (ret)
> > +		goto out_free_iova;
> > +
> > +	msi_page->phys_hi = msg->address_hi;
> > +	msi_page->phys_lo = msg->address_lo;
> > +	INIT_LIST_HEAD(&msi_page->list);
> > +	list_add(&msi_page->list, &cookie->msi_page_list);
> > +	*ppage = msi_page;
> > +	return 0;
> > +
> > +out_free_iova:
> > +	__free_iova(iovad, iova);
> > +out_free_page:
> > +	kfree(msi_page);
> > +	return ret;
> > +}
> > +
> > +void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)  
> Marc told in the past it was reasonable to consider adding a size
> parameter to the allocate function. Obviously you don't have the same
> concern as I had in the passthrough series where the window aperture
> is set by the userspace but well that is just for checking.

The beauty is that at this level we really don't care. It's simply
"here's an address from the irqchip driver, is there a mapping in this
domain which covers it?" The only possible use of knowing a size
here would be if it happens to correspond to a larger page size we
could use for the mapping, but that would entail a bunch of complication
for what seems like a highly tenuous benefit, and right now I just want
the simplest possible solution for PCI DMA ops to not break MSIs.

> 
> > +{
> > +	struct device *dev =
> > msi_desc_to_dev(irq_get_msi_desc(irq));
> > +	struct iommu_domain *domain =
> > iommu_get_domain_for_dev(dev);
> > +	struct iova_domain *iovad;
> > +	struct iommu_dma_cookie *cookie;
> > +	struct iommu_dma_msi_page *msi_page;
> > +	int ret = 0;
> > +
> > +	if (!domain || !domain->iova_cookie)
> > +		return;
> > +
> > +	cookie = domain->iova_cookie;
> > +	iovad = &cookie->iovad;
> > +
> > +	spin_lock(&cookie->msi_lock);
> > +	list_for_each_entry(msi_page, &cookie->msi_page_list, list)
> > +		if (msi_page->phys_hi == msg->address_hi &&
> > +		    msi_page->phys_lo - msg->address_lo <
> > iovad->granule)
> > +			goto unlock;
> > +
> > +	ret = __iommu_dma_map_msi_page(dev, msg, domain,
> > &msi_page); +unlock:
> > +	spin_unlock(&cookie->msi_lock);
> > +
> > +	if (!ret) {
> > +		msg->address_hi = upper_32_bits(msi_page->iova);
> > +		msg->address_lo &= iova_mask(iovad);
> > +		msg->address_lo += lower_32_bits(msi_page->iova);
> > +	} else {
> > +		/*
> > +		 * We're called from a void callback, so the best
> > we can do is
> > +		 * 'fail' by filling the message with obviously
> > bogus values.
> > +		 * Since we got this far due to an IOMMU being
> > present, it's
> > +		 * not like the existing address would have worked
> > anyway...
> > +		 */
> > +		msg->address_hi = ~0U;
> > +		msg->address_lo = ~0U;
> > +		msg->data = ~0U;
> > +	}
> > +}
> > diff --git a/drivers/irqchip/irq-gic-v2m.c
> > b/drivers/irqchip/irq-gic-v2m.c index 35eb7ac5d21f..863e073c6f7f
> > 100644 --- a/drivers/irqchip/irq-gic-v2m.c
> > +++ b/drivers/irqchip/irq-gic-v2m.c
> > @@ -16,6 +16,7 @@
> >  #define pr_fmt(fmt) "GICv2m: " fmt
> >  
> >  #include <linux/acpi.h>
> > +#include <linux/dma-iommu.h>
> >  #include <linux/irq.h>
> >  #include <linux/irqdomain.h>
> >  #include <linux/kernel.h>
> > @@ -108,6 +109,8 @@ static void gicv2m_compose_msi_msg(struct
> > irq_data *data, struct msi_msg *msg) 
> >  	if (v2m->flags & GICV2M_NEEDS_SPI_OFFSET)
> >  		msg->data -= v2m->spi_offset;
> > +
> > +	iommu_dma_map_msi_msg(data->irq, msg);  
> In the past we identified that msi_compose was not authorized to sleep
> (https://lkml.org/lkml/2016/3/10/216) since potentialy called in
> atomic context.
> 
> This is why in my passthrough series I was forced to move the mapping
> in msi_domain_alloc, which also has the benefit to happen earlier and
> is able to fail whereas the compose cannot due, to the subsequent
> BUG_ON. Has things changed since this notice which now allow to do
> the mapping here?

We've got a non-sleeping spinlock covering the lookup, and new pages are
allocated with GFP_ATOMIC; what have I missed? Any IOMMU driver whose
iommu_map() implementation might sleep already can't use this layer, as
DMA API calls may come from atomic context as well.

The "oh well, at least we tried" failure case (I've added a WARN_ON
now for good measure) is largely mitigated by the fact that 99% of the
time in practice we'll just be mapping one page once per domain and
hitting it on subsequent lookups. If someone's already consumed all the
available IOVA space before that page is mapped, or the IOMMU page
tables are corrupted, then the device won't be able to do DMA anyway, so
the fact that it can't raise its MSI is likely moot.

> 
> >  }
> >  
> >  static struct irq_chip gicv2m_irq_chip = {
> > diff --git a/drivers/irqchip/irq-gic-v3-its.c
> > b/drivers/irqchip/irq-gic-v3-its.c index 7ceaba81efb4..73f4f10dc204
> > 100644 --- a/drivers/irqchip/irq-gic-v3-its.c
> > +++ b/drivers/irqchip/irq-gic-v3-its.c
> > @@ -18,6 +18,7 @@
> >  #include <linux/bitmap.h>
> >  #include <linux/cpu.h>
> >  #include <linux/delay.h>
> > +#include <linux/dma-iommu.h>
> >  #include <linux/interrupt.h>
> >  #include <linux/log2.h>
> >  #include <linux/mm.h>
> > @@ -655,6 +656,8 @@ static void its_irq_compose_msi_msg(struct
> > irq_data *d, struct msi_msg *msg) msg->address_lo		=
> > addr & ((1UL << 32) - 1); msg->address_hi		= addr >>
> > 32; msg->data		= its_get_event_id(d);
> > +
> > +	iommu_dma_map_msi_msg(d->irq, msg);
> >  }
> >  
> >  static struct irq_chip its_irq_chip = {
> > diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
> > index 81c5c8d167ad..5ee806e41b5c 100644
> > --- a/include/linux/dma-iommu.h
> > +++ b/include/linux/dma-iommu.h
> > @@ -21,6 +21,7 @@
> >  
> >  #ifdef CONFIG_IOMMU_DMA
> >  #include <linux/iommu.h>
> > +#include <linux/msi.h>
> >  
> >  int iommu_dma_init(void);
> >  
> > @@ -62,9 +63,13 @@ void iommu_dma_unmap_sg(struct device *dev,
> > struct scatterlist *sg, int nents, int iommu_dma_supported(struct
> > device *dev, u64 mask); int iommu_dma_mapping_error(struct device
> > *dev, dma_addr_t dma_addr); 
> > +/* The DMA API isn't _quite_ the whole story, though... */
> > +void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg);  
> So I understand the patch currently addresses dma-mapping use case.
> What about the passthrough use case? Here you obviously propose a
> simpler version but it also looks to me it skips some comments we
> collected in the past which resulted in the direction taken before:

This is only intended to be the lowest-level remapping machinery -
clearly the guest MSI case has a lot more complexity on top, but that
doesn't apply on the host side, and right now I just want the simplest
possible solution for PCI DMA ops to not break MSIs.

> - generic API to allocate msi iovas
> - msi_geometry semantic recommended by Alex
> - the handling of the size parameter as recommended by Marc
> - separation of allocation/enumeration for msi_domain_allocate_irqs
> /msi_compose separation.

I have also thrown together an illustrative patch for plugging
this into VFIO domains [1] following some internal discussion, but
that's about as far as I was planning to go myself - AFAICS all your
MSI geometry and VFIO bits remain valid, I just looked at the
msi_cookie stuff and found it really didn't tie in with DMA ops at all
well.
 
> 
> For passthrough we also have to care about the safety issue, the
> window size computation. Please can we collaborate to converge on a
> unified solution?

I remain adamant that the safety thing is a property of the irqchip and
the irqchip alone (I've also come to realise that iommu_capable() is
fundamentally unworkable altogether, but that's another story). Thus
as I see it, getting the low-level remapping aspect out of the way in a
common manner leaves the rest of the guest MSI problem firmly between
VFIO and the MSI layer, now that we've got a much clearer view of
it thanks to your efforts. What do you think?

After all, right now... well, y'know ;)

Robin.

[1]:http://linux-arm.org/git?p=linux-rm.git;a=commitdiff;h=991effe0712750ce24f6a0b2e2e3f8f57d4a9910

> 
> Best Regards
> 
> Eric
> 
> > +
> >  #else
> >  
> >  struct iommu_domain;
> > +struct msi_msg;
> >  
> >  static inline int iommu_dma_init(void)
> >  {
> > @@ -80,6 +85,10 @@ static inline void iommu_put_dma_cookie(struct
> > iommu_domain *domain) {
> >  }
> >  
> > +static inline void iommu_dma_map_msi_msg(int irq, struct msi_msg
> > *msg) +{
> > +}
> > +
> >  #endif	/* CONFIG_IOMMU_DMA */
> >  #endif	/* __KERNEL__ */
> >  #endif	/* __DMA_IOMMU_H */
> >   
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v5 16/19] Docs: dt: document ARM SMMU generic binding usage
       [not found]     ` <b4f0eca93ac944c3430297b97c703e1bc54846d7.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
@ 2016-08-29 15:44       ` Rob Herring
  0 siblings, 0 replies; 61+ messages in thread
From: Rob Herring @ 2016-08-29 15:44 UTC (permalink / raw)
  To: Robin Murphy
  Cc: joro-zLv9SwRftAIdnm+yROfE0A, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	devicetree-u79uwXL29TY76Z2rM5mHXA, lorenzo.pieralisi-5wv7dgnIgG8,
	jean-philippe.brucker-5wv7dgnIgG8, punit.agrawal-5wv7dgnIgG8,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA,
	eric.auger-H+wXaHxf7aLQT0dZR+AlfA, Mark Rutland

On Tue, Aug 23, 2016 at 08:05:27PM +0100, Robin Murphy wrote:
> Document how the generic "iommus" binding should be used to describe ARM
> SMMU stream IDs instead of the old "mmu-masters" binding.
> 
> CC: Rob Herring <robh+dt-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> CC: Mark Rutland <mark.rutland-5wv7dgnIgG8@public.gmane.org>
> Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
> ---
>  .../devicetree/bindings/iommu/arm,smmu.txt         | 63 ++++++++++++++++------
>  1 file changed, 48 insertions(+), 15 deletions(-)

Acked-by: Rob Herring <robh-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v5 03/19] iommu/of: Handle iommu-map property for PCI
       [not found]     ` <93909648835867008b21cb688a1d7db238d3641a.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
@ 2016-08-31 15:43       ` Will Deacon
  0 siblings, 0 replies; 61+ messages in thread
From: Will Deacon @ 2016-08-31 15:43 UTC (permalink / raw)
  To: Robin Murphy
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, punit.agrawal-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA

On Tue, Aug 23, 2016 at 08:05:14PM +0100, Robin Murphy wrote:
> Now that we have a way to pick up the RID translation and target IOMMU,
> hook up of_iommu_configure() to bring PCI devices into the of_xlate
> mechanism and allow them IOMMU-backed DMA ops without the need for
> driver-specific handling.
> 
> Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
> ---
>  drivers/iommu/of_iommu.c | 43 ++++++++++++++++++++++++++++++++++++-------
>  1 file changed, 36 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
> index 57f23eaaa2f9..1a65cc806898 100644
> --- a/drivers/iommu/of_iommu.c
> +++ b/drivers/iommu/of_iommu.c
> @@ -22,6 +22,7 @@
>  #include <linux/limits.h>
>  #include <linux/of.h>
>  #include <linux/of_iommu.h>
> +#include <linux/of_pci.h>
>  #include <linux/slab.h>
>  
>  static const struct of_device_id __iommu_of_table_sentinel
> @@ -134,20 +135,48 @@ const struct iommu_ops *of_iommu_get_ops(struct device_node *np)
>  	return ops;
>  }
>  
> +static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void *data)
> +{
> +	struct of_phandle_args *iommu_spec = data;
> +
> +	iommu_spec->args[0] = alias;
> +	return iommu_spec->np == pdev->bus->dev.of_node;
> +}
> +
>  const struct iommu_ops *of_iommu_configure(struct device *dev,
>  					   struct device_node *master_np)
>  {
>  	struct of_phandle_args iommu_spec;
> -	struct device_node *np;
> +	struct device_node *np = NULL;
>  	const struct iommu_ops *ops = NULL;
>  	int idx = 0;
>  
> -	/*
> -	 * We can't do much for PCI devices without knowing how
> -	 * device IDs are wired up from the PCI bus to the IOMMU.
> -	 */
> -	if (dev_is_pci(dev))
> -		return NULL;
> +	if (dev_is_pci(dev)) {
> +		/*
> +		 * Start by tracing the RID alias down the PCI topology as
> +		 * far as the host bridge whose OF node we have...
> +		 */
> +		iommu_spec.np = master_np;
> +		pci_for_each_dma_alias(to_pci_dev(dev), __get_pci_rid,
> +				       &iommu_spec);
> +		/*
> +		 * ...then find out what that becomes once it escapes the PCI
> +		 * bus into the system beyond, and which IOMMU it ends up at.
> +		 */
> +		if (of_pci_map_rid(master_np, iommu_spec.args[0], "iommu-map",
> +				    "iommu-map-mask", &np, iommu_spec.args))
> +			return NULL;
> +
> +		/* We're not attempting to handle multi-alias devices yet */
> +		iommu_spec.np = np;
> +		iommu_spec.args_count = 1;
> +		ops = of_iommu_get_ops(np);
> +		if (!ops || !ops->of_xlate || ops->of_xlate(dev, &iommu_spec))
> +			ops = NULL;
> +
> +		of_node_put(np);
> +		return ops;
> +	}

I think you should stick this in a separate function, rather than inline
it into of_iommu_configure, otherwise the control flow is pretty whacky
until you realise that the PCI path and the platform device path are
mutually exclusive.

With that:

Reviewed-by: Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org>

Will

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v5 04/19] iommu/of: Introduce iommu_fwspec
       [not found]     ` <3e8eaf4fd65833fecc62828214aee81f6ca6c190.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
@ 2016-08-31 17:28       ` Will Deacon
       [not found]         ` <20160831172856.GI29505-5wv7dgnIgG8@public.gmane.org>
  0 siblings, 1 reply; 61+ messages in thread
From: Will Deacon @ 2016-08-31 17:28 UTC (permalink / raw)
  To: Robin Murphy
  Cc: joro-zLv9SwRftAIdnm+yROfE0A,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	devicetree-u79uwXL29TY76Z2rM5mHXA, lorenzo.pieralisi-5wv7dgnIgG8,
	jean-philippe.brucker-5wv7dgnIgG8, punit.agrawal-5wv7dgnIgG8,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA,
	eric.auger-H+wXaHxf7aLQT0dZR+AlfA

On Tue, Aug 23, 2016 at 08:05:15PM +0100, Robin Murphy wrote:
> Introduce a common structure to hold the per-device firmware data that
> non-architectural IOMMU drivers generally need to keep track of.
> Initially this is DT-specific to complement the existing of_iommu
> support code, but will generalise further once other firmware methods
> (e.g. ACPI IORT) come along.
> 
> Ultimately the aim is to promote the fwspec to a first-class member of
> struct device, and handle the init/free automatically in the firmware
> code. That way we can have API calls look for dev->fwspec->iommu_ops
> before falling back to dev->bus->iommu_ops, and thus gracefully handle
> those troublesome multi-IOMMU systems which we currently cannot. To
> start with, though, make use of the existing archdata field and delegate
> the init/free to drivers to allow an incremental conversion rather than
> the impractical pain of trying to attempt everything in one go.
> 
> Suggested-by: Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org>
> Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
> ---
> 
> v5: Fix shocking num_ids oversight.
> 
>  drivers/iommu/of_iommu.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/of_iommu.h | 15 ++++++++++++++
>  2 files changed, 67 insertions(+)
> 
> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
> index 1a65cc806898..bec51eb47b0d 100644
> --- a/drivers/iommu/of_iommu.c
> +++ b/drivers/iommu/of_iommu.c
> @@ -219,3 +219,55 @@ static int __init of_iommu_init(void)
>  	return 0;
>  }
>  postcore_initcall_sync(of_iommu_init);
> +
> +int iommu_fwspec_init(struct device *dev, struct device_node *iommu_np)
> +{
> +	struct iommu_fwspec *fwspec = dev->archdata.iommu;
> +
> +	if (fwspec)
> +		return 0;
> +
> +	fwspec = kzalloc(sizeof(*fwspec), GFP_KERNEL);
> +	if (!fwspec)
> +		return -ENOMEM;
> +
> +	fwspec->iommu_np = of_node_get(iommu_np);
> +	fwspec->iommu_ops = of_iommu_get_ops(iommu_np);
> +	dev->archdata.iommu = fwspec;
> +	return 0;
> +}
> +
> +void iommu_fwspec_free(struct device *dev)
> +{
> +	struct iommu_fwspec *fwspec = dev->archdata.iommu;
> +
> +	if (fwspec) {
> +		of_node_put(fwspec->iommu_np);
> +		kfree(fwspec);
> +	}
> +}
> +
> +int iommu_fwspec_add_ids(struct device *dev, u32 *ids, int num_ids)
> +{
> +	struct iommu_fwspec *fwspec = dev->archdata.iommu;
> +	size_t size;
> +
> +	if (!fwspec)
> +		return -EINVAL;
> +
> +	size = offsetof(struct iommu_fwspec, ids[fwspec->num_ids + num_ids]);
> +	fwspec = krealloc(dev->archdata.iommu, size, GFP_KERNEL);
> +	if (!fwspec)
> +		return -ENOMEM;
> +
> +	while (num_ids--)
> +		fwspec->ids[fwspec->num_ids++] = *ids++;
> +
> +	dev->archdata.iommu = fwspec;

It might just be me, but I find this really fiddly to read. The fact
that you realloc the whole fwspec, rather than just the array isn't
helping, but I also think that while loop would be much better off as
a for loop, using the index as, well, an index into the ids array and
fwspec->ids array.

Will
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v5 19/19] iommu/dma: Add support for mapping MSIs
@ 2016-08-31 20:51         ` Auger Eric
  0 siblings, 0 replies; 61+ messages in thread
From: Auger Eric @ 2016-08-31 20:51 UTC (permalink / raw)
  To: Robin Murphy
  Cc: joro, will.deacon, iommu, devicetree, lorenzo.pieralisi,
	jean-philippe.brucker, punit.agrawal, thunder.leizhen,
	Thomas Gleixner, Jason Cooper, Marc Zyngier, linux-kernel, nd

Hi Robin,

On 26/08/2016 03:17, Robin Murphy wrote:
> Hi Eric,
> 
> On Fri, 26 Aug 2016 00:25:34 +0200
> Auger Eric <eric.auger@redhat.com> wrote:
> 
>> Hi Robin,
>>
>> On 23/08/2016 21:05, Robin Murphy wrote:
>>> When an MSI doorbell is located downstream of an IOMMU, attaching
>>> devices to a DMA ops domain and switching on translation leads to a
>>> rude shock when their attempt to write to the physical address
>>> returned by the irqchip driver faults (or worse, writes into some
>>> already-mapped buffer) and no interrupt is forthcoming.
>>>
>>> Address this by adding a hook for relevant irqchip drivers to call
>>> from their compose_msi_msg() callback, to swizzle the physical
>>> address with an appropriatly-mapped IOVA for any device attached to
>>> one of our DMA ops domains.
>>>
>>> CC: Thomas Gleixner <tglx@linutronix.de>
>>> CC: Jason Cooper <jason@lakedaemon.net>
>>> CC: Marc Zyngier <marc.zyngier@arm.com>
>>> CC: linux-kernel@vger.kernel.org
>>> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
>>> ---
>>>  drivers/iommu/dma-iommu.c        | 141
>>> ++++++++++++++++++++++++++++++++++-----
>>> drivers/irqchip/irq-gic-v2m.c    |   3 +
>>> drivers/irqchip/irq-gic-v3-its.c |   3 +
>>> include/linux/dma-iommu.h        |   9 +++ 4 files changed, 141
>>> insertions(+), 15 deletions(-)
>>>
>>> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
>>> index 00c8a08d56e7..330cce60cad9 100644
>>> --- a/drivers/iommu/dma-iommu.c
>>> +++ b/drivers/iommu/dma-iommu.c
>>> @@ -25,10 +25,29 @@
>>>  #include <linux/huge_mm.h>
>>>  #include <linux/iommu.h>
>>>  #include <linux/iova.h>
>>> +#include <linux/irq.h>
>>>  #include <linux/mm.h>
>>>  #include <linux/scatterlist.h>
>>>  #include <linux/vmalloc.h>
>>>  
>>> +struct iommu_dma_msi_page {
>>> +	struct list_head	list;
>>> +	dma_addr_t		iova;
The iova address here corresponds to the page iova address and not to
the iova address mapped onto the phys_hi/phys_lo address. Might be worth
a comment since it is not obvious or populate with the right iova?
>>> +	u32			phys_lo;
>>> +	u32			phys_hi;
>>> +};
>>> +
>>> +struct iommu_dma_cookie {
>>> +	struct iova_domain	iovad;
>>> +	struct list_head	msi_page_list;
>>> +	spinlock_t		msi_lock;
>>> +};
>>> +
>>> +static inline struct iova_domain *cookie_iovad(struct iommu_domain
>>> *domain) +{
>>> +	return &((struct iommu_dma_cookie
>>> *)domain->iova_cookie)->iovad; +}
>>> +
>>>  int iommu_dma_init(void)
>>>  {
>>>  	return iova_cache_get();
>>> @@ -43,15 +62,19 @@ int iommu_dma_init(void)
>>>   */
>>>  int iommu_get_dma_cookie(struct iommu_domain *domain)
>>>  {
>>> -	struct iova_domain *iovad;
>>> +	struct iommu_dma_cookie *cookie;
>>>  
>>>  	if (domain->iova_cookie)
>>>  		return -EEXIST;
>>>  
>>> -	iovad = kzalloc(sizeof(*iovad), GFP_KERNEL);
>>> -	domain->iova_cookie = iovad;
>>> +	cookie = kzalloc(sizeof(*cookie), GFP_KERNEL);
>>> +	if (!cookie)
>>> +		return -ENOMEM;
>>>  
>>> -	return iovad ? 0 : -ENOMEM;
>>> +	spin_lock_init(&cookie->msi_lock);
>>> +	INIT_LIST_HEAD(&cookie->msi_page_list);
>>> +	domain->iova_cookie = cookie;
>>> +	return 0;
>>>  }
>>>  EXPORT_SYMBOL(iommu_get_dma_cookie);
>>>  
>>> @@ -63,14 +86,20 @@ EXPORT_SYMBOL(iommu_get_dma_cookie);
>>>   */
>>>  void iommu_put_dma_cookie(struct iommu_domain *domain)
>>>  {
>>> -	struct iova_domain *iovad = domain->iova_cookie;
>>> +	struct iommu_dma_cookie *cookie = domain->iova_cookie;
>>> +	struct iommu_dma_msi_page *msi, *tmp;
>>>  
>>> -	if (!iovad)
>>> +	if (!cookie)
>>>  		return;
>>>  
>>> -	if (iovad->granule)
>>> -		put_iova_domain(iovad);
>>> -	kfree(iovad);
>>> +	if (cookie->iovad.granule)
>>> +		put_iova_domain(&cookie->iovad);
>>> +
>>> +	list_for_each_entry_safe(msi, tmp, &cookie->msi_page_list,
>>> list) {
>>> +		list_del(&msi->list);
>>> +		kfree(msi);
>>> +	}
>>> +	kfree(cookie);
>>>  	domain->iova_cookie = NULL;
>>>  }
>>>  EXPORT_SYMBOL(iommu_put_dma_cookie);
>>> @@ -88,7 +117,7 @@ EXPORT_SYMBOL(iommu_put_dma_cookie);
>>>   */
>>>  int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t
>>> base, u64 size) {
>>> -	struct iova_domain *iovad = domain->iova_cookie;
>>> +	struct iova_domain *iovad = cookie_iovad(domain);
>>>  	unsigned long order, base_pfn, end_pfn;
>>>  
>>>  	if (!iovad)
>>> @@ -155,7 +184,7 @@ int dma_direction_to_prot(enum
>>> dma_data_direction dir, bool coherent) static struct iova
>>> *__alloc_iova(struct iommu_domain *domain, size_t size, dma_addr_t
>>> dma_limit) {
>>> -	struct iova_domain *iovad = domain->iova_cookie;
>>> +	struct iova_domain *iovad = cookie_iovad(domain);
>>>  	unsigned long shift = iova_shift(iovad);
>>>  	unsigned long length = iova_align(iovad, size) >> shift;
>>>  
>>> @@ -171,7 +200,7 @@ static struct iova *__alloc_iova(struct
>>> iommu_domain *domain, size_t size, /* The IOVA allocator knows what
>>> we mapped, so just unmap whatever that was */ static void
>>> __iommu_dma_unmap(struct iommu_domain *domain, dma_addr_t dma_addr)
>>> {
>>> -	struct iova_domain *iovad = domain->iova_cookie;
>>> +	struct iova_domain *iovad = cookie_iovad(domain);
>>>  	unsigned long shift = iova_shift(iovad);
>>>  	unsigned long pfn = dma_addr >> shift;
>>>  	struct iova *iova = find_iova(iovad, pfn);
>>> @@ -294,7 +323,7 @@ struct page **iommu_dma_alloc(struct device
>>> *dev, size_t size, gfp_t gfp, void (*flush_page)(struct device *,
>>> const void *, phys_addr_t)) {
>>>  	struct iommu_domain *domain =
>>> iommu_get_domain_for_dev(dev);
>>> -	struct iova_domain *iovad = domain->iova_cookie;
>>> +	struct iova_domain *iovad = cookie_iovad(domain);
>>>  	struct iova *iova;
>>>  	struct page **pages;
>>>  	struct sg_table sgt;
>>> @@ -386,7 +415,7 @@ dma_addr_t iommu_dma_map_page(struct device
>>> *dev, struct page *page, {
>>>  	dma_addr_t dma_addr;
>>>  	struct iommu_domain *domain =
>>> iommu_get_domain_for_dev(dev);
>>> -	struct iova_domain *iovad = domain->iova_cookie;
>>> +	struct iova_domain *iovad = cookie_iovad(domain);
>>>  	phys_addr_t phys = page_to_phys(page) + offset;
>>>  	size_t iova_off = iova_offset(iovad, phys);
>>>  	size_t len = iova_align(iovad, size + iova_off);
>>> @@ -495,7 +524,7 @@ int iommu_dma_map_sg(struct device *dev, struct
>>> scatterlist *sg, int nents, int prot)
>>>  {
>>>  	struct iommu_domain *domain =
>>> iommu_get_domain_for_dev(dev);
>>> -	struct iova_domain *iovad = domain->iova_cookie;
>>> +	struct iova_domain *iovad = cookie_iovad(domain);
>>>  	struct iova *iova;
>>>  	struct scatterlist *s, *prev = NULL;
>>>  	dma_addr_t dma_addr;
>>> @@ -587,3 +616,85 @@ int iommu_dma_mapping_error(struct device
>>> *dev, dma_addr_t dma_addr) {
>>>  	return dma_addr == DMA_ERROR_CODE;
>>>  }
>>> +
>>> +static int __iommu_dma_map_msi_page(struct device *dev, struct
>>> msi_msg *msg,
>>> +		struct iommu_domain *domain, struct
>>> iommu_dma_msi_page **ppage) +{
>>> +	struct iommu_dma_cookie *cookie = domain->iova_cookie;
>>> +	struct iommu_dma_msi_page *msi_page;
>>> +	struct iova_domain *iovad = &cookie->iovad;
>>> +	struct iova *iova;
>>> +	phys_addr_t msi_addr = (u64)msg->address_hi << 32 |
>>> msg->address_lo;
>>> +	int ret, prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;  
>> In my series I ended up putting the memory attributes as a property of
>> the doorbell, advised to do so by Marc. Here we hard freeze them. Do
>> you foresee all the doorbells ill have the same attributes?
> 
> I can't think of any good reason any device would want to read from an
> ITS or GICv2m doorbell, execute it, or allow the interconnect to reorder
> or cache its MSI writes. If some crazy hardware comes along to prove my
> assumption wrong then we'll revisit it, but right now I just want the
> simplest possible solution for PCI DMA ops to not break MSIs.
OK looks reasonable to me. This puts into question the msi_doorbell
structure we defined together with Marc comprising size, memory
protection, per_cpu attributes but well ...
> 
>>> +
>>> +	msi_page = kzalloc(sizeof(*msi_page), GFP_ATOMIC);
>>> +	if (!msi_page)
>>> +		return -ENOMEM;
>>> +
>>> +	iova = __alloc_iova(domain, iovad->granule,
>>> dma_get_mask(dev));
>>> +	if (!iova) {
>>> +		ret = -ENOSPC;
>>> +		goto out_free_page;
>>> +	}
>>> +
>>> +	msi_page->iova = iova_dma_addr(iovad, iova);
>>> +	ret = iommu_map(domain, msi_page->iova, msi_addr &
>>> ~iova_mask(iovad),
>>> +			iovad->granule, prot);
>>> +	if (ret)
>>> +		goto out_free_iova;
>>> +
>>> +	msi_page->phys_hi = msg->address_hi;
>>> +	msi_page->phys_lo = msg->address_lo;
>>> +	INIT_LIST_HEAD(&msi_page->list);
>>> +	list_add(&msi_page->list, &cookie->msi_page_list);
>>> +	*ppage = msi_page;
>>> +	return 0;
>>> +
>>> +out_free_iova:
>>> +	__free_iova(iovad, iova);
>>> +out_free_page:
>>> +	kfree(msi_page);
>>> +	return ret;
>>> +}
>>> +
>>> +void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)  
>> Marc told in the past it was reasonable to consider adding a size
>> parameter to the allocate function. Obviously you don't have the same
>> concern as I had in the passthrough series where the window aperture
>> is set by the userspace but well that is just for checking.
> 
> The beauty is that at this level we really don't care. It's simply
> "here's an address from the irqchip driver, is there a mapping in this
> domain which covers it?" The only possible use of knowing a size
> here would be if it happens to correspond to a larger page size we
> could use for the mapping, but that would entail a bunch of complication
> for what seems like a highly tenuous benefit, and right now I just want
> the simplest possible solution for PCI DMA ops to not break MSIs.
Yes it was the case discussed with Marc where the doorbell could span
over several pages. This definitively does not look like the HW we
currently handle. I endeavored to take this comment into account and
effectively this brings quite a significant extra complexity ;-)
> 
>>
>>> +{
>>> +	struct device *dev =
>>> msi_desc_to_dev(irq_get_msi_desc(irq));
>>> +	struct iommu_domain *domain =
>>> iommu_get_domain_for_dev(dev);
>>> +	struct iova_domain *iovad;
>>> +	struct iommu_dma_cookie *cookie;
>>> +	struct iommu_dma_msi_page *msi_page;
>>> +	int ret = 0;
>>> +
>>> +	if (!domain || !domain->iova_cookie)
>>> +		return;
>>> +
>>> +	cookie = domain->iova_cookie;
>>> +	iovad = &cookie->iovad;
>>> +
>>> +	spin_lock(&cookie->msi_lock);
>>> +	list_for_each_entry(msi_page, &cookie->msi_page_list, list)
>>> +		if (msi_page->phys_hi == msg->address_hi &&
>>> +		    msi_page->phys_lo - msg->address_lo <
>>> iovad->granule)
>>> +			goto unlock;
>>> +
>>> +	ret = __iommu_dma_map_msi_page(dev, msg, domain,
>>> &msi_page); +unlock:
>>> +	spin_unlock(&cookie->msi_lock);
>>> +
>>> +	if (!ret) {
>>> +		msg->address_hi = upper_32_bits(msi_page->iova);
>>> +		msg->address_lo &= iova_mask(iovad);
>>> +		msg->address_lo += lower_32_bits(msi_page->iova);
>>> +	} else {
>>> +		/*
>>> +		 * We're called from a void callback, so the best
>>> we can do is
>>> +		 * 'fail' by filling the message with obviously
>>> bogus values.
>>> +		 * Since we got this far due to an IOMMU being
>>> present, it's
>>> +		 * not like the existing address would have worked
>>> anyway...
>>> +		 */
>>> +		msg->address_hi = ~0U;
>>> +		msg->address_lo = ~0U;
>>> +		msg->data = ~0U;
>>> +	}
>>> +}
>>> diff --git a/drivers/irqchip/irq-gic-v2m.c
>>> b/drivers/irqchip/irq-gic-v2m.c index 35eb7ac5d21f..863e073c6f7f
>>> 100644 --- a/drivers/irqchip/irq-gic-v2m.c
>>> +++ b/drivers/irqchip/irq-gic-v2m.c
>>> @@ -16,6 +16,7 @@
>>>  #define pr_fmt(fmt) "GICv2m: " fmt
>>>  
>>>  #include <linux/acpi.h>
>>> +#include <linux/dma-iommu.h>
>>>  #include <linux/irq.h>
>>>  #include <linux/irqdomain.h>
>>>  #include <linux/kernel.h>
>>> @@ -108,6 +109,8 @@ static void gicv2m_compose_msi_msg(struct
>>> irq_data *data, struct msi_msg *msg) 
>>>  	if (v2m->flags & GICV2M_NEEDS_SPI_OFFSET)
>>>  		msg->data -= v2m->spi_offset;
>>> +
>>> +	iommu_dma_map_msi_msg(data->irq, msg);  
>> In the past we identified that msi_compose was not authorized to sleep
>> (https://lkml.org/lkml/2016/3/10/216) since potentialy called in
>> atomic context.
>>
>> This is why in my passthrough series I was forced to move the mapping
>> in msi_domain_alloc, which also has the benefit to happen earlier and
>> is able to fail whereas the compose cannot due, to the subsequent
>> BUG_ON. Has things changed since this notice which now allow to do
>> the mapping here?
> 
> We've got a non-sleeping spinlock covering the lookup, and new pages are
> allocated with GFP_ATOMIC; what have I missed? Any IOMMU driver whose
> iommu_map() implementation might sleep already can't use this layer, as
> DMA API calls may come from atomic context as well.
Yes my point was about the iommu_map() which potentially can sleep.
> 
> The "oh well, at least we tried" failure case (I've added a WARN_ON
> now for good measure) is largely mitigated by the fact that 99% of the
> time in practice we'll just be mapping one page once per domain and
> hitting it on subsequent lookups. If someone's already consumed all the
> available IOVA space before that page is mapped, or the IOMMU page
> tables are corrupted, then the device won't be able to do DMA anyway, so
> the fact that it can't raise its MSI is likely moot.
Fair enough. The error handling however is weak if the userspace becomes
the provider of the IOVA window which is not your concern here though.
> 
>>
>>>  }
>>>  
>>>  static struct irq_chip gicv2m_irq_chip = {
>>> diff --git a/drivers/irqchip/irq-gic-v3-its.c
>>> b/drivers/irqchip/irq-gic-v3-its.c index 7ceaba81efb4..73f4f10dc204
>>> 100644 --- a/drivers/irqchip/irq-gic-v3-its.c
>>> +++ b/drivers/irqchip/irq-gic-v3-its.c
>>> @@ -18,6 +18,7 @@
>>>  #include <linux/bitmap.h>
>>>  #include <linux/cpu.h>
>>>  #include <linux/delay.h>
>>> +#include <linux/dma-iommu.h>
>>>  #include <linux/interrupt.h>
>>>  #include <linux/log2.h>
>>>  #include <linux/mm.h>
>>> @@ -655,6 +656,8 @@ static void its_irq_compose_msi_msg(struct
>>> irq_data *d, struct msi_msg *msg) msg->address_lo		=
>>> addr & ((1UL << 32) - 1); msg->address_hi		= addr >>
>>> 32; msg->data		= its_get_event_id(d);
>>> +
>>> +	iommu_dma_map_msi_msg(d->irq, msg);
>>>  }
>>>  
>>>  static struct irq_chip its_irq_chip = {
>>> diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
>>> index 81c5c8d167ad..5ee806e41b5c 100644
>>> --- a/include/linux/dma-iommu.h
>>> +++ b/include/linux/dma-iommu.h
>>> @@ -21,6 +21,7 @@
>>>  
>>>  #ifdef CONFIG_IOMMU_DMA
>>>  #include <linux/iommu.h>
>>> +#include <linux/msi.h>
>>>  
>>>  int iommu_dma_init(void);
>>>  
>>> @@ -62,9 +63,13 @@ void iommu_dma_unmap_sg(struct device *dev,
>>> struct scatterlist *sg, int nents, int iommu_dma_supported(struct
>>> device *dev, u64 mask); int iommu_dma_mapping_error(struct device
>>> *dev, dma_addr_t dma_addr); 
>>> +/* The DMA API isn't _quite_ the whole story, though... */
>>> +void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg);  
>> So I understand the patch currently addresses dma-mapping use case.
>> What about the passthrough use case? Here you obviously propose a
>> simpler version but it also looks to me it skips some comments we
>> collected in the past which resulted in the direction taken before:
> 
> This is only intended to be the lowest-level remapping machinery -
> clearly the guest MSI case has a lot more complexity on top, but that
> doesn't apply on the host side, and right now I just want the simplest
> possible solution for PCI DMA ops to not break MSIs.
> 
>> - generic API to allocate msi iovas
>> - msi_geometry semantic recommended by Alex
>> - the handling of the size parameter as recommended by Marc
>> - separation of allocation/enumeration for msi_domain_allocate_irqs
>> /msi_compose separation.
> 
> I have also thrown together an illustrative patch for plugging
> this into VFIO domains [1] following some internal discussion, but
> that's about as far as I was planning to go myself - AFAICS all your
> MSI geometry and VFIO bits remain valid, I just looked at the
> msi_cookie stuff and found it really didn't tie in with DMA ops at all
> well.

>  
>>
>> For passthrough we also have to care about the safety issue, the
>> window size computation. Please can we collaborate to converge on a
>> unified solution?
> 
> I remain adamant that the safety thing is a property of the irqchip and
> the irqchip alone (I've also come to realise that iommu_capable() is
> fundamentally unworkable altogether, but that's another story). Thus
> as I see it, getting the low-level remapping aspect out of the way in a
> common manner leaves the rest of the guest MSI problem firmly between
> VFIO and the MSI layer, now that we've got a much clearer view of
> it thanks to your efforts. What do you think?

Well I tried to compare our approaches:

1) I put the MSI mapping code in msi-iommu as a layer on top of
dma-mapping whereas you put it directly in dma-mapping
2) you removed the size parameter management (Marc's guidance)
3) you removed the iommu mapping list iterator since you don't need it
4) you simplified the lock mechanism using atomic flag and considering
iommu_map cannot sleep
5) you removed ref counting (as I could do too) since the removal can be
taken in charge at iommu domain destruction
6) you do the allocation at compose time (still considering iommu_map
cannot sleep) whereas I do it at MSI enable time, with the drawback of
poor error handling and late notice; This definitively simplifies the
overall solution.

I think your approach kills my doorbell registration approach, since you
removed most of the doorbell attributes (size, mem protection
attributes, per-cpu vs global phys addrs space). So globally I think
most of my part II MSI layer series becomes irrelevant (iommu mapping &
iova retrieval part definitively). Keeping the registration API for just
allowing MSI controller enumeration & safety coarse assessment looks
overkill.

So we restart from scratch with respect to the enumeration and safety
coarse assessment for passthrough use case.

Another altenative could be to do
a) use you dma-iommu additions +  your
http://linux-arm.org/git?p=linux-rm.git;a=commitdiff;h=991effe0712750ce24f6a0b2e2e3f8f57d4a9910
b) add iommu map iterator in dma-iommu
c) do the allocation at MSI enable time and retrieval at compose time,
based on my registration API. which would improve the error handling.

Whatever the fate of my series and besides the first comment above, you
can add my R-b since I reviewed your code ;-)

Reviewed-by: Eric Auger <eric.auger@redhat.com>

Thanks

Eric

> 
> After all, right now... well, y'know ;)
> 
> Robin.
> 
> [1]:http://linux-arm.org/git?p=linux-rm.git;a=commitdiff;h=991effe0712750ce24f6a0b2e2e3f8f57d4a9910
> 
>>
>> Best Regards
>>
>> Eric
>>
>>> +
>>>  #else
>>>  
>>>  struct iommu_domain;
>>> +struct msi_msg;
>>>  
>>>  static inline int iommu_dma_init(void)
>>>  {
>>> @@ -80,6 +85,10 @@ static inline void iommu_put_dma_cookie(struct
>>> iommu_domain *domain) {
>>>  }
>>>  
>>> +static inline void iommu_dma_map_msi_msg(int irq, struct msi_msg
>>> *msg) +{
>>> +}
>>> +
>>>  #endif	/* CONFIG_IOMMU_DMA */
>>>  #endif	/* __KERNEL__ */
>>>  #endif	/* __DMA_IOMMU_H */
>>>   
>>
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v5 19/19] iommu/dma: Add support for mapping MSIs
@ 2016-08-31 20:51         ` Auger Eric
  0 siblings, 0 replies; 61+ messages in thread
From: Auger Eric @ 2016-08-31 20:51 UTC (permalink / raw)
  To: Robin Murphy
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, Jason Cooper,
	punit.agrawal-5wv7dgnIgG8, will.deacon-5wv7dgnIgG8,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Marc Zyngier,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA, Thomas Gleixner,
	nd-5wv7dgnIgG8

Hi Robin,

On 26/08/2016 03:17, Robin Murphy wrote:
> Hi Eric,
> 
> On Fri, 26 Aug 2016 00:25:34 +0200
> Auger Eric <eric.auger-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> 
>> Hi Robin,
>>
>> On 23/08/2016 21:05, Robin Murphy wrote:
>>> When an MSI doorbell is located downstream of an IOMMU, attaching
>>> devices to a DMA ops domain and switching on translation leads to a
>>> rude shock when their attempt to write to the physical address
>>> returned by the irqchip driver faults (or worse, writes into some
>>> already-mapped buffer) and no interrupt is forthcoming.
>>>
>>> Address this by adding a hook for relevant irqchip drivers to call
>>> from their compose_msi_msg() callback, to swizzle the physical
>>> address with an appropriatly-mapped IOVA for any device attached to
>>> one of our DMA ops domains.
>>>
>>> CC: Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
>>> CC: Jason Cooper <jason-NLaQJdtUoK4Be96aLqz0jA@public.gmane.org>
>>> CC: Marc Zyngier <marc.zyngier-5wv7dgnIgG8@public.gmane.org>
>>> CC: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>>> Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
>>> ---
>>>  drivers/iommu/dma-iommu.c        | 141
>>> ++++++++++++++++++++++++++++++++++-----
>>> drivers/irqchip/irq-gic-v2m.c    |   3 +
>>> drivers/irqchip/irq-gic-v3-its.c |   3 +
>>> include/linux/dma-iommu.h        |   9 +++ 4 files changed, 141
>>> insertions(+), 15 deletions(-)
>>>
>>> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
>>> index 00c8a08d56e7..330cce60cad9 100644
>>> --- a/drivers/iommu/dma-iommu.c
>>> +++ b/drivers/iommu/dma-iommu.c
>>> @@ -25,10 +25,29 @@
>>>  #include <linux/huge_mm.h>
>>>  #include <linux/iommu.h>
>>>  #include <linux/iova.h>
>>> +#include <linux/irq.h>
>>>  #include <linux/mm.h>
>>>  #include <linux/scatterlist.h>
>>>  #include <linux/vmalloc.h>
>>>  
>>> +struct iommu_dma_msi_page {
>>> +	struct list_head	list;
>>> +	dma_addr_t		iova;
The iova address here corresponds to the page iova address and not to
the iova address mapped onto the phys_hi/phys_lo address. Might be worth
a comment since it is not obvious or populate with the right iova?
>>> +	u32			phys_lo;
>>> +	u32			phys_hi;
>>> +};
>>> +
>>> +struct iommu_dma_cookie {
>>> +	struct iova_domain	iovad;
>>> +	struct list_head	msi_page_list;
>>> +	spinlock_t		msi_lock;
>>> +};
>>> +
>>> +static inline struct iova_domain *cookie_iovad(struct iommu_domain
>>> *domain) +{
>>> +	return &((struct iommu_dma_cookie
>>> *)domain->iova_cookie)->iovad; +}
>>> +
>>>  int iommu_dma_init(void)
>>>  {
>>>  	return iova_cache_get();
>>> @@ -43,15 +62,19 @@ int iommu_dma_init(void)
>>>   */
>>>  int iommu_get_dma_cookie(struct iommu_domain *domain)
>>>  {
>>> -	struct iova_domain *iovad;
>>> +	struct iommu_dma_cookie *cookie;
>>>  
>>>  	if (domain->iova_cookie)
>>>  		return -EEXIST;
>>>  
>>> -	iovad = kzalloc(sizeof(*iovad), GFP_KERNEL);
>>> -	domain->iova_cookie = iovad;
>>> +	cookie = kzalloc(sizeof(*cookie), GFP_KERNEL);
>>> +	if (!cookie)
>>> +		return -ENOMEM;
>>>  
>>> -	return iovad ? 0 : -ENOMEM;
>>> +	spin_lock_init(&cookie->msi_lock);
>>> +	INIT_LIST_HEAD(&cookie->msi_page_list);
>>> +	domain->iova_cookie = cookie;
>>> +	return 0;
>>>  }
>>>  EXPORT_SYMBOL(iommu_get_dma_cookie);
>>>  
>>> @@ -63,14 +86,20 @@ EXPORT_SYMBOL(iommu_get_dma_cookie);
>>>   */
>>>  void iommu_put_dma_cookie(struct iommu_domain *domain)
>>>  {
>>> -	struct iova_domain *iovad = domain->iova_cookie;
>>> +	struct iommu_dma_cookie *cookie = domain->iova_cookie;
>>> +	struct iommu_dma_msi_page *msi, *tmp;
>>>  
>>> -	if (!iovad)
>>> +	if (!cookie)
>>>  		return;
>>>  
>>> -	if (iovad->granule)
>>> -		put_iova_domain(iovad);
>>> -	kfree(iovad);
>>> +	if (cookie->iovad.granule)
>>> +		put_iova_domain(&cookie->iovad);
>>> +
>>> +	list_for_each_entry_safe(msi, tmp, &cookie->msi_page_list,
>>> list) {
>>> +		list_del(&msi->list);
>>> +		kfree(msi);
>>> +	}
>>> +	kfree(cookie);
>>>  	domain->iova_cookie = NULL;
>>>  }
>>>  EXPORT_SYMBOL(iommu_put_dma_cookie);
>>> @@ -88,7 +117,7 @@ EXPORT_SYMBOL(iommu_put_dma_cookie);
>>>   */
>>>  int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t
>>> base, u64 size) {
>>> -	struct iova_domain *iovad = domain->iova_cookie;
>>> +	struct iova_domain *iovad = cookie_iovad(domain);
>>>  	unsigned long order, base_pfn, end_pfn;
>>>  
>>>  	if (!iovad)
>>> @@ -155,7 +184,7 @@ int dma_direction_to_prot(enum
>>> dma_data_direction dir, bool coherent) static struct iova
>>> *__alloc_iova(struct iommu_domain *domain, size_t size, dma_addr_t
>>> dma_limit) {
>>> -	struct iova_domain *iovad = domain->iova_cookie;
>>> +	struct iova_domain *iovad = cookie_iovad(domain);
>>>  	unsigned long shift = iova_shift(iovad);
>>>  	unsigned long length = iova_align(iovad, size) >> shift;
>>>  
>>> @@ -171,7 +200,7 @@ static struct iova *__alloc_iova(struct
>>> iommu_domain *domain, size_t size, /* The IOVA allocator knows what
>>> we mapped, so just unmap whatever that was */ static void
>>> __iommu_dma_unmap(struct iommu_domain *domain, dma_addr_t dma_addr)
>>> {
>>> -	struct iova_domain *iovad = domain->iova_cookie;
>>> +	struct iova_domain *iovad = cookie_iovad(domain);
>>>  	unsigned long shift = iova_shift(iovad);
>>>  	unsigned long pfn = dma_addr >> shift;
>>>  	struct iova *iova = find_iova(iovad, pfn);
>>> @@ -294,7 +323,7 @@ struct page **iommu_dma_alloc(struct device
>>> *dev, size_t size, gfp_t gfp, void (*flush_page)(struct device *,
>>> const void *, phys_addr_t)) {
>>>  	struct iommu_domain *domain =
>>> iommu_get_domain_for_dev(dev);
>>> -	struct iova_domain *iovad = domain->iova_cookie;
>>> +	struct iova_domain *iovad = cookie_iovad(domain);
>>>  	struct iova *iova;
>>>  	struct page **pages;
>>>  	struct sg_table sgt;
>>> @@ -386,7 +415,7 @@ dma_addr_t iommu_dma_map_page(struct device
>>> *dev, struct page *page, {
>>>  	dma_addr_t dma_addr;
>>>  	struct iommu_domain *domain =
>>> iommu_get_domain_for_dev(dev);
>>> -	struct iova_domain *iovad = domain->iova_cookie;
>>> +	struct iova_domain *iovad = cookie_iovad(domain);
>>>  	phys_addr_t phys = page_to_phys(page) + offset;
>>>  	size_t iova_off = iova_offset(iovad, phys);
>>>  	size_t len = iova_align(iovad, size + iova_off);
>>> @@ -495,7 +524,7 @@ int iommu_dma_map_sg(struct device *dev, struct
>>> scatterlist *sg, int nents, int prot)
>>>  {
>>>  	struct iommu_domain *domain =
>>> iommu_get_domain_for_dev(dev);
>>> -	struct iova_domain *iovad = domain->iova_cookie;
>>> +	struct iova_domain *iovad = cookie_iovad(domain);
>>>  	struct iova *iova;
>>>  	struct scatterlist *s, *prev = NULL;
>>>  	dma_addr_t dma_addr;
>>> @@ -587,3 +616,85 @@ int iommu_dma_mapping_error(struct device
>>> *dev, dma_addr_t dma_addr) {
>>>  	return dma_addr == DMA_ERROR_CODE;
>>>  }
>>> +
>>> +static int __iommu_dma_map_msi_page(struct device *dev, struct
>>> msi_msg *msg,
>>> +		struct iommu_domain *domain, struct
>>> iommu_dma_msi_page **ppage) +{
>>> +	struct iommu_dma_cookie *cookie = domain->iova_cookie;
>>> +	struct iommu_dma_msi_page *msi_page;
>>> +	struct iova_domain *iovad = &cookie->iovad;
>>> +	struct iova *iova;
>>> +	phys_addr_t msi_addr = (u64)msg->address_hi << 32 |
>>> msg->address_lo;
>>> +	int ret, prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;  
>> In my series I ended up putting the memory attributes as a property of
>> the doorbell, advised to do so by Marc. Here we hard freeze them. Do
>> you foresee all the doorbells ill have the same attributes?
> 
> I can't think of any good reason any device would want to read from an
> ITS or GICv2m doorbell, execute it, or allow the interconnect to reorder
> or cache its MSI writes. If some crazy hardware comes along to prove my
> assumption wrong then we'll revisit it, but right now I just want the
> simplest possible solution for PCI DMA ops to not break MSIs.
OK looks reasonable to me. This puts into question the msi_doorbell
structure we defined together with Marc comprising size, memory
protection, per_cpu attributes but well ...
> 
>>> +
>>> +	msi_page = kzalloc(sizeof(*msi_page), GFP_ATOMIC);
>>> +	if (!msi_page)
>>> +		return -ENOMEM;
>>> +
>>> +	iova = __alloc_iova(domain, iovad->granule,
>>> dma_get_mask(dev));
>>> +	if (!iova) {
>>> +		ret = -ENOSPC;
>>> +		goto out_free_page;
>>> +	}
>>> +
>>> +	msi_page->iova = iova_dma_addr(iovad, iova);
>>> +	ret = iommu_map(domain, msi_page->iova, msi_addr &
>>> ~iova_mask(iovad),
>>> +			iovad->granule, prot);
>>> +	if (ret)
>>> +		goto out_free_iova;
>>> +
>>> +	msi_page->phys_hi = msg->address_hi;
>>> +	msi_page->phys_lo = msg->address_lo;
>>> +	INIT_LIST_HEAD(&msi_page->list);
>>> +	list_add(&msi_page->list, &cookie->msi_page_list);
>>> +	*ppage = msi_page;
>>> +	return 0;
>>> +
>>> +out_free_iova:
>>> +	__free_iova(iovad, iova);
>>> +out_free_page:
>>> +	kfree(msi_page);
>>> +	return ret;
>>> +}
>>> +
>>> +void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)  
>> Marc told in the past it was reasonable to consider adding a size
>> parameter to the allocate function. Obviously you don't have the same
>> concern as I had in the passthrough series where the window aperture
>> is set by the userspace but well that is just for checking.
> 
> The beauty is that at this level we really don't care. It's simply
> "here's an address from the irqchip driver, is there a mapping in this
> domain which covers it?" The only possible use of knowing a size
> here would be if it happens to correspond to a larger page size we
> could use for the mapping, but that would entail a bunch of complication
> for what seems like a highly tenuous benefit, and right now I just want
> the simplest possible solution for PCI DMA ops to not break MSIs.
Yes it was the case discussed with Marc where the doorbell could span
over several pages. This definitively does not look like the HW we
currently handle. I endeavored to take this comment into account and
effectively this brings quite a significant extra complexity ;-)
> 
>>
>>> +{
>>> +	struct device *dev =
>>> msi_desc_to_dev(irq_get_msi_desc(irq));
>>> +	struct iommu_domain *domain =
>>> iommu_get_domain_for_dev(dev);
>>> +	struct iova_domain *iovad;
>>> +	struct iommu_dma_cookie *cookie;
>>> +	struct iommu_dma_msi_page *msi_page;
>>> +	int ret = 0;
>>> +
>>> +	if (!domain || !domain->iova_cookie)
>>> +		return;
>>> +
>>> +	cookie = domain->iova_cookie;
>>> +	iovad = &cookie->iovad;
>>> +
>>> +	spin_lock(&cookie->msi_lock);
>>> +	list_for_each_entry(msi_page, &cookie->msi_page_list, list)
>>> +		if (msi_page->phys_hi == msg->address_hi &&
>>> +		    msi_page->phys_lo - msg->address_lo <
>>> iovad->granule)
>>> +			goto unlock;
>>> +
>>> +	ret = __iommu_dma_map_msi_page(dev, msg, domain,
>>> &msi_page); +unlock:
>>> +	spin_unlock(&cookie->msi_lock);
>>> +
>>> +	if (!ret) {
>>> +		msg->address_hi = upper_32_bits(msi_page->iova);
>>> +		msg->address_lo &= iova_mask(iovad);
>>> +		msg->address_lo += lower_32_bits(msi_page->iova);
>>> +	} else {
>>> +		/*
>>> +		 * We're called from a void callback, so the best
>>> we can do is
>>> +		 * 'fail' by filling the message with obviously
>>> bogus values.
>>> +		 * Since we got this far due to an IOMMU being
>>> present, it's
>>> +		 * not like the existing address would have worked
>>> anyway...
>>> +		 */
>>> +		msg->address_hi = ~0U;
>>> +		msg->address_lo = ~0U;
>>> +		msg->data = ~0U;
>>> +	}
>>> +}
>>> diff --git a/drivers/irqchip/irq-gic-v2m.c
>>> b/drivers/irqchip/irq-gic-v2m.c index 35eb7ac5d21f..863e073c6f7f
>>> 100644 --- a/drivers/irqchip/irq-gic-v2m.c
>>> +++ b/drivers/irqchip/irq-gic-v2m.c
>>> @@ -16,6 +16,7 @@
>>>  #define pr_fmt(fmt) "GICv2m: " fmt
>>>  
>>>  #include <linux/acpi.h>
>>> +#include <linux/dma-iommu.h>
>>>  #include <linux/irq.h>
>>>  #include <linux/irqdomain.h>
>>>  #include <linux/kernel.h>
>>> @@ -108,6 +109,8 @@ static void gicv2m_compose_msi_msg(struct
>>> irq_data *data, struct msi_msg *msg) 
>>>  	if (v2m->flags & GICV2M_NEEDS_SPI_OFFSET)
>>>  		msg->data -= v2m->spi_offset;
>>> +
>>> +	iommu_dma_map_msi_msg(data->irq, msg);  
>> In the past we identified that msi_compose was not authorized to sleep
>> (https://lkml.org/lkml/2016/3/10/216) since potentialy called in
>> atomic context.
>>
>> This is why in my passthrough series I was forced to move the mapping
>> in msi_domain_alloc, which also has the benefit to happen earlier and
>> is able to fail whereas the compose cannot due, to the subsequent
>> BUG_ON. Has things changed since this notice which now allow to do
>> the mapping here?
> 
> We've got a non-sleeping spinlock covering the lookup, and new pages are
> allocated with GFP_ATOMIC; what have I missed? Any IOMMU driver whose
> iommu_map() implementation might sleep already can't use this layer, as
> DMA API calls may come from atomic context as well.
Yes my point was about the iommu_map() which potentially can sleep.
> 
> The "oh well, at least we tried" failure case (I've added a WARN_ON
> now for good measure) is largely mitigated by the fact that 99% of the
> time in practice we'll just be mapping one page once per domain and
> hitting it on subsequent lookups. If someone's already consumed all the
> available IOVA space before that page is mapped, or the IOMMU page
> tables are corrupted, then the device won't be able to do DMA anyway, so
> the fact that it can't raise its MSI is likely moot.
Fair enough. The error handling however is weak if the userspace becomes
the provider of the IOVA window which is not your concern here though.
> 
>>
>>>  }
>>>  
>>>  static struct irq_chip gicv2m_irq_chip = {
>>> diff --git a/drivers/irqchip/irq-gic-v3-its.c
>>> b/drivers/irqchip/irq-gic-v3-its.c index 7ceaba81efb4..73f4f10dc204
>>> 100644 --- a/drivers/irqchip/irq-gic-v3-its.c
>>> +++ b/drivers/irqchip/irq-gic-v3-its.c
>>> @@ -18,6 +18,7 @@
>>>  #include <linux/bitmap.h>
>>>  #include <linux/cpu.h>
>>>  #include <linux/delay.h>
>>> +#include <linux/dma-iommu.h>
>>>  #include <linux/interrupt.h>
>>>  #include <linux/log2.h>
>>>  #include <linux/mm.h>
>>> @@ -655,6 +656,8 @@ static void its_irq_compose_msi_msg(struct
>>> irq_data *d, struct msi_msg *msg) msg->address_lo		=
>>> addr & ((1UL << 32) - 1); msg->address_hi		= addr >>
>>> 32; msg->data		= its_get_event_id(d);
>>> +
>>> +	iommu_dma_map_msi_msg(d->irq, msg);
>>>  }
>>>  
>>>  static struct irq_chip its_irq_chip = {
>>> diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
>>> index 81c5c8d167ad..5ee806e41b5c 100644
>>> --- a/include/linux/dma-iommu.h
>>> +++ b/include/linux/dma-iommu.h
>>> @@ -21,6 +21,7 @@
>>>  
>>>  #ifdef CONFIG_IOMMU_DMA
>>>  #include <linux/iommu.h>
>>> +#include <linux/msi.h>
>>>  
>>>  int iommu_dma_init(void);
>>>  
>>> @@ -62,9 +63,13 @@ void iommu_dma_unmap_sg(struct device *dev,
>>> struct scatterlist *sg, int nents, int iommu_dma_supported(struct
>>> device *dev, u64 mask); int iommu_dma_mapping_error(struct device
>>> *dev, dma_addr_t dma_addr); 
>>> +/* The DMA API isn't _quite_ the whole story, though... */
>>> +void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg);  
>> So I understand the patch currently addresses dma-mapping use case.
>> What about the passthrough use case? Here you obviously propose a
>> simpler version but it also looks to me it skips some comments we
>> collected in the past which resulted in the direction taken before:
> 
> This is only intended to be the lowest-level remapping machinery -
> clearly the guest MSI case has a lot more complexity on top, but that
> doesn't apply on the host side, and right now I just want the simplest
> possible solution for PCI DMA ops to not break MSIs.
> 
>> - generic API to allocate msi iovas
>> - msi_geometry semantic recommended by Alex
>> - the handling of the size parameter as recommended by Marc
>> - separation of allocation/enumeration for msi_domain_allocate_irqs
>> /msi_compose separation.
> 
> I have also thrown together an illustrative patch for plugging
> this into VFIO domains [1] following some internal discussion, but
> that's about as far as I was planning to go myself - AFAICS all your
> MSI geometry and VFIO bits remain valid, I just looked at the
> msi_cookie stuff and found it really didn't tie in with DMA ops at all
> well.

>  
>>
>> For passthrough we also have to care about the safety issue, the
>> window size computation. Please can we collaborate to converge on a
>> unified solution?
> 
> I remain adamant that the safety thing is a property of the irqchip and
> the irqchip alone (I've also come to realise that iommu_capable() is
> fundamentally unworkable altogether, but that's another story). Thus
> as I see it, getting the low-level remapping aspect out of the way in a
> common manner leaves the rest of the guest MSI problem firmly between
> VFIO and the MSI layer, now that we've got a much clearer view of
> it thanks to your efforts. What do you think?

Well I tried to compare our approaches:

1) I put the MSI mapping code in msi-iommu as a layer on top of
dma-mapping whereas you put it directly in dma-mapping
2) you removed the size parameter management (Marc's guidance)
3) you removed the iommu mapping list iterator since you don't need it
4) you simplified the lock mechanism using atomic flag and considering
iommu_map cannot sleep
5) you removed ref counting (as I could do too) since the removal can be
taken in charge at iommu domain destruction
6) you do the allocation at compose time (still considering iommu_map
cannot sleep) whereas I do it at MSI enable time, with the drawback of
poor error handling and late notice; This definitively simplifies the
overall solution.

I think your approach kills my doorbell registration approach, since you
removed most of the doorbell attributes (size, mem protection
attributes, per-cpu vs global phys addrs space). So globally I think
most of my part II MSI layer series becomes irrelevant (iommu mapping &
iova retrieval part definitively). Keeping the registration API for just
allowing MSI controller enumeration & safety coarse assessment looks
overkill.

So we restart from scratch with respect to the enumeration and safety
coarse assessment for passthrough use case.

Another altenative could be to do
a) use you dma-iommu additions +  your
http://linux-arm.org/git?p=linux-rm.git;a=commitdiff;h=991effe0712750ce24f6a0b2e2e3f8f57d4a9910
b) add iommu map iterator in dma-iommu
c) do the allocation at MSI enable time and retrieval at compose time,
based on my registration API. which would improve the error handling.

Whatever the fate of my series and besides the first comment above, you
can add my R-b since I reviewed your code ;-)

Reviewed-by: Eric Auger <eric.auger-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

Thanks

Eric

> 
> After all, right now... well, y'know ;)
> 
> Robin.
> 
> [1]:http://linux-arm.org/git?p=linux-rm.git;a=commitdiff;h=991effe0712750ce24f6a0b2e2e3f8f57d4a9910
> 
>>
>> Best Regards
>>
>> Eric
>>
>>> +
>>>  #else
>>>  
>>>  struct iommu_domain;
>>> +struct msi_msg;
>>>  
>>>  static inline int iommu_dma_init(void)
>>>  {
>>> @@ -80,6 +85,10 @@ static inline void iommu_put_dma_cookie(struct
>>> iommu_domain *domain) {
>>>  }
>>>  
>>> +static inline void iommu_dma_map_msi_msg(int irq, struct msi_msg
>>> *msg) +{
>>> +}
>>> +
>>>  #endif	/* CONFIG_IOMMU_DMA */
>>>  #endif	/* __KERNEL__ */
>>>  #endif	/* __DMA_IOMMU_H */
>>>   
>>
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v5 18/19] iommu/arm-smmu: Set domain geometry
       [not found]     ` <d6cedec16fe96a081ea2f9f27378dd1a6f406c72.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
@ 2016-08-31 21:00       ` Auger Eric
  0 siblings, 0 replies; 61+ messages in thread
From: Auger Eric @ 2016-08-31 21:00 UTC (permalink / raw)
  To: Robin Murphy, joro-zLv9SwRftAIdnm+yROfE0A,
	will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, lorenzo.pieralisi-5wv7dgnIgG8,
	jean-philippe.brucker-5wv7dgnIgG8, punit.agrawal-5wv7dgnIgG8,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA

Hi,

On 23/08/2016 21:05, Robin Murphy wrote:
> For non-aperture-based IOMMUs, the domain geometry seems to have become
> the de-facto way of indicating the input address space size. That is
> quite a useful thing from the users' perspective, so let's do the same.
> 
> Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
> ---
>  drivers/iommu/arm-smmu-v3.c | 2 ++
>  drivers/iommu/arm-smmu.c    | 2 ++
>  2 files changed, 4 insertions(+)
> 
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index 72b996aa7460..9c56bd194dc2 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -1574,6 +1574,8 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
>  		return -ENOMEM;
>  
>  	domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
> +	domain->geometry.aperture_end = (1UL << ias) - 1;
> +	domain->geometry.force_aperture = true;
>  	smmu_domain->pgtbl_ops = pgtbl_ops;
>  
>  	ret = finalise_stage_fn(smmu_domain, &pgtbl_cfg);
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 85bc74d8fca0..112918d787eb 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -913,6 +913,8 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
>  
>  	/* Update the domain's page sizes to reflect the page table format */
>  	domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
> +	domain->geometry.aperture_end = (1UL << ias) - 1;
> +	domain->geometry.force_aperture = true;
>  
>  	/* Initialise the context bank with our page table cfg */
>  	arm_smmu_init_context_bank(smmu_domain, &pgtbl_cfg);
> 
Reviewed-by: Eric Auger <eric.auger-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>


Eric
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v5 00/19] Generic DT bindings for PCI IOMMUs and ARM SMMU
  2016-08-23 19:15     ` Robin Murphy
@ 2016-09-01  3:49         ` Anup Patel
  -1 siblings, 0 replies; 61+ messages in thread
From: Anup Patel @ 2016-09-01  3:49 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Linux ARM Kernel, Device Tree, Punit Agrawal, Will Deacon,
	Linux IOMMU, thunder.leizhen-hv44wF8Li93QT0dZR+AlfA

Hi Robin,

What are the chances of having this series in Linux-4.9?

Regards,
Anup
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v5 00/19] Generic DT bindings for PCI IOMMUs and ARM SMMU
@ 2016-09-01  3:49         ` Anup Patel
  0 siblings, 0 replies; 61+ messages in thread
From: Anup Patel @ 2016-09-01  3:49 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Robin,

What are the chances of having this series in Linux-4.9?

Regards,
Anup

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v5 00/19] Generic DT bindings for PCI IOMMUs and ARM SMMU
  2016-09-01  3:49         ` Anup Patel
@ 2016-09-01 10:10             ` Robin Murphy
  -1 siblings, 0 replies; 61+ messages in thread
From: Robin Murphy @ 2016-09-01 10:10 UTC (permalink / raw)
  To: Anup Patel
  Cc: Device Tree, Punit Agrawal, Will Deacon, Linux IOMMU,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA, Linux ARM Kernel

On 01/09/16 04:49, Anup Patel wrote:
> Hi Robin,
> 
> What are the chances of having this series in Linux-4.9?

Well, I'm planning to do everything I can over the next few weeks to
make that happen, but ultimately it's down to maintainers, not me ;)

I'll certainly be posting a v6 next week to address the feedback so far,
but the more people who can find time to give it a proper review
(particularly, say, in terms of systems with funky PCI topology) the better!

Robin.

> 
> Regards,
> Anup
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v5 00/19] Generic DT bindings for PCI IOMMUs and ARM SMMU
@ 2016-09-01 10:10             ` Robin Murphy
  0 siblings, 0 replies; 61+ messages in thread
From: Robin Murphy @ 2016-09-01 10:10 UTC (permalink / raw)
  To: linux-arm-kernel

On 01/09/16 04:49, Anup Patel wrote:
> Hi Robin,
> 
> What are the chances of having this series in Linux-4.9?

Well, I'm planning to do everything I can over the next few weeks to
make that happen, but ultimately it's down to maintainers, not me ;)

I'll certainly be posting a v6 next week to address the feedback so far,
but the more people who can find time to give it a proper review
(particularly, say, in terms of systems with funky PCI topology) the better!

Robin.

> 
> Regards,
> Anup
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v5 04/19] iommu/of: Introduce iommu_fwspec
       [not found]         ` <20160831172856.GI29505-5wv7dgnIgG8@public.gmane.org>
@ 2016-09-01 12:07           ` Robin Murphy
       [not found]             ` <900f3dcb-217c-4fb3-2f7d-15572f31a0c0-5wv7dgnIgG8@public.gmane.org>
  0 siblings, 1 reply; 61+ messages in thread
From: Robin Murphy @ 2016-09-01 12:07 UTC (permalink / raw)
  To: Will Deacon
  Cc: joro-zLv9SwRftAIdnm+yROfE0A,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	devicetree-u79uwXL29TY76Z2rM5mHXA, lorenzo.pieralisi-5wv7dgnIgG8,
	jean-philippe.brucker-5wv7dgnIgG8, punit.agrawal-5wv7dgnIgG8,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA,
	eric.auger-H+wXaHxf7aLQT0dZR+AlfA

On 31/08/16 18:28, Will Deacon wrote:
> On Tue, Aug 23, 2016 at 08:05:15PM +0100, Robin Murphy wrote:
>> Introduce a common structure to hold the per-device firmware data that
>> non-architectural IOMMU drivers generally need to keep track of.
>> Initially this is DT-specific to complement the existing of_iommu
>> support code, but will generalise further once other firmware methods
>> (e.g. ACPI IORT) come along.
>>
>> Ultimately the aim is to promote the fwspec to a first-class member of
>> struct device, and handle the init/free automatically in the firmware
>> code. That way we can have API calls look for dev->fwspec->iommu_ops
>> before falling back to dev->bus->iommu_ops, and thus gracefully handle
>> those troublesome multi-IOMMU systems which we currently cannot. To
>> start with, though, make use of the existing archdata field and delegate
>> the init/free to drivers to allow an incremental conversion rather than
>> the impractical pain of trying to attempt everything in one go.
>>
>> Suggested-by: Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org>
>> Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
>> ---
>>
>> v5: Fix shocking num_ids oversight.
>>
>>  drivers/iommu/of_iommu.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++++
>>  include/linux/of_iommu.h | 15 ++++++++++++++
>>  2 files changed, 67 insertions(+)
>>
>> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
>> index 1a65cc806898..bec51eb47b0d 100644
>> --- a/drivers/iommu/of_iommu.c
>> +++ b/drivers/iommu/of_iommu.c
>> @@ -219,3 +219,55 @@ static int __init of_iommu_init(void)
>>  	return 0;
>>  }
>>  postcore_initcall_sync(of_iommu_init);
>> +
>> +int iommu_fwspec_init(struct device *dev, struct device_node *iommu_np)
>> +{
>> +	struct iommu_fwspec *fwspec = dev->archdata.iommu;
>> +
>> +	if (fwspec)
>> +		return 0;
>> +
>> +	fwspec = kzalloc(sizeof(*fwspec), GFP_KERNEL);
>> +	if (!fwspec)
>> +		return -ENOMEM;
>> +
>> +	fwspec->iommu_np = of_node_get(iommu_np);
>> +	fwspec->iommu_ops = of_iommu_get_ops(iommu_np);
>> +	dev->archdata.iommu = fwspec;
>> +	return 0;
>> +}
>> +
>> +void iommu_fwspec_free(struct device *dev)
>> +{
>> +	struct iommu_fwspec *fwspec = dev->archdata.iommu;
>> +
>> +	if (fwspec) {
>> +		of_node_put(fwspec->iommu_np);
>> +		kfree(fwspec);
>> +	}
>> +}
>> +
>> +int iommu_fwspec_add_ids(struct device *dev, u32 *ids, int num_ids)
>> +{
>> +	struct iommu_fwspec *fwspec = dev->archdata.iommu;
>> +	size_t size;
>> +
>> +	if (!fwspec)
>> +		return -EINVAL;
>> +
>> +	size = offsetof(struct iommu_fwspec, ids[fwspec->num_ids + num_ids]);
>> +	fwspec = krealloc(dev->archdata.iommu, size, GFP_KERNEL);
>> +	if (!fwspec)
>> +		return -ENOMEM;
>> +
>> +	while (num_ids--)
>> +		fwspec->ids[fwspec->num_ids++] = *ids++;
>> +
>> +	dev->archdata.iommu = fwspec;
> 
> It might just be me, but I find this really fiddly to read. The fact
> that you realloc the whole fwspec, rather than just the array isn't
> helping, but I also think that while loop would be much better off as
> a for loop, using the index as, well, an index into the ids array and
> fwspec->ids array.

Sure - copying one array into the tail end of another is always going to
be boring, ugly code, which I feel compelled to make as compact as
possible so as not to distract from the more interesting code, but I
guess that's self-defeating if it then no longer looks like something
simple and boring to skip over. I'll expend a few more precious lines on
turning it back into something staid and sensible ;)

My argument for embedding the IDs directly in the fwspec is that for
most devices there's only likely to be a single one anyway, and other
than a brief period up until add_device() they're then going to be fixed
for the lifetime of the device, so saving a little memory fragmentation
and a level of indirection on all subsequent uses is certainly not going
to be detrimental (plus it slightly simplifies the cleanup/free cases
here as well).

Robin.
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v5 04/19] iommu/of: Introduce iommu_fwspec
       [not found]             ` <900f3dcb-217c-4fb3-2f7d-15572f31a0c0-5wv7dgnIgG8@public.gmane.org>
@ 2016-09-01 12:31               ` Will Deacon
       [not found]                 ` <20160901123158.GE6721-5wv7dgnIgG8@public.gmane.org>
  0 siblings, 1 reply; 61+ messages in thread
From: Will Deacon @ 2016-09-01 12:31 UTC (permalink / raw)
  To: Robin Murphy
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, punit.agrawal-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA

On Thu, Sep 01, 2016 at 01:07:19PM +0100, Robin Murphy wrote:
> On 31/08/16 18:28, Will Deacon wrote:
> > On Tue, Aug 23, 2016 at 08:05:15PM +0100, Robin Murphy wrote:
> >> +int iommu_fwspec_add_ids(struct device *dev, u32 *ids, int num_ids)
> >> +{
> >> +	struct iommu_fwspec *fwspec = dev->archdata.iommu;
> >> +	size_t size;
> >> +
> >> +	if (!fwspec)
> >> +		return -EINVAL;
> >> +
> >> +	size = offsetof(struct iommu_fwspec, ids[fwspec->num_ids + num_ids]);
> >> +	fwspec = krealloc(dev->archdata.iommu, size, GFP_KERNEL);
> >> +	if (!fwspec)
> >> +		return -ENOMEM;
> >> +
> >> +	while (num_ids--)
> >> +		fwspec->ids[fwspec->num_ids++] = *ids++;
> >> +
> >> +	dev->archdata.iommu = fwspec;
> > 
> > It might just be me, but I find this really fiddly to read. The fact
> > that you realloc the whole fwspec, rather than just the array isn't
> > helping, but I also think that while loop would be much better off as
> > a for loop, using the index as, well, an index into the ids array and
> > fwspec->ids array.
> 
> Sure - copying one array into the tail end of another is always going to
> be boring, ugly code, which I feel compelled to make as compact as
> possible so as not to distract from the more interesting code, but I
> guess that's self-defeating if it then no longer looks like something
> simple and boring to skip over. I'll expend a few more precious lines on
> turning it back into something staid and sensible ;)

Why do you need to make the copy explicit? If you ensure that the array
is NULL in a freshly initialised fwspec, then you can either kmalloc
it when adding the IDs, or krealloc it if you need to append to an
array that's already been initialised. It's pretty much the same as you
have already, just operating on the array as opposed to the containing
structure.

Will

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v5 04/19] iommu/of: Introduce iommu_fwspec
       [not found]                 ` <20160901123158.GE6721-5wv7dgnIgG8@public.gmane.org>
@ 2016-09-01 13:25                   ` Robin Murphy
  0 siblings, 0 replies; 61+ messages in thread
From: Robin Murphy @ 2016-09-01 13:25 UTC (permalink / raw)
  To: Will Deacon
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, punit.agrawal-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA

On 01/09/16 13:31, Will Deacon wrote:
> On Thu, Sep 01, 2016 at 01:07:19PM +0100, Robin Murphy wrote:
>> On 31/08/16 18:28, Will Deacon wrote:
>>> On Tue, Aug 23, 2016 at 08:05:15PM +0100, Robin Murphy wrote:
>>>> +int iommu_fwspec_add_ids(struct device *dev, u32 *ids, int num_ids)
>>>> +{
>>>> +	struct iommu_fwspec *fwspec = dev->archdata.iommu;
>>>> +	size_t size;
>>>> +
>>>> +	if (!fwspec)
>>>> +		return -EINVAL;
>>>> +
>>>> +	size = offsetof(struct iommu_fwspec, ids[fwspec->num_ids + num_ids]);
>>>> +	fwspec = krealloc(dev->archdata.iommu, size, GFP_KERNEL);
>>>> +	if (!fwspec)
>>>> +		return -ENOMEM;
>>>> +
>>>> +	while (num_ids--)
>>>> +		fwspec->ids[fwspec->num_ids++] = *ids++;
>>>> +
>>>> +	dev->archdata.iommu = fwspec;
>>>
>>> It might just be me, but I find this really fiddly to read. The fact
>>> that you realloc the whole fwspec, rather than just the array isn't
>>> helping, but I also think that while loop would be much better off as
>>> a for loop, using the index as, well, an index into the ids array and
>>> fwspec->ids array.
>>
>> Sure - copying one array into the tail end of another is always going to
>> be boring, ugly code, which I feel compelled to make as compact as
>> possible so as not to distract from the more interesting code, but I
>> guess that's self-defeating if it then no longer looks like something
>> simple and boring to skip over. I'll expend a few more precious lines on
>> turning it back into something staid and sensible ;)
> 
> Why do you need to make the copy explicit?

Er, because we're filling the *new* ID(s) from the caller-provided
pointer into the now-enlarged array - I don't see how one could do that
implicitly. Tell you what, I'll also rename the variables to be less
confusing while I'm cleaning up the loop.

Robin.

> If you ensure that the array
> is NULL in a freshly initialised fwspec, then you can either kmalloc
> it when adding the IDs, or krealloc it if you need to append to an
> array that's already been initialised. It's pretty much the same as you
> have already, just operating on the array as opposed to the containing
> structure.
> 
> Will
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v5 14/19] iommu/arm-smmu: Intelligent SMR allocation
       [not found]     ` <693b7fdd58be254297eb43ac8f5e035beb5226b2.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
@ 2016-09-01 15:17       ` Lorenzo Pieralisi
  2016-09-01 17:59         ` Robin Murphy
  0 siblings, 1 reply; 61+ messages in thread
From: Lorenzo Pieralisi @ 2016-09-01 15:17 UTC (permalink / raw)
  To: Robin Murphy
  Cc: joro-zLv9SwRftAIdnm+yROfE0A, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	jean-philippe.brucker-5wv7dgnIgG8, punit.agrawal-5wv7dgnIgG8,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA,
	eric.auger-H+wXaHxf7aLQT0dZR+AlfA

On Tue, Aug 23, 2016 at 08:05:25PM +0100, Robin Murphy wrote:
> Stream Match Registers are one of the more awkward parts of the SMMUv2
> architecture; there are typically never enough to assign one to each
> stream ID in the system, and configuring them such that a single ID
> matches multiple entries is catastrophically bad - at best, every
> transaction raises a global fault; at worst, they go *somewhere*.
> 
> To address the former issue, we can mask ID bits such that a single
> register may be used to match multiple IDs belonging to the same device
> or group, but doing so also heightens the risk of the latter problem
> (which can be nasty to debug).
> 
> Tackle both problems at once by replacing the simple bitmap allocator
> with something much cleverer. Now that we have convenient in-memory
> representations of the stream mapping table, it becomes straightforward
> to properly validate new SMR entries against the current state, opening
> the door to arbitrary masking and SMR sharing.
> 
> Another feature which falls out of this is that with IDs shared by
> separate devices being automatically accounted for, simply associating a
> group pointer with the S2CR offers appropriate group allocation almost
> for free, so hook that up in the process.
> 
> Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
> ---
>  drivers/iommu/arm-smmu.c | 192 ++++++++++++++++++++++++++++-------------------
>  1 file changed, 114 insertions(+), 78 deletions(-)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 17bf871030c6..88f82eb8d1fe 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -302,6 +302,8 @@ enum arm_smmu_implementation {
>  };
>  
>  struct arm_smmu_s2cr {
> +	struct iommu_group		*group;
> +	int				count;
>  	enum arm_smmu_s2cr_type		type;
>  	enum arm_smmu_s2cr_privcfg	privcfg;
>  	u8				cbndx;
> @@ -363,6 +365,7 @@ struct arm_smmu_device {
>  	u16				smr_mask_mask;
>  	struct arm_smmu_smr		*smrs;
>  	struct arm_smmu_s2cr		*s2crs;
> +	struct mutex			stream_map_mutex;
>  
>  	unsigned long			va_size;
>  	unsigned long			ipa_size;
> @@ -1016,23 +1019,6 @@ static void arm_smmu_domain_free(struct iommu_domain *domain)
>  	kfree(smmu_domain);
>  }
>  
> -static int arm_smmu_alloc_smr(struct arm_smmu_device *smmu)
> -{
> -	int i;
> -
> -	for (i = 0; i < smmu->num_mapping_groups; i++)
> -		if (!cmpxchg(&smmu->smrs[i].valid, false, true))
> -			return i;
> -
> -	return INVALID_SMENDX;
> -}
> -
> -static void arm_smmu_free_smr(struct arm_smmu_device *smmu, int idx)
> -{
> -	writel_relaxed(~SMR_VALID, ARM_SMMU_GR0(smmu) + ARM_SMMU_GR0_SMR(idx));
> -	WRITE_ONCE(smmu->smrs[idx].valid, false);
> -}
> -
>  static void arm_smmu_write_smr(struct arm_smmu_device *smmu, int idx)
>  {
>  	struct arm_smmu_smr *smr = smmu->smrs + idx;
> @@ -1060,49 +1046,110 @@ static void arm_smmu_write_sme(struct arm_smmu_device *smmu, int idx)
>  		arm_smmu_write_smr(smmu, idx);
>  }
>  
> -static int arm_smmu_master_alloc_smes(struct arm_smmu_device *smmu,
> -				      struct arm_smmu_master_cfg *cfg)
> +static int arm_smmu_find_sme(struct arm_smmu_device *smmu, u16 id, u16 mask)
>  {
>  	struct arm_smmu_smr *smrs = smmu->smrs;
> -	int i, idx;
> +	int i, idx = -ENOSPC;
>  
> -	/* Allocate the SMRs on the SMMU */
> -	for_each_cfg_sme(cfg, i, idx) {
> -		if (idx >= 0)
> -			return -EEXIST;
> +	/* Stream indexing is blissfully easy */
> +	if (!smrs)
> +		return id;
>  
> -		/* ...except on stream indexing hardware, of course */
> -		if (!smrs) {
> -			cfg->smendx[i] = cfg->streamids[i];
> +	/* Validating SMRs is... less so */
> +	for (i = 0; i < smmu->num_mapping_groups; ++i) {
> +		if (!smrs[i].valid) {
> +			if (idx < 0)
> +				idx = i;

This is to stash an "empty" entry index in case none matches, right ?
Let me say it is not that obvious, that deserves a comment since it
is hard to follow.

>  			continue;
>  		}
>  
> -		idx = arm_smmu_alloc_smr(smmu);
> -		if (idx < 0) {
> -			dev_err(smmu->dev, "failed to allocate free SMR\n");
> -			goto err_free_smrs;
> -		}
> -		cfg->smendx[i] = idx;
> +		/* Exact matches are good */
> +		if (mask == smrs[i].mask && id == smrs[i].id)
> +			return i;
>  
> -		smrs[idx].id = cfg->streamids[i];
> -		smrs[idx].mask = 0; /* We don't currently share SMRs */
> +		/* New unmasked IDs matching existing masks we can cope with */
> +		if (!mask && !((smrs[i].id ^ id) & ~smrs[i].mask))
> +			return i;
> +
> +		/* Overlapping masks are right out */
> +		if (mask & smrs[i].mask)
> +			return -EINVAL;
> +
> +		/* Distinct masks must match unambiguous ranges */
> +		if (mask && !((smrs[i].id ^ id) & ~(smrs[i].mask | mask)))
> +			return -EINVAL;

And this is to _prevent_ an entry to match the input streamid
range (masked streamid) because it would create an ambiguous match ?

Basically here you keep looping because this function is at the
same time allocating and validating the SMRS (ie you want to make
sure there are no valid entries that clash with the streamid you
are currently allocating).

It is a tad complicated and a bit hard to parse. I wonder if it would
not be better to split into two SMRS validation and allocation
functions, I think I fully grasped what you want to achieve but it is
not that trivial.

Thanks,
Lorenzo

>  	}
>  
> -	if (!smrs)
> -		return 0;
> +	return idx;
> +}
> +
> +static bool arm_smmu_free_sme(struct arm_smmu_device *smmu, int idx)
> +{
> +	if (--smmu->s2crs[idx].count)
> +		return false;
> +
> +	smmu->s2crs[idx] = s2cr_init_val;
> +	if (smmu->smrs)
> +		smmu->smrs[idx].valid = false;
> +
> +	return true;
> +}
> +
> +static int arm_smmu_master_alloc_smes(struct device *dev)
> +{
> +	struct arm_smmu_master_cfg *cfg = dev->archdata.iommu;
> +	struct arm_smmu_device *smmu = cfg->smmu;
> +	struct arm_smmu_smr *smrs = smmu->smrs;
> +	struct iommu_group *group;
> +	int i, idx, ret;
> +
> +	mutex_lock(&smmu->stream_map_mutex);
> +	/* Figure out a viable stream map entry allocation */
> +	for_each_cfg_sme(cfg, i, idx) {
> +		if (idx >= 0) {
> +			ret = -EEXIST;
> +			goto out_err;
> +		}
> +
> +		ret = arm_smmu_find_sme(smmu, cfg->streamids[i], 0);
> +		if (ret < 0)
> +			goto out_err;
> +
> +		idx = ret;
> +		if (smrs && smmu->s2crs[idx].count == 0) {
> +			smrs[idx].id = cfg->streamids[i];
> +			smrs[idx].mask = 0; /* We don't currently share SMRs */
> +			smrs[idx].valid = true;
> +		}
> +		smmu->s2crs[idx].count++;
> +		cfg->smendx[i] = (s16)idx;
> +	}
> +
> +	group = iommu_group_get_for_dev(dev);
> +	if (!group)
> +		group = ERR_PTR(-ENOMEM);
> +	if (IS_ERR(group)) {
> +		ret = PTR_ERR(group);
> +		goto out_err;
> +	}
> +	iommu_group_put(group);
>  
>  	/* It worked! Now, poke the actual hardware */
> -	for_each_cfg_sme(cfg, i, idx)
> -		arm_smmu_write_smr(smmu, idx);
> +	for_each_cfg_sme(cfg, i, idx) {
> +		arm_smmu_write_sme(smmu, idx);
> +		smmu->s2crs[idx].group = group;
> +	}
>  
> +	mutex_unlock(&smmu->stream_map_mutex);
>  	return 0;
>  
> -err_free_smrs:
> +out_err:
>  	while (i--) {
> -		arm_smmu_free_smr(smmu, cfg->smendx[i]);
> +		arm_smmu_free_sme(smmu, cfg->smendx[i]);
>  		cfg->smendx[i] = INVALID_SMENDX;
>  	}
> -	return -ENOSPC;
> +	mutex_unlock(&smmu->stream_map_mutex);
> +	return ret;
>  }
>  
>  static void arm_smmu_master_free_smes(struct arm_smmu_master_cfg *cfg)
> @@ -1110,43 +1157,23 @@ static void arm_smmu_master_free_smes(struct arm_smmu_master_cfg *cfg)
>  	struct arm_smmu_device *smmu = cfg->smmu;
>  	int i, idx;
>  
> -	/*
> -	 * We *must* clear the S2CR first, because freeing the SMR means
> -	 * that it can be re-allocated immediately.
> -	 */
> +	mutex_lock(&smmu->stream_map_mutex);
>  	for_each_cfg_sme(cfg, i, idx) {
> -		/* An IOMMU group is torn down by the first device to be removed */
> -		if (idx < 0)
> -			return;
> -
> -		smmu->s2crs[idx] = s2cr_init_val;
> -		arm_smmu_write_s2cr(smmu, idx);
> -	}
> -	/* Sync S2CR updates before touching anything else */
> -	__iowmb();
> -
> -	/* Invalidate the SMRs before freeing back to the allocator */
> -	for_each_cfg_sme(cfg, i, idx) {
> -		if (smmu->smrs)
> -			arm_smmu_free_smr(smmu, idx);
> -
> +		if (arm_smmu_free_sme(smmu, idx))
> +			arm_smmu_write_sme(smmu, idx);
>  		cfg->smendx[i] = INVALID_SMENDX;
>  	}
> +	mutex_unlock(&smmu->stream_map_mutex);
>  }
>  
>  static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain,
>  				      struct arm_smmu_master_cfg *cfg)
>  {
> -	int i, idx, ret = 0;
>  	struct arm_smmu_device *smmu = smmu_domain->smmu;
>  	struct arm_smmu_s2cr *s2cr = smmu->s2crs;
>  	enum arm_smmu_s2cr_type type = S2CR_TYPE_TRANS;
>  	u8 cbndx = smmu_domain->cfg.cbndx;
> -
> -	if (cfg->smendx[0] < 0)
> -		ret = arm_smmu_master_alloc_smes(smmu, cfg);
> -	if (ret)
> -		return ret;
> +	int i, idx;
>  
>  	/*
>  	 * FIXME: This won't be needed once we have IOMMU-backed DMA ops
> @@ -1158,9 +1185,8 @@ static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain,
>  		type = S2CR_TYPE_BYPASS;
>  
>  	for_each_cfg_sme(cfg, i, idx) {
> -		/* Devices in an IOMMU group may already be configured */
>  		if (type == s2cr[idx].type && cbndx == s2cr[idx].cbndx)
> -			break;
> +			continue;
>  
>  		s2cr[idx].type = type;
>  		s2cr[idx].privcfg = S2CR_PRIVCFG_UNPRIV;
> @@ -1320,7 +1346,6 @@ static bool arm_smmu_capable(enum iommu_cap cap)
>  static int arm_smmu_add_device(struct device *dev)
>  {
>  	struct arm_smmu_master_cfg *cfg;
> -	struct iommu_group *group;
>  	int i, ret;
>  
>  	ret = arm_smmu_register_legacy_master(dev);
> @@ -1340,13 +1365,9 @@ static int arm_smmu_add_device(struct device *dev)
>  		cfg->smendx[i] = INVALID_SMENDX;
>  	}
>  
> -	group = iommu_group_get_for_dev(dev);
> -	if (IS_ERR(group)) {
> -		ret = PTR_ERR(group);
> -		goto out_free;
> -	}
> -	iommu_group_put(group);
> -	return 0;
> +	ret = arm_smmu_master_alloc_smes(dev);
> +	if (!ret)
> +		return ret;
>  
>  out_free:
>  	kfree(cfg);
> @@ -1369,7 +1390,21 @@ static void arm_smmu_remove_device(struct device *dev)
>  
>  static struct iommu_group *arm_smmu_device_group(struct device *dev)
>  {
> -	struct iommu_group *group;
> +	struct arm_smmu_master_cfg *cfg = dev->archdata.iommu;
> +	struct arm_smmu_device *smmu = cfg->smmu;
> +	struct iommu_group *group = NULL;
> +	int i, idx;
> +
> +	for_each_cfg_sme(cfg, i, idx) {
> +		if (group && smmu->s2crs[idx].group &&
> +		    group != smmu->s2crs[idx].group)
> +			return ERR_PTR(-EINVAL);
> +
> +		group = smmu->s2crs[idx].group;
> +	}
> +
> +	if (group)
> +		return group;
>  
>  	if (dev_is_pci(dev))
>  		group = pci_device_group(dev);
> @@ -1652,6 +1687,7 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
>  		smmu->s2crs[i] = s2cr_init_val;
>  
>  	smmu->num_mapping_groups = size;
> +	mutex_init(&smmu->stream_map_mutex);
>  
>  	if (smmu->version < ARM_SMMU_V2 || !(id & ID0_PTFS_NO_AARCH32)) {
>  		smmu->features |= ARM_SMMU_FEAT_FMT_AARCH32_L;
> -- 
> 2.8.1.dirty
> 
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v5 00/19] Generic DT bindings for PCI IOMMUs and ARM SMMU
       [not found] ` <cover.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
                     ` (18 preceding siblings ...)
  2016-08-23 19:15     ` Robin Murphy
@ 2016-09-01 15:22   ` Lorenzo Pieralisi
  2016-09-01 19:05   ` Will Deacon
  20 siblings, 0 replies; 61+ messages in thread
From: Lorenzo Pieralisi @ 2016-09-01 15:22 UTC (permalink / raw)
  To: Robin Murphy
  Cc: joro-zLv9SwRftAIdnm+yROfE0A, will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	jean-philippe.brucker-5wv7dgnIgG8, punit.agrawal-5wv7dgnIgG8,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA,
	eric.auger-H+wXaHxf7aLQT0dZR+AlfA

On Tue, Aug 23, 2016 at 08:05:11PM +0100, Robin Murphy wrote:
> Hi all,
> 
> At long last I've finished the big SMMUv2 rework, so here's everything
> all together for a v5. As a quick breakdown:
> 
> Patches 1-3 are the core PCI part, all acked and ready to go. No code
> changes from v4.
> 
> Patch 4 is merely bugfixed from v4 for simplicity, as I've not yet
> managed to take as close a look at Lorenzo's follow-on work as I'd like.
> 
> Patches 5-7 (SMMUv3) are mostly unchanged beyond a slight tweak to #5.
> 
> Patches 8-17 are the all-new SMMUv2 rework.
> 
> Patch 18 goes along with the fix already in 4.8-rc3 to help avoid 64-bit
> DMA masks going wrong now that DMA ops will be enabled.
> 
> Finally, patch 19 addresses the previous problem of having to choose
> between DMA ops or working MSIs. This is currently at the end as
> moving it before #17 would require a further interim SMMUv2 patch, and
> a 19-patch series is already quite enough...
> 
> I've pushed out a branch based on iommu/next to the usual place:
> 
> git://linux-arm.org/linux-rm iommu/generic-v5

I tested my ACPI IORT SMMU series on top of it on FVP and Juno (and
respective SMMU versions) and have not spotted any issues so far, so
feel free to add my:

Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi-5wv7dgnIgG8@public.gmane.org>

> Thanks,
> Robin.
> ---
> 
> Mark Rutland (1):
>   Docs: dt: add PCI IOMMU map bindings
> 
> Robin Murphy (18):
>   of/irq: Break out msi-map lookup (again)
>   iommu/of: Handle iommu-map property for PCI
>   iommu/of: Introduce iommu_fwspec
>   iommu/arm-smmu: Implement of_xlate() for SMMUv3
>   iommu/arm-smmu: Support non-PCI devices with SMMUv3
>   iommu/arm-smmu: Set PRIVCFG in stage 1 STEs
>   iommu/arm-smmu: Handle stream IDs more dynamically
>   iommu/arm-smmu: Consolidate stream map entry state
>   iommu/arm-smmu: Keep track of S2CR state
>   iommu/arm-smmu: Refactor mmu-masters handling
>   iommu/arm-smmu: Streamline SMMU data lookups
>   iommu/arm-smmu: Add a stream map entry iterator
>   iommu/arm-smmu: Intelligent SMR allocation
>   iommu/arm-smmu: Convert to iommu_fwspec
>   Docs: dt: document ARM SMMU generic binding usage
>   iommu/arm-smmu: Wire up generic configuration support
>   iommu/arm-smmu: Set domain geometry
>   iommu/dma: Add support for mapping MSIs
> 
>  .../devicetree/bindings/iommu/arm,smmu.txt         |  63 +-
>  .../devicetree/bindings/pci/pci-iommu.txt          | 171 ++++
>  drivers/iommu/Kconfig                              |   2 +-
>  drivers/iommu/arm-smmu-v3.c                        | 347 ++++----
>  drivers/iommu/arm-smmu.c                           | 952 ++++++++++-----------
>  drivers/iommu/dma-iommu.c                          | 141 ++-
>  drivers/iommu/of_iommu.c                           |  95 +-
>  drivers/irqchip/irq-gic-v2m.c                      |   3 +
>  drivers/irqchip/irq-gic-v3-its.c                   |   3 +
>  drivers/of/irq.c                                   |  78 +-
>  drivers/of/of_pci.c                                | 102 +++
>  include/linux/dma-iommu.h                          |   9 +
>  include/linux/of_iommu.h                           |  15 +
>  include/linux/of_pci.h                             |  10 +
>  14 files changed, 1208 insertions(+), 783 deletions(-)
>  create mode 100644 Documentation/devicetree/bindings/pci/pci-iommu.txt
> 
> -- 
> 2.8.1.dirty
> 
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v5 05/19] iommu/arm-smmu: Implement of_xlate() for SMMUv3
       [not found]     ` <6088007f60a24b36a3bf965b62521f99cd908019.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
@ 2016-09-01 17:06       ` Will Deacon
       [not found]         ` <20160901170604.GP6721-5wv7dgnIgG8@public.gmane.org>
  0 siblings, 1 reply; 61+ messages in thread
From: Will Deacon @ 2016-09-01 17:06 UTC (permalink / raw)
  To: Robin Murphy
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, punit.agrawal-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA

Hi Robin,

On Tue, Aug 23, 2016 at 08:05:16PM +0100, Robin Murphy wrote:
> Now that we can properly describe the mapping between PCI RIDs and
> stream IDs via "iommu-map", and have it fed it to the driver
> automatically via of_xlate(), rework the SMMUv3 driver to benefit from
> that, and get rid of the current misuse of the "iommus" binding.
> 
> Since having of_xlate wired up means that masters will now be given the
> appropriate DMA ops, we also need to make sure that default domains work
> properly. This necessitates dispensing with the "whole group at a time"
> notion for attaching to a domain, as devices which share a group get
> attached to the group's default domain one by one as they are initially
> probed.
> 
> Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
> ---
> 
> v5: Simplify init code, use firmware-agnostic (and more efficient)
>     driver-based instance lookup

I'm largely happy with this, just one question below...

> @@ -2649,7 +2602,14 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
>  	platform_set_drvdata(pdev, smmu);
>  
>  	/* Reset the device */
> -	return arm_smmu_device_reset(smmu);
> +	ret = arm_smmu_device_reset(smmu);
> +	if (ret)
> +		return ret;

... if we fail the probe at this point, the drvdata remains set. Do you
need to clear it, or we can we guarantee that nobody is going to try
arm_smmu_get_by_node with the (failed) SMMU's device node?

Alternatively, we could postpone setting the drvdata until the very end
of probe.

Will

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v5 06/19] iommu/arm-smmu: Support non-PCI devices with SMMUv3
       [not found]     ` <207d0ae38c5b01b7cf7e48231a4d01bac453b57c.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
@ 2016-09-01 17:08       ` Will Deacon
  0 siblings, 0 replies; 61+ messages in thread
From: Will Deacon @ 2016-09-01 17:08 UTC (permalink / raw)
  To: Robin Murphy
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, punit.agrawal-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA

On Tue, Aug 23, 2016 at 08:05:17PM +0100, Robin Murphy wrote:
> With the device <-> stream ID relationship suitably abstracted and
> of_xlate() hooked up, the PCI dependency now looks, and is, entirely
> arbitrary. Any bus using the of_dma_configure() mechanism will work,
> so extend support to the platform and AMBA buses which do just that.
> 
> Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
> ---
>  drivers/iommu/Kconfig       |  2 +-
>  drivers/iommu/arm-smmu-v3.c | 40 ++++++++++++++++++++++++++++++++--------
>  2 files changed, 33 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index d432ca828472..8ee54d71c7eb 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -309,7 +309,7 @@ config ARM_SMMU
>  
>  config ARM_SMMU_V3
>  	bool "ARM Ltd. System MMU Version 3 (SMMUv3) Support"
> -	depends on ARM64 && PCI
> +	depends on ARM64
>  	select IOMMU_API
>  	select IOMMU_IO_PGTABLE_LPAE
>  	select GENERIC_MSI_IRQ_DOMAIN
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index 094babff64a6..e0384f7afb03 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -35,6 +35,8 @@
>  #include <linux/pci.h>
>  #include <linux/platform_device.h>
>  
> +#include <linux/amba/bus.h>
> +
>  #include "io-pgtable.h"
>  
>  /* MMIO registers */
> @@ -1830,6 +1832,23 @@ static void arm_smmu_remove_device(struct device *dev)
>  	iommu_fwspec_free(dev);
>  }
>  
> +static struct iommu_group *arm_smmu_device_group(struct device *dev)
> +{
> +	struct iommu_group *group;
> +
> +	/*
> +	 * We don't support devices sharing stream IDs other than PCI RID
> +	 * aliases, since the necessary ID-to-device lookup becomes rather
> +	 * impractical given a potential sparse 32-bit stream ID space.
> +	 */
> +	if (dev_is_pci(dev))
> +		group = pci_device_group(dev);
> +	else
> +		group = generic_device_group(dev);
> +
> +	return group;
> +}

It's a pity that this ends up in the driver(s), but I can live with it
for now.

Acked-by: Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org>

Will

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v5 07/19] iommu/arm-smmu: Set PRIVCFG in stage 1 STEs
       [not found]     ` <1cda9861ce3ede6c2de9c6c4f2294549808b421b.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
@ 2016-09-01 17:19       ` Will Deacon
  0 siblings, 0 replies; 61+ messages in thread
From: Will Deacon @ 2016-09-01 17:19 UTC (permalink / raw)
  To: Robin Murphy
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, punit.agrawal-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA

On Tue, Aug 23, 2016 at 08:05:18PM +0100, Robin Murphy wrote:
> Implement the SMMUv3 equivalent of d346180e70b9 ("iommu/arm-smmu: Treat
> all device transactions as unprivileged"), so that once again those
> pesky DMA controllers with their privileged instruction fetches don't
> unexpectedly fault in stage 1 domains due to VMSAv8 rules.
> 
> Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
> ---
>  drivers/iommu/arm-smmu-v3.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)

Acked-by: Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org>

Will

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v5 05/19] iommu/arm-smmu: Implement of_xlate() for SMMUv3
       [not found]         ` <20160901170604.GP6721-5wv7dgnIgG8@public.gmane.org>
@ 2016-09-01 17:40           ` Robin Murphy
  0 siblings, 0 replies; 61+ messages in thread
From: Robin Murphy @ 2016-09-01 17:40 UTC (permalink / raw)
  To: Will Deacon
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, punit.agrawal-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA

On 01/09/16 18:06, Will Deacon wrote:
> Hi Robin,
> 
> On Tue, Aug 23, 2016 at 08:05:16PM +0100, Robin Murphy wrote:
>> Now that we can properly describe the mapping between PCI RIDs and
>> stream IDs via "iommu-map", and have it fed it to the driver
>> automatically via of_xlate(), rework the SMMUv3 driver to benefit from
>> that, and get rid of the current misuse of the "iommus" binding.
>>
>> Since having of_xlate wired up means that masters will now be given the
>> appropriate DMA ops, we also need to make sure that default domains work
>> properly. This necessitates dispensing with the "whole group at a time"
>> notion for attaching to a domain, as devices which share a group get
>> attached to the group's default domain one by one as they are initially
>> probed.
>>
>> Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
>> ---
>>
>> v5: Simplify init code, use firmware-agnostic (and more efficient)
>>     driver-based instance lookup
> 
> I'm largely happy with this, just one question below...
> 
>> @@ -2649,7 +2602,14 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
>>  	platform_set_drvdata(pdev, smmu);
>>  
>>  	/* Reset the device */
>> -	return arm_smmu_device_reset(smmu);
>> +	ret = arm_smmu_device_reset(smmu);
>> +	if (ret)
>> +		return ret;
> 
> ... if we fail the probe at this point, the drvdata remains set. Do you
> need to clear it, or we can we guarantee that nobody is going to try
> arm_smmu_get_by_node with the (failed) SMMU's device node?

The device only gets added to the driver's list by driver_bound(), and
really_probe() will bail before it calls that if we return nonzero from
the probe function here. Since get_by_node() is thus safe, and .remove()
shouldn't be called given that .probe() failed, I can't see a legitimate
situation in which leaving behind a stale pointer in the drvdata of an
unbound device might be problematic.

Robin.

> 
> Alternatively, we could postpone setting the drvdata until the very end
> of probe.
> 
> Will
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v5 14/19] iommu/arm-smmu: Intelligent SMR allocation
  2016-09-01 15:17       ` Lorenzo Pieralisi
@ 2016-09-01 17:59         ` Robin Murphy
  0 siblings, 0 replies; 61+ messages in thread
From: Robin Murphy @ 2016-09-01 17:59 UTC (permalink / raw)
  To: Lorenzo Pieralisi
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, punit.agrawal-5wv7dgnIgG8,
	will.deacon-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA

On 01/09/16 16:17, Lorenzo Pieralisi wrote:
[...]
>> +static int arm_smmu_find_sme(struct arm_smmu_device *smmu, u16 id, u16 mask)
>>  {
>>  	struct arm_smmu_smr *smrs = smmu->smrs;
>> -	int i, idx;
>> +	int i, idx = -ENOSPC;
>>  
>> -	/* Allocate the SMRs on the SMMU */
>> -	for_each_cfg_sme(cfg, i, idx) {
>> -		if (idx >= 0)
>> -			return -EEXIST;
>> +	/* Stream indexing is blissfully easy */
>> +	if (!smrs)
>> +		return id;
>>  
>> -		/* ...except on stream indexing hardware, of course */
>> -		if (!smrs) {
>> -			cfg->smendx[i] = cfg->streamids[i];
>> +	/* Validating SMRs is... less so */
>> +	for (i = 0; i < smmu->num_mapping_groups; ++i) {
>> +		if (!smrs[i].valid) {
>> +			if (idx < 0)
>> +				idx = i;
> 
> This is to stash an "empty" entry index in case none matches, right ?
> Let me say it is not that obvious, that deserves a comment since it
> is hard to follow.

Yes, since we have to iterate through every SMR anyway, this just keeps
a note of the first free entry we see along the way. It _could_ be
simplified slightly further by being made unconditional so that we end
up allocating top-down instead, although that would pessimize the
early-out cases below. I'll certainly comment it, though.

>>  			continue;
>>  		}
>>  
>> -		idx = arm_smmu_alloc_smr(smmu);
>> -		if (idx < 0) {
>> -			dev_err(smmu->dev, "failed to allocate free SMR\n");
>> -			goto err_free_smrs;
>> -		}
>> -		cfg->smendx[i] = idx;
>> +		/* Exact matches are good */
>> +		if (mask == smrs[i].mask && id == smrs[i].id)
>> +			return i;
>>  
>> -		smrs[idx].id = cfg->streamids[i];
>> -		smrs[idx].mask = 0; /* We don't currently share SMRs */
>> +		/* New unmasked IDs matching existing masks we can cope with */
>> +		if (!mask && !((smrs[i].id ^ id) & ~smrs[i].mask))
>> +			return i;
>> +
>> +		/* Overlapping masks are right out */
>> +		if (mask & smrs[i].mask)
>> +			return -EINVAL;
>> +
>> +		/* Distinct masks must match unambiguous ranges */
>> +		if (mask && !((smrs[i].id ^ id) & ~(smrs[i].mask | mask)))
>> +			return -EINVAL;
> 
> And this is to _prevent_ an entry to match the input streamid
> range (masked streamid) because it would create an ambiguous match ?

Indeed; say we have the (ID,mask) pair (0x0100,0x000f) already in an
SMR, then we'd allow (0x0000,0x00f0), since bit 8 still uniquely matches
one or the other, but we'd have to reject (0x0100,0x00f0) as that would
create a conflict for ID 0x0100. In general, unless there are distinct
bits outside both masks then there will always exist at least one ID
capable of causing a conflict.

> Basically here you keep looping because this function is at the
> same time allocating and validating the SMRS (ie you want to make
> sure there are no valid entries that clash with the streamid you
> are currently allocating).
> 
> It is a tad complicated and a bit hard to parse. I wonder if it would
> not be better to split into two SMRS validation and allocation
> functions, I think I fully grasped what you want to achieve but it is
> not that trivial.

Things were actually considerably more complicated until I realised the
neatness of hoisting the allocation (which itself is all of 4 lines
here) up into the same function. A real logical separation of concerns
would involve iterating through the whole array two-and-a-bit times,
first validating against conflicts, then looking to return an existing
entry which already matches, then finally falling back to allocating a
free entry. Even disregarding the efficiency angle, I'm not convinced
that having more code spread about with similar but subtly different
repetition would actually be any easier to follow (unfortunately earlier
implementations of this have been rebased into oblivion so I can't pull
one out to demonstrate). I'll definitely beef up the comments here - on
reflection these are more notes to myself than actual explanations for
others...

Robin.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v5 08/19] iommu/arm-smmu: Handle stream IDs more dynamically
       [not found]     ` <36f71a07fbc6037ca664bdcc540650f893081dd1.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
@ 2016-09-01 18:17       ` Will Deacon
  0 siblings, 0 replies; 61+ messages in thread
From: Will Deacon @ 2016-09-01 18:17 UTC (permalink / raw)
  To: Robin Murphy
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, punit.agrawal-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA

On Tue, Aug 23, 2016 at 08:05:19PM +0100, Robin Murphy wrote:
> Rather than assuming fixed worst-case values for stream IDs and SMR
> masks, keep track of whatever implemented bits the hardware actually
> reports. This also obviates the slightly questionable validation of SMR
> fields in isolation - rather than aborting the whole SMMU probe for a
> hardware configuration which is still architecturally valid, we can
> simply refuse masters later if they try to claim an unrepresentable ID
> or mask (which almost certainly implies a DT error anyway).
> 
> Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
> ---
>  drivers/iommu/arm-smmu.c | 43 ++++++++++++++++++++++---------------------
>  1 file changed, 22 insertions(+), 21 deletions(-)

Looks good to me:

Acked-by: Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org>

Will

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v5 09/19] iommu/arm-smmu: Consolidate stream map entry state
       [not found]     ` <26fcf7d3138816b9546a3dcc2bbbc2f229f34c91.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
@ 2016-09-01 18:32       ` Will Deacon
       [not found]         ` <20160901183257.GT6721-5wv7dgnIgG8@public.gmane.org>
  0 siblings, 1 reply; 61+ messages in thread
From: Will Deacon @ 2016-09-01 18:32 UTC (permalink / raw)
  To: Robin Murphy
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, punit.agrawal-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA

On Tue, Aug 23, 2016 at 08:05:20PM +0100, Robin Murphy wrote:
> In order to consider SMR masking, we really want to be able to validate
> ID/mask pairs against existing SMR contents to prevent stream match
> conflicts, which at best would cause transactions to fault unexpectedly,
> and at worst lead to silent unpredictable behaviour. With our SMMU
> instance data holding only an allocator bitmap, and the SMR values
> themselves scattered across master configs hanging off devices which we
> may have no way of finding, there's essentially no way short of digging
> everything back out of the hardware. Similarly, the thought of power
> management ops to support suspend/resume faces the exact same problem.
> 
> By massaging the software state into a closer shape to the underlying
> hardware, everything comes together quite nicely; the allocator and the
> high-level view of the data become a single centralised state which we
> can easily keep track of, and to which any updates can be validated in
> full before being synchronised to the hardware itself.
> 
> Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
> ---
>  drivers/iommu/arm-smmu.c | 147 +++++++++++++++++++++++++++--------------------
>  1 file changed, 86 insertions(+), 61 deletions(-)

[...]

> +static int arm_smmu_master_alloc_smes(struct arm_smmu_device *smmu,
> +				      struct arm_smmu_master_cfg *cfg)
> +{
> +	struct arm_smmu_smr *smrs = smmu->smrs;
> +	int i, idx;
>  
>  	/* Allocate the SMRs on the SMMU */
>  	for (i = 0; i < cfg->num_streamids; ++i) {
> -		int idx = __arm_smmu_alloc_bitmap(smmu->smr_map, 0,
> -						  smmu->num_mapping_groups);
> +		if (cfg->smendx[i] >= 0)
> +			return -EEXIST;
> +
> +		/* ...except on stream indexing hardware, of course */
> +		if (!smrs) {
> +			cfg->smendx[i] = cfg->streamids[i];
> +			continue;
> +		}
> +
> +		idx = arm_smmu_alloc_smr(smmu);
>  		if (idx < 0) {
>  			dev_err(smmu->dev, "failed to allocate free SMR\n");
>  			goto err_free_smrs;
>  		}
> +		cfg->smendx[i] = idx;
>  
> -		smrs[i] = (struct arm_smmu_smr) {
> -			.idx	= idx,
> -			.mask	= 0, /* We don't currently share SMRs */
> -			.id	= cfg->streamids[i],
> -		};
> +		smrs[idx].id = cfg->streamids[i];
> +		smrs[idx].mask = 0; /* We don't currently share SMRs */
>  	}
>  
> +	if (!smrs)
> +		return 0;
> +
>  	/* It worked! Now, poke the actual hardware */
> -	for (i = 0; i < cfg->num_streamids; ++i) {
> -		u32 reg = SMR_VALID | smrs[i].id << SMR_ID_SHIFT |
> -			  smrs[i].mask << SMR_MASK_SHIFT;
> -		writel_relaxed(reg, gr0_base + ARM_SMMU_GR0_SMR(smrs[i].idx));
> -	}
> +	for (i = 0; i < cfg->num_streamids; ++i)
> +		arm_smmu_write_smr(smmu, cfg->smendx[i]);
>  
> -	cfg->smrs = smrs;
>  	return 0;
>  
>  err_free_smrs:
> -	while (--i >= 0)
> -		__arm_smmu_free_bitmap(smmu->smr_map, smrs[i].idx);
> -	kfree(smrs);
> +	while (i--) {
> +		arm_smmu_free_smr(smmu, cfg->smendx[i]);
> +		cfg->smendx[i] = INVALID_SMENDX;
> +	}

Hmm, don't you have an off-by-one here? At least, looking at the final
code, we branch to out_err from within the for_each_cfg_sme loop, but
before we've incremented smmu->s2crs[idx].count, so the arm_smmu_free_smr
will erroneously decrement that.

Will

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v5 10/19] iommu/arm-smmu: Keep track of S2CR state
       [not found]     ` <e086741acfd0959671d184203ef758c824c8d7da.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
@ 2016-09-01 18:42       ` Will Deacon
       [not found]         ` <20160901184259.GU6721-5wv7dgnIgG8@public.gmane.org>
  0 siblings, 1 reply; 61+ messages in thread
From: Will Deacon @ 2016-09-01 18:42 UTC (permalink / raw)
  To: Robin Murphy
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, punit.agrawal-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA

On Tue, Aug 23, 2016 at 08:05:21PM +0100, Robin Murphy wrote:
> Making S2CRs first-class citizens within the driver with a high-level
> representation of their state offers a neat solution to a few problems:
> 
> Firstly, the information about which context a device's stream IDs are
> associated with is already present by necessity in the S2CR. With that
> state easily accessible we can refer directly to it and obviate the need
> to track an IOMMU domain in each device's archdata (its earlier purpose
> of enforcing correct attachment of multi-device groups now being handled
> by the IOMMU core itself).
> 
> Secondly, the core API now deprecates explicit domain detach and expects
> domain attach to move devices smoothly from one domain to another; for
> SMMUv2, this notion maps directly to simply rewriting the S2CRs assigned
> to the device. By giving the driver a suitable abstraction of those
> S2CRs to work with, we can massively reduce the overhead of the current
> heavy-handed "detach, free resources, reallocate resources, attach"
> approach.
> 
> Thirdly, making the software state hardware-shaped and attached to the
> SMMU instance once again makes suspend/resume of this register group
> that much simpler to implement in future.
> 
> Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
> ---
>  drivers/iommu/arm-smmu.c | 159 +++++++++++++++++++++++++++--------------------
>  1 file changed, 93 insertions(+), 66 deletions(-)

[...]

> @@ -1145,9 +1198,16 @@ static void arm_smmu_master_free_smes(struct arm_smmu_device *smmu,
>  static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain,
>  				      struct arm_smmu_master_cfg *cfg)
>  {
> -	int i, ret;
> +	int i, ret = 0;
>  	struct arm_smmu_device *smmu = smmu_domain->smmu;
> -	void __iomem *gr0_base = ARM_SMMU_GR0(smmu);
> +	struct arm_smmu_s2cr *s2cr = smmu->s2crs;
> +	enum arm_smmu_s2cr_type type = S2CR_TYPE_TRANS;
> +	u8 cbndx = smmu_domain->cfg.cbndx;
> +
> +	if (cfg->smendx[0] < 0)

Shouldn't that be INVALID_SMENDX?

Will

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v5 09/19] iommu/arm-smmu: Consolidate stream map entry state
       [not found]         ` <20160901183257.GT6721-5wv7dgnIgG8@public.gmane.org>
@ 2016-09-01 18:45           ` Robin Murphy
       [not found]             ` <6d3209ff-51ad-30ca-867b-ce62105e6699-5wv7dgnIgG8@public.gmane.org>
  0 siblings, 1 reply; 61+ messages in thread
From: Robin Murphy @ 2016-09-01 18:45 UTC (permalink / raw)
  To: Will Deacon
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, punit.agrawal-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA

On 01/09/16 19:32, Will Deacon wrote:
> On Tue, Aug 23, 2016 at 08:05:20PM +0100, Robin Murphy wrote:
>> In order to consider SMR masking, we really want to be able to validate
>> ID/mask pairs against existing SMR contents to prevent stream match
>> conflicts, which at best would cause transactions to fault unexpectedly,
>> and at worst lead to silent unpredictable behaviour. With our SMMU
>> instance data holding only an allocator bitmap, and the SMR values
>> themselves scattered across master configs hanging off devices which we
>> may have no way of finding, there's essentially no way short of digging
>> everything back out of the hardware. Similarly, the thought of power
>> management ops to support suspend/resume faces the exact same problem.
>>
>> By massaging the software state into a closer shape to the underlying
>> hardware, everything comes together quite nicely; the allocator and the
>> high-level view of the data become a single centralised state which we
>> can easily keep track of, and to which any updates can be validated in
>> full before being synchronised to the hardware itself.
>>
>> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
>> ---
>>  drivers/iommu/arm-smmu.c | 147 +++++++++++++++++++++++++++--------------------
>>  1 file changed, 86 insertions(+), 61 deletions(-)
> 
> [...]
> 
>> +static int arm_smmu_master_alloc_smes(struct arm_smmu_device *smmu,
>> +				      struct arm_smmu_master_cfg *cfg)
>> +{
>> +	struct arm_smmu_smr *smrs = smmu->smrs;
>> +	int i, idx;
>>  
>>  	/* Allocate the SMRs on the SMMU */
>>  	for (i = 0; i < cfg->num_streamids; ++i) {
>> -		int idx = __arm_smmu_alloc_bitmap(smmu->smr_map, 0,
>> -						  smmu->num_mapping_groups);
>> +		if (cfg->smendx[i] >= 0)
>> +			return -EEXIST;
>> +
>> +		/* ...except on stream indexing hardware, of course */
>> +		if (!smrs) {
>> +			cfg->smendx[i] = cfg->streamids[i];
>> +			continue;
>> +		}
>> +
>> +		idx = arm_smmu_alloc_smr(smmu);
>>  		if (idx < 0) {
>>  			dev_err(smmu->dev, "failed to allocate free SMR\n");
>>  			goto err_free_smrs;
>>  		}
>> +		cfg->smendx[i] = idx;
>>  
>> -		smrs[i] = (struct arm_smmu_smr) {
>> -			.idx	= idx,
>> -			.mask	= 0, /* We don't currently share SMRs */
>> -			.id	= cfg->streamids[i],
>> -		};
>> +		smrs[idx].id = cfg->streamids[i];
>> +		smrs[idx].mask = 0; /* We don't currently share SMRs */
>>  	}
>>  
>> +	if (!smrs)
>> +		return 0;
>> +
>>  	/* It worked! Now, poke the actual hardware */
>> -	for (i = 0; i < cfg->num_streamids; ++i) {
>> -		u32 reg = SMR_VALID | smrs[i].id << SMR_ID_SHIFT |
>> -			  smrs[i].mask << SMR_MASK_SHIFT;
>> -		writel_relaxed(reg, gr0_base + ARM_SMMU_GR0_SMR(smrs[i].idx));
>> -	}
>> +	for (i = 0; i < cfg->num_streamids; ++i)
>> +		arm_smmu_write_smr(smmu, cfg->smendx[i]);
>>  
>> -	cfg->smrs = smrs;
>>  	return 0;
>>  
>>  err_free_smrs:
>> -	while (--i >= 0)
>> -		__arm_smmu_free_bitmap(smmu->smr_map, smrs[i].idx);
>> -	kfree(smrs);
>> +	while (i--) {
>> +		arm_smmu_free_smr(smmu, cfg->smendx[i]);
>> +		cfg->smendx[i] = INVALID_SMENDX;
>> +	}
> 
> Hmm, don't you have an off-by-one here? At least, looking at the final
> code, we branch to out_err from within the for_each_cfg_sme loop, but
> before we've incremented smmu->s2crs[idx].count, so the arm_smmu_free_smr
> will erroneously decrement that.

Given that s2crs only doesn't exist until patch 10, and s2crs->count
doesn't exist until patch 14, I'd have to say pick one of:

a) no

b) ¯\_(ツ)_/¯

Robin.

> 
> Will
> 

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v5 11/19] iommu/arm-smmu: Refactor mmu-masters handling
       [not found]     ` <00301aa60323bb94588d078f2962feea0cb45c72.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
@ 2016-09-01 18:47       ` Will Deacon
  0 siblings, 0 replies; 61+ messages in thread
From: Will Deacon @ 2016-09-01 18:47 UTC (permalink / raw)
  To: Robin Murphy
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, punit.agrawal-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA

On Tue, Aug 23, 2016 at 08:05:22PM +0100, Robin Murphy wrote:
> To be able to support the generic bindings and handle of_xlate() calls,
> we need to be able to associate SMMUs and stream IDs directly with
> devices *before* allocating IOMMU groups. Furthermore, to support real
> default domains with multi-device groups we also have to handle domain
> attach on a per-device basis, as the "whole group at a time" assumption
> fails to properly handle subsequent devices added to a group after the
> first has already triggered default domain creation and attachment.
> 
> To that end, use the now-vacant dev->archdata.iommu field for easy
> config and SMMU instance lookup, and unify config management by chopping
> down the platform-device-specific tree and probing the "mmu-masters"
> property on-demand instead. This may add a bit of one-off overhead to
> initially adding a new device, but we're about to deprecate that binding
> in favour of the inherently-more-efficient generic ones anyway.
> 
> For the sake of simplicity, this patch does temporarily regress the case
> of aliasing PCI devices by losing the duplicate stream ID detection that
> the previous per-group config had. Stay tuned, because we'll be back to
> fix that in a better and more general way momentarily...
> 
> Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
> ---
>  drivers/iommu/arm-smmu.c | 382 +++++++++++++----------------------------------
>  1 file changed, 107 insertions(+), 275 deletions(-)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 22c093030322..9066fd1399d4 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -317,18 +317,13 @@ struct arm_smmu_smr {
>  };
>  
>  struct arm_smmu_master_cfg {
> +	struct arm_smmu_device		*smmu;
>  	int				num_streamids;
>  	u16				streamids[MAX_MASTER_STREAMIDS];
>  	s16				smendx[MAX_MASTER_STREAMIDS];
>  };
>  #define INVALID_SMENDX			-1
>  
> -struct arm_smmu_master {
> -	struct device_node		*of_node;
> -	struct rb_node			node;
> -	struct arm_smmu_master_cfg	cfg;
> -};
> -
>  struct arm_smmu_device {
>  	struct device			*dev;
>  
> @@ -376,7 +371,6 @@ struct arm_smmu_device {
>  	unsigned int			*irqs;
>  
>  	struct list_head		list;
> -	struct rb_root			masters;
>  
>  	u32				cavium_id_base; /* Specific to Cavium */
>  };
> @@ -415,12 +409,6 @@ struct arm_smmu_domain {
>  	struct iommu_domain		domain;
>  };
>  
> -struct arm_smmu_phandle_args {
> -	struct device_node *np;
> -	int args_count;
> -	uint32_t args[MAX_MASTER_STREAMIDS];
> -};
> -
>  static DEFINE_SPINLOCK(arm_smmu_devices_lock);
>  static LIST_HEAD(arm_smmu_devices);
>  
> @@ -462,132 +450,89 @@ static struct device_node *dev_get_dev_node(struct device *dev)
>  
>  		while (!pci_is_root_bus(bus))
>  			bus = bus->parent;
> -		return bus->bridge->parent->of_node;
> +		return of_node_get(bus->bridge->parent->of_node);
>  	}
>  
> -	return dev->of_node;
> +	return of_node_get(dev->of_node);
>  }
>  
> -static struct arm_smmu_master *find_smmu_master(struct arm_smmu_device *smmu,
> -						struct device_node *dev_node)
> +static int __arm_smmu_get_pci_sid(struct pci_dev *pdev, u16 alias, void *data)
>  {
> -	struct rb_node *node = smmu->masters.rb_node;
> -
> -	while (node) {
> -		struct arm_smmu_master *master;
> -
> -		master = container_of(node, struct arm_smmu_master, node);
> -
> -		if (dev_node < master->of_node)
> -			node = node->rb_left;
> -		else if (dev_node > master->of_node)
> -			node = node->rb_right;
> -		else
> -			return master;
> -	}
> -
> -	return NULL;
> +	*((__be32 *)data) = cpu_to_be32(alias);
> +	return 0; /* Continue walking */
>  }
>  
> -static struct arm_smmu_master_cfg *
> -find_smmu_master_cfg(struct device *dev)
> +static int __find_legacy_master_phandle(struct device *dev, void *data)
>  {
> -	struct arm_smmu_master_cfg *cfg = NULL;
> -	struct iommu_group *group = iommu_group_get(dev);
> +	struct of_phandle_iterator *it = *(void **)data;
> +	struct device_node *np = it->node;
> +	int err;
>  
> -	if (group) {
> -		cfg = iommu_group_get_iommudata(group);
> -		iommu_group_put(group);
> -	}
> -
> -	return cfg;
> -}
> -
> -static int insert_smmu_master(struct arm_smmu_device *smmu,
> -			      struct arm_smmu_master *master)
> -{
> -	struct rb_node **new, *parent;
> -
> -	new = &smmu->masters.rb_node;
> -	parent = NULL;
> -	while (*new) {
> -		struct arm_smmu_master *this
> -			= container_of(*new, struct arm_smmu_master, node);
> -
> -		parent = *new;
> -		if (master->of_node < this->of_node)
> -			new = &((*new)->rb_left);
> -		else if (master->of_node > this->of_node)
> -			new = &((*new)->rb_right);
> -		else
> -			return -EEXIST;
> -	}
> -
> -	rb_link_node(&master->node, parent, new);
> -	rb_insert_color(&master->node, &smmu->masters);
> -	return 0;
> -}
> -
> -static int register_smmu_master(struct arm_smmu_device *smmu,
> -				struct device *dev,
> -				struct arm_smmu_phandle_args *masterspec)
> -{
> -	int i;
> -	struct arm_smmu_master *master;
> -
> -	master = find_smmu_master(smmu, masterspec->np);
> -	if (master) {
> -		dev_err(dev,
> -			"rejecting multiple registrations for master device %s\n",
> -			masterspec->np->name);
> -		return -EBUSY;
> -	}
> -
> -	if (masterspec->args_count > MAX_MASTER_STREAMIDS) {
> -		dev_err(dev,
> -			"reached maximum number (%d) of stream IDs for master device %s\n",
> -			MAX_MASTER_STREAMIDS, masterspec->np->name);
> -		return -ENOSPC;
> -	}
> -
> -	master = devm_kzalloc(dev, sizeof(*master), GFP_KERNEL);
> -	if (!master)
> -		return -ENOMEM;
> -
> -	master->of_node			= masterspec->np;
> -	master->cfg.num_streamids	= masterspec->args_count;
> -
> -	for (i = 0; i < master->cfg.num_streamids; ++i) {
> -		u16 streamid = masterspec->args[i];
> -
> -		if (!(smmu->features & ARM_SMMU_FEAT_STREAM_MATCH) &&
> -		     (streamid >= smmu->num_mapping_groups)) {
> -			dev_err(dev,
> -				"stream ID for master device %s greater than maximum allowed (%d)\n",
> -				masterspec->np->name, smmu->num_mapping_groups);
> -			return -ERANGE;
> +	of_for_each_phandle(it, err, dev->of_node, "mmu-masters",
> +			    "#stream-id-cells", 0)
> +		if (it->node == np) {
> +			*(void **)data = dev;
> +			return 1;
>  		}
> -		master->cfg.streamids[i] = streamid;
> -		master->cfg.smendx[i] = INVALID_SMENDX;
> -	}
> -	return insert_smmu_master(smmu, master);
> +	it->node = np;
> +	return err;
>  }
>  
> -static struct arm_smmu_device *find_smmu_for_device(struct device *dev)
> +static int arm_smmu_register_legacy_master(struct device *dev)
>  {
>  	struct arm_smmu_device *smmu;
> -	struct arm_smmu_master *master = NULL;
> -	struct device_node *dev_node = dev_get_dev_node(dev);
> +	struct arm_smmu_master_cfg *cfg;
> +	struct device_node *np;
> +	struct of_phandle_iterator it;
> +	void *data = &it;
> +	__be32 pci_sid;
> +	int err;
>  
> +	np = dev_get_dev_node(dev);
> +	if (!np || !of_find_property(np, "#stream-id-cells", NULL)) {
> +		of_node_put(np);
> +		return -ENODEV;
> +	}
> +
> +	it.node = np;
>  	spin_lock(&arm_smmu_devices_lock);
>  	list_for_each_entry(smmu, &arm_smmu_devices, list) {
> -		master = find_smmu_master(smmu, dev_node);
> -		if (master)
> +		err = __find_legacy_master_phandle(smmu->dev, &data);
> +		if (err)
>  			break;
>  	}
>  	spin_unlock(&arm_smmu_devices_lock);
> +	of_node_put(np);
> +	if (err == 0)
> +		return -ENODEV;
> +	if (err < 0)
> +		return err;
>  
> -	return master ? smmu : NULL;
> +	if (it.cur_count > MAX_MASTER_STREAMIDS) {
> +		dev_err(smmu->dev,
> +			"reached maximum number (%d) of stream IDs for master device %s\n",
> +			MAX_MASTER_STREAMIDS, dev_name(dev));
> +		return -ENOSPC;
> +	}
> +	if (dev_is_pci(dev)) {
> +		/* "mmu-masters" assumes Stream ID == Requester ID */
> +		pci_for_each_dma_alias(to_pci_dev(dev), __arm_smmu_get_pci_sid,
> +				       &pci_sid);
> +		it.cur = &pci_sid;
> +		it.cur_count = 1;
> +	}
> +
> +	cfg = kzalloc(sizeof(*cfg), GFP_KERNEL);
> +	if (!cfg)
> +		return -ENOMEM;
> +
> +	cfg->smmu = smmu;
> +	dev->archdata.iommu = cfg;
> +
> +	while (it.cur_count--)
> +		cfg->streamids[cfg->num_streamids++] = be32_to_cpup(it.cur++);

I pronounce this construct, the "Murphy Device"! At least it didn't survive
until the end of the series :p

Will

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v5 15/19] iommu/arm-smmu: Convert to iommu_fwspec
       [not found]     ` <221f668d606abdfb4d6ee6da2c5f568c57ceccdd.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
@ 2016-09-01 18:53       ` Will Deacon
  0 siblings, 0 replies; 61+ messages in thread
From: Will Deacon @ 2016-09-01 18:53 UTC (permalink / raw)
  To: Robin Murphy
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, punit.agrawal-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA

On Tue, Aug 23, 2016 at 08:05:26PM +0100, Robin Murphy wrote:
> In the final step of preparation for full generic configuration support,
> swap our fixed-size master_cfg for the generic iommu_fwspec. For the
> legacy DT bindings, the driver simply gets to act as its own 'firmware'.
> Farewell, arbitrary MAX_MASTER_STREAMIDS!
> 
> Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
> ---
>  drivers/iommu/arm-smmu.c | 139 ++++++++++++++++++++++++++---------------------
>  1 file changed, 77 insertions(+), 62 deletions(-)

[...]

>  static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>  {
>  	int ret;
> +	struct iommu_fwspec *fwspec = dev_iommu_fwspec(dev);
> +	struct arm_smmu_device *smmu = fwspec_smmu(fwspec);
>  	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> -	struct arm_smmu_master_cfg *cfg = dev->archdata.iommu;
>  
> -	if (!cfg) {
> +	if (!fwspec || fwspec->iommu_ops != &arm_smmu_ops) {

As mentioned off-list, you have already dereferenced fwspec before this
NULL check.

Will

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v5 09/19] iommu/arm-smmu: Consolidate stream map entry state
       [not found]             ` <6d3209ff-51ad-30ca-867b-ce62105e6699-5wv7dgnIgG8@public.gmane.org>
@ 2016-09-01 18:54               ` Will Deacon
  0 siblings, 0 replies; 61+ messages in thread
From: Will Deacon @ 2016-09-01 18:54 UTC (permalink / raw)
  To: Robin Murphy
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, punit.agrawal-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA

On Thu, Sep 01, 2016 at 07:45:51PM +0100, Robin Murphy wrote:
> On 01/09/16 19:32, Will Deacon wrote:
> > On Tue, Aug 23, 2016 at 08:05:20PM +0100, Robin Murphy wrote:
> >> -	while (--i >= 0)
> >> -		__arm_smmu_free_bitmap(smmu->smr_map, smrs[i].idx);
> >> -	kfree(smrs);
> >> +	while (i--) {
> >> +		arm_smmu_free_smr(smmu, cfg->smendx[i]);
> >> +		cfg->smendx[i] = INVALID_SMENDX;
> >> +	}
> > 
> > Hmm, don't you have an off-by-one here? At least, looking at the final
> > code, we branch to out_err from within the for_each_cfg_sme loop, but
> > before we've incremented smmu->s2crs[idx].count, so the arm_smmu_free_smr
> > will erroneously decrement that.
> 
> Given that s2crs only doesn't exist until patch 10, and s2crs->count
> doesn't exist until patch 14, I'd have to say pick one of:
> 
> a) no
> 
> b) ¯\_(ツ)_/¯

You forgot:

c) I completely misread the code

So it's all fine!

Will
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v5 10/19] iommu/arm-smmu: Keep track of S2CR state
       [not found]         ` <20160901184259.GU6721-5wv7dgnIgG8@public.gmane.org>
@ 2016-09-01 19:00           ` Robin Murphy
  0 siblings, 0 replies; 61+ messages in thread
From: Robin Murphy @ 2016-09-01 19:00 UTC (permalink / raw)
  To: Will Deacon
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, punit.agrawal-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA

On 01/09/16 19:42, Will Deacon wrote:
> On Tue, Aug 23, 2016 at 08:05:21PM +0100, Robin Murphy wrote:
>> Making S2CRs first-class citizens within the driver with a high-level
>> representation of their state offers a neat solution to a few problems:
>>
>> Firstly, the information about which context a device's stream IDs are
>> associated with is already present by necessity in the S2CR. With that
>> state easily accessible we can refer directly to it and obviate the need
>> to track an IOMMU domain in each device's archdata (its earlier purpose
>> of enforcing correct attachment of multi-device groups now being handled
>> by the IOMMU core itself).
>>
>> Secondly, the core API now deprecates explicit domain detach and expects
>> domain attach to move devices smoothly from one domain to another; for
>> SMMUv2, this notion maps directly to simply rewriting the S2CRs assigned
>> to the device. By giving the driver a suitable abstraction of those
>> S2CRs to work with, we can massively reduce the overhead of the current
>> heavy-handed "detach, free resources, reallocate resources, attach"
>> approach.
>>
>> Thirdly, making the software state hardware-shaped and attached to the
>> SMMU instance once again makes suspend/resume of this register group
>> that much simpler to implement in future.
>>
>> Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
>> ---
>>  drivers/iommu/arm-smmu.c | 159 +++++++++++++++++++++++++++--------------------
>>  1 file changed, 93 insertions(+), 66 deletions(-)
> 
> [...]
> 
>> @@ -1145,9 +1198,16 @@ static void arm_smmu_master_free_smes(struct arm_smmu_device *smmu,
>>  static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain,
>>  				      struct arm_smmu_master_cfg *cfg)
>>  {
>> -	int i, ret;
>> +	int i, ret = 0;
>>  	struct arm_smmu_device *smmu = smmu_domain->smmu;
>> -	void __iomem *gr0_base = ARM_SMMU_GR0(smmu);
>> +	struct arm_smmu_s2cr *s2cr = smmu->s2crs;
>> +	enum arm_smmu_s2cr_type type = S2CR_TYPE_TRANS;
>> +	u8 cbndx = smmu_domain->cfg.cbndx;
>> +
>> +	if (cfg->smendx[0] < 0)
> 
> Shouldn't that be INVALID_SMENDX?

...which is defined as -1, and thus less than zero. I have no objection,
however, to changing this (and equivalent instances) to an explicit "if
(foo == INVALID_SMENDX" if that's clearer, as I can't foresee any
reasonable need for additional "invalid" values.

Robin.

> 
> Will
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v5 17/19] iommu/arm-smmu: Wire up generic configuration support
       [not found]     ` <4439250e01ac071bae8f03a5ccf107ed7ddc0b49.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
@ 2016-09-01 19:02       ` Will Deacon
  0 siblings, 0 replies; 61+ messages in thread
From: Will Deacon @ 2016-09-01 19:02 UTC (permalink / raw)
  To: Robin Murphy
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, punit.agrawal-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA

On Tue, Aug 23, 2016 at 08:05:28PM +0100, Robin Murphy wrote:
> With everything else now in place, fill in an of_xlate callback and the
> appropriate registration to plumb into the generic configuration
> machinery, and watch everything just work.
> 
> Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
> ---
>  drivers/iommu/arm-smmu.c | 168 ++++++++++++++++++++++++++++++-----------------
>  1 file changed, 107 insertions(+), 61 deletions(-)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index ea22beb58b59..85bc74d8fca0 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -43,6 +43,7 @@
>  #include <linux/of_address.h>
>  #include <linux/of_device.h>
>  #include <linux/of_iommu.h>
> +#include <linux/of_platform.h>
>  #include <linux/pci.h>
>  #include <linux/platform_device.h>
>  #include <linux/slab.h>
> @@ -418,6 +419,8 @@ struct arm_smmu_option_prop {
>  
>  static atomic_t cavium_smmu_context_count = ATOMIC_INIT(0);
>  
> +static bool legacy_binding_used;

I think we need to be a bit stricter here, and force all SMMUs to probe
using the same binding. I think that just boils down to checking against
this flag in the probe routine when we're not the first device.

will

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v5 00/19] Generic DT bindings for PCI IOMMUs and ARM SMMU
       [not found] ` <cover.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
                     ` (19 preceding siblings ...)
  2016-09-01 15:22   ` Lorenzo Pieralisi
@ 2016-09-01 19:05   ` Will Deacon
  20 siblings, 0 replies; 61+ messages in thread
From: Will Deacon @ 2016-09-01 19:05 UTC (permalink / raw)
  To: Robin Murphy
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, punit.agrawal-5wv7dgnIgG8,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	thunder.leizhen-hv44wF8Li93QT0dZR+AlfA

On Tue, Aug 23, 2016 at 08:05:11PM +0100, Robin Murphy wrote:
> At long last I've finished the big SMMUv2 rework, so here's everything
> all together for a v5. As a quick breakdown:
> 
> Patches 1-3 are the core PCI part, all acked and ready to go. No code
> changes from v4.
> 
> Patch 4 is merely bugfixed from v4 for simplicity, as I've not yet
> managed to take as close a look at Lorenzo's follow-on work as I'd like.
> 
> Patches 5-7 (SMMUv3) are mostly unchanged beyond a slight tweak to #5.
> 
> Patches 8-17 are the all-new SMMUv2 rework.
> 
> Patch 18 goes along with the fix already in 4.8-rc3 to help avoid 64-bit
> DMA masks going wrong now that DMA ops will be enabled.
> 
> Finally, patch 19 addresses the previous problem of having to choose
> between DMA ops or working MSIs. This is currently at the end as
> moving it before #17 would require a further interim SMMUv2 patch, and
> a 19-patch series is already quite enough...

So this is all looking pretty good to me, modulo the handful of comments
to address. The arm-smmu.c changes are pretty tough to review, given the
necessary amount of refactoring to get where you want to get, but the
end result looks good and the series does bisect.

Thanks,

Will

^ permalink raw reply	[flat|nested] 61+ messages in thread

end of thread, other threads:[~2016-09-01 19:05 UTC | newest]

Thread overview: 61+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-23 19:05 [PATCH v5 00/19] Generic DT bindings for PCI IOMMUs and ARM SMMU Robin Murphy
     [not found] ` <cover.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
2016-08-23 19:05   ` [PATCH v5 01/19] Docs: dt: add PCI IOMMU map bindings Robin Murphy
2016-08-23 19:05   ` [PATCH v5 02/19] of/irq: Break out msi-map lookup (again) Robin Murphy
2016-08-23 19:05   ` [PATCH v5 03/19] iommu/of: Handle iommu-map property for PCI Robin Murphy
     [not found]     ` <93909648835867008b21cb688a1d7db238d3641a.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
2016-08-31 15:43       ` Will Deacon
2016-08-23 19:05   ` [PATCH v5 04/19] iommu/of: Introduce iommu_fwspec Robin Murphy
     [not found]     ` <3e8eaf4fd65833fecc62828214aee81f6ca6c190.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
2016-08-31 17:28       ` Will Deacon
     [not found]         ` <20160831172856.GI29505-5wv7dgnIgG8@public.gmane.org>
2016-09-01 12:07           ` Robin Murphy
     [not found]             ` <900f3dcb-217c-4fb3-2f7d-15572f31a0c0-5wv7dgnIgG8@public.gmane.org>
2016-09-01 12:31               ` Will Deacon
     [not found]                 ` <20160901123158.GE6721-5wv7dgnIgG8@public.gmane.org>
2016-09-01 13:25                   ` Robin Murphy
2016-08-23 19:05   ` [PATCH v5 05/19] iommu/arm-smmu: Implement of_xlate() for SMMUv3 Robin Murphy
     [not found]     ` <6088007f60a24b36a3bf965b62521f99cd908019.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
2016-09-01 17:06       ` Will Deacon
     [not found]         ` <20160901170604.GP6721-5wv7dgnIgG8@public.gmane.org>
2016-09-01 17:40           ` Robin Murphy
2016-08-23 19:05   ` [PATCH v5 06/19] iommu/arm-smmu: Support non-PCI devices with SMMUv3 Robin Murphy
     [not found]     ` <207d0ae38c5b01b7cf7e48231a4d01bac453b57c.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
2016-09-01 17:08       ` Will Deacon
2016-08-23 19:05   ` [PATCH v5 07/19] iommu/arm-smmu: Set PRIVCFG in stage 1 STEs Robin Murphy
     [not found]     ` <1cda9861ce3ede6c2de9c6c4f2294549808b421b.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
2016-09-01 17:19       ` Will Deacon
2016-08-23 19:05   ` [PATCH v5 08/19] iommu/arm-smmu: Handle stream IDs more dynamically Robin Murphy
     [not found]     ` <36f71a07fbc6037ca664bdcc540650f893081dd1.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
2016-09-01 18:17       ` Will Deacon
2016-08-23 19:05   ` [PATCH v5 09/19] iommu/arm-smmu: Consolidate stream map entry state Robin Murphy
     [not found]     ` <26fcf7d3138816b9546a3dcc2bbbc2f229f34c91.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
2016-09-01 18:32       ` Will Deacon
     [not found]         ` <20160901183257.GT6721-5wv7dgnIgG8@public.gmane.org>
2016-09-01 18:45           ` Robin Murphy
     [not found]             ` <6d3209ff-51ad-30ca-867b-ce62105e6699-5wv7dgnIgG8@public.gmane.org>
2016-09-01 18:54               ` Will Deacon
2016-08-23 19:05   ` [PATCH v5 10/19] iommu/arm-smmu: Keep track of S2CR state Robin Murphy
     [not found]     ` <e086741acfd0959671d184203ef758c824c8d7da.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
2016-09-01 18:42       ` Will Deacon
     [not found]         ` <20160901184259.GU6721-5wv7dgnIgG8@public.gmane.org>
2016-09-01 19:00           ` Robin Murphy
2016-08-23 19:05   ` [PATCH v5 11/19] iommu/arm-smmu: Refactor mmu-masters handling Robin Murphy
     [not found]     ` <00301aa60323bb94588d078f2962feea0cb45c72.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
2016-09-01 18:47       ` Will Deacon
2016-08-23 19:05   ` [PATCH v5 12/19] iommu/arm-smmu: Streamline SMMU data lookups Robin Murphy
2016-08-23 19:05   ` [PATCH v5 13/19] iommu/arm-smmu: Add a stream map entry iterator Robin Murphy
2016-08-23 19:05   ` [PATCH v5 14/19] iommu/arm-smmu: Intelligent SMR allocation Robin Murphy
     [not found]     ` <693b7fdd58be254297eb43ac8f5e035beb5226b2.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
2016-09-01 15:17       ` Lorenzo Pieralisi
2016-09-01 17:59         ` Robin Murphy
2016-08-23 19:05   ` [PATCH v5 15/19] iommu/arm-smmu: Convert to iommu_fwspec Robin Murphy
     [not found]     ` <221f668d606abdfb4d6ee6da2c5f568c57ceccdd.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
2016-09-01 18:53       ` Will Deacon
2016-08-23 19:05   ` [PATCH v5 16/19] Docs: dt: document ARM SMMU generic binding usage Robin Murphy
     [not found]     ` <b4f0eca93ac944c3430297b97c703e1bc54846d7.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
2016-08-29 15:44       ` Rob Herring
2016-08-23 19:05   ` [PATCH v5 17/19] iommu/arm-smmu: Wire up generic configuration support Robin Murphy
     [not found]     ` <4439250e01ac071bae8f03a5ccf107ed7ddc0b49.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
2016-09-01 19:02       ` Will Deacon
2016-08-23 19:05   ` [PATCH v5 18/19] iommu/arm-smmu: Set domain geometry Robin Murphy
     [not found]     ` <d6cedec16fe96a081ea2f9f27378dd1a6f406c72.1471975357.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
2016-08-31 21:00       ` Auger Eric
2016-08-23 19:15   ` [PATCH v5 00/19] Generic DT bindings for PCI IOMMUs and ARM SMMU Robin Murphy
2016-08-23 19:15     ` Robin Murphy
     [not found]     ` <3a9a9369-d8cd-66f4-9344-965c80894bb6-5wv7dgnIgG8@public.gmane.org>
2016-09-01  3:49       ` Anup Patel
2016-09-01  3:49         ` Anup Patel
     [not found]         ` <CAALAos-OzPG=+aU8eKEZtx6EFXytPXq09k3QweHvtYCD=mN0mQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-09-01 10:10           ` Robin Murphy
2016-09-01 10:10             ` Robin Murphy
2016-09-01 15:22   ` Lorenzo Pieralisi
2016-09-01 19:05   ` Will Deacon
2016-08-23 19:05 ` [PATCH v5 19/19] iommu/dma: Add support for mapping MSIs Robin Murphy
2016-08-23 19:05   ` Robin Murphy
2016-08-24  8:16   ` Thomas Gleixner
2016-08-24  8:16     ` Thomas Gleixner
2016-08-24 10:06     ` Robin Murphy
2016-08-24 10:06       ` Robin Murphy
2016-08-25 22:25   ` Auger Eric
2016-08-25 22:25     ` Auger Eric
2016-08-26  1:17     ` Robin Murphy
2016-08-26  1:17       ` Robin Murphy
2016-08-31 20:51       ` Auger Eric
2016-08-31 20:51         ` Auger Eric

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.