All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] iommu: add qcom_iommu for early "B" family devices
@ 2017-08-03 10:47 Rob Clark
       [not found] ` <20170803104800.18624-1-robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2017-08-03 10:47 ` [PATCH 3/4] iommu: add qcom_iommu Rob Clark
  0 siblings, 2 replies; 24+ messages in thread
From: Rob Clark @ 2017-08-03 10:47 UTC (permalink / raw)
  To: iommu, linux-arm-msm
  Cc: Archit Taneja, Rob Herring, Will Deacon, Sricharan, Mark Rutland,
	Robin Murphy, Rob Clark

An iommu driver for Qualcomm "B" family devices which do implement the
ARM SMMU spec, but not in a way that arm-smmu can support.

(I initially added support to arm-smmu, but it was decided that approach
was too intrusive and it would be cleaner to have a separate driver.)

I should note that all the dependencies for this driver have been merged
since 4.12, and it is the last thing needed for having another fully-
enabled (gpu/display/video codec/etc) ARM device that is fully upstream.

One minor change to move a couple #defines and MMU500 bits back to
arm-smmu.c as suggested by Will.

Rob Clark (3):
  Docs: dt: document qcom iommu bindings
  iommu: arm-smmu: split out register defines
  iommu: add qcom_iommu

Stanimir Varbanov (1):
  iommu: qcom: initialize secure page table

 .../devicetree/bindings/iommu/qcom,iommu.txt       | 121 +++
 drivers/iommu/Kconfig                              |  10 +
 drivers/iommu/Makefile                             |   1 +
 drivers/iommu/arm-smmu-regs.h                      | 220 +++++
 drivers/iommu/arm-smmu.c                           | 211 +----
 drivers/iommu/qcom_iommu.c                         | 932 +++++++++++++++++++++
 6 files changed, 1293 insertions(+), 202 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/iommu/qcom,iommu.txt
 create mode 100644 drivers/iommu/arm-smmu-regs.h
 create mode 100644 drivers/iommu/qcom_iommu.c

-- 
2.13.0

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/4] Docs: dt: document qcom iommu bindings
  2017-08-03 10:47 [PATCH 0/4] iommu: add qcom_iommu for early "B" family devices Rob Clark
@ 2017-08-03 10:47     ` Rob Clark
  2017-08-03 10:47 ` [PATCH 3/4] iommu: add qcom_iommu Rob Clark
  1 sibling, 0 replies; 24+ messages in thread
From: Rob Clark @ 2017-08-03 10:47 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-msm-u79uwXL29TY76Z2rM5mHXA
  Cc: Mark Rutland, Rob Herring, devicetree-u79uwXL29TY76Z2rM5mHXA,
	Will Deacon, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Rob Herring

Cc: devicetree-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Signed-off-by: Rob Clark <robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Reviewed-by: Rob Herring <robh-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Tested-by: Archit Taneja <architt-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
---
 .../devicetree/bindings/iommu/qcom,iommu.txt       | 121 +++++++++++++++++++++
 1 file changed, 121 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/iommu/qcom,iommu.txt

diff --git a/Documentation/devicetree/bindings/iommu/qcom,iommu.txt b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
new file mode 100644
index 000000000000..b2641ceb2b40
--- /dev/null
+++ b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
@@ -0,0 +1,121 @@
+* QCOM IOMMU v1 Implementation
+
+Qualcomm "B" family devices which are not compatible with arm-smmu have
+a similar looking IOMMU but without access to the global register space,
+and optionally requiring additional configuration to route context irqs
+to non-secure vs secure interrupt line.
+
+** Required properties:
+
+- compatible       : Should be one of:
+
+                        "qcom,msm8916-iommu"
+
+                     Followed by "qcom,msm-iommu-v1".
+
+- clock-names      : Should be a pair of "iface" (required for IOMMUs
+                     register group access) and "bus" (required for
+                     the IOMMUs underlying bus access).
+
+- clocks           : Phandles for respective clocks described by
+                     clock-names.
+
+- #address-cells   : must be 1.
+
+- #size-cells      : must be 1.
+
+- #iommu-cells     : Must be 1.  Index identifies the context-bank #.
+
+- ranges           : Base address and size of the iommu context banks.
+
+- qcom,iommu-secure-id  : secure-id.
+
+- List of sub-nodes, one per translation context bank.  Each sub-node
+  has the following required properties:
+
+  - compatible     : Should be one of:
+        - "qcom,msm-iommu-v1-ns"  : non-secure context bank
+        - "qcom,msm-iommu-v1-sec" : secure context bank
+  - reg            : Base address and size of context bank within the iommu
+  - interrupts     : The context fault irq.
+
+** Optional properties:
+
+- reg              : Base address and size of the SMMU local base, should
+                     be only specified if the iommu requires configuration
+                     for routing of context bank irq's to secure vs non-
+                     secure lines.  (Ie. if the iommu contains secure
+                     context banks)
+
+
+** Examples:
+
+	apps_iommu: iommu@1e20000 {
+		#address-cells = <1>;
+		#size-cells = <1>;
+		#iommu-cells = <1>;
+		compatible = "qcom,msm8916-iommu", "qcom,msm-iommu-v1";
+		ranges = <0 0x1e20000 0x40000>;
+		reg = <0x1ef0000 0x3000>;
+		clocks = <&gcc GCC_SMMU_CFG_CLK>,
+			 <&gcc GCC_APSS_TCU_CLK>;
+		clock-names = "iface", "bus";
+		qcom,iommu-secure-id = <17>;
+
+		// mdp_0:
+		iommu-ctx@4000 {
+			compatible = "qcom,msm-iommu-v1-ns";
+			reg = <0x4000 0x1000>;
+			interrupts = <GIC_SPI 70 IRQ_TYPE_LEVEL_HIGH>;
+		};
+
+		// venus_ns:
+		iommu-ctx@5000 {
+			compatible = "qcom,msm-iommu-v1-sec";
+			reg = <0x5000 0x1000>;
+			interrupts = <GIC_SPI 70 IRQ_TYPE_LEVEL_HIGH>;
+		};
+	};
+
+	gpu_iommu: iommu@1f08000 {
+		#address-cells = <1>;
+		#size-cells = <1>;
+		#iommu-cells = <1>;
+		compatible = "qcom,msm8916-iommu", "qcom,msm-iommu-v1";
+		ranges = <0 0x1f08000 0x10000>;
+		clocks = <&gcc GCC_SMMU_CFG_CLK>,
+			 <&gcc GCC_GFX_TCU_CLK>;
+		clock-names = "iface", "bus";
+		qcom,iommu-secure-id = <18>;
+
+		// gfx3d_user:
+		iommu-ctx@1000 {
+			compatible = "qcom,msm-iommu-v1-ns";
+			reg = <0x1000 0x1000>;
+			interrupts = <GIC_SPI 241 IRQ_TYPE_LEVEL_HIGH>;
+		};
+
+		// gfx3d_priv:
+		iommu-ctx@2000 {
+			compatible = "qcom,msm-iommu-v1-ns";
+			reg = <0x2000 0x1000>;
+			interrupts = <GIC_SPI 242 IRQ_TYPE_LEVEL_HIGH>;
+		};
+	};
+
+	...
+
+	venus: video-codec@1d00000 {
+		...
+		iommus = <&apps_iommu 5>;
+	};
+
+	mdp: mdp@1a01000 {
+		...
+		iommus = <&apps_iommu 4>;
+	};
+
+	gpu@01c00000 {
+		...
+		iommus = <&gpu_iommu 1>, <&gpu_iommu 2>;
+	};
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 1/4] Docs: dt: document qcom iommu bindings
@ 2017-08-03 10:47     ` Rob Clark
  0 siblings, 0 replies; 24+ messages in thread
From: Rob Clark @ 2017-08-03 10:47 UTC (permalink / raw)
  To: iommu, linux-arm-msm
  Cc: Archit Taneja, Rob Herring, Will Deacon, Sricharan, Mark Rutland,
	Robin Murphy, Rob Clark, devicetree, Joerg Roedel, Rob Herring,
	linux-kernel

Cc: devicetree@vger.kernel.org
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Tested-by: Archit Taneja <architt@codeaurora.org>
---
 .../devicetree/bindings/iommu/qcom,iommu.txt       | 121 +++++++++++++++++++++
 1 file changed, 121 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/iommu/qcom,iommu.txt

diff --git a/Documentation/devicetree/bindings/iommu/qcom,iommu.txt b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
new file mode 100644
index 000000000000..b2641ceb2b40
--- /dev/null
+++ b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
@@ -0,0 +1,121 @@
+* QCOM IOMMU v1 Implementation
+
+Qualcomm "B" family devices which are not compatible with arm-smmu have
+a similar looking IOMMU but without access to the global register space,
+and optionally requiring additional configuration to route context irqs
+to non-secure vs secure interrupt line.
+
+** Required properties:
+
+- compatible       : Should be one of:
+
+                        "qcom,msm8916-iommu"
+
+                     Followed by "qcom,msm-iommu-v1".
+
+- clock-names      : Should be a pair of "iface" (required for IOMMUs
+                     register group access) and "bus" (required for
+                     the IOMMUs underlying bus access).
+
+- clocks           : Phandles for respective clocks described by
+                     clock-names.
+
+- #address-cells   : must be 1.
+
+- #size-cells      : must be 1.
+
+- #iommu-cells     : Must be 1.  Index identifies the context-bank #.
+
+- ranges           : Base address and size of the iommu context banks.
+
+- qcom,iommu-secure-id  : secure-id.
+
+- List of sub-nodes, one per translation context bank.  Each sub-node
+  has the following required properties:
+
+  - compatible     : Should be one of:
+        - "qcom,msm-iommu-v1-ns"  : non-secure context bank
+        - "qcom,msm-iommu-v1-sec" : secure context bank
+  - reg            : Base address and size of context bank within the iommu
+  - interrupts     : The context fault irq.
+
+** Optional properties:
+
+- reg              : Base address and size of the SMMU local base, should
+                     be only specified if the iommu requires configuration
+                     for routing of context bank irq's to secure vs non-
+                     secure lines.  (Ie. if the iommu contains secure
+                     context banks)
+
+
+** Examples:
+
+	apps_iommu: iommu@1e20000 {
+		#address-cells = <1>;
+		#size-cells = <1>;
+		#iommu-cells = <1>;
+		compatible = "qcom,msm8916-iommu", "qcom,msm-iommu-v1";
+		ranges = <0 0x1e20000 0x40000>;
+		reg = <0x1ef0000 0x3000>;
+		clocks = <&gcc GCC_SMMU_CFG_CLK>,
+			 <&gcc GCC_APSS_TCU_CLK>;
+		clock-names = "iface", "bus";
+		qcom,iommu-secure-id = <17>;
+
+		// mdp_0:
+		iommu-ctx@4000 {
+			compatible = "qcom,msm-iommu-v1-ns";
+			reg = <0x4000 0x1000>;
+			interrupts = <GIC_SPI 70 IRQ_TYPE_LEVEL_HIGH>;
+		};
+
+		// venus_ns:
+		iommu-ctx@5000 {
+			compatible = "qcom,msm-iommu-v1-sec";
+			reg = <0x5000 0x1000>;
+			interrupts = <GIC_SPI 70 IRQ_TYPE_LEVEL_HIGH>;
+		};
+	};
+
+	gpu_iommu: iommu@1f08000 {
+		#address-cells = <1>;
+		#size-cells = <1>;
+		#iommu-cells = <1>;
+		compatible = "qcom,msm8916-iommu", "qcom,msm-iommu-v1";
+		ranges = <0 0x1f08000 0x10000>;
+		clocks = <&gcc GCC_SMMU_CFG_CLK>,
+			 <&gcc GCC_GFX_TCU_CLK>;
+		clock-names = "iface", "bus";
+		qcom,iommu-secure-id = <18>;
+
+		// gfx3d_user:
+		iommu-ctx@1000 {
+			compatible = "qcom,msm-iommu-v1-ns";
+			reg = <0x1000 0x1000>;
+			interrupts = <GIC_SPI 241 IRQ_TYPE_LEVEL_HIGH>;
+		};
+
+		// gfx3d_priv:
+		iommu-ctx@2000 {
+			compatible = "qcom,msm-iommu-v1-ns";
+			reg = <0x2000 0x1000>;
+			interrupts = <GIC_SPI 242 IRQ_TYPE_LEVEL_HIGH>;
+		};
+	};
+
+	...
+
+	venus: video-codec@1d00000 {
+		...
+		iommus = <&apps_iommu 5>;
+	};
+
+	mdp: mdp@1a01000 {
+		...
+		iommus = <&apps_iommu 4>;
+	};
+
+	gpu@01c00000 {
+		...
+		iommus = <&gpu_iommu 1>, <&gpu_iommu 2>;
+	};
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/4] iommu: arm-smmu: split out register defines
  2017-08-03 10:47 [PATCH 0/4] iommu: add qcom_iommu for early "B" family devices Rob Clark
       [not found] ` <20170803104800.18624-1-robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-08-03 10:47     ` Rob Clark
  1 sibling, 0 replies; 24+ messages in thread
From: Rob Clark @ 2017-08-03 10:47 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-msm-u79uwXL29TY76Z2rM5mHXA
  Cc: Mark Rutland, Rob Herring, Will Deacon,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

I want to re-use some of these for qcom_iommu, which has (roughly) the
same context-bank registers.

Signed-off-by: Rob Clark <robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Tested-by: Archit Taneja <architt-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
---
 drivers/iommu/arm-smmu-regs.h | 220 ++++++++++++++++++++++++++++++++++++++++++
 drivers/iommu/arm-smmu.c      | 211 ++--------------------------------------
 2 files changed, 229 insertions(+), 202 deletions(-)
 create mode 100644 drivers/iommu/arm-smmu-regs.h

diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-regs.h
new file mode 100644
index 000000000000..a1226e4ab5f8
--- /dev/null
+++ b/drivers/iommu/arm-smmu-regs.h
@@ -0,0 +1,220 @@
+/*
+ * IOMMU API for ARM architected SMMU implementations.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) 2013 ARM Limited
+ *
+ * Author: Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org>
+ */
+
+#ifndef _ARM_SMMU_REGS_H
+#define _ARM_SMMU_REGS_H
+
+/* Configuration registers */
+#define ARM_SMMU_GR0_sCR0		0x0
+#define sCR0_CLIENTPD			(1 << 0)
+#define sCR0_GFRE			(1 << 1)
+#define sCR0_GFIE			(1 << 2)
+#define sCR0_EXIDENABLE			(1 << 3)
+#define sCR0_GCFGFRE			(1 << 4)
+#define sCR0_GCFGFIE			(1 << 5)
+#define sCR0_USFCFG			(1 << 10)
+#define sCR0_VMIDPNE			(1 << 11)
+#define sCR0_PTM			(1 << 12)
+#define sCR0_FB				(1 << 13)
+#define sCR0_VMID16EN			(1 << 31)
+#define sCR0_BSU_SHIFT			14
+#define sCR0_BSU_MASK			0x3
+
+/* Auxiliary Configuration register */
+#define ARM_SMMU_GR0_sACR		0x10
+
+/* Identification registers */
+#define ARM_SMMU_GR0_ID0		0x20
+#define ARM_SMMU_GR0_ID1		0x24
+#define ARM_SMMU_GR0_ID2		0x28
+#define ARM_SMMU_GR0_ID3		0x2c
+#define ARM_SMMU_GR0_ID4		0x30
+#define ARM_SMMU_GR0_ID5		0x34
+#define ARM_SMMU_GR0_ID6		0x38
+#define ARM_SMMU_GR0_ID7		0x3c
+#define ARM_SMMU_GR0_sGFSR		0x48
+#define ARM_SMMU_GR0_sGFSYNR0		0x50
+#define ARM_SMMU_GR0_sGFSYNR1		0x54
+#define ARM_SMMU_GR0_sGFSYNR2		0x58
+
+#define ID0_S1TS			(1 << 30)
+#define ID0_S2TS			(1 << 29)
+#define ID0_NTS				(1 << 28)
+#define ID0_SMS				(1 << 27)
+#define ID0_ATOSNS			(1 << 26)
+#define ID0_PTFS_NO_AARCH32		(1 << 25)
+#define ID0_PTFS_NO_AARCH32S		(1 << 24)
+#define ID0_CTTW			(1 << 14)
+#define ID0_NUMIRPT_SHIFT		16
+#define ID0_NUMIRPT_MASK		0xff
+#define ID0_NUMSIDB_SHIFT		9
+#define ID0_NUMSIDB_MASK		0xf
+#define ID0_EXIDS			(1 << 8)
+#define ID0_NUMSMRG_SHIFT		0
+#define ID0_NUMSMRG_MASK		0xff
+
+#define ID1_PAGESIZE			(1 << 31)
+#define ID1_NUMPAGENDXB_SHIFT		28
+#define ID1_NUMPAGENDXB_MASK		7
+#define ID1_NUMS2CB_SHIFT		16
+#define ID1_NUMS2CB_MASK		0xff
+#define ID1_NUMCB_SHIFT			0
+#define ID1_NUMCB_MASK			0xff
+
+#define ID2_OAS_SHIFT			4
+#define ID2_OAS_MASK			0xf
+#define ID2_IAS_SHIFT			0
+#define ID2_IAS_MASK			0xf
+#define ID2_UBS_SHIFT			8
+#define ID2_UBS_MASK			0xf
+#define ID2_PTFS_4K			(1 << 12)
+#define ID2_PTFS_16K			(1 << 13)
+#define ID2_PTFS_64K			(1 << 14)
+#define ID2_VMID16			(1 << 15)
+
+#define ID7_MAJOR_SHIFT			4
+#define ID7_MAJOR_MASK			0xf
+
+/* Global TLB invalidation */
+#define ARM_SMMU_GR0_TLBIVMID		0x64
+#define ARM_SMMU_GR0_TLBIALLNSNH	0x68
+#define ARM_SMMU_GR0_TLBIALLH		0x6c
+#define ARM_SMMU_GR0_sTLBGSYNC		0x70
+#define ARM_SMMU_GR0_sTLBGSTATUS	0x74
+#define sTLBGSTATUS_GSACTIVE		(1 << 0)
+
+/* Stream mapping registers */
+#define ARM_SMMU_GR0_SMR(n)		(0x800 + ((n) << 2))
+#define SMR_VALID			(1 << 31)
+#define SMR_MASK_SHIFT			16
+#define SMR_ID_SHIFT			0
+
+#define ARM_SMMU_GR0_S2CR(n)		(0xc00 + ((n) << 2))
+#define S2CR_CBNDX_SHIFT		0
+#define S2CR_CBNDX_MASK			0xff
+#define S2CR_EXIDVALID			(1 << 10)
+#define S2CR_TYPE_SHIFT			16
+#define S2CR_TYPE_MASK			0x3
+enum arm_smmu_s2cr_type {
+	S2CR_TYPE_TRANS,
+	S2CR_TYPE_BYPASS,
+	S2CR_TYPE_FAULT,
+};
+
+#define S2CR_PRIVCFG_SHIFT		24
+#define S2CR_PRIVCFG_MASK		0x3
+enum arm_smmu_s2cr_privcfg {
+	S2CR_PRIVCFG_DEFAULT,
+	S2CR_PRIVCFG_DIPAN,
+	S2CR_PRIVCFG_UNPRIV,
+	S2CR_PRIVCFG_PRIV,
+};
+
+/* Context bank attribute registers */
+#define ARM_SMMU_GR1_CBAR(n)		(0x0 + ((n) << 2))
+#define CBAR_VMID_SHIFT			0
+#define CBAR_VMID_MASK			0xff
+#define CBAR_S1_BPSHCFG_SHIFT		8
+#define CBAR_S1_BPSHCFG_MASK		3
+#define CBAR_S1_BPSHCFG_NSH		3
+#define CBAR_S1_MEMATTR_SHIFT		12
+#define CBAR_S1_MEMATTR_MASK		0xf
+#define CBAR_S1_MEMATTR_WB		0xf
+#define CBAR_TYPE_SHIFT			16
+#define CBAR_TYPE_MASK			0x3
+#define CBAR_TYPE_S2_TRANS		(0 << CBAR_TYPE_SHIFT)
+#define CBAR_TYPE_S1_TRANS_S2_BYPASS	(1 << CBAR_TYPE_SHIFT)
+#define CBAR_TYPE_S1_TRANS_S2_FAULT	(2 << CBAR_TYPE_SHIFT)
+#define CBAR_TYPE_S1_TRANS_S2_TRANS	(3 << CBAR_TYPE_SHIFT)
+#define CBAR_IRPTNDX_SHIFT		24
+#define CBAR_IRPTNDX_MASK		0xff
+
+#define ARM_SMMU_GR1_CBA2R(n)		(0x800 + ((n) << 2))
+#define CBA2R_RW64_32BIT		(0 << 0)
+#define CBA2R_RW64_64BIT		(1 << 0)
+#define CBA2R_VMID_SHIFT		16
+#define CBA2R_VMID_MASK			0xffff
+
+#define ARM_SMMU_CB_SCTLR		0x0
+#define ARM_SMMU_CB_ACTLR		0x4
+#define ARM_SMMU_CB_RESUME		0x8
+#define ARM_SMMU_CB_TTBCR2		0x10
+#define ARM_SMMU_CB_TTBR0		0x20
+#define ARM_SMMU_CB_TTBR1		0x28
+#define ARM_SMMU_CB_TTBCR		0x30
+#define ARM_SMMU_CB_CONTEXTIDR		0x34
+#define ARM_SMMU_CB_S1_MAIR0		0x38
+#define ARM_SMMU_CB_S1_MAIR1		0x3c
+#define ARM_SMMU_CB_PAR			0x50
+#define ARM_SMMU_CB_FSR			0x58
+#define ARM_SMMU_CB_FAR			0x60
+#define ARM_SMMU_CB_FSYNR0		0x68
+#define ARM_SMMU_CB_S1_TLBIVA		0x600
+#define ARM_SMMU_CB_S1_TLBIASID		0x610
+#define ARM_SMMU_CB_S1_TLBIVAL		0x620
+#define ARM_SMMU_CB_S2_TLBIIPAS2	0x630
+#define ARM_SMMU_CB_S2_TLBIIPAS2L	0x638
+#define ARM_SMMU_CB_TLBSYNC		0x7f0
+#define ARM_SMMU_CB_TLBSTATUS		0x7f4
+#define ARM_SMMU_CB_ATS1PR		0x800
+#define ARM_SMMU_CB_ATSR		0x8f0
+
+#define SCTLR_S1_ASIDPNE		(1 << 12)
+#define SCTLR_CFCFG			(1 << 7)
+#define SCTLR_CFIE			(1 << 6)
+#define SCTLR_CFRE			(1 << 5)
+#define SCTLR_E				(1 << 4)
+#define SCTLR_AFE			(1 << 2)
+#define SCTLR_TRE			(1 << 1)
+#define SCTLR_M				(1 << 0)
+
+#define CB_PAR_F			(1 << 0)
+
+#define ATSR_ACTIVE			(1 << 0)
+
+#define RESUME_RETRY			(0 << 0)
+#define RESUME_TERMINATE		(1 << 0)
+
+#define TTBCR2_SEP_SHIFT		15
+#define TTBCR2_SEP_UPSTREAM		(0x7 << TTBCR2_SEP_SHIFT)
+#define TTBCR2_AS			(1 << 4)
+
+#define TTBRn_ASID_SHIFT		48
+
+#define FSR_MULTI			(1 << 31)
+#define FSR_SS				(1 << 30)
+#define FSR_UUT				(1 << 8)
+#define FSR_ASF				(1 << 7)
+#define FSR_TLBLKF			(1 << 6)
+#define FSR_TLBMCF			(1 << 5)
+#define FSR_EF				(1 << 4)
+#define FSR_PF				(1 << 3)
+#define FSR_AFF				(1 << 2)
+#define FSR_TF				(1 << 1)
+
+#define FSR_IGN				(FSR_AFF | FSR_ASF | \
+					 FSR_TLBMCF | FSR_TLBLKF)
+#define FSR_FAULT			(FSR_MULTI | FSR_SS | FSR_UUT | \
+					 FSR_EF | FSR_PF | FSR_TF | FSR_IGN)
+
+#define FSYNR0_WNR			(1 << 4)
+
+#endif /* _ARM_SMMU_REGS_H */
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index bc89b4d6c043..e5f008596998 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -54,6 +54,15 @@
 #include <linux/amba/bus.h>
 
 #include "io-pgtable.h"
+#include "arm-smmu-regs.h"
+
+#define ARM_MMU500_ACTLR_CPRE		(1 << 1)
+
+#define ARM_MMU500_ACR_CACHE_LOCK	(1 << 26)
+#define ARM_MMU500_ACR_SMTNMB_TLBEN	(1 << 8)
+
+#define TLB_LOOP_TIMEOUT		1000000	/* 1s! */
+#define TLB_SPIN_COUNT			10
 
 /* Maximum number of context banks per SMMU */
 #define ARM_SMMU_MAX_CBS		128
@@ -83,211 +92,9 @@
 #define smmu_write_atomic_lq		writel_relaxed
 #endif
 
-/* Configuration registers */
-#define ARM_SMMU_GR0_sCR0		0x0
-#define sCR0_CLIENTPD			(1 << 0)
-#define sCR0_GFRE			(1 << 1)
-#define sCR0_GFIE			(1 << 2)
-#define sCR0_EXIDENABLE			(1 << 3)
-#define sCR0_GCFGFRE			(1 << 4)
-#define sCR0_GCFGFIE			(1 << 5)
-#define sCR0_USFCFG			(1 << 10)
-#define sCR0_VMIDPNE			(1 << 11)
-#define sCR0_PTM			(1 << 12)
-#define sCR0_FB				(1 << 13)
-#define sCR0_VMID16EN			(1 << 31)
-#define sCR0_BSU_SHIFT			14
-#define sCR0_BSU_MASK			0x3
-
-/* Auxiliary Configuration register */
-#define ARM_SMMU_GR0_sACR		0x10
-
-/* Identification registers */
-#define ARM_SMMU_GR0_ID0		0x20
-#define ARM_SMMU_GR0_ID1		0x24
-#define ARM_SMMU_GR0_ID2		0x28
-#define ARM_SMMU_GR0_ID3		0x2c
-#define ARM_SMMU_GR0_ID4		0x30
-#define ARM_SMMU_GR0_ID5		0x34
-#define ARM_SMMU_GR0_ID6		0x38
-#define ARM_SMMU_GR0_ID7		0x3c
-#define ARM_SMMU_GR0_sGFSR		0x48
-#define ARM_SMMU_GR0_sGFSYNR0		0x50
-#define ARM_SMMU_GR0_sGFSYNR1		0x54
-#define ARM_SMMU_GR0_sGFSYNR2		0x58
-
-#define ID0_S1TS			(1 << 30)
-#define ID0_S2TS			(1 << 29)
-#define ID0_NTS				(1 << 28)
-#define ID0_SMS				(1 << 27)
-#define ID0_ATOSNS			(1 << 26)
-#define ID0_PTFS_NO_AARCH32		(1 << 25)
-#define ID0_PTFS_NO_AARCH32S		(1 << 24)
-#define ID0_CTTW			(1 << 14)
-#define ID0_NUMIRPT_SHIFT		16
-#define ID0_NUMIRPT_MASK		0xff
-#define ID0_NUMSIDB_SHIFT		9
-#define ID0_NUMSIDB_MASK		0xf
-#define ID0_EXIDS			(1 << 8)
-#define ID0_NUMSMRG_SHIFT		0
-#define ID0_NUMSMRG_MASK		0xff
-
-#define ID1_PAGESIZE			(1 << 31)
-#define ID1_NUMPAGENDXB_SHIFT		28
-#define ID1_NUMPAGENDXB_MASK		7
-#define ID1_NUMS2CB_SHIFT		16
-#define ID1_NUMS2CB_MASK		0xff
-#define ID1_NUMCB_SHIFT			0
-#define ID1_NUMCB_MASK			0xff
-
-#define ID2_OAS_SHIFT			4
-#define ID2_OAS_MASK			0xf
-#define ID2_IAS_SHIFT			0
-#define ID2_IAS_MASK			0xf
-#define ID2_UBS_SHIFT			8
-#define ID2_UBS_MASK			0xf
-#define ID2_PTFS_4K			(1 << 12)
-#define ID2_PTFS_16K			(1 << 13)
-#define ID2_PTFS_64K			(1 << 14)
-#define ID2_VMID16			(1 << 15)
-
-#define ID7_MAJOR_SHIFT			4
-#define ID7_MAJOR_MASK			0xf
-
-/* Global TLB invalidation */
-#define ARM_SMMU_GR0_TLBIVMID		0x64
-#define ARM_SMMU_GR0_TLBIALLNSNH	0x68
-#define ARM_SMMU_GR0_TLBIALLH		0x6c
-#define ARM_SMMU_GR0_sTLBGSYNC		0x70
-#define ARM_SMMU_GR0_sTLBGSTATUS	0x74
-#define sTLBGSTATUS_GSACTIVE		(1 << 0)
-#define TLB_LOOP_TIMEOUT		1000000	/* 1s! */
-#define TLB_SPIN_COUNT			10
-
-/* Stream mapping registers */
-#define ARM_SMMU_GR0_SMR(n)		(0x800 + ((n) << 2))
-#define SMR_VALID			(1 << 31)
-#define SMR_MASK_SHIFT			16
-#define SMR_ID_SHIFT			0
-
-#define ARM_SMMU_GR0_S2CR(n)		(0xc00 + ((n) << 2))
-#define S2CR_CBNDX_SHIFT		0
-#define S2CR_CBNDX_MASK			0xff
-#define S2CR_EXIDVALID			(1 << 10)
-#define S2CR_TYPE_SHIFT			16
-#define S2CR_TYPE_MASK			0x3
-enum arm_smmu_s2cr_type {
-	S2CR_TYPE_TRANS,
-	S2CR_TYPE_BYPASS,
-	S2CR_TYPE_FAULT,
-};
-
-#define S2CR_PRIVCFG_SHIFT		24
-#define S2CR_PRIVCFG_MASK		0x3
-enum arm_smmu_s2cr_privcfg {
-	S2CR_PRIVCFG_DEFAULT,
-	S2CR_PRIVCFG_DIPAN,
-	S2CR_PRIVCFG_UNPRIV,
-	S2CR_PRIVCFG_PRIV,
-};
-
-/* Context bank attribute registers */
-#define ARM_SMMU_GR1_CBAR(n)		(0x0 + ((n) << 2))
-#define CBAR_VMID_SHIFT			0
-#define CBAR_VMID_MASK			0xff
-#define CBAR_S1_BPSHCFG_SHIFT		8
-#define CBAR_S1_BPSHCFG_MASK		3
-#define CBAR_S1_BPSHCFG_NSH		3
-#define CBAR_S1_MEMATTR_SHIFT		12
-#define CBAR_S1_MEMATTR_MASK		0xf
-#define CBAR_S1_MEMATTR_WB		0xf
-#define CBAR_TYPE_SHIFT			16
-#define CBAR_TYPE_MASK			0x3
-#define CBAR_TYPE_S2_TRANS		(0 << CBAR_TYPE_SHIFT)
-#define CBAR_TYPE_S1_TRANS_S2_BYPASS	(1 << CBAR_TYPE_SHIFT)
-#define CBAR_TYPE_S1_TRANS_S2_FAULT	(2 << CBAR_TYPE_SHIFT)
-#define CBAR_TYPE_S1_TRANS_S2_TRANS	(3 << CBAR_TYPE_SHIFT)
-#define CBAR_IRPTNDX_SHIFT		24
-#define CBAR_IRPTNDX_MASK		0xff
-
-#define ARM_SMMU_GR1_CBA2R(n)		(0x800 + ((n) << 2))
-#define CBA2R_RW64_32BIT		(0 << 0)
-#define CBA2R_RW64_64BIT		(1 << 0)
-#define CBA2R_VMID_SHIFT		16
-#define CBA2R_VMID_MASK			0xffff
-
 /* Translation context bank */
 #define ARM_SMMU_CB(smmu, n)	((smmu)->cb_base + ((n) << (smmu)->pgshift))
 
-#define ARM_SMMU_CB_SCTLR		0x0
-#define ARM_SMMU_CB_ACTLR		0x4
-#define ARM_SMMU_CB_RESUME		0x8
-#define ARM_SMMU_CB_TTBCR2		0x10
-#define ARM_SMMU_CB_TTBR0		0x20
-#define ARM_SMMU_CB_TTBR1		0x28
-#define ARM_SMMU_CB_TTBCR		0x30
-#define ARM_SMMU_CB_CONTEXTIDR		0x34
-#define ARM_SMMU_CB_S1_MAIR0		0x38
-#define ARM_SMMU_CB_S1_MAIR1		0x3c
-#define ARM_SMMU_CB_PAR			0x50
-#define ARM_SMMU_CB_FSR			0x58
-#define ARM_SMMU_CB_FAR			0x60
-#define ARM_SMMU_CB_FSYNR0		0x68
-#define ARM_SMMU_CB_S1_TLBIVA		0x600
-#define ARM_SMMU_CB_S1_TLBIASID		0x610
-#define ARM_SMMU_CB_S1_TLBIVAL		0x620
-#define ARM_SMMU_CB_S2_TLBIIPAS2	0x630
-#define ARM_SMMU_CB_S2_TLBIIPAS2L	0x638
-#define ARM_SMMU_CB_TLBSYNC		0x7f0
-#define ARM_SMMU_CB_TLBSTATUS		0x7f4
-#define ARM_SMMU_CB_ATS1PR		0x800
-#define ARM_SMMU_CB_ATSR		0x8f0
-
-#define SCTLR_S1_ASIDPNE		(1 << 12)
-#define SCTLR_CFCFG			(1 << 7)
-#define SCTLR_CFIE			(1 << 6)
-#define SCTLR_CFRE			(1 << 5)
-#define SCTLR_E				(1 << 4)
-#define SCTLR_AFE			(1 << 2)
-#define SCTLR_TRE			(1 << 1)
-#define SCTLR_M				(1 << 0)
-
-#define ARM_MMU500_ACTLR_CPRE		(1 << 1)
-
-#define ARM_MMU500_ACR_CACHE_LOCK	(1 << 26)
-#define ARM_MMU500_ACR_SMTNMB_TLBEN	(1 << 8)
-
-#define CB_PAR_F			(1 << 0)
-
-#define ATSR_ACTIVE			(1 << 0)
-
-#define RESUME_RETRY			(0 << 0)
-#define RESUME_TERMINATE		(1 << 0)
-
-#define TTBCR2_SEP_SHIFT		15
-#define TTBCR2_SEP_UPSTREAM		(0x7 << TTBCR2_SEP_SHIFT)
-#define TTBCR2_AS			(1 << 4)
-
-#define TTBRn_ASID_SHIFT		48
-
-#define FSR_MULTI			(1 << 31)
-#define FSR_SS				(1 << 30)
-#define FSR_UUT				(1 << 8)
-#define FSR_ASF				(1 << 7)
-#define FSR_TLBLKF			(1 << 6)
-#define FSR_TLBMCF			(1 << 5)
-#define FSR_EF				(1 << 4)
-#define FSR_PF				(1 << 3)
-#define FSR_AFF				(1 << 2)
-#define FSR_TF				(1 << 1)
-
-#define FSR_IGN				(FSR_AFF | FSR_ASF | \
-					 FSR_TLBMCF | FSR_TLBLKF)
-#define FSR_FAULT			(FSR_MULTI | FSR_SS | FSR_UUT | \
-					 FSR_EF | FSR_PF | FSR_TF | FSR_IGN)
-
-#define FSYNR0_WNR			(1 << 4)
-
 #define MSI_IOVA_BASE			0x8000000
 #define MSI_IOVA_LENGTH			0x100000
 
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/4] iommu: arm-smmu: split out register defines
@ 2017-08-03 10:47     ` Rob Clark
  0 siblings, 0 replies; 24+ messages in thread
From: Rob Clark @ 2017-08-03 10:47 UTC (permalink / raw)
  To: iommu, linux-arm-msm
  Cc: Archit Taneja, Rob Herring, Will Deacon, Sricharan, Mark Rutland,
	Robin Murphy, Rob Clark, Joerg Roedel, linux-kernel,
	linux-arm-kernel

I want to re-use some of these for qcom_iommu, which has (roughly) the
same context-bank registers.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Tested-by: Archit Taneja <architt@codeaurora.org>
---
 drivers/iommu/arm-smmu-regs.h | 220 ++++++++++++++++++++++++++++++++++++++++++
 drivers/iommu/arm-smmu.c      | 211 ++--------------------------------------
 2 files changed, 229 insertions(+), 202 deletions(-)
 create mode 100644 drivers/iommu/arm-smmu-regs.h

diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-regs.h
new file mode 100644
index 000000000000..a1226e4ab5f8
--- /dev/null
+++ b/drivers/iommu/arm-smmu-regs.h
@@ -0,0 +1,220 @@
+/*
+ * IOMMU API for ARM architected SMMU implementations.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) 2013 ARM Limited
+ *
+ * Author: Will Deacon <will.deacon@arm.com>
+ */
+
+#ifndef _ARM_SMMU_REGS_H
+#define _ARM_SMMU_REGS_H
+
+/* Configuration registers */
+#define ARM_SMMU_GR0_sCR0		0x0
+#define sCR0_CLIENTPD			(1 << 0)
+#define sCR0_GFRE			(1 << 1)
+#define sCR0_GFIE			(1 << 2)
+#define sCR0_EXIDENABLE			(1 << 3)
+#define sCR0_GCFGFRE			(1 << 4)
+#define sCR0_GCFGFIE			(1 << 5)
+#define sCR0_USFCFG			(1 << 10)
+#define sCR0_VMIDPNE			(1 << 11)
+#define sCR0_PTM			(1 << 12)
+#define sCR0_FB				(1 << 13)
+#define sCR0_VMID16EN			(1 << 31)
+#define sCR0_BSU_SHIFT			14
+#define sCR0_BSU_MASK			0x3
+
+/* Auxiliary Configuration register */
+#define ARM_SMMU_GR0_sACR		0x10
+
+/* Identification registers */
+#define ARM_SMMU_GR0_ID0		0x20
+#define ARM_SMMU_GR0_ID1		0x24
+#define ARM_SMMU_GR0_ID2		0x28
+#define ARM_SMMU_GR0_ID3		0x2c
+#define ARM_SMMU_GR0_ID4		0x30
+#define ARM_SMMU_GR0_ID5		0x34
+#define ARM_SMMU_GR0_ID6		0x38
+#define ARM_SMMU_GR0_ID7		0x3c
+#define ARM_SMMU_GR0_sGFSR		0x48
+#define ARM_SMMU_GR0_sGFSYNR0		0x50
+#define ARM_SMMU_GR0_sGFSYNR1		0x54
+#define ARM_SMMU_GR0_sGFSYNR2		0x58
+
+#define ID0_S1TS			(1 << 30)
+#define ID0_S2TS			(1 << 29)
+#define ID0_NTS				(1 << 28)
+#define ID0_SMS				(1 << 27)
+#define ID0_ATOSNS			(1 << 26)
+#define ID0_PTFS_NO_AARCH32		(1 << 25)
+#define ID0_PTFS_NO_AARCH32S		(1 << 24)
+#define ID0_CTTW			(1 << 14)
+#define ID0_NUMIRPT_SHIFT		16
+#define ID0_NUMIRPT_MASK		0xff
+#define ID0_NUMSIDB_SHIFT		9
+#define ID0_NUMSIDB_MASK		0xf
+#define ID0_EXIDS			(1 << 8)
+#define ID0_NUMSMRG_SHIFT		0
+#define ID0_NUMSMRG_MASK		0xff
+
+#define ID1_PAGESIZE			(1 << 31)
+#define ID1_NUMPAGENDXB_SHIFT		28
+#define ID1_NUMPAGENDXB_MASK		7
+#define ID1_NUMS2CB_SHIFT		16
+#define ID1_NUMS2CB_MASK		0xff
+#define ID1_NUMCB_SHIFT			0
+#define ID1_NUMCB_MASK			0xff
+
+#define ID2_OAS_SHIFT			4
+#define ID2_OAS_MASK			0xf
+#define ID2_IAS_SHIFT			0
+#define ID2_IAS_MASK			0xf
+#define ID2_UBS_SHIFT			8
+#define ID2_UBS_MASK			0xf
+#define ID2_PTFS_4K			(1 << 12)
+#define ID2_PTFS_16K			(1 << 13)
+#define ID2_PTFS_64K			(1 << 14)
+#define ID2_VMID16			(1 << 15)
+
+#define ID7_MAJOR_SHIFT			4
+#define ID7_MAJOR_MASK			0xf
+
+/* Global TLB invalidation */
+#define ARM_SMMU_GR0_TLBIVMID		0x64
+#define ARM_SMMU_GR0_TLBIALLNSNH	0x68
+#define ARM_SMMU_GR0_TLBIALLH		0x6c
+#define ARM_SMMU_GR0_sTLBGSYNC		0x70
+#define ARM_SMMU_GR0_sTLBGSTATUS	0x74
+#define sTLBGSTATUS_GSACTIVE		(1 << 0)
+
+/* Stream mapping registers */
+#define ARM_SMMU_GR0_SMR(n)		(0x800 + ((n) << 2))
+#define SMR_VALID			(1 << 31)
+#define SMR_MASK_SHIFT			16
+#define SMR_ID_SHIFT			0
+
+#define ARM_SMMU_GR0_S2CR(n)		(0xc00 + ((n) << 2))
+#define S2CR_CBNDX_SHIFT		0
+#define S2CR_CBNDX_MASK			0xff
+#define S2CR_EXIDVALID			(1 << 10)
+#define S2CR_TYPE_SHIFT			16
+#define S2CR_TYPE_MASK			0x3
+enum arm_smmu_s2cr_type {
+	S2CR_TYPE_TRANS,
+	S2CR_TYPE_BYPASS,
+	S2CR_TYPE_FAULT,
+};
+
+#define S2CR_PRIVCFG_SHIFT		24
+#define S2CR_PRIVCFG_MASK		0x3
+enum arm_smmu_s2cr_privcfg {
+	S2CR_PRIVCFG_DEFAULT,
+	S2CR_PRIVCFG_DIPAN,
+	S2CR_PRIVCFG_UNPRIV,
+	S2CR_PRIVCFG_PRIV,
+};
+
+/* Context bank attribute registers */
+#define ARM_SMMU_GR1_CBAR(n)		(0x0 + ((n) << 2))
+#define CBAR_VMID_SHIFT			0
+#define CBAR_VMID_MASK			0xff
+#define CBAR_S1_BPSHCFG_SHIFT		8
+#define CBAR_S1_BPSHCFG_MASK		3
+#define CBAR_S1_BPSHCFG_NSH		3
+#define CBAR_S1_MEMATTR_SHIFT		12
+#define CBAR_S1_MEMATTR_MASK		0xf
+#define CBAR_S1_MEMATTR_WB		0xf
+#define CBAR_TYPE_SHIFT			16
+#define CBAR_TYPE_MASK			0x3
+#define CBAR_TYPE_S2_TRANS		(0 << CBAR_TYPE_SHIFT)
+#define CBAR_TYPE_S1_TRANS_S2_BYPASS	(1 << CBAR_TYPE_SHIFT)
+#define CBAR_TYPE_S1_TRANS_S2_FAULT	(2 << CBAR_TYPE_SHIFT)
+#define CBAR_TYPE_S1_TRANS_S2_TRANS	(3 << CBAR_TYPE_SHIFT)
+#define CBAR_IRPTNDX_SHIFT		24
+#define CBAR_IRPTNDX_MASK		0xff
+
+#define ARM_SMMU_GR1_CBA2R(n)		(0x800 + ((n) << 2))
+#define CBA2R_RW64_32BIT		(0 << 0)
+#define CBA2R_RW64_64BIT		(1 << 0)
+#define CBA2R_VMID_SHIFT		16
+#define CBA2R_VMID_MASK			0xffff
+
+#define ARM_SMMU_CB_SCTLR		0x0
+#define ARM_SMMU_CB_ACTLR		0x4
+#define ARM_SMMU_CB_RESUME		0x8
+#define ARM_SMMU_CB_TTBCR2		0x10
+#define ARM_SMMU_CB_TTBR0		0x20
+#define ARM_SMMU_CB_TTBR1		0x28
+#define ARM_SMMU_CB_TTBCR		0x30
+#define ARM_SMMU_CB_CONTEXTIDR		0x34
+#define ARM_SMMU_CB_S1_MAIR0		0x38
+#define ARM_SMMU_CB_S1_MAIR1		0x3c
+#define ARM_SMMU_CB_PAR			0x50
+#define ARM_SMMU_CB_FSR			0x58
+#define ARM_SMMU_CB_FAR			0x60
+#define ARM_SMMU_CB_FSYNR0		0x68
+#define ARM_SMMU_CB_S1_TLBIVA		0x600
+#define ARM_SMMU_CB_S1_TLBIASID		0x610
+#define ARM_SMMU_CB_S1_TLBIVAL		0x620
+#define ARM_SMMU_CB_S2_TLBIIPAS2	0x630
+#define ARM_SMMU_CB_S2_TLBIIPAS2L	0x638
+#define ARM_SMMU_CB_TLBSYNC		0x7f0
+#define ARM_SMMU_CB_TLBSTATUS		0x7f4
+#define ARM_SMMU_CB_ATS1PR		0x800
+#define ARM_SMMU_CB_ATSR		0x8f0
+
+#define SCTLR_S1_ASIDPNE		(1 << 12)
+#define SCTLR_CFCFG			(1 << 7)
+#define SCTLR_CFIE			(1 << 6)
+#define SCTLR_CFRE			(1 << 5)
+#define SCTLR_E				(1 << 4)
+#define SCTLR_AFE			(1 << 2)
+#define SCTLR_TRE			(1 << 1)
+#define SCTLR_M				(1 << 0)
+
+#define CB_PAR_F			(1 << 0)
+
+#define ATSR_ACTIVE			(1 << 0)
+
+#define RESUME_RETRY			(0 << 0)
+#define RESUME_TERMINATE		(1 << 0)
+
+#define TTBCR2_SEP_SHIFT		15
+#define TTBCR2_SEP_UPSTREAM		(0x7 << TTBCR2_SEP_SHIFT)
+#define TTBCR2_AS			(1 << 4)
+
+#define TTBRn_ASID_SHIFT		48
+
+#define FSR_MULTI			(1 << 31)
+#define FSR_SS				(1 << 30)
+#define FSR_UUT				(1 << 8)
+#define FSR_ASF				(1 << 7)
+#define FSR_TLBLKF			(1 << 6)
+#define FSR_TLBMCF			(1 << 5)
+#define FSR_EF				(1 << 4)
+#define FSR_PF				(1 << 3)
+#define FSR_AFF				(1 << 2)
+#define FSR_TF				(1 << 1)
+
+#define FSR_IGN				(FSR_AFF | FSR_ASF | \
+					 FSR_TLBMCF | FSR_TLBLKF)
+#define FSR_FAULT			(FSR_MULTI | FSR_SS | FSR_UUT | \
+					 FSR_EF | FSR_PF | FSR_TF | FSR_IGN)
+
+#define FSYNR0_WNR			(1 << 4)
+
+#endif /* _ARM_SMMU_REGS_H */
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index bc89b4d6c043..e5f008596998 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -54,6 +54,15 @@
 #include <linux/amba/bus.h>
 
 #include "io-pgtable.h"
+#include "arm-smmu-regs.h"
+
+#define ARM_MMU500_ACTLR_CPRE		(1 << 1)
+
+#define ARM_MMU500_ACR_CACHE_LOCK	(1 << 26)
+#define ARM_MMU500_ACR_SMTNMB_TLBEN	(1 << 8)
+
+#define TLB_LOOP_TIMEOUT		1000000	/* 1s! */
+#define TLB_SPIN_COUNT			10
 
 /* Maximum number of context banks per SMMU */
 #define ARM_SMMU_MAX_CBS		128
@@ -83,211 +92,9 @@
 #define smmu_write_atomic_lq		writel_relaxed
 #endif
 
-/* Configuration registers */
-#define ARM_SMMU_GR0_sCR0		0x0
-#define sCR0_CLIENTPD			(1 << 0)
-#define sCR0_GFRE			(1 << 1)
-#define sCR0_GFIE			(1 << 2)
-#define sCR0_EXIDENABLE			(1 << 3)
-#define sCR0_GCFGFRE			(1 << 4)
-#define sCR0_GCFGFIE			(1 << 5)
-#define sCR0_USFCFG			(1 << 10)
-#define sCR0_VMIDPNE			(1 << 11)
-#define sCR0_PTM			(1 << 12)
-#define sCR0_FB				(1 << 13)
-#define sCR0_VMID16EN			(1 << 31)
-#define sCR0_BSU_SHIFT			14
-#define sCR0_BSU_MASK			0x3
-
-/* Auxiliary Configuration register */
-#define ARM_SMMU_GR0_sACR		0x10
-
-/* Identification registers */
-#define ARM_SMMU_GR0_ID0		0x20
-#define ARM_SMMU_GR0_ID1		0x24
-#define ARM_SMMU_GR0_ID2		0x28
-#define ARM_SMMU_GR0_ID3		0x2c
-#define ARM_SMMU_GR0_ID4		0x30
-#define ARM_SMMU_GR0_ID5		0x34
-#define ARM_SMMU_GR0_ID6		0x38
-#define ARM_SMMU_GR0_ID7		0x3c
-#define ARM_SMMU_GR0_sGFSR		0x48
-#define ARM_SMMU_GR0_sGFSYNR0		0x50
-#define ARM_SMMU_GR0_sGFSYNR1		0x54
-#define ARM_SMMU_GR0_sGFSYNR2		0x58
-
-#define ID0_S1TS			(1 << 30)
-#define ID0_S2TS			(1 << 29)
-#define ID0_NTS				(1 << 28)
-#define ID0_SMS				(1 << 27)
-#define ID0_ATOSNS			(1 << 26)
-#define ID0_PTFS_NO_AARCH32		(1 << 25)
-#define ID0_PTFS_NO_AARCH32S		(1 << 24)
-#define ID0_CTTW			(1 << 14)
-#define ID0_NUMIRPT_SHIFT		16
-#define ID0_NUMIRPT_MASK		0xff
-#define ID0_NUMSIDB_SHIFT		9
-#define ID0_NUMSIDB_MASK		0xf
-#define ID0_EXIDS			(1 << 8)
-#define ID0_NUMSMRG_SHIFT		0
-#define ID0_NUMSMRG_MASK		0xff
-
-#define ID1_PAGESIZE			(1 << 31)
-#define ID1_NUMPAGENDXB_SHIFT		28
-#define ID1_NUMPAGENDXB_MASK		7
-#define ID1_NUMS2CB_SHIFT		16
-#define ID1_NUMS2CB_MASK		0xff
-#define ID1_NUMCB_SHIFT			0
-#define ID1_NUMCB_MASK			0xff
-
-#define ID2_OAS_SHIFT			4
-#define ID2_OAS_MASK			0xf
-#define ID2_IAS_SHIFT			0
-#define ID2_IAS_MASK			0xf
-#define ID2_UBS_SHIFT			8
-#define ID2_UBS_MASK			0xf
-#define ID2_PTFS_4K			(1 << 12)
-#define ID2_PTFS_16K			(1 << 13)
-#define ID2_PTFS_64K			(1 << 14)
-#define ID2_VMID16			(1 << 15)
-
-#define ID7_MAJOR_SHIFT			4
-#define ID7_MAJOR_MASK			0xf
-
-/* Global TLB invalidation */
-#define ARM_SMMU_GR0_TLBIVMID		0x64
-#define ARM_SMMU_GR0_TLBIALLNSNH	0x68
-#define ARM_SMMU_GR0_TLBIALLH		0x6c
-#define ARM_SMMU_GR0_sTLBGSYNC		0x70
-#define ARM_SMMU_GR0_sTLBGSTATUS	0x74
-#define sTLBGSTATUS_GSACTIVE		(1 << 0)
-#define TLB_LOOP_TIMEOUT		1000000	/* 1s! */
-#define TLB_SPIN_COUNT			10
-
-/* Stream mapping registers */
-#define ARM_SMMU_GR0_SMR(n)		(0x800 + ((n) << 2))
-#define SMR_VALID			(1 << 31)
-#define SMR_MASK_SHIFT			16
-#define SMR_ID_SHIFT			0
-
-#define ARM_SMMU_GR0_S2CR(n)		(0xc00 + ((n) << 2))
-#define S2CR_CBNDX_SHIFT		0
-#define S2CR_CBNDX_MASK			0xff
-#define S2CR_EXIDVALID			(1 << 10)
-#define S2CR_TYPE_SHIFT			16
-#define S2CR_TYPE_MASK			0x3
-enum arm_smmu_s2cr_type {
-	S2CR_TYPE_TRANS,
-	S2CR_TYPE_BYPASS,
-	S2CR_TYPE_FAULT,
-};
-
-#define S2CR_PRIVCFG_SHIFT		24
-#define S2CR_PRIVCFG_MASK		0x3
-enum arm_smmu_s2cr_privcfg {
-	S2CR_PRIVCFG_DEFAULT,
-	S2CR_PRIVCFG_DIPAN,
-	S2CR_PRIVCFG_UNPRIV,
-	S2CR_PRIVCFG_PRIV,
-};
-
-/* Context bank attribute registers */
-#define ARM_SMMU_GR1_CBAR(n)		(0x0 + ((n) << 2))
-#define CBAR_VMID_SHIFT			0
-#define CBAR_VMID_MASK			0xff
-#define CBAR_S1_BPSHCFG_SHIFT		8
-#define CBAR_S1_BPSHCFG_MASK		3
-#define CBAR_S1_BPSHCFG_NSH		3
-#define CBAR_S1_MEMATTR_SHIFT		12
-#define CBAR_S1_MEMATTR_MASK		0xf
-#define CBAR_S1_MEMATTR_WB		0xf
-#define CBAR_TYPE_SHIFT			16
-#define CBAR_TYPE_MASK			0x3
-#define CBAR_TYPE_S2_TRANS		(0 << CBAR_TYPE_SHIFT)
-#define CBAR_TYPE_S1_TRANS_S2_BYPASS	(1 << CBAR_TYPE_SHIFT)
-#define CBAR_TYPE_S1_TRANS_S2_FAULT	(2 << CBAR_TYPE_SHIFT)
-#define CBAR_TYPE_S1_TRANS_S2_TRANS	(3 << CBAR_TYPE_SHIFT)
-#define CBAR_IRPTNDX_SHIFT		24
-#define CBAR_IRPTNDX_MASK		0xff
-
-#define ARM_SMMU_GR1_CBA2R(n)		(0x800 + ((n) << 2))
-#define CBA2R_RW64_32BIT		(0 << 0)
-#define CBA2R_RW64_64BIT		(1 << 0)
-#define CBA2R_VMID_SHIFT		16
-#define CBA2R_VMID_MASK			0xffff
-
 /* Translation context bank */
 #define ARM_SMMU_CB(smmu, n)	((smmu)->cb_base + ((n) << (smmu)->pgshift))
 
-#define ARM_SMMU_CB_SCTLR		0x0
-#define ARM_SMMU_CB_ACTLR		0x4
-#define ARM_SMMU_CB_RESUME		0x8
-#define ARM_SMMU_CB_TTBCR2		0x10
-#define ARM_SMMU_CB_TTBR0		0x20
-#define ARM_SMMU_CB_TTBR1		0x28
-#define ARM_SMMU_CB_TTBCR		0x30
-#define ARM_SMMU_CB_CONTEXTIDR		0x34
-#define ARM_SMMU_CB_S1_MAIR0		0x38
-#define ARM_SMMU_CB_S1_MAIR1		0x3c
-#define ARM_SMMU_CB_PAR			0x50
-#define ARM_SMMU_CB_FSR			0x58
-#define ARM_SMMU_CB_FAR			0x60
-#define ARM_SMMU_CB_FSYNR0		0x68
-#define ARM_SMMU_CB_S1_TLBIVA		0x600
-#define ARM_SMMU_CB_S1_TLBIASID		0x610
-#define ARM_SMMU_CB_S1_TLBIVAL		0x620
-#define ARM_SMMU_CB_S2_TLBIIPAS2	0x630
-#define ARM_SMMU_CB_S2_TLBIIPAS2L	0x638
-#define ARM_SMMU_CB_TLBSYNC		0x7f0
-#define ARM_SMMU_CB_TLBSTATUS		0x7f4
-#define ARM_SMMU_CB_ATS1PR		0x800
-#define ARM_SMMU_CB_ATSR		0x8f0
-
-#define SCTLR_S1_ASIDPNE		(1 << 12)
-#define SCTLR_CFCFG			(1 << 7)
-#define SCTLR_CFIE			(1 << 6)
-#define SCTLR_CFRE			(1 << 5)
-#define SCTLR_E				(1 << 4)
-#define SCTLR_AFE			(1 << 2)
-#define SCTLR_TRE			(1 << 1)
-#define SCTLR_M				(1 << 0)
-
-#define ARM_MMU500_ACTLR_CPRE		(1 << 1)
-
-#define ARM_MMU500_ACR_CACHE_LOCK	(1 << 26)
-#define ARM_MMU500_ACR_SMTNMB_TLBEN	(1 << 8)
-
-#define CB_PAR_F			(1 << 0)
-
-#define ATSR_ACTIVE			(1 << 0)
-
-#define RESUME_RETRY			(0 << 0)
-#define RESUME_TERMINATE		(1 << 0)
-
-#define TTBCR2_SEP_SHIFT		15
-#define TTBCR2_SEP_UPSTREAM		(0x7 << TTBCR2_SEP_SHIFT)
-#define TTBCR2_AS			(1 << 4)
-
-#define TTBRn_ASID_SHIFT		48
-
-#define FSR_MULTI			(1 << 31)
-#define FSR_SS				(1 << 30)
-#define FSR_UUT				(1 << 8)
-#define FSR_ASF				(1 << 7)
-#define FSR_TLBLKF			(1 << 6)
-#define FSR_TLBMCF			(1 << 5)
-#define FSR_EF				(1 << 4)
-#define FSR_PF				(1 << 3)
-#define FSR_AFF				(1 << 2)
-#define FSR_TF				(1 << 1)
-
-#define FSR_IGN				(FSR_AFF | FSR_ASF | \
-					 FSR_TLBMCF | FSR_TLBLKF)
-#define FSR_FAULT			(FSR_MULTI | FSR_SS | FSR_UUT | \
-					 FSR_EF | FSR_PF | FSR_TF | FSR_IGN)
-
-#define FSYNR0_WNR			(1 << 4)
-
 #define MSI_IOVA_BASE			0x8000000
 #define MSI_IOVA_LENGTH			0x100000
 
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/4] iommu: arm-smmu: split out register defines
@ 2017-08-03 10:47     ` Rob Clark
  0 siblings, 0 replies; 24+ messages in thread
From: Rob Clark @ 2017-08-03 10:47 UTC (permalink / raw)
  To: linux-arm-kernel

I want to re-use some of these for qcom_iommu, which has (roughly) the
same context-bank registers.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Tested-by: Archit Taneja <architt@codeaurora.org>
---
 drivers/iommu/arm-smmu-regs.h | 220 ++++++++++++++++++++++++++++++++++++++++++
 drivers/iommu/arm-smmu.c      | 211 ++--------------------------------------
 2 files changed, 229 insertions(+), 202 deletions(-)
 create mode 100644 drivers/iommu/arm-smmu-regs.h

diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-regs.h
new file mode 100644
index 000000000000..a1226e4ab5f8
--- /dev/null
+++ b/drivers/iommu/arm-smmu-regs.h
@@ -0,0 +1,220 @@
+/*
+ * IOMMU API for ARM architected SMMU implementations.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) 2013 ARM Limited
+ *
+ * Author: Will Deacon <will.deacon@arm.com>
+ */
+
+#ifndef _ARM_SMMU_REGS_H
+#define _ARM_SMMU_REGS_H
+
+/* Configuration registers */
+#define ARM_SMMU_GR0_sCR0		0x0
+#define sCR0_CLIENTPD			(1 << 0)
+#define sCR0_GFRE			(1 << 1)
+#define sCR0_GFIE			(1 << 2)
+#define sCR0_EXIDENABLE			(1 << 3)
+#define sCR0_GCFGFRE			(1 << 4)
+#define sCR0_GCFGFIE			(1 << 5)
+#define sCR0_USFCFG			(1 << 10)
+#define sCR0_VMIDPNE			(1 << 11)
+#define sCR0_PTM			(1 << 12)
+#define sCR0_FB				(1 << 13)
+#define sCR0_VMID16EN			(1 << 31)
+#define sCR0_BSU_SHIFT			14
+#define sCR0_BSU_MASK			0x3
+
+/* Auxiliary Configuration register */
+#define ARM_SMMU_GR0_sACR		0x10
+
+/* Identification registers */
+#define ARM_SMMU_GR0_ID0		0x20
+#define ARM_SMMU_GR0_ID1		0x24
+#define ARM_SMMU_GR0_ID2		0x28
+#define ARM_SMMU_GR0_ID3		0x2c
+#define ARM_SMMU_GR0_ID4		0x30
+#define ARM_SMMU_GR0_ID5		0x34
+#define ARM_SMMU_GR0_ID6		0x38
+#define ARM_SMMU_GR0_ID7		0x3c
+#define ARM_SMMU_GR0_sGFSR		0x48
+#define ARM_SMMU_GR0_sGFSYNR0		0x50
+#define ARM_SMMU_GR0_sGFSYNR1		0x54
+#define ARM_SMMU_GR0_sGFSYNR2		0x58
+
+#define ID0_S1TS			(1 << 30)
+#define ID0_S2TS			(1 << 29)
+#define ID0_NTS				(1 << 28)
+#define ID0_SMS				(1 << 27)
+#define ID0_ATOSNS			(1 << 26)
+#define ID0_PTFS_NO_AARCH32		(1 << 25)
+#define ID0_PTFS_NO_AARCH32S		(1 << 24)
+#define ID0_CTTW			(1 << 14)
+#define ID0_NUMIRPT_SHIFT		16
+#define ID0_NUMIRPT_MASK		0xff
+#define ID0_NUMSIDB_SHIFT		9
+#define ID0_NUMSIDB_MASK		0xf
+#define ID0_EXIDS			(1 << 8)
+#define ID0_NUMSMRG_SHIFT		0
+#define ID0_NUMSMRG_MASK		0xff
+
+#define ID1_PAGESIZE			(1 << 31)
+#define ID1_NUMPAGENDXB_SHIFT		28
+#define ID1_NUMPAGENDXB_MASK		7
+#define ID1_NUMS2CB_SHIFT		16
+#define ID1_NUMS2CB_MASK		0xff
+#define ID1_NUMCB_SHIFT			0
+#define ID1_NUMCB_MASK			0xff
+
+#define ID2_OAS_SHIFT			4
+#define ID2_OAS_MASK			0xf
+#define ID2_IAS_SHIFT			0
+#define ID2_IAS_MASK			0xf
+#define ID2_UBS_SHIFT			8
+#define ID2_UBS_MASK			0xf
+#define ID2_PTFS_4K			(1 << 12)
+#define ID2_PTFS_16K			(1 << 13)
+#define ID2_PTFS_64K			(1 << 14)
+#define ID2_VMID16			(1 << 15)
+
+#define ID7_MAJOR_SHIFT			4
+#define ID7_MAJOR_MASK			0xf
+
+/* Global TLB invalidation */
+#define ARM_SMMU_GR0_TLBIVMID		0x64
+#define ARM_SMMU_GR0_TLBIALLNSNH	0x68
+#define ARM_SMMU_GR0_TLBIALLH		0x6c
+#define ARM_SMMU_GR0_sTLBGSYNC		0x70
+#define ARM_SMMU_GR0_sTLBGSTATUS	0x74
+#define sTLBGSTATUS_GSACTIVE		(1 << 0)
+
+/* Stream mapping registers */
+#define ARM_SMMU_GR0_SMR(n)		(0x800 + ((n) << 2))
+#define SMR_VALID			(1 << 31)
+#define SMR_MASK_SHIFT			16
+#define SMR_ID_SHIFT			0
+
+#define ARM_SMMU_GR0_S2CR(n)		(0xc00 + ((n) << 2))
+#define S2CR_CBNDX_SHIFT		0
+#define S2CR_CBNDX_MASK			0xff
+#define S2CR_EXIDVALID			(1 << 10)
+#define S2CR_TYPE_SHIFT			16
+#define S2CR_TYPE_MASK			0x3
+enum arm_smmu_s2cr_type {
+	S2CR_TYPE_TRANS,
+	S2CR_TYPE_BYPASS,
+	S2CR_TYPE_FAULT,
+};
+
+#define S2CR_PRIVCFG_SHIFT		24
+#define S2CR_PRIVCFG_MASK		0x3
+enum arm_smmu_s2cr_privcfg {
+	S2CR_PRIVCFG_DEFAULT,
+	S2CR_PRIVCFG_DIPAN,
+	S2CR_PRIVCFG_UNPRIV,
+	S2CR_PRIVCFG_PRIV,
+};
+
+/* Context bank attribute registers */
+#define ARM_SMMU_GR1_CBAR(n)		(0x0 + ((n) << 2))
+#define CBAR_VMID_SHIFT			0
+#define CBAR_VMID_MASK			0xff
+#define CBAR_S1_BPSHCFG_SHIFT		8
+#define CBAR_S1_BPSHCFG_MASK		3
+#define CBAR_S1_BPSHCFG_NSH		3
+#define CBAR_S1_MEMATTR_SHIFT		12
+#define CBAR_S1_MEMATTR_MASK		0xf
+#define CBAR_S1_MEMATTR_WB		0xf
+#define CBAR_TYPE_SHIFT			16
+#define CBAR_TYPE_MASK			0x3
+#define CBAR_TYPE_S2_TRANS		(0 << CBAR_TYPE_SHIFT)
+#define CBAR_TYPE_S1_TRANS_S2_BYPASS	(1 << CBAR_TYPE_SHIFT)
+#define CBAR_TYPE_S1_TRANS_S2_FAULT	(2 << CBAR_TYPE_SHIFT)
+#define CBAR_TYPE_S1_TRANS_S2_TRANS	(3 << CBAR_TYPE_SHIFT)
+#define CBAR_IRPTNDX_SHIFT		24
+#define CBAR_IRPTNDX_MASK		0xff
+
+#define ARM_SMMU_GR1_CBA2R(n)		(0x800 + ((n) << 2))
+#define CBA2R_RW64_32BIT		(0 << 0)
+#define CBA2R_RW64_64BIT		(1 << 0)
+#define CBA2R_VMID_SHIFT		16
+#define CBA2R_VMID_MASK			0xffff
+
+#define ARM_SMMU_CB_SCTLR		0x0
+#define ARM_SMMU_CB_ACTLR		0x4
+#define ARM_SMMU_CB_RESUME		0x8
+#define ARM_SMMU_CB_TTBCR2		0x10
+#define ARM_SMMU_CB_TTBR0		0x20
+#define ARM_SMMU_CB_TTBR1		0x28
+#define ARM_SMMU_CB_TTBCR		0x30
+#define ARM_SMMU_CB_CONTEXTIDR		0x34
+#define ARM_SMMU_CB_S1_MAIR0		0x38
+#define ARM_SMMU_CB_S1_MAIR1		0x3c
+#define ARM_SMMU_CB_PAR			0x50
+#define ARM_SMMU_CB_FSR			0x58
+#define ARM_SMMU_CB_FAR			0x60
+#define ARM_SMMU_CB_FSYNR0		0x68
+#define ARM_SMMU_CB_S1_TLBIVA		0x600
+#define ARM_SMMU_CB_S1_TLBIASID		0x610
+#define ARM_SMMU_CB_S1_TLBIVAL		0x620
+#define ARM_SMMU_CB_S2_TLBIIPAS2	0x630
+#define ARM_SMMU_CB_S2_TLBIIPAS2L	0x638
+#define ARM_SMMU_CB_TLBSYNC		0x7f0
+#define ARM_SMMU_CB_TLBSTATUS		0x7f4
+#define ARM_SMMU_CB_ATS1PR		0x800
+#define ARM_SMMU_CB_ATSR		0x8f0
+
+#define SCTLR_S1_ASIDPNE		(1 << 12)
+#define SCTLR_CFCFG			(1 << 7)
+#define SCTLR_CFIE			(1 << 6)
+#define SCTLR_CFRE			(1 << 5)
+#define SCTLR_E				(1 << 4)
+#define SCTLR_AFE			(1 << 2)
+#define SCTLR_TRE			(1 << 1)
+#define SCTLR_M				(1 << 0)
+
+#define CB_PAR_F			(1 << 0)
+
+#define ATSR_ACTIVE			(1 << 0)
+
+#define RESUME_RETRY			(0 << 0)
+#define RESUME_TERMINATE		(1 << 0)
+
+#define TTBCR2_SEP_SHIFT		15
+#define TTBCR2_SEP_UPSTREAM		(0x7 << TTBCR2_SEP_SHIFT)
+#define TTBCR2_AS			(1 << 4)
+
+#define TTBRn_ASID_SHIFT		48
+
+#define FSR_MULTI			(1 << 31)
+#define FSR_SS				(1 << 30)
+#define FSR_UUT				(1 << 8)
+#define FSR_ASF				(1 << 7)
+#define FSR_TLBLKF			(1 << 6)
+#define FSR_TLBMCF			(1 << 5)
+#define FSR_EF				(1 << 4)
+#define FSR_PF				(1 << 3)
+#define FSR_AFF				(1 << 2)
+#define FSR_TF				(1 << 1)
+
+#define FSR_IGN				(FSR_AFF | FSR_ASF | \
+					 FSR_TLBMCF | FSR_TLBLKF)
+#define FSR_FAULT			(FSR_MULTI | FSR_SS | FSR_UUT | \
+					 FSR_EF | FSR_PF | FSR_TF | FSR_IGN)
+
+#define FSYNR0_WNR			(1 << 4)
+
+#endif /* _ARM_SMMU_REGS_H */
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index bc89b4d6c043..e5f008596998 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -54,6 +54,15 @@
 #include <linux/amba/bus.h>
 
 #include "io-pgtable.h"
+#include "arm-smmu-regs.h"
+
+#define ARM_MMU500_ACTLR_CPRE		(1 << 1)
+
+#define ARM_MMU500_ACR_CACHE_LOCK	(1 << 26)
+#define ARM_MMU500_ACR_SMTNMB_TLBEN	(1 << 8)
+
+#define TLB_LOOP_TIMEOUT		1000000	/* 1s! */
+#define TLB_SPIN_COUNT			10
 
 /* Maximum number of context banks per SMMU */
 #define ARM_SMMU_MAX_CBS		128
@@ -83,211 +92,9 @@
 #define smmu_write_atomic_lq		writel_relaxed
 #endif
 
-/* Configuration registers */
-#define ARM_SMMU_GR0_sCR0		0x0
-#define sCR0_CLIENTPD			(1 << 0)
-#define sCR0_GFRE			(1 << 1)
-#define sCR0_GFIE			(1 << 2)
-#define sCR0_EXIDENABLE			(1 << 3)
-#define sCR0_GCFGFRE			(1 << 4)
-#define sCR0_GCFGFIE			(1 << 5)
-#define sCR0_USFCFG			(1 << 10)
-#define sCR0_VMIDPNE			(1 << 11)
-#define sCR0_PTM			(1 << 12)
-#define sCR0_FB				(1 << 13)
-#define sCR0_VMID16EN			(1 << 31)
-#define sCR0_BSU_SHIFT			14
-#define sCR0_BSU_MASK			0x3
-
-/* Auxiliary Configuration register */
-#define ARM_SMMU_GR0_sACR		0x10
-
-/* Identification registers */
-#define ARM_SMMU_GR0_ID0		0x20
-#define ARM_SMMU_GR0_ID1		0x24
-#define ARM_SMMU_GR0_ID2		0x28
-#define ARM_SMMU_GR0_ID3		0x2c
-#define ARM_SMMU_GR0_ID4		0x30
-#define ARM_SMMU_GR0_ID5		0x34
-#define ARM_SMMU_GR0_ID6		0x38
-#define ARM_SMMU_GR0_ID7		0x3c
-#define ARM_SMMU_GR0_sGFSR		0x48
-#define ARM_SMMU_GR0_sGFSYNR0		0x50
-#define ARM_SMMU_GR0_sGFSYNR1		0x54
-#define ARM_SMMU_GR0_sGFSYNR2		0x58
-
-#define ID0_S1TS			(1 << 30)
-#define ID0_S2TS			(1 << 29)
-#define ID0_NTS				(1 << 28)
-#define ID0_SMS				(1 << 27)
-#define ID0_ATOSNS			(1 << 26)
-#define ID0_PTFS_NO_AARCH32		(1 << 25)
-#define ID0_PTFS_NO_AARCH32S		(1 << 24)
-#define ID0_CTTW			(1 << 14)
-#define ID0_NUMIRPT_SHIFT		16
-#define ID0_NUMIRPT_MASK		0xff
-#define ID0_NUMSIDB_SHIFT		9
-#define ID0_NUMSIDB_MASK		0xf
-#define ID0_EXIDS			(1 << 8)
-#define ID0_NUMSMRG_SHIFT		0
-#define ID0_NUMSMRG_MASK		0xff
-
-#define ID1_PAGESIZE			(1 << 31)
-#define ID1_NUMPAGENDXB_SHIFT		28
-#define ID1_NUMPAGENDXB_MASK		7
-#define ID1_NUMS2CB_SHIFT		16
-#define ID1_NUMS2CB_MASK		0xff
-#define ID1_NUMCB_SHIFT			0
-#define ID1_NUMCB_MASK			0xff
-
-#define ID2_OAS_SHIFT			4
-#define ID2_OAS_MASK			0xf
-#define ID2_IAS_SHIFT			0
-#define ID2_IAS_MASK			0xf
-#define ID2_UBS_SHIFT			8
-#define ID2_UBS_MASK			0xf
-#define ID2_PTFS_4K			(1 << 12)
-#define ID2_PTFS_16K			(1 << 13)
-#define ID2_PTFS_64K			(1 << 14)
-#define ID2_VMID16			(1 << 15)
-
-#define ID7_MAJOR_SHIFT			4
-#define ID7_MAJOR_MASK			0xf
-
-/* Global TLB invalidation */
-#define ARM_SMMU_GR0_TLBIVMID		0x64
-#define ARM_SMMU_GR0_TLBIALLNSNH	0x68
-#define ARM_SMMU_GR0_TLBIALLH		0x6c
-#define ARM_SMMU_GR0_sTLBGSYNC		0x70
-#define ARM_SMMU_GR0_sTLBGSTATUS	0x74
-#define sTLBGSTATUS_GSACTIVE		(1 << 0)
-#define TLB_LOOP_TIMEOUT		1000000	/* 1s! */
-#define TLB_SPIN_COUNT			10
-
-/* Stream mapping registers */
-#define ARM_SMMU_GR0_SMR(n)		(0x800 + ((n) << 2))
-#define SMR_VALID			(1 << 31)
-#define SMR_MASK_SHIFT			16
-#define SMR_ID_SHIFT			0
-
-#define ARM_SMMU_GR0_S2CR(n)		(0xc00 + ((n) << 2))
-#define S2CR_CBNDX_SHIFT		0
-#define S2CR_CBNDX_MASK			0xff
-#define S2CR_EXIDVALID			(1 << 10)
-#define S2CR_TYPE_SHIFT			16
-#define S2CR_TYPE_MASK			0x3
-enum arm_smmu_s2cr_type {
-	S2CR_TYPE_TRANS,
-	S2CR_TYPE_BYPASS,
-	S2CR_TYPE_FAULT,
-};
-
-#define S2CR_PRIVCFG_SHIFT		24
-#define S2CR_PRIVCFG_MASK		0x3
-enum arm_smmu_s2cr_privcfg {
-	S2CR_PRIVCFG_DEFAULT,
-	S2CR_PRIVCFG_DIPAN,
-	S2CR_PRIVCFG_UNPRIV,
-	S2CR_PRIVCFG_PRIV,
-};
-
-/* Context bank attribute registers */
-#define ARM_SMMU_GR1_CBAR(n)		(0x0 + ((n) << 2))
-#define CBAR_VMID_SHIFT			0
-#define CBAR_VMID_MASK			0xff
-#define CBAR_S1_BPSHCFG_SHIFT		8
-#define CBAR_S1_BPSHCFG_MASK		3
-#define CBAR_S1_BPSHCFG_NSH		3
-#define CBAR_S1_MEMATTR_SHIFT		12
-#define CBAR_S1_MEMATTR_MASK		0xf
-#define CBAR_S1_MEMATTR_WB		0xf
-#define CBAR_TYPE_SHIFT			16
-#define CBAR_TYPE_MASK			0x3
-#define CBAR_TYPE_S2_TRANS		(0 << CBAR_TYPE_SHIFT)
-#define CBAR_TYPE_S1_TRANS_S2_BYPASS	(1 << CBAR_TYPE_SHIFT)
-#define CBAR_TYPE_S1_TRANS_S2_FAULT	(2 << CBAR_TYPE_SHIFT)
-#define CBAR_TYPE_S1_TRANS_S2_TRANS	(3 << CBAR_TYPE_SHIFT)
-#define CBAR_IRPTNDX_SHIFT		24
-#define CBAR_IRPTNDX_MASK		0xff
-
-#define ARM_SMMU_GR1_CBA2R(n)		(0x800 + ((n) << 2))
-#define CBA2R_RW64_32BIT		(0 << 0)
-#define CBA2R_RW64_64BIT		(1 << 0)
-#define CBA2R_VMID_SHIFT		16
-#define CBA2R_VMID_MASK			0xffff
-
 /* Translation context bank */
 #define ARM_SMMU_CB(smmu, n)	((smmu)->cb_base + ((n) << (smmu)->pgshift))
 
-#define ARM_SMMU_CB_SCTLR		0x0
-#define ARM_SMMU_CB_ACTLR		0x4
-#define ARM_SMMU_CB_RESUME		0x8
-#define ARM_SMMU_CB_TTBCR2		0x10
-#define ARM_SMMU_CB_TTBR0		0x20
-#define ARM_SMMU_CB_TTBR1		0x28
-#define ARM_SMMU_CB_TTBCR		0x30
-#define ARM_SMMU_CB_CONTEXTIDR		0x34
-#define ARM_SMMU_CB_S1_MAIR0		0x38
-#define ARM_SMMU_CB_S1_MAIR1		0x3c
-#define ARM_SMMU_CB_PAR			0x50
-#define ARM_SMMU_CB_FSR			0x58
-#define ARM_SMMU_CB_FAR			0x60
-#define ARM_SMMU_CB_FSYNR0		0x68
-#define ARM_SMMU_CB_S1_TLBIVA		0x600
-#define ARM_SMMU_CB_S1_TLBIASID		0x610
-#define ARM_SMMU_CB_S1_TLBIVAL		0x620
-#define ARM_SMMU_CB_S2_TLBIIPAS2	0x630
-#define ARM_SMMU_CB_S2_TLBIIPAS2L	0x638
-#define ARM_SMMU_CB_TLBSYNC		0x7f0
-#define ARM_SMMU_CB_TLBSTATUS		0x7f4
-#define ARM_SMMU_CB_ATS1PR		0x800
-#define ARM_SMMU_CB_ATSR		0x8f0
-
-#define SCTLR_S1_ASIDPNE		(1 << 12)
-#define SCTLR_CFCFG			(1 << 7)
-#define SCTLR_CFIE			(1 << 6)
-#define SCTLR_CFRE			(1 << 5)
-#define SCTLR_E				(1 << 4)
-#define SCTLR_AFE			(1 << 2)
-#define SCTLR_TRE			(1 << 1)
-#define SCTLR_M				(1 << 0)
-
-#define ARM_MMU500_ACTLR_CPRE		(1 << 1)
-
-#define ARM_MMU500_ACR_CACHE_LOCK	(1 << 26)
-#define ARM_MMU500_ACR_SMTNMB_TLBEN	(1 << 8)
-
-#define CB_PAR_F			(1 << 0)
-
-#define ATSR_ACTIVE			(1 << 0)
-
-#define RESUME_RETRY			(0 << 0)
-#define RESUME_TERMINATE		(1 << 0)
-
-#define TTBCR2_SEP_SHIFT		15
-#define TTBCR2_SEP_UPSTREAM		(0x7 << TTBCR2_SEP_SHIFT)
-#define TTBCR2_AS			(1 << 4)
-
-#define TTBRn_ASID_SHIFT		48
-
-#define FSR_MULTI			(1 << 31)
-#define FSR_SS				(1 << 30)
-#define FSR_UUT				(1 << 8)
-#define FSR_ASF				(1 << 7)
-#define FSR_TLBLKF			(1 << 6)
-#define FSR_TLBMCF			(1 << 5)
-#define FSR_EF				(1 << 4)
-#define FSR_PF				(1 << 3)
-#define FSR_AFF				(1 << 2)
-#define FSR_TF				(1 << 1)
-
-#define FSR_IGN				(FSR_AFF | FSR_ASF | \
-					 FSR_TLBMCF | FSR_TLBLKF)
-#define FSR_FAULT			(FSR_MULTI | FSR_SS | FSR_UUT | \
-					 FSR_EF | FSR_PF | FSR_TF | FSR_IGN)
-
-#define FSYNR0_WNR			(1 << 4)
-
 #define MSI_IOVA_BASE			0x8000000
 #define MSI_IOVA_LENGTH			0x100000
 
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 3/4] iommu: add qcom_iommu
  2017-08-03 10:47 [PATCH 0/4] iommu: add qcom_iommu for early "B" family devices Rob Clark
       [not found] ` <20170803104800.18624-1-robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-08-03 10:47 ` Rob Clark
  1 sibling, 0 replies; 24+ messages in thread
From: Rob Clark @ 2017-08-03 10:47 UTC (permalink / raw)
  To: iommu, linux-arm-msm
  Cc: Archit Taneja, Rob Herring, Will Deacon, Sricharan, Mark Rutland,
	Robin Murphy, Rob Clark, Joerg Roedel, linux-kernel

An iommu driver for Qualcomm "B" family devices which do implement the
ARM SMMU spec, but not in a way that is compatible with how the arm-smmu
driver is designed.  It seems SMMU_SCR1.GASRAE=1 so the global register
space is not accessible.  This means it needs to get configuration from
devicetree instead of setting it up dynamically.

In the end, other than register definitions, there is not much code to
share with arm-smmu (other than what has already been refactored out
into the pgtable helpers).

Signed-off-by: Rob Clark <robdclark@gmail.com>
Tested-by: Riku Voipio <riku.voipio@linaro.org>
Tested-by: Archit Taneja <architt@codeaurora.org>
---
 drivers/iommu/Kconfig      |  10 +
 drivers/iommu/Makefile     |   1 +
 drivers/iommu/qcom_iommu.c | 868 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 879 insertions(+)
 create mode 100644 drivers/iommu/qcom_iommu.c

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index f73ff28f77e2..92f5fd2e0e4b 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -367,4 +367,14 @@ config MTK_IOMMU_V1
 
 	  if unsure, say N here.
 
+config QCOM_IOMMU
+	# Note: iommu drivers cannot (yet?) be built as modules
+	bool "Qualcomm IOMMU Support"
+	depends on ARCH_QCOM || COMPILE_TEST
+	select IOMMU_API
+	select IOMMU_IO_PGTABLE_LPAE
+	select ARM_DMA_USE_IOMMU
+	help
+	  Support for IOMMU on certain Qualcomm SoCs.
+
 endif # IOMMU_SUPPORT
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 195f7b997d8e..b910aea813a1 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -27,3 +27,4 @@ obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o
 obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
 obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
 obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
+obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
new file mode 100644
index 000000000000..860cad1cb167
--- /dev/null
+++ b/drivers/iommu/qcom_iommu.c
@@ -0,0 +1,868 @@
+/*
+ * IOMMU API for QCOM secure IOMMUs.  Somewhat based on arm-smmu.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Copyright (C) 2013 ARM Limited
+ * Copyright (C) 2017 Red Hat
+ */
+
+#include <linux/atomic.h>
+#include <linux/clk.h>
+#include <linux/delay.h>
+#include <linux/dma-iommu.h>
+#include <linux/dma-mapping.h>
+#include <linux/err.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/io-64-nonatomic-hi-lo.h>
+#include <linux/iommu.h>
+#include <linux/iopoll.h>
+#include <linux/kconfig.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/of_device.h>
+#include <linux/of_iommu.h>
+#include <linux/platform_device.h>
+#include <linux/pm.h>
+#include <linux/pm_runtime.h>
+#include <linux/qcom_scm.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+
+#include "io-pgtable.h"
+#include "arm-smmu-regs.h"
+
+#define SMMU_INTR_SEL_NS     0x2000
+
+struct qcom_iommu_ctx;
+
+struct qcom_iommu_dev {
+	/* IOMMU core code handle */
+	struct iommu_device	 iommu;
+	struct device		*dev;
+	struct clk		*iface_clk;
+	struct clk		*bus_clk;
+	void __iomem		*local_base;
+	u32			 sec_id;
+	u8			 num_ctxs;
+	struct qcom_iommu_ctx	*ctxs[0];   /* indexed by asid-1 */
+};
+
+struct qcom_iommu_ctx {
+	struct device		*dev;
+	void __iomem		*base;
+	bool			 secure_init;
+	u8			 asid;      /* asid and ctx bank # are 1:1 */
+};
+
+struct qcom_iommu_domain {
+	struct io_pgtable_ops	*pgtbl_ops;
+	spinlock_t		 pgtbl_lock;
+	struct mutex		 init_mutex; /* Protects iommu pointer */
+	struct iommu_domain	 domain;
+	struct qcom_iommu_dev	*iommu;
+};
+
+static struct qcom_iommu_domain *to_qcom_iommu_domain(struct iommu_domain *dom)
+{
+	return container_of(dom, struct qcom_iommu_domain, domain);
+}
+
+static const struct iommu_ops qcom_iommu_ops;
+
+static struct qcom_iommu_dev * to_iommu(struct iommu_fwspec *fwspec)
+{
+	if (!fwspec || fwspec->ops != &qcom_iommu_ops)
+		return NULL;
+	return fwspec->iommu_priv;
+}
+
+static struct qcom_iommu_ctx * to_ctx(struct iommu_fwspec *fwspec, unsigned asid)
+{
+	struct qcom_iommu_dev *qcom_iommu = to_iommu(fwspec);
+	if (!qcom_iommu)
+		return NULL;
+	return qcom_iommu->ctxs[asid - 1];
+}
+
+static inline void
+iommu_writel(struct qcom_iommu_ctx *ctx, unsigned reg, u32 val)
+{
+	writel_relaxed(val, ctx->base + reg);
+}
+
+static inline void
+iommu_writeq(struct qcom_iommu_ctx *ctx, unsigned reg, u64 val)
+{
+	writeq_relaxed(val, ctx->base + reg);
+}
+
+static inline u32
+iommu_readl(struct qcom_iommu_ctx *ctx, unsigned reg)
+{
+	return readl_relaxed(ctx->base + reg);
+}
+
+static inline u64
+iommu_readq(struct qcom_iommu_ctx *ctx, unsigned reg)
+{
+	return readq_relaxed(ctx->base + reg);
+}
+
+static void qcom_iommu_tlb_sync(void *cookie)
+{
+	struct iommu_fwspec *fwspec = cookie;
+	unsigned i;
+
+	for (i = 0; i < fwspec->num_ids; i++) {
+		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
+		unsigned int val, ret;
+
+		iommu_writel(ctx, ARM_SMMU_CB_TLBSYNC, 0);
+
+		ret = readl_poll_timeout(ctx->base + ARM_SMMU_CB_TLBSTATUS, val,
+					 (val & 0x1) == 0, 0, 5000000);
+		if (ret)
+			dev_err(ctx->dev, "timeout waiting for TLB SYNC\n");
+	}
+}
+
+static void qcom_iommu_tlb_inv_context(void *cookie)
+{
+	struct iommu_fwspec *fwspec = cookie;
+	unsigned i;
+
+	for (i = 0; i < fwspec->num_ids; i++) {
+		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
+		iommu_writel(ctx, ARM_SMMU_CB_S1_TLBIASID, ctx->asid);
+	}
+
+	qcom_iommu_tlb_sync(cookie);
+}
+
+static void qcom_iommu_tlb_inv_range_nosync(unsigned long iova, size_t size,
+					    size_t granule, bool leaf, void *cookie)
+{
+	struct iommu_fwspec *fwspec = cookie;
+	unsigned i, reg;
+
+	reg = leaf ? ARM_SMMU_CB_S1_TLBIVAL : ARM_SMMU_CB_S1_TLBIVA;
+
+	for (i = 0; i < fwspec->num_ids; i++) {
+		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
+		size_t s = size;
+
+		iova &= ~12UL;
+		iova |= ctx->asid;
+		do {
+			iommu_writel(ctx, reg, iova);
+			iova += granule;
+		} while (s -= granule);
+	}
+}
+
+static const struct iommu_gather_ops qcom_gather_ops = {
+	.tlb_flush_all	= qcom_iommu_tlb_inv_context,
+	.tlb_add_flush	= qcom_iommu_tlb_inv_range_nosync,
+	.tlb_sync	= qcom_iommu_tlb_sync,
+};
+
+static irqreturn_t qcom_iommu_fault(int irq, void *dev)
+{
+	struct qcom_iommu_ctx *ctx = dev;
+	u32 fsr, fsynr;
+	u64 iova;
+
+	fsr = iommu_readl(ctx, ARM_SMMU_CB_FSR);
+
+	if (!(fsr & FSR_FAULT))
+		return IRQ_NONE;
+
+	fsynr = iommu_readl(ctx, ARM_SMMU_CB_FSYNR0);
+	iova = iommu_readq(ctx, ARM_SMMU_CB_FAR);
+
+	dev_err_ratelimited(ctx->dev,
+			    "Unhandled context fault: fsr=0x%x, "
+			    "iova=0x%016llx, fsynr=0x%x, cb=%d\n",
+			    fsr, iova, fsynr, ctx->asid);
+
+	iommu_writel(ctx, ARM_SMMU_CB_FSR, fsr);
+
+	return IRQ_HANDLED;
+}
+
+static int qcom_iommu_init_domain(struct iommu_domain *domain,
+				  struct qcom_iommu_dev *qcom_iommu,
+				  struct iommu_fwspec *fwspec)
+{
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	struct io_pgtable_ops *pgtbl_ops;
+	struct io_pgtable_cfg pgtbl_cfg;
+	int i, ret = 0;
+	u32 reg;
+
+	mutex_lock(&qcom_domain->init_mutex);
+	if (qcom_domain->iommu)
+		goto out_unlock;
+
+	pgtbl_cfg = (struct io_pgtable_cfg) {
+		.pgsize_bitmap	= qcom_iommu_ops.pgsize_bitmap,
+		.ias		= 32,
+		.oas		= 40,
+		.tlb		= &qcom_gather_ops,
+		.iommu_dev	= qcom_iommu->dev,
+	};
+
+	qcom_domain->iommu = qcom_iommu;
+	pgtbl_ops = alloc_io_pgtable_ops(ARM_32_LPAE_S1, &pgtbl_cfg, fwspec);
+	if (!pgtbl_ops) {
+		dev_err(qcom_iommu->dev, "failed to allocate pagetable ops\n");
+		ret = -ENOMEM;
+		goto out_clear_iommu;
+	}
+
+	/* Update the domain's page sizes to reflect the page table format */
+	domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
+	domain->geometry.aperture_end = (1ULL << pgtbl_cfg.ias) - 1;
+	domain->geometry.force_aperture = true;
+
+	for (i = 0; i < fwspec->num_ids; i++) {
+		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
+
+		if (!ctx->secure_init) {
+			ret = qcom_scm_restore_sec_cfg(qcom_iommu->sec_id, ctx->asid);
+			if (ret) {
+				dev_err(qcom_iommu->dev, "secure init failed: %d\n", ret);
+				goto out_clear_iommu;
+			}
+			ctx->secure_init = true;
+		}
+
+		/* TTBRs */
+		iommu_writeq(ctx, ARM_SMMU_CB_TTBR0,
+				pgtbl_cfg.arm_lpae_s1_cfg.ttbr[0] |
+				((u64)ctx->asid << TTBRn_ASID_SHIFT));
+		iommu_writeq(ctx, ARM_SMMU_CB_TTBR1,
+				pgtbl_cfg.arm_lpae_s1_cfg.ttbr[1] |
+				((u64)ctx->asid << TTBRn_ASID_SHIFT));
+
+		/* TTBCR */
+		iommu_writel(ctx, ARM_SMMU_CB_TTBCR2,
+				(pgtbl_cfg.arm_lpae_s1_cfg.tcr >> 32) |
+				TTBCR2_SEP_UPSTREAM);
+		iommu_writel(ctx, ARM_SMMU_CB_TTBCR,
+				pgtbl_cfg.arm_lpae_s1_cfg.tcr);
+
+		/* MAIRs (stage-1 only) */
+		iommu_writel(ctx, ARM_SMMU_CB_S1_MAIR0,
+				pgtbl_cfg.arm_lpae_s1_cfg.mair[0]);
+		iommu_writel(ctx, ARM_SMMU_CB_S1_MAIR1,
+				pgtbl_cfg.arm_lpae_s1_cfg.mair[1]);
+
+		/* SCTLR */
+		reg = SCTLR_CFIE | SCTLR_CFRE | SCTLR_AFE | SCTLR_TRE |
+			SCTLR_M | SCTLR_S1_ASIDPNE;
+
+		if (IS_ENABLED(CONFIG_BIG_ENDIAN))
+			reg |= SCTLR_E;
+
+		iommu_writel(ctx, ARM_SMMU_CB_SCTLR, reg);
+	}
+
+	mutex_unlock(&qcom_domain->init_mutex);
+
+	/* Publish page table ops for map/unmap */
+	qcom_domain->pgtbl_ops = pgtbl_ops;
+
+	return 0;
+
+out_clear_iommu:
+	qcom_domain->iommu = NULL;
+out_unlock:
+	mutex_unlock(&qcom_domain->init_mutex);
+	return ret;
+}
+
+static struct iommu_domain *qcom_iommu_domain_alloc(unsigned type)
+{
+	struct qcom_iommu_domain *qcom_domain;
+
+	if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA)
+		return NULL;
+	/*
+	 * Allocate the domain and initialise some of its data structures.
+	 * We can't really do anything meaningful until we've added a
+	 * master.
+	 */
+	qcom_domain = kzalloc(sizeof(*qcom_domain), GFP_KERNEL);
+	if (!qcom_domain)
+		return NULL;
+
+	if (type == IOMMU_DOMAIN_DMA &&
+	    iommu_get_dma_cookie(&qcom_domain->domain)) {
+		kfree(qcom_domain);
+		return NULL;
+	}
+
+	mutex_init(&qcom_domain->init_mutex);
+	spin_lock_init(&qcom_domain->pgtbl_lock);
+
+	return &qcom_domain->domain;
+}
+
+static void qcom_iommu_domain_free(struct iommu_domain *domain)
+{
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+
+	if (WARN_ON(qcom_domain->iommu))    /* forgot to detach? */
+		return;
+
+	iommu_put_dma_cookie(domain);
+
+	/* NOTE: unmap can be called after client device is powered off,
+	 * for example, with GPUs or anything involving dma-buf.  So we
+	 * cannot rely on the device_link.  Make sure the IOMMU is on to
+	 * avoid unclocked accesses in the TLB inv path:
+	 */
+	pm_runtime_get_sync(qcom_domain->iommu->dev);
+
+	free_io_pgtable_ops(qcom_domain->pgtbl_ops);
+
+	pm_runtime_put_sync(qcom_domain->iommu->dev);
+
+	kfree(qcom_domain);
+}
+
+static int qcom_iommu_attach_dev(struct iommu_domain *domain, struct device *dev)
+{
+	struct qcom_iommu_dev *qcom_iommu = to_iommu(dev->iommu_fwspec);
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	int ret;
+
+	if (!qcom_iommu) {
+		dev_err(dev, "cannot attach to IOMMU, is it on the same bus?\n");
+		return -ENXIO;
+	}
+
+	/* Ensure that the domain is finalized */
+	pm_runtime_get_sync(qcom_iommu->dev);
+	ret = qcom_iommu_init_domain(domain, qcom_iommu, dev->iommu_fwspec);
+	pm_runtime_put_sync(qcom_iommu->dev);
+	if (ret < 0)
+		return ret;
+
+	/*
+	 * Sanity check the domain. We don't support domains across
+	 * different IOMMUs.
+	 */
+	if (qcom_domain->iommu != qcom_iommu) {
+		dev_err(dev, "cannot attach to IOMMU %s while already "
+			"attached to domain on IOMMU %s\n",
+			dev_name(qcom_domain->iommu->dev),
+			dev_name(qcom_iommu->dev));
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static void qcom_iommu_detach_dev(struct iommu_domain *domain, struct device *dev)
+{
+	struct iommu_fwspec *fwspec = dev->iommu_fwspec;
+	struct qcom_iommu_dev *qcom_iommu = to_iommu(fwspec);
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	unsigned i;
+
+	if (!qcom_domain->iommu)
+		return;
+
+	pm_runtime_get_sync(qcom_iommu->dev);
+	for (i = 0; i < fwspec->num_ids; i++) {
+		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
+
+		/* Disable the context bank: */
+		iommu_writel(ctx, ARM_SMMU_CB_SCTLR, 0);
+	}
+	pm_runtime_put_sync(qcom_iommu->dev);
+
+	qcom_domain->iommu = NULL;
+}
+
+static int qcom_iommu_map(struct iommu_domain *domain, unsigned long iova,
+			  phys_addr_t paddr, size_t size, int prot)
+{
+	int ret;
+	unsigned long flags;
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
+
+	if (!ops)
+		return -ENODEV;
+
+	spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
+	ret = ops->map(ops, iova, paddr, size, prot);
+	spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
+	return ret;
+}
+
+static size_t qcom_iommu_unmap(struct iommu_domain *domain, unsigned long iova,
+			       size_t size)
+{
+	size_t ret;
+	unsigned long flags;
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
+
+	if (!ops)
+		return 0;
+
+	/* NOTE: unmap can be called after client device is powered off,
+	 * for example, with GPUs or anything involving dma-buf.  So we
+	 * cannot rely on the device_link.  Make sure the IOMMU is on to
+	 * avoid unclocked accesses in the TLB inv path:
+	 */
+	pm_runtime_get_sync(qcom_domain->iommu->dev);
+	spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
+	ret = ops->unmap(ops, iova, size);
+	spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
+	pm_runtime_put_sync(qcom_domain->iommu->dev);
+
+	return ret;
+}
+
+static phys_addr_t qcom_iommu_iova_to_phys(struct iommu_domain *domain,
+					   dma_addr_t iova)
+{
+	phys_addr_t ret;
+	unsigned long flags;
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
+
+	if (!ops)
+		return 0;
+
+	spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
+	ret = ops->iova_to_phys(ops, iova);
+	spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
+
+	return ret;
+}
+
+static bool qcom_iommu_capable(enum iommu_cap cap)
+{
+	switch (cap) {
+	case IOMMU_CAP_CACHE_COHERENCY:
+		/*
+		 * Return true here as the SMMU can always send out coherent
+		 * requests.
+		 */
+		return true;
+	case IOMMU_CAP_NOEXEC:
+		return true;
+	default:
+		return false;
+	}
+}
+
+static int qcom_iommu_add_device(struct device *dev)
+{
+	struct qcom_iommu_dev *qcom_iommu = to_iommu(dev->iommu_fwspec);
+	struct iommu_group *group;
+	struct device_link *link;
+
+	if (!qcom_iommu)
+		return -ENODEV;
+
+	/*
+	 * Establish the link between iommu and master, so that the
+	 * iommu gets runtime enabled/disabled as per the master's
+	 * needs.
+	 */
+	link = device_link_add(dev, qcom_iommu->dev, DL_FLAG_PM_RUNTIME);
+	if (!link) {
+		dev_err(qcom_iommu->dev, "Unable to create device link between %s and %s\n",
+			dev_name(qcom_iommu->dev), dev_name(dev));
+		return -ENODEV;
+	}
+
+	group = iommu_group_get_for_dev(dev);
+	if (IS_ERR_OR_NULL(group))
+		return PTR_ERR_OR_ZERO(group);
+
+	iommu_group_put(group);
+	iommu_device_link(&qcom_iommu->iommu, dev);
+
+	return 0;
+}
+
+static void qcom_iommu_remove_device(struct device *dev)
+{
+	struct qcom_iommu_dev *qcom_iommu = to_iommu(dev->iommu_fwspec);
+
+	if (!qcom_iommu)
+		return;
+
+	iommu_device_unlink(&qcom_iommu->iommu, dev);
+	iommu_group_remove_device(dev);
+	iommu_fwspec_free(dev);
+}
+
+static int qcom_iommu_of_xlate(struct device *dev, struct of_phandle_args *args)
+{
+	struct qcom_iommu_dev *qcom_iommu;
+	struct platform_device *iommu_pdev;
+	unsigned asid = args->args[0];
+
+	if (args->args_count != 1) {
+		dev_err(dev, "incorrect number of iommu params found for %s "
+			"(found %d, expected 1)\n",
+			args->np->full_name, args->args_count);
+		return -EINVAL;
+	}
+
+	iommu_pdev = of_find_device_by_node(args->np);
+	if (WARN_ON(!iommu_pdev))
+		return -EINVAL;
+
+	qcom_iommu = platform_get_drvdata(iommu_pdev);
+
+	/* make sure the asid specified in dt is valid, so we don't have
+	 * to sanity check this elsewhere, since 'asid - 1' is used to
+	 * index into qcom_iommu->ctxs:
+	 */
+	if (WARN_ON(asid < 1) ||
+	    WARN_ON(asid > qcom_iommu->num_ctxs))
+		return -EINVAL;
+
+	if (!dev->iommu_fwspec->iommu_priv) {
+		dev->iommu_fwspec->iommu_priv = qcom_iommu;
+	} else {
+		/* make sure devices iommus dt node isn't referring to
+		 * multiple different iommu devices.  Multiple context
+		 * banks are ok, but multiple devices are not:
+		 */
+		if (WARN_ON(qcom_iommu != dev->iommu_fwspec->iommu_priv))
+			return -EINVAL;
+	}
+
+	return iommu_fwspec_add_ids(dev, &asid, 1);
+}
+
+static const struct iommu_ops qcom_iommu_ops = {
+	.capable	= qcom_iommu_capable,
+	.domain_alloc	= qcom_iommu_domain_alloc,
+	.domain_free	= qcom_iommu_domain_free,
+	.attach_dev	= qcom_iommu_attach_dev,
+	.detach_dev	= qcom_iommu_detach_dev,
+	.map		= qcom_iommu_map,
+	.unmap		= qcom_iommu_unmap,
+	.map_sg		= default_iommu_map_sg,
+	.iova_to_phys	= qcom_iommu_iova_to_phys,
+	.add_device	= qcom_iommu_add_device,
+	.remove_device	= qcom_iommu_remove_device,
+	.device_group	= generic_device_group,
+	.of_xlate	= qcom_iommu_of_xlate,
+	.pgsize_bitmap	= SZ_4K | SZ_64K | SZ_1M | SZ_16M,
+};
+
+static int qcom_iommu_enable_clocks(struct qcom_iommu_dev *qcom_iommu)
+{
+	int ret;
+
+	ret = clk_prepare_enable(qcom_iommu->iface_clk);
+	if (ret) {
+		dev_err(qcom_iommu->dev, "Couldn't enable iface_clk\n");
+		return ret;
+	}
+
+	ret = clk_prepare_enable(qcom_iommu->bus_clk);
+	if (ret) {
+		dev_err(qcom_iommu->dev, "Couldn't enable bus_clk\n");
+		clk_disable_unprepare(qcom_iommu->iface_clk);
+		return ret;
+	}
+
+	return 0;
+}
+
+static void qcom_iommu_disable_clocks(struct qcom_iommu_dev *qcom_iommu)
+{
+	clk_disable_unprepare(qcom_iommu->bus_clk);
+	clk_disable_unprepare(qcom_iommu->iface_clk);
+}
+
+static int get_asid(const struct device_node *np)
+{
+	u32 reg;
+
+	/* read the "reg" property directly to get the relative address
+	 * of the context bank, and calculate the asid from that:
+	 */
+	if (of_property_read_u32_index(np, "reg", 0, &reg))
+		return -ENODEV;
+
+	return reg / 0x1000;      /* context banks are 0x1000 apart */
+}
+
+static int qcom_iommu_ctx_probe(struct platform_device *pdev)
+{
+	struct qcom_iommu_ctx *ctx;
+	struct device *dev = &pdev->dev;
+	struct qcom_iommu_dev *qcom_iommu = dev_get_drvdata(dev->parent);
+	struct resource *res;
+	int ret, irq;
+
+	ctx = devm_kzalloc(dev, sizeof(*ctx), GFP_KERNEL);
+	if (!ctx)
+		return -ENOMEM;
+
+	ctx->dev = dev;
+	platform_set_drvdata(pdev, ctx);
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	ctx->base = devm_ioremap_resource(dev, res);
+	if (IS_ERR(ctx->base))
+		return PTR_ERR(ctx->base);
+
+	irq = platform_get_irq(pdev, 0);
+	if (irq < 0) {
+		dev_err(dev, "failed to get irq\n");
+		return -ENODEV;
+	}
+
+	/* clear IRQs before registering fault handler, just in case the
+	 * boot-loader left us a surprise:
+	 */
+	iommu_writel(ctx, ARM_SMMU_CB_FSR, iommu_readl(ctx, ARM_SMMU_CB_FSR));
+
+	ret = devm_request_irq(dev, irq,
+			       qcom_iommu_fault,
+			       IRQF_SHARED,
+			       "qcom-iommu-fault",
+			       ctx);
+	if (ret) {
+		dev_err(dev, "failed to request IRQ %u\n", irq);
+		return ret;
+	}
+
+	ret = get_asid(dev->of_node);
+	if (ret < 0) {
+		dev_err(dev, "missing reg property\n");
+		return ret;
+	}
+
+	ctx->asid = ret;
+
+	dev_dbg(dev, "found asid %u\n", ctx->asid);
+
+	qcom_iommu->ctxs[ctx->asid - 1] = ctx;
+
+	return 0;
+}
+
+static int qcom_iommu_ctx_remove(struct platform_device *pdev)
+{
+	struct qcom_iommu_dev *qcom_iommu = dev_get_drvdata(pdev->dev.parent);
+	struct qcom_iommu_ctx *ctx = platform_get_drvdata(pdev);
+
+	platform_set_drvdata(pdev, NULL);
+
+	qcom_iommu->ctxs[ctx->asid - 1] = NULL;
+
+	return 0;
+}
+
+static const struct of_device_id ctx_of_match[] = {
+	{ .compatible = "qcom,msm-iommu-v1-ns" },
+	{ .compatible = "qcom,msm-iommu-v1-sec" },
+	{ /* sentinel */ }
+};
+
+static struct platform_driver qcom_iommu_ctx_driver = {
+	.driver	= {
+		.name		= "qcom-iommu-ctx",
+		.of_match_table	= of_match_ptr(ctx_of_match),
+	},
+	.probe	= qcom_iommu_ctx_probe,
+	.remove = qcom_iommu_ctx_remove,
+};
+
+static int qcom_iommu_device_probe(struct platform_device *pdev)
+{
+	struct device_node *child;
+	struct qcom_iommu_dev *qcom_iommu;
+	struct device *dev = &pdev->dev;
+	struct resource *res;
+	int ret, sz, max_asid = 0;
+
+	/* find the max asid (which is 1:1 to ctx bank idx), so we know how
+	 * many child ctx devices we have:
+	 */
+	for_each_child_of_node(dev->of_node, child)
+		max_asid = max(max_asid, get_asid(child));
+
+	sz = sizeof(*qcom_iommu) + (max_asid * sizeof(qcom_iommu->ctxs[0]));
+
+	qcom_iommu = devm_kzalloc(dev, sz, GFP_KERNEL);
+	if (!qcom_iommu)
+		return -ENOMEM;
+	qcom_iommu->num_ctxs = max_asid;
+	qcom_iommu->dev = dev;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	if (res)
+		qcom_iommu->local_base = devm_ioremap_resource(dev, res);
+
+	qcom_iommu->iface_clk = devm_clk_get(dev, "iface");
+	if (IS_ERR(qcom_iommu->iface_clk)) {
+		dev_err(dev, "failed to get iface clock\n");
+		return PTR_ERR(qcom_iommu->iface_clk);
+	}
+
+	qcom_iommu->bus_clk = devm_clk_get(dev, "bus");
+	if (IS_ERR(qcom_iommu->bus_clk)) {
+		dev_err(dev, "failed to get bus clock\n");
+		return PTR_ERR(qcom_iommu->bus_clk);
+	}
+
+	if (of_property_read_u32(dev->of_node, "qcom,iommu-secure-id",
+				 &qcom_iommu->sec_id)) {
+		dev_err(dev, "missing qcom,iommu-secure-id property\n");
+		return -ENODEV;
+	}
+
+	platform_set_drvdata(pdev, qcom_iommu);
+
+	pm_runtime_enable(dev);
+
+	/* register context bank devices, which are child nodes: */
+	ret = devm_of_platform_populate(dev);
+	if (ret) {
+		dev_err(dev, "Failed to populate iommu contexts\n");
+		return ret;
+	}
+
+	ret = iommu_device_sysfs_add(&qcom_iommu->iommu, dev, NULL,
+				     dev_name(dev));
+	if (ret) {
+		dev_err(dev, "Failed to register iommu in sysfs\n");
+		return ret;
+	}
+
+	iommu_device_set_ops(&qcom_iommu->iommu, &qcom_iommu_ops);
+	iommu_device_set_fwnode(&qcom_iommu->iommu, dev->fwnode);
+
+	ret = iommu_device_register(&qcom_iommu->iommu);
+	if (ret) {
+		dev_err(dev, "Failed to register iommu\n");
+		return ret;
+	}
+
+	bus_set_iommu(&platform_bus_type, &qcom_iommu_ops);
+
+	if (qcom_iommu->local_base) {
+		pm_runtime_get_sync(dev);
+		writel_relaxed(0xffffffff, qcom_iommu->local_base + SMMU_INTR_SEL_NS);
+		pm_runtime_put_sync(dev);
+	}
+
+	return 0;
+}
+
+static int qcom_iommu_device_remove(struct platform_device *pdev)
+{
+	struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
+
+	bus_set_iommu(&platform_bus_type, NULL);
+
+	pm_runtime_force_suspend(&pdev->dev);
+	platform_set_drvdata(pdev, NULL);
+	iommu_device_sysfs_remove(&qcom_iommu->iommu);
+	iommu_device_unregister(&qcom_iommu->iommu);
+
+	return 0;
+}
+
+#ifdef CONFIG_PM
+static int qcom_iommu_resume(struct device *dev)
+{
+	struct platform_device *pdev = to_platform_device(dev);
+	struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
+
+	return qcom_iommu_enable_clocks(qcom_iommu);
+}
+
+static int qcom_iommu_suspend(struct device *dev)
+{
+	struct platform_device *pdev = to_platform_device(dev);
+	struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
+
+	qcom_iommu_disable_clocks(qcom_iommu);
+
+	return 0;
+}
+#endif
+
+static const struct dev_pm_ops qcom_iommu_pm_ops = {
+	SET_RUNTIME_PM_OPS(qcom_iommu_suspend, qcom_iommu_resume, NULL)
+	SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
+				pm_runtime_force_resume)
+};
+
+static const struct of_device_id qcom_iommu_of_match[] = {
+	{ .compatible = "qcom,msm-iommu-v1" },
+	{ /* sentinel */ }
+};
+MODULE_DEVICE_TABLE(of, qcom_iommu_of_match);
+
+static struct platform_driver qcom_iommu_driver = {
+	.driver	= {
+		.name		= "qcom-iommu",
+		.of_match_table	= of_match_ptr(qcom_iommu_of_match),
+		.pm		= &qcom_iommu_pm_ops,
+	},
+	.probe	= qcom_iommu_device_probe,
+	.remove	= qcom_iommu_device_remove,
+};
+
+static int __init qcom_iommu_init(void)
+{
+	int ret;
+
+	ret = platform_driver_register(&qcom_iommu_ctx_driver);
+	if (ret)
+		return ret;
+
+	ret = platform_driver_register(&qcom_iommu_driver);
+	if (ret)
+		platform_driver_unregister(&qcom_iommu_ctx_driver);
+
+	return ret;
+}
+
+static void __exit qcom_iommu_exit(void)
+{
+	platform_driver_unregister(&qcom_iommu_driver);
+	platform_driver_unregister(&qcom_iommu_ctx_driver);
+}
+
+module_init(qcom_iommu_init);
+module_exit(qcom_iommu_exit);
+
+IOMMU_OF_DECLARE(qcom_iommu_dev, "qcom,msm-iommu-v1", NULL);
+
+MODULE_DESCRIPTION("IOMMU API for QCOM IOMMU v1 implementations");
+MODULE_LICENSE("GPL v2");
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 4/4] iommu: qcom: initialize secure page table
  2017-08-03 10:47 [PATCH 0/4] iommu: add qcom_iommu for early "B" family devices Rob Clark
@ 2017-08-03 10:47     ` Rob Clark
  2017-08-03 10:47 ` [PATCH 3/4] iommu: add qcom_iommu Rob Clark
  1 sibling, 0 replies; 24+ messages in thread
From: Rob Clark @ 2017-08-03 10:47 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-msm-u79uwXL29TY76Z2rM5mHXA
  Cc: Mark Rutland, Rob Herring, Will Deacon, Stanimir Varbanov,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

From: Stanimir Varbanov <stanimir.varbanov-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>

This basically gets the secure page table size, allocates memory for
secure pagetables and passes the physical address to the trusted zone.

Signed-off-by: Stanimir Varbanov <stanimir.varbanov-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
Signed-off-by: Rob Clark <robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Tested-by: Archit Taneja <architt-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
---
 drivers/iommu/qcom_iommu.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 64 insertions(+)

diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
index 860cad1cb167..48b62aa52787 100644
--- a/drivers/iommu/qcom_iommu.c
+++ b/drivers/iommu/qcom_iommu.c
@@ -604,6 +604,51 @@ static void qcom_iommu_disable_clocks(struct qcom_iommu_dev *qcom_iommu)
 	clk_disable_unprepare(qcom_iommu->iface_clk);
 }
 
+static int qcom_iommu_sec_ptbl_init(struct device *dev)
+{
+	size_t psize = 0;
+	unsigned int spare = 0;
+	void *cpu_addr;
+	dma_addr_t paddr;
+	unsigned long attrs;
+	static bool allocated = false;
+	int ret;
+
+	if (allocated)
+		return 0;
+
+	ret = qcom_scm_iommu_secure_ptbl_size(spare, &psize);
+	if (ret) {
+		dev_err(dev, "failed to get iommu secure pgtable size (%d)\n",
+			ret);
+		return ret;
+	}
+
+	dev_info(dev, "iommu sec: pgtable size: %zu\n", psize);
+
+	attrs = DMA_ATTR_NO_KERNEL_MAPPING;
+
+	cpu_addr = dma_alloc_attrs(dev, psize, &paddr, GFP_KERNEL, attrs);
+	if (!cpu_addr) {
+		dev_err(dev, "failed to allocate %zu bytes for pgtable\n",
+			psize);
+		return -ENOMEM;
+	}
+
+	ret = qcom_scm_iommu_secure_ptbl_init(paddr, psize, spare);
+	if (ret) {
+		dev_err(dev, "failed to init iommu pgtable (%d)\n", ret);
+		goto free_mem;
+	}
+
+	allocated = true;
+	return 0;
+
+free_mem:
+	dma_free_attrs(dev, psize, cpu_addr, paddr, attrs);
+	return ret;
+}
+
 static int get_asid(const struct device_node *np)
 {
 	u32 reg;
@@ -700,6 +745,17 @@ static struct platform_driver qcom_iommu_ctx_driver = {
 	.remove = qcom_iommu_ctx_remove,
 };
 
+static bool qcom_iommu_has_secure_context(struct qcom_iommu_dev *qcom_iommu)
+{
+	struct device_node *child;
+
+	for_each_child_of_node(qcom_iommu->dev->of_node, child)
+		if (of_device_is_compatible(child, "qcom,msm-iommu-v1-sec"))
+			return true;
+
+	return false;
+}
+
 static int qcom_iommu_device_probe(struct platform_device *pdev)
 {
 	struct device_node *child;
@@ -744,6 +800,14 @@ static int qcom_iommu_device_probe(struct platform_device *pdev)
 		return -ENODEV;
 	}
 
+	if (qcom_iommu_has_secure_context(qcom_iommu)) {
+		ret = qcom_iommu_sec_ptbl_init(dev);
+		if (ret) {
+			dev_err(dev, "cannot init secure pg table(%d)\n", ret);
+			return ret;
+		}
+	}
+
 	platform_set_drvdata(pdev, qcom_iommu);
 
 	pm_runtime_enable(dev);
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 4/4] iommu: qcom: initialize secure page table
@ 2017-08-03 10:47     ` Rob Clark
  0 siblings, 0 replies; 24+ messages in thread
From: Rob Clark @ 2017-08-03 10:47 UTC (permalink / raw)
  To: iommu, linux-arm-msm
  Cc: Archit Taneja, Rob Herring, Will Deacon, Sricharan, Mark Rutland,
	Robin Murphy, Stanimir Varbanov, Rob Clark, Joerg Roedel,
	linux-kernel

From: Stanimir Varbanov <stanimir.varbanov@linaro.org>

This basically gets the secure page table size, allocates memory for
secure pagetables and passes the physical address to the trusted zone.

Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Tested-by: Archit Taneja <architt@codeaurora.org>
---
 drivers/iommu/qcom_iommu.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 64 insertions(+)

diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
index 860cad1cb167..48b62aa52787 100644
--- a/drivers/iommu/qcom_iommu.c
+++ b/drivers/iommu/qcom_iommu.c
@@ -604,6 +604,51 @@ static void qcom_iommu_disable_clocks(struct qcom_iommu_dev *qcom_iommu)
 	clk_disable_unprepare(qcom_iommu->iface_clk);
 }
 
+static int qcom_iommu_sec_ptbl_init(struct device *dev)
+{
+	size_t psize = 0;
+	unsigned int spare = 0;
+	void *cpu_addr;
+	dma_addr_t paddr;
+	unsigned long attrs;
+	static bool allocated = false;
+	int ret;
+
+	if (allocated)
+		return 0;
+
+	ret = qcom_scm_iommu_secure_ptbl_size(spare, &psize);
+	if (ret) {
+		dev_err(dev, "failed to get iommu secure pgtable size (%d)\n",
+			ret);
+		return ret;
+	}
+
+	dev_info(dev, "iommu sec: pgtable size: %zu\n", psize);
+
+	attrs = DMA_ATTR_NO_KERNEL_MAPPING;
+
+	cpu_addr = dma_alloc_attrs(dev, psize, &paddr, GFP_KERNEL, attrs);
+	if (!cpu_addr) {
+		dev_err(dev, "failed to allocate %zu bytes for pgtable\n",
+			psize);
+		return -ENOMEM;
+	}
+
+	ret = qcom_scm_iommu_secure_ptbl_init(paddr, psize, spare);
+	if (ret) {
+		dev_err(dev, "failed to init iommu pgtable (%d)\n", ret);
+		goto free_mem;
+	}
+
+	allocated = true;
+	return 0;
+
+free_mem:
+	dma_free_attrs(dev, psize, cpu_addr, paddr, attrs);
+	return ret;
+}
+
 static int get_asid(const struct device_node *np)
 {
 	u32 reg;
@@ -700,6 +745,17 @@ static struct platform_driver qcom_iommu_ctx_driver = {
 	.remove = qcom_iommu_ctx_remove,
 };
 
+static bool qcom_iommu_has_secure_context(struct qcom_iommu_dev *qcom_iommu)
+{
+	struct device_node *child;
+
+	for_each_child_of_node(qcom_iommu->dev->of_node, child)
+		if (of_device_is_compatible(child, "qcom,msm-iommu-v1-sec"))
+			return true;
+
+	return false;
+}
+
 static int qcom_iommu_device_probe(struct platform_device *pdev)
 {
 	struct device_node *child;
@@ -744,6 +800,14 @@ static int qcom_iommu_device_probe(struct platform_device *pdev)
 		return -ENODEV;
 	}
 
+	if (qcom_iommu_has_secure_context(qcom_iommu)) {
+		ret = qcom_iommu_sec_ptbl_init(dev);
+		if (ret) {
+			dev_err(dev, "cannot init secure pg table(%d)\n", ret);
+			return ret;
+		}
+	}
+
 	platform_set_drvdata(pdev, qcom_iommu);
 
 	pm_runtime_enable(dev);
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 3/4] iommu: add qcom_iommu
       [not found] ` <20170626124352.21726-1-robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-06-26 12:43   ` Rob Clark
  0 siblings, 0 replies; 24+ messages in thread
From: Rob Clark @ 2017-06-26 12:43 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: Mark Rutland, Rob Herring, linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	Will Deacon

An iommu driver for Qualcomm "B" family devices which do implement the
ARM SMMU spec, but not in a way that is compatible with how the arm-smmu
driver is designed.  It seems SMMU_SCR1.GASRAE=1 so the global register
space is not accessible.  This means it needs to get configuration from
devicetree instead of setting it up dynamically.

In the end, other than register definitions, there is not much code to
share with arm-smmu (other than what has already been refactored out
into the pgtable helpers).

Signed-off-by: Rob Clark <robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Tested-by: Riku Voipio <riku.voipio-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
v1: original
v2: bindings cleanups and kconfig issues that kbuild robot pointed out
v3: fix issues pointed out by Rob H. and actually make device removal
    work
v4: fix WARN_ON() splats reported by Archit
v5: some fixes to build as a module.. note that it cannot actually
    be built as a module yet (at minimum a bunch of other iommu syms
    that are needed are not exported, but there may be more to it
    than that), but at least qcom_iommu is ready should it become
    possible to build iommu drivers as modules.
v6: Add additional pm-runtime get/puts around paths that can hit
    TLB inv, to avoid unclocked register access if device using the
    iommu is not powered on.  And pre-emptively clear interrupts
    before registering IRQ handler just in case the bootloader has
    left us a surpise.
v7: Address review comments from Robin (don't associate iommu_group
    with context bank, table lookup instead of list to find context
    bank, etc)
v8: Fix silly bug on detach.  Actually Robin already pointed it out
    but I somehow overlooked that comment when preparing v7.

 drivers/iommu/Kconfig      |  10 +
 drivers/iommu/Makefile     |   1 +
 drivers/iommu/qcom_iommu.c | 857 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 868 insertions(+)
 create mode 100644 drivers/iommu/qcom_iommu.c

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 6ee3a25..aa4b628 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -367,4 +367,14 @@ config MTK_IOMMU_V1
 
 	  if unsure, say N here.
 
+config QCOM_IOMMU
+	# Note: iommu drivers cannot (yet?) be built as modules
+	bool "Qualcomm IOMMU Support"
+	depends on ARCH_QCOM || COMPILE_TEST
+	select IOMMU_API
+	select IOMMU_IO_PGTABLE_LPAE
+	select ARM_DMA_USE_IOMMU
+	help
+	  Support for IOMMU on certain Qualcomm SoCs.
+
 endif # IOMMU_SUPPORT
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 195f7b9..b910aea 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -27,3 +27,4 @@ obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o
 obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
 obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
 obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
+obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
new file mode 100644
index 0000000..33e984e
--- /dev/null
+++ b/drivers/iommu/qcom_iommu.c
@@ -0,0 +1,857 @@
+/*
+ * IOMMU API for QCOM secure IOMMUs.  Somewhat based on arm-smmu.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Copyright (C) 2013 ARM Limited
+ * Copyright (C) 2017 Red Hat
+ */
+
+#include <linux/atomic.h>
+#include <linux/clk.h>
+#include <linux/delay.h>
+#include <linux/dma-iommu.h>
+#include <linux/dma-mapping.h>
+#include <linux/err.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/io-64-nonatomic-hi-lo.h>
+#include <linux/iommu.h>
+#include <linux/iopoll.h>
+#include <linux/kconfig.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/of_device.h>
+#include <linux/of_iommu.h>
+#include <linux/platform_device.h>
+#include <linux/pm.h>
+#include <linux/pm_runtime.h>
+#include <linux/qcom_scm.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+
+#include "io-pgtable.h"
+#include "arm-smmu-regs.h"
+
+#define SMMU_INTR_SEL_NS     0x2000
+
+struct qcom_iommu_ctx;
+
+struct qcom_iommu_dev {
+	/* IOMMU core code handle */
+	struct iommu_device	 iommu;
+	struct device		*dev;
+	struct clk		*iface_clk;
+	struct clk		*bus_clk;
+	void __iomem		*local_base;
+	u32			 sec_id;
+	u8			 num_ctxs;
+	struct qcom_iommu_ctx	*ctxs[0];   /* indexed by asid-1 */
+};
+
+struct qcom_iommu_ctx {
+	struct device		*dev;
+	void __iomem		*base;
+	bool			 secure_init;
+	u8			 asid;      /* asid and ctx bank # are 1:1 */
+};
+
+struct qcom_iommu_domain {
+	struct io_pgtable_ops	*pgtbl_ops;
+	spinlock_t		 pgtbl_lock;
+	struct mutex		 init_mutex; /* Protects iommu pointer */
+	struct iommu_domain	 domain;
+	struct qcom_iommu_dev	*iommu;
+};
+
+static struct qcom_iommu_domain *to_qcom_iommu_domain(struct iommu_domain *dom)
+{
+	return container_of(dom, struct qcom_iommu_domain, domain);
+}
+
+static const struct iommu_ops qcom_iommu_ops;
+
+static struct qcom_iommu_dev * to_iommu(struct iommu_fwspec *fwspec)
+{
+	if (!fwspec || fwspec->ops != &qcom_iommu_ops)
+		return NULL;
+	return fwspec->iommu_priv;
+}
+
+static struct qcom_iommu_ctx * to_ctx(struct iommu_fwspec *fwspec, unsigned asid)
+{
+	struct qcom_iommu_dev *qcom_iommu = to_iommu(fwspec);
+	if (!qcom_iommu)
+		return NULL;
+	return qcom_iommu->ctxs[asid - 1];
+}
+
+static inline void
+iommu_writel(struct qcom_iommu_ctx *ctx, unsigned reg, u32 val)
+{
+	writel_relaxed(val, ctx->base + reg);
+}
+
+static inline void
+iommu_writeq(struct qcom_iommu_ctx *ctx, unsigned reg, u64 val)
+{
+	writeq_relaxed(val, ctx->base + reg);
+}
+
+static inline u32
+iommu_readl(struct qcom_iommu_ctx *ctx, unsigned reg)
+{
+	return readl_relaxed(ctx->base + reg);
+}
+
+static inline u64
+iommu_readq(struct qcom_iommu_ctx *ctx, unsigned reg)
+{
+	return readq_relaxed(ctx->base + reg);
+}
+
+static void qcom_iommu_tlb_sync(void *cookie)
+{
+	struct iommu_fwspec *fwspec = cookie;
+	unsigned i;
+
+	for (i = 0; i < fwspec->num_ids; i++) {
+		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
+		unsigned int val, ret;
+
+		iommu_writel(ctx, ARM_SMMU_CB_TLBSYNC, 0);
+
+		ret = readl_poll_timeout(ctx->base + ARM_SMMU_CB_TLBSTATUS, val,
+					 (val & 0x1) == 0, 0, 5000000);
+		if (ret)
+			dev_err(ctx->dev, "timeout waiting for TLB SYNC\n");
+	}
+}
+
+static void qcom_iommu_tlb_inv_context(void *cookie)
+{
+	struct iommu_fwspec *fwspec = cookie;
+	unsigned i;
+
+	for (i = 0; i < fwspec->num_ids; i++) {
+		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
+		iommu_writel(ctx, ARM_SMMU_CB_S1_TLBIASID, ctx->asid);
+	}
+
+	qcom_iommu_tlb_sync(cookie);
+}
+
+static void qcom_iommu_tlb_inv_range_nosync(unsigned long iova, size_t size,
+					    size_t granule, bool leaf, void *cookie)
+{
+	struct iommu_fwspec *fwspec = cookie;
+	unsigned i, reg;
+
+	reg = leaf ? ARM_SMMU_CB_S1_TLBIVAL : ARM_SMMU_CB_S1_TLBIVA;
+
+	for (i = 0; i < fwspec->num_ids; i++) {
+		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
+		size_t s = size;
+
+		iova &= ~12UL;
+		iova |= ctx->asid;
+		do {
+			iommu_writel(ctx, reg, iova);
+			iova += granule;
+		} while (s -= granule);
+	}
+}
+
+static const struct iommu_gather_ops qcom_gather_ops = {
+	.tlb_flush_all	= qcom_iommu_tlb_inv_context,
+	.tlb_add_flush	= qcom_iommu_tlb_inv_range_nosync,
+	.tlb_sync	= qcom_iommu_tlb_sync,
+};
+
+static irqreturn_t qcom_iommu_fault(int irq, void *dev)
+{
+	struct qcom_iommu_ctx *ctx = dev;
+	u32 fsr, fsynr;
+	u64 iova;
+
+	fsr = iommu_readl(ctx, ARM_SMMU_CB_FSR);
+
+	if (!(fsr & FSR_FAULT))
+		return IRQ_NONE;
+
+	fsynr = iommu_readl(ctx, ARM_SMMU_CB_FSYNR0);
+	iova = iommu_readq(ctx, ARM_SMMU_CB_FAR);
+
+	dev_err_ratelimited(ctx->dev,
+			    "Unhandled context fault: fsr=0x%x, "
+			    "iova=0x%016llx, fsynr=0x%x, cb=%d\n",
+			    fsr, iova, fsynr, ctx->asid);
+
+	iommu_writel(ctx, ARM_SMMU_CB_FSR, fsr);
+
+	return IRQ_HANDLED;
+}
+
+static int qcom_iommu_init_domain(struct iommu_domain *domain,
+				  struct qcom_iommu_dev *qcom_iommu,
+				  struct iommu_fwspec *fwspec)
+{
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	struct io_pgtable_ops *pgtbl_ops;
+	struct io_pgtable_cfg pgtbl_cfg;
+	int i, ret = 0;
+	u32 reg;
+
+	mutex_lock(&qcom_domain->init_mutex);
+	if (qcom_domain->iommu)
+		goto out_unlock;
+
+	pgtbl_cfg = (struct io_pgtable_cfg) {
+		.pgsize_bitmap	= qcom_iommu_ops.pgsize_bitmap,
+		.ias		= 32,
+		.oas		= 40,
+		.tlb		= &qcom_gather_ops,
+		.iommu_dev	= qcom_iommu->dev,
+	};
+
+	qcom_domain->iommu = qcom_iommu;
+	pgtbl_ops = alloc_io_pgtable_ops(ARM_32_LPAE_S1, &pgtbl_cfg, fwspec);
+	if (!pgtbl_ops) {
+		dev_err(qcom_iommu->dev, "failed to allocate pagetable ops\n");
+		ret = -ENOMEM;
+		goto out_clear_iommu;
+	}
+
+	/* Update the domain's page sizes to reflect the page table format */
+	domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
+	domain->geometry.aperture_end = (1ULL << pgtbl_cfg.ias) - 1;
+	domain->geometry.force_aperture = true;
+
+	for (i = 0; i < fwspec->num_ids; i++) {
+		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
+
+		if (!ctx->secure_init) {
+			ret = qcom_scm_restore_sec_cfg(qcom_iommu->sec_id, ctx->asid);
+			if (ret) {
+				dev_err(qcom_iommu->dev, "secure init failed: %d\n", ret);
+				goto out_clear_iommu;
+			}
+			ctx->secure_init = true;
+		}
+
+		/* TTBRs */
+		iommu_writeq(ctx, ARM_SMMU_CB_TTBR0,
+				pgtbl_cfg.arm_lpae_s1_cfg.ttbr[0] |
+				((u64)ctx->asid << TTBRn_ASID_SHIFT));
+		iommu_writeq(ctx, ARM_SMMU_CB_TTBR1,
+				pgtbl_cfg.arm_lpae_s1_cfg.ttbr[1] |
+				((u64)ctx->asid << TTBRn_ASID_SHIFT));
+
+		/* TTBCR */
+		iommu_writel(ctx, ARM_SMMU_CB_TTBCR2,
+				(pgtbl_cfg.arm_lpae_s1_cfg.tcr >> 32) |
+				TTBCR2_SEP_UPSTREAM);
+		iommu_writel(ctx, ARM_SMMU_CB_TTBCR,
+				pgtbl_cfg.arm_lpae_s1_cfg.tcr);
+
+		/* MAIRs (stage-1 only) */
+		iommu_writel(ctx, ARM_SMMU_CB_S1_MAIR0,
+				pgtbl_cfg.arm_lpae_s1_cfg.mair[0]);
+		iommu_writel(ctx, ARM_SMMU_CB_S1_MAIR1,
+				pgtbl_cfg.arm_lpae_s1_cfg.mair[1]);
+
+		/* SCTLR */
+		reg = SCTLR_CFIE | SCTLR_CFRE | SCTLR_AFE | SCTLR_TRE |
+			SCTLR_M | SCTLR_S1_ASIDPNE;
+
+		if (IS_ENABLED(CONFIG_BIG_ENDIAN))
+			reg |= SCTLR_E;
+
+		iommu_writel(ctx, ARM_SMMU_CB_SCTLR, reg);
+	}
+
+	mutex_unlock(&qcom_domain->init_mutex);
+
+	/* Publish page table ops for map/unmap */
+	qcom_domain->pgtbl_ops = pgtbl_ops;
+
+	return 0;
+
+out_clear_iommu:
+	qcom_domain->iommu = NULL;
+out_unlock:
+	mutex_unlock(&qcom_domain->init_mutex);
+	return ret;
+}
+
+static struct iommu_domain *qcom_iommu_domain_alloc(unsigned type)
+{
+	struct qcom_iommu_domain *qcom_domain;
+
+	if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA)
+		return NULL;
+	/*
+	 * Allocate the domain and initialise some of its data structures.
+	 * We can't really do anything meaningful until we've added a
+	 * master.
+	 */
+	qcom_domain = kzalloc(sizeof(*qcom_domain), GFP_KERNEL);
+	if (!qcom_domain)
+		return NULL;
+
+	if (type == IOMMU_DOMAIN_DMA &&
+	    iommu_get_dma_cookie(&qcom_domain->domain)) {
+		kfree(qcom_domain);
+		return NULL;
+	}
+
+	mutex_init(&qcom_domain->init_mutex);
+	spin_lock_init(&qcom_domain->pgtbl_lock);
+
+	return &qcom_domain->domain;
+}
+
+static void qcom_iommu_domain_free(struct iommu_domain *domain)
+{
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+
+	if (WARN_ON(qcom_domain->iommu))    /* forgot to detach? */
+		return;
+
+	iommu_put_dma_cookie(domain);
+	kfree(qcom_domain);
+}
+
+static int qcom_iommu_attach_dev(struct iommu_domain *domain, struct device *dev)
+{
+	struct qcom_iommu_dev *qcom_iommu = to_iommu(dev->iommu_fwspec);
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	int ret;
+
+	if (!qcom_iommu) {
+		dev_err(dev, "cannot attach to IOMMU, is it on the same bus?\n");
+		return -ENXIO;
+	}
+
+	/* Ensure that the domain is finalized */
+	pm_runtime_get_sync(qcom_iommu->dev);
+	ret = qcom_iommu_init_domain(domain, qcom_iommu, dev->iommu_fwspec);
+	pm_runtime_put_sync(qcom_iommu->dev);
+	if (ret < 0)
+		return ret;
+
+	/*
+	 * Sanity check the domain. We don't support domains across
+	 * different IOMMUs.
+	 */
+	if (qcom_domain->iommu != qcom_iommu) {
+		dev_err(dev, "cannot attach to IOMMU %s while already "
+			"attached to domain on IOMMU %s\n",
+			dev_name(qcom_domain->iommu->dev),
+			dev_name(qcom_iommu->dev));
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static void qcom_iommu_detach_dev(struct iommu_domain *domain, struct device *dev)
+{
+	struct iommu_fwspec *fwspec = dev->iommu_fwspec;
+	struct qcom_iommu_dev *qcom_iommu = to_iommu(fwspec);
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	unsigned i;
+
+	if (!qcom_domain->iommu)
+		return;
+
+	pm_runtime_get_sync(qcom_iommu->dev);
+	for (i = 0; i < fwspec->num_ids; i++) {
+		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
+
+		/* Disable the context bank: */
+		iommu_writel(ctx, ARM_SMMU_CB_SCTLR, 0);
+	}
+	free_io_pgtable_ops(qcom_domain->pgtbl_ops);
+	pm_runtime_put_sync(qcom_iommu->dev);
+
+	qcom_domain->iommu = NULL;
+}
+
+static int qcom_iommu_map(struct iommu_domain *domain, unsigned long iova,
+			  phys_addr_t paddr, size_t size, int prot)
+{
+	int ret;
+	unsigned long flags;
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
+
+	if (!ops)
+		return -ENODEV;
+
+	spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
+	ret = ops->map(ops, iova, paddr, size, prot);
+	spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
+	return ret;
+}
+
+static size_t qcom_iommu_unmap(struct iommu_domain *domain, unsigned long iova,
+			       size_t size)
+{
+	size_t ret;
+	unsigned long flags;
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
+
+	if (!ops)
+		return 0;
+
+	/* NOTE: unmap can be called after client device is powered off,
+	 * for example, with GPUs or anything involving dma-buf.  So we
+	 * cannot rely on the device_link.  Make sure the IOMMU is on to
+	 * avoid unclocked accesses in the TLB inv path:
+	 */
+	pm_runtime_get_sync(qcom_domain->iommu->dev);
+	spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
+	ret = ops->unmap(ops, iova, size);
+	spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
+	pm_runtime_put_sync(qcom_domain->iommu->dev);
+
+	return ret;
+}
+
+static phys_addr_t qcom_iommu_iova_to_phys(struct iommu_domain *domain,
+					   dma_addr_t iova)
+{
+	phys_addr_t ret;
+	unsigned long flags;
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
+
+	if (!ops)
+		return 0;
+
+	spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
+	ret = ops->iova_to_phys(ops, iova);
+	spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
+
+	return ret;
+}
+
+static bool qcom_iommu_capable(enum iommu_cap cap)
+{
+	switch (cap) {
+	case IOMMU_CAP_CACHE_COHERENCY:
+		/*
+		 * Return true here as the SMMU can always send out coherent
+		 * requests.
+		 */
+		return true;
+	case IOMMU_CAP_NOEXEC:
+		return true;
+	default:
+		return false;
+	}
+}
+
+static int qcom_iommu_add_device(struct device *dev)
+{
+	struct qcom_iommu_dev *qcom_iommu = to_iommu(dev->iommu_fwspec);
+	struct iommu_group *group;
+	struct device_link *link;
+
+	if (!qcom_iommu)
+		return -ENODEV;
+
+	/*
+	 * Establish the link between iommu and master, so that the
+	 * iommu gets runtime enabled/disabled as per the master's
+	 * needs.
+	 */
+	link = device_link_add(dev, qcom_iommu->dev, DL_FLAG_PM_RUNTIME);
+	if (!link) {
+		dev_err(qcom_iommu->dev, "Unable to create device link between %s and %s\n",
+			dev_name(qcom_iommu->dev), dev_name(dev));
+		return -ENODEV;
+	}
+
+	group = iommu_group_get_for_dev(dev);
+	if (IS_ERR_OR_NULL(group))
+		return PTR_ERR_OR_ZERO(group);
+
+	iommu_group_put(group);
+	iommu_device_link(&qcom_iommu->iommu, dev);
+
+	return 0;
+}
+
+static void qcom_iommu_remove_device(struct device *dev)
+{
+	struct qcom_iommu_dev *qcom_iommu = to_iommu(dev->iommu_fwspec);
+
+	if (!qcom_iommu)
+		return;
+
+	iommu_device_unlink(&qcom_iommu->iommu, dev);
+	iommu_group_remove_device(dev);
+	iommu_fwspec_free(dev);
+}
+
+static int qcom_iommu_of_xlate(struct device *dev, struct of_phandle_args *args)
+{
+	struct qcom_iommu_dev *qcom_iommu;
+	struct platform_device *iommu_pdev;
+	unsigned asid = args->args[0];
+
+	if (args->args_count != 1) {
+		dev_err(dev, "incorrect number of iommu params found for %s "
+			"(found %d, expected 1)\n",
+			args->np->full_name, args->args_count);
+		return -EINVAL;
+	}
+
+	iommu_pdev = of_find_device_by_node(args->np);
+	if (WARN_ON(!iommu_pdev))
+		return -EINVAL;
+
+	qcom_iommu = platform_get_drvdata(iommu_pdev);
+
+	/* make sure the asid specified in dt is valid, so we don't have
+	 * to sanity check this elsewhere, since 'asid - 1' is used to
+	 * index into qcom_iommu->ctxs:
+	 */
+	if (WARN_ON(asid < 1) ||
+	    WARN_ON(asid > qcom_iommu->num_ctxs))
+		return -EINVAL;
+
+	if (!dev->iommu_fwspec->iommu_priv) {
+		dev->iommu_fwspec->iommu_priv = qcom_iommu;
+	} else {
+		/* make sure devices iommus dt node isn't referring to
+		 * multiple different iommu devices.  Multiple context
+		 * banks are ok, but multiple devices are not:
+		 */
+		if (WARN_ON(qcom_iommu != dev->iommu_fwspec->iommu_priv))
+			return -EINVAL;
+	}
+
+	return iommu_fwspec_add_ids(dev, &asid, 1);
+}
+
+static const struct iommu_ops qcom_iommu_ops = {
+	.capable	= qcom_iommu_capable,
+	.domain_alloc	= qcom_iommu_domain_alloc,
+	.domain_free	= qcom_iommu_domain_free,
+	.attach_dev	= qcom_iommu_attach_dev,
+	.detach_dev	= qcom_iommu_detach_dev,
+	.map		= qcom_iommu_map,
+	.unmap		= qcom_iommu_unmap,
+	.map_sg		= default_iommu_map_sg,
+	.iova_to_phys	= qcom_iommu_iova_to_phys,
+	.add_device	= qcom_iommu_add_device,
+	.remove_device	= qcom_iommu_remove_device,
+	.device_group	= generic_device_group,
+	.of_xlate	= qcom_iommu_of_xlate,
+	.pgsize_bitmap	= SZ_4K | SZ_64K | SZ_1M | SZ_16M,
+};
+
+static int qcom_iommu_enable_clocks(struct qcom_iommu_dev *qcom_iommu)
+{
+	int ret;
+
+	ret = clk_prepare_enable(qcom_iommu->iface_clk);
+	if (ret) {
+		dev_err(qcom_iommu->dev, "Couldn't enable iface_clk\n");
+		return ret;
+	}
+
+	ret = clk_prepare_enable(qcom_iommu->bus_clk);
+	if (ret) {
+		dev_err(qcom_iommu->dev, "Couldn't enable bus_clk\n");
+		clk_disable_unprepare(qcom_iommu->iface_clk);
+		return ret;
+	}
+
+	return 0;
+}
+
+static void qcom_iommu_disable_clocks(struct qcom_iommu_dev *qcom_iommu)
+{
+	clk_disable_unprepare(qcom_iommu->bus_clk);
+	clk_disable_unprepare(qcom_iommu->iface_clk);
+}
+
+static int get_asid(const struct device_node *np)
+{
+	u32 reg;
+
+	/* read the "reg" property directly to get the relative address
+	 * of the context bank, and calculate the asid from that:
+	 */
+	if (of_property_read_u32_index(np, "reg", 0, &reg))
+		return -ENODEV;
+
+	return reg / 0x1000;      /* context banks are 0x1000 apart */
+}
+
+static int qcom_iommu_ctx_probe(struct platform_device *pdev)
+{
+	struct qcom_iommu_ctx *ctx;
+	struct device *dev = &pdev->dev;
+	struct qcom_iommu_dev *qcom_iommu = dev_get_drvdata(dev->parent);
+	struct resource *res;
+	int ret, irq;
+
+	ctx = devm_kzalloc(dev, sizeof(*ctx), GFP_KERNEL);
+	if (!ctx)
+		return -ENOMEM;
+
+	ctx->dev = dev;
+	platform_set_drvdata(pdev, ctx);
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	ctx->base = devm_ioremap_resource(dev, res);
+	if (IS_ERR(ctx->base))
+		return PTR_ERR(ctx->base);
+
+	irq = platform_get_irq(pdev, 0);
+	if (irq < 0) {
+		dev_err(dev, "failed to get irq\n");
+		return -ENODEV;
+	}
+
+	/* clear IRQs before registering fault handler, just in case the
+	 * boot-loader left us a surprise:
+	 */
+	iommu_writel(ctx, ARM_SMMU_CB_FSR, iommu_readl(ctx, ARM_SMMU_CB_FSR));
+
+	ret = devm_request_irq(dev, irq,
+			       qcom_iommu_fault,
+			       IRQF_SHARED,
+			       "qcom-iommu-fault",
+			       ctx);
+	if (ret) {
+		dev_err(dev, "failed to request IRQ %u\n", irq);
+		return ret;
+	}
+
+	ret = get_asid(dev->of_node);
+	if (ret < 0) {
+		dev_err(dev, "missing reg property\n");
+		return ret;
+	}
+
+	ctx->asid = ret;
+
+	dev_dbg(dev, "found asid %u\n", ctx->asid);
+
+	qcom_iommu->ctxs[ctx->asid - 1] = ctx;
+
+	return 0;
+}
+
+static int qcom_iommu_ctx_remove(struct platform_device *pdev)
+{
+	struct qcom_iommu_dev *qcom_iommu = dev_get_drvdata(pdev->dev.parent);
+	struct qcom_iommu_ctx *ctx = platform_get_drvdata(pdev);
+
+	platform_set_drvdata(pdev, NULL);
+
+	qcom_iommu->ctxs[ctx->asid - 1] = NULL;
+
+	return 0;
+}
+
+static const struct of_device_id ctx_of_match[] = {
+	{ .compatible = "qcom,msm-iommu-v1-ns" },
+	{ .compatible = "qcom,msm-iommu-v1-sec" },
+	{ /* sentinel */ }
+};
+
+static struct platform_driver qcom_iommu_ctx_driver = {
+	.driver	= {
+		.name		= "qcom-iommu-ctx",
+		.of_match_table	= of_match_ptr(ctx_of_match),
+	},
+	.probe	= qcom_iommu_ctx_probe,
+	.remove = qcom_iommu_ctx_remove,
+};
+
+static int qcom_iommu_device_probe(struct platform_device *pdev)
+{
+	struct device_node *child;
+	struct qcom_iommu_dev *qcom_iommu;
+	struct device *dev = &pdev->dev;
+	struct resource *res;
+	int ret, sz, max_asid = 0;
+
+	/* find the max asid (which is 1:1 to ctx bank idx), so we know how
+	 * many child ctx devices we have:
+	 */
+	for_each_child_of_node(dev->of_node, child)
+		max_asid = max(max_asid, get_asid(child));
+
+	sz = sizeof(*qcom_iommu) + (max_asid * sizeof(qcom_iommu->ctxs[0]));
+
+	qcom_iommu = devm_kzalloc(dev, sz, GFP_KERNEL);
+	if (!qcom_iommu)
+		return -ENOMEM;
+	qcom_iommu->num_ctxs = max_asid;
+	qcom_iommu->dev = dev;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	if (res)
+		qcom_iommu->local_base = devm_ioremap_resource(dev, res);
+
+	qcom_iommu->iface_clk = devm_clk_get(dev, "iface");
+	if (IS_ERR(qcom_iommu->iface_clk)) {
+		dev_err(dev, "failed to get iface clock\n");
+		return PTR_ERR(qcom_iommu->iface_clk);
+	}
+
+	qcom_iommu->bus_clk = devm_clk_get(dev, "bus");
+	if (IS_ERR(qcom_iommu->bus_clk)) {
+		dev_err(dev, "failed to get bus clock\n");
+		return PTR_ERR(qcom_iommu->bus_clk);
+	}
+
+	if (of_property_read_u32(dev->of_node, "qcom,iommu-secure-id",
+				 &qcom_iommu->sec_id)) {
+		dev_err(dev, "missing qcom,iommu-secure-id property\n");
+		return -ENODEV;
+	}
+
+	platform_set_drvdata(pdev, qcom_iommu);
+
+	pm_runtime_enable(dev);
+
+	/* register context bank devices, which are child nodes: */
+	ret = devm_of_platform_populate(dev);
+	if (ret) {
+		dev_err(dev, "Failed to populate iommu contexts\n");
+		return ret;
+	}
+
+	ret = iommu_device_sysfs_add(&qcom_iommu->iommu, dev, NULL,
+				     dev_name(dev));
+	if (ret) {
+		dev_err(dev, "Failed to register iommu in sysfs\n");
+		return ret;
+	}
+
+	iommu_device_set_ops(&qcom_iommu->iommu, &qcom_iommu_ops);
+	iommu_device_set_fwnode(&qcom_iommu->iommu, dev->fwnode);
+
+	ret = iommu_device_register(&qcom_iommu->iommu);
+	if (ret) {
+		dev_err(dev, "Failed to register iommu\n");
+		return ret;
+	}
+
+	bus_set_iommu(&platform_bus_type, &qcom_iommu_ops);
+
+	if (qcom_iommu->local_base) {
+		pm_runtime_get_sync(dev);
+		writel_relaxed(0xffffffff, qcom_iommu->local_base + SMMU_INTR_SEL_NS);
+		pm_runtime_put_sync(dev);
+	}
+
+	return 0;
+}
+
+static int qcom_iommu_device_remove(struct platform_device *pdev)
+{
+	struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
+
+	bus_set_iommu(&platform_bus_type, NULL);
+
+	pm_runtime_force_suspend(&pdev->dev);
+	platform_set_drvdata(pdev, NULL);
+	iommu_device_sysfs_remove(&qcom_iommu->iommu);
+	iommu_device_unregister(&qcom_iommu->iommu);
+
+	return 0;
+}
+
+#ifdef CONFIG_PM
+static int qcom_iommu_resume(struct device *dev)
+{
+	struct platform_device *pdev = to_platform_device(dev);
+	struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
+
+	return qcom_iommu_enable_clocks(qcom_iommu);
+}
+
+static int qcom_iommu_suspend(struct device *dev)
+{
+	struct platform_device *pdev = to_platform_device(dev);
+	struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
+
+	qcom_iommu_disable_clocks(qcom_iommu);
+
+	return 0;
+}
+#endif
+
+static const struct dev_pm_ops qcom_iommu_pm_ops = {
+	SET_RUNTIME_PM_OPS(qcom_iommu_suspend, qcom_iommu_resume, NULL)
+	SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
+				pm_runtime_force_resume)
+};
+
+static const struct of_device_id qcom_iommu_of_match[] = {
+	{ .compatible = "qcom,msm-iommu-v1" },
+	{ /* sentinel */ }
+};
+MODULE_DEVICE_TABLE(of, qcom_iommu_of_match);
+
+static struct platform_driver qcom_iommu_driver = {
+	.driver	= {
+		.name		= "qcom-iommu",
+		.of_match_table	= of_match_ptr(qcom_iommu_of_match),
+		.pm		= &qcom_iommu_pm_ops,
+	},
+	.probe	= qcom_iommu_device_probe,
+	.remove	= qcom_iommu_device_remove,
+};
+
+static int __init qcom_iommu_init(void)
+{
+	int ret;
+
+	ret = platform_driver_register(&qcom_iommu_ctx_driver);
+	if (ret)
+		return ret;
+
+	ret = platform_driver_register(&qcom_iommu_driver);
+	if (ret)
+		platform_driver_unregister(&qcom_iommu_ctx_driver);
+
+	return ret;
+}
+
+static void __exit qcom_iommu_exit(void)
+{
+	platform_driver_unregister(&qcom_iommu_driver);
+	platform_driver_unregister(&qcom_iommu_ctx_driver);
+}
+
+module_init(qcom_iommu_init);
+module_exit(qcom_iommu_exit);
+
+IOMMU_OF_DECLARE(qcom_iommu_dev, "qcom,msm-iommu-v1", NULL);
+
+MODULE_DESCRIPTION("IOMMU API for QCOM IOMMU v1 implementations");
+MODULE_LICENSE("GPL v2");
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] iommu: add qcom_iommu
  2017-06-13 12:17 ` Rob Clark
@ 2017-06-16 13:29   ` Riku Voipio
  0 siblings, 0 replies; 24+ messages in thread
From: Riku Voipio @ 2017-06-16 13:29 UTC (permalink / raw)
  To: Rob Clark
  Cc: iommu, linux-arm-msm, Robin Murphy, Rob Herring, Will Deacon,
	Sricharan, Mark Rutland, Stanimir Varbanov, Archit Taneja

On 13 June 2017 at 15:17, Rob Clark <robdclark@gmail.com> wrote:
> An iommu driver for Qualcomm "B" family devices which do implement the
> ARM SMMU spec, but not in a way that is compatible with how the arm-smmu
> driver is designed.  It seems SMMU_SCR1.GASRAE=1 so the global register
> space is not accessible.  This means it needs to get configuration from
> devicetree instead of setting it up dynamically.
>
> In the end, other than register definitions, there is not much code to
> share with arm-smmu (other than what has already been refactored out
> into the pgtable helpers).

Just adding this serieas and the related device tree changes on top of
4.12-rc5, 3d graphics works now on Dragonboard 410c. Great work!

Tested-by: Riku Voipio <riku.voipio@linaro.org>

> Signed-off-by: Rob Clark <robdclark@gmail.com>
> ---
> v1: original
> v2: bindings cleanups and kconfig issues that kbuild robot pointed out
> v3: fix issues pointed out by Rob H. and actually make device removal
>     work
> v4: fix WARN_ON() splats reported by Archit
> v5: some fixes to build as a module.. note that it cannot actually
>     be built as a module yet (at minimum a bunch of other iommu syms
>     that are needed are not exported, but there may be more to it
>     than that), but at least qcom_iommu is ready should it become
>     possible to build iommu drivers as modules.
> v6: Add additional pm-runtime get/puts around paths that can hit
>     TLB inv, to avoid unclocked register access if device using the
>     iommu is not powered on.  And pre-emptively clear interrupts
>     before registering IRQ handler just in case the bootloader has
>     left us a surpise.
> v7: Address review comments from Robin (don't associate iommu_group
>     with context bank, table lookup instead of list to find context
>     bank, etc)
>
>  drivers/iommu/Kconfig      |  10 +
>  drivers/iommu/Makefile     |   1 +
>  drivers/iommu/qcom_iommu.c | 868 +++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 879 insertions(+)
>  create mode 100644 drivers/iommu/qcom_iommu.c
>
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index 6ee3a25..aa4b628 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -367,4 +367,14 @@ config MTK_IOMMU_V1
>
>           if unsure, say N here.
>
> +config QCOM_IOMMU
> +       # Note: iommu drivers cannot (yet?) be built as modules
> +       bool "Qualcomm IOMMU Support"
> +       depends on ARCH_QCOM || COMPILE_TEST
> +       select IOMMU_API
> +       select IOMMU_IO_PGTABLE_LPAE
> +       select ARM_DMA_USE_IOMMU
> +       help
> +         Support for IOMMU on certain Qualcomm SoCs.
> +
>  endif # IOMMU_SUPPORT
> diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
> index 195f7b9..b910aea 100644
> --- a/drivers/iommu/Makefile
> +++ b/drivers/iommu/Makefile
> @@ -27,3 +27,4 @@ obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o
>  obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
>  obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
>  obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
> +obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
> diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
> new file mode 100644
> index 0000000..860cad1
> --- /dev/null
> +++ b/drivers/iommu/qcom_iommu.c
> @@ -0,0 +1,868 @@
> +/*
> + * IOMMU API for QCOM secure IOMMUs.  Somewhat based on arm-smmu.c
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + *
> + * Copyright (C) 2013 ARM Limited
> + * Copyright (C) 2017 Red Hat
> + */
> +
> +#include <linux/atomic.h>
> +#include <linux/clk.h>
> +#include <linux/delay.h>
> +#include <linux/dma-iommu.h>
> +#include <linux/dma-mapping.h>
> +#include <linux/err.h>
> +#include <linux/interrupt.h>
> +#include <linux/io.h>
> +#include <linux/io-64-nonatomic-hi-lo.h>
> +#include <linux/iommu.h>
> +#include <linux/iopoll.h>
> +#include <linux/kconfig.h>
> +#include <linux/module.h>
> +#include <linux/mutex.h>
> +#include <linux/of.h>
> +#include <linux/of_address.h>
> +#include <linux/of_device.h>
> +#include <linux/of_iommu.h>
> +#include <linux/platform_device.h>
> +#include <linux/pm.h>
> +#include <linux/pm_runtime.h>
> +#include <linux/qcom_scm.h>
> +#include <linux/slab.h>
> +#include <linux/spinlock.h>
> +
> +#include "io-pgtable.h"
> +#include "arm-smmu-regs.h"
> +
> +#define SMMU_INTR_SEL_NS     0x2000
> +
> +struct qcom_iommu_ctx;
> +
> +struct qcom_iommu_dev {
> +       /* IOMMU core code handle */
> +       struct iommu_device      iommu;
> +       struct device           *dev;
> +       struct clk              *iface_clk;
> +       struct clk              *bus_clk;
> +       void __iomem            *local_base;
> +       u32                      sec_id;
> +       u8                       num_ctxs;
> +       struct qcom_iommu_ctx   *ctxs[0];   /* indexed by asid-1 */
> +};
> +
> +struct qcom_iommu_ctx {
> +       struct device           *dev;
> +       void __iomem            *base;
> +       bool                     secure_init;
> +       u8                       asid;      /* asid and ctx bank # are 1:1 */
> +};
> +
> +struct qcom_iommu_domain {
> +       struct io_pgtable_ops   *pgtbl_ops;
> +       spinlock_t               pgtbl_lock;
> +       struct mutex             init_mutex; /* Protects iommu pointer */
> +       struct iommu_domain      domain;
> +       struct qcom_iommu_dev   *iommu;
> +};
> +
> +static struct qcom_iommu_domain *to_qcom_iommu_domain(struct iommu_domain *dom)
> +{
> +       return container_of(dom, struct qcom_iommu_domain, domain);
> +}
> +
> +static const struct iommu_ops qcom_iommu_ops;
> +
> +static struct qcom_iommu_dev * to_iommu(struct iommu_fwspec *fwspec)
> +{
> +       if (!fwspec || fwspec->ops != &qcom_iommu_ops)
> +               return NULL;
> +       return fwspec->iommu_priv;
> +}
> +
> +static struct qcom_iommu_ctx * to_ctx(struct iommu_fwspec *fwspec, unsigned asid)
> +{
> +       struct qcom_iommu_dev *qcom_iommu = to_iommu(fwspec);
> +       if (!qcom_iommu)
> +               return NULL;
> +       return qcom_iommu->ctxs[asid - 1];
> +}
> +
> +static inline void
> +iommu_writel(struct qcom_iommu_ctx *ctx, unsigned reg, u32 val)
> +{
> +       writel_relaxed(val, ctx->base + reg);
> +}
> +
> +static inline void
> +iommu_writeq(struct qcom_iommu_ctx *ctx, unsigned reg, u64 val)
> +{
> +       writeq_relaxed(val, ctx->base + reg);
> +}
> +
> +static inline u32
> +iommu_readl(struct qcom_iommu_ctx *ctx, unsigned reg)
> +{
> +       return readl_relaxed(ctx->base + reg);
> +}
> +
> +static inline u64
> +iommu_readq(struct qcom_iommu_ctx *ctx, unsigned reg)
> +{
> +       return readq_relaxed(ctx->base + reg);
> +}
> +
> +static void qcom_iommu_tlb_sync(void *cookie)
> +{
> +       struct iommu_fwspec *fwspec = cookie;
> +       unsigned i;
> +
> +       for (i = 0; i < fwspec->num_ids; i++) {
> +               struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
> +               unsigned int val, ret;
> +
> +               iommu_writel(ctx, ARM_SMMU_CB_TLBSYNC, 0);
> +
> +               ret = readl_poll_timeout(ctx->base + ARM_SMMU_CB_TLBSTATUS, val,
> +                                        (val & 0x1) == 0, 0, 5000000);
> +               if (ret)
> +                       dev_err(ctx->dev, "timeout waiting for TLB SYNC\n");
> +       }
> +}
> +
> +static void qcom_iommu_tlb_inv_context(void *cookie)
> +{
> +       struct iommu_fwspec *fwspec = cookie;
> +       unsigned i;
> +
> +       for (i = 0; i < fwspec->num_ids; i++) {
> +               struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
> +               iommu_writel(ctx, ARM_SMMU_CB_S1_TLBIASID, ctx->asid);
> +       }
> +
> +       qcom_iommu_tlb_sync(cookie);
> +}
> +
> +static void qcom_iommu_tlb_inv_range_nosync(unsigned long iova, size_t size,
> +                                           size_t granule, bool leaf, void *cookie)
> +{
> +       struct iommu_fwspec *fwspec = cookie;
> +       unsigned i, reg;
> +
> +       reg = leaf ? ARM_SMMU_CB_S1_TLBIVAL : ARM_SMMU_CB_S1_TLBIVA;
> +
> +       for (i = 0; i < fwspec->num_ids; i++) {
> +               struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
> +               size_t s = size;
> +
> +               iova &= ~12UL;
> +               iova |= ctx->asid;
> +               do {
> +                       iommu_writel(ctx, reg, iova);
> +                       iova += granule;
> +               } while (s -= granule);
> +       }
> +}
> +
> +static const struct iommu_gather_ops qcom_gather_ops = {
> +       .tlb_flush_all  = qcom_iommu_tlb_inv_context,
> +       .tlb_add_flush  = qcom_iommu_tlb_inv_range_nosync,
> +       .tlb_sync       = qcom_iommu_tlb_sync,
> +};
> +
> +static irqreturn_t qcom_iommu_fault(int irq, void *dev)
> +{
> +       struct qcom_iommu_ctx *ctx = dev;
> +       u32 fsr, fsynr;
> +       u64 iova;
> +
> +       fsr = iommu_readl(ctx, ARM_SMMU_CB_FSR);
> +
> +       if (!(fsr & FSR_FAULT))
> +               return IRQ_NONE;
> +
> +       fsynr = iommu_readl(ctx, ARM_SMMU_CB_FSYNR0);
> +       iova = iommu_readq(ctx, ARM_SMMU_CB_FAR);
> +
> +       dev_err_ratelimited(ctx->dev,
> +                           "Unhandled context fault: fsr=0x%x, "
> +                           "iova=0x%016llx, fsynr=0x%x, cb=%d\n",
> +                           fsr, iova, fsynr, ctx->asid);
> +
> +       iommu_writel(ctx, ARM_SMMU_CB_FSR, fsr);
> +
> +       return IRQ_HANDLED;
> +}
> +
> +static int qcom_iommu_init_domain(struct iommu_domain *domain,
> +                                 struct qcom_iommu_dev *qcom_iommu,
> +                                 struct iommu_fwspec *fwspec)
> +{
> +       struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
> +       struct io_pgtable_ops *pgtbl_ops;
> +       struct io_pgtable_cfg pgtbl_cfg;
> +       int i, ret = 0;
> +       u32 reg;
> +
> +       mutex_lock(&qcom_domain->init_mutex);
> +       if (qcom_domain->iommu)
> +               goto out_unlock;
> +
> +       pgtbl_cfg = (struct io_pgtable_cfg) {
> +               .pgsize_bitmap  = qcom_iommu_ops.pgsize_bitmap,
> +               .ias            = 32,
> +               .oas            = 40,
> +               .tlb            = &qcom_gather_ops,
> +               .iommu_dev      = qcom_iommu->dev,
> +       };
> +
> +       qcom_domain->iommu = qcom_iommu;
> +       pgtbl_ops = alloc_io_pgtable_ops(ARM_32_LPAE_S1, &pgtbl_cfg, fwspec);
> +       if (!pgtbl_ops) {
> +               dev_err(qcom_iommu->dev, "failed to allocate pagetable ops\n");
> +               ret = -ENOMEM;
> +               goto out_clear_iommu;
> +       }
> +
> +       /* Update the domain's page sizes to reflect the page table format */
> +       domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
> +       domain->geometry.aperture_end = (1ULL << pgtbl_cfg.ias) - 1;
> +       domain->geometry.force_aperture = true;
> +
> +       for (i = 0; i < fwspec->num_ids; i++) {
> +               struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
> +
> +               if (!ctx->secure_init) {
> +                       ret = qcom_scm_restore_sec_cfg(qcom_iommu->sec_id, ctx->asid);
> +                       if (ret) {
> +                               dev_err(qcom_iommu->dev, "secure init failed: %d\n", ret);
> +                               goto out_clear_iommu;
> +                       }
> +                       ctx->secure_init = true;
> +               }
> +
> +               /* TTBRs */
> +               iommu_writeq(ctx, ARM_SMMU_CB_TTBR0,
> +                               pgtbl_cfg.arm_lpae_s1_cfg.ttbr[0] |
> +                               ((u64)ctx->asid << TTBRn_ASID_SHIFT));
> +               iommu_writeq(ctx, ARM_SMMU_CB_TTBR1,
> +                               pgtbl_cfg.arm_lpae_s1_cfg.ttbr[1] |
> +                               ((u64)ctx->asid << TTBRn_ASID_SHIFT));
> +
> +               /* TTBCR */
> +               iommu_writel(ctx, ARM_SMMU_CB_TTBCR2,
> +                               (pgtbl_cfg.arm_lpae_s1_cfg.tcr >> 32) |
> +                               TTBCR2_SEP_UPSTREAM);
> +               iommu_writel(ctx, ARM_SMMU_CB_TTBCR,
> +                               pgtbl_cfg.arm_lpae_s1_cfg.tcr);
> +
> +               /* MAIRs (stage-1 only) */
> +               iommu_writel(ctx, ARM_SMMU_CB_S1_MAIR0,
> +                               pgtbl_cfg.arm_lpae_s1_cfg.mair[0]);
> +               iommu_writel(ctx, ARM_SMMU_CB_S1_MAIR1,
> +                               pgtbl_cfg.arm_lpae_s1_cfg.mair[1]);
> +
> +               /* SCTLR */
> +               reg = SCTLR_CFIE | SCTLR_CFRE | SCTLR_AFE | SCTLR_TRE |
> +                       SCTLR_M | SCTLR_S1_ASIDPNE;
> +
> +               if (IS_ENABLED(CONFIG_BIG_ENDIAN))
> +                       reg |= SCTLR_E;
> +
> +               iommu_writel(ctx, ARM_SMMU_CB_SCTLR, reg);
> +       }
> +
> +       mutex_unlock(&qcom_domain->init_mutex);
> +
> +       /* Publish page table ops for map/unmap */
> +       qcom_domain->pgtbl_ops = pgtbl_ops;
> +
> +       return 0;
> +
> +out_clear_iommu:
> +       qcom_domain->iommu = NULL;
> +out_unlock:
> +       mutex_unlock(&qcom_domain->init_mutex);
> +       return ret;
> +}
> +
> +static struct iommu_domain *qcom_iommu_domain_alloc(unsigned type)
> +{
> +       struct qcom_iommu_domain *qcom_domain;
> +
> +       if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA)
> +               return NULL;
> +       /*
> +        * Allocate the domain and initialise some of its data structures.
> +        * We can't really do anything meaningful until we've added a
> +        * master.
> +        */
> +       qcom_domain = kzalloc(sizeof(*qcom_domain), GFP_KERNEL);
> +       if (!qcom_domain)
> +               return NULL;
> +
> +       if (type == IOMMU_DOMAIN_DMA &&
> +           iommu_get_dma_cookie(&qcom_domain->domain)) {
> +               kfree(qcom_domain);
> +               return NULL;
> +       }
> +
> +       mutex_init(&qcom_domain->init_mutex);
> +       spin_lock_init(&qcom_domain->pgtbl_lock);
> +
> +       return &qcom_domain->domain;
> +}
> +
> +static void qcom_iommu_domain_free(struct iommu_domain *domain)
> +{
> +       struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
> +
> +       if (WARN_ON(qcom_domain->iommu))    /* forgot to detach? */
> +               return;
> +
> +       iommu_put_dma_cookie(domain);
> +
> +       /* NOTE: unmap can be called after client device is powered off,
> +        * for example, with GPUs or anything involving dma-buf.  So we
> +        * cannot rely on the device_link.  Make sure the IOMMU is on to
> +        * avoid unclocked accesses in the TLB inv path:
> +        */
> +       pm_runtime_get_sync(qcom_domain->iommu->dev);
> +
> +       free_io_pgtable_ops(qcom_domain->pgtbl_ops);
> +
> +       pm_runtime_put_sync(qcom_domain->iommu->dev);
> +
> +       kfree(qcom_domain);
> +}
> +
> +static int qcom_iommu_attach_dev(struct iommu_domain *domain, struct device *dev)
> +{
> +       struct qcom_iommu_dev *qcom_iommu = to_iommu(dev->iommu_fwspec);
> +       struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
> +       int ret;
> +
> +       if (!qcom_iommu) {
> +               dev_err(dev, "cannot attach to IOMMU, is it on the same bus?\n");
> +               return -ENXIO;
> +       }
> +
> +       /* Ensure that the domain is finalized */
> +       pm_runtime_get_sync(qcom_iommu->dev);
> +       ret = qcom_iommu_init_domain(domain, qcom_iommu, dev->iommu_fwspec);
> +       pm_runtime_put_sync(qcom_iommu->dev);
> +       if (ret < 0)
> +               return ret;
> +
> +       /*
> +        * Sanity check the domain. We don't support domains across
> +        * different IOMMUs.
> +        */
> +       if (qcom_domain->iommu != qcom_iommu) {
> +               dev_err(dev, "cannot attach to IOMMU %s while already "
> +                       "attached to domain on IOMMU %s\n",
> +                       dev_name(qcom_domain->iommu->dev),
> +                       dev_name(qcom_iommu->dev));
> +               return -EINVAL;
> +       }
> +
> +       return 0;
> +}
> +
> +static void qcom_iommu_detach_dev(struct iommu_domain *domain, struct device *dev)
> +{
> +       struct iommu_fwspec *fwspec = dev->iommu_fwspec;
> +       struct qcom_iommu_dev *qcom_iommu = to_iommu(fwspec);
> +       struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
> +       unsigned i;
> +
> +       if (!qcom_domain->iommu)
> +               return;
> +
> +       pm_runtime_get_sync(qcom_iommu->dev);
> +       for (i = 0; i < fwspec->num_ids; i++) {
> +               struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
> +
> +               /* Disable the context bank: */
> +               iommu_writel(ctx, ARM_SMMU_CB_SCTLR, 0);
> +       }
> +       pm_runtime_put_sync(qcom_iommu->dev);
> +
> +       qcom_domain->iommu = NULL;
> +}
> +
> +static int qcom_iommu_map(struct iommu_domain *domain, unsigned long iova,
> +                         phys_addr_t paddr, size_t size, int prot)
> +{
> +       int ret;
> +       unsigned long flags;
> +       struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
> +       struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
> +
> +       if (!ops)
> +               return -ENODEV;
> +
> +       spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
> +       ret = ops->map(ops, iova, paddr, size, prot);
> +       spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
> +       return ret;
> +}
> +
> +static size_t qcom_iommu_unmap(struct iommu_domain *domain, unsigned long iova,
> +                              size_t size)
> +{
> +       size_t ret;
> +       unsigned long flags;
> +       struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
> +       struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
> +
> +       if (!ops)
> +               return 0;
> +
> +       /* NOTE: unmap can be called after client device is powered off,
> +        * for example, with GPUs or anything involving dma-buf.  So we
> +        * cannot rely on the device_link.  Make sure the IOMMU is on to
> +        * avoid unclocked accesses in the TLB inv path:
> +        */
> +       pm_runtime_get_sync(qcom_domain->iommu->dev);
> +       spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
> +       ret = ops->unmap(ops, iova, size);
> +       spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
> +       pm_runtime_put_sync(qcom_domain->iommu->dev);
> +
> +       return ret;
> +}
> +
> +static phys_addr_t qcom_iommu_iova_to_phys(struct iommu_domain *domain,
> +                                          dma_addr_t iova)
> +{
> +       phys_addr_t ret;
> +       unsigned long flags;
> +       struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
> +       struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
> +
> +       if (!ops)
> +               return 0;
> +
> +       spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
> +       ret = ops->iova_to_phys(ops, iova);
> +       spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
> +
> +       return ret;
> +}
> +
> +static bool qcom_iommu_capable(enum iommu_cap cap)
> +{
> +       switch (cap) {
> +       case IOMMU_CAP_CACHE_COHERENCY:
> +               /*
> +                * Return true here as the SMMU can always send out coherent
> +                * requests.
> +                */
> +               return true;
> +       case IOMMU_CAP_NOEXEC:
> +               return true;
> +       default:
> +               return false;
> +       }
> +}
> +
> +static int qcom_iommu_add_device(struct device *dev)
> +{
> +       struct qcom_iommu_dev *qcom_iommu = to_iommu(dev->iommu_fwspec);
> +       struct iommu_group *group;
> +       struct device_link *link;
> +
> +       if (!qcom_iommu)
> +               return -ENODEV;
> +
> +       /*
> +        * Establish the link between iommu and master, so that the
> +        * iommu gets runtime enabled/disabled as per the master's
> +        * needs.
> +        */
> +       link = device_link_add(dev, qcom_iommu->dev, DL_FLAG_PM_RUNTIME);
> +       if (!link) {
> +               dev_err(qcom_iommu->dev, "Unable to create device link between %s and %s\n",
> +                       dev_name(qcom_iommu->dev), dev_name(dev));
> +               return -ENODEV;
> +       }
> +
> +       group = iommu_group_get_for_dev(dev);
> +       if (IS_ERR_OR_NULL(group))
> +               return PTR_ERR_OR_ZERO(group);
> +
> +       iommu_group_put(group);
> +       iommu_device_link(&qcom_iommu->iommu, dev);
> +
> +       return 0;
> +}
> +
> +static void qcom_iommu_remove_device(struct device *dev)
> +{
> +       struct qcom_iommu_dev *qcom_iommu = to_iommu(dev->iommu_fwspec);
> +
> +       if (!qcom_iommu)
> +               return;
> +
> +       iommu_device_unlink(&qcom_iommu->iommu, dev);
> +       iommu_group_remove_device(dev);
> +       iommu_fwspec_free(dev);
> +}
> +
> +static int qcom_iommu_of_xlate(struct device *dev, struct of_phandle_args *args)
> +{
> +       struct qcom_iommu_dev *qcom_iommu;
> +       struct platform_device *iommu_pdev;
> +       unsigned asid = args->args[0];
> +
> +       if (args->args_count != 1) {
> +               dev_err(dev, "incorrect number of iommu params found for %s "
> +                       "(found %d, expected 1)\n",
> +                       args->np->full_name, args->args_count);
> +               return -EINVAL;
> +       }
> +
> +       iommu_pdev = of_find_device_by_node(args->np);
> +       if (WARN_ON(!iommu_pdev))
> +               return -EINVAL;
> +
> +       qcom_iommu = platform_get_drvdata(iommu_pdev);
> +
> +       /* make sure the asid specified in dt is valid, so we don't have
> +        * to sanity check this elsewhere, since 'asid - 1' is used to
> +        * index into qcom_iommu->ctxs:
> +        */
> +       if (WARN_ON(asid < 1) ||
> +           WARN_ON(asid > qcom_iommu->num_ctxs))
> +               return -EINVAL;
> +
> +       if (!dev->iommu_fwspec->iommu_priv) {
> +               dev->iommu_fwspec->iommu_priv = qcom_iommu;
> +       } else {
> +               /* make sure devices iommus dt node isn't referring to
> +                * multiple different iommu devices.  Multiple context
> +                * banks are ok, but multiple devices are not:
> +                */
> +               if (WARN_ON(qcom_iommu != dev->iommu_fwspec->iommu_priv))
> +                       return -EINVAL;
> +       }
> +
> +       return iommu_fwspec_add_ids(dev, &asid, 1);
> +}
> +
> +static const struct iommu_ops qcom_iommu_ops = {
> +       .capable        = qcom_iommu_capable,
> +       .domain_alloc   = qcom_iommu_domain_alloc,
> +       .domain_free    = qcom_iommu_domain_free,
> +       .attach_dev     = qcom_iommu_attach_dev,
> +       .detach_dev     = qcom_iommu_detach_dev,
> +       .map            = qcom_iommu_map,
> +       .unmap          = qcom_iommu_unmap,
> +       .map_sg         = default_iommu_map_sg,
> +       .iova_to_phys   = qcom_iommu_iova_to_phys,
> +       .add_device     = qcom_iommu_add_device,
> +       .remove_device  = qcom_iommu_remove_device,
> +       .device_group   = generic_device_group,
> +       .of_xlate       = qcom_iommu_of_xlate,
> +       .pgsize_bitmap  = SZ_4K | SZ_64K | SZ_1M | SZ_16M,
> +};
> +
> +static int qcom_iommu_enable_clocks(struct qcom_iommu_dev *qcom_iommu)
> +{
> +       int ret;
> +
> +       ret = clk_prepare_enable(qcom_iommu->iface_clk);
> +       if (ret) {
> +               dev_err(qcom_iommu->dev, "Couldn't enable iface_clk\n");
> +               return ret;
> +       }
> +
> +       ret = clk_prepare_enable(qcom_iommu->bus_clk);
> +       if (ret) {
> +               dev_err(qcom_iommu->dev, "Couldn't enable bus_clk\n");
> +               clk_disable_unprepare(qcom_iommu->iface_clk);
> +               return ret;
> +       }
> +
> +       return 0;
> +}
> +
> +static void qcom_iommu_disable_clocks(struct qcom_iommu_dev *qcom_iommu)
> +{
> +       clk_disable_unprepare(qcom_iommu->bus_clk);
> +       clk_disable_unprepare(qcom_iommu->iface_clk);
> +}
> +
> +static int get_asid(const struct device_node *np)
> +{
> +       u32 reg;
> +
> +       /* read the "reg" property directly to get the relative address
> +        * of the context bank, and calculate the asid from that:
> +        */
> +       if (of_property_read_u32_index(np, "reg", 0, &reg))
> +               return -ENODEV;
> +
> +       return reg / 0x1000;      /* context banks are 0x1000 apart */
> +}
> +
> +static int qcom_iommu_ctx_probe(struct platform_device *pdev)
> +{
> +       struct qcom_iommu_ctx *ctx;
> +       struct device *dev = &pdev->dev;
> +       struct qcom_iommu_dev *qcom_iommu = dev_get_drvdata(dev->parent);
> +       struct resource *res;
> +       int ret, irq;
> +
> +       ctx = devm_kzalloc(dev, sizeof(*ctx), GFP_KERNEL);
> +       if (!ctx)
> +               return -ENOMEM;
> +
> +       ctx->dev = dev;
> +       platform_set_drvdata(pdev, ctx);
> +
> +       res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +       ctx->base = devm_ioremap_resource(dev, res);
> +       if (IS_ERR(ctx->base))
> +               return PTR_ERR(ctx->base);
> +
> +       irq = platform_get_irq(pdev, 0);
> +       if (irq < 0) {
> +               dev_err(dev, "failed to get irq\n");
> +               return -ENODEV;
> +       }
> +
> +       /* clear IRQs before registering fault handler, just in case the
> +        * boot-loader left us a surprise:
> +        */
> +       iommu_writel(ctx, ARM_SMMU_CB_FSR, iommu_readl(ctx, ARM_SMMU_CB_FSR));
> +
> +       ret = devm_request_irq(dev, irq,
> +                              qcom_iommu_fault,
> +                              IRQF_SHARED,
> +                              "qcom-iommu-fault",
> +                              ctx);
> +       if (ret) {
> +               dev_err(dev, "failed to request IRQ %u\n", irq);
> +               return ret;
> +       }
> +
> +       ret = get_asid(dev->of_node);
> +       if (ret < 0) {
> +               dev_err(dev, "missing reg property\n");
> +               return ret;
> +       }
> +
> +       ctx->asid = ret;
> +
> +       dev_dbg(dev, "found asid %u\n", ctx->asid);
> +
> +       qcom_iommu->ctxs[ctx->asid - 1] = ctx;
> +
> +       return 0;
> +}
> +
> +static int qcom_iommu_ctx_remove(struct platform_device *pdev)
> +{
> +       struct qcom_iommu_dev *qcom_iommu = dev_get_drvdata(pdev->dev.parent);
> +       struct qcom_iommu_ctx *ctx = platform_get_drvdata(pdev);
> +
> +       platform_set_drvdata(pdev, NULL);
> +
> +       qcom_iommu->ctxs[ctx->asid - 1] = NULL;
> +
> +       return 0;
> +}
> +
> +static const struct of_device_id ctx_of_match[] = {
> +       { .compatible = "qcom,msm-iommu-v1-ns" },
> +       { .compatible = "qcom,msm-iommu-v1-sec" },
> +       { /* sentinel */ }
> +};
> +
> +static struct platform_driver qcom_iommu_ctx_driver = {
> +       .driver = {
> +               .name           = "qcom-iommu-ctx",
> +               .of_match_table = of_match_ptr(ctx_of_match),
> +       },
> +       .probe  = qcom_iommu_ctx_probe,
> +       .remove = qcom_iommu_ctx_remove,
> +};
> +
> +static int qcom_iommu_device_probe(struct platform_device *pdev)
> +{
> +       struct device_node *child;
> +       struct qcom_iommu_dev *qcom_iommu;
> +       struct device *dev = &pdev->dev;
> +       struct resource *res;
> +       int ret, sz, max_asid = 0;
> +
> +       /* find the max asid (which is 1:1 to ctx bank idx), so we know how
> +        * many child ctx devices we have:
> +        */
> +       for_each_child_of_node(dev->of_node, child)
> +               max_asid = max(max_asid, get_asid(child));
> +
> +       sz = sizeof(*qcom_iommu) + (max_asid * sizeof(qcom_iommu->ctxs[0]));
> +
> +       qcom_iommu = devm_kzalloc(dev, sz, GFP_KERNEL);
> +       if (!qcom_iommu)
> +               return -ENOMEM;
> +       qcom_iommu->num_ctxs = max_asid;
> +       qcom_iommu->dev = dev;
> +
> +       res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +       if (res)
> +               qcom_iommu->local_base = devm_ioremap_resource(dev, res);
> +
> +       qcom_iommu->iface_clk = devm_clk_get(dev, "iface");
> +       if (IS_ERR(qcom_iommu->iface_clk)) {
> +               dev_err(dev, "failed to get iface clock\n");
> +               return PTR_ERR(qcom_iommu->iface_clk);
> +       }
> +
> +       qcom_iommu->bus_clk = devm_clk_get(dev, "bus");
> +       if (IS_ERR(qcom_iommu->bus_clk)) {
> +               dev_err(dev, "failed to get bus clock\n");
> +               return PTR_ERR(qcom_iommu->bus_clk);
> +       }
> +
> +       if (of_property_read_u32(dev->of_node, "qcom,iommu-secure-id",
> +                                &qcom_iommu->sec_id)) {
> +               dev_err(dev, "missing qcom,iommu-secure-id property\n");
> +               return -ENODEV;
> +       }
> +
> +       platform_set_drvdata(pdev, qcom_iommu);
> +
> +       pm_runtime_enable(dev);
> +
> +       /* register context bank devices, which are child nodes: */
> +       ret = devm_of_platform_populate(dev);
> +       if (ret) {
> +               dev_err(dev, "Failed to populate iommu contexts\n");
> +               return ret;
> +       }
> +
> +       ret = iommu_device_sysfs_add(&qcom_iommu->iommu, dev, NULL,
> +                                    dev_name(dev));
> +       if (ret) {
> +               dev_err(dev, "Failed to register iommu in sysfs\n");
> +               return ret;
> +       }
> +
> +       iommu_device_set_ops(&qcom_iommu->iommu, &qcom_iommu_ops);
> +       iommu_device_set_fwnode(&qcom_iommu->iommu, dev->fwnode);
> +
> +       ret = iommu_device_register(&qcom_iommu->iommu);
> +       if (ret) {
> +               dev_err(dev, "Failed to register iommu\n");
> +               return ret;
> +       }
> +
> +       bus_set_iommu(&platform_bus_type, &qcom_iommu_ops);
> +
> +       if (qcom_iommu->local_base) {
> +               pm_runtime_get_sync(dev);
> +               writel_relaxed(0xffffffff, qcom_iommu->local_base + SMMU_INTR_SEL_NS);
> +               pm_runtime_put_sync(dev);
> +       }
> +
> +       return 0;
> +}
> +
> +static int qcom_iommu_device_remove(struct platform_device *pdev)
> +{
> +       struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
> +
> +       bus_set_iommu(&platform_bus_type, NULL);
> +
> +       pm_runtime_force_suspend(&pdev->dev);
> +       platform_set_drvdata(pdev, NULL);
> +       iommu_device_sysfs_remove(&qcom_iommu->iommu);
> +       iommu_device_unregister(&qcom_iommu->iommu);
> +
> +       return 0;
> +}
> +
> +#ifdef CONFIG_PM
> +static int qcom_iommu_resume(struct device *dev)
> +{
> +       struct platform_device *pdev = to_platform_device(dev);
> +       struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
> +
> +       return qcom_iommu_enable_clocks(qcom_iommu);
> +}
> +
> +static int qcom_iommu_suspend(struct device *dev)
> +{
> +       struct platform_device *pdev = to_platform_device(dev);
> +       struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
> +
> +       qcom_iommu_disable_clocks(qcom_iommu);
> +
> +       return 0;
> +}
> +#endif
> +
> +static const struct dev_pm_ops qcom_iommu_pm_ops = {
> +       SET_RUNTIME_PM_OPS(qcom_iommu_suspend, qcom_iommu_resume, NULL)
> +       SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
> +                               pm_runtime_force_resume)
> +};
> +
> +static const struct of_device_id qcom_iommu_of_match[] = {
> +       { .compatible = "qcom,msm-iommu-v1" },
> +       { /* sentinel */ }
> +};
> +MODULE_DEVICE_TABLE(of, qcom_iommu_of_match);
> +
> +static struct platform_driver qcom_iommu_driver = {
> +       .driver = {
> +               .name           = "qcom-iommu",
> +               .of_match_table = of_match_ptr(qcom_iommu_of_match),
> +               .pm             = &qcom_iommu_pm_ops,
> +       },
> +       .probe  = qcom_iommu_device_probe,
> +       .remove = qcom_iommu_device_remove,
> +};
> +
> +static int __init qcom_iommu_init(void)
> +{
> +       int ret;
> +
> +       ret = platform_driver_register(&qcom_iommu_ctx_driver);
> +       if (ret)
> +               return ret;
> +
> +       ret = platform_driver_register(&qcom_iommu_driver);
> +       if (ret)
> +               platform_driver_unregister(&qcom_iommu_ctx_driver);
> +
> +       return ret;
> +}
> +
> +static void __exit qcom_iommu_exit(void)
> +{
> +       platform_driver_unregister(&qcom_iommu_driver);
> +       platform_driver_unregister(&qcom_iommu_ctx_driver);
> +}
> +
> +module_init(qcom_iommu_init);
> +module_exit(qcom_iommu_exit);
> +
> +IOMMU_OF_DECLARE(qcom_iommu_dev, "qcom,msm-iommu-v1", NULL);
> +
> +MODULE_DESCRIPTION("IOMMU API for QCOM IOMMU v1 implementations");
> +MODULE_LICENSE("GPL v2");
> --
> 2.9.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 3/4] iommu: add qcom_iommu
       [not found] <Message-ID: <CAF6AEGsGCWCASL=L6Z8_0TGWV6b1ozBND3tZHLn=y5AAJ=1JEA@mail.gmail.com>
@ 2017-06-13 12:17 ` Rob Clark
  2017-06-16 13:29   ` Riku Voipio
  0 siblings, 1 reply; 24+ messages in thread
From: Rob Clark @ 2017-06-13 12:17 UTC (permalink / raw)
  To: iommu
  Cc: linux-arm-msm, Robin Murphy, Rob Herring, Will Deacon, Sricharan,
	Mark Rutland, Stanimir Varbanov, Archit Taneja, Rob Clark

An iommu driver for Qualcomm "B" family devices which do implement the
ARM SMMU spec, but not in a way that is compatible with how the arm-smmu
driver is designed.  It seems SMMU_SCR1.GASRAE=1 so the global register
space is not accessible.  This means it needs to get configuration from
devicetree instead of setting it up dynamically.

In the end, other than register definitions, there is not much code to
share with arm-smmu (other than what has already been refactored out
into the pgtable helpers).

Signed-off-by: Rob Clark <robdclark@gmail.com>
---
v1: original
v2: bindings cleanups and kconfig issues that kbuild robot pointed out
v3: fix issues pointed out by Rob H. and actually make device removal
    work
v4: fix WARN_ON() splats reported by Archit
v5: some fixes to build as a module.. note that it cannot actually
    be built as a module yet (at minimum a bunch of other iommu syms
    that are needed are not exported, but there may be more to it
    than that), but at least qcom_iommu is ready should it become
    possible to build iommu drivers as modules.
v6: Add additional pm-runtime get/puts around paths that can hit
    TLB inv, to avoid unclocked register access if device using the
    iommu is not powered on.  And pre-emptively clear interrupts
    before registering IRQ handler just in case the bootloader has
    left us a surpise.
v7: Address review comments from Robin (don't associate iommu_group
    with context bank, table lookup instead of list to find context
    bank, etc)

 drivers/iommu/Kconfig      |  10 +
 drivers/iommu/Makefile     |   1 +
 drivers/iommu/qcom_iommu.c | 868 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 879 insertions(+)
 create mode 100644 drivers/iommu/qcom_iommu.c

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 6ee3a25..aa4b628 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -367,4 +367,14 @@ config MTK_IOMMU_V1
 
 	  if unsure, say N here.
 
+config QCOM_IOMMU
+	# Note: iommu drivers cannot (yet?) be built as modules
+	bool "Qualcomm IOMMU Support"
+	depends on ARCH_QCOM || COMPILE_TEST
+	select IOMMU_API
+	select IOMMU_IO_PGTABLE_LPAE
+	select ARM_DMA_USE_IOMMU
+	help
+	  Support for IOMMU on certain Qualcomm SoCs.
+
 endif # IOMMU_SUPPORT
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 195f7b9..b910aea 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -27,3 +27,4 @@ obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o
 obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
 obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
 obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
+obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
new file mode 100644
index 0000000..860cad1
--- /dev/null
+++ b/drivers/iommu/qcom_iommu.c
@@ -0,0 +1,868 @@
+/*
+ * IOMMU API for QCOM secure IOMMUs.  Somewhat based on arm-smmu.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Copyright (C) 2013 ARM Limited
+ * Copyright (C) 2017 Red Hat
+ */
+
+#include <linux/atomic.h>
+#include <linux/clk.h>
+#include <linux/delay.h>
+#include <linux/dma-iommu.h>
+#include <linux/dma-mapping.h>
+#include <linux/err.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/io-64-nonatomic-hi-lo.h>
+#include <linux/iommu.h>
+#include <linux/iopoll.h>
+#include <linux/kconfig.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/of_device.h>
+#include <linux/of_iommu.h>
+#include <linux/platform_device.h>
+#include <linux/pm.h>
+#include <linux/pm_runtime.h>
+#include <linux/qcom_scm.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+
+#include "io-pgtable.h"
+#include "arm-smmu-regs.h"
+
+#define SMMU_INTR_SEL_NS     0x2000
+
+struct qcom_iommu_ctx;
+
+struct qcom_iommu_dev {
+	/* IOMMU core code handle */
+	struct iommu_device	 iommu;
+	struct device		*dev;
+	struct clk		*iface_clk;
+	struct clk		*bus_clk;
+	void __iomem		*local_base;
+	u32			 sec_id;
+	u8			 num_ctxs;
+	struct qcom_iommu_ctx	*ctxs[0];   /* indexed by asid-1 */
+};
+
+struct qcom_iommu_ctx {
+	struct device		*dev;
+	void __iomem		*base;
+	bool			 secure_init;
+	u8			 asid;      /* asid and ctx bank # are 1:1 */
+};
+
+struct qcom_iommu_domain {
+	struct io_pgtable_ops	*pgtbl_ops;
+	spinlock_t		 pgtbl_lock;
+	struct mutex		 init_mutex; /* Protects iommu pointer */
+	struct iommu_domain	 domain;
+	struct qcom_iommu_dev	*iommu;
+};
+
+static struct qcom_iommu_domain *to_qcom_iommu_domain(struct iommu_domain *dom)
+{
+	return container_of(dom, struct qcom_iommu_domain, domain);
+}
+
+static const struct iommu_ops qcom_iommu_ops;
+
+static struct qcom_iommu_dev * to_iommu(struct iommu_fwspec *fwspec)
+{
+	if (!fwspec || fwspec->ops != &qcom_iommu_ops)
+		return NULL;
+	return fwspec->iommu_priv;
+}
+
+static struct qcom_iommu_ctx * to_ctx(struct iommu_fwspec *fwspec, unsigned asid)
+{
+	struct qcom_iommu_dev *qcom_iommu = to_iommu(fwspec);
+	if (!qcom_iommu)
+		return NULL;
+	return qcom_iommu->ctxs[asid - 1];
+}
+
+static inline void
+iommu_writel(struct qcom_iommu_ctx *ctx, unsigned reg, u32 val)
+{
+	writel_relaxed(val, ctx->base + reg);
+}
+
+static inline void
+iommu_writeq(struct qcom_iommu_ctx *ctx, unsigned reg, u64 val)
+{
+	writeq_relaxed(val, ctx->base + reg);
+}
+
+static inline u32
+iommu_readl(struct qcom_iommu_ctx *ctx, unsigned reg)
+{
+	return readl_relaxed(ctx->base + reg);
+}
+
+static inline u64
+iommu_readq(struct qcom_iommu_ctx *ctx, unsigned reg)
+{
+	return readq_relaxed(ctx->base + reg);
+}
+
+static void qcom_iommu_tlb_sync(void *cookie)
+{
+	struct iommu_fwspec *fwspec = cookie;
+	unsigned i;
+
+	for (i = 0; i < fwspec->num_ids; i++) {
+		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
+		unsigned int val, ret;
+
+		iommu_writel(ctx, ARM_SMMU_CB_TLBSYNC, 0);
+
+		ret = readl_poll_timeout(ctx->base + ARM_SMMU_CB_TLBSTATUS, val,
+					 (val & 0x1) == 0, 0, 5000000);
+		if (ret)
+			dev_err(ctx->dev, "timeout waiting for TLB SYNC\n");
+	}
+}
+
+static void qcom_iommu_tlb_inv_context(void *cookie)
+{
+	struct iommu_fwspec *fwspec = cookie;
+	unsigned i;
+
+	for (i = 0; i < fwspec->num_ids; i++) {
+		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
+		iommu_writel(ctx, ARM_SMMU_CB_S1_TLBIASID, ctx->asid);
+	}
+
+	qcom_iommu_tlb_sync(cookie);
+}
+
+static void qcom_iommu_tlb_inv_range_nosync(unsigned long iova, size_t size,
+					    size_t granule, bool leaf, void *cookie)
+{
+	struct iommu_fwspec *fwspec = cookie;
+	unsigned i, reg;
+
+	reg = leaf ? ARM_SMMU_CB_S1_TLBIVAL : ARM_SMMU_CB_S1_TLBIVA;
+
+	for (i = 0; i < fwspec->num_ids; i++) {
+		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
+		size_t s = size;
+
+		iova &= ~12UL;
+		iova |= ctx->asid;
+		do {
+			iommu_writel(ctx, reg, iova);
+			iova += granule;
+		} while (s -= granule);
+	}
+}
+
+static const struct iommu_gather_ops qcom_gather_ops = {
+	.tlb_flush_all	= qcom_iommu_tlb_inv_context,
+	.tlb_add_flush	= qcom_iommu_tlb_inv_range_nosync,
+	.tlb_sync	= qcom_iommu_tlb_sync,
+};
+
+static irqreturn_t qcom_iommu_fault(int irq, void *dev)
+{
+	struct qcom_iommu_ctx *ctx = dev;
+	u32 fsr, fsynr;
+	u64 iova;
+
+	fsr = iommu_readl(ctx, ARM_SMMU_CB_FSR);
+
+	if (!(fsr & FSR_FAULT))
+		return IRQ_NONE;
+
+	fsynr = iommu_readl(ctx, ARM_SMMU_CB_FSYNR0);
+	iova = iommu_readq(ctx, ARM_SMMU_CB_FAR);
+
+	dev_err_ratelimited(ctx->dev,
+			    "Unhandled context fault: fsr=0x%x, "
+			    "iova=0x%016llx, fsynr=0x%x, cb=%d\n",
+			    fsr, iova, fsynr, ctx->asid);
+
+	iommu_writel(ctx, ARM_SMMU_CB_FSR, fsr);
+
+	return IRQ_HANDLED;
+}
+
+static int qcom_iommu_init_domain(struct iommu_domain *domain,
+				  struct qcom_iommu_dev *qcom_iommu,
+				  struct iommu_fwspec *fwspec)
+{
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	struct io_pgtable_ops *pgtbl_ops;
+	struct io_pgtable_cfg pgtbl_cfg;
+	int i, ret = 0;
+	u32 reg;
+
+	mutex_lock(&qcom_domain->init_mutex);
+	if (qcom_domain->iommu)
+		goto out_unlock;
+
+	pgtbl_cfg = (struct io_pgtable_cfg) {
+		.pgsize_bitmap	= qcom_iommu_ops.pgsize_bitmap,
+		.ias		= 32,
+		.oas		= 40,
+		.tlb		= &qcom_gather_ops,
+		.iommu_dev	= qcom_iommu->dev,
+	};
+
+	qcom_domain->iommu = qcom_iommu;
+	pgtbl_ops = alloc_io_pgtable_ops(ARM_32_LPAE_S1, &pgtbl_cfg, fwspec);
+	if (!pgtbl_ops) {
+		dev_err(qcom_iommu->dev, "failed to allocate pagetable ops\n");
+		ret = -ENOMEM;
+		goto out_clear_iommu;
+	}
+
+	/* Update the domain's page sizes to reflect the page table format */
+	domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
+	domain->geometry.aperture_end = (1ULL << pgtbl_cfg.ias) - 1;
+	domain->geometry.force_aperture = true;
+
+	for (i = 0; i < fwspec->num_ids; i++) {
+		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
+
+		if (!ctx->secure_init) {
+			ret = qcom_scm_restore_sec_cfg(qcom_iommu->sec_id, ctx->asid);
+			if (ret) {
+				dev_err(qcom_iommu->dev, "secure init failed: %d\n", ret);
+				goto out_clear_iommu;
+			}
+			ctx->secure_init = true;
+		}
+
+		/* TTBRs */
+		iommu_writeq(ctx, ARM_SMMU_CB_TTBR0,
+				pgtbl_cfg.arm_lpae_s1_cfg.ttbr[0] |
+				((u64)ctx->asid << TTBRn_ASID_SHIFT));
+		iommu_writeq(ctx, ARM_SMMU_CB_TTBR1,
+				pgtbl_cfg.arm_lpae_s1_cfg.ttbr[1] |
+				((u64)ctx->asid << TTBRn_ASID_SHIFT));
+
+		/* TTBCR */
+		iommu_writel(ctx, ARM_SMMU_CB_TTBCR2,
+				(pgtbl_cfg.arm_lpae_s1_cfg.tcr >> 32) |
+				TTBCR2_SEP_UPSTREAM);
+		iommu_writel(ctx, ARM_SMMU_CB_TTBCR,
+				pgtbl_cfg.arm_lpae_s1_cfg.tcr);
+
+		/* MAIRs (stage-1 only) */
+		iommu_writel(ctx, ARM_SMMU_CB_S1_MAIR0,
+				pgtbl_cfg.arm_lpae_s1_cfg.mair[0]);
+		iommu_writel(ctx, ARM_SMMU_CB_S1_MAIR1,
+				pgtbl_cfg.arm_lpae_s1_cfg.mair[1]);
+
+		/* SCTLR */
+		reg = SCTLR_CFIE | SCTLR_CFRE | SCTLR_AFE | SCTLR_TRE |
+			SCTLR_M | SCTLR_S1_ASIDPNE;
+
+		if (IS_ENABLED(CONFIG_BIG_ENDIAN))
+			reg |= SCTLR_E;
+
+		iommu_writel(ctx, ARM_SMMU_CB_SCTLR, reg);
+	}
+
+	mutex_unlock(&qcom_domain->init_mutex);
+
+	/* Publish page table ops for map/unmap */
+	qcom_domain->pgtbl_ops = pgtbl_ops;
+
+	return 0;
+
+out_clear_iommu:
+	qcom_domain->iommu = NULL;
+out_unlock:
+	mutex_unlock(&qcom_domain->init_mutex);
+	return ret;
+}
+
+static struct iommu_domain *qcom_iommu_domain_alloc(unsigned type)
+{
+	struct qcom_iommu_domain *qcom_domain;
+
+	if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA)
+		return NULL;
+	/*
+	 * Allocate the domain and initialise some of its data structures.
+	 * We can't really do anything meaningful until we've added a
+	 * master.
+	 */
+	qcom_domain = kzalloc(sizeof(*qcom_domain), GFP_KERNEL);
+	if (!qcom_domain)
+		return NULL;
+
+	if (type == IOMMU_DOMAIN_DMA &&
+	    iommu_get_dma_cookie(&qcom_domain->domain)) {
+		kfree(qcom_domain);
+		return NULL;
+	}
+
+	mutex_init(&qcom_domain->init_mutex);
+	spin_lock_init(&qcom_domain->pgtbl_lock);
+
+	return &qcom_domain->domain;
+}
+
+static void qcom_iommu_domain_free(struct iommu_domain *domain)
+{
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+
+	if (WARN_ON(qcom_domain->iommu))    /* forgot to detach? */
+		return;
+
+	iommu_put_dma_cookie(domain);
+
+	/* NOTE: unmap can be called after client device is powered off,
+	 * for example, with GPUs or anything involving dma-buf.  So we
+	 * cannot rely on the device_link.  Make sure the IOMMU is on to
+	 * avoid unclocked accesses in the TLB inv path:
+	 */
+	pm_runtime_get_sync(qcom_domain->iommu->dev);
+
+	free_io_pgtable_ops(qcom_domain->pgtbl_ops);
+
+	pm_runtime_put_sync(qcom_domain->iommu->dev);
+
+	kfree(qcom_domain);
+}
+
+static int qcom_iommu_attach_dev(struct iommu_domain *domain, struct device *dev)
+{
+	struct qcom_iommu_dev *qcom_iommu = to_iommu(dev->iommu_fwspec);
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	int ret;
+
+	if (!qcom_iommu) {
+		dev_err(dev, "cannot attach to IOMMU, is it on the same bus?\n");
+		return -ENXIO;
+	}
+
+	/* Ensure that the domain is finalized */
+	pm_runtime_get_sync(qcom_iommu->dev);
+	ret = qcom_iommu_init_domain(domain, qcom_iommu, dev->iommu_fwspec);
+	pm_runtime_put_sync(qcom_iommu->dev);
+	if (ret < 0)
+		return ret;
+
+	/*
+	 * Sanity check the domain. We don't support domains across
+	 * different IOMMUs.
+	 */
+	if (qcom_domain->iommu != qcom_iommu) {
+		dev_err(dev, "cannot attach to IOMMU %s while already "
+			"attached to domain on IOMMU %s\n",
+			dev_name(qcom_domain->iommu->dev),
+			dev_name(qcom_iommu->dev));
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static void qcom_iommu_detach_dev(struct iommu_domain *domain, struct device *dev)
+{
+	struct iommu_fwspec *fwspec = dev->iommu_fwspec;
+	struct qcom_iommu_dev *qcom_iommu = to_iommu(fwspec);
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	unsigned i;
+
+	if (!qcom_domain->iommu)
+		return;
+
+	pm_runtime_get_sync(qcom_iommu->dev);
+	for (i = 0; i < fwspec->num_ids; i++) {
+		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
+
+		/* Disable the context bank: */
+		iommu_writel(ctx, ARM_SMMU_CB_SCTLR, 0);
+	}
+	pm_runtime_put_sync(qcom_iommu->dev);
+
+	qcom_domain->iommu = NULL;
+}
+
+static int qcom_iommu_map(struct iommu_domain *domain, unsigned long iova,
+			  phys_addr_t paddr, size_t size, int prot)
+{
+	int ret;
+	unsigned long flags;
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
+
+	if (!ops)
+		return -ENODEV;
+
+	spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
+	ret = ops->map(ops, iova, paddr, size, prot);
+	spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
+	return ret;
+}
+
+static size_t qcom_iommu_unmap(struct iommu_domain *domain, unsigned long iova,
+			       size_t size)
+{
+	size_t ret;
+	unsigned long flags;
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
+
+	if (!ops)
+		return 0;
+
+	/* NOTE: unmap can be called after client device is powered off,
+	 * for example, with GPUs or anything involving dma-buf.  So we
+	 * cannot rely on the device_link.  Make sure the IOMMU is on to
+	 * avoid unclocked accesses in the TLB inv path:
+	 */
+	pm_runtime_get_sync(qcom_domain->iommu->dev);
+	spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
+	ret = ops->unmap(ops, iova, size);
+	spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
+	pm_runtime_put_sync(qcom_domain->iommu->dev);
+
+	return ret;
+}
+
+static phys_addr_t qcom_iommu_iova_to_phys(struct iommu_domain *domain,
+					   dma_addr_t iova)
+{
+	phys_addr_t ret;
+	unsigned long flags;
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
+
+	if (!ops)
+		return 0;
+
+	spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
+	ret = ops->iova_to_phys(ops, iova);
+	spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
+
+	return ret;
+}
+
+static bool qcom_iommu_capable(enum iommu_cap cap)
+{
+	switch (cap) {
+	case IOMMU_CAP_CACHE_COHERENCY:
+		/*
+		 * Return true here as the SMMU can always send out coherent
+		 * requests.
+		 */
+		return true;
+	case IOMMU_CAP_NOEXEC:
+		return true;
+	default:
+		return false;
+	}
+}
+
+static int qcom_iommu_add_device(struct device *dev)
+{
+	struct qcom_iommu_dev *qcom_iommu = to_iommu(dev->iommu_fwspec);
+	struct iommu_group *group;
+	struct device_link *link;
+
+	if (!qcom_iommu)
+		return -ENODEV;
+
+	/*
+	 * Establish the link between iommu and master, so that the
+	 * iommu gets runtime enabled/disabled as per the master's
+	 * needs.
+	 */
+	link = device_link_add(dev, qcom_iommu->dev, DL_FLAG_PM_RUNTIME);
+	if (!link) {
+		dev_err(qcom_iommu->dev, "Unable to create device link between %s and %s\n",
+			dev_name(qcom_iommu->dev), dev_name(dev));
+		return -ENODEV;
+	}
+
+	group = iommu_group_get_for_dev(dev);
+	if (IS_ERR_OR_NULL(group))
+		return PTR_ERR_OR_ZERO(group);
+
+	iommu_group_put(group);
+	iommu_device_link(&qcom_iommu->iommu, dev);
+
+	return 0;
+}
+
+static void qcom_iommu_remove_device(struct device *dev)
+{
+	struct qcom_iommu_dev *qcom_iommu = to_iommu(dev->iommu_fwspec);
+
+	if (!qcom_iommu)
+		return;
+
+	iommu_device_unlink(&qcom_iommu->iommu, dev);
+	iommu_group_remove_device(dev);
+	iommu_fwspec_free(dev);
+}
+
+static int qcom_iommu_of_xlate(struct device *dev, struct of_phandle_args *args)
+{
+	struct qcom_iommu_dev *qcom_iommu;
+	struct platform_device *iommu_pdev;
+	unsigned asid = args->args[0];
+
+	if (args->args_count != 1) {
+		dev_err(dev, "incorrect number of iommu params found for %s "
+			"(found %d, expected 1)\n",
+			args->np->full_name, args->args_count);
+		return -EINVAL;
+	}
+
+	iommu_pdev = of_find_device_by_node(args->np);
+	if (WARN_ON(!iommu_pdev))
+		return -EINVAL;
+
+	qcom_iommu = platform_get_drvdata(iommu_pdev);
+
+	/* make sure the asid specified in dt is valid, so we don't have
+	 * to sanity check this elsewhere, since 'asid - 1' is used to
+	 * index into qcom_iommu->ctxs:
+	 */
+	if (WARN_ON(asid < 1) ||
+	    WARN_ON(asid > qcom_iommu->num_ctxs))
+		return -EINVAL;
+
+	if (!dev->iommu_fwspec->iommu_priv) {
+		dev->iommu_fwspec->iommu_priv = qcom_iommu;
+	} else {
+		/* make sure devices iommus dt node isn't referring to
+		 * multiple different iommu devices.  Multiple context
+		 * banks are ok, but multiple devices are not:
+		 */
+		if (WARN_ON(qcom_iommu != dev->iommu_fwspec->iommu_priv))
+			return -EINVAL;
+	}
+
+	return iommu_fwspec_add_ids(dev, &asid, 1);
+}
+
+static const struct iommu_ops qcom_iommu_ops = {
+	.capable	= qcom_iommu_capable,
+	.domain_alloc	= qcom_iommu_domain_alloc,
+	.domain_free	= qcom_iommu_domain_free,
+	.attach_dev	= qcom_iommu_attach_dev,
+	.detach_dev	= qcom_iommu_detach_dev,
+	.map		= qcom_iommu_map,
+	.unmap		= qcom_iommu_unmap,
+	.map_sg		= default_iommu_map_sg,
+	.iova_to_phys	= qcom_iommu_iova_to_phys,
+	.add_device	= qcom_iommu_add_device,
+	.remove_device	= qcom_iommu_remove_device,
+	.device_group	= generic_device_group,
+	.of_xlate	= qcom_iommu_of_xlate,
+	.pgsize_bitmap	= SZ_4K | SZ_64K | SZ_1M | SZ_16M,
+};
+
+static int qcom_iommu_enable_clocks(struct qcom_iommu_dev *qcom_iommu)
+{
+	int ret;
+
+	ret = clk_prepare_enable(qcom_iommu->iface_clk);
+	if (ret) {
+		dev_err(qcom_iommu->dev, "Couldn't enable iface_clk\n");
+		return ret;
+	}
+
+	ret = clk_prepare_enable(qcom_iommu->bus_clk);
+	if (ret) {
+		dev_err(qcom_iommu->dev, "Couldn't enable bus_clk\n");
+		clk_disable_unprepare(qcom_iommu->iface_clk);
+		return ret;
+	}
+
+	return 0;
+}
+
+static void qcom_iommu_disable_clocks(struct qcom_iommu_dev *qcom_iommu)
+{
+	clk_disable_unprepare(qcom_iommu->bus_clk);
+	clk_disable_unprepare(qcom_iommu->iface_clk);
+}
+
+static int get_asid(const struct device_node *np)
+{
+	u32 reg;
+
+	/* read the "reg" property directly to get the relative address
+	 * of the context bank, and calculate the asid from that:
+	 */
+	if (of_property_read_u32_index(np, "reg", 0, &reg))
+		return -ENODEV;
+
+	return reg / 0x1000;      /* context banks are 0x1000 apart */
+}
+
+static int qcom_iommu_ctx_probe(struct platform_device *pdev)
+{
+	struct qcom_iommu_ctx *ctx;
+	struct device *dev = &pdev->dev;
+	struct qcom_iommu_dev *qcom_iommu = dev_get_drvdata(dev->parent);
+	struct resource *res;
+	int ret, irq;
+
+	ctx = devm_kzalloc(dev, sizeof(*ctx), GFP_KERNEL);
+	if (!ctx)
+		return -ENOMEM;
+
+	ctx->dev = dev;
+	platform_set_drvdata(pdev, ctx);
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	ctx->base = devm_ioremap_resource(dev, res);
+	if (IS_ERR(ctx->base))
+		return PTR_ERR(ctx->base);
+
+	irq = platform_get_irq(pdev, 0);
+	if (irq < 0) {
+		dev_err(dev, "failed to get irq\n");
+		return -ENODEV;
+	}
+
+	/* clear IRQs before registering fault handler, just in case the
+	 * boot-loader left us a surprise:
+	 */
+	iommu_writel(ctx, ARM_SMMU_CB_FSR, iommu_readl(ctx, ARM_SMMU_CB_FSR));
+
+	ret = devm_request_irq(dev, irq,
+			       qcom_iommu_fault,
+			       IRQF_SHARED,
+			       "qcom-iommu-fault",
+			       ctx);
+	if (ret) {
+		dev_err(dev, "failed to request IRQ %u\n", irq);
+		return ret;
+	}
+
+	ret = get_asid(dev->of_node);
+	if (ret < 0) {
+		dev_err(dev, "missing reg property\n");
+		return ret;
+	}
+
+	ctx->asid = ret;
+
+	dev_dbg(dev, "found asid %u\n", ctx->asid);
+
+	qcom_iommu->ctxs[ctx->asid - 1] = ctx;
+
+	return 0;
+}
+
+static int qcom_iommu_ctx_remove(struct platform_device *pdev)
+{
+	struct qcom_iommu_dev *qcom_iommu = dev_get_drvdata(pdev->dev.parent);
+	struct qcom_iommu_ctx *ctx = platform_get_drvdata(pdev);
+
+	platform_set_drvdata(pdev, NULL);
+
+	qcom_iommu->ctxs[ctx->asid - 1] = NULL;
+
+	return 0;
+}
+
+static const struct of_device_id ctx_of_match[] = {
+	{ .compatible = "qcom,msm-iommu-v1-ns" },
+	{ .compatible = "qcom,msm-iommu-v1-sec" },
+	{ /* sentinel */ }
+};
+
+static struct platform_driver qcom_iommu_ctx_driver = {
+	.driver	= {
+		.name		= "qcom-iommu-ctx",
+		.of_match_table	= of_match_ptr(ctx_of_match),
+	},
+	.probe	= qcom_iommu_ctx_probe,
+	.remove = qcom_iommu_ctx_remove,
+};
+
+static int qcom_iommu_device_probe(struct platform_device *pdev)
+{
+	struct device_node *child;
+	struct qcom_iommu_dev *qcom_iommu;
+	struct device *dev = &pdev->dev;
+	struct resource *res;
+	int ret, sz, max_asid = 0;
+
+	/* find the max asid (which is 1:1 to ctx bank idx), so we know how
+	 * many child ctx devices we have:
+	 */
+	for_each_child_of_node(dev->of_node, child)
+		max_asid = max(max_asid, get_asid(child));
+
+	sz = sizeof(*qcom_iommu) + (max_asid * sizeof(qcom_iommu->ctxs[0]));
+
+	qcom_iommu = devm_kzalloc(dev, sz, GFP_KERNEL);
+	if (!qcom_iommu)
+		return -ENOMEM;
+	qcom_iommu->num_ctxs = max_asid;
+	qcom_iommu->dev = dev;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	if (res)
+		qcom_iommu->local_base = devm_ioremap_resource(dev, res);
+
+	qcom_iommu->iface_clk = devm_clk_get(dev, "iface");
+	if (IS_ERR(qcom_iommu->iface_clk)) {
+		dev_err(dev, "failed to get iface clock\n");
+		return PTR_ERR(qcom_iommu->iface_clk);
+	}
+
+	qcom_iommu->bus_clk = devm_clk_get(dev, "bus");
+	if (IS_ERR(qcom_iommu->bus_clk)) {
+		dev_err(dev, "failed to get bus clock\n");
+		return PTR_ERR(qcom_iommu->bus_clk);
+	}
+
+	if (of_property_read_u32(dev->of_node, "qcom,iommu-secure-id",
+				 &qcom_iommu->sec_id)) {
+		dev_err(dev, "missing qcom,iommu-secure-id property\n");
+		return -ENODEV;
+	}
+
+	platform_set_drvdata(pdev, qcom_iommu);
+
+	pm_runtime_enable(dev);
+
+	/* register context bank devices, which are child nodes: */
+	ret = devm_of_platform_populate(dev);
+	if (ret) {
+		dev_err(dev, "Failed to populate iommu contexts\n");
+		return ret;
+	}
+
+	ret = iommu_device_sysfs_add(&qcom_iommu->iommu, dev, NULL,
+				     dev_name(dev));
+	if (ret) {
+		dev_err(dev, "Failed to register iommu in sysfs\n");
+		return ret;
+	}
+
+	iommu_device_set_ops(&qcom_iommu->iommu, &qcom_iommu_ops);
+	iommu_device_set_fwnode(&qcom_iommu->iommu, dev->fwnode);
+
+	ret = iommu_device_register(&qcom_iommu->iommu);
+	if (ret) {
+		dev_err(dev, "Failed to register iommu\n");
+		return ret;
+	}
+
+	bus_set_iommu(&platform_bus_type, &qcom_iommu_ops);
+
+	if (qcom_iommu->local_base) {
+		pm_runtime_get_sync(dev);
+		writel_relaxed(0xffffffff, qcom_iommu->local_base + SMMU_INTR_SEL_NS);
+		pm_runtime_put_sync(dev);
+	}
+
+	return 0;
+}
+
+static int qcom_iommu_device_remove(struct platform_device *pdev)
+{
+	struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
+
+	bus_set_iommu(&platform_bus_type, NULL);
+
+	pm_runtime_force_suspend(&pdev->dev);
+	platform_set_drvdata(pdev, NULL);
+	iommu_device_sysfs_remove(&qcom_iommu->iommu);
+	iommu_device_unregister(&qcom_iommu->iommu);
+
+	return 0;
+}
+
+#ifdef CONFIG_PM
+static int qcom_iommu_resume(struct device *dev)
+{
+	struct platform_device *pdev = to_platform_device(dev);
+	struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
+
+	return qcom_iommu_enable_clocks(qcom_iommu);
+}
+
+static int qcom_iommu_suspend(struct device *dev)
+{
+	struct platform_device *pdev = to_platform_device(dev);
+	struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
+
+	qcom_iommu_disable_clocks(qcom_iommu);
+
+	return 0;
+}
+#endif
+
+static const struct dev_pm_ops qcom_iommu_pm_ops = {
+	SET_RUNTIME_PM_OPS(qcom_iommu_suspend, qcom_iommu_resume, NULL)
+	SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
+				pm_runtime_force_resume)
+};
+
+static const struct of_device_id qcom_iommu_of_match[] = {
+	{ .compatible = "qcom,msm-iommu-v1" },
+	{ /* sentinel */ }
+};
+MODULE_DEVICE_TABLE(of, qcom_iommu_of_match);
+
+static struct platform_driver qcom_iommu_driver = {
+	.driver	= {
+		.name		= "qcom-iommu",
+		.of_match_table	= of_match_ptr(qcom_iommu_of_match),
+		.pm		= &qcom_iommu_pm_ops,
+	},
+	.probe	= qcom_iommu_device_probe,
+	.remove	= qcom_iommu_device_remove,
+};
+
+static int __init qcom_iommu_init(void)
+{
+	int ret;
+
+	ret = platform_driver_register(&qcom_iommu_ctx_driver);
+	if (ret)
+		return ret;
+
+	ret = platform_driver_register(&qcom_iommu_driver);
+	if (ret)
+		platform_driver_unregister(&qcom_iommu_ctx_driver);
+
+	return ret;
+}
+
+static void __exit qcom_iommu_exit(void)
+{
+	platform_driver_unregister(&qcom_iommu_driver);
+	platform_driver_unregister(&qcom_iommu_ctx_driver);
+}
+
+module_init(qcom_iommu_init);
+module_exit(qcom_iommu_exit);
+
+IOMMU_OF_DECLARE(qcom_iommu_dev, "qcom,msm-iommu-v1", NULL);
+
+MODULE_DESCRIPTION("IOMMU API for QCOM IOMMU v1 implementations");
+MODULE_LICENSE("GPL v2");
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] iommu: add qcom_iommu
       [not found]         ` <47a738b1-7da5-7043-c16c-4159c6211f7e-5wv7dgnIgG8@public.gmane.org>
  2017-05-26 19:12           ` Rob Clark
@ 2017-06-12 13:25           ` Rob Clark
  1 sibling, 0 replies; 24+ messages in thread
From: Rob Clark @ 2017-06-12 13:25 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Mark Rutland, Rob Herring, linux-arm-msm, Will Deacon,
	Stanimir Varbanov,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

On Fri, May 26, 2017 at 8:56 AM, Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org> wrote:
>> +     struct iommu_group      *group;
>
> This feels weird, since a device can be associated with multiple
> contexts, but only one group, so group-per-context is somewhat redundant
> and smacks of being in the wrong place. Does the firmware ever map
> multiple devices to the same context?


so, actually it seems like I can dump all of this, and just plug
generic_device_group directly in to iommu ops without needing to care
about tracking the iommu_group myself.  At least this appears to work.

BR,
-R

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 3/4] iommu: add qcom_iommu
  2017-06-01 13:58 [PATCH 0/4] iommu: add qcom_iommu for early "B" family devices Rob Clark
@ 2017-06-01 13:58 ` Rob Clark
  0 siblings, 0 replies; 24+ messages in thread
From: Rob Clark @ 2017-06-01 13:58 UTC (permalink / raw)
  To: iommu
  Cc: linux-arm-msm, Rob Herring, Robin Murphy, Will Deacon,
	Mark Rutland, Sricharan, Archit Taneja, Stanimir Varbanov,
	Rob Clark

An iommu driver for Qualcomm "B" family devices which do implement the
ARM SMMU spec, but not in a way that is compatible with how the arm-smmu
driver is designed.  It seems SMMU_SCR1.GASRAE=1 so the global register
space is not accessible.  This means it needs to get configuration from
devicetree instead of setting it up dynamically.

In the end, other than register definitions, there is not much code to
share with arm-smmu (other than what has already been refactored out
into the pgtable helpers).

Signed-off-by: Rob Clark <robdclark@gmail.com>
---
 drivers/iommu/Kconfig      |  10 +
 drivers/iommu/Makefile     |   1 +
 drivers/iommu/qcom_iommu.c | 901 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 912 insertions(+)
 create mode 100644 drivers/iommu/qcom_iommu.c

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 6ee3a25..aa4b628 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -367,4 +367,14 @@ config MTK_IOMMU_V1
 
 	  if unsure, say N here.
 
+config QCOM_IOMMU
+	# Note: iommu drivers cannot (yet?) be built as modules
+	bool "Qualcomm IOMMU Support"
+	depends on ARCH_QCOM || COMPILE_TEST
+	select IOMMU_API
+	select IOMMU_IO_PGTABLE_LPAE
+	select ARM_DMA_USE_IOMMU
+	help
+	  Support for IOMMU on certain Qualcomm SoCs.
+
 endif # IOMMU_SUPPORT
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 195f7b9..b910aea 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -27,3 +27,4 @@ obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o
 obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
 obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
 obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
+obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
new file mode 100644
index 0000000..3b578e6
--- /dev/null
+++ b/drivers/iommu/qcom_iommu.c
@@ -0,0 +1,901 @@
+/*
+ * IOMMU API for QCOM secure IOMMUs.  Somewhat based on arm-smmu.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Copyright (C) 2013 ARM Limited
+ * Copyright (C) 2017 Red Hat
+ */
+
+#include <linux/atomic.h>
+#include <linux/clk.h>
+#include <linux/delay.h>
+#include <linux/dma-iommu.h>
+#include <linux/dma-mapping.h>
+#include <linux/err.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/io-64-nonatomic-hi-lo.h>
+#include <linux/iommu.h>
+#include <linux/iopoll.h>
+#include <linux/kconfig.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/of_device.h>
+#include <linux/of_iommu.h>
+#include <linux/platform_device.h>
+#include <linux/pm.h>
+#include <linux/pm_runtime.h>
+#include <linux/qcom_scm.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+
+#include "io-pgtable.h"
+#include "arm-smmu-regs.h"
+
+#define SMMU_INTR_SEL_NS     0x2000
+
+struct qcom_iommu_ctx;
+
+struct qcom_iommu_dev {
+	/* IOMMU core code handle */
+	struct iommu_device	 iommu;
+	struct device		*dev;
+	struct clk		*iface_clk;
+	struct clk		*bus_clk;
+	void __iomem		*local_base;
+	u32			 sec_id;
+	u8			 num_ctxs;
+	struct qcom_iommu_ctx	*ctxs[0];   /* indexed by asid-1 */
+};
+
+struct qcom_iommu_ctx {
+	struct device		*dev;
+	void __iomem		*base;
+	bool			 secure_init;
+	u8			 asid;      /* asid and ctx bank # are 1:1 */
+	struct iommu_group	*group;
+};
+
+struct qcom_iommu_domain {
+	struct io_pgtable_ops	*pgtbl_ops;
+	spinlock_t		 pgtbl_lock;
+	struct mutex		 init_mutex; /* Protects iommu pointer */
+	struct iommu_domain	 domain;
+	struct qcom_iommu_dev	*iommu;
+};
+
+static struct qcom_iommu_domain *to_qcom_iommu_domain(struct iommu_domain *dom)
+{
+	return container_of(dom, struct qcom_iommu_domain, domain);
+}
+
+static const struct iommu_ops qcom_iommu_ops;
+
+static struct qcom_iommu_dev * to_iommu(struct iommu_fwspec *fwspec)
+{
+	if (!fwspec || fwspec->ops != &qcom_iommu_ops)
+		return NULL;
+	return fwspec->iommu_priv;
+}
+
+static struct qcom_iommu_ctx * to_ctx(struct iommu_fwspec *fwspec, unsigned asid)
+{
+	struct qcom_iommu_dev *qcom_iommu = to_iommu(fwspec);
+	if (!qcom_iommu)
+		return NULL;
+	return qcom_iommu->ctxs[asid - 1];
+}
+
+static inline void
+iommu_writel(struct qcom_iommu_ctx *ctx, unsigned reg, u32 val)
+{
+	writel_relaxed(val, ctx->base + reg);
+}
+
+static inline void
+iommu_writeq(struct qcom_iommu_ctx *ctx, unsigned reg, u64 val)
+{
+	writeq_relaxed(val, ctx->base + reg);
+}
+
+static inline u32
+iommu_readl(struct qcom_iommu_ctx *ctx, unsigned reg)
+{
+	return readl_relaxed(ctx->base + reg);
+}
+
+static inline u64
+iommu_readq(struct qcom_iommu_ctx *ctx, unsigned reg)
+{
+	return readq_relaxed(ctx->base + reg);
+}
+
+static void qcom_iommu_tlb_sync(void *cookie)
+{
+	struct iommu_fwspec *fwspec = cookie;
+	unsigned i;
+
+	for (i = 0; i < fwspec->num_ids; i++) {
+		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
+		unsigned int val, ret;
+
+		iommu_writel(ctx, ARM_SMMU_CB_TLBSYNC, 0);
+
+		ret = readl_poll_timeout(ctx->base + ARM_SMMU_CB_TLBSTATUS, val,
+					 (val & 0x1) == 0, 0, 5000000);
+		if (ret)
+			dev_err(ctx->dev, "timeout waiting for TLB SYNC\n");
+	}
+}
+
+static void qcom_iommu_tlb_inv_context(void *cookie)
+{
+	struct iommu_fwspec *fwspec = cookie;
+	unsigned i;
+
+	for (i = 0; i < fwspec->num_ids; i++) {
+		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
+		iommu_writel(ctx, ARM_SMMU_CB_S1_TLBIASID, ctx->asid);
+	}
+
+	qcom_iommu_tlb_sync(cookie);
+}
+
+static void qcom_iommu_tlb_inv_range_nosync(unsigned long iova, size_t size,
+					    size_t granule, bool leaf, void *cookie)
+{
+	struct iommu_fwspec *fwspec = cookie;
+	unsigned i, reg;
+
+	reg = leaf ? ARM_SMMU_CB_S1_TLBIVAL : ARM_SMMU_CB_S1_TLBIVA;
+
+	for (i = 0; i < fwspec->num_ids; i++) {
+		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
+		size_t s = size;
+
+		iova &= ~12UL;
+		iova |= ctx->asid;
+		do {
+			iommu_writel(ctx, reg, iova);
+			iova += granule;
+		} while (s -= granule);
+	}
+}
+
+static const struct iommu_gather_ops qcom_gather_ops = {
+	.tlb_flush_all	= qcom_iommu_tlb_inv_context,
+	.tlb_add_flush	= qcom_iommu_tlb_inv_range_nosync,
+	.tlb_sync	= qcom_iommu_tlb_sync,
+};
+
+static irqreturn_t qcom_iommu_fault(int irq, void *dev)
+{
+	struct qcom_iommu_ctx *ctx = dev;
+	u32 fsr, fsynr;
+	u64 iova;
+
+	fsr = iommu_readl(ctx, ARM_SMMU_CB_FSR);
+
+	if (!(fsr & FSR_FAULT))
+		return IRQ_NONE;
+
+	fsynr = iommu_readl(ctx, ARM_SMMU_CB_FSYNR0);
+	iova = iommu_readq(ctx, ARM_SMMU_CB_FAR);
+
+	dev_err_ratelimited(ctx->dev,
+			    "Unhandled context fault: fsr=0x%x, "
+			    "iova=0x%016llx, fsynr=0x%x, cb=%d\n",
+			    fsr, iova, fsynr, ctx->asid);
+
+	iommu_writel(ctx, ARM_SMMU_CB_FSR, fsr);
+
+	return IRQ_HANDLED;
+}
+
+static int qcom_iommu_init_domain(struct iommu_domain *domain,
+				  struct qcom_iommu_dev *qcom_iommu,
+				  struct iommu_fwspec *fwspec)
+{
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	struct io_pgtable_ops *pgtbl_ops;
+	struct io_pgtable_cfg pgtbl_cfg;
+	int i, ret = 0;
+	u32 reg;
+
+	mutex_lock(&qcom_domain->init_mutex);
+	if (qcom_domain->iommu)
+		goto out_unlock;
+
+	pgtbl_cfg = (struct io_pgtable_cfg) {
+		.pgsize_bitmap	= qcom_iommu_ops.pgsize_bitmap,
+		.ias		= 32,
+		.oas		= 40,
+		.tlb		= &qcom_gather_ops,
+		.iommu_dev	= qcom_iommu->dev,
+	};
+
+	qcom_domain->iommu = qcom_iommu;
+	pgtbl_ops = alloc_io_pgtable_ops(ARM_32_LPAE_S1, &pgtbl_cfg, fwspec);
+	if (!pgtbl_ops) {
+		dev_err(qcom_iommu->dev, "failed to allocate pagetable ops\n");
+		ret = -ENOMEM;
+		goto out_clear_iommu;
+	}
+
+	/* Update the domain's page sizes to reflect the page table format */
+	domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
+	domain->geometry.aperture_end = (1ULL << pgtbl_cfg.ias) - 1;
+	domain->geometry.force_aperture = true;
+
+	for (i = 0; i < fwspec->num_ids; i++) {
+		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
+
+		if (!ctx->secure_init) {
+			ret = qcom_scm_restore_sec_cfg(qcom_iommu->sec_id, ctx->asid);
+			if (ret) {
+				dev_err(qcom_iommu->dev, "secure init failed: %d\n", ret);
+				goto out_clear_iommu;
+			}
+			ctx->secure_init = true;
+		}
+
+		/* TTBRs */
+		iommu_writeq(ctx, ARM_SMMU_CB_TTBR0,
+				pgtbl_cfg.arm_lpae_s1_cfg.ttbr[0] |
+				((u64)ctx->asid << TTBRn_ASID_SHIFT));
+		iommu_writeq(ctx, ARM_SMMU_CB_TTBR1,
+				pgtbl_cfg.arm_lpae_s1_cfg.ttbr[1] |
+				((u64)ctx->asid << TTBRn_ASID_SHIFT));
+
+		/* TTBCR */
+		iommu_writel(ctx, ARM_SMMU_CB_TTBCR2,
+				(pgtbl_cfg.arm_lpae_s1_cfg.tcr >> 32) |
+				TTBCR2_SEP_UPSTREAM);
+		iommu_writel(ctx, ARM_SMMU_CB_TTBCR,
+				pgtbl_cfg.arm_lpae_s1_cfg.tcr);
+
+		/* MAIRs (stage-1 only) */
+		iommu_writel(ctx, ARM_SMMU_CB_S1_MAIR0,
+				pgtbl_cfg.arm_lpae_s1_cfg.mair[0]);
+		iommu_writel(ctx, ARM_SMMU_CB_S1_MAIR1,
+				pgtbl_cfg.arm_lpae_s1_cfg.mair[1]);
+
+		/* SCTLR */
+		reg = SCTLR_CFIE | SCTLR_CFRE | SCTLR_AFE | SCTLR_TRE |
+			SCTLR_M | SCTLR_S1_ASIDPNE;
+
+		if (IS_ENABLED(CONFIG_BIG_ENDIAN))
+			reg |= SCTLR_E;
+
+		iommu_writel(ctx, ARM_SMMU_CB_SCTLR, reg);
+	}
+
+	mutex_unlock(&qcom_domain->init_mutex);
+
+	/* Publish page table ops for map/unmap */
+	qcom_domain->pgtbl_ops = pgtbl_ops;
+
+	return 0;
+
+out_clear_iommu:
+	qcom_domain->iommu = NULL;
+out_unlock:
+	mutex_unlock(&qcom_domain->init_mutex);
+	return ret;
+}
+
+static struct iommu_domain *qcom_iommu_domain_alloc(unsigned type)
+{
+	struct qcom_iommu_domain *qcom_domain;
+
+	if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA)
+		return NULL;
+	/*
+	 * Allocate the domain and initialise some of its data structures.
+	 * We can't really do anything meaningful until we've added a
+	 * master.
+	 */
+	qcom_domain = kzalloc(sizeof(*qcom_domain), GFP_KERNEL);
+	if (!qcom_domain)
+		return NULL;
+
+	if (type == IOMMU_DOMAIN_DMA &&
+	    iommu_get_dma_cookie(&qcom_domain->domain)) {
+		kfree(qcom_domain);
+		return NULL;
+	}
+
+	mutex_init(&qcom_domain->init_mutex);
+	spin_lock_init(&qcom_domain->pgtbl_lock);
+
+	return &qcom_domain->domain;
+}
+
+static void qcom_iommu_domain_free(struct iommu_domain *domain)
+{
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+
+	if (WARN_ON(qcom_domain->iommu))    /* forgot to detach? */
+		return;
+
+	iommu_put_dma_cookie(domain);
+
+	/* NOTE: unmap can be called after client device is powered off,
+	 * for example, with GPUs or anything involving dma-buf.  So we
+	 * cannot rely on the device_link.  Make sure the IOMMU is on to
+	 * avoid unclocked accesses in the TLB inv path:
+	 */
+	pm_runtime_get_sync(qcom_domain->iommu->dev);
+
+	free_io_pgtable_ops(qcom_domain->pgtbl_ops);
+
+	pm_runtime_put_sync(qcom_domain->iommu->dev);
+
+	kfree(qcom_domain);
+}
+
+static int qcom_iommu_attach_dev(struct iommu_domain *domain, struct device *dev)
+{
+	struct qcom_iommu_dev *qcom_iommu = to_iommu(dev->iommu_fwspec);
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	int ret;
+
+	if (!qcom_iommu) {
+		dev_err(dev, "cannot attach to IOMMU, is it on the same bus?\n");
+		return -ENXIO;
+	}
+
+	/* Ensure that the domain is finalized */
+	pm_runtime_get_sync(qcom_iommu->dev);
+	ret = qcom_iommu_init_domain(domain, qcom_iommu, dev->iommu_fwspec);
+	pm_runtime_put_sync(qcom_iommu->dev);
+	if (ret < 0)
+		return ret;
+
+	/*
+	 * Sanity check the domain. We don't support domains across
+	 * different IOMMUs.
+	 */
+	if (qcom_domain->iommu != qcom_iommu) {
+		dev_err(dev, "cannot attach to IOMMU %s while already "
+			"attached to domain on IOMMU %s\n",
+			dev_name(qcom_domain->iommu->dev),
+			dev_name(qcom_iommu->dev));
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static void qcom_iommu_detach_dev(struct iommu_domain *domain, struct device *dev)
+{
+	struct iommu_fwspec *fwspec = dev->iommu_fwspec;
+	struct qcom_iommu_dev *qcom_iommu = to_iommu(fwspec);
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	unsigned i;
+
+	if (!qcom_domain->iommu)
+		return;
+
+	pm_runtime_get_sync(qcom_iommu->dev);
+	for (i = 0; i < fwspec->num_ids; i++) {
+		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
+
+		/* Disable the context bank: */
+		iommu_writel(ctx, ARM_SMMU_CB_SCTLR, 0);
+	}
+	pm_runtime_put_sync(qcom_iommu->dev);
+
+	qcom_domain->iommu = NULL;
+}
+
+static int qcom_iommu_map(struct iommu_domain *domain, unsigned long iova,
+			  phys_addr_t paddr, size_t size, int prot)
+{
+	int ret;
+	unsigned long flags;
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
+
+	if (!ops)
+		return -ENODEV;
+
+	spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
+	ret = ops->map(ops, iova, paddr, size, prot);
+	spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
+	return ret;
+}
+
+static size_t qcom_iommu_unmap(struct iommu_domain *domain, unsigned long iova,
+			       size_t size)
+{
+	size_t ret;
+	unsigned long flags;
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
+
+	if (!ops)
+		return 0;
+
+	/* NOTE: unmap can be called after client device is powered off,
+	 * for example, with GPUs or anything involving dma-buf.  So we
+	 * cannot rely on the device_link.  Make sure the IOMMU is on to
+	 * avoid unclocked accesses in the TLB inv path:
+	 */
+	pm_runtime_get_sync(qcom_domain->iommu->dev);
+	spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
+	ret = ops->unmap(ops, iova, size);
+	spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
+	pm_runtime_put_sync(qcom_domain->iommu->dev);
+
+	return ret;
+}
+
+static phys_addr_t qcom_iommu_iova_to_phys(struct iommu_domain *domain,
+					   dma_addr_t iova)
+{
+	phys_addr_t ret;
+	unsigned long flags;
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
+
+	if (!ops)
+		return 0;
+
+	spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
+	ret = ops->iova_to_phys(ops, iova);
+	spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
+
+	return ret;
+}
+
+static bool qcom_iommu_capable(enum iommu_cap cap)
+{
+	switch (cap) {
+	case IOMMU_CAP_CACHE_COHERENCY:
+		/*
+		 * Return true here as the SMMU can always send out coherent
+		 * requests.
+		 */
+		return true;
+	case IOMMU_CAP_NOEXEC:
+		return true;
+	default:
+		return false;
+	}
+}
+
+static int qcom_iommu_add_device(struct device *dev)
+{
+	struct qcom_iommu_dev *qcom_iommu = to_iommu(dev->iommu_fwspec);
+	struct iommu_group *group;
+	struct device_link *link;
+
+	if (!qcom_iommu)
+		return -ENODEV;
+
+	/*
+	 * Establish the link between iommu and master, so that the
+	 * iommu gets runtime enabled/disabled as per the master's
+	 * needs.
+	 */
+	link = device_link_add(dev, qcom_iommu->dev, DL_FLAG_PM_RUNTIME);
+	if (!link) {
+		dev_err(qcom_iommu->dev, "Unable to create device link between %s and %s\n",
+			dev_name(qcom_iommu->dev), dev_name(dev));
+		return -ENODEV;
+	}
+
+	group = iommu_group_get_for_dev(dev);
+	if (IS_ERR_OR_NULL(group))
+		return PTR_ERR_OR_ZERO(group);
+
+	iommu_group_put(group);
+	iommu_device_link(&qcom_iommu->iommu, dev);
+
+	return 0;
+}
+
+static void qcom_iommu_remove_device(struct device *dev)
+{
+	struct qcom_iommu_dev *qcom_iommu = to_iommu(dev->iommu_fwspec);
+
+	if (!qcom_iommu)
+		return;
+
+	iommu_device_unlink(&qcom_iommu->iommu, dev);
+	iommu_group_remove_device(dev);
+	iommu_fwspec_free(dev);
+}
+
+static struct iommu_group *qcom_iommu_device_group(struct device *dev)
+{
+	struct iommu_fwspec *fwspec = dev->iommu_fwspec;
+	struct iommu_group *group = NULL;
+	unsigned i;
+
+	for (i = 0; i < fwspec->num_ids; i++) {
+		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
+
+		if (!ctx)
+			return ERR_PTR(-ENODEV);
+
+		if (group && ctx->group && group != ctx->group)
+			return ERR_PTR(-EINVAL);
+
+		group = ctx->group;
+	}
+
+	if (group)
+		return iommu_group_ref_get(group);
+
+	group = generic_device_group(dev);
+
+	for (i = 0; i < fwspec->num_ids; i++) {
+		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
+		ctx->group = iommu_group_ref_get(group);
+	}
+
+	return group;
+}
+
+static int qcom_iommu_of_xlate(struct device *dev, struct of_phandle_args *args)
+{
+	struct qcom_iommu_dev *qcom_iommu;
+	struct platform_device *iommu_pdev;
+	unsigned asid = args->args[0];
+
+	if (args->args_count != 1) {
+		dev_err(dev, "incorrect number of iommu params found for %s "
+			"(found %d, expected 1)\n",
+			args->np->full_name, args->args_count);
+		return -EINVAL;
+	}
+
+	iommu_pdev = of_find_device_by_node(args->np);
+	if (WARN_ON(!iommu_pdev))
+		return -EINVAL;
+
+	qcom_iommu = platform_get_drvdata(iommu_pdev);
+
+	/* make sure the asid specified in dt is valid, so we don't have
+	 * to sanity check this elsewhere, since 'asid - 1' is used to
+	 * index into qcom_iommu->ctxs:
+	 */
+	if (WARN_ON(asid < 1) ||
+	    WARN_ON(asid > qcom_iommu->num_ctxs))
+		return -EINVAL;
+
+	if (!dev->iommu_fwspec->iommu_priv) {
+		dev->iommu_fwspec->iommu_priv = qcom_iommu;
+	} else {
+		/* make sure devices iommus dt node isn't referring to
+		 * multiple different iommu devices.  Multiple context
+		 * banks are ok, but multiple devices are not:
+		 */
+		if (WARN_ON(qcom_iommu != dev->iommu_fwspec->iommu_priv))
+			return -EINVAL;
+	}
+
+	return iommu_fwspec_add_ids(dev, &asid, 1);
+}
+
+static const struct iommu_ops qcom_iommu_ops = {
+	.capable	= qcom_iommu_capable,
+	.domain_alloc	= qcom_iommu_domain_alloc,
+	.domain_free	= qcom_iommu_domain_free,
+	.attach_dev	= qcom_iommu_attach_dev,
+	.detach_dev	= qcom_iommu_detach_dev,
+	.map		= qcom_iommu_map,
+	.unmap		= qcom_iommu_unmap,
+	.map_sg		= default_iommu_map_sg,
+	.iova_to_phys	= qcom_iommu_iova_to_phys,
+	.add_device	= qcom_iommu_add_device,
+	.remove_device	= qcom_iommu_remove_device,
+	.device_group	= qcom_iommu_device_group,
+	.of_xlate	= qcom_iommu_of_xlate,
+	.pgsize_bitmap	= SZ_4K | SZ_64K | SZ_1M | SZ_16M,
+};
+
+static int qcom_iommu_enable_clocks(struct qcom_iommu_dev *qcom_iommu)
+{
+	int ret;
+
+	ret = clk_prepare_enable(qcom_iommu->iface_clk);
+	if (ret) {
+		dev_err(qcom_iommu->dev, "Couldn't enable iface_clk\n");
+		return ret;
+	}
+
+	ret = clk_prepare_enable(qcom_iommu->bus_clk);
+	if (ret) {
+		dev_err(qcom_iommu->dev, "Couldn't enable bus_clk\n");
+		clk_disable_unprepare(qcom_iommu->iface_clk);
+		return ret;
+	}
+
+	return 0;
+}
+
+static void qcom_iommu_disable_clocks(struct qcom_iommu_dev *qcom_iommu)
+{
+	clk_disable_unprepare(qcom_iommu->bus_clk);
+	clk_disable_unprepare(qcom_iommu->iface_clk);
+}
+
+static int get_asid(const struct device_node *np)
+{
+	u32 reg;
+
+	/* read the "reg" property directly to get the relative address
+	 * of the context bank, and calculate the asid from that:
+	 */
+	if (of_property_read_u32_index(np, "reg", 0, &reg))
+		return -ENODEV;
+
+	return reg / 0x1000;      /* context banks are 0x1000 apart */
+}
+
+static int qcom_iommu_ctx_probe(struct platform_device *pdev)
+{
+	struct qcom_iommu_ctx *ctx;
+	struct device *dev = &pdev->dev;
+	struct qcom_iommu_dev *qcom_iommu = dev_get_drvdata(dev->parent);
+	struct resource *res;
+	int ret, irq;
+
+	ctx = devm_kzalloc(dev, sizeof(*ctx), GFP_KERNEL);
+	if (!ctx)
+		return -ENOMEM;
+
+	ctx->dev = dev;
+	platform_set_drvdata(pdev, ctx);
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	ctx->base = devm_ioremap_resource(dev, res);
+	if (IS_ERR(ctx->base))
+		return PTR_ERR(ctx->base);
+
+	irq = platform_get_irq(pdev, 0);
+	if (irq < 0) {
+		dev_err(dev, "failed to get irq\n");
+		return -ENODEV;
+	}
+
+	/* clear IRQs before registering fault handler, just in case the
+	 * boot-loader left us a surprise:
+	 */
+	iommu_writel(ctx, ARM_SMMU_CB_FSR, iommu_readl(ctx, ARM_SMMU_CB_FSR));
+
+	ret = devm_request_irq(dev, irq,
+			       qcom_iommu_fault,
+			       IRQF_SHARED,
+			       "qcom-iommu-fault",
+			       ctx);
+	if (ret) {
+		dev_err(dev, "failed to request IRQ %u\n", irq);
+		return ret;
+	}
+
+	ret = get_asid(dev->of_node);
+	if (ret < 0) {
+		dev_err(dev, "missing reg property\n");
+		return ret;
+	}
+
+	ctx->asid = ret;
+
+	dev_dbg(dev, "found asid %u\n", ctx->asid);
+
+	qcom_iommu->ctxs[ctx->asid - 1] = ctx;
+
+	return 0;
+}
+
+static int qcom_iommu_ctx_remove(struct platform_device *pdev)
+{
+	struct qcom_iommu_dev *qcom_iommu = dev_get_drvdata(pdev->dev.parent);
+	struct qcom_iommu_ctx *ctx = platform_get_drvdata(pdev);
+
+	iommu_group_put(ctx->group);
+	platform_set_drvdata(pdev, NULL);
+
+	qcom_iommu->ctxs[ctx->asid - 1] = NULL;
+
+	return 0;
+}
+
+static const struct of_device_id ctx_of_match[] = {
+	{ .compatible = "qcom,msm-iommu-v1-ns" },
+	{ .compatible = "qcom,msm-iommu-v1-sec" },
+	{ /* sentinel */ }
+};
+
+static struct platform_driver qcom_iommu_ctx_driver = {
+	.driver	= {
+		.name		= "qcom-iommu-ctx",
+		.of_match_table	= of_match_ptr(ctx_of_match),
+	},
+	.probe	= qcom_iommu_ctx_probe,
+	.remove = qcom_iommu_ctx_remove,
+};
+
+static int qcom_iommu_device_probe(struct platform_device *pdev)
+{
+	struct device_node *child;
+	struct qcom_iommu_dev *qcom_iommu;
+	struct device *dev = &pdev->dev;
+	struct resource *res;
+	int ret, sz, max_asid = 0;
+
+	/* find the max asid (which is 1:1 to ctx bank idx), so we know how
+	 * many child ctx devices we have:
+	 */
+	for_each_child_of_node(dev->of_node, child)
+		max_asid = max(max_asid, get_asid(child));
+
+	sz = sizeof(*qcom_iommu) + (max_asid * sizeof(qcom_iommu->ctxs[0]));
+
+	qcom_iommu = devm_kzalloc(dev, sz, GFP_KERNEL);
+	if (!qcom_iommu)
+		return -ENOMEM;
+	qcom_iommu->num_ctxs = max_asid;
+	qcom_iommu->dev = dev;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	if (res)
+		qcom_iommu->local_base = devm_ioremap_resource(dev, res);
+
+	qcom_iommu->iface_clk = devm_clk_get(dev, "iface");
+	if (IS_ERR(qcom_iommu->iface_clk)) {
+		dev_err(dev, "failed to get iface clock\n");
+		return PTR_ERR(qcom_iommu->iface_clk);
+	}
+
+	qcom_iommu->bus_clk = devm_clk_get(dev, "bus");
+	if (IS_ERR(qcom_iommu->bus_clk)) {
+		dev_err(dev, "failed to get bus clock\n");
+		return PTR_ERR(qcom_iommu->bus_clk);
+	}
+
+	if (of_property_read_u32(dev->of_node, "qcom,iommu-secure-id",
+				 &qcom_iommu->sec_id)) {
+		dev_err(dev, "missing qcom,iommu-secure-id property\n");
+		return -ENODEV;
+	}
+
+	platform_set_drvdata(pdev, qcom_iommu);
+
+	pm_runtime_enable(dev);
+
+	/* register context bank devices, which are child nodes: */
+	ret = devm_of_platform_populate(dev);
+	if (ret) {
+		dev_err(dev, "Failed to populate iommu contexts\n");
+		return ret;
+	}
+
+	ret = iommu_device_sysfs_add(&qcom_iommu->iommu, dev, NULL,
+				     dev_name(dev));
+	if (ret) {
+		dev_err(dev, "Failed to register iommu in sysfs\n");
+		return ret;
+	}
+
+	iommu_device_set_ops(&qcom_iommu->iommu, &qcom_iommu_ops);
+	iommu_device_set_fwnode(&qcom_iommu->iommu, dev->fwnode);
+
+	ret = iommu_device_register(&qcom_iommu->iommu);
+	if (ret) {
+		dev_err(dev, "Failed to register iommu\n");
+		return ret;
+	}
+
+	bus_set_iommu(&platform_bus_type, &qcom_iommu_ops);
+
+	if (qcom_iommu->local_base) {
+		pm_runtime_get_sync(dev);
+		writel_relaxed(0xffffffff, qcom_iommu->local_base + SMMU_INTR_SEL_NS);
+		pm_runtime_put_sync(dev);
+	}
+
+	return 0;
+}
+
+static int qcom_iommu_device_remove(struct platform_device *pdev)
+{
+	struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
+
+	bus_set_iommu(&platform_bus_type, NULL);
+
+	pm_runtime_force_suspend(&pdev->dev);
+	platform_set_drvdata(pdev, NULL);
+	iommu_device_sysfs_remove(&qcom_iommu->iommu);
+	iommu_device_unregister(&qcom_iommu->iommu);
+
+	return 0;
+}
+
+#ifdef CONFIG_PM
+static int qcom_iommu_resume(struct device *dev)
+{
+	struct platform_device *pdev = to_platform_device(dev);
+	struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
+
+	return qcom_iommu_enable_clocks(qcom_iommu);
+}
+
+static int qcom_iommu_suspend(struct device *dev)
+{
+	struct platform_device *pdev = to_platform_device(dev);
+	struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
+
+	qcom_iommu_disable_clocks(qcom_iommu);
+
+	return 0;
+}
+#endif
+
+static const struct dev_pm_ops qcom_iommu_pm_ops = {
+	SET_RUNTIME_PM_OPS(qcom_iommu_suspend, qcom_iommu_resume, NULL)
+	SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
+				pm_runtime_force_resume)
+};
+
+static const struct of_device_id qcom_iommu_of_match[] = {
+	{ .compatible = "qcom,msm-iommu-v1" },
+	{ /* sentinel */ }
+};
+MODULE_DEVICE_TABLE(of, qcom_iommu_of_match);
+
+static struct platform_driver qcom_iommu_driver = {
+	.driver	= {
+		.name		= "qcom-iommu",
+		.of_match_table	= of_match_ptr(qcom_iommu_of_match),
+		.pm		= &qcom_iommu_pm_ops,
+	},
+	.probe	= qcom_iommu_device_probe,
+	.remove	= qcom_iommu_device_remove,
+};
+
+static int __init qcom_iommu_init(void)
+{
+	int ret;
+
+	ret = platform_driver_register(&qcom_iommu_ctx_driver);
+	if (ret)
+		return ret;
+
+	ret = platform_driver_register(&qcom_iommu_driver);
+	if (ret)
+		platform_driver_unregister(&qcom_iommu_ctx_driver);
+
+	return ret;
+}
+
+static void __exit qcom_iommu_exit(void)
+{
+	platform_driver_unregister(&qcom_iommu_driver);
+	platform_driver_unregister(&qcom_iommu_ctx_driver);
+}
+
+module_init(qcom_iommu_init);
+module_exit(qcom_iommu_exit);
+
+IOMMU_OF_DECLARE(qcom_iommu_dev, "qcom,msm-iommu-v1", NULL);
+
+MODULE_DESCRIPTION("IOMMU API for QCOM IOMMU v1 implementations");
+MODULE_LICENSE("GPL v2");
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] iommu: add qcom_iommu
       [not found]         ` <47a738b1-7da5-7043-c16c-4159c6211f7e-5wv7dgnIgG8@public.gmane.org>
@ 2017-05-26 19:12           ` Rob Clark
  2017-06-12 13:25           ` Rob Clark
  1 sibling, 0 replies; 24+ messages in thread
From: Rob Clark @ 2017-05-26 19:12 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Mark Rutland, Rob Herring, linux-arm-msm, Will Deacon,
	Stanimir Varbanov,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

On Fri, May 26, 2017 at 8:56 AM, Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org> wrote:
> On 25/05/17 18:33, Rob Clark wrote:
>> An iommu driver for Qualcomm "B" family devices which do not completely
>> implement the ARM SMMU spec.  These devices have context-bank register
>> layout that is similar to ARM SMMU, but no global register space (or at
>> least not one that is accessible).
>
> I still object to this description, because the SMMU_SCR1.GASRAE = 1
> usage model is explicitly *specified* by the ARM SMMU spec! It's merely
> that the arm-smmu driver is designed for the case where we do have
> control of the global space and stage 2 contexts.

hmm, ok.. well, I've no idea what secure world is doing, but it sounds
plausible that GASRAE is set to 1.. at least that would match how
things behave.

In that case, I wonder if the driver should have a more generic name
than "qcom_iommu" (and likewise for compat strings, etc)?  I've really
no idea if qcom is the only one doing this.  In either case,
suggestions welcome.  (I had assumed someone would have bikeshedded
the name/compat-strings by now ;-))

>> Signed-off-by: Rob Clark <robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>> ---
>> v1: original
>> v2: bindings cleanups and kconfig issues that kbuild robot pointed out
>> v3: fix issues pointed out by Rob H. and actually make device removal
>>     work
>> v4: fix WARN_ON() splats reported by Archit
>> v5: some fixes to build as a module.. note that it cannot actually
>>     be built as a module yet (at minimum a bunch of other iommu syms
>>     that are needed are not exported, but there may be more to it
>>     than that), but at least qcom_iommu is ready should it become
>>     possible to build iommu drivers as modules.
>
> Note that with the 4.12 probe-deferral changes, modules totally work!
> For any master which probed before the IOMMU driver was loaded, you can
> then hook them up after the fact by just unbinding and rebinding their
> drivers - it's really cool.

hmm, ok, last time I tried this was 4.11 + iommu-next for 4.12 (plus a
couple other -next trees), since 4.12-rc1 wasn't out yet.. but at that
time, we needed at least a few EXPORT_SYMBOL()s, plus probably some
sort of fix for iommu bug I was trying to fix/paper-over in
<20170505180837.11326-1-robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> (at least if you wanted
module unload to work).  For the former issue, I can send patches to
add EXPORT_SYMBOL()s (or is EXPORT_SYMBOL_GPL() preferred?).. for
latter, well I spend 80% or my time working on userspace level part of
gpu driver stack, and 80% of my kernel time working in drm, so I'll
leave this to someone who spends more than 4% of their time working on
the iommu subsystem ;-)

>> v6: Add additional pm-runtime get/puts around paths that can hit
>>     TLB inv, to avoid unclocked register access if device using the
>>     iommu is not powered on.  And pre-emptively clear interrupts
>>     before registering IRQ handler just in case the bootloader has
>>     left us a surpise.
>>
>>  drivers/iommu/Kconfig      |  10 +
>>  drivers/iommu/Makefile     |   1 +
>>  drivers/iommu/qcom_iommu.c | 878 +++++++++++++++++++++++++++++++++++++++++++++
>>  3 files changed, 889 insertions(+)
>>  create mode 100644 drivers/iommu/qcom_iommu.c
>>
>> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
>> index 6ee3a25..aa4b628 100644
>> --- a/drivers/iommu/Kconfig
>> +++ b/drivers/iommu/Kconfig
>> @@ -367,4 +367,14 @@ config MTK_IOMMU_V1
>>
>>         if unsure, say N here.
>>
>> +config QCOM_IOMMU
>> +     # Note: iommu drivers cannot (yet?) be built as modules
>> +     bool "Qualcomm IOMMU Support"
>> +     depends on ARCH_QCOM || COMPILE_TEST
>> +     select IOMMU_API
>> +     select IOMMU_IO_PGTABLE_LPAE
>> +     select ARM_DMA_USE_IOMMU
>> +     help
>> +       Support for IOMMU on certain Qualcomm SoCs.
>> +
>>  endif # IOMMU_SUPPORT
>> diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
>> index 195f7b9..b910aea 100644
>> --- a/drivers/iommu/Makefile
>> +++ b/drivers/iommu/Makefile
>> @@ -27,3 +27,4 @@ obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o
>>  obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
>>  obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
>>  obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
>> +obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
>> diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
>> new file mode 100644
>> index 0000000..bfaf97c
>> --- /dev/null
>> +++ b/drivers/iommu/qcom_iommu.c
>> @@ -0,0 +1,878 @@
>> +/*
>> + * IOMMU API for QCOM secure IOMMUs.  Somewhat based on arm-smmu.c
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
>> + *
>> + * Copyright (C) 2013 ARM Limited
>> + * Copyright (C) 2017 Red Hat
>> + */
>> +
>> +#include <linux/atomic.h>
>> +#include <linux/clk.h>
>> +#include <linux/delay.h>
>> +#include <linux/dma-iommu.h>
>> +#include <linux/dma-mapping.h>
>> +#include <linux/err.h>
>> +#include <linux/interrupt.h>
>> +#include <linux/io.h>
>> +#include <linux/io-64-nonatomic-hi-lo.h>
>> +#include <linux/iommu.h>
>> +#include <linux/iopoll.h>
>> +#include <linux/kconfig.h>
>> +#include <linux/module.h>
>> +#include <linux/mutex.h>
>> +#include <linux/of.h>
>> +#include <linux/of_address.h>
>> +#include <linux/of_device.h>
>> +#include <linux/of_iommu.h>
>> +#include <linux/platform_device.h>
>> +#include <linux/pm.h>
>> +#include <linux/pm_runtime.h>
>> +#include <linux/qcom_scm.h>
>> +#include <linux/slab.h>
>> +#include <linux/spinlock.h>
>> +
>> +#include "io-pgtable.h"
>> +#include "arm-smmu-regs.h"
>> +
>> +#define SMMU_INTR_SEL_NS     0x2000
>> +
>> +struct qcom_iommu_dev {
>> +     /* IOMMU core code handle */
>> +     struct iommu_device      iommu;
>> +     struct device           *dev;
>> +     struct clk              *iface_clk;
>> +     struct clk              *bus_clk;
>> +     void __iomem            *local_base;
>> +     u32                      sec_id;
>> +     struct list_head         context_list;   /* list of qcom_iommu_context */
>
> Why not just an array? You can already determine at probe time what the
> maximum size needs to be, and it looks like it would make a fair chunk
> of the context code less horrible by turning it into simple indexing.

I guess if there is an easy way to figure out number of child nodes,
then that would simplify things.

>> +};
>> +
>> +struct qcom_iommu_ctx {
>> +     struct device           *dev;
>> +     void __iomem            *base;
>> +     unsigned int             irq;
>> +     bool                     secure_init;
>> +     u32                      asid;      /* asid and ctx bank # are 1:1 */
>
> u32? ASIDs are only 8 bits, and even that's already twice the maximum
> possible number of contexts...

I guess I was thinking that it wouldn't change the structure size
either way, but I can change to u8

>> +     struct iommu_group      *group;
>
> This feels weird, since a device can be associated with multiple
> contexts, but only one group, so group-per-context is somewhat redundant
> and smacks of being in the wrong place. Does the firmware ever map
> multiple devices to the same context?

I *think* it is always the other direction.. one or more cb's to a device.

tbh, I don't quite remember my reasoning here.. it already seems like
a long time ago that I wrote this.  But I do remember being a bit
fuzzy about what iommu_group was.  (Some kerneldoc really wouldn't
hurt)

>> +     struct list_head         node;      /* head in qcom_iommu_device::context_list */
>> +};
>> +
>> +struct qcom_iommu_domain {
>> +     struct io_pgtable_ops   *pgtbl_ops;
>> +     spinlock_t               pgtbl_lock;
>> +     struct mutex             init_mutex; /* Protects iommu pointer */
>> +     struct iommu_domain      domain;
>> +     struct qcom_iommu_dev   *iommu;
>> +};
>> +
>> +static struct qcom_iommu_domain *to_qcom_iommu_domain(struct iommu_domain *dom)
>> +{
>> +     return container_of(dom, struct qcom_iommu_domain, domain);
>> +}
>> +
>> +static const struct iommu_ops qcom_iommu_ops;
>> +
>> +static struct qcom_iommu_dev * __to_iommu(struct iommu_fwspec *fwspec)
>> +{
>> +     if (!fwspec || fwspec->ops != &qcom_iommu_ops)
>> +             return NULL;
>> +     return fwspec->iommu_priv;
>> +}
>> +
>> +static struct qcom_iommu_dev * to_iommu(struct iommu_fwspec *fwspec)
>> +{
>> +     struct qcom_iommu_dev *qcom_iommu = __to_iommu(fwspec);
>> +     WARN_ON(!qcom_iommu);
>
> This seems unnecessary - if your private data has somehow disappeared
> from under your nose you've almost certainly got much bigger problems.

this sorta thing is useful when debugging changes to the driver, IMO..
but I can remove it is you feel strongly.

>> +     return qcom_iommu;
>> +}
>> +
>> +static struct qcom_iommu_ctx * to_ctx(struct iommu_fwspec *fwspec, unsigned asid)
>> +{
>> +     struct qcom_iommu_dev *qcom_iommu = to_iommu(fwspec);
>> +     struct qcom_iommu_ctx *ctx;
>> +
>> +     if (!qcom_iommu)
>> +             return NULL;
>> +
>> +     list_for_each_entry(ctx, &qcom_iommu->context_list, node)
>> +             if (ctx->asid == asid)
>> +                     return ctx;
>> +
>> +     WARN(1, "no ctx for asid %u\n", asid);
>
> Verify this once in of_xlate() or add_device(), and reject the device
> up-front if the DT turns out to be bogus. Anything more than that is
> edging towards WARN_ON(1 != 1) paranoia territory.

ok

>> +     return NULL;
>> +}
>> +
>> +static inline void
>> +iommu_writel(struct qcom_iommu_ctx *ctx, unsigned reg, u32 val)
>> +{
>> +     writel_relaxed(val, ctx->base + reg);
>> +}
>> +
>> +static inline void
>> +iommu_writeq(struct qcom_iommu_ctx *ctx, unsigned reg, u64 val)
>> +{
>> +     writeq_relaxed(val, ctx->base + reg);
>> +}
>> +
>> +static inline u32
>> +iommu_readl(struct qcom_iommu_ctx *ctx, unsigned reg)
>> +{
>> +     return readl_relaxed(ctx->base + reg);
>> +}
>> +
>> +static inline u64
>> +iommu_readq(struct qcom_iommu_ctx *ctx, unsigned reg)
>> +{
>> +     return readq_relaxed(ctx->base + reg);
>> +}
>> +
>> +static void __sync_tlb(struct qcom_iommu_ctx *ctx)
>> +{
>> +     unsigned int val;
>> +     unsigned int ret;
>> +
>> +     iommu_writel(ctx, ARM_SMMU_CB_TLBSYNC, 0);
>> +
>> +     ret = readl_poll_timeout(ctx->base + ARM_SMMU_CB_TLBSTATUS, val,
>> +                              (val & 0x1) == 0, 0, 5000000);
>> +     if (ret)
>> +             dev_err(ctx->dev, "timeout waiting for TLB SYNC\n");
>> +}
>> +
>> +static void qcom_iommu_tlb_sync(void *cookie)
>> +{
>> +     struct iommu_fwspec *fwspec = cookie;
>> +     unsigned i;
>> +
>> +     for (i = 0; i < fwspec->num_ids; i++)
>> +             __sync_tlb(to_ctx(fwspec, fwspec->ids[i]));
>> +}
>> +
>> +static void qcom_iommu_tlb_inv_context(void *cookie)
>> +{
>> +     struct iommu_fwspec *fwspec = cookie;
>> +     unsigned i;
>> +
>> +     for (i = 0; i < fwspec->num_ids; i++) {
>> +             struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
>> +
>> +             iommu_writel(ctx, ARM_SMMU_CB_S1_TLBIASID, ctx->asid);
>> +             __sync_tlb(ctx);
>> +     }
>
> Wouldn't it be nicer and more efficient to issue all the invalidations
> first, *then* wait for them all to finish? That way you also wouldn't
> need to split up sync_tlb() either.

hmm, I suppose so.. although unsplitting sync_tlb() would double the
number of to_ctx()'s.  But if I can make context_list into an array
that wouldn't be a problem.

>> +}
>> +
>> +static void qcom_iommu_tlb_inv_range_nosync(unsigned long iova, size_t size,
>> +                                         size_t granule, bool leaf, void *cookie)
>> +{
>> +     struct iommu_fwspec *fwspec = cookie;
>> +     unsigned i, reg;
>> +
>> +     reg = leaf ? ARM_SMMU_CB_S1_TLBIVAL : ARM_SMMU_CB_S1_TLBIVA;
>> +
>> +     for (i = 0; i < fwspec->num_ids; i++) {
>> +             struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
>> +             size_t s = size;
>> +
>> +             iova &= ~12UL;
>> +             iova |= ctx->asid;
>> +             do {
>> +                     iommu_writel(ctx, reg, iova);
>> +                     iova += granule;
>> +             } while (s -= granule);
>> +     }
>> +}
>> +
>> +static const struct iommu_gather_ops qcom_gather_ops = {
>> +     .tlb_flush_all  = qcom_iommu_tlb_inv_context,
>> +     .tlb_add_flush  = qcom_iommu_tlb_inv_range_nosync,
>> +     .tlb_sync       = qcom_iommu_tlb_sync,
>> +};
>> +
>> +static irqreturn_t qcom_iommu_fault(int irq, void *dev)
>> +{
>> +     struct qcom_iommu_ctx *ctx = dev;
>> +     u32 fsr, fsynr;
>> +     unsigned long iova;
>> +
>> +     fsr = iommu_readl(ctx, ARM_SMMU_CB_FSR);
>> +
>> +     if (!(fsr & FSR_FAULT))
>> +             return IRQ_NONE;
>> +
>> +     fsynr = iommu_readl(ctx, ARM_SMMU_CB_FSYNR0);
>> +     iova = iommu_readq(ctx, ARM_SMMU_CB_FAR);
>
> readl()? There's not much point reading the upper word if it's just
> going to be immediately truncated (and it seems unlikely that you'd ever
> see an address outside the input range anyway).

I did have some uncertainty about whether we'll ever see a device
where we wanted to use something other than ARM_32_LPAE_S1..  but I
guess by the time the first 64b peripherals (adreno 5xx) appeared, I
think everything could use arm-smmu.  I probably didn't mean to use
'unsigned long' here though ;-)

>> +
>> +     dev_err_ratelimited(ctx->dev,
>> +                         "Unhandled context fault: fsr=0x%x, "
>> +                         "iova=0x%08lx, fsynr=0x%x, cb=%d\n",
>> +                         fsr, iova, fsynr, ctx->asid);
>> +
>> +     iommu_writel(ctx, ARM_SMMU_CB_FSR, fsr);
>> +
>> +     return IRQ_HANDLED;
>> +}
>> +
>> +static int qcom_iommu_init_domain(struct iommu_domain *domain,
>> +                               struct qcom_iommu_dev *qcom_iommu,
>> +                               struct iommu_fwspec *fwspec)
>> +{
>> +     struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
>> +     struct io_pgtable_ops *pgtbl_ops;
>> +     struct io_pgtable_cfg pgtbl_cfg;
>> +     int i, ret = 0;
>> +     u32 reg;
>> +
>> +     mutex_lock(&qcom_domain->init_mutex);
>> +     if (qcom_domain->iommu)
>> +             goto out_unlock;
>> +
>> +     pgtbl_cfg = (struct io_pgtable_cfg) {
>> +             .pgsize_bitmap  = qcom_iommu_ops.pgsize_bitmap,
>> +             .ias            = 32,
>> +             .oas            = 40,
>> +             .tlb            = &qcom_gather_ops,
>> +             .iommu_dev      = qcom_iommu->dev,
>> +     };
>> +
>> +     qcom_domain->iommu = qcom_iommu;
>> +     pgtbl_ops = alloc_io_pgtable_ops(ARM_32_LPAE_S1, &pgtbl_cfg, fwspec);
>
> If more devices get attached to this domain later, how are we going to
> do TLB maintenance on their contexts?

At least on the snapdragon devices this isn't a scenario that I've
ever seen.  If this driver was useful on other devices where this was
a real case, I'm not entirely sure off the top of my head.

>> +     if (!pgtbl_ops) {
>> +             dev_err(qcom_iommu->dev, "failed to allocate pagetable ops\n");
>> +             ret = -ENOMEM;
>> +             goto out_clear_iommu;
>> +     }
>> +
>> +     /* Update the domain's page sizes to reflect the page table format */
>> +     domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
>> +     domain->geometry.aperture_end = (1ULL << pgtbl_cfg.ias) - 1;
>> +     domain->geometry.force_aperture = true;
>> +
>> +     for (i = 0; i < fwspec->num_ids; i++) {
>> +             struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
>> +
>> +             if (!ctx->secure_init) {
>> +                     ret = qcom_scm_restore_sec_cfg(qcom_iommu->sec_id, ctx->asid);
>> +                     if (ret) {
>> +                             dev_err(qcom_iommu->dev, "secure init failed: %d\n", ret);
>> +                             goto out_clear_iommu;
>> +                     }
>> +                     ctx->secure_init = true;
>> +             }
>> +
>> +             /* TTBRs */
>> +             iommu_writeq(ctx, ARM_SMMU_CB_TTBR0,
>> +                             pgtbl_cfg.arm_lpae_s1_cfg.ttbr[0] |
>> +                             ((u64)ctx->asid << TTBRn_ASID_SHIFT));
>> +             iommu_writeq(ctx, ARM_SMMU_CB_TTBR1,
>> +                             pgtbl_cfg.arm_lpae_s1_cfg.ttbr[1] |
>> +                             ((u64)ctx->asid << TTBRn_ASID_SHIFT));
>> +
>> +             /* TTBCR */
>> +             iommu_writel(ctx, ARM_SMMU_CB_TTBCR2,
>> +                             (pgtbl_cfg.arm_lpae_s1_cfg.tcr >> 32) |
>> +                             TTBCR2_SEP_UPSTREAM);
>> +             iommu_writel(ctx, ARM_SMMU_CB_TTBCR,
>> +                             pgtbl_cfg.arm_lpae_s1_cfg.tcr);
>> +
>> +             /* MAIRs (stage-1 only) */
>> +             iommu_writel(ctx, ARM_SMMU_CB_S1_MAIR0,
>> +                             pgtbl_cfg.arm_lpae_s1_cfg.mair[0]);
>> +             iommu_writel(ctx, ARM_SMMU_CB_S1_MAIR1,
>> +                             pgtbl_cfg.arm_lpae_s1_cfg.mair[1]);
>> +
>> +             /* SCTLR */
>> +             reg = SCTLR_CFIE | SCTLR_CFRE | SCTLR_AFE | SCTLR_TRE |
>> +                     SCTLR_M | SCTLR_S1_ASIDPNE;
>> +
>> +             if (IS_ENABLED(CONFIG_BIG_ENDIAN))
>> +                     reg |= SCTLR_E;
>> +
>> +             iommu_writel(ctx, ARM_SMMU_CB_SCTLR, reg);
>> +     }
>> +
>> +     mutex_unlock(&qcom_domain->init_mutex);
>> +
>> +     /* Publish page table ops for map/unmap */
>> +     qcom_domain->pgtbl_ops = pgtbl_ops;
>> +
>> +     return 0;
>> +
>> +out_clear_iommu:
>> +     qcom_domain->iommu = NULL;
>> +out_unlock:
>> +     mutex_unlock(&qcom_domain->init_mutex);
>> +     return ret;
>> +}
>> +
>> +static struct iommu_domain *qcom_iommu_domain_alloc(unsigned type)
>> +{
>> +     struct qcom_iommu_domain *qcom_domain;
>> +
>> +     if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA)
>> +             return NULL;
>> +     /*
>> +      * Allocate the domain and initialise some of its data structures.
>> +      * We can't really do anything meaningful until we've added a
>> +      * master.
>> +      */
>
> If you don't have to wory about supporting multiple formats, you could
> do a bit more here (e.g. the domain geometry).

I guess we don't have to worry about other formats, so I suppose this
would be reasonable..

>> +     qcom_domain = kzalloc(sizeof(*qcom_domain), GFP_KERNEL);
>> +     if (!qcom_domain)
>> +             return NULL;
>> +
>> +     if (type == IOMMU_DOMAIN_DMA &&
>> +         iommu_get_dma_cookie(&qcom_domain->domain)) {
>> +             kfree(qcom_domain);
>> +             return NULL;
>> +     }
>> +
>> +     mutex_init(&qcom_domain->init_mutex);
>> +     spin_lock_init(&qcom_domain->pgtbl_lock);
>> +
>> +     return &qcom_domain->domain;
>> +}
>> +
>> +static void qcom_iommu_domain_free(struct iommu_domain *domain)
>> +{
>> +     struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
>> +
>> +     if (WARN_ON(qcom_domain->iommu))    /* forgot to detach? */
>> +             return;
>> +
>> +     iommu_put_dma_cookie(domain);
>> +
>> +     /* NOTE: unmap can be called after client device is powered off,
>> +      * for example, with GPUs or anything involving dma-buf.  So we
>> +      * cannot rely on the device_link.  Make sure the IOMMU is on to
>> +      * avoid unclocked accesses in the TLB inv path:
>> +      */
>> +     pm_runtime_get_sync(qcom_domain->iommu->dev);
>
> So we only dereference qcom_domain->iommu if we know it to be NULL? :P

hmm, yeah.. I guess this should all move to detatch

>> +     free_io_pgtable_ops(qcom_domain->pgtbl_ops);
>> +
>> +     pm_runtime_put_sync(qcom_domain->iommu->dev);
>> +
>> +     kfree(qcom_domain);
>> +}
>> +
>> +static int qcom_iommu_attach_dev(struct iommu_domain *domain, struct device *dev)
>> +{
>> +     struct qcom_iommu_dev *qcom_iommu = to_iommu(dev->iommu_fwspec);
>> +     struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
>> +     int ret;
>> +
>> +     if (!qcom_iommu) {
>> +             dev_err(dev, "cannot attach to IOMMU, is it on the same bus?\n");
>> +             return -ENXIO;
>> +     }
>> +
>> +     /* Ensure that the domain is finalized */
>> +     pm_runtime_get_sync(qcom_iommu->dev);
>> +     ret = qcom_iommu_init_domain(domain, qcom_iommu, dev->iommu_fwspec);
>> +     pm_runtime_put_sync(qcom_iommu->dev);
>> +     if (ret < 0)
>> +             return ret;
>> +
>> +     /*
>> +      * Sanity check the domain. We don't support domains across
>> +      * different IOMMUs.
>> +      */
>> +     if (qcom_domain->iommu != qcom_iommu) {
>> +             dev_err(dev, "cannot attach to IOMMU %s while already "
>> +                     "attached to domain on IOMMU %s\n",
>> +                     dev_name(qcom_domain->iommu->dev),
>> +                     dev_name(qcom_iommu->dev));
>> +             return -EINVAL;
>> +     }
>> +
>> +     return 0;
>> +}
>> +
>> +static void qcom_iommu_detach_dev(struct iommu_domain *domain, struct device *dev)
>> +{
>> +     struct iommu_fwspec *fwspec = dev->iommu_fwspec;
>> +     struct qcom_iommu_dev *qcom_iommu = to_iommu(fwspec);
>
> This extra lookup is redundant with qcom_domain->iommu that you're
> accessing anyway.

ok

>> +     struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
>> +     unsigned i;
>> +
>> +     if (!qcom_domain->iommu)
>> +             return;
>> +
>> +     pm_runtime_get_sync(qcom_iommu->dev);
>> +     for (i = 0; i < fwspec->num_ids; i++) {
>> +             struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
>> +
>> +             /* Disable the context bank: */
>> +             iommu_writel(ctx, ARM_SMMU_CB_SCTLR, 0);
>> +     }
>> +     pm_runtime_put_sync(qcom_iommu->dev);
>
> I was going to say it seems kinda bad that we won't do this for the
> second and subsequent devices we'd happily allow to be attached to this
> domain, but then I realise we'd also have silently not initialised their
> contexts in the first place :/

yeah, admittedly the multiple device situation isn't something that
I've seen on snapdragon.  Maybe I should just make sure there are some
WARN_ON()s in case anyone someone manages to encounter that scenario..

>> +
>> +     qcom_domain->iommu = NULL;
>> +}
>> +
>> +static int qcom_iommu_map(struct iommu_domain *domain, unsigned long iova,
>> +                       phys_addr_t paddr, size_t size, int prot)
>> +{
>> +     int ret;
>> +     unsigned long flags;
>> +     struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
>> +     struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
>> +
>> +     if (!ops)
>> +             return -ENODEV;
>> +
>> +     spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
>> +     ret = ops->map(ops, iova, paddr, size, prot);
>> +     spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
>> +     return ret;
>> +}
>> +
>> +static size_t qcom_iommu_unmap(struct iommu_domain *domain, unsigned long iova,
>> +                            size_t size)
>> +{
>> +     size_t ret;
>> +     unsigned long flags;
>> +     struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
>> +     struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
>> +
>> +     if (!ops)
>> +             return 0;
>> +
>> +     /* NOTE: unmap can be called after client device is powered off,
>> +      * for example, with GPUs or anything involving dma-buf.  So we
>> +      * cannot rely on the device_link.  Make sure the IOMMU is on to
>> +      * avoid unclocked accesses in the TLB inv path:
>> +      */
>> +     pm_runtime_get_sync(qcom_domain->iommu->dev);
>> +     spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
>> +     ret = ops->unmap(ops, iova, size);
>> +     spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
>> +     pm_runtime_put_sync(qcom_domain->iommu->dev);
>> +
>> +     return ret;
>> +}
>> +
>> +static phys_addr_t qcom_iommu_iova_to_phys(struct iommu_domain *domain,
>> +                                        dma_addr_t iova)
>> +{
>> +     phys_addr_t ret;
>> +     unsigned long flags;
>> +     struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
>> +     struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
>> +
>> +     if (!ops)
>> +             return 0;
>> +
>> +     spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
>> +     ret = ops->iova_to_phys(ops, iova);
>> +     spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
>> +
>> +     return ret;
>> +}
>> +
>> +static bool qcom_iommu_capable(enum iommu_cap cap)
>> +{
>> +     switch (cap) {
>> +     case IOMMU_CAP_CACHE_COHERENCY:
>> +             /*
>> +              * Return true here as the SMMU can always send out coherent
>> +              * requests.
>> +              */
>
> This isn't true, but then the whole iommu_capable() interface is
> fundamentally unworkable anyway, so meh.
>
>> +             return true;
>> +     case IOMMU_CAP_NOEXEC:
>> +             return true;
>> +     default:
>> +             return false;
>> +     }
>> +}
>> +
>> +static int qcom_iommu_add_device(struct device *dev)
>> +{
>> +     struct qcom_iommu_dev *qcom_iommu = __to_iommu(dev->iommu_fwspec);
>> +     struct iommu_group *group;
>> +     struct device_link *link;
>> +
>> +     if (!qcom_iommu)
>> +             return -ENODEV;
>> +
>> +     /*
>> +      * Establish the link between iommu and master, so that the
>> +      * iommu gets runtime enabled/disabled as per the master's
>> +      * needs.
>> +      */
>> +     link = device_link_add(dev, qcom_iommu->dev, DL_FLAG_PM_RUNTIME);
>> +     if (!link) {
>> +             dev_err(qcom_iommu->dev, "Unable to create device link between %s and %s\n",
>> +                     dev_name(qcom_iommu->dev), dev_name(dev));
>> +             return -ENODEV;
>> +     }
>> +
>> +     group = iommu_group_get_for_dev(dev);
>> +     if (IS_ERR_OR_NULL(group))
>> +             return PTR_ERR_OR_ZERO(group);
>> +
>> +     iommu_group_put(group);
>> +     iommu_device_link(&qcom_iommu->iommu, dev);
>> +
>> +     return 0;
>> +}
>> +
>> +static void qcom_iommu_remove_device(struct device *dev)
>> +{
>> +     struct qcom_iommu_dev *qcom_iommu = to_iommu(dev->iommu_fwspec);
>> +
>> +     if (!qcom_iommu)
>> +             return;
>> +
>> +     iommu_device_unlink(&qcom_iommu->iommu, dev);
>> +     iommu_group_remove_device(dev);
>> +     iommu_fwspec_free(dev);
>> +}
>> +
>> +static struct iommu_group *qcom_iommu_device_group(struct device *dev)
>> +{
>> +     struct iommu_fwspec *fwspec = dev->iommu_fwspec;
>> +     struct iommu_group *group = NULL;
>> +     unsigned i;
>> +
>> +     for (i = 0; i < fwspec->num_ids; i++) {
>> +             struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
>> +
>> +             if (group && ctx->group && group != ctx->group)
>> +                     return ERR_PTR(-EINVAL);
>> +
>> +             group = ctx->group;
>> +     }
>> +
>> +     if (group)
>> +             return iommu_group_ref_get(group);
>> +
>> +     group = generic_device_group(dev);
>> +
>> +     for (i = 0; i < fwspec->num_ids; i++) {
>> +             struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
>> +             ctx->group = iommu_group_ref_get(group);
>> +     }
>> +
>> +     return group;
>> +}
>> +
>> +static int qcom_iommu_of_xlate(struct device *dev, struct of_phandle_args *args)
>> +{
>> +     struct platform_device *iommu_pdev;
>> +
>> +     if (args->args_count != 1) {
>> +             dev_err(dev, "incorrect number of iommu params found for %s "
>> +                     "(found %d, expected 1)\n",
>> +                     args->np->full_name, args->args_count);
>> +             return -EINVAL;
>> +     }
>> +
>> +     if (!dev->iommu_fwspec->iommu_priv) {
>> +             iommu_pdev = of_find_device_by_node(args->np);
>> +             if (WARN_ON(!iommu_pdev))
>> +                     return -EINVAL;
>> +
>> +             dev->iommu_fwspec->iommu_priv = platform_get_drvdata(iommu_pdev);
>> +     }
>> +
>> +     return iommu_fwspec_add_ids(dev, &args->args[0], 1);
>> +}
>> +
>> +static const struct iommu_ops qcom_iommu_ops = {
>> +     .capable        = qcom_iommu_capable,
>> +     .domain_alloc   = qcom_iommu_domain_alloc,
>> +     .domain_free    = qcom_iommu_domain_free,
>> +     .attach_dev     = qcom_iommu_attach_dev,
>> +     .detach_dev     = qcom_iommu_detach_dev,
>> +     .map            = qcom_iommu_map,
>> +     .unmap          = qcom_iommu_unmap,
>> +     .map_sg         = default_iommu_map_sg,
>> +     .iova_to_phys   = qcom_iommu_iova_to_phys,
>> +     .add_device     = qcom_iommu_add_device,
>> +     .remove_device  = qcom_iommu_remove_device,
>> +     .device_group   = qcom_iommu_device_group,
>> +     .of_xlate       = qcom_iommu_of_xlate,
>> +     .pgsize_bitmap  = SZ_4K | SZ_64K | SZ_1M | SZ_16M,
>> +};
>> +
>> +static int qcom_iommu_enable_clocks(struct qcom_iommu_dev *qcom_iommu)
>> +{
>> +     int ret;
>> +
>> +     ret = clk_prepare_enable(qcom_iommu->iface_clk);
>> +     if (ret) {
>> +             dev_err(qcom_iommu->dev, "Couldn't enable iface_clk\n");
>> +             return ret;
>> +     }
>> +
>> +     ret = clk_prepare_enable(qcom_iommu->bus_clk);
>> +     if (ret) {
>> +             dev_err(qcom_iommu->dev, "Couldn't enable bus_clk\n");
>> +             clk_disable_unprepare(qcom_iommu->iface_clk);
>> +             return ret;
>> +     }
>> +
>> +     return 0;
>> +}
>> +
>> +static void qcom_iommu_disable_clocks(struct qcom_iommu_dev *qcom_iommu)
>> +{
>> +     clk_disable_unprepare(qcom_iommu->bus_clk);
>> +     clk_disable_unprepare(qcom_iommu->iface_clk);
>> +}
>> +
>> +static int qcom_iommu_ctx_probe(struct platform_device *pdev)
>> +{
>> +     struct qcom_iommu_ctx *ctx;
>> +     struct device *dev = &pdev->dev;
>> +     struct qcom_iommu_dev *qcom_iommu = dev_get_drvdata(dev->parent);
>> +     struct resource *res;
>> +     int ret;
>> +     u32 reg;
>> +
>> +     ctx = devm_kzalloc(dev, sizeof(*ctx), GFP_KERNEL);
>> +     if (!ctx)
>> +             return -ENOMEM;
>> +
>> +     ctx->dev = dev;
>> +     platform_set_drvdata(pdev, ctx);
>> +
>> +     res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>> +     ctx->base = devm_ioremap_resource(dev, res);
>> +     if (IS_ERR(ctx->base))
>> +             return PTR_ERR(ctx->base);
>> +
>> +     ctx->irq = platform_get_irq(pdev, 0);
>> +     if (ctx->irq < 0) {
>> +             dev_err(dev, "failed to get irq\n");
>> +             return -ENODEV;
>> +     }
>> +
>> +     /* clear IRQs before registering fault handler, just in case the
>> +      * boot-loader left us a surprise:
>> +      */
>> +     iommu_writel(ctx, ARM_SMMU_CB_FSR, iommu_readl(ctx, ARM_SMMU_CB_FSR));
>> +
>> +     ret = devm_request_irq(dev, ctx->irq,
>> +                            qcom_iommu_fault,
>> +                            IRQF_SHARED,
>> +                            "qcom-iommu-fault",
>> +                            ctx);
>> +     if (ret) {
>> +             dev_err(dev, "failed to request IRQ %u\n", ctx->irq);
>> +             return ret;
>> +     }
>> +
>> +     /* read the "reg" property directly to get the relative address
>> +      * of the context bank, and calculate the asid from that:
>> +      */
>> +     if (of_property_read_u32_index(dev->of_node, "reg", 0, &reg)) {
>> +             dev_err(dev, "missing reg property\n");
>> +             return -ENODEV;
>> +     }
>> +
>> +     ctx->asid = reg / 0x1000;      /* context banks are 0x1000 apart */
>> +
>> +     dev_dbg(dev, "found asid %u\n", ctx->asid);
>> +
>> +     list_add_tail(&ctx->node, &qcom_iommu->context_list);
>> +
>> +     return 0;
>> +}
>> +
>> +static int qcom_iommu_ctx_remove(struct platform_device *pdev)
>> +{
>> +     struct qcom_iommu_ctx *ctx = platform_get_drvdata(pdev);
>> +
>> +     iommu_group_put(ctx->group);
>> +     platform_set_drvdata(pdev, NULL);
>> +
>> +     list_del(&ctx->node);
>> +
>> +     return 0;
>> +}
>> +
>> +static const struct of_device_id ctx_of_match[] = {
>> +     { .compatible = "qcom,msm-iommu-v1-ns" },
>> +     { .compatible = "qcom,msm-iommu-v1-sec" },
>> +     { /* sentinel */ }
>> +};
>> +
>> +static struct platform_driver qcom_iommu_ctx_driver = {
>> +     .driver = {
>> +             .name           = "qcom-iommu-ctx",
>> +             .of_match_table = of_match_ptr(ctx_of_match),
>> +     },
>> +     .probe  = qcom_iommu_ctx_probe,
>> +     .remove = qcom_iommu_ctx_remove,
>> +};
>> +
>> +static int qcom_iommu_device_probe(struct platform_device *pdev)
>> +{
>> +     struct qcom_iommu_dev *qcom_iommu;
>> +     struct device *dev = &pdev->dev;
>> +     struct resource *res;
>> +     int ret;
>> +
>> +     qcom_iommu = devm_kzalloc(dev, sizeof(*qcom_iommu), GFP_KERNEL);
>> +     if (!qcom_iommu)
>> +             return -ENOMEM;
>> +     qcom_iommu->dev = dev;
>> +
>> +     INIT_LIST_HEAD(&qcom_iommu->context_list);
>> +
>> +     res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>> +     if (res)
>> +             qcom_iommu->local_base = devm_ioremap_resource(dev, res);
>> +
>> +     qcom_iommu->iface_clk = devm_clk_get(dev, "iface");
>> +     if (IS_ERR(qcom_iommu->iface_clk)) {
>> +             dev_err(dev, "failed to get iface clock\n");
>> +             return PTR_ERR(qcom_iommu->iface_clk);
>> +     }
>> +
>> +     qcom_iommu->bus_clk = devm_clk_get(dev, "bus");
>> +     if (IS_ERR(qcom_iommu->bus_clk)) {
>> +             dev_err(dev, "failed to get bus clock\n");
>> +             return PTR_ERR(qcom_iommu->bus_clk);
>> +     }
>> +
>> +     if (of_property_read_u32(dev->of_node, "qcom,iommu-secure-id",
>> +                              &qcom_iommu->sec_id)) {
>> +             dev_err(dev, "missing qcom,iommu-secure-id property\n");
>> +             return -ENODEV;
>> +     }
>> +
>> +     platform_set_drvdata(pdev, qcom_iommu);
>> +
>> +     pm_runtime_enable(dev);
>> +
>> +     /* register context bank devices, which are child nodes: */
>> +     ret = devm_of_platform_populate(dev);
>> +     if (ret) {
>> +             dev_err(dev, "Failed to populate iommu contexts\n");
>> +             return ret;
>> +     }
>> +
>> +     ret = iommu_device_sysfs_add(&qcom_iommu->iommu, dev, NULL,
>> +                                  dev_name(dev));
>> +     if (ret) {
>> +             dev_err(dev, "Failed to register iommu in sysfs\n");
>> +             return ret;
>> +     }
>> +
>> +     iommu_device_set_ops(&qcom_iommu->iommu, &qcom_iommu_ops);
>> +     iommu_device_set_fwnode(&qcom_iommu->iommu, dev->fwnode);
>> +
>> +     ret = iommu_device_register(&qcom_iommu->iommu);
>> +     if (ret) {
>> +             dev_err(dev, "Failed to register iommu\n");
>> +             return ret;
>> +     }
>> +
>> +     bus_set_iommu(&platform_bus_type, &qcom_iommu_ops);
>> +
>> +     if (qcom_iommu->local_base) {
>> +             pm_runtime_get_sync(dev);
>> +             writel_relaxed(0xffffffff, qcom_iommu->local_base + SMMU_INTR_SEL_NS);
>> +             pm_runtime_put_sync(dev);
>> +     }
>> +
>> +     return 0;
>
> You need to set your DMA masks somewhere and make sure it succeeded,
> especially given that what the platform bus gives you by default isn't
> big enough for an LPAE table walker.

ok

>> +}
>> +
>> +static int qcom_iommu_device_remove(struct platform_device *pdev)
>> +{
>> +     struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
>> +
>> +     bus_set_iommu(&platform_bus_type, NULL);
>
> This does nothing.

hmm, true.. which doesn't do much for my confidence in module unload ;-)

>> +     pm_runtime_force_suspend(&pdev->dev);
>> +     platform_set_drvdata(pdev, NULL);
>> +     iommu_device_sysfs_remove(&qcom_iommu->iommu);
>> +     iommu_device_unregister(&qcom_iommu->iommu);
>> +
>> +     return 0;
>> +}
>> +
>> +#ifdef CONFIG_PM
>
> I was under the impression that annotating PM callbacks as
> __maybe_unused was preferred these days, but I could be wrong.

you might be right.. I mostly looked at what other drivers were doing,
but ofc that doesn't guarantee that they are updated to work the new
shiny way.

BR,
-R

> Robin.
>
>> +static int qcom_iommu_resume(struct device *dev)
>> +{
>> +     struct platform_device *pdev = to_platform_device(dev);
>> +     struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
>> +
>> +     return qcom_iommu_enable_clocks(qcom_iommu);
>> +}
>> +
>> +static int qcom_iommu_suspend(struct device *dev)
>> +{
>> +     struct platform_device *pdev = to_platform_device(dev);
>> +     struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
>> +
>> +     qcom_iommu_disable_clocks(qcom_iommu);
>> +
>> +     return 0;
>> +}
>> +#endif
>> +
>> +static const struct dev_pm_ops qcom_iommu_pm_ops = {
>> +     SET_RUNTIME_PM_OPS(qcom_iommu_suspend, qcom_iommu_resume, NULL)
>> +     SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
>> +                             pm_runtime_force_resume)
>> +};
>> +
>> +static const struct of_device_id qcom_iommu_of_match[] = {
>> +     { .compatible = "qcom,msm-iommu-v1" },
>> +     { /* sentinel */ }
>> +};
>> +MODULE_DEVICE_TABLE(of, qcom_iommu_of_match);
>> +
>> +static struct platform_driver qcom_iommu_driver = {
>> +     .driver = {
>> +             .name           = "qcom-iommu",
>> +             .of_match_table = of_match_ptr(qcom_iommu_of_match),
>> +             .pm             = &qcom_iommu_pm_ops,
>> +     },
>> +     .probe  = qcom_iommu_device_probe,
>> +     .remove = qcom_iommu_device_remove,
>> +};
>> +
>> +static int __init qcom_iommu_init(void)
>> +{
>> +     int ret;
>> +
>> +     ret = platform_driver_register(&qcom_iommu_ctx_driver);
>> +     if (ret)
>> +             return ret;
>> +
>> +     ret = platform_driver_register(&qcom_iommu_driver);
>> +     if (ret)
>> +             platform_driver_unregister(&qcom_iommu_ctx_driver);
>> +
>> +     return ret;
>> +}
>> +
>> +static void __exit qcom_iommu_exit(void)
>> +{
>> +     platform_driver_unregister(&qcom_iommu_driver);
>> +     platform_driver_unregister(&qcom_iommu_ctx_driver);
>> +}
>> +
>> +module_init(qcom_iommu_init);
>> +module_exit(qcom_iommu_exit);
>> +
>> +IOMMU_OF_DECLARE(qcom_iommu_dev, "qcom,msm-iommu-v1", NULL);
>> +
>> +MODULE_DESCRIPTION("IOMMU API for QCOM IOMMU v1 implementations");
>> +MODULE_LICENSE("GPL v2");
>>
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] iommu: add qcom_iommu
       [not found]     ` <20170525173340.26904-4-robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-05-26 12:56       ` Robin Murphy
       [not found]         ` <47a738b1-7da5-7043-c16c-4159c6211f7e-5wv7dgnIgG8@public.gmane.org>
  0 siblings, 1 reply; 24+ messages in thread
From: Robin Murphy @ 2017-05-26 12:56 UTC (permalink / raw)
  To: Rob Clark, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: Mark Rutland, Rob Herring, linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	Will Deacon, Stanimir Varbanov

On 25/05/17 18:33, Rob Clark wrote:
> An iommu driver for Qualcomm "B" family devices which do not completely
> implement the ARM SMMU spec.  These devices have context-bank register
> layout that is similar to ARM SMMU, but no global register space (or at
> least not one that is accessible).

I still object to this description, because the SMMU_SCR1.GASRAE = 1
usage model is explicitly *specified* by the ARM SMMU spec! It's merely
that the arm-smmu driver is designed for the case where we do have
control of the global space and stage 2 contexts.

> Signed-off-by: Rob Clark <robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> ---
> v1: original
> v2: bindings cleanups and kconfig issues that kbuild robot pointed out
> v3: fix issues pointed out by Rob H. and actually make device removal
>     work
> v4: fix WARN_ON() splats reported by Archit
> v5: some fixes to build as a module.. note that it cannot actually
>     be built as a module yet (at minimum a bunch of other iommu syms
>     that are needed are not exported, but there may be more to it
>     than that), but at least qcom_iommu is ready should it become
>     possible to build iommu drivers as modules.

Note that with the 4.12 probe-deferral changes, modules totally work!
For any master which probed before the IOMMU driver was loaded, you can
then hook them up after the fact by just unbinding and rebinding their
drivers - it's really cool.

> v6: Add additional pm-runtime get/puts around paths that can hit
>     TLB inv, to avoid unclocked register access if device using the
>     iommu is not powered on.  And pre-emptively clear interrupts
>     before registering IRQ handler just in case the bootloader has
>     left us a surpise.
> 
>  drivers/iommu/Kconfig      |  10 +
>  drivers/iommu/Makefile     |   1 +
>  drivers/iommu/qcom_iommu.c | 878 +++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 889 insertions(+)
>  create mode 100644 drivers/iommu/qcom_iommu.c
> 
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index 6ee3a25..aa4b628 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -367,4 +367,14 @@ config MTK_IOMMU_V1
>  
>  	  if unsure, say N here.
>  
> +config QCOM_IOMMU
> +	# Note: iommu drivers cannot (yet?) be built as modules
> +	bool "Qualcomm IOMMU Support"
> +	depends on ARCH_QCOM || COMPILE_TEST
> +	select IOMMU_API
> +	select IOMMU_IO_PGTABLE_LPAE
> +	select ARM_DMA_USE_IOMMU
> +	help
> +	  Support for IOMMU on certain Qualcomm SoCs.
> +
>  endif # IOMMU_SUPPORT
> diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
> index 195f7b9..b910aea 100644
> --- a/drivers/iommu/Makefile
> +++ b/drivers/iommu/Makefile
> @@ -27,3 +27,4 @@ obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o
>  obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
>  obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
>  obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
> +obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
> diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
> new file mode 100644
> index 0000000..bfaf97c
> --- /dev/null
> +++ b/drivers/iommu/qcom_iommu.c
> @@ -0,0 +1,878 @@
> +/*
> + * IOMMU API for QCOM secure IOMMUs.  Somewhat based on arm-smmu.c
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + *
> + * Copyright (C) 2013 ARM Limited
> + * Copyright (C) 2017 Red Hat
> + */
> +
> +#include <linux/atomic.h>
> +#include <linux/clk.h>
> +#include <linux/delay.h>
> +#include <linux/dma-iommu.h>
> +#include <linux/dma-mapping.h>
> +#include <linux/err.h>
> +#include <linux/interrupt.h>
> +#include <linux/io.h>
> +#include <linux/io-64-nonatomic-hi-lo.h>
> +#include <linux/iommu.h>
> +#include <linux/iopoll.h>
> +#include <linux/kconfig.h>
> +#include <linux/module.h>
> +#include <linux/mutex.h>
> +#include <linux/of.h>
> +#include <linux/of_address.h>
> +#include <linux/of_device.h>
> +#include <linux/of_iommu.h>
> +#include <linux/platform_device.h>
> +#include <linux/pm.h>
> +#include <linux/pm_runtime.h>
> +#include <linux/qcom_scm.h>
> +#include <linux/slab.h>
> +#include <linux/spinlock.h>
> +
> +#include "io-pgtable.h"
> +#include "arm-smmu-regs.h"
> +
> +#define SMMU_INTR_SEL_NS     0x2000
> +
> +struct qcom_iommu_dev {
> +	/* IOMMU core code handle */
> +	struct iommu_device	 iommu;
> +	struct device		*dev;
> +	struct clk		*iface_clk;
> +	struct clk		*bus_clk;
> +	void __iomem		*local_base;
> +	u32			 sec_id;
> +	struct list_head	 context_list;   /* list of qcom_iommu_context */

Why not just an array? You can already determine at probe time what the
maximum size needs to be, and it looks like it would make a fair chunk
of the context code less horrible by turning it into simple indexing.

> +};
> +
> +struct qcom_iommu_ctx {
> +	struct device		*dev;
> +	void __iomem		*base;
> +	unsigned int		 irq;
> +	bool			 secure_init;
> +	u32			 asid;      /* asid and ctx bank # are 1:1 */

u32? ASIDs are only 8 bits, and even that's already twice the maximum
possible number of contexts...

> +	struct iommu_group	*group;

This feels weird, since a device can be associated with multiple
contexts, but only one group, so group-per-context is somewhat redundant
and smacks of being in the wrong place. Does the firmware ever map
multiple devices to the same context?

> +	struct list_head	 node;      /* head in qcom_iommu_device::context_list */
> +};
> +
> +struct qcom_iommu_domain {
> +	struct io_pgtable_ops	*pgtbl_ops;
> +	spinlock_t		 pgtbl_lock;
> +	struct mutex		 init_mutex; /* Protects iommu pointer */
> +	struct iommu_domain	 domain;
> +	struct qcom_iommu_dev	*iommu;
> +};
> +
> +static struct qcom_iommu_domain *to_qcom_iommu_domain(struct iommu_domain *dom)
> +{
> +	return container_of(dom, struct qcom_iommu_domain, domain);
> +}
> +
> +static const struct iommu_ops qcom_iommu_ops;
> +
> +static struct qcom_iommu_dev * __to_iommu(struct iommu_fwspec *fwspec)
> +{
> +	if (!fwspec || fwspec->ops != &qcom_iommu_ops)
> +		return NULL;
> +	return fwspec->iommu_priv;
> +}
> +
> +static struct qcom_iommu_dev * to_iommu(struct iommu_fwspec *fwspec)
> +{
> +	struct qcom_iommu_dev *qcom_iommu = __to_iommu(fwspec);
> +	WARN_ON(!qcom_iommu);

This seems unnecessary - if your private data has somehow disappeared
from under your nose you've almost certainly got much bigger problems.

> +	return qcom_iommu;
> +}
> +
> +static struct qcom_iommu_ctx * to_ctx(struct iommu_fwspec *fwspec, unsigned asid)
> +{
> +	struct qcom_iommu_dev *qcom_iommu = to_iommu(fwspec);
> +	struct qcom_iommu_ctx *ctx;
> +
> +	if (!qcom_iommu)
> +		return NULL;
> +
> +	list_for_each_entry(ctx, &qcom_iommu->context_list, node)
> +		if (ctx->asid == asid)
> +			return ctx;
> +
> +	WARN(1, "no ctx for asid %u\n", asid);

Verify this once in of_xlate() or add_device(), and reject the device
up-front if the DT turns out to be bogus. Anything more than that is
edging towards WARN_ON(1 != 1) paranoia territory.

> +	return NULL;
> +}
> +
> +static inline void
> +iommu_writel(struct qcom_iommu_ctx *ctx, unsigned reg, u32 val)
> +{
> +	writel_relaxed(val, ctx->base + reg);
> +}
> +
> +static inline void
> +iommu_writeq(struct qcom_iommu_ctx *ctx, unsigned reg, u64 val)
> +{
> +	writeq_relaxed(val, ctx->base + reg);
> +}
> +
> +static inline u32
> +iommu_readl(struct qcom_iommu_ctx *ctx, unsigned reg)
> +{
> +	return readl_relaxed(ctx->base + reg);
> +}
> +
> +static inline u64
> +iommu_readq(struct qcom_iommu_ctx *ctx, unsigned reg)
> +{
> +	return readq_relaxed(ctx->base + reg);
> +}
> +
> +static void __sync_tlb(struct qcom_iommu_ctx *ctx)
> +{
> +	unsigned int val;
> +	unsigned int ret;
> +
> +	iommu_writel(ctx, ARM_SMMU_CB_TLBSYNC, 0);
> +
> +	ret = readl_poll_timeout(ctx->base + ARM_SMMU_CB_TLBSTATUS, val,
> +				 (val & 0x1) == 0, 0, 5000000);
> +	if (ret)
> +		dev_err(ctx->dev, "timeout waiting for TLB SYNC\n");
> +}
> +
> +static void qcom_iommu_tlb_sync(void *cookie)
> +{
> +	struct iommu_fwspec *fwspec = cookie;
> +	unsigned i;
> +
> +	for (i = 0; i < fwspec->num_ids; i++)
> +		__sync_tlb(to_ctx(fwspec, fwspec->ids[i]));
> +}
> +
> +static void qcom_iommu_tlb_inv_context(void *cookie)
> +{
> +	struct iommu_fwspec *fwspec = cookie;
> +	unsigned i;
> +
> +	for (i = 0; i < fwspec->num_ids; i++) {
> +		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
> +
> +		iommu_writel(ctx, ARM_SMMU_CB_S1_TLBIASID, ctx->asid);
> +		__sync_tlb(ctx);
> +	}

Wouldn't it be nicer and more efficient to issue all the invalidations
first, *then* wait for them all to finish? That way you also wouldn't
need to split up sync_tlb() either.

> +}
> +
> +static void qcom_iommu_tlb_inv_range_nosync(unsigned long iova, size_t size,
> +					    size_t granule, bool leaf, void *cookie)
> +{
> +	struct iommu_fwspec *fwspec = cookie;
> +	unsigned i, reg;
> +
> +	reg = leaf ? ARM_SMMU_CB_S1_TLBIVAL : ARM_SMMU_CB_S1_TLBIVA;
> +
> +	for (i = 0; i < fwspec->num_ids; i++) {
> +		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
> +		size_t s = size;
> +
> +		iova &= ~12UL;
> +		iova |= ctx->asid;
> +		do {
> +			iommu_writel(ctx, reg, iova);
> +			iova += granule;
> +		} while (s -= granule);
> +	}
> +}
> +
> +static const struct iommu_gather_ops qcom_gather_ops = {
> +	.tlb_flush_all	= qcom_iommu_tlb_inv_context,
> +	.tlb_add_flush	= qcom_iommu_tlb_inv_range_nosync,
> +	.tlb_sync	= qcom_iommu_tlb_sync,
> +};
> +
> +static irqreturn_t qcom_iommu_fault(int irq, void *dev)
> +{
> +	struct qcom_iommu_ctx *ctx = dev;
> +	u32 fsr, fsynr;
> +	unsigned long iova;
> +
> +	fsr = iommu_readl(ctx, ARM_SMMU_CB_FSR);
> +
> +	if (!(fsr & FSR_FAULT))
> +		return IRQ_NONE;
> +
> +	fsynr = iommu_readl(ctx, ARM_SMMU_CB_FSYNR0);
> +	iova = iommu_readq(ctx, ARM_SMMU_CB_FAR);

readl()? There's not much point reading the upper word if it's just
going to be immediately truncated (and it seems unlikely that you'd ever
see an address outside the input range anyway).

> +
> +	dev_err_ratelimited(ctx->dev,
> +			    "Unhandled context fault: fsr=0x%x, "
> +			    "iova=0x%08lx, fsynr=0x%x, cb=%d\n",
> +			    fsr, iova, fsynr, ctx->asid);
> +
> +	iommu_writel(ctx, ARM_SMMU_CB_FSR, fsr);
> +
> +	return IRQ_HANDLED;
> +}
> +
> +static int qcom_iommu_init_domain(struct iommu_domain *domain,
> +				  struct qcom_iommu_dev *qcom_iommu,
> +				  struct iommu_fwspec *fwspec)
> +{
> +	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
> +	struct io_pgtable_ops *pgtbl_ops;
> +	struct io_pgtable_cfg pgtbl_cfg;
> +	int i, ret = 0;
> +	u32 reg;
> +
> +	mutex_lock(&qcom_domain->init_mutex);
> +	if (qcom_domain->iommu)
> +		goto out_unlock;
> +
> +	pgtbl_cfg = (struct io_pgtable_cfg) {
> +		.pgsize_bitmap	= qcom_iommu_ops.pgsize_bitmap,
> +		.ias		= 32,
> +		.oas		= 40,
> +		.tlb		= &qcom_gather_ops,
> +		.iommu_dev	= qcom_iommu->dev,
> +	};
> +
> +	qcom_domain->iommu = qcom_iommu;
> +	pgtbl_ops = alloc_io_pgtable_ops(ARM_32_LPAE_S1, &pgtbl_cfg, fwspec);

If more devices get attached to this domain later, how are we going to
do TLB maintenance on their contexts?

> +	if (!pgtbl_ops) {
> +		dev_err(qcom_iommu->dev, "failed to allocate pagetable ops\n");
> +		ret = -ENOMEM;
> +		goto out_clear_iommu;
> +	}
> +
> +	/* Update the domain's page sizes to reflect the page table format */
> +	domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
> +	domain->geometry.aperture_end = (1ULL << pgtbl_cfg.ias) - 1;
> +	domain->geometry.force_aperture = true;
> +
> +	for (i = 0; i < fwspec->num_ids; i++) {
> +		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
> +
> +		if (!ctx->secure_init) {
> +			ret = qcom_scm_restore_sec_cfg(qcom_iommu->sec_id, ctx->asid);
> +			if (ret) {
> +				dev_err(qcom_iommu->dev, "secure init failed: %d\n", ret);
> +				goto out_clear_iommu;
> +			}
> +			ctx->secure_init = true;
> +		}
> +
> +		/* TTBRs */
> +		iommu_writeq(ctx, ARM_SMMU_CB_TTBR0,
> +				pgtbl_cfg.arm_lpae_s1_cfg.ttbr[0] |
> +				((u64)ctx->asid << TTBRn_ASID_SHIFT));
> +		iommu_writeq(ctx, ARM_SMMU_CB_TTBR1,
> +				pgtbl_cfg.arm_lpae_s1_cfg.ttbr[1] |
> +				((u64)ctx->asid << TTBRn_ASID_SHIFT));
> +
> +		/* TTBCR */
> +		iommu_writel(ctx, ARM_SMMU_CB_TTBCR2,
> +				(pgtbl_cfg.arm_lpae_s1_cfg.tcr >> 32) |
> +				TTBCR2_SEP_UPSTREAM);
> +		iommu_writel(ctx, ARM_SMMU_CB_TTBCR,
> +				pgtbl_cfg.arm_lpae_s1_cfg.tcr);
> +
> +		/* MAIRs (stage-1 only) */
> +		iommu_writel(ctx, ARM_SMMU_CB_S1_MAIR0,
> +				pgtbl_cfg.arm_lpae_s1_cfg.mair[0]);
> +		iommu_writel(ctx, ARM_SMMU_CB_S1_MAIR1,
> +				pgtbl_cfg.arm_lpae_s1_cfg.mair[1]);
> +
> +		/* SCTLR */
> +		reg = SCTLR_CFIE | SCTLR_CFRE | SCTLR_AFE | SCTLR_TRE |
> +			SCTLR_M | SCTLR_S1_ASIDPNE;
> +
> +		if (IS_ENABLED(CONFIG_BIG_ENDIAN))
> +			reg |= SCTLR_E;
> +
> +		iommu_writel(ctx, ARM_SMMU_CB_SCTLR, reg);
> +	}
> +
> +	mutex_unlock(&qcom_domain->init_mutex);
> +
> +	/* Publish page table ops for map/unmap */
> +	qcom_domain->pgtbl_ops = pgtbl_ops;
> +
> +	return 0;
> +
> +out_clear_iommu:
> +	qcom_domain->iommu = NULL;
> +out_unlock:
> +	mutex_unlock(&qcom_domain->init_mutex);
> +	return ret;
> +}
> +
> +static struct iommu_domain *qcom_iommu_domain_alloc(unsigned type)
> +{
> +	struct qcom_iommu_domain *qcom_domain;
> +
> +	if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA)
> +		return NULL;
> +	/*
> +	 * Allocate the domain and initialise some of its data structures.
> +	 * We can't really do anything meaningful until we've added a
> +	 * master.
> +	 */

If you don't have to wory about supporting multiple formats, you could
do a bit more here (e.g. the domain geometry).

> +	qcom_domain = kzalloc(sizeof(*qcom_domain), GFP_KERNEL);
> +	if (!qcom_domain)
> +		return NULL;
> +
> +	if (type == IOMMU_DOMAIN_DMA &&
> +	    iommu_get_dma_cookie(&qcom_domain->domain)) {
> +		kfree(qcom_domain);
> +		return NULL;
> +	}
> +
> +	mutex_init(&qcom_domain->init_mutex);
> +	spin_lock_init(&qcom_domain->pgtbl_lock);
> +
> +	return &qcom_domain->domain;
> +}
> +
> +static void qcom_iommu_domain_free(struct iommu_domain *domain)
> +{
> +	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
> +
> +	if (WARN_ON(qcom_domain->iommu))    /* forgot to detach? */
> +		return;
> +
> +	iommu_put_dma_cookie(domain);
> +
> +	/* NOTE: unmap can be called after client device is powered off,
> +	 * for example, with GPUs or anything involving dma-buf.  So we
> +	 * cannot rely on the device_link.  Make sure the IOMMU is on to
> +	 * avoid unclocked accesses in the TLB inv path:
> +	 */
> +	pm_runtime_get_sync(qcom_domain->iommu->dev);

So we only dereference qcom_domain->iommu if we know it to be NULL? :P

> +	free_io_pgtable_ops(qcom_domain->pgtbl_ops);
> +
> +	pm_runtime_put_sync(qcom_domain->iommu->dev);
> +
> +	kfree(qcom_domain);
> +}
> +
> +static int qcom_iommu_attach_dev(struct iommu_domain *domain, struct device *dev)
> +{
> +	struct qcom_iommu_dev *qcom_iommu = to_iommu(dev->iommu_fwspec);
> +	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
> +	int ret;
> +
> +	if (!qcom_iommu) {
> +		dev_err(dev, "cannot attach to IOMMU, is it on the same bus?\n");
> +		return -ENXIO;
> +	}
> +
> +	/* Ensure that the domain is finalized */
> +	pm_runtime_get_sync(qcom_iommu->dev);
> +	ret = qcom_iommu_init_domain(domain, qcom_iommu, dev->iommu_fwspec);
> +	pm_runtime_put_sync(qcom_iommu->dev);
> +	if (ret < 0)
> +		return ret;
> +
> +	/*
> +	 * Sanity check the domain. We don't support domains across
> +	 * different IOMMUs.
> +	 */
> +	if (qcom_domain->iommu != qcom_iommu) {
> +		dev_err(dev, "cannot attach to IOMMU %s while already "
> +			"attached to domain on IOMMU %s\n",
> +			dev_name(qcom_domain->iommu->dev),
> +			dev_name(qcom_iommu->dev));
> +		return -EINVAL;
> +	}
> +
> +	return 0;
> +}
> +
> +static void qcom_iommu_detach_dev(struct iommu_domain *domain, struct device *dev)
> +{
> +	struct iommu_fwspec *fwspec = dev->iommu_fwspec;
> +	struct qcom_iommu_dev *qcom_iommu = to_iommu(fwspec);

This extra lookup is redundant with qcom_domain->iommu that you're
accessing anyway.

> +	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
> +	unsigned i;
> +
> +	if (!qcom_domain->iommu)
> +		return;
> +
> +	pm_runtime_get_sync(qcom_iommu->dev);
> +	for (i = 0; i < fwspec->num_ids; i++) {
> +		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
> +
> +		/* Disable the context bank: */
> +		iommu_writel(ctx, ARM_SMMU_CB_SCTLR, 0);
> +	}
> +	pm_runtime_put_sync(qcom_iommu->dev);

I was going to say it seems kinda bad that we won't do this for the
second and subsequent devices we'd happily allow to be attached to this
domain, but then I realise we'd also have silently not initialised their
contexts in the first place :/

> +
> +	qcom_domain->iommu = NULL;
> +}
> +
> +static int qcom_iommu_map(struct iommu_domain *domain, unsigned long iova,
> +			  phys_addr_t paddr, size_t size, int prot)
> +{
> +	int ret;
> +	unsigned long flags;
> +	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
> +	struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
> +
> +	if (!ops)
> +		return -ENODEV;
> +
> +	spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
> +	ret = ops->map(ops, iova, paddr, size, prot);
> +	spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
> +	return ret;
> +}
> +
> +static size_t qcom_iommu_unmap(struct iommu_domain *domain, unsigned long iova,
> +			       size_t size)
> +{
> +	size_t ret;
> +	unsigned long flags;
> +	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
> +	struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
> +
> +	if (!ops)
> +		return 0;
> +
> +	/* NOTE: unmap can be called after client device is powered off,
> +	 * for example, with GPUs or anything involving dma-buf.  So we
> +	 * cannot rely on the device_link.  Make sure the IOMMU is on to
> +	 * avoid unclocked accesses in the TLB inv path:
> +	 */
> +	pm_runtime_get_sync(qcom_domain->iommu->dev);
> +	spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
> +	ret = ops->unmap(ops, iova, size);
> +	spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
> +	pm_runtime_put_sync(qcom_domain->iommu->dev);
> +
> +	return ret;
> +}
> +
> +static phys_addr_t qcom_iommu_iova_to_phys(struct iommu_domain *domain,
> +					   dma_addr_t iova)
> +{
> +	phys_addr_t ret;
> +	unsigned long flags;
> +	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
> +	struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
> +
> +	if (!ops)
> +		return 0;
> +
> +	spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
> +	ret = ops->iova_to_phys(ops, iova);
> +	spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
> +
> +	return ret;
> +}
> +
> +static bool qcom_iommu_capable(enum iommu_cap cap)
> +{
> +	switch (cap) {
> +	case IOMMU_CAP_CACHE_COHERENCY:
> +		/*
> +		 * Return true here as the SMMU can always send out coherent
> +		 * requests.
> +		 */

This isn't true, but then the whole iommu_capable() interface is
fundamentally unworkable anyway, so meh.

> +		return true;
> +	case IOMMU_CAP_NOEXEC:
> +		return true;
> +	default:
> +		return false;
> +	}
> +}
> +
> +static int qcom_iommu_add_device(struct device *dev)
> +{
> +	struct qcom_iommu_dev *qcom_iommu = __to_iommu(dev->iommu_fwspec);
> +	struct iommu_group *group;
> +	struct device_link *link;
> +
> +	if (!qcom_iommu)
> +		return -ENODEV;
> +
> +	/*
> +	 * Establish the link between iommu and master, so that the
> +	 * iommu gets runtime enabled/disabled as per the master's
> +	 * needs.
> +	 */
> +	link = device_link_add(dev, qcom_iommu->dev, DL_FLAG_PM_RUNTIME);
> +	if (!link) {
> +		dev_err(qcom_iommu->dev, "Unable to create device link between %s and %s\n",
> +			dev_name(qcom_iommu->dev), dev_name(dev));
> +		return -ENODEV;
> +	}
> +
> +	group = iommu_group_get_for_dev(dev);
> +	if (IS_ERR_OR_NULL(group))
> +		return PTR_ERR_OR_ZERO(group);
> +
> +	iommu_group_put(group);
> +	iommu_device_link(&qcom_iommu->iommu, dev);
> +
> +	return 0;
> +}
> +
> +static void qcom_iommu_remove_device(struct device *dev)
> +{
> +	struct qcom_iommu_dev *qcom_iommu = to_iommu(dev->iommu_fwspec);
> +
> +	if (!qcom_iommu)
> +		return;
> +
> +	iommu_device_unlink(&qcom_iommu->iommu, dev);
> +	iommu_group_remove_device(dev);
> +	iommu_fwspec_free(dev);
> +}
> +
> +static struct iommu_group *qcom_iommu_device_group(struct device *dev)
> +{
> +	struct iommu_fwspec *fwspec = dev->iommu_fwspec;
> +	struct iommu_group *group = NULL;
> +	unsigned i;
> +
> +	for (i = 0; i < fwspec->num_ids; i++) {
> +		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
> +
> +		if (group && ctx->group && group != ctx->group)
> +			return ERR_PTR(-EINVAL);
> +
> +		group = ctx->group;
> +	}
> +
> +	if (group)
> +		return iommu_group_ref_get(group);
> +
> +	group = generic_device_group(dev);
> +
> +	for (i = 0; i < fwspec->num_ids; i++) {
> +		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
> +		ctx->group = iommu_group_ref_get(group);
> +	}
> +
> +	return group;
> +}
> +
> +static int qcom_iommu_of_xlate(struct device *dev, struct of_phandle_args *args)
> +{
> +	struct platform_device *iommu_pdev;
> +
> +	if (args->args_count != 1) {
> +		dev_err(dev, "incorrect number of iommu params found for %s "
> +			"(found %d, expected 1)\n",
> +			args->np->full_name, args->args_count);
> +		return -EINVAL;
> +	}
> +
> +	if (!dev->iommu_fwspec->iommu_priv) {
> +		iommu_pdev = of_find_device_by_node(args->np);
> +		if (WARN_ON(!iommu_pdev))
> +			return -EINVAL;
> +
> +		dev->iommu_fwspec->iommu_priv = platform_get_drvdata(iommu_pdev);
> +	}
> +
> +	return iommu_fwspec_add_ids(dev, &args->args[0], 1);
> +}
> +
> +static const struct iommu_ops qcom_iommu_ops = {
> +	.capable	= qcom_iommu_capable,
> +	.domain_alloc	= qcom_iommu_domain_alloc,
> +	.domain_free	= qcom_iommu_domain_free,
> +	.attach_dev	= qcom_iommu_attach_dev,
> +	.detach_dev	= qcom_iommu_detach_dev,
> +	.map		= qcom_iommu_map,
> +	.unmap		= qcom_iommu_unmap,
> +	.map_sg		= default_iommu_map_sg,
> +	.iova_to_phys	= qcom_iommu_iova_to_phys,
> +	.add_device	= qcom_iommu_add_device,
> +	.remove_device	= qcom_iommu_remove_device,
> +	.device_group	= qcom_iommu_device_group,
> +	.of_xlate	= qcom_iommu_of_xlate,
> +	.pgsize_bitmap	= SZ_4K | SZ_64K | SZ_1M | SZ_16M,
> +};
> +
> +static int qcom_iommu_enable_clocks(struct qcom_iommu_dev *qcom_iommu)
> +{
> +	int ret;
> +
> +	ret = clk_prepare_enable(qcom_iommu->iface_clk);
> +	if (ret) {
> +		dev_err(qcom_iommu->dev, "Couldn't enable iface_clk\n");
> +		return ret;
> +	}
> +
> +	ret = clk_prepare_enable(qcom_iommu->bus_clk);
> +	if (ret) {
> +		dev_err(qcom_iommu->dev, "Couldn't enable bus_clk\n");
> +		clk_disable_unprepare(qcom_iommu->iface_clk);
> +		return ret;
> +	}
> +
> +	return 0;
> +}
> +
> +static void qcom_iommu_disable_clocks(struct qcom_iommu_dev *qcom_iommu)
> +{
> +	clk_disable_unprepare(qcom_iommu->bus_clk);
> +	clk_disable_unprepare(qcom_iommu->iface_clk);
> +}
> +
> +static int qcom_iommu_ctx_probe(struct platform_device *pdev)
> +{
> +	struct qcom_iommu_ctx *ctx;
> +	struct device *dev = &pdev->dev;
> +	struct qcom_iommu_dev *qcom_iommu = dev_get_drvdata(dev->parent);
> +	struct resource *res;
> +	int ret;
> +	u32 reg;
> +
> +	ctx = devm_kzalloc(dev, sizeof(*ctx), GFP_KERNEL);
> +	if (!ctx)
> +		return -ENOMEM;
> +
> +	ctx->dev = dev;
> +	platform_set_drvdata(pdev, ctx);
> +
> +	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +	ctx->base = devm_ioremap_resource(dev, res);
> +	if (IS_ERR(ctx->base))
> +		return PTR_ERR(ctx->base);
> +
> +	ctx->irq = platform_get_irq(pdev, 0);
> +	if (ctx->irq < 0) {
> +		dev_err(dev, "failed to get irq\n");
> +		return -ENODEV;
> +	}
> +
> +	/* clear IRQs before registering fault handler, just in case the
> +	 * boot-loader left us a surprise:
> +	 */
> +	iommu_writel(ctx, ARM_SMMU_CB_FSR, iommu_readl(ctx, ARM_SMMU_CB_FSR));
> +
> +	ret = devm_request_irq(dev, ctx->irq,
> +			       qcom_iommu_fault,
> +			       IRQF_SHARED,
> +			       "qcom-iommu-fault",
> +			       ctx);
> +	if (ret) {
> +		dev_err(dev, "failed to request IRQ %u\n", ctx->irq);
> +		return ret;
> +	}
> +
> +	/* read the "reg" property directly to get the relative address
> +	 * of the context bank, and calculate the asid from that:
> +	 */
> +	if (of_property_read_u32_index(dev->of_node, "reg", 0, &reg)) {
> +		dev_err(dev, "missing reg property\n");
> +		return -ENODEV;
> +	}
> +
> +	ctx->asid = reg / 0x1000;      /* context banks are 0x1000 apart */
> +
> +	dev_dbg(dev, "found asid %u\n", ctx->asid);
> +
> +	list_add_tail(&ctx->node, &qcom_iommu->context_list);
> +
> +	return 0;
> +}
> +
> +static int qcom_iommu_ctx_remove(struct platform_device *pdev)
> +{
> +	struct qcom_iommu_ctx *ctx = platform_get_drvdata(pdev);
> +
> +	iommu_group_put(ctx->group);
> +	platform_set_drvdata(pdev, NULL);
> +
> +	list_del(&ctx->node);
> +
> +	return 0;
> +}
> +
> +static const struct of_device_id ctx_of_match[] = {
> +	{ .compatible = "qcom,msm-iommu-v1-ns" },
> +	{ .compatible = "qcom,msm-iommu-v1-sec" },
> +	{ /* sentinel */ }
> +};
> +
> +static struct platform_driver qcom_iommu_ctx_driver = {
> +	.driver	= {
> +		.name		= "qcom-iommu-ctx",
> +		.of_match_table	= of_match_ptr(ctx_of_match),
> +	},
> +	.probe	= qcom_iommu_ctx_probe,
> +	.remove = qcom_iommu_ctx_remove,
> +};
> +
> +static int qcom_iommu_device_probe(struct platform_device *pdev)
> +{
> +	struct qcom_iommu_dev *qcom_iommu;
> +	struct device *dev = &pdev->dev;
> +	struct resource *res;
> +	int ret;
> +
> +	qcom_iommu = devm_kzalloc(dev, sizeof(*qcom_iommu), GFP_KERNEL);
> +	if (!qcom_iommu)
> +		return -ENOMEM;
> +	qcom_iommu->dev = dev;
> +
> +	INIT_LIST_HEAD(&qcom_iommu->context_list);
> +
> +	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +	if (res)
> +		qcom_iommu->local_base = devm_ioremap_resource(dev, res);
> +
> +	qcom_iommu->iface_clk = devm_clk_get(dev, "iface");
> +	if (IS_ERR(qcom_iommu->iface_clk)) {
> +		dev_err(dev, "failed to get iface clock\n");
> +		return PTR_ERR(qcom_iommu->iface_clk);
> +	}
> +
> +	qcom_iommu->bus_clk = devm_clk_get(dev, "bus");
> +	if (IS_ERR(qcom_iommu->bus_clk)) {
> +		dev_err(dev, "failed to get bus clock\n");
> +		return PTR_ERR(qcom_iommu->bus_clk);
> +	}
> +
> +	if (of_property_read_u32(dev->of_node, "qcom,iommu-secure-id",
> +				 &qcom_iommu->sec_id)) {
> +		dev_err(dev, "missing qcom,iommu-secure-id property\n");
> +		return -ENODEV;
> +	}
> +
> +	platform_set_drvdata(pdev, qcom_iommu);
> +
> +	pm_runtime_enable(dev);
> +
> +	/* register context bank devices, which are child nodes: */
> +	ret = devm_of_platform_populate(dev);
> +	if (ret) {
> +		dev_err(dev, "Failed to populate iommu contexts\n");
> +		return ret;
> +	}
> +
> +	ret = iommu_device_sysfs_add(&qcom_iommu->iommu, dev, NULL,
> +				     dev_name(dev));
> +	if (ret) {
> +		dev_err(dev, "Failed to register iommu in sysfs\n");
> +		return ret;
> +	}
> +
> +	iommu_device_set_ops(&qcom_iommu->iommu, &qcom_iommu_ops);
> +	iommu_device_set_fwnode(&qcom_iommu->iommu, dev->fwnode);
> +
> +	ret = iommu_device_register(&qcom_iommu->iommu);
> +	if (ret) {
> +		dev_err(dev, "Failed to register iommu\n");
> +		return ret;
> +	}
> +
> +	bus_set_iommu(&platform_bus_type, &qcom_iommu_ops);
> +
> +	if (qcom_iommu->local_base) {
> +		pm_runtime_get_sync(dev);
> +		writel_relaxed(0xffffffff, qcom_iommu->local_base + SMMU_INTR_SEL_NS);
> +		pm_runtime_put_sync(dev);
> +	}
> +
> +	return 0;

You need to set your DMA masks somewhere and make sure it succeeded,
especially given that what the platform bus gives you by default isn't
big enough for an LPAE table walker.

> +}
> +
> +static int qcom_iommu_device_remove(struct platform_device *pdev)
> +{
> +	struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
> +
> +	bus_set_iommu(&platform_bus_type, NULL);

This does nothing.

> +	pm_runtime_force_suspend(&pdev->dev);
> +	platform_set_drvdata(pdev, NULL);
> +	iommu_device_sysfs_remove(&qcom_iommu->iommu);
> +	iommu_device_unregister(&qcom_iommu->iommu);
> +
> +	return 0;
> +}
> +
> +#ifdef CONFIG_PM

I was under the impression that annotating PM callbacks as
__maybe_unused was preferred these days, but I could be wrong.

Robin.

> +static int qcom_iommu_resume(struct device *dev)
> +{
> +	struct platform_device *pdev = to_platform_device(dev);
> +	struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
> +
> +	return qcom_iommu_enable_clocks(qcom_iommu);
> +}
> +
> +static int qcom_iommu_suspend(struct device *dev)
> +{
> +	struct platform_device *pdev = to_platform_device(dev);
> +	struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
> +
> +	qcom_iommu_disable_clocks(qcom_iommu);
> +
> +	return 0;
> +}
> +#endif
> +
> +static const struct dev_pm_ops qcom_iommu_pm_ops = {
> +	SET_RUNTIME_PM_OPS(qcom_iommu_suspend, qcom_iommu_resume, NULL)
> +	SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
> +				pm_runtime_force_resume)
> +};
> +
> +static const struct of_device_id qcom_iommu_of_match[] = {
> +	{ .compatible = "qcom,msm-iommu-v1" },
> +	{ /* sentinel */ }
> +};
> +MODULE_DEVICE_TABLE(of, qcom_iommu_of_match);
> +
> +static struct platform_driver qcom_iommu_driver = {
> +	.driver	= {
> +		.name		= "qcom-iommu",
> +		.of_match_table	= of_match_ptr(qcom_iommu_of_match),
> +		.pm		= &qcom_iommu_pm_ops,
> +	},
> +	.probe	= qcom_iommu_device_probe,
> +	.remove	= qcom_iommu_device_remove,
> +};
> +
> +static int __init qcom_iommu_init(void)
> +{
> +	int ret;
> +
> +	ret = platform_driver_register(&qcom_iommu_ctx_driver);
> +	if (ret)
> +		return ret;
> +
> +	ret = platform_driver_register(&qcom_iommu_driver);
> +	if (ret)
> +		platform_driver_unregister(&qcom_iommu_ctx_driver);
> +
> +	return ret;
> +}
> +
> +static void __exit qcom_iommu_exit(void)
> +{
> +	platform_driver_unregister(&qcom_iommu_driver);
> +	platform_driver_unregister(&qcom_iommu_ctx_driver);
> +}
> +
> +module_init(qcom_iommu_init);
> +module_exit(qcom_iommu_exit);
> +
> +IOMMU_OF_DECLARE(qcom_iommu_dev, "qcom,msm-iommu-v1", NULL);
> +
> +MODULE_DESCRIPTION("IOMMU API for QCOM IOMMU v1 implementations");
> +MODULE_LICENSE("GPL v2");
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 3/4] iommu: add qcom_iommu
       [not found] ` <20170525173340.26904-1-robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-05-25 17:33   ` Rob Clark
       [not found]     ` <20170525173340.26904-4-robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 24+ messages in thread
From: Rob Clark @ 2017-05-25 17:33 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: Mark Rutland, Rob Herring, linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	Will Deacon, Stanimir Varbanov

An iommu driver for Qualcomm "B" family devices which do not completely
implement the ARM SMMU spec.  These devices have context-bank register
layout that is similar to ARM SMMU, but no global register space (or at
least not one that is accessible).

Signed-off-by: Rob Clark <robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
---
v1: original
v2: bindings cleanups and kconfig issues that kbuild robot pointed out
v3: fix issues pointed out by Rob H. and actually make device removal
    work
v4: fix WARN_ON() splats reported by Archit
v5: some fixes to build as a module.. note that it cannot actually
    be built as a module yet (at minimum a bunch of other iommu syms
    that are needed are not exported, but there may be more to it
    than that), but at least qcom_iommu is ready should it become
    possible to build iommu drivers as modules.
v6: Add additional pm-runtime get/puts around paths that can hit
    TLB inv, to avoid unclocked register access if device using the
    iommu is not powered on.  And pre-emptively clear interrupts
    before registering IRQ handler just in case the bootloader has
    left us a surpise.

 drivers/iommu/Kconfig      |  10 +
 drivers/iommu/Makefile     |   1 +
 drivers/iommu/qcom_iommu.c | 878 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 889 insertions(+)
 create mode 100644 drivers/iommu/qcom_iommu.c

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 6ee3a25..aa4b628 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -367,4 +367,14 @@ config MTK_IOMMU_V1
 
 	  if unsure, say N here.
 
+config QCOM_IOMMU
+	# Note: iommu drivers cannot (yet?) be built as modules
+	bool "Qualcomm IOMMU Support"
+	depends on ARCH_QCOM || COMPILE_TEST
+	select IOMMU_API
+	select IOMMU_IO_PGTABLE_LPAE
+	select ARM_DMA_USE_IOMMU
+	help
+	  Support for IOMMU on certain Qualcomm SoCs.
+
 endif # IOMMU_SUPPORT
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 195f7b9..b910aea 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -27,3 +27,4 @@ obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o
 obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
 obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
 obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
+obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
new file mode 100644
index 0000000..bfaf97c
--- /dev/null
+++ b/drivers/iommu/qcom_iommu.c
@@ -0,0 +1,878 @@
+/*
+ * IOMMU API for QCOM secure IOMMUs.  Somewhat based on arm-smmu.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Copyright (C) 2013 ARM Limited
+ * Copyright (C) 2017 Red Hat
+ */
+
+#include <linux/atomic.h>
+#include <linux/clk.h>
+#include <linux/delay.h>
+#include <linux/dma-iommu.h>
+#include <linux/dma-mapping.h>
+#include <linux/err.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/io-64-nonatomic-hi-lo.h>
+#include <linux/iommu.h>
+#include <linux/iopoll.h>
+#include <linux/kconfig.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/of_device.h>
+#include <linux/of_iommu.h>
+#include <linux/platform_device.h>
+#include <linux/pm.h>
+#include <linux/pm_runtime.h>
+#include <linux/qcom_scm.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+
+#include "io-pgtable.h"
+#include "arm-smmu-regs.h"
+
+#define SMMU_INTR_SEL_NS     0x2000
+
+struct qcom_iommu_dev {
+	/* IOMMU core code handle */
+	struct iommu_device	 iommu;
+	struct device		*dev;
+	struct clk		*iface_clk;
+	struct clk		*bus_clk;
+	void __iomem		*local_base;
+	u32			 sec_id;
+	struct list_head	 context_list;   /* list of qcom_iommu_context */
+};
+
+struct qcom_iommu_ctx {
+	struct device		*dev;
+	void __iomem		*base;
+	unsigned int		 irq;
+	bool			 secure_init;
+	u32			 asid;      /* asid and ctx bank # are 1:1 */
+	struct iommu_group	*group;
+	struct list_head	 node;      /* head in qcom_iommu_device::context_list */
+};
+
+struct qcom_iommu_domain {
+	struct io_pgtable_ops	*pgtbl_ops;
+	spinlock_t		 pgtbl_lock;
+	struct mutex		 init_mutex; /* Protects iommu pointer */
+	struct iommu_domain	 domain;
+	struct qcom_iommu_dev	*iommu;
+};
+
+static struct qcom_iommu_domain *to_qcom_iommu_domain(struct iommu_domain *dom)
+{
+	return container_of(dom, struct qcom_iommu_domain, domain);
+}
+
+static const struct iommu_ops qcom_iommu_ops;
+
+static struct qcom_iommu_dev * __to_iommu(struct iommu_fwspec *fwspec)
+{
+	if (!fwspec || fwspec->ops != &qcom_iommu_ops)
+		return NULL;
+	return fwspec->iommu_priv;
+}
+
+static struct qcom_iommu_dev * to_iommu(struct iommu_fwspec *fwspec)
+{
+	struct qcom_iommu_dev *qcom_iommu = __to_iommu(fwspec);
+	WARN_ON(!qcom_iommu);
+	return qcom_iommu;
+}
+
+static struct qcom_iommu_ctx * to_ctx(struct iommu_fwspec *fwspec, unsigned asid)
+{
+	struct qcom_iommu_dev *qcom_iommu = to_iommu(fwspec);
+	struct qcom_iommu_ctx *ctx;
+
+	if (!qcom_iommu)
+		return NULL;
+
+	list_for_each_entry(ctx, &qcom_iommu->context_list, node)
+		if (ctx->asid == asid)
+			return ctx;
+
+	WARN(1, "no ctx for asid %u\n", asid);
+	return NULL;
+}
+
+static inline void
+iommu_writel(struct qcom_iommu_ctx *ctx, unsigned reg, u32 val)
+{
+	writel_relaxed(val, ctx->base + reg);
+}
+
+static inline void
+iommu_writeq(struct qcom_iommu_ctx *ctx, unsigned reg, u64 val)
+{
+	writeq_relaxed(val, ctx->base + reg);
+}
+
+static inline u32
+iommu_readl(struct qcom_iommu_ctx *ctx, unsigned reg)
+{
+	return readl_relaxed(ctx->base + reg);
+}
+
+static inline u64
+iommu_readq(struct qcom_iommu_ctx *ctx, unsigned reg)
+{
+	return readq_relaxed(ctx->base + reg);
+}
+
+static void __sync_tlb(struct qcom_iommu_ctx *ctx)
+{
+	unsigned int val;
+	unsigned int ret;
+
+	iommu_writel(ctx, ARM_SMMU_CB_TLBSYNC, 0);
+
+	ret = readl_poll_timeout(ctx->base + ARM_SMMU_CB_TLBSTATUS, val,
+				 (val & 0x1) == 0, 0, 5000000);
+	if (ret)
+		dev_err(ctx->dev, "timeout waiting for TLB SYNC\n");
+}
+
+static void qcom_iommu_tlb_sync(void *cookie)
+{
+	struct iommu_fwspec *fwspec = cookie;
+	unsigned i;
+
+	for (i = 0; i < fwspec->num_ids; i++)
+		__sync_tlb(to_ctx(fwspec, fwspec->ids[i]));
+}
+
+static void qcom_iommu_tlb_inv_context(void *cookie)
+{
+	struct iommu_fwspec *fwspec = cookie;
+	unsigned i;
+
+	for (i = 0; i < fwspec->num_ids; i++) {
+		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
+
+		iommu_writel(ctx, ARM_SMMU_CB_S1_TLBIASID, ctx->asid);
+		__sync_tlb(ctx);
+	}
+}
+
+static void qcom_iommu_tlb_inv_range_nosync(unsigned long iova, size_t size,
+					    size_t granule, bool leaf, void *cookie)
+{
+	struct iommu_fwspec *fwspec = cookie;
+	unsigned i, reg;
+
+	reg = leaf ? ARM_SMMU_CB_S1_TLBIVAL : ARM_SMMU_CB_S1_TLBIVA;
+
+	for (i = 0; i < fwspec->num_ids; i++) {
+		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
+		size_t s = size;
+
+		iova &= ~12UL;
+		iova |= ctx->asid;
+		do {
+			iommu_writel(ctx, reg, iova);
+			iova += granule;
+		} while (s -= granule);
+	}
+}
+
+static const struct iommu_gather_ops qcom_gather_ops = {
+	.tlb_flush_all	= qcom_iommu_tlb_inv_context,
+	.tlb_add_flush	= qcom_iommu_tlb_inv_range_nosync,
+	.tlb_sync	= qcom_iommu_tlb_sync,
+};
+
+static irqreturn_t qcom_iommu_fault(int irq, void *dev)
+{
+	struct qcom_iommu_ctx *ctx = dev;
+	u32 fsr, fsynr;
+	unsigned long iova;
+
+	fsr = iommu_readl(ctx, ARM_SMMU_CB_FSR);
+
+	if (!(fsr & FSR_FAULT))
+		return IRQ_NONE;
+
+	fsynr = iommu_readl(ctx, ARM_SMMU_CB_FSYNR0);
+	iova = iommu_readq(ctx, ARM_SMMU_CB_FAR);
+
+	dev_err_ratelimited(ctx->dev,
+			    "Unhandled context fault: fsr=0x%x, "
+			    "iova=0x%08lx, fsynr=0x%x, cb=%d\n",
+			    fsr, iova, fsynr, ctx->asid);
+
+	iommu_writel(ctx, ARM_SMMU_CB_FSR, fsr);
+
+	return IRQ_HANDLED;
+}
+
+static int qcom_iommu_init_domain(struct iommu_domain *domain,
+				  struct qcom_iommu_dev *qcom_iommu,
+				  struct iommu_fwspec *fwspec)
+{
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	struct io_pgtable_ops *pgtbl_ops;
+	struct io_pgtable_cfg pgtbl_cfg;
+	int i, ret = 0;
+	u32 reg;
+
+	mutex_lock(&qcom_domain->init_mutex);
+	if (qcom_domain->iommu)
+		goto out_unlock;
+
+	pgtbl_cfg = (struct io_pgtable_cfg) {
+		.pgsize_bitmap	= qcom_iommu_ops.pgsize_bitmap,
+		.ias		= 32,
+		.oas		= 40,
+		.tlb		= &qcom_gather_ops,
+		.iommu_dev	= qcom_iommu->dev,
+	};
+
+	qcom_domain->iommu = qcom_iommu;
+	pgtbl_ops = alloc_io_pgtable_ops(ARM_32_LPAE_S1, &pgtbl_cfg, fwspec);
+	if (!pgtbl_ops) {
+		dev_err(qcom_iommu->dev, "failed to allocate pagetable ops\n");
+		ret = -ENOMEM;
+		goto out_clear_iommu;
+	}
+
+	/* Update the domain's page sizes to reflect the page table format */
+	domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
+	domain->geometry.aperture_end = (1ULL << pgtbl_cfg.ias) - 1;
+	domain->geometry.force_aperture = true;
+
+	for (i = 0; i < fwspec->num_ids; i++) {
+		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
+
+		if (!ctx->secure_init) {
+			ret = qcom_scm_restore_sec_cfg(qcom_iommu->sec_id, ctx->asid);
+			if (ret) {
+				dev_err(qcom_iommu->dev, "secure init failed: %d\n", ret);
+				goto out_clear_iommu;
+			}
+			ctx->secure_init = true;
+		}
+
+		/* TTBRs */
+		iommu_writeq(ctx, ARM_SMMU_CB_TTBR0,
+				pgtbl_cfg.arm_lpae_s1_cfg.ttbr[0] |
+				((u64)ctx->asid << TTBRn_ASID_SHIFT));
+		iommu_writeq(ctx, ARM_SMMU_CB_TTBR1,
+				pgtbl_cfg.arm_lpae_s1_cfg.ttbr[1] |
+				((u64)ctx->asid << TTBRn_ASID_SHIFT));
+
+		/* TTBCR */
+		iommu_writel(ctx, ARM_SMMU_CB_TTBCR2,
+				(pgtbl_cfg.arm_lpae_s1_cfg.tcr >> 32) |
+				TTBCR2_SEP_UPSTREAM);
+		iommu_writel(ctx, ARM_SMMU_CB_TTBCR,
+				pgtbl_cfg.arm_lpae_s1_cfg.tcr);
+
+		/* MAIRs (stage-1 only) */
+		iommu_writel(ctx, ARM_SMMU_CB_S1_MAIR0,
+				pgtbl_cfg.arm_lpae_s1_cfg.mair[0]);
+		iommu_writel(ctx, ARM_SMMU_CB_S1_MAIR1,
+				pgtbl_cfg.arm_lpae_s1_cfg.mair[1]);
+
+		/* SCTLR */
+		reg = SCTLR_CFIE | SCTLR_CFRE | SCTLR_AFE | SCTLR_TRE |
+			SCTLR_M | SCTLR_S1_ASIDPNE;
+
+		if (IS_ENABLED(CONFIG_BIG_ENDIAN))
+			reg |= SCTLR_E;
+
+		iommu_writel(ctx, ARM_SMMU_CB_SCTLR, reg);
+	}
+
+	mutex_unlock(&qcom_domain->init_mutex);
+
+	/* Publish page table ops for map/unmap */
+	qcom_domain->pgtbl_ops = pgtbl_ops;
+
+	return 0;
+
+out_clear_iommu:
+	qcom_domain->iommu = NULL;
+out_unlock:
+	mutex_unlock(&qcom_domain->init_mutex);
+	return ret;
+}
+
+static struct iommu_domain *qcom_iommu_domain_alloc(unsigned type)
+{
+	struct qcom_iommu_domain *qcom_domain;
+
+	if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA)
+		return NULL;
+	/*
+	 * Allocate the domain and initialise some of its data structures.
+	 * We can't really do anything meaningful until we've added a
+	 * master.
+	 */
+	qcom_domain = kzalloc(sizeof(*qcom_domain), GFP_KERNEL);
+	if (!qcom_domain)
+		return NULL;
+
+	if (type == IOMMU_DOMAIN_DMA &&
+	    iommu_get_dma_cookie(&qcom_domain->domain)) {
+		kfree(qcom_domain);
+		return NULL;
+	}
+
+	mutex_init(&qcom_domain->init_mutex);
+	spin_lock_init(&qcom_domain->pgtbl_lock);
+
+	return &qcom_domain->domain;
+}
+
+static void qcom_iommu_domain_free(struct iommu_domain *domain)
+{
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+
+	if (WARN_ON(qcom_domain->iommu))    /* forgot to detach? */
+		return;
+
+	iommu_put_dma_cookie(domain);
+
+	/* NOTE: unmap can be called after client device is powered off,
+	 * for example, with GPUs or anything involving dma-buf.  So we
+	 * cannot rely on the device_link.  Make sure the IOMMU is on to
+	 * avoid unclocked accesses in the TLB inv path:
+	 */
+	pm_runtime_get_sync(qcom_domain->iommu->dev);
+
+	free_io_pgtable_ops(qcom_domain->pgtbl_ops);
+
+	pm_runtime_put_sync(qcom_domain->iommu->dev);
+
+	kfree(qcom_domain);
+}
+
+static int qcom_iommu_attach_dev(struct iommu_domain *domain, struct device *dev)
+{
+	struct qcom_iommu_dev *qcom_iommu = to_iommu(dev->iommu_fwspec);
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	int ret;
+
+	if (!qcom_iommu) {
+		dev_err(dev, "cannot attach to IOMMU, is it on the same bus?\n");
+		return -ENXIO;
+	}
+
+	/* Ensure that the domain is finalized */
+	pm_runtime_get_sync(qcom_iommu->dev);
+	ret = qcom_iommu_init_domain(domain, qcom_iommu, dev->iommu_fwspec);
+	pm_runtime_put_sync(qcom_iommu->dev);
+	if (ret < 0)
+		return ret;
+
+	/*
+	 * Sanity check the domain. We don't support domains across
+	 * different IOMMUs.
+	 */
+	if (qcom_domain->iommu != qcom_iommu) {
+		dev_err(dev, "cannot attach to IOMMU %s while already "
+			"attached to domain on IOMMU %s\n",
+			dev_name(qcom_domain->iommu->dev),
+			dev_name(qcom_iommu->dev));
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static void qcom_iommu_detach_dev(struct iommu_domain *domain, struct device *dev)
+{
+	struct iommu_fwspec *fwspec = dev->iommu_fwspec;
+	struct qcom_iommu_dev *qcom_iommu = to_iommu(fwspec);
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	unsigned i;
+
+	if (!qcom_domain->iommu)
+		return;
+
+	pm_runtime_get_sync(qcom_iommu->dev);
+	for (i = 0; i < fwspec->num_ids; i++) {
+		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
+
+		/* Disable the context bank: */
+		iommu_writel(ctx, ARM_SMMU_CB_SCTLR, 0);
+	}
+	pm_runtime_put_sync(qcom_iommu->dev);
+
+	qcom_domain->iommu = NULL;
+}
+
+static int qcom_iommu_map(struct iommu_domain *domain, unsigned long iova,
+			  phys_addr_t paddr, size_t size, int prot)
+{
+	int ret;
+	unsigned long flags;
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
+
+	if (!ops)
+		return -ENODEV;
+
+	spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
+	ret = ops->map(ops, iova, paddr, size, prot);
+	spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
+	return ret;
+}
+
+static size_t qcom_iommu_unmap(struct iommu_domain *domain, unsigned long iova,
+			       size_t size)
+{
+	size_t ret;
+	unsigned long flags;
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
+
+	if (!ops)
+		return 0;
+
+	/* NOTE: unmap can be called after client device is powered off,
+	 * for example, with GPUs or anything involving dma-buf.  So we
+	 * cannot rely on the device_link.  Make sure the IOMMU is on to
+	 * avoid unclocked accesses in the TLB inv path:
+	 */
+	pm_runtime_get_sync(qcom_domain->iommu->dev);
+	spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
+	ret = ops->unmap(ops, iova, size);
+	spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
+	pm_runtime_put_sync(qcom_domain->iommu->dev);
+
+	return ret;
+}
+
+static phys_addr_t qcom_iommu_iova_to_phys(struct iommu_domain *domain,
+					   dma_addr_t iova)
+{
+	phys_addr_t ret;
+	unsigned long flags;
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
+
+	if (!ops)
+		return 0;
+
+	spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
+	ret = ops->iova_to_phys(ops, iova);
+	spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
+
+	return ret;
+}
+
+static bool qcom_iommu_capable(enum iommu_cap cap)
+{
+	switch (cap) {
+	case IOMMU_CAP_CACHE_COHERENCY:
+		/*
+		 * Return true here as the SMMU can always send out coherent
+		 * requests.
+		 */
+		return true;
+	case IOMMU_CAP_NOEXEC:
+		return true;
+	default:
+		return false;
+	}
+}
+
+static int qcom_iommu_add_device(struct device *dev)
+{
+	struct qcom_iommu_dev *qcom_iommu = __to_iommu(dev->iommu_fwspec);
+	struct iommu_group *group;
+	struct device_link *link;
+
+	if (!qcom_iommu)
+		return -ENODEV;
+
+	/*
+	 * Establish the link between iommu and master, so that the
+	 * iommu gets runtime enabled/disabled as per the master's
+	 * needs.
+	 */
+	link = device_link_add(dev, qcom_iommu->dev, DL_FLAG_PM_RUNTIME);
+	if (!link) {
+		dev_err(qcom_iommu->dev, "Unable to create device link between %s and %s\n",
+			dev_name(qcom_iommu->dev), dev_name(dev));
+		return -ENODEV;
+	}
+
+	group = iommu_group_get_for_dev(dev);
+	if (IS_ERR_OR_NULL(group))
+		return PTR_ERR_OR_ZERO(group);
+
+	iommu_group_put(group);
+	iommu_device_link(&qcom_iommu->iommu, dev);
+
+	return 0;
+}
+
+static void qcom_iommu_remove_device(struct device *dev)
+{
+	struct qcom_iommu_dev *qcom_iommu = to_iommu(dev->iommu_fwspec);
+
+	if (!qcom_iommu)
+		return;
+
+	iommu_device_unlink(&qcom_iommu->iommu, dev);
+	iommu_group_remove_device(dev);
+	iommu_fwspec_free(dev);
+}
+
+static struct iommu_group *qcom_iommu_device_group(struct device *dev)
+{
+	struct iommu_fwspec *fwspec = dev->iommu_fwspec;
+	struct iommu_group *group = NULL;
+	unsigned i;
+
+	for (i = 0; i < fwspec->num_ids; i++) {
+		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
+
+		if (group && ctx->group && group != ctx->group)
+			return ERR_PTR(-EINVAL);
+
+		group = ctx->group;
+	}
+
+	if (group)
+		return iommu_group_ref_get(group);
+
+	group = generic_device_group(dev);
+
+	for (i = 0; i < fwspec->num_ids; i++) {
+		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
+		ctx->group = iommu_group_ref_get(group);
+	}
+
+	return group;
+}
+
+static int qcom_iommu_of_xlate(struct device *dev, struct of_phandle_args *args)
+{
+	struct platform_device *iommu_pdev;
+
+	if (args->args_count != 1) {
+		dev_err(dev, "incorrect number of iommu params found for %s "
+			"(found %d, expected 1)\n",
+			args->np->full_name, args->args_count);
+		return -EINVAL;
+	}
+
+	if (!dev->iommu_fwspec->iommu_priv) {
+		iommu_pdev = of_find_device_by_node(args->np);
+		if (WARN_ON(!iommu_pdev))
+			return -EINVAL;
+
+		dev->iommu_fwspec->iommu_priv = platform_get_drvdata(iommu_pdev);
+	}
+
+	return iommu_fwspec_add_ids(dev, &args->args[0], 1);
+}
+
+static const struct iommu_ops qcom_iommu_ops = {
+	.capable	= qcom_iommu_capable,
+	.domain_alloc	= qcom_iommu_domain_alloc,
+	.domain_free	= qcom_iommu_domain_free,
+	.attach_dev	= qcom_iommu_attach_dev,
+	.detach_dev	= qcom_iommu_detach_dev,
+	.map		= qcom_iommu_map,
+	.unmap		= qcom_iommu_unmap,
+	.map_sg		= default_iommu_map_sg,
+	.iova_to_phys	= qcom_iommu_iova_to_phys,
+	.add_device	= qcom_iommu_add_device,
+	.remove_device	= qcom_iommu_remove_device,
+	.device_group	= qcom_iommu_device_group,
+	.of_xlate	= qcom_iommu_of_xlate,
+	.pgsize_bitmap	= SZ_4K | SZ_64K | SZ_1M | SZ_16M,
+};
+
+static int qcom_iommu_enable_clocks(struct qcom_iommu_dev *qcom_iommu)
+{
+	int ret;
+
+	ret = clk_prepare_enable(qcom_iommu->iface_clk);
+	if (ret) {
+		dev_err(qcom_iommu->dev, "Couldn't enable iface_clk\n");
+		return ret;
+	}
+
+	ret = clk_prepare_enable(qcom_iommu->bus_clk);
+	if (ret) {
+		dev_err(qcom_iommu->dev, "Couldn't enable bus_clk\n");
+		clk_disable_unprepare(qcom_iommu->iface_clk);
+		return ret;
+	}
+
+	return 0;
+}
+
+static void qcom_iommu_disable_clocks(struct qcom_iommu_dev *qcom_iommu)
+{
+	clk_disable_unprepare(qcom_iommu->bus_clk);
+	clk_disable_unprepare(qcom_iommu->iface_clk);
+}
+
+static int qcom_iommu_ctx_probe(struct platform_device *pdev)
+{
+	struct qcom_iommu_ctx *ctx;
+	struct device *dev = &pdev->dev;
+	struct qcom_iommu_dev *qcom_iommu = dev_get_drvdata(dev->parent);
+	struct resource *res;
+	int ret;
+	u32 reg;
+
+	ctx = devm_kzalloc(dev, sizeof(*ctx), GFP_KERNEL);
+	if (!ctx)
+		return -ENOMEM;
+
+	ctx->dev = dev;
+	platform_set_drvdata(pdev, ctx);
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	ctx->base = devm_ioremap_resource(dev, res);
+	if (IS_ERR(ctx->base))
+		return PTR_ERR(ctx->base);
+
+	ctx->irq = platform_get_irq(pdev, 0);
+	if (ctx->irq < 0) {
+		dev_err(dev, "failed to get irq\n");
+		return -ENODEV;
+	}
+
+	/* clear IRQs before registering fault handler, just in case the
+	 * boot-loader left us a surprise:
+	 */
+	iommu_writel(ctx, ARM_SMMU_CB_FSR, iommu_readl(ctx, ARM_SMMU_CB_FSR));
+
+	ret = devm_request_irq(dev, ctx->irq,
+			       qcom_iommu_fault,
+			       IRQF_SHARED,
+			       "qcom-iommu-fault",
+			       ctx);
+	if (ret) {
+		dev_err(dev, "failed to request IRQ %u\n", ctx->irq);
+		return ret;
+	}
+
+	/* read the "reg" property directly to get the relative address
+	 * of the context bank, and calculate the asid from that:
+	 */
+	if (of_property_read_u32_index(dev->of_node, "reg", 0, &reg)) {
+		dev_err(dev, "missing reg property\n");
+		return -ENODEV;
+	}
+
+	ctx->asid = reg / 0x1000;      /* context banks are 0x1000 apart */
+
+	dev_dbg(dev, "found asid %u\n", ctx->asid);
+
+	list_add_tail(&ctx->node, &qcom_iommu->context_list);
+
+	return 0;
+}
+
+static int qcom_iommu_ctx_remove(struct platform_device *pdev)
+{
+	struct qcom_iommu_ctx *ctx = platform_get_drvdata(pdev);
+
+	iommu_group_put(ctx->group);
+	platform_set_drvdata(pdev, NULL);
+
+	list_del(&ctx->node);
+
+	return 0;
+}
+
+static const struct of_device_id ctx_of_match[] = {
+	{ .compatible = "qcom,msm-iommu-v1-ns" },
+	{ .compatible = "qcom,msm-iommu-v1-sec" },
+	{ /* sentinel */ }
+};
+
+static struct platform_driver qcom_iommu_ctx_driver = {
+	.driver	= {
+		.name		= "qcom-iommu-ctx",
+		.of_match_table	= of_match_ptr(ctx_of_match),
+	},
+	.probe	= qcom_iommu_ctx_probe,
+	.remove = qcom_iommu_ctx_remove,
+};
+
+static int qcom_iommu_device_probe(struct platform_device *pdev)
+{
+	struct qcom_iommu_dev *qcom_iommu;
+	struct device *dev = &pdev->dev;
+	struct resource *res;
+	int ret;
+
+	qcom_iommu = devm_kzalloc(dev, sizeof(*qcom_iommu), GFP_KERNEL);
+	if (!qcom_iommu)
+		return -ENOMEM;
+	qcom_iommu->dev = dev;
+
+	INIT_LIST_HEAD(&qcom_iommu->context_list);
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	if (res)
+		qcom_iommu->local_base = devm_ioremap_resource(dev, res);
+
+	qcom_iommu->iface_clk = devm_clk_get(dev, "iface");
+	if (IS_ERR(qcom_iommu->iface_clk)) {
+		dev_err(dev, "failed to get iface clock\n");
+		return PTR_ERR(qcom_iommu->iface_clk);
+	}
+
+	qcom_iommu->bus_clk = devm_clk_get(dev, "bus");
+	if (IS_ERR(qcom_iommu->bus_clk)) {
+		dev_err(dev, "failed to get bus clock\n");
+		return PTR_ERR(qcom_iommu->bus_clk);
+	}
+
+	if (of_property_read_u32(dev->of_node, "qcom,iommu-secure-id",
+				 &qcom_iommu->sec_id)) {
+		dev_err(dev, "missing qcom,iommu-secure-id property\n");
+		return -ENODEV;
+	}
+
+	platform_set_drvdata(pdev, qcom_iommu);
+
+	pm_runtime_enable(dev);
+
+	/* register context bank devices, which are child nodes: */
+	ret = devm_of_platform_populate(dev);
+	if (ret) {
+		dev_err(dev, "Failed to populate iommu contexts\n");
+		return ret;
+	}
+
+	ret = iommu_device_sysfs_add(&qcom_iommu->iommu, dev, NULL,
+				     dev_name(dev));
+	if (ret) {
+		dev_err(dev, "Failed to register iommu in sysfs\n");
+		return ret;
+	}
+
+	iommu_device_set_ops(&qcom_iommu->iommu, &qcom_iommu_ops);
+	iommu_device_set_fwnode(&qcom_iommu->iommu, dev->fwnode);
+
+	ret = iommu_device_register(&qcom_iommu->iommu);
+	if (ret) {
+		dev_err(dev, "Failed to register iommu\n");
+		return ret;
+	}
+
+	bus_set_iommu(&platform_bus_type, &qcom_iommu_ops);
+
+	if (qcom_iommu->local_base) {
+		pm_runtime_get_sync(dev);
+		writel_relaxed(0xffffffff, qcom_iommu->local_base + SMMU_INTR_SEL_NS);
+		pm_runtime_put_sync(dev);
+	}
+
+	return 0;
+}
+
+static int qcom_iommu_device_remove(struct platform_device *pdev)
+{
+	struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
+
+	bus_set_iommu(&platform_bus_type, NULL);
+
+	pm_runtime_force_suspend(&pdev->dev);
+	platform_set_drvdata(pdev, NULL);
+	iommu_device_sysfs_remove(&qcom_iommu->iommu);
+	iommu_device_unregister(&qcom_iommu->iommu);
+
+	return 0;
+}
+
+#ifdef CONFIG_PM
+static int qcom_iommu_resume(struct device *dev)
+{
+	struct platform_device *pdev = to_platform_device(dev);
+	struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
+
+	return qcom_iommu_enable_clocks(qcom_iommu);
+}
+
+static int qcom_iommu_suspend(struct device *dev)
+{
+	struct platform_device *pdev = to_platform_device(dev);
+	struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
+
+	qcom_iommu_disable_clocks(qcom_iommu);
+
+	return 0;
+}
+#endif
+
+static const struct dev_pm_ops qcom_iommu_pm_ops = {
+	SET_RUNTIME_PM_OPS(qcom_iommu_suspend, qcom_iommu_resume, NULL)
+	SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
+				pm_runtime_force_resume)
+};
+
+static const struct of_device_id qcom_iommu_of_match[] = {
+	{ .compatible = "qcom,msm-iommu-v1" },
+	{ /* sentinel */ }
+};
+MODULE_DEVICE_TABLE(of, qcom_iommu_of_match);
+
+static struct platform_driver qcom_iommu_driver = {
+	.driver	= {
+		.name		= "qcom-iommu",
+		.of_match_table	= of_match_ptr(qcom_iommu_of_match),
+		.pm		= &qcom_iommu_pm_ops,
+	},
+	.probe	= qcom_iommu_device_probe,
+	.remove	= qcom_iommu_device_remove,
+};
+
+static int __init qcom_iommu_init(void)
+{
+	int ret;
+
+	ret = platform_driver_register(&qcom_iommu_ctx_driver);
+	if (ret)
+		return ret;
+
+	ret = platform_driver_register(&qcom_iommu_driver);
+	if (ret)
+		platform_driver_unregister(&qcom_iommu_ctx_driver);
+
+	return ret;
+}
+
+static void __exit qcom_iommu_exit(void)
+{
+	platform_driver_unregister(&qcom_iommu_driver);
+	platform_driver_unregister(&qcom_iommu_ctx_driver);
+}
+
+module_init(qcom_iommu_init);
+module_exit(qcom_iommu_exit);
+
+IOMMU_OF_DECLARE(qcom_iommu_dev, "qcom,msm-iommu-v1", NULL);
+
+MODULE_DESCRIPTION("IOMMU API for QCOM IOMMU v1 implementations");
+MODULE_LICENSE("GPL v2");
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] iommu: add qcom_iommu
  2017-05-11 16:50         ` Rob Clark
@ 2017-05-12  3:52           ` Sricharan R
  0 siblings, 0 replies; 24+ messages in thread
From: Sricharan R @ 2017-05-12  3:52 UTC (permalink / raw)
  To: Rob Clark
  Cc: Mark Rutland, Rob Herring, linux-arm-msm, Will Deacon,
	Stanimir Varbanov,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

Hi,

On 5/11/2017 10:20 PM, Rob Clark wrote:
> On Thu, May 11, 2017 at 11:08 AM, Sricharan R <sricharan-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> wrote:
>> Hi Rob,
>>
>> <snip..>
>>
>>> +static irqreturn_t qcom_iommu_fault(int irq, void *dev)
>>> +{
>>> +     struct qcom_iommu_ctx *ctx = dev;
>>> +     u32 fsr, fsynr;
>>> +     unsigned long iova;
>>> +
>>> +     fsr = iommu_readl(ctx, ARM_SMMU_CB_FSR);
>>> +
>>> +     if (!(fsr & FSR_FAULT))
>>> +             return IRQ_NONE;
>>> +
>>> +     fsynr = iommu_readl(ctx, ARM_SMMU_CB_FSYNR0);
>>> +     iova = iommu_readq(ctx, ARM_SMMU_CB_FAR);
>>> +
>>> +     dev_err_ratelimited(ctx->dev,
>>> +                         "Unhandled context fault: fsr=0x%x, "
>>> +                         "iova=0x%08lx, fsynr=0x%x, cb=%d\n",
>>> +                         fsr, iova, fsynr, ctx->asid);
>>> +
>>> +     iommu_writel(ctx, ARM_SMMU_CB_FSR, fsr);
>>
>> Just thinking if the clocks should be enabled in the fault handler
>> for handling cases that would happen out of the master context.
>> While global faults are one case, that is anyways is handled in
>> secure world for this case. Something like bootloader used the iommu
>> and not handled the fault, and getting the fault in kernel the
>> moment we enable the ctx. Atleast downstream seems to enable the
>> clocks in the fault handler explicitly.
> 
> hmm, I wonder if we should instead do something to clear interrupts
> when we initialize the context?
> 
> I guess we probably don't want to get fault irq's from the bootloader..

Right, better to clear it in the beginning and that could be added.


Regards,
 Sricharan

> 
> BR,
> -R
> 
>> Regards,
>>  Sricharan
>>
>>
>>> +
>>> +     return IRQ_HANDLED;
>>> +}
>>> +
>>> +static int qcom_iommu_init_domain(struct iommu_domain *domain,
>>> +                               struct qcom_iommu_dev *qcom_iommu,
>>> +                               struct iommu_fwspec *fwspec)
>>> +{
>>> +     struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
>>> +     struct io_pgtable_ops *pgtbl_ops;
>>> +     struct io_pgtable_cfg pgtbl_cfg;
>>> +     int i, ret = 0;
>>> +     u32 reg;
>>> +
>>> +     mutex_lock(&qcom_domain->init_mutex);
>>> +     if (qcom_domain->iommu)
>>> +             goto out_unlock;
>>> +
>>> +     pgtbl_cfg = (struct io_pgtable_cfg) {
>>> +             .pgsize_bitmap  = qcom_iommu_ops.pgsize_bitmap,
>>> +             .ias            = 32,
>>> +             .oas            = 40,
>>> +             .tlb            = &qcom_gather_ops,
>>> +             .iommu_dev      = qcom_iommu->dev,
>>> +     };
>>> +
>>> +     qcom_domain->iommu = qcom_iommu;
>>> +     pgtbl_ops = alloc_io_pgtable_ops(ARM_32_LPAE_S1, &pgtbl_cfg, fwspec);
>>> +     if (!pgtbl_ops) {
>>> +             dev_err(qcom_iommu->dev, "failed to allocate pagetable ops\n");
>>> +             ret = -ENOMEM;
>>> +             goto out_clear_iommu;
>>> +     }
>>> +
>>> +     /* Update the domain's page sizes to reflect the page table format */
>>> +     domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
>>> +     domain->geometry.aperture_end = (1ULL << pgtbl_cfg.ias) - 1;
>>> +     domain->geometry.force_aperture = true;
>>> +
>>> +     for (i = 0; i < fwspec->num_ids; i++) {
>>> +             struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
>>> +
>>> +             if (!ctx->secure_init) {
>>> +                     ret = qcom_scm_restore_sec_cfg(qcom_iommu->sec_id, ctx->asid);
>>> +                     if (ret) {
>>> +                             dev_err(qcom_iommu->dev, "secure init failed: %d\n", ret);
>>> +                             goto out_clear_iommu;
>>> +                     }
>>> +                     ctx->secure_init = true;
>>> +             }
>>> +
>>> +             /* TTBRs */
>>> +             iommu_writeq(ctx, ARM_SMMU_CB_TTBR0,
>>> +                             pgtbl_cfg.arm_lpae_s1_cfg.ttbr[0] |
>>> +                             ((u64)ctx->asid << TTBRn_ASID_SHIFT));
>>> +             iommu_writeq(ctx, ARM_SMMU_CB_TTBR1,
>>> +                             pgtbl_cfg.arm_lpae_s1_cfg.ttbr[1] |
>>> +                             ((u64)ctx->asid << TTBRn_ASID_SHIFT));
>>> +
>>> +             /* TTBCR */
>>> +             iommu_writel(ctx, ARM_SMMU_CB_TTBCR2,
>>> +                             (pgtbl_cfg.arm_lpae_s1_cfg.tcr >> 32) |
>>> +                             TTBCR2_SEP_UPSTREAM);
>>> +             iommu_writel(ctx, ARM_SMMU_CB_TTBCR,
>>> +                             pgtbl_cfg.arm_lpae_s1_cfg.tcr);
>>> +
>>> +             /* MAIRs (stage-1 only) */
>>> +             iommu_writel(ctx, ARM_SMMU_CB_S1_MAIR0,
>>> +                             pgtbl_cfg.arm_lpae_s1_cfg.mair[0]);
>>> +             iommu_writel(ctx, ARM_SMMU_CB_S1_MAIR1,
>>> +                             pgtbl_cfg.arm_lpae_s1_cfg.mair[1]);
>>> +
>>> +             /* SCTLR */
>>> +             reg = SCTLR_CFIE | SCTLR_CFRE | SCTLR_AFE | SCTLR_TRE |
>>> +                     SCTLR_M | SCTLR_S1_ASIDPNE;
>>> +
>>> +             if (IS_ENABLED(CONFIG_BIG_ENDIAN))
>>> +                     reg |= SCTLR_E;
>>> +
>>> +             iommu_writel(ctx, ARM_SMMU_CB_SCTLR, reg);
>>> +     }
>>> +
>>> +     mutex_unlock(&qcom_domain->init_mutex);
>>> +
>>> +     /* Publish page table ops for map/unmap */
>>> +     qcom_domain->pgtbl_ops = pgtbl_ops;
>>> +
>>> +     return 0;
>>> +
>>> +out_clear_iommu:
>>> +     qcom_domain->iommu = NULL;
>>> +out_unlock:
>>> +     mutex_unlock(&qcom_domain->init_mutex);
>>> +     return ret;
>>> +}
>>> +
>>> +static struct iommu_domain *qcom_iommu_domain_alloc(unsigned type)
>>> +{
>>> +     struct qcom_iommu_domain *qcom_domain;
>>> +
>>> +     if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA)
>>> +             return NULL;
>>> +     /*
>>> +      * Allocate the domain and initialise some of its data structures.
>>> +      * We can't really do anything meaningful until we've added a
>>> +      * master.
>>> +      */
>>> +     qcom_domain = kzalloc(sizeof(*qcom_domain), GFP_KERNEL);
>>> +     if (!qcom_domain)
>>> +             return NULL;
>>> +
>>> +     if (type == IOMMU_DOMAIN_DMA &&
>>> +         iommu_get_dma_cookie(&qcom_domain->domain)) {
>>> +             kfree(qcom_domain);
>>> +             return NULL;
>>> +     }
>>> +
>>> +     mutex_init(&qcom_domain->init_mutex);
>>> +     spin_lock_init(&qcom_domain->pgtbl_lock);
>>> +
>>> +     return &qcom_domain->domain;
>>> +}
>>> +
>>> +static void qcom_iommu_domain_free(struct iommu_domain *domain)
>>> +{
>>> +     struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
>>> +
>>> +     if (WARN_ON(qcom_domain->iommu))    /* forgot to detach? */
>>> +             return;
>>> +
>>> +     iommu_put_dma_cookie(domain);
>>> +
>>> +     free_io_pgtable_ops(qcom_domain->pgtbl_ops);
>>> +
>>> +     kfree(qcom_domain);
>>> +}
>>> +
>>> +static int qcom_iommu_attach_dev(struct iommu_domain *domain, struct device *dev)
>>> +{
>>> +     struct qcom_iommu_dev *qcom_iommu = to_iommu(dev->iommu_fwspec);
>>> +     struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
>>> +     int ret;
>>> +
>>> +     if (!qcom_iommu) {
>>> +             dev_err(dev, "cannot attach to IOMMU, is it on the same bus?\n");
>>> +             return -ENXIO;
>>> +     }
>>> +
>>> +     /* Ensure that the domain is finalized */
>>> +     pm_runtime_get_sync(qcom_iommu->dev);
>>> +     ret = qcom_iommu_init_domain(domain, qcom_iommu, dev->iommu_fwspec);
>>> +     pm_runtime_put_sync(qcom_iommu->dev);
>>> +     if (ret < 0)
>>> +             return ret;
>>> +
>>> +     /*
>>> +      * Sanity check the domain. We don't support domains across
>>> +      * different IOMMUs.
>>> +      */
>>> +     if (qcom_domain->iommu != qcom_iommu) {
>>> +             dev_err(dev, "cannot attach to IOMMU %s while already "
>>> +                     "attached to domain on IOMMU %s\n",
>>> +                     dev_name(qcom_domain->iommu->dev),
>>> +                     dev_name(qcom_iommu->dev));
>>> +             return -EINVAL;
>>> +     }
>>> +
>>> +     return 0;
>>> +}
>>> +
>>> +static void qcom_iommu_detach_dev(struct iommu_domain *domain, struct device *dev)
>>> +{
>>> +     struct iommu_fwspec *fwspec = dev->iommu_fwspec;
>>> +     struct qcom_iommu_dev *qcom_iommu = to_iommu(fwspec);
>>> +     struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
>>> +     unsigned i;
>>> +
>>> +     if (!qcom_domain->iommu)
>>> +             return;
>>> +
>>> +     pm_runtime_get_sync(qcom_iommu->dev);
>>> +     for (i = 0; i < fwspec->num_ids; i++) {
>>> +             struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
>>> +
>>> +             /* Disable the context bank: */
>>> +             iommu_writel(ctx, ARM_SMMU_CB_SCTLR, 0);
>>> +     }
>>> +     pm_runtime_put_sync(qcom_iommu->dev);
>>> +
>>> +     qcom_domain->iommu = NULL;
>>> +}
>>> +
>>> +static int qcom_iommu_map(struct iommu_domain *domain, unsigned long iova,
>>> +                       phys_addr_t paddr, size_t size, int prot)
>>> +{
>>> +     int ret;
>>> +     unsigned long flags;
>>> +     struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
>>> +     struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
>>> +
>>> +     if (!ops)
>>> +             return -ENODEV;
>>> +
>>> +     spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
>>> +     ret = ops->map(ops, iova, paddr, size, prot);
>>> +     spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
>>> +     return ret;
>>> +}
>>> +
>>> +static size_t qcom_iommu_unmap(struct iommu_domain *domain, unsigned long iova,
>>> +                            size_t size)
>>> +{
>>> +     size_t ret;
>>> +     unsigned long flags;
>>> +     struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
>>> +     struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
>>> +
>>> +     if (!ops)
>>> +             return 0;
>>> +
>>> +     spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
>>> +     ret = ops->unmap(ops, iova, size);
>>> +     spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
>>> +     return ret;
>>> +}
>>> +
>>> +static phys_addr_t qcom_iommu_iova_to_phys(struct iommu_domain *domain,
>>> +                                        dma_addr_t iova)
>>> +{
>>> +     phys_addr_t ret;
>>> +     unsigned long flags;
>>> +     struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
>>> +     struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
>>> +
>>> +     if (!ops)
>>> +             return 0;
>>> +
>>> +     spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
>>> +     ret = ops->iova_to_phys(ops, iova);
>>> +     spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
>>> +
>>> +     return ret;
>>> +}
>>> +
>>> +static bool qcom_iommu_capable(enum iommu_cap cap)
>>> +{
>>> +     switch (cap) {
>>> +     case IOMMU_CAP_CACHE_COHERENCY:
>>> +             /*
>>> +              * Return true here as the SMMU can always send out coherent
>>> +              * requests.
>>> +              */
>>> +             return true;
>>> +     case IOMMU_CAP_NOEXEC:
>>> +             return true;
>>> +     default:
>>> +             return false;
>>> +     }
>>> +}
>>> +
>>> +static int qcom_iommu_add_device(struct device *dev)
>>> +{
>>> +     struct qcom_iommu_dev *qcom_iommu = __to_iommu(dev->iommu_fwspec);
>>> +     struct iommu_group *group;
>>> +     struct device_link *link;
>>> +
>>> +     if (!qcom_iommu)
>>> +             return -ENODEV;
>>> +
>>> +     /*
>>> +      * Establish the link between iommu and master, so that the
>>> +      * iommu gets runtime enabled/disabled as per the master's
>>> +      * needs.
>>> +      */
>>> +     link = device_link_add(dev, qcom_iommu->dev, DL_FLAG_PM_RUNTIME);
>>> +     if (!link) {
>>> +             dev_err(qcom_iommu->dev, "Unable to create device link between %s and %s\n",
>>> +                     dev_name(qcom_iommu->dev), dev_name(dev));
>>> +             return -ENODEV;
>>> +     }
>>> +
>>> +     group = iommu_group_get_for_dev(dev);
>>> +     if (IS_ERR_OR_NULL(group))
>>> +             return PTR_ERR_OR_ZERO(group);
>>> +
>>> +     iommu_group_put(group);
>>> +     iommu_device_link(&qcom_iommu->iommu, dev);
>>> +
>>> +     return 0;
>>> +}
>>> +
>>> +static void qcom_iommu_remove_device(struct device *dev)
>>> +{
>>> +     struct qcom_iommu_dev *qcom_iommu = to_iommu(dev->iommu_fwspec);
>>> +
>>> +     if (!qcom_iommu)
>>> +             return;
>>> +
>>> +     iommu_device_unlink(&qcom_iommu->iommu, dev);
>>> +     iommu_group_remove_device(dev);
>>> +     iommu_fwspec_free(dev);
>>> +}
>>> +
>>> +static struct iommu_group *qcom_iommu_device_group(struct device *dev)
>>> +{
>>> +     struct iommu_fwspec *fwspec = dev->iommu_fwspec;
>>> +     struct iommu_group *group = NULL;
>>> +     unsigned i;
>>> +
>>> +     for (i = 0; i < fwspec->num_ids; i++) {
>>> +             struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
>>> +
>>> +             if (group && ctx->group && group != ctx->group)
>>> +                     return ERR_PTR(-EINVAL);
>>> +
>>> +             group = ctx->group;
>>> +     }
>>> +
>>> +     if (group)
>>> +             return iommu_group_ref_get(group);
>>> +
>>> +     group = generic_device_group(dev);
>>> +
>>> +     for (i = 0; i < fwspec->num_ids; i++) {
>>> +             struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
>>> +             ctx->group = iommu_group_ref_get(group);
>>> +     }
>>> +
>>> +     return group;
>>> +}
>>> +
>>> +static int qcom_iommu_of_xlate(struct device *dev, struct of_phandle_args *args)
>>> +{
>>> +     struct platform_device *iommu_pdev;
>>> +
>>> +     if (args->args_count != 1) {
>>> +             dev_err(dev, "incorrect number of iommu params found for %s "
>>> +                     "(found %d, expected 1)\n",
>>> +                     args->np->full_name, args->args_count);
>>> +             return -EINVAL;
>>> +     }
>>> +
>>> +     if (!dev->iommu_fwspec->iommu_priv) {
>>> +             iommu_pdev = of_find_device_by_node(args->np);
>>> +             if (WARN_ON(!iommu_pdev))
>>> +                     return -EINVAL;
>>> +
>>> +             dev->iommu_fwspec->iommu_priv = platform_get_drvdata(iommu_pdev);
>>> +     }
>>> +
>>> +     return iommu_fwspec_add_ids(dev, &args->args[0], 1);
>>> +}
>>> +
>>> +static const struct iommu_ops qcom_iommu_ops = {
>>> +     .capable        = qcom_iommu_capable,
>>> +     .domain_alloc   = qcom_iommu_domain_alloc,
>>> +     .domain_free    = qcom_iommu_domain_free,
>>> +     .attach_dev     = qcom_iommu_attach_dev,
>>> +     .detach_dev     = qcom_iommu_detach_dev,
>>> +     .map            = qcom_iommu_map,
>>> +     .unmap          = qcom_iommu_unmap,
>>> +     .map_sg         = default_iommu_map_sg,
>>> +     .iova_to_phys   = qcom_iommu_iova_to_phys,
>>> +     .add_device     = qcom_iommu_add_device,
>>> +     .remove_device  = qcom_iommu_remove_device,
>>> +     .device_group   = qcom_iommu_device_group,
>>> +     .of_xlate       = qcom_iommu_of_xlate,
>>> +     .pgsize_bitmap  = SZ_4K | SZ_64K | SZ_1M | SZ_16M,
>>> +};
>>> +
>>> +static int qcom_iommu_enable_clocks(struct qcom_iommu_dev *qcom_iommu)
>>> +{
>>> +     int ret;
>>> +
>>> +     ret = clk_prepare_enable(qcom_iommu->iface_clk);
>>> +     if (ret) {
>>> +             dev_err(qcom_iommu->dev, "Couldn't enable iface_clk\n");
>>> +             return ret;
>>> +     }
>>> +
>>> +     ret = clk_prepare_enable(qcom_iommu->bus_clk);
>>> +     if (ret) {
>>> +             dev_err(qcom_iommu->dev, "Couldn't enable bus_clk\n");
>>> +             clk_disable_unprepare(qcom_iommu->iface_clk);
>>> +             return ret;
>>> +     }
>>> +
>>> +     return 0;
>>> +}
>>> +
>>> +static void qcom_iommu_disable_clocks(struct qcom_iommu_dev *qcom_iommu)
>>> +{
>>> +     clk_disable_unprepare(qcom_iommu->bus_clk);
>>> +     clk_disable_unprepare(qcom_iommu->iface_clk);
>>> +}
>>> +
>>> +static int qcom_iommu_ctx_probe(struct platform_device *pdev)
>>> +{
>>> +     struct qcom_iommu_ctx *ctx;
>>> +     struct device *dev = &pdev->dev;
>>> +     struct qcom_iommu_dev *qcom_iommu = dev_get_drvdata(dev->parent);
>>> +     struct resource *res;
>>> +     int ret;
>>> +     u32 reg;
>>> +
>>> +     ctx = devm_kzalloc(dev, sizeof(*ctx), GFP_KERNEL);
>>> +     if (!ctx)
>>> +             return -ENOMEM;
>>> +
>>> +     ctx->dev = dev;
>>> +     platform_set_drvdata(pdev, ctx);
>>> +
>>> +     res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>>> +     ctx->base = devm_ioremap_resource(dev, res);
>>> +     if (IS_ERR(ctx->base))
>>> +             return PTR_ERR(ctx->base);
>>> +
>>> +     ctx->irq = platform_get_irq(pdev, 0);
>>> +     if (ctx->irq < 0) {
>>> +             dev_err(dev, "failed to get irq\n");
>>> +             return -ENODEV;
>>> +     }
>>> +
>>> +     ret = devm_request_irq(dev, ctx->irq,
>>> +                            qcom_iommu_fault,
>>> +                            IRQF_SHARED,
>>> +                            "qcom-iommu-fault",
>>> +                            ctx);
>>> +     if (ret) {
>>> +             dev_err(dev, "failed to request IRQ %u\n", ctx->irq);
>>> +             return ret;
>>> +     }
>>> +
>>> +     /* read the "reg" property directly to get the relative address
>>> +      * of the context bank, and calculate the asid from that:
>>> +      */
>>> +     if (of_property_read_u32_index(dev->of_node, "reg", 0, &reg)) {
>>> +             dev_err(dev, "missing reg property\n");
>>> +             return -ENODEV;
>>> +     }
>>> +
>>> +     ctx->asid = reg / 0x1000;      /* context banks are 0x1000 apart */
>>> +
>>> +     dev_dbg(dev, "found asid %u\n", ctx->asid);
>>> +
>>> +     list_add_tail(&ctx->node, &qcom_iommu->context_list);
>>> +
>>> +     return 0;
>>> +}
>>> +
>>> +static int qcom_iommu_ctx_remove(struct platform_device *pdev)
>>> +{
>>> +     struct qcom_iommu_ctx *ctx = platform_get_drvdata(pdev);
>>> +
>>> +     iommu_group_put(ctx->group);
>>> +     platform_set_drvdata(pdev, NULL);
>>> +
>>> +     list_del(&ctx->node);
>>> +
>>> +     return 0;
>>> +}
>>> +
>>> +static const struct of_device_id ctx_of_match[] = {
>>> +     { .compatible = "qcom,msm-iommu-v1-ns" },
>>> +     { .compatible = "qcom,msm-iommu-v1-sec" },
>>> +     { /* sentinel */ }
>>> +};
>>> +
>>> +static struct platform_driver qcom_iommu_ctx_driver = {
>>> +     .driver = {
>>> +             .name           = "qcom-iommu-ctx",
>>> +             .of_match_table = of_match_ptr(ctx_of_match),
>>> +     },
>>> +     .probe  = qcom_iommu_ctx_probe,
>>> +     .remove = qcom_iommu_ctx_remove,
>>> +};
>>> +
>>> +static int qcom_iommu_device_probe(struct platform_device *pdev)
>>> +{
>>> +     struct qcom_iommu_dev *qcom_iommu;
>>> +     struct device *dev = &pdev->dev;
>>> +     struct resource *res;
>>> +     int ret;
>>> +
>>> +     qcom_iommu = devm_kzalloc(dev, sizeof(*qcom_iommu), GFP_KERNEL);
>>> +     if (!qcom_iommu)
>>> +             return -ENOMEM;
>>> +     qcom_iommu->dev = dev;
>>> +
>>> +     INIT_LIST_HEAD(&qcom_iommu->context_list);
>>> +
>>> +     res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>>> +     if (res)
>>> +             qcom_iommu->local_base = devm_ioremap_resource(dev, res);
>>> +
>>> +     qcom_iommu->iface_clk = devm_clk_get(dev, "iface");
>>> +     if (IS_ERR(qcom_iommu->iface_clk)) {
>>> +             dev_err(dev, "failed to get iface clock\n");
>>> +             return PTR_ERR(qcom_iommu->iface_clk);
>>> +     }
>>> +
>>> +     qcom_iommu->bus_clk = devm_clk_get(dev, "bus");
>>> +     if (IS_ERR(qcom_iommu->bus_clk)) {
>>> +             dev_err(dev, "failed to get bus clock\n");
>>> +             return PTR_ERR(qcom_iommu->bus_clk);
>>> +     }
>>> +
>>> +     if (of_property_read_u32(dev->of_node, "qcom,iommu-secure-id",
>>> +                              &qcom_iommu->sec_id)) {
>>> +             dev_err(dev, "missing qcom,iommu-secure-id property\n");
>>> +             return -ENODEV;
>>> +     }
>>> +
>>> +     platform_set_drvdata(pdev, qcom_iommu);
>>> +
>>> +     /* register context bank devices, which are child nodes: */
>>> +     ret = devm_of_platform_populate(dev);
>>> +     if (ret) {
>>> +             dev_err(dev, "Failed to populate iommu contexts\n");
>>> +             return ret;
>>> +     }
>>> +
>>> +     ret = iommu_device_sysfs_add(&qcom_iommu->iommu, dev, NULL,
>>> +                                  dev_name(dev));
>>> +     if (ret) {
>>> +             dev_err(dev, "Failed to register iommu in sysfs\n");
>>> +             return ret;
>>> +     }
>>> +
>>> +     iommu_device_set_ops(&qcom_iommu->iommu, &qcom_iommu_ops);
>>> +     iommu_device_set_fwnode(&qcom_iommu->iommu, dev->fwnode);
>>> +
>>> +     ret = iommu_device_register(&qcom_iommu->iommu);
>>> +     if (ret) {
>>> +             dev_err(dev, "Failed to register iommu\n");
>>> +             return ret;
>>> +     }
>>> +
>>> +     pm_runtime_enable(dev);
>>> +     bus_set_iommu(&platform_bus_type, &qcom_iommu_ops);
>>> +
>>> +     if (qcom_iommu->local_base) {
>>> +             pm_runtime_get_sync(dev);
>>> +             writel_relaxed(0xffffffff, qcom_iommu->local_base + SMMU_INTR_SEL_NS);
>>> +             pm_runtime_put_sync(dev);
>>> +     }
>>> +
>>> +     return 0;
>>> +}
>>> +
>>> +static int qcom_iommu_device_remove(struct platform_device *pdev)
>>> +{
>>> +     struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
>>> +
>>> +     bus_set_iommu(&platform_bus_type, NULL);
>>> +
>>> +     pm_runtime_force_suspend(&pdev->dev);
>>> +     platform_set_drvdata(pdev, NULL);
>>> +     iommu_device_sysfs_remove(&qcom_iommu->iommu);
>>> +     iommu_device_unregister(&qcom_iommu->iommu);
>>> +
>>> +     return 0;
>>> +}
>>> +
>>> +#ifdef CONFIG_PM
>>> +static int qcom_iommu_resume(struct device *dev)
>>> +{
>>> +     struct platform_device *pdev = to_platform_device(dev);
>>> +     struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
>>> +
>>> +     return qcom_iommu_enable_clocks(qcom_iommu);
>>> +}
>>> +
>>> +static int qcom_iommu_suspend(struct device *dev)
>>> +{
>>> +     struct platform_device *pdev = to_platform_device(dev);
>>> +     struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
>>> +
>>> +     qcom_iommu_disable_clocks(qcom_iommu);
>>> +
>>> +     return 0;
>>> +}
>>> +#endif
>>> +
>>> +static const struct dev_pm_ops qcom_iommu_pm_ops = {
>>> +     SET_RUNTIME_PM_OPS(qcom_iommu_suspend, qcom_iommu_resume, NULL)
>>> +     SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
>>> +                             pm_runtime_force_resume)
>>> +};
>>> +
>>> +static const struct of_device_id qcom_iommu_of_match[] = {
>>> +     { .compatible = "qcom,msm-iommu-v1" },
>>> +     { /* sentinel */ }
>>> +};
>>> +MODULE_DEVICE_TABLE(of, qcom_iommu_of_match);
>>> +
>>> +static struct platform_driver qcom_iommu_driver = {
>>> +     .driver = {
>>> +             .name           = "qcom-iommu",
>>> +             .of_match_table = of_match_ptr(qcom_iommu_of_match),
>>> +             .pm             = &qcom_iommu_pm_ops,
>>> +     },
>>> +     .probe  = qcom_iommu_device_probe,
>>> +     .remove = qcom_iommu_device_remove,
>>> +};
>>> +
>>> +static int __init qcom_iommu_init(void)
>>> +{
>>> +     int ret;
>>> +
>>> +     ret = platform_driver_register(&qcom_iommu_ctx_driver);
>>> +     if (ret)
>>> +             return ret;
>>> +
>>> +     ret = platform_driver_register(&qcom_iommu_driver);
>>> +     if (ret)
>>> +             platform_driver_unregister(&qcom_iommu_ctx_driver);
>>> +
>>> +     return ret;
>>> +}
>>> +
>>> +static void __exit qcom_iommu_exit(void)
>>> +{
>>> +     platform_driver_unregister(&qcom_iommu_driver);
>>> +     platform_driver_unregister(&qcom_iommu_ctx_driver);
>>> +}
>>> +
>>> +module_init(qcom_iommu_init);
>>> +module_exit(qcom_iommu_exit);
>>> +
>>> +IOMMU_OF_DECLARE(qcom_iommu_dev, "qcom,msm-iommu-v1", NULL);
>>> +
>>> +MODULE_DESCRIPTION("IOMMU API for QCOM IOMMU v1 implementations");
>>> +MODULE_LICENSE("GPL v2");
>>>
>>
>> --
>> "QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation

-- 
"QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] iommu: add qcom_iommu
  2017-05-11 15:08       ` Sricharan R
@ 2017-05-11 16:50         ` Rob Clark
  2017-05-12  3:52           ` Sricharan R
  0 siblings, 1 reply; 24+ messages in thread
From: Rob Clark @ 2017-05-11 16:50 UTC (permalink / raw)
  To: Sricharan R
  Cc: iommu, linux-arm-msm, Robin Murphy, Will Deacon, Mark Rutland,
	Stanimir Varbanov, Archit Taneja, Rob Herring

On Thu, May 11, 2017 at 11:08 AM, Sricharan R <sricharan@codeaurora.org> wrote:
> Hi Rob,
>
> <snip..>
>
>> +static irqreturn_t qcom_iommu_fault(int irq, void *dev)
>> +{
>> +     struct qcom_iommu_ctx *ctx = dev;
>> +     u32 fsr, fsynr;
>> +     unsigned long iova;
>> +
>> +     fsr = iommu_readl(ctx, ARM_SMMU_CB_FSR);
>> +
>> +     if (!(fsr & FSR_FAULT))
>> +             return IRQ_NONE;
>> +
>> +     fsynr = iommu_readl(ctx, ARM_SMMU_CB_FSYNR0);
>> +     iova = iommu_readq(ctx, ARM_SMMU_CB_FAR);
>> +
>> +     dev_err_ratelimited(ctx->dev,
>> +                         "Unhandled context fault: fsr=0x%x, "
>> +                         "iova=0x%08lx, fsynr=0x%x, cb=%d\n",
>> +                         fsr, iova, fsynr, ctx->asid);
>> +
>> +     iommu_writel(ctx, ARM_SMMU_CB_FSR, fsr);
>
> Just thinking if the clocks should be enabled in the fault handler
> for handling cases that would happen out of the master context.
> While global faults are one case, that is anyways is handled in
> secure world for this case. Something like bootloader used the iommu
> and not handled the fault, and getting the fault in kernel the
> moment we enable the ctx. Atleast downstream seems to enable the
> clocks in the fault handler explicitly.

hmm, I wonder if we should instead do something to clear interrupts
when we initialize the context?

I guess we probably don't want to get fault irq's from the bootloader..

BR,
-R

> Regards,
>  Sricharan
>
>
>> +
>> +     return IRQ_HANDLED;
>> +}
>> +
>> +static int qcom_iommu_init_domain(struct iommu_domain *domain,
>> +                               struct qcom_iommu_dev *qcom_iommu,
>> +                               struct iommu_fwspec *fwspec)
>> +{
>> +     struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
>> +     struct io_pgtable_ops *pgtbl_ops;
>> +     struct io_pgtable_cfg pgtbl_cfg;
>> +     int i, ret = 0;
>> +     u32 reg;
>> +
>> +     mutex_lock(&qcom_domain->init_mutex);
>> +     if (qcom_domain->iommu)
>> +             goto out_unlock;
>> +
>> +     pgtbl_cfg = (struct io_pgtable_cfg) {
>> +             .pgsize_bitmap  = qcom_iommu_ops.pgsize_bitmap,
>> +             .ias            = 32,
>> +             .oas            = 40,
>> +             .tlb            = &qcom_gather_ops,
>> +             .iommu_dev      = qcom_iommu->dev,
>> +     };
>> +
>> +     qcom_domain->iommu = qcom_iommu;
>> +     pgtbl_ops = alloc_io_pgtable_ops(ARM_32_LPAE_S1, &pgtbl_cfg, fwspec);
>> +     if (!pgtbl_ops) {
>> +             dev_err(qcom_iommu->dev, "failed to allocate pagetable ops\n");
>> +             ret = -ENOMEM;
>> +             goto out_clear_iommu;
>> +     }
>> +
>> +     /* Update the domain's page sizes to reflect the page table format */
>> +     domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
>> +     domain->geometry.aperture_end = (1ULL << pgtbl_cfg.ias) - 1;
>> +     domain->geometry.force_aperture = true;
>> +
>> +     for (i = 0; i < fwspec->num_ids; i++) {
>> +             struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
>> +
>> +             if (!ctx->secure_init) {
>> +                     ret = qcom_scm_restore_sec_cfg(qcom_iommu->sec_id, ctx->asid);
>> +                     if (ret) {
>> +                             dev_err(qcom_iommu->dev, "secure init failed: %d\n", ret);
>> +                             goto out_clear_iommu;
>> +                     }
>> +                     ctx->secure_init = true;
>> +             }
>> +
>> +             /* TTBRs */
>> +             iommu_writeq(ctx, ARM_SMMU_CB_TTBR0,
>> +                             pgtbl_cfg.arm_lpae_s1_cfg.ttbr[0] |
>> +                             ((u64)ctx->asid << TTBRn_ASID_SHIFT));
>> +             iommu_writeq(ctx, ARM_SMMU_CB_TTBR1,
>> +                             pgtbl_cfg.arm_lpae_s1_cfg.ttbr[1] |
>> +                             ((u64)ctx->asid << TTBRn_ASID_SHIFT));
>> +
>> +             /* TTBCR */
>> +             iommu_writel(ctx, ARM_SMMU_CB_TTBCR2,
>> +                             (pgtbl_cfg.arm_lpae_s1_cfg.tcr >> 32) |
>> +                             TTBCR2_SEP_UPSTREAM);
>> +             iommu_writel(ctx, ARM_SMMU_CB_TTBCR,
>> +                             pgtbl_cfg.arm_lpae_s1_cfg.tcr);
>> +
>> +             /* MAIRs (stage-1 only) */
>> +             iommu_writel(ctx, ARM_SMMU_CB_S1_MAIR0,
>> +                             pgtbl_cfg.arm_lpae_s1_cfg.mair[0]);
>> +             iommu_writel(ctx, ARM_SMMU_CB_S1_MAIR1,
>> +                             pgtbl_cfg.arm_lpae_s1_cfg.mair[1]);
>> +
>> +             /* SCTLR */
>> +             reg = SCTLR_CFIE | SCTLR_CFRE | SCTLR_AFE | SCTLR_TRE |
>> +                     SCTLR_M | SCTLR_S1_ASIDPNE;
>> +
>> +             if (IS_ENABLED(CONFIG_BIG_ENDIAN))
>> +                     reg |= SCTLR_E;
>> +
>> +             iommu_writel(ctx, ARM_SMMU_CB_SCTLR, reg);
>> +     }
>> +
>> +     mutex_unlock(&qcom_domain->init_mutex);
>> +
>> +     /* Publish page table ops for map/unmap */
>> +     qcom_domain->pgtbl_ops = pgtbl_ops;
>> +
>> +     return 0;
>> +
>> +out_clear_iommu:
>> +     qcom_domain->iommu = NULL;
>> +out_unlock:
>> +     mutex_unlock(&qcom_domain->init_mutex);
>> +     return ret;
>> +}
>> +
>> +static struct iommu_domain *qcom_iommu_domain_alloc(unsigned type)
>> +{
>> +     struct qcom_iommu_domain *qcom_domain;
>> +
>> +     if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA)
>> +             return NULL;
>> +     /*
>> +      * Allocate the domain and initialise some of its data structures.
>> +      * We can't really do anything meaningful until we've added a
>> +      * master.
>> +      */
>> +     qcom_domain = kzalloc(sizeof(*qcom_domain), GFP_KERNEL);
>> +     if (!qcom_domain)
>> +             return NULL;
>> +
>> +     if (type == IOMMU_DOMAIN_DMA &&
>> +         iommu_get_dma_cookie(&qcom_domain->domain)) {
>> +             kfree(qcom_domain);
>> +             return NULL;
>> +     }
>> +
>> +     mutex_init(&qcom_domain->init_mutex);
>> +     spin_lock_init(&qcom_domain->pgtbl_lock);
>> +
>> +     return &qcom_domain->domain;
>> +}
>> +
>> +static void qcom_iommu_domain_free(struct iommu_domain *domain)
>> +{
>> +     struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
>> +
>> +     if (WARN_ON(qcom_domain->iommu))    /* forgot to detach? */
>> +             return;
>> +
>> +     iommu_put_dma_cookie(domain);
>> +
>> +     free_io_pgtable_ops(qcom_domain->pgtbl_ops);
>> +
>> +     kfree(qcom_domain);
>> +}
>> +
>> +static int qcom_iommu_attach_dev(struct iommu_domain *domain, struct device *dev)
>> +{
>> +     struct qcom_iommu_dev *qcom_iommu = to_iommu(dev->iommu_fwspec);
>> +     struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
>> +     int ret;
>> +
>> +     if (!qcom_iommu) {
>> +             dev_err(dev, "cannot attach to IOMMU, is it on the same bus?\n");
>> +             return -ENXIO;
>> +     }
>> +
>> +     /* Ensure that the domain is finalized */
>> +     pm_runtime_get_sync(qcom_iommu->dev);
>> +     ret = qcom_iommu_init_domain(domain, qcom_iommu, dev->iommu_fwspec);
>> +     pm_runtime_put_sync(qcom_iommu->dev);
>> +     if (ret < 0)
>> +             return ret;
>> +
>> +     /*
>> +      * Sanity check the domain. We don't support domains across
>> +      * different IOMMUs.
>> +      */
>> +     if (qcom_domain->iommu != qcom_iommu) {
>> +             dev_err(dev, "cannot attach to IOMMU %s while already "
>> +                     "attached to domain on IOMMU %s\n",
>> +                     dev_name(qcom_domain->iommu->dev),
>> +                     dev_name(qcom_iommu->dev));
>> +             return -EINVAL;
>> +     }
>> +
>> +     return 0;
>> +}
>> +
>> +static void qcom_iommu_detach_dev(struct iommu_domain *domain, struct device *dev)
>> +{
>> +     struct iommu_fwspec *fwspec = dev->iommu_fwspec;
>> +     struct qcom_iommu_dev *qcom_iommu = to_iommu(fwspec);
>> +     struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
>> +     unsigned i;
>> +
>> +     if (!qcom_domain->iommu)
>> +             return;
>> +
>> +     pm_runtime_get_sync(qcom_iommu->dev);
>> +     for (i = 0; i < fwspec->num_ids; i++) {
>> +             struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
>> +
>> +             /* Disable the context bank: */
>> +             iommu_writel(ctx, ARM_SMMU_CB_SCTLR, 0);
>> +     }
>> +     pm_runtime_put_sync(qcom_iommu->dev);
>> +
>> +     qcom_domain->iommu = NULL;
>> +}
>> +
>> +static int qcom_iommu_map(struct iommu_domain *domain, unsigned long iova,
>> +                       phys_addr_t paddr, size_t size, int prot)
>> +{
>> +     int ret;
>> +     unsigned long flags;
>> +     struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
>> +     struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
>> +
>> +     if (!ops)
>> +             return -ENODEV;
>> +
>> +     spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
>> +     ret = ops->map(ops, iova, paddr, size, prot);
>> +     spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
>> +     return ret;
>> +}
>> +
>> +static size_t qcom_iommu_unmap(struct iommu_domain *domain, unsigned long iova,
>> +                            size_t size)
>> +{
>> +     size_t ret;
>> +     unsigned long flags;
>> +     struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
>> +     struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
>> +
>> +     if (!ops)
>> +             return 0;
>> +
>> +     spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
>> +     ret = ops->unmap(ops, iova, size);
>> +     spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
>> +     return ret;
>> +}
>> +
>> +static phys_addr_t qcom_iommu_iova_to_phys(struct iommu_domain *domain,
>> +                                        dma_addr_t iova)
>> +{
>> +     phys_addr_t ret;
>> +     unsigned long flags;
>> +     struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
>> +     struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
>> +
>> +     if (!ops)
>> +             return 0;
>> +
>> +     spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
>> +     ret = ops->iova_to_phys(ops, iova);
>> +     spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
>> +
>> +     return ret;
>> +}
>> +
>> +static bool qcom_iommu_capable(enum iommu_cap cap)
>> +{
>> +     switch (cap) {
>> +     case IOMMU_CAP_CACHE_COHERENCY:
>> +             /*
>> +              * Return true here as the SMMU can always send out coherent
>> +              * requests.
>> +              */
>> +             return true;
>> +     case IOMMU_CAP_NOEXEC:
>> +             return true;
>> +     default:
>> +             return false;
>> +     }
>> +}
>> +
>> +static int qcom_iommu_add_device(struct device *dev)
>> +{
>> +     struct qcom_iommu_dev *qcom_iommu = __to_iommu(dev->iommu_fwspec);
>> +     struct iommu_group *group;
>> +     struct device_link *link;
>> +
>> +     if (!qcom_iommu)
>> +             return -ENODEV;
>> +
>> +     /*
>> +      * Establish the link between iommu and master, so that the
>> +      * iommu gets runtime enabled/disabled as per the master's
>> +      * needs.
>> +      */
>> +     link = device_link_add(dev, qcom_iommu->dev, DL_FLAG_PM_RUNTIME);
>> +     if (!link) {
>> +             dev_err(qcom_iommu->dev, "Unable to create device link between %s and %s\n",
>> +                     dev_name(qcom_iommu->dev), dev_name(dev));
>> +             return -ENODEV;
>> +     }
>> +
>> +     group = iommu_group_get_for_dev(dev);
>> +     if (IS_ERR_OR_NULL(group))
>> +             return PTR_ERR_OR_ZERO(group);
>> +
>> +     iommu_group_put(group);
>> +     iommu_device_link(&qcom_iommu->iommu, dev);
>> +
>> +     return 0;
>> +}
>> +
>> +static void qcom_iommu_remove_device(struct device *dev)
>> +{
>> +     struct qcom_iommu_dev *qcom_iommu = to_iommu(dev->iommu_fwspec);
>> +
>> +     if (!qcom_iommu)
>> +             return;
>> +
>> +     iommu_device_unlink(&qcom_iommu->iommu, dev);
>> +     iommu_group_remove_device(dev);
>> +     iommu_fwspec_free(dev);
>> +}
>> +
>> +static struct iommu_group *qcom_iommu_device_group(struct device *dev)
>> +{
>> +     struct iommu_fwspec *fwspec = dev->iommu_fwspec;
>> +     struct iommu_group *group = NULL;
>> +     unsigned i;
>> +
>> +     for (i = 0; i < fwspec->num_ids; i++) {
>> +             struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
>> +
>> +             if (group && ctx->group && group != ctx->group)
>> +                     return ERR_PTR(-EINVAL);
>> +
>> +             group = ctx->group;
>> +     }
>> +
>> +     if (group)
>> +             return iommu_group_ref_get(group);
>> +
>> +     group = generic_device_group(dev);
>> +
>> +     for (i = 0; i < fwspec->num_ids; i++) {
>> +             struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
>> +             ctx->group = iommu_group_ref_get(group);
>> +     }
>> +
>> +     return group;
>> +}
>> +
>> +static int qcom_iommu_of_xlate(struct device *dev, struct of_phandle_args *args)
>> +{
>> +     struct platform_device *iommu_pdev;
>> +
>> +     if (args->args_count != 1) {
>> +             dev_err(dev, "incorrect number of iommu params found for %s "
>> +                     "(found %d, expected 1)\n",
>> +                     args->np->full_name, args->args_count);
>> +             return -EINVAL;
>> +     }
>> +
>> +     if (!dev->iommu_fwspec->iommu_priv) {
>> +             iommu_pdev = of_find_device_by_node(args->np);
>> +             if (WARN_ON(!iommu_pdev))
>> +                     return -EINVAL;
>> +
>> +             dev->iommu_fwspec->iommu_priv = platform_get_drvdata(iommu_pdev);
>> +     }
>> +
>> +     return iommu_fwspec_add_ids(dev, &args->args[0], 1);
>> +}
>> +
>> +static const struct iommu_ops qcom_iommu_ops = {
>> +     .capable        = qcom_iommu_capable,
>> +     .domain_alloc   = qcom_iommu_domain_alloc,
>> +     .domain_free    = qcom_iommu_domain_free,
>> +     .attach_dev     = qcom_iommu_attach_dev,
>> +     .detach_dev     = qcom_iommu_detach_dev,
>> +     .map            = qcom_iommu_map,
>> +     .unmap          = qcom_iommu_unmap,
>> +     .map_sg         = default_iommu_map_sg,
>> +     .iova_to_phys   = qcom_iommu_iova_to_phys,
>> +     .add_device     = qcom_iommu_add_device,
>> +     .remove_device  = qcom_iommu_remove_device,
>> +     .device_group   = qcom_iommu_device_group,
>> +     .of_xlate       = qcom_iommu_of_xlate,
>> +     .pgsize_bitmap  = SZ_4K | SZ_64K | SZ_1M | SZ_16M,
>> +};
>> +
>> +static int qcom_iommu_enable_clocks(struct qcom_iommu_dev *qcom_iommu)
>> +{
>> +     int ret;
>> +
>> +     ret = clk_prepare_enable(qcom_iommu->iface_clk);
>> +     if (ret) {
>> +             dev_err(qcom_iommu->dev, "Couldn't enable iface_clk\n");
>> +             return ret;
>> +     }
>> +
>> +     ret = clk_prepare_enable(qcom_iommu->bus_clk);
>> +     if (ret) {
>> +             dev_err(qcom_iommu->dev, "Couldn't enable bus_clk\n");
>> +             clk_disable_unprepare(qcom_iommu->iface_clk);
>> +             return ret;
>> +     }
>> +
>> +     return 0;
>> +}
>> +
>> +static void qcom_iommu_disable_clocks(struct qcom_iommu_dev *qcom_iommu)
>> +{
>> +     clk_disable_unprepare(qcom_iommu->bus_clk);
>> +     clk_disable_unprepare(qcom_iommu->iface_clk);
>> +}
>> +
>> +static int qcom_iommu_ctx_probe(struct platform_device *pdev)
>> +{
>> +     struct qcom_iommu_ctx *ctx;
>> +     struct device *dev = &pdev->dev;
>> +     struct qcom_iommu_dev *qcom_iommu = dev_get_drvdata(dev->parent);
>> +     struct resource *res;
>> +     int ret;
>> +     u32 reg;
>> +
>> +     ctx = devm_kzalloc(dev, sizeof(*ctx), GFP_KERNEL);
>> +     if (!ctx)
>> +             return -ENOMEM;
>> +
>> +     ctx->dev = dev;
>> +     platform_set_drvdata(pdev, ctx);
>> +
>> +     res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>> +     ctx->base = devm_ioremap_resource(dev, res);
>> +     if (IS_ERR(ctx->base))
>> +             return PTR_ERR(ctx->base);
>> +
>> +     ctx->irq = platform_get_irq(pdev, 0);
>> +     if (ctx->irq < 0) {
>> +             dev_err(dev, "failed to get irq\n");
>> +             return -ENODEV;
>> +     }
>> +
>> +     ret = devm_request_irq(dev, ctx->irq,
>> +                            qcom_iommu_fault,
>> +                            IRQF_SHARED,
>> +                            "qcom-iommu-fault",
>> +                            ctx);
>> +     if (ret) {
>> +             dev_err(dev, "failed to request IRQ %u\n", ctx->irq);
>> +             return ret;
>> +     }
>> +
>> +     /* read the "reg" property directly to get the relative address
>> +      * of the context bank, and calculate the asid from that:
>> +      */
>> +     if (of_property_read_u32_index(dev->of_node, "reg", 0, &reg)) {
>> +             dev_err(dev, "missing reg property\n");
>> +             return -ENODEV;
>> +     }
>> +
>> +     ctx->asid = reg / 0x1000;      /* context banks are 0x1000 apart */
>> +
>> +     dev_dbg(dev, "found asid %u\n", ctx->asid);
>> +
>> +     list_add_tail(&ctx->node, &qcom_iommu->context_list);
>> +
>> +     return 0;
>> +}
>> +
>> +static int qcom_iommu_ctx_remove(struct platform_device *pdev)
>> +{
>> +     struct qcom_iommu_ctx *ctx = platform_get_drvdata(pdev);
>> +
>> +     iommu_group_put(ctx->group);
>> +     platform_set_drvdata(pdev, NULL);
>> +
>> +     list_del(&ctx->node);
>> +
>> +     return 0;
>> +}
>> +
>> +static const struct of_device_id ctx_of_match[] = {
>> +     { .compatible = "qcom,msm-iommu-v1-ns" },
>> +     { .compatible = "qcom,msm-iommu-v1-sec" },
>> +     { /* sentinel */ }
>> +};
>> +
>> +static struct platform_driver qcom_iommu_ctx_driver = {
>> +     .driver = {
>> +             .name           = "qcom-iommu-ctx",
>> +             .of_match_table = of_match_ptr(ctx_of_match),
>> +     },
>> +     .probe  = qcom_iommu_ctx_probe,
>> +     .remove = qcom_iommu_ctx_remove,
>> +};
>> +
>> +static int qcom_iommu_device_probe(struct platform_device *pdev)
>> +{
>> +     struct qcom_iommu_dev *qcom_iommu;
>> +     struct device *dev = &pdev->dev;
>> +     struct resource *res;
>> +     int ret;
>> +
>> +     qcom_iommu = devm_kzalloc(dev, sizeof(*qcom_iommu), GFP_KERNEL);
>> +     if (!qcom_iommu)
>> +             return -ENOMEM;
>> +     qcom_iommu->dev = dev;
>> +
>> +     INIT_LIST_HEAD(&qcom_iommu->context_list);
>> +
>> +     res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>> +     if (res)
>> +             qcom_iommu->local_base = devm_ioremap_resource(dev, res);
>> +
>> +     qcom_iommu->iface_clk = devm_clk_get(dev, "iface");
>> +     if (IS_ERR(qcom_iommu->iface_clk)) {
>> +             dev_err(dev, "failed to get iface clock\n");
>> +             return PTR_ERR(qcom_iommu->iface_clk);
>> +     }
>> +
>> +     qcom_iommu->bus_clk = devm_clk_get(dev, "bus");
>> +     if (IS_ERR(qcom_iommu->bus_clk)) {
>> +             dev_err(dev, "failed to get bus clock\n");
>> +             return PTR_ERR(qcom_iommu->bus_clk);
>> +     }
>> +
>> +     if (of_property_read_u32(dev->of_node, "qcom,iommu-secure-id",
>> +                              &qcom_iommu->sec_id)) {
>> +             dev_err(dev, "missing qcom,iommu-secure-id property\n");
>> +             return -ENODEV;
>> +     }
>> +
>> +     platform_set_drvdata(pdev, qcom_iommu);
>> +
>> +     /* register context bank devices, which are child nodes: */
>> +     ret = devm_of_platform_populate(dev);
>> +     if (ret) {
>> +             dev_err(dev, "Failed to populate iommu contexts\n");
>> +             return ret;
>> +     }
>> +
>> +     ret = iommu_device_sysfs_add(&qcom_iommu->iommu, dev, NULL,
>> +                                  dev_name(dev));
>> +     if (ret) {
>> +             dev_err(dev, "Failed to register iommu in sysfs\n");
>> +             return ret;
>> +     }
>> +
>> +     iommu_device_set_ops(&qcom_iommu->iommu, &qcom_iommu_ops);
>> +     iommu_device_set_fwnode(&qcom_iommu->iommu, dev->fwnode);
>> +
>> +     ret = iommu_device_register(&qcom_iommu->iommu);
>> +     if (ret) {
>> +             dev_err(dev, "Failed to register iommu\n");
>> +             return ret;
>> +     }
>> +
>> +     pm_runtime_enable(dev);
>> +     bus_set_iommu(&platform_bus_type, &qcom_iommu_ops);
>> +
>> +     if (qcom_iommu->local_base) {
>> +             pm_runtime_get_sync(dev);
>> +             writel_relaxed(0xffffffff, qcom_iommu->local_base + SMMU_INTR_SEL_NS);
>> +             pm_runtime_put_sync(dev);
>> +     }
>> +
>> +     return 0;
>> +}
>> +
>> +static int qcom_iommu_device_remove(struct platform_device *pdev)
>> +{
>> +     struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
>> +
>> +     bus_set_iommu(&platform_bus_type, NULL);
>> +
>> +     pm_runtime_force_suspend(&pdev->dev);
>> +     platform_set_drvdata(pdev, NULL);
>> +     iommu_device_sysfs_remove(&qcom_iommu->iommu);
>> +     iommu_device_unregister(&qcom_iommu->iommu);
>> +
>> +     return 0;
>> +}
>> +
>> +#ifdef CONFIG_PM
>> +static int qcom_iommu_resume(struct device *dev)
>> +{
>> +     struct platform_device *pdev = to_platform_device(dev);
>> +     struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
>> +
>> +     return qcom_iommu_enable_clocks(qcom_iommu);
>> +}
>> +
>> +static int qcom_iommu_suspend(struct device *dev)
>> +{
>> +     struct platform_device *pdev = to_platform_device(dev);
>> +     struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
>> +
>> +     qcom_iommu_disable_clocks(qcom_iommu);
>> +
>> +     return 0;
>> +}
>> +#endif
>> +
>> +static const struct dev_pm_ops qcom_iommu_pm_ops = {
>> +     SET_RUNTIME_PM_OPS(qcom_iommu_suspend, qcom_iommu_resume, NULL)
>> +     SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
>> +                             pm_runtime_force_resume)
>> +};
>> +
>> +static const struct of_device_id qcom_iommu_of_match[] = {
>> +     { .compatible = "qcom,msm-iommu-v1" },
>> +     { /* sentinel */ }
>> +};
>> +MODULE_DEVICE_TABLE(of, qcom_iommu_of_match);
>> +
>> +static struct platform_driver qcom_iommu_driver = {
>> +     .driver = {
>> +             .name           = "qcom-iommu",
>> +             .of_match_table = of_match_ptr(qcom_iommu_of_match),
>> +             .pm             = &qcom_iommu_pm_ops,
>> +     },
>> +     .probe  = qcom_iommu_device_probe,
>> +     .remove = qcom_iommu_device_remove,
>> +};
>> +
>> +static int __init qcom_iommu_init(void)
>> +{
>> +     int ret;
>> +
>> +     ret = platform_driver_register(&qcom_iommu_ctx_driver);
>> +     if (ret)
>> +             return ret;
>> +
>> +     ret = platform_driver_register(&qcom_iommu_driver);
>> +     if (ret)
>> +             platform_driver_unregister(&qcom_iommu_ctx_driver);
>> +
>> +     return ret;
>> +}
>> +
>> +static void __exit qcom_iommu_exit(void)
>> +{
>> +     platform_driver_unregister(&qcom_iommu_driver);
>> +     platform_driver_unregister(&qcom_iommu_ctx_driver);
>> +}
>> +
>> +module_init(qcom_iommu_init);
>> +module_exit(qcom_iommu_exit);
>> +
>> +IOMMU_OF_DECLARE(qcom_iommu_dev, "qcom,msm-iommu-v1", NULL);
>> +
>> +MODULE_DESCRIPTION("IOMMU API for QCOM IOMMU v1 implementations");
>> +MODULE_LICENSE("GPL v2");
>>
>
> --
> "QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] iommu: add qcom_iommu
       [not found]     ` <20170509142310.10535-1-robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-05-11 15:08       ` Sricharan R
  2017-05-11 16:50         ` Rob Clark
  0 siblings, 1 reply; 24+ messages in thread
From: Sricharan R @ 2017-05-11 15:08 UTC (permalink / raw)
  To: Rob Clark, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: Mark Rutland, Rob Herring, linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	Will Deacon, Stanimir Varbanov

Hi Rob,

<snip..>

> +static irqreturn_t qcom_iommu_fault(int irq, void *dev)
> +{
> +	struct qcom_iommu_ctx *ctx = dev;
> +	u32 fsr, fsynr;
> +	unsigned long iova;
> +
> +	fsr = iommu_readl(ctx, ARM_SMMU_CB_FSR);
> +
> +	if (!(fsr & FSR_FAULT))
> +		return IRQ_NONE;
> +
> +	fsynr = iommu_readl(ctx, ARM_SMMU_CB_FSYNR0);
> +	iova = iommu_readq(ctx, ARM_SMMU_CB_FAR);
> +
> +	dev_err_ratelimited(ctx->dev,
> +			    "Unhandled context fault: fsr=0x%x, "
> +			    "iova=0x%08lx, fsynr=0x%x, cb=%d\n",
> +			    fsr, iova, fsynr, ctx->asid);
> +
> +	iommu_writel(ctx, ARM_SMMU_CB_FSR, fsr);

Just thinking if the clocks should be enabled in the fault handler
for handling cases that would happen out of the master context.
While global faults are one case, that is anyways is handled in
secure world for this case. Something like bootloader used the iommu
and not handled the fault, and getting the fault in kernel the
moment we enable the ctx. Atleast downstream seems to enable the
clocks in the fault handler explicitly.

Regards,
 Sricharan


> +
> +	return IRQ_HANDLED;
> +}
> +
> +static int qcom_iommu_init_domain(struct iommu_domain *domain,
> +				  struct qcom_iommu_dev *qcom_iommu,
> +				  struct iommu_fwspec *fwspec)
> +{
> +	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
> +	struct io_pgtable_ops *pgtbl_ops;
> +	struct io_pgtable_cfg pgtbl_cfg;
> +	int i, ret = 0;
> +	u32 reg;
> +
> +	mutex_lock(&qcom_domain->init_mutex);
> +	if (qcom_domain->iommu)
> +		goto out_unlock;
> +
> +	pgtbl_cfg = (struct io_pgtable_cfg) {
> +		.pgsize_bitmap	= qcom_iommu_ops.pgsize_bitmap,
> +		.ias		= 32,
> +		.oas		= 40,
> +		.tlb		= &qcom_gather_ops,
> +		.iommu_dev	= qcom_iommu->dev,
> +	};
> +
> +	qcom_domain->iommu = qcom_iommu;
> +	pgtbl_ops = alloc_io_pgtable_ops(ARM_32_LPAE_S1, &pgtbl_cfg, fwspec);
> +	if (!pgtbl_ops) {
> +		dev_err(qcom_iommu->dev, "failed to allocate pagetable ops\n");
> +		ret = -ENOMEM;
> +		goto out_clear_iommu;
> +	}
> +
> +	/* Update the domain's page sizes to reflect the page table format */
> +	domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
> +	domain->geometry.aperture_end = (1ULL << pgtbl_cfg.ias) - 1;
> +	domain->geometry.force_aperture = true;
> +
> +	for (i = 0; i < fwspec->num_ids; i++) {
> +		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
> +
> +		if (!ctx->secure_init) {
> +			ret = qcom_scm_restore_sec_cfg(qcom_iommu->sec_id, ctx->asid);
> +			if (ret) {
> +				dev_err(qcom_iommu->dev, "secure init failed: %d\n", ret);
> +				goto out_clear_iommu;
> +			}
> +			ctx->secure_init = true;
> +		}
> +
> +		/* TTBRs */
> +		iommu_writeq(ctx, ARM_SMMU_CB_TTBR0,
> +				pgtbl_cfg.arm_lpae_s1_cfg.ttbr[0] |
> +				((u64)ctx->asid << TTBRn_ASID_SHIFT));
> +		iommu_writeq(ctx, ARM_SMMU_CB_TTBR1,
> +				pgtbl_cfg.arm_lpae_s1_cfg.ttbr[1] |
> +				((u64)ctx->asid << TTBRn_ASID_SHIFT));
> +
> +		/* TTBCR */
> +		iommu_writel(ctx, ARM_SMMU_CB_TTBCR2,
> +				(pgtbl_cfg.arm_lpae_s1_cfg.tcr >> 32) |
> +				TTBCR2_SEP_UPSTREAM);
> +		iommu_writel(ctx, ARM_SMMU_CB_TTBCR,
> +				pgtbl_cfg.arm_lpae_s1_cfg.tcr);
> +
> +		/* MAIRs (stage-1 only) */
> +		iommu_writel(ctx, ARM_SMMU_CB_S1_MAIR0,
> +				pgtbl_cfg.arm_lpae_s1_cfg.mair[0]);
> +		iommu_writel(ctx, ARM_SMMU_CB_S1_MAIR1,
> +				pgtbl_cfg.arm_lpae_s1_cfg.mair[1]);
> +
> +		/* SCTLR */
> +		reg = SCTLR_CFIE | SCTLR_CFRE | SCTLR_AFE | SCTLR_TRE |
> +			SCTLR_M | SCTLR_S1_ASIDPNE;
> +
> +		if (IS_ENABLED(CONFIG_BIG_ENDIAN))
> +			reg |= SCTLR_E;
> +
> +		iommu_writel(ctx, ARM_SMMU_CB_SCTLR, reg);
> +	}
> +
> +	mutex_unlock(&qcom_domain->init_mutex);
> +
> +	/* Publish page table ops for map/unmap */
> +	qcom_domain->pgtbl_ops = pgtbl_ops;
> +
> +	return 0;
> +
> +out_clear_iommu:
> +	qcom_domain->iommu = NULL;
> +out_unlock:
> +	mutex_unlock(&qcom_domain->init_mutex);
> +	return ret;
> +}
> +
> +static struct iommu_domain *qcom_iommu_domain_alloc(unsigned type)
> +{
> +	struct qcom_iommu_domain *qcom_domain;
> +
> +	if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA)
> +		return NULL;
> +	/*
> +	 * Allocate the domain and initialise some of its data structures.
> +	 * We can't really do anything meaningful until we've added a
> +	 * master.
> +	 */
> +	qcom_domain = kzalloc(sizeof(*qcom_domain), GFP_KERNEL);
> +	if (!qcom_domain)
> +		return NULL;
> +
> +	if (type == IOMMU_DOMAIN_DMA &&
> +	    iommu_get_dma_cookie(&qcom_domain->domain)) {
> +		kfree(qcom_domain);
> +		return NULL;
> +	}
> +
> +	mutex_init(&qcom_domain->init_mutex);
> +	spin_lock_init(&qcom_domain->pgtbl_lock);
> +
> +	return &qcom_domain->domain;
> +}
> +
> +static void qcom_iommu_domain_free(struct iommu_domain *domain)
> +{
> +	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
> +
> +	if (WARN_ON(qcom_domain->iommu))    /* forgot to detach? */
> +		return;
> +
> +	iommu_put_dma_cookie(domain);
> +
> +	free_io_pgtable_ops(qcom_domain->pgtbl_ops);
> +
> +	kfree(qcom_domain);
> +}
> +
> +static int qcom_iommu_attach_dev(struct iommu_domain *domain, struct device *dev)
> +{
> +	struct qcom_iommu_dev *qcom_iommu = to_iommu(dev->iommu_fwspec);
> +	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
> +	int ret;
> +
> +	if (!qcom_iommu) {
> +		dev_err(dev, "cannot attach to IOMMU, is it on the same bus?\n");
> +		return -ENXIO;
> +	}
> +
> +	/* Ensure that the domain is finalized */
> +	pm_runtime_get_sync(qcom_iommu->dev);
> +	ret = qcom_iommu_init_domain(domain, qcom_iommu, dev->iommu_fwspec);
> +	pm_runtime_put_sync(qcom_iommu->dev);
> +	if (ret < 0)
> +		return ret;
> +
> +	/*
> +	 * Sanity check the domain. We don't support domains across
> +	 * different IOMMUs.
> +	 */
> +	if (qcom_domain->iommu != qcom_iommu) {
> +		dev_err(dev, "cannot attach to IOMMU %s while already "
> +			"attached to domain on IOMMU %s\n",
> +			dev_name(qcom_domain->iommu->dev),
> +			dev_name(qcom_iommu->dev));
> +		return -EINVAL;
> +	}
> +
> +	return 0;
> +}
> +
> +static void qcom_iommu_detach_dev(struct iommu_domain *domain, struct device *dev)
> +{
> +	struct iommu_fwspec *fwspec = dev->iommu_fwspec;
> +	struct qcom_iommu_dev *qcom_iommu = to_iommu(fwspec);
> +	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
> +	unsigned i;
> +
> +	if (!qcom_domain->iommu)
> +		return;
> +
> +	pm_runtime_get_sync(qcom_iommu->dev);
> +	for (i = 0; i < fwspec->num_ids; i++) {
> +		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
> +
> +		/* Disable the context bank: */
> +		iommu_writel(ctx, ARM_SMMU_CB_SCTLR, 0);
> +	}
> +	pm_runtime_put_sync(qcom_iommu->dev);
> +
> +	qcom_domain->iommu = NULL;
> +}
> +
> +static int qcom_iommu_map(struct iommu_domain *domain, unsigned long iova,
> +			  phys_addr_t paddr, size_t size, int prot)
> +{
> +	int ret;
> +	unsigned long flags;
> +	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
> +	struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
> +
> +	if (!ops)
> +		return -ENODEV;
> +
> +	spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
> +	ret = ops->map(ops, iova, paddr, size, prot);
> +	spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
> +	return ret;
> +}
> +
> +static size_t qcom_iommu_unmap(struct iommu_domain *domain, unsigned long iova,
> +			       size_t size)
> +{
> +	size_t ret;
> +	unsigned long flags;
> +	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
> +	struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
> +
> +	if (!ops)
> +		return 0;
> +
> +	spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
> +	ret = ops->unmap(ops, iova, size);
> +	spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
> +	return ret;
> +}
> +
> +static phys_addr_t qcom_iommu_iova_to_phys(struct iommu_domain *domain,
> +					   dma_addr_t iova)
> +{
> +	phys_addr_t ret;
> +	unsigned long flags;
> +	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
> +	struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
> +
> +	if (!ops)
> +		return 0;
> +
> +	spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
> +	ret = ops->iova_to_phys(ops, iova);
> +	spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
> +
> +	return ret;
> +}
> +
> +static bool qcom_iommu_capable(enum iommu_cap cap)
> +{
> +	switch (cap) {
> +	case IOMMU_CAP_CACHE_COHERENCY:
> +		/*
> +		 * Return true here as the SMMU can always send out coherent
> +		 * requests.
> +		 */
> +		return true;
> +	case IOMMU_CAP_NOEXEC:
> +		return true;
> +	default:
> +		return false;
> +	}
> +}
> +
> +static int qcom_iommu_add_device(struct device *dev)
> +{
> +	struct qcom_iommu_dev *qcom_iommu = __to_iommu(dev->iommu_fwspec);
> +	struct iommu_group *group;
> +	struct device_link *link;
> +
> +	if (!qcom_iommu)
> +		return -ENODEV;
> +
> +	/*
> +	 * Establish the link between iommu and master, so that the
> +	 * iommu gets runtime enabled/disabled as per the master's
> +	 * needs.
> +	 */
> +	link = device_link_add(dev, qcom_iommu->dev, DL_FLAG_PM_RUNTIME);
> +	if (!link) {
> +		dev_err(qcom_iommu->dev, "Unable to create device link between %s and %s\n",
> +			dev_name(qcom_iommu->dev), dev_name(dev));
> +		return -ENODEV;
> +	}
> +
> +	group = iommu_group_get_for_dev(dev);
> +	if (IS_ERR_OR_NULL(group))
> +		return PTR_ERR_OR_ZERO(group);
> +
> +	iommu_group_put(group);
> +	iommu_device_link(&qcom_iommu->iommu, dev);
> +
> +	return 0;
> +}
> +
> +static void qcom_iommu_remove_device(struct device *dev)
> +{
> +	struct qcom_iommu_dev *qcom_iommu = to_iommu(dev->iommu_fwspec);
> +
> +	if (!qcom_iommu)
> +		return;
> +
> +	iommu_device_unlink(&qcom_iommu->iommu, dev);
> +	iommu_group_remove_device(dev);
> +	iommu_fwspec_free(dev);
> +}
> +
> +static struct iommu_group *qcom_iommu_device_group(struct device *dev)
> +{
> +	struct iommu_fwspec *fwspec = dev->iommu_fwspec;
> +	struct iommu_group *group = NULL;
> +	unsigned i;
> +
> +	for (i = 0; i < fwspec->num_ids; i++) {
> +		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
> +
> +		if (group && ctx->group && group != ctx->group)
> +			return ERR_PTR(-EINVAL);
> +
> +		group = ctx->group;
> +	}
> +
> +	if (group)
> +		return iommu_group_ref_get(group);
> +
> +	group = generic_device_group(dev);
> +
> +	for (i = 0; i < fwspec->num_ids; i++) {
> +		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
> +		ctx->group = iommu_group_ref_get(group);
> +	}
> +
> +	return group;
> +}
> +
> +static int qcom_iommu_of_xlate(struct device *dev, struct of_phandle_args *args)
> +{
> +	struct platform_device *iommu_pdev;
> +
> +	if (args->args_count != 1) {
> +		dev_err(dev, "incorrect number of iommu params found for %s "
> +			"(found %d, expected 1)\n",
> +			args->np->full_name, args->args_count);
> +		return -EINVAL;
> +	}
> +
> +	if (!dev->iommu_fwspec->iommu_priv) {
> +		iommu_pdev = of_find_device_by_node(args->np);
> +		if (WARN_ON(!iommu_pdev))
> +			return -EINVAL;
> +
> +		dev->iommu_fwspec->iommu_priv = platform_get_drvdata(iommu_pdev);
> +	}
> +
> +	return iommu_fwspec_add_ids(dev, &args->args[0], 1);
> +}
> +
> +static const struct iommu_ops qcom_iommu_ops = {
> +	.capable	= qcom_iommu_capable,
> +	.domain_alloc	= qcom_iommu_domain_alloc,
> +	.domain_free	= qcom_iommu_domain_free,
> +	.attach_dev	= qcom_iommu_attach_dev,
> +	.detach_dev	= qcom_iommu_detach_dev,
> +	.map		= qcom_iommu_map,
> +	.unmap		= qcom_iommu_unmap,
> +	.map_sg		= default_iommu_map_sg,
> +	.iova_to_phys	= qcom_iommu_iova_to_phys,
> +	.add_device	= qcom_iommu_add_device,
> +	.remove_device	= qcom_iommu_remove_device,
> +	.device_group	= qcom_iommu_device_group,
> +	.of_xlate	= qcom_iommu_of_xlate,
> +	.pgsize_bitmap	= SZ_4K | SZ_64K | SZ_1M | SZ_16M,
> +};
> +
> +static int qcom_iommu_enable_clocks(struct qcom_iommu_dev *qcom_iommu)
> +{
> +	int ret;
> +
> +	ret = clk_prepare_enable(qcom_iommu->iface_clk);
> +	if (ret) {
> +		dev_err(qcom_iommu->dev, "Couldn't enable iface_clk\n");
> +		return ret;
> +	}
> +
> +	ret = clk_prepare_enable(qcom_iommu->bus_clk);
> +	if (ret) {
> +		dev_err(qcom_iommu->dev, "Couldn't enable bus_clk\n");
> +		clk_disable_unprepare(qcom_iommu->iface_clk);
> +		return ret;
> +	}
> +
> +	return 0;
> +}
> +
> +static void qcom_iommu_disable_clocks(struct qcom_iommu_dev *qcom_iommu)
> +{
> +	clk_disable_unprepare(qcom_iommu->bus_clk);
> +	clk_disable_unprepare(qcom_iommu->iface_clk);
> +}
> +
> +static int qcom_iommu_ctx_probe(struct platform_device *pdev)
> +{
> +	struct qcom_iommu_ctx *ctx;
> +	struct device *dev = &pdev->dev;
> +	struct qcom_iommu_dev *qcom_iommu = dev_get_drvdata(dev->parent);
> +	struct resource *res;
> +	int ret;
> +	u32 reg;
> +
> +	ctx = devm_kzalloc(dev, sizeof(*ctx), GFP_KERNEL);
> +	if (!ctx)
> +		return -ENOMEM;
> +
> +	ctx->dev = dev;
> +	platform_set_drvdata(pdev, ctx);
> +
> +	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +	ctx->base = devm_ioremap_resource(dev, res);
> +	if (IS_ERR(ctx->base))
> +		return PTR_ERR(ctx->base);
> +
> +	ctx->irq = platform_get_irq(pdev, 0);
> +	if (ctx->irq < 0) {
> +		dev_err(dev, "failed to get irq\n");
> +		return -ENODEV;
> +	}
> +
> +	ret = devm_request_irq(dev, ctx->irq,
> +			       qcom_iommu_fault,
> +			       IRQF_SHARED,
> +			       "qcom-iommu-fault",
> +			       ctx);
> +	if (ret) {
> +		dev_err(dev, "failed to request IRQ %u\n", ctx->irq);
> +		return ret;
> +	}
> +
> +	/* read the "reg" property directly to get the relative address
> +	 * of the context bank, and calculate the asid from that:
> +	 */
> +	if (of_property_read_u32_index(dev->of_node, "reg", 0, &reg)) {
> +		dev_err(dev, "missing reg property\n");
> +		return -ENODEV;
> +	}
> +
> +	ctx->asid = reg / 0x1000;      /* context banks are 0x1000 apart */
> +
> +	dev_dbg(dev, "found asid %u\n", ctx->asid);
> +
> +	list_add_tail(&ctx->node, &qcom_iommu->context_list);
> +
> +	return 0;
> +}
> +
> +static int qcom_iommu_ctx_remove(struct platform_device *pdev)
> +{
> +	struct qcom_iommu_ctx *ctx = platform_get_drvdata(pdev);
> +
> +	iommu_group_put(ctx->group);
> +	platform_set_drvdata(pdev, NULL);
> +
> +	list_del(&ctx->node);
> +
> +	return 0;
> +}
> +
> +static const struct of_device_id ctx_of_match[] = {
> +	{ .compatible = "qcom,msm-iommu-v1-ns" },
> +	{ .compatible = "qcom,msm-iommu-v1-sec" },
> +	{ /* sentinel */ }
> +};
> +
> +static struct platform_driver qcom_iommu_ctx_driver = {
> +	.driver	= {
> +		.name		= "qcom-iommu-ctx",
> +		.of_match_table	= of_match_ptr(ctx_of_match),
> +	},
> +	.probe	= qcom_iommu_ctx_probe,
> +	.remove = qcom_iommu_ctx_remove,
> +};
> +
> +static int qcom_iommu_device_probe(struct platform_device *pdev)
> +{
> +	struct qcom_iommu_dev *qcom_iommu;
> +	struct device *dev = &pdev->dev;
> +	struct resource *res;
> +	int ret;
> +
> +	qcom_iommu = devm_kzalloc(dev, sizeof(*qcom_iommu), GFP_KERNEL);
> +	if (!qcom_iommu)
> +		return -ENOMEM;
> +	qcom_iommu->dev = dev;
> +
> +	INIT_LIST_HEAD(&qcom_iommu->context_list);
> +
> +	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +	if (res)
> +		qcom_iommu->local_base = devm_ioremap_resource(dev, res);
> +
> +	qcom_iommu->iface_clk = devm_clk_get(dev, "iface");
> +	if (IS_ERR(qcom_iommu->iface_clk)) {
> +		dev_err(dev, "failed to get iface clock\n");
> +		return PTR_ERR(qcom_iommu->iface_clk);
> +	}
> +
> +	qcom_iommu->bus_clk = devm_clk_get(dev, "bus");
> +	if (IS_ERR(qcom_iommu->bus_clk)) {
> +		dev_err(dev, "failed to get bus clock\n");
> +		return PTR_ERR(qcom_iommu->bus_clk);
> +	}
> +
> +	if (of_property_read_u32(dev->of_node, "qcom,iommu-secure-id",
> +				 &qcom_iommu->sec_id)) {
> +		dev_err(dev, "missing qcom,iommu-secure-id property\n");
> +		return -ENODEV;
> +	}
> +
> +	platform_set_drvdata(pdev, qcom_iommu);
> +
> +	/* register context bank devices, which are child nodes: */
> +	ret = devm_of_platform_populate(dev);
> +	if (ret) {
> +		dev_err(dev, "Failed to populate iommu contexts\n");
> +		return ret;
> +	}
> +
> +	ret = iommu_device_sysfs_add(&qcom_iommu->iommu, dev, NULL,
> +				     dev_name(dev));
> +	if (ret) {
> +		dev_err(dev, "Failed to register iommu in sysfs\n");
> +		return ret;
> +	}
> +
> +	iommu_device_set_ops(&qcom_iommu->iommu, &qcom_iommu_ops);
> +	iommu_device_set_fwnode(&qcom_iommu->iommu, dev->fwnode);
> +
> +	ret = iommu_device_register(&qcom_iommu->iommu);
> +	if (ret) {
> +		dev_err(dev, "Failed to register iommu\n");
> +		return ret;
> +	}
> +
> +	pm_runtime_enable(dev);
> +	bus_set_iommu(&platform_bus_type, &qcom_iommu_ops);
> +
> +	if (qcom_iommu->local_base) {
> +		pm_runtime_get_sync(dev);
> +		writel_relaxed(0xffffffff, qcom_iommu->local_base + SMMU_INTR_SEL_NS);
> +		pm_runtime_put_sync(dev);
> +	}
> +
> +	return 0;
> +}
> +
> +static int qcom_iommu_device_remove(struct platform_device *pdev)
> +{
> +	struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
> +
> +	bus_set_iommu(&platform_bus_type, NULL);
> +
> +	pm_runtime_force_suspend(&pdev->dev);
> +	platform_set_drvdata(pdev, NULL);
> +	iommu_device_sysfs_remove(&qcom_iommu->iommu);
> +	iommu_device_unregister(&qcom_iommu->iommu);
> +
> +	return 0;
> +}
> +
> +#ifdef CONFIG_PM
> +static int qcom_iommu_resume(struct device *dev)
> +{
> +	struct platform_device *pdev = to_platform_device(dev);
> +	struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
> +
> +	return qcom_iommu_enable_clocks(qcom_iommu);
> +}
> +
> +static int qcom_iommu_suspend(struct device *dev)
> +{
> +	struct platform_device *pdev = to_platform_device(dev);
> +	struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
> +
> +	qcom_iommu_disable_clocks(qcom_iommu);
> +
> +	return 0;
> +}
> +#endif
> +
> +static const struct dev_pm_ops qcom_iommu_pm_ops = {
> +	SET_RUNTIME_PM_OPS(qcom_iommu_suspend, qcom_iommu_resume, NULL)
> +	SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
> +				pm_runtime_force_resume)
> +};
> +
> +static const struct of_device_id qcom_iommu_of_match[] = {
> +	{ .compatible = "qcom,msm-iommu-v1" },
> +	{ /* sentinel */ }
> +};
> +MODULE_DEVICE_TABLE(of, qcom_iommu_of_match);
> +
> +static struct platform_driver qcom_iommu_driver = {
> +	.driver	= {
> +		.name		= "qcom-iommu",
> +		.of_match_table	= of_match_ptr(qcom_iommu_of_match),
> +		.pm		= &qcom_iommu_pm_ops,
> +	},
> +	.probe	= qcom_iommu_device_probe,
> +	.remove	= qcom_iommu_device_remove,
> +};
> +
> +static int __init qcom_iommu_init(void)
> +{
> +	int ret;
> +
> +	ret = platform_driver_register(&qcom_iommu_ctx_driver);
> +	if (ret)
> +		return ret;
> +
> +	ret = platform_driver_register(&qcom_iommu_driver);
> +	if (ret)
> +		platform_driver_unregister(&qcom_iommu_ctx_driver);
> +
> +	return ret;
> +}
> +
> +static void __exit qcom_iommu_exit(void)
> +{
> +	platform_driver_unregister(&qcom_iommu_driver);
> +	platform_driver_unregister(&qcom_iommu_ctx_driver);
> +}
> +
> +module_init(qcom_iommu_init);
> +module_exit(qcom_iommu_exit);
> +
> +IOMMU_OF_DECLARE(qcom_iommu_dev, "qcom,msm-iommu-v1", NULL);
> +
> +MODULE_DESCRIPTION("IOMMU API for QCOM IOMMU v1 implementations");
> +MODULE_LICENSE("GPL v2");
> 

-- 
"QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 3/4] iommu: add qcom_iommu
       [not found] ` <20170505182151.22931-1-robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-05-09 14:23   ` Rob Clark
       [not found]     ` <20170509142310.10535-1-robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 24+ messages in thread
From: Rob Clark @ 2017-05-09 14:23 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: Mark Rutland, Rob Herring, linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	Will Deacon, Stanimir Varbanov

An iommu driver for Qualcomm "B" family devices which do not completely
implement the ARM SMMU spec.  These devices have context-bank register
layout that is similar to ARM SMMU, but no global register space (or at
least not one that is accessible).

Signed-off-by: Rob Clark <robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Signed-off-by: Stanimir Varbanov <stanimir.varbanov-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
v1: original
v2: bindings cleanups and kconfig issues that kbuild robot pointed out
v4: fix issues pointed out by Rob H. and actually make device removal
    work
v3: fix WARN_ON() splats reported by Archit
v4: some fixes to build as a module.. note that it cannot actually
    be built as a module yet (at minimum a bunch of other iommu syms
    that are needed are not exported, but there may be more to it
    than that), but at least qcom_iommu is ready should it become
    possible to build iommu drivers as modules.

 drivers/iommu/Kconfig      |  10 +
 drivers/iommu/Makefile     |   1 +
 drivers/iommu/qcom_iommu.c | 855 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 866 insertions(+)
 create mode 100644 drivers/iommu/qcom_iommu.c

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 37e204f..55d68c9 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -359,4 +359,14 @@ config MTK_IOMMU_V1
 
 	  if unsure, say N here.
 
+config QCOM_IOMMU
+	# Note: iommu drivers cannot (yet?) be built as modules
+	bool "Qualcomm IOMMU Support"
+	depends on ARCH_QCOM || COMPILE_TEST
+	select IOMMU_API
+	select IOMMU_IO_PGTABLE_LPAE
+	select ARM_DMA_USE_IOMMU
+	help
+	  Support for IOMMU on certain Qualcomm SoCs.
+
 endif # IOMMU_SUPPORT
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 195f7b9..b910aea 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -27,3 +27,4 @@ obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o
 obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
 obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
 obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
+obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
new file mode 100644
index 0000000..85fe364
--- /dev/null
+++ b/drivers/iommu/qcom_iommu.c
@@ -0,0 +1,855 @@
+/*
+ * IOMMU API for QCOM secure IOMMUs.  Somewhat based on arm-smmu.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Copyright (C) 2013 ARM Limited
+ * Copyright (C) 2017 Red Hat
+ */
+
+#include <linux/atomic.h>
+#include <linux/clk.h>
+#include <linux/delay.h>
+#include <linux/dma-iommu.h>
+#include <linux/dma-mapping.h>
+#include <linux/err.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/io-64-nonatomic-hi-lo.h>
+#include <linux/iommu.h>
+#include <linux/iopoll.h>
+#include <linux/kconfig.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/of_device.h>
+#include <linux/of_iommu.h>
+#include <linux/platform_device.h>
+#include <linux/pm.h>
+#include <linux/pm_runtime.h>
+#include <linux/qcom_scm.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+
+#include "io-pgtable.h"
+#include "arm-smmu-regs.h"
+
+#define SMMU_INTR_SEL_NS     0x2000
+
+struct qcom_iommu_dev {
+	/* IOMMU core code handle */
+	struct iommu_device	 iommu;
+	struct device		*dev;
+	struct clk		*iface_clk;
+	struct clk		*bus_clk;
+	void __iomem		*local_base;
+	u32			 sec_id;
+	struct list_head	 context_list;   /* list of qcom_iommu_context */
+};
+
+struct qcom_iommu_ctx {
+	struct device		*dev;
+	void __iomem		*base;
+	unsigned int		 irq;
+	bool			 secure_init;
+	u32			 asid;      /* asid and ctx bank # are 1:1 */
+	struct iommu_group	*group;
+	struct list_head	 node;      /* head in qcom_iommu_device::context_list */
+};
+
+struct qcom_iommu_domain {
+	struct io_pgtable_ops	*pgtbl_ops;
+	spinlock_t		 pgtbl_lock;
+	struct mutex		 init_mutex; /* Protects iommu pointer */
+	struct iommu_domain	 domain;
+	struct qcom_iommu_dev	*iommu;
+};
+
+static struct qcom_iommu_domain *to_qcom_iommu_domain(struct iommu_domain *dom)
+{
+	return container_of(dom, struct qcom_iommu_domain, domain);
+}
+
+static const struct iommu_ops qcom_iommu_ops;
+
+static struct qcom_iommu_dev * __to_iommu(struct iommu_fwspec *fwspec)
+{
+	if (!fwspec || fwspec->ops != &qcom_iommu_ops)
+		return NULL;
+	return fwspec->iommu_priv;
+}
+
+static struct qcom_iommu_dev * to_iommu(struct iommu_fwspec *fwspec)
+{
+	struct qcom_iommu_dev *qcom_iommu = __to_iommu(fwspec);
+	WARN_ON(!qcom_iommu);
+	return qcom_iommu;
+}
+
+static struct qcom_iommu_ctx * to_ctx(struct iommu_fwspec *fwspec, unsigned asid)
+{
+	struct qcom_iommu_dev *qcom_iommu = to_iommu(fwspec);
+	struct qcom_iommu_ctx *ctx;
+
+	if (!qcom_iommu)
+		return NULL;
+
+	list_for_each_entry(ctx, &qcom_iommu->context_list, node)
+		if (ctx->asid == asid)
+			return ctx;
+
+	WARN(1, "no ctx for asid %u\n", asid);
+	return NULL;
+}
+
+static inline void
+iommu_writel(struct qcom_iommu_ctx *ctx, unsigned reg, u32 val)
+{
+	writel_relaxed(val, ctx->base + reg);
+}
+
+static inline void
+iommu_writeq(struct qcom_iommu_ctx *ctx, unsigned reg, u64 val)
+{
+	writeq_relaxed(val, ctx->base + reg);
+}
+
+static inline u32
+iommu_readl(struct qcom_iommu_ctx *ctx, unsigned reg)
+{
+	return readl_relaxed(ctx->base + reg);
+}
+
+static inline u64
+iommu_readq(struct qcom_iommu_ctx *ctx, unsigned reg)
+{
+	return readq_relaxed(ctx->base + reg);
+}
+
+static void __sync_tlb(struct qcom_iommu_ctx *ctx)
+{
+	unsigned int val;
+	unsigned int ret;
+
+	iommu_writel(ctx, ARM_SMMU_CB_TLBSYNC, 0);
+
+	ret = readl_poll_timeout(ctx->base + ARM_SMMU_CB_TLBSTATUS, val,
+				 (val & 0x1) == 0, 0, 5000000);
+	if (ret)
+		dev_err(ctx->dev, "timeout waiting for TLB SYNC\n");
+}
+
+static void qcom_iommu_tlb_sync(void *cookie)
+{
+	struct iommu_fwspec *fwspec = cookie;
+	unsigned i;
+
+	for (i = 0; i < fwspec->num_ids; i++)
+		__sync_tlb(to_ctx(fwspec, fwspec->ids[i]));
+}
+
+static void qcom_iommu_tlb_inv_context(void *cookie)
+{
+	struct iommu_fwspec *fwspec = cookie;
+	unsigned i;
+
+	for (i = 0; i < fwspec->num_ids; i++) {
+		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
+
+		iommu_writel(ctx, ARM_SMMU_CB_S1_TLBIASID, ctx->asid);
+		__sync_tlb(ctx);
+	}
+}
+
+static void qcom_iommu_tlb_inv_range_nosync(unsigned long iova, size_t size,
+					    size_t granule, bool leaf, void *cookie)
+{
+	struct iommu_fwspec *fwspec = cookie;
+	unsigned i, reg;
+
+	reg = leaf ? ARM_SMMU_CB_S1_TLBIVAL : ARM_SMMU_CB_S1_TLBIVA;
+
+	for (i = 0; i < fwspec->num_ids; i++) {
+		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
+		size_t s = size;
+
+		iova &= ~12UL;
+		iova |= ctx->asid;
+		do {
+			iommu_writel(ctx, reg, iova);
+			iova += granule;
+		} while (s -= granule);
+	}
+}
+
+static const struct iommu_gather_ops qcom_gather_ops = {
+	.tlb_flush_all	= qcom_iommu_tlb_inv_context,
+	.tlb_add_flush	= qcom_iommu_tlb_inv_range_nosync,
+	.tlb_sync	= qcom_iommu_tlb_sync,
+};
+
+static irqreturn_t qcom_iommu_fault(int irq, void *dev)
+{
+	struct qcom_iommu_ctx *ctx = dev;
+	u32 fsr, fsynr;
+	unsigned long iova;
+
+	fsr = iommu_readl(ctx, ARM_SMMU_CB_FSR);
+
+	if (!(fsr & FSR_FAULT))
+		return IRQ_NONE;
+
+	fsynr = iommu_readl(ctx, ARM_SMMU_CB_FSYNR0);
+	iova = iommu_readq(ctx, ARM_SMMU_CB_FAR);
+
+	dev_err_ratelimited(ctx->dev,
+			    "Unhandled context fault: fsr=0x%x, "
+			    "iova=0x%08lx, fsynr=0x%x, cb=%d\n",
+			    fsr, iova, fsynr, ctx->asid);
+
+	iommu_writel(ctx, ARM_SMMU_CB_FSR, fsr);
+
+	return IRQ_HANDLED;
+}
+
+static int qcom_iommu_init_domain(struct iommu_domain *domain,
+				  struct qcom_iommu_dev *qcom_iommu,
+				  struct iommu_fwspec *fwspec)
+{
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	struct io_pgtable_ops *pgtbl_ops;
+	struct io_pgtable_cfg pgtbl_cfg;
+	int i, ret = 0;
+	u32 reg;
+
+	mutex_lock(&qcom_domain->init_mutex);
+	if (qcom_domain->iommu)
+		goto out_unlock;
+
+	pgtbl_cfg = (struct io_pgtable_cfg) {
+		.pgsize_bitmap	= qcom_iommu_ops.pgsize_bitmap,
+		.ias		= 32,
+		.oas		= 40,
+		.tlb		= &qcom_gather_ops,
+		.iommu_dev	= qcom_iommu->dev,
+	};
+
+	qcom_domain->iommu = qcom_iommu;
+	pgtbl_ops = alloc_io_pgtable_ops(ARM_32_LPAE_S1, &pgtbl_cfg, fwspec);
+	if (!pgtbl_ops) {
+		dev_err(qcom_iommu->dev, "failed to allocate pagetable ops\n");
+		ret = -ENOMEM;
+		goto out_clear_iommu;
+	}
+
+	/* Update the domain's page sizes to reflect the page table format */
+	domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
+	domain->geometry.aperture_end = (1ULL << pgtbl_cfg.ias) - 1;
+	domain->geometry.force_aperture = true;
+
+	for (i = 0; i < fwspec->num_ids; i++) {
+		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
+
+		if (!ctx->secure_init) {
+			ret = qcom_scm_restore_sec_cfg(qcom_iommu->sec_id, ctx->asid);
+			if (ret) {
+				dev_err(qcom_iommu->dev, "secure init failed: %d\n", ret);
+				goto out_clear_iommu;
+			}
+			ctx->secure_init = true;
+		}
+
+		/* TTBRs */
+		iommu_writeq(ctx, ARM_SMMU_CB_TTBR0,
+				pgtbl_cfg.arm_lpae_s1_cfg.ttbr[0] |
+				((u64)ctx->asid << TTBRn_ASID_SHIFT));
+		iommu_writeq(ctx, ARM_SMMU_CB_TTBR1,
+				pgtbl_cfg.arm_lpae_s1_cfg.ttbr[1] |
+				((u64)ctx->asid << TTBRn_ASID_SHIFT));
+
+		/* TTBCR */
+		iommu_writel(ctx, ARM_SMMU_CB_TTBCR2,
+				(pgtbl_cfg.arm_lpae_s1_cfg.tcr >> 32) |
+				TTBCR2_SEP_UPSTREAM);
+		iommu_writel(ctx, ARM_SMMU_CB_TTBCR,
+				pgtbl_cfg.arm_lpae_s1_cfg.tcr);
+
+		/* MAIRs (stage-1 only) */
+		iommu_writel(ctx, ARM_SMMU_CB_S1_MAIR0,
+				pgtbl_cfg.arm_lpae_s1_cfg.mair[0]);
+		iommu_writel(ctx, ARM_SMMU_CB_S1_MAIR1,
+				pgtbl_cfg.arm_lpae_s1_cfg.mair[1]);
+
+		/* SCTLR */
+		reg = SCTLR_CFIE | SCTLR_CFRE | SCTLR_AFE | SCTLR_TRE |
+			SCTLR_M | SCTLR_S1_ASIDPNE;
+
+		if (IS_ENABLED(CONFIG_BIG_ENDIAN))
+			reg |= SCTLR_E;
+
+		iommu_writel(ctx, ARM_SMMU_CB_SCTLR, reg);
+	}
+
+	mutex_unlock(&qcom_domain->init_mutex);
+
+	/* Publish page table ops for map/unmap */
+	qcom_domain->pgtbl_ops = pgtbl_ops;
+
+	return 0;
+
+out_clear_iommu:
+	qcom_domain->iommu = NULL;
+out_unlock:
+	mutex_unlock(&qcom_domain->init_mutex);
+	return ret;
+}
+
+static struct iommu_domain *qcom_iommu_domain_alloc(unsigned type)
+{
+	struct qcom_iommu_domain *qcom_domain;
+
+	if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA)
+		return NULL;
+	/*
+	 * Allocate the domain and initialise some of its data structures.
+	 * We can't really do anything meaningful until we've added a
+	 * master.
+	 */
+	qcom_domain = kzalloc(sizeof(*qcom_domain), GFP_KERNEL);
+	if (!qcom_domain)
+		return NULL;
+
+	if (type == IOMMU_DOMAIN_DMA &&
+	    iommu_get_dma_cookie(&qcom_domain->domain)) {
+		kfree(qcom_domain);
+		return NULL;
+	}
+
+	mutex_init(&qcom_domain->init_mutex);
+	spin_lock_init(&qcom_domain->pgtbl_lock);
+
+	return &qcom_domain->domain;
+}
+
+static void qcom_iommu_domain_free(struct iommu_domain *domain)
+{
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+
+	if (WARN_ON(qcom_domain->iommu))    /* forgot to detach? */
+		return;
+
+	iommu_put_dma_cookie(domain);
+
+	free_io_pgtable_ops(qcom_domain->pgtbl_ops);
+
+	kfree(qcom_domain);
+}
+
+static int qcom_iommu_attach_dev(struct iommu_domain *domain, struct device *dev)
+{
+	struct qcom_iommu_dev *qcom_iommu = to_iommu(dev->iommu_fwspec);
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	int ret;
+
+	if (!qcom_iommu) {
+		dev_err(dev, "cannot attach to IOMMU, is it on the same bus?\n");
+		return -ENXIO;
+	}
+
+	/* Ensure that the domain is finalized */
+	pm_runtime_get_sync(qcom_iommu->dev);
+	ret = qcom_iommu_init_domain(domain, qcom_iommu, dev->iommu_fwspec);
+	pm_runtime_put_sync(qcom_iommu->dev);
+	if (ret < 0)
+		return ret;
+
+	/*
+	 * Sanity check the domain. We don't support domains across
+	 * different IOMMUs.
+	 */
+	if (qcom_domain->iommu != qcom_iommu) {
+		dev_err(dev, "cannot attach to IOMMU %s while already "
+			"attached to domain on IOMMU %s\n",
+			dev_name(qcom_domain->iommu->dev),
+			dev_name(qcom_iommu->dev));
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static void qcom_iommu_detach_dev(struct iommu_domain *domain, struct device *dev)
+{
+	struct iommu_fwspec *fwspec = dev->iommu_fwspec;
+	struct qcom_iommu_dev *qcom_iommu = to_iommu(fwspec);
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	unsigned i;
+
+	if (!qcom_domain->iommu)
+		return;
+
+	pm_runtime_get_sync(qcom_iommu->dev);
+	for (i = 0; i < fwspec->num_ids; i++) {
+		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
+
+		/* Disable the context bank: */
+		iommu_writel(ctx, ARM_SMMU_CB_SCTLR, 0);
+	}
+	pm_runtime_put_sync(qcom_iommu->dev);
+
+	qcom_domain->iommu = NULL;
+}
+
+static int qcom_iommu_map(struct iommu_domain *domain, unsigned long iova,
+			  phys_addr_t paddr, size_t size, int prot)
+{
+	int ret;
+	unsigned long flags;
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
+
+	if (!ops)
+		return -ENODEV;
+
+	spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
+	ret = ops->map(ops, iova, paddr, size, prot);
+	spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
+	return ret;
+}
+
+static size_t qcom_iommu_unmap(struct iommu_domain *domain, unsigned long iova,
+			       size_t size)
+{
+	size_t ret;
+	unsigned long flags;
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
+
+	if (!ops)
+		return 0;
+
+	spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
+	ret = ops->unmap(ops, iova, size);
+	spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
+	return ret;
+}
+
+static phys_addr_t qcom_iommu_iova_to_phys(struct iommu_domain *domain,
+					   dma_addr_t iova)
+{
+	phys_addr_t ret;
+	unsigned long flags;
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
+
+	if (!ops)
+		return 0;
+
+	spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
+	ret = ops->iova_to_phys(ops, iova);
+	spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
+
+	return ret;
+}
+
+static bool qcom_iommu_capable(enum iommu_cap cap)
+{
+	switch (cap) {
+	case IOMMU_CAP_CACHE_COHERENCY:
+		/*
+		 * Return true here as the SMMU can always send out coherent
+		 * requests.
+		 */
+		return true;
+	case IOMMU_CAP_NOEXEC:
+		return true;
+	default:
+		return false;
+	}
+}
+
+static int qcom_iommu_add_device(struct device *dev)
+{
+	struct qcom_iommu_dev *qcom_iommu = __to_iommu(dev->iommu_fwspec);
+	struct iommu_group *group;
+	struct device_link *link;
+
+	if (!qcom_iommu)
+		return -ENODEV;
+
+	/*
+	 * Establish the link between iommu and master, so that the
+	 * iommu gets runtime enabled/disabled as per the master's
+	 * needs.
+	 */
+	link = device_link_add(dev, qcom_iommu->dev, DL_FLAG_PM_RUNTIME);
+	if (!link) {
+		dev_err(qcom_iommu->dev, "Unable to create device link between %s and %s\n",
+			dev_name(qcom_iommu->dev), dev_name(dev));
+		return -ENODEV;
+	}
+
+	group = iommu_group_get_for_dev(dev);
+	if (IS_ERR_OR_NULL(group))
+		return PTR_ERR_OR_ZERO(group);
+
+	iommu_group_put(group);
+	iommu_device_link(&qcom_iommu->iommu, dev);
+
+	return 0;
+}
+
+static void qcom_iommu_remove_device(struct device *dev)
+{
+	struct qcom_iommu_dev *qcom_iommu = to_iommu(dev->iommu_fwspec);
+
+	if (!qcom_iommu)
+		return;
+
+	iommu_device_unlink(&qcom_iommu->iommu, dev);
+	iommu_group_remove_device(dev);
+	iommu_fwspec_free(dev);
+}
+
+static struct iommu_group *qcom_iommu_device_group(struct device *dev)
+{
+	struct iommu_fwspec *fwspec = dev->iommu_fwspec;
+	struct iommu_group *group = NULL;
+	unsigned i;
+
+	for (i = 0; i < fwspec->num_ids; i++) {
+		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
+
+		if (group && ctx->group && group != ctx->group)
+			return ERR_PTR(-EINVAL);
+
+		group = ctx->group;
+	}
+
+	if (group)
+		return iommu_group_ref_get(group);
+
+	group = generic_device_group(dev);
+
+	for (i = 0; i < fwspec->num_ids; i++) {
+		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
+		ctx->group = iommu_group_ref_get(group);
+	}
+
+	return group;
+}
+
+static int qcom_iommu_of_xlate(struct device *dev, struct of_phandle_args *args)
+{
+	struct platform_device *iommu_pdev;
+
+	if (args->args_count != 1) {
+		dev_err(dev, "incorrect number of iommu params found for %s "
+			"(found %d, expected 1)\n",
+			args->np->full_name, args->args_count);
+		return -EINVAL;
+	}
+
+	if (!dev->iommu_fwspec->iommu_priv) {
+		iommu_pdev = of_find_device_by_node(args->np);
+		if (WARN_ON(!iommu_pdev))
+			return -EINVAL;
+
+		dev->iommu_fwspec->iommu_priv = platform_get_drvdata(iommu_pdev);
+	}
+
+	return iommu_fwspec_add_ids(dev, &args->args[0], 1);
+}
+
+static const struct iommu_ops qcom_iommu_ops = {
+	.capable	= qcom_iommu_capable,
+	.domain_alloc	= qcom_iommu_domain_alloc,
+	.domain_free	= qcom_iommu_domain_free,
+	.attach_dev	= qcom_iommu_attach_dev,
+	.detach_dev	= qcom_iommu_detach_dev,
+	.map		= qcom_iommu_map,
+	.unmap		= qcom_iommu_unmap,
+	.map_sg		= default_iommu_map_sg,
+	.iova_to_phys	= qcom_iommu_iova_to_phys,
+	.add_device	= qcom_iommu_add_device,
+	.remove_device	= qcom_iommu_remove_device,
+	.device_group	= qcom_iommu_device_group,
+	.of_xlate	= qcom_iommu_of_xlate,
+	.pgsize_bitmap	= SZ_4K | SZ_64K | SZ_1M | SZ_16M,
+};
+
+static int qcom_iommu_enable_clocks(struct qcom_iommu_dev *qcom_iommu)
+{
+	int ret;
+
+	ret = clk_prepare_enable(qcom_iommu->iface_clk);
+	if (ret) {
+		dev_err(qcom_iommu->dev, "Couldn't enable iface_clk\n");
+		return ret;
+	}
+
+	ret = clk_prepare_enable(qcom_iommu->bus_clk);
+	if (ret) {
+		dev_err(qcom_iommu->dev, "Couldn't enable bus_clk\n");
+		clk_disable_unprepare(qcom_iommu->iface_clk);
+		return ret;
+	}
+
+	return 0;
+}
+
+static void qcom_iommu_disable_clocks(struct qcom_iommu_dev *qcom_iommu)
+{
+	clk_disable_unprepare(qcom_iommu->bus_clk);
+	clk_disable_unprepare(qcom_iommu->iface_clk);
+}
+
+static int qcom_iommu_ctx_probe(struct platform_device *pdev)
+{
+	struct qcom_iommu_ctx *ctx;
+	struct device *dev = &pdev->dev;
+	struct qcom_iommu_dev *qcom_iommu = dev_get_drvdata(dev->parent);
+	struct resource *res;
+	int ret;
+	u32 reg;
+
+	ctx = devm_kzalloc(dev, sizeof(*ctx), GFP_KERNEL);
+	if (!ctx)
+		return -ENOMEM;
+
+	ctx->dev = dev;
+	platform_set_drvdata(pdev, ctx);
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	ctx->base = devm_ioremap_resource(dev, res);
+	if (IS_ERR(ctx->base))
+		return PTR_ERR(ctx->base);
+
+	ctx->irq = platform_get_irq(pdev, 0);
+	if (ctx->irq < 0) {
+		dev_err(dev, "failed to get irq\n");
+		return -ENODEV;
+	}
+
+	ret = devm_request_irq(dev, ctx->irq,
+			       qcom_iommu_fault,
+			       IRQF_SHARED,
+			       "qcom-iommu-fault",
+			       ctx);
+	if (ret) {
+		dev_err(dev, "failed to request IRQ %u\n", ctx->irq);
+		return ret;
+	}
+
+	/* read the "reg" property directly to get the relative address
+	 * of the context bank, and calculate the asid from that:
+	 */
+	if (of_property_read_u32_index(dev->of_node, "reg", 0, &reg)) {
+		dev_err(dev, "missing reg property\n");
+		return -ENODEV;
+	}
+
+	ctx->asid = reg / 0x1000;      /* context banks are 0x1000 apart */
+
+	dev_dbg(dev, "found asid %u\n", ctx->asid);
+
+	list_add_tail(&ctx->node, &qcom_iommu->context_list);
+
+	return 0;
+}
+
+static int qcom_iommu_ctx_remove(struct platform_device *pdev)
+{
+	struct qcom_iommu_ctx *ctx = platform_get_drvdata(pdev);
+
+	iommu_group_put(ctx->group);
+	platform_set_drvdata(pdev, NULL);
+
+	list_del(&ctx->node);
+
+	return 0;
+}
+
+static const struct of_device_id ctx_of_match[] = {
+	{ .compatible = "qcom,msm-iommu-v1-ns" },
+	{ .compatible = "qcom,msm-iommu-v1-sec" },
+	{ /* sentinel */ }
+};
+
+static struct platform_driver qcom_iommu_ctx_driver = {
+	.driver	= {
+		.name		= "qcom-iommu-ctx",
+		.of_match_table	= of_match_ptr(ctx_of_match),
+	},
+	.probe	= qcom_iommu_ctx_probe,
+	.remove = qcom_iommu_ctx_remove,
+};
+
+static int qcom_iommu_device_probe(struct platform_device *pdev)
+{
+	struct qcom_iommu_dev *qcom_iommu;
+	struct device *dev = &pdev->dev;
+	struct resource *res;
+	int ret;
+
+	qcom_iommu = devm_kzalloc(dev, sizeof(*qcom_iommu), GFP_KERNEL);
+	if (!qcom_iommu)
+		return -ENOMEM;
+	qcom_iommu->dev = dev;
+
+	INIT_LIST_HEAD(&qcom_iommu->context_list);
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	if (res)
+		qcom_iommu->local_base = devm_ioremap_resource(dev, res);
+
+	qcom_iommu->iface_clk = devm_clk_get(dev, "iface");
+	if (IS_ERR(qcom_iommu->iface_clk)) {
+		dev_err(dev, "failed to get iface clock\n");
+		return PTR_ERR(qcom_iommu->iface_clk);
+	}
+
+	qcom_iommu->bus_clk = devm_clk_get(dev, "bus");
+	if (IS_ERR(qcom_iommu->bus_clk)) {
+		dev_err(dev, "failed to get bus clock\n");
+		return PTR_ERR(qcom_iommu->bus_clk);
+	}
+
+	if (of_property_read_u32(dev->of_node, "qcom,iommu-secure-id",
+				 &qcom_iommu->sec_id)) {
+		dev_err(dev, "missing qcom,iommu-secure-id property\n");
+		return -ENODEV;
+	}
+
+	platform_set_drvdata(pdev, qcom_iommu);
+
+	/* register context bank devices, which are child nodes: */
+	ret = devm_of_platform_populate(dev);
+	if (ret) {
+		dev_err(dev, "Failed to populate iommu contexts\n");
+		return ret;
+	}
+
+	ret = iommu_device_sysfs_add(&qcom_iommu->iommu, dev, NULL,
+				     dev_name(dev));
+	if (ret) {
+		dev_err(dev, "Failed to register iommu in sysfs\n");
+		return ret;
+	}
+
+	iommu_device_set_ops(&qcom_iommu->iommu, &qcom_iommu_ops);
+	iommu_device_set_fwnode(&qcom_iommu->iommu, dev->fwnode);
+
+	ret = iommu_device_register(&qcom_iommu->iommu);
+	if (ret) {
+		dev_err(dev, "Failed to register iommu\n");
+		return ret;
+	}
+
+	pm_runtime_enable(dev);
+	bus_set_iommu(&platform_bus_type, &qcom_iommu_ops);
+
+	if (qcom_iommu->local_base) {
+		pm_runtime_get_sync(dev);
+		writel_relaxed(0xffffffff, qcom_iommu->local_base + SMMU_INTR_SEL_NS);
+		pm_runtime_put_sync(dev);
+	}
+
+	return 0;
+}
+
+static int qcom_iommu_device_remove(struct platform_device *pdev)
+{
+	struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
+
+	bus_set_iommu(&platform_bus_type, NULL);
+
+	pm_runtime_force_suspend(&pdev->dev);
+	platform_set_drvdata(pdev, NULL);
+	iommu_device_sysfs_remove(&qcom_iommu->iommu);
+	iommu_device_unregister(&qcom_iommu->iommu);
+
+	return 0;
+}
+
+#ifdef CONFIG_PM
+static int qcom_iommu_resume(struct device *dev)
+{
+	struct platform_device *pdev = to_platform_device(dev);
+	struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
+
+	return qcom_iommu_enable_clocks(qcom_iommu);
+}
+
+static int qcom_iommu_suspend(struct device *dev)
+{
+	struct platform_device *pdev = to_platform_device(dev);
+	struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
+
+	qcom_iommu_disable_clocks(qcom_iommu);
+
+	return 0;
+}
+#endif
+
+static const struct dev_pm_ops qcom_iommu_pm_ops = {
+	SET_RUNTIME_PM_OPS(qcom_iommu_suspend, qcom_iommu_resume, NULL)
+	SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
+				pm_runtime_force_resume)
+};
+
+static const struct of_device_id qcom_iommu_of_match[] = {
+	{ .compatible = "qcom,msm-iommu-v1" },
+	{ /* sentinel */ }
+};
+MODULE_DEVICE_TABLE(of, qcom_iommu_of_match);
+
+static struct platform_driver qcom_iommu_driver = {
+	.driver	= {
+		.name		= "qcom-iommu",
+		.of_match_table	= of_match_ptr(qcom_iommu_of_match),
+		.pm		= &qcom_iommu_pm_ops,
+	},
+	.probe	= qcom_iommu_device_probe,
+	.remove	= qcom_iommu_device_remove,
+};
+
+static int __init qcom_iommu_init(void)
+{
+	int ret;
+
+	ret = platform_driver_register(&qcom_iommu_ctx_driver);
+	if (ret)
+		return ret;
+
+	ret = platform_driver_register(&qcom_iommu_driver);
+	if (ret)
+		platform_driver_unregister(&qcom_iommu_ctx_driver);
+
+	return ret;
+}
+
+static void __exit qcom_iommu_exit(void)
+{
+	platform_driver_unregister(&qcom_iommu_driver);
+	platform_driver_unregister(&qcom_iommu_ctx_driver);
+}
+
+module_init(qcom_iommu_init);
+module_exit(qcom_iommu_exit);
+
+IOMMU_OF_DECLARE(qcom_iommu_dev, "qcom,msm-iommu-v1", NULL);
+
+MODULE_DESCRIPTION("IOMMU API for QCOM IOMMU v1 implementations");
+MODULE_LICENSE("GPL v2");
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] iommu: add qcom_iommu
  2017-05-04 14:31       ` Rob Herring
@ 2017-05-05 12:31         ` Sricharan R
  0 siblings, 0 replies; 24+ messages in thread
From: Sricharan R @ 2017-05-05 12:31 UTC (permalink / raw)
  To: Rob Herring, Rob Clark
  Cc: Linux IOMMU, linux-arm-msm, Robin Murphy, Will Deacon,
	Mark Rutland, Stanimir Varbanov, Archit Taneja

< snip ..>
>> +
>> +static struct platform_driver qcom_iommu_driver = {
>> +       .driver = {
>> +               .name           = "qcom-iommu",
>> +               .of_match_table = of_match_ptr(qcom_iommu_of_match),
>> +               .pm             = &qcom_iommu_pm_ops,
>> +       },
>> +       .probe  = qcom_iommu_device_probe,
>> +       .remove = qcom_iommu_device_remove,
>> +};
>> +module_platform_driver(qcom_iommu_driver);
>> +
>> +IOMMU_OF_DECLARE(qcom_iommu_dev, "qcom,msm-iommu-v1", NULL);
> 
> Is this needed any more with deferred probe now?

Yes, because the __iommu_of_table is still used for to find out
the presence of the driver.

Regards,
 Sricharan

> 
>> +
>> +MODULE_DESCRIPTION("IOMMU API for QCOM IOMMU v1 implementations");
>> +MODULE_LICENSE("GPL v2");
>> --
>> 2.9.3
>>

-- 
"QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] iommu: add qcom_iommu
       [not found]     ` <20170504133436.24288-4-robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-05-04 14:31       ` Rob Herring
  2017-05-05 12:31         ` Sricharan R
  0 siblings, 1 reply; 24+ messages in thread
From: Rob Herring @ 2017-05-04 14:31 UTC (permalink / raw)
  To: Rob Clark
  Cc: Mark Rutland, linux-arm-msm, Will Deacon, Stanimir Varbanov, Linux IOMMU

On Thu, May 4, 2017 at 8:34 AM, Rob Clark <robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> An iommu driver for Qualcomm "B" family devices which do not completely
> implement the ARM SMMU spec.  These devices have context-bank register
> layout that is similar to ARM SMMU, but no global register space (or at
> least not one that is accessible).
>
> Signed-off-by: Rob Clark <robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Signed-off-by: Stanimir Varbanov <stanimir.varbanov-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
> ---
>  drivers/iommu/Kconfig      |  10 +
>  drivers/iommu/Makefile     |   1 +
>  drivers/iommu/qcom_iommu.c | 825 +++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 836 insertions(+)
>  create mode 100644 drivers/iommu/qcom_iommu.c
>
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index 37e204f..400a404 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -359,4 +359,14 @@ config MTK_IOMMU_V1
>
>           if unsure, say N here.
>
> +config QCOM_IOMMU
> +       bool "Qualcomm IOMMU Support"
> +       depends on ARM || ARM64

This is redundant as you have ARCH_QCOM

> +       depends on ARCH_QCOM || COMPILE_TEST
> +       select IOMMU_API
> +       select IOMMU_IO_PGTABLE_LPAE
> +       select ARM_DMA_USE_IOMMU
> +       help
> +         Support for IOMMU on certain Qualcomm SoCs.
> +
>  endif # IOMMU_SUPPORT
> diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
> index 195f7b9..b910aea 100644
> --- a/drivers/iommu/Makefile
> +++ b/drivers/iommu/Makefile
> @@ -27,3 +27,4 @@ obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o
>  obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
>  obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
>  obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
> +obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
> diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
> new file mode 100644
> index 0000000..1cf7c8e
> --- /dev/null
> +++ b/drivers/iommu/qcom_iommu.c
> @@ -0,0 +1,825 @@
> +/*
> + * IOMMU API for QCOM secure IOMMUs.  Somewhat based on arm-smmu.c
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.

Don't put FSF address in.

> + *
> + * Copyright (C) 2013 ARM Limited
> + * Copyright (C) 2017 Red Hat
> + */
> +
> +#define pr_fmt(fmt) "qcom-iommu: " fmt

Unused as dev_* prints are used?

> +
> +#include <linux/atomic.h>
> +#include <linux/clk.h>
> +#include <linux/delay.h>
> +#include <linux/dma-iommu.h>
> +#include <linux/dma-mapping.h>
> +#include <linux/err.h>
> +#include <linux/interrupt.h>
> +#include <linux/io.h>
> +#include <linux/io-64-nonatomic-hi-lo.h>
> +#include <linux/iommu.h>
> +#include <linux/iopoll.h>
> +#include <linux/module.h>

This driver is boolean and not a module.

> +#include <linux/mutex.h>
> +#include <linux/of.h>
> +#include <linux/of_address.h>
> +#include <linux/of_device.h>
> +#include <linux/of_iommu.h>
> +#include <linux/platform_device.h>
> +#include <linux/pm_runtime.h>
> +#include <linux/qcom_scm.h>
> +#include <linux/slab.h>
> +#include <linux/spinlock.h>
> +
> +#include "io-pgtable.h"
> +#include "arm-smmu-regs.h"
> +
> +#define SMMU_INTR_SEL_NS     0x2000
> +
> +struct qcom_iommu_dev {
> +       /* IOMMU core code handle */
> +       struct iommu_device      iommu;
> +       struct device           *dev;
> +       struct clk              *iface_clk;
> +       struct clk              *bus_clk;
> +       void __iomem            *local_base;
> +       u32                      sec_id;
> +       struct list_head         context_list;   /* list of qcom_iommu_context */
> +};
> +
> +struct qcom_iommu_ctx {
> +       struct device           *dev;
> +       void __iomem            *base;
> +       unsigned int             irq;
> +       bool                     secure_init;
> +       u32                      asid;      /* asid and ctx bank # are 1:1 */
> +       struct iommu_group      *group;
> +       struct list_head         node;      /* head in qcom_iommu_device::context_list */
> +};
> +
> +struct qcom_iommu_domain {
> +       struct io_pgtable_ops   *pgtbl_ops;
> +       spinlock_t               pgtbl_lock;
> +       struct mutex             init_mutex; /* Protects iommu pointer */
> +       struct iommu_domain      domain;
> +       struct qcom_iommu_dev   *iommu;
> +};
> +
> +static struct qcom_iommu_domain *to_qcom_iommu_domain(struct iommu_domain *dom)
> +{
> +       return container_of(dom, struct qcom_iommu_domain, domain);
> +}
> +
> +static const struct iommu_ops qcom_iommu_ops;
> +
> +static struct qcom_iommu_dev * __to_iommu(struct iommu_fwspec *fwspec)
> +{
> +       if (!fwspec || fwspec->ops != &qcom_iommu_ops)
> +               return NULL;
> +       return fwspec->iommu_priv;
> +}
> +
> +static struct qcom_iommu_dev * to_iommu(struct iommu_fwspec *fwspec)
> +{
> +       struct qcom_iommu_dev *qcom_iommu = __to_iommu(fwspec);
> +       WARN_ON(!qcom_iommu);
> +       return qcom_iommu;
> +}
> +
> +static struct qcom_iommu_ctx * to_ctx(struct iommu_fwspec *fwspec, unsigned asid)
> +{
> +       struct qcom_iommu_dev *qcom_iommu = to_iommu(fwspec);
> +       struct qcom_iommu_ctx *ctx;
> +
> +       if (!qcom_iommu)
> +               return NULL;
> +
> +       list_for_each_entry(ctx, &qcom_iommu->context_list, node)
> +               if (ctx->asid == asid)
> +                       return ctx;
> +
> +       WARN(1, "no ctx for asid %u\n", asid);
> +       return NULL;
> +}
> +
> +static inline void
> +iommu_writel(struct qcom_iommu_ctx *ctx, unsigned reg, u32 val)
> +{
> +       writel_relaxed(val, ctx->base + reg);
> +}
> +
> +static inline void
> +iommu_writeq(struct qcom_iommu_ctx *ctx, unsigned reg, u64 val)
> +{
> +       writeq_relaxed(val, ctx->base + reg);
> +}
> +
> +static inline u32
> +iommu_readl(struct qcom_iommu_ctx *ctx, unsigned reg)
> +{
> +       return readl_relaxed(ctx->base + reg);
> +}
> +
> +static inline u32

u64?

> +iommu_readq(struct qcom_iommu_ctx *ctx, unsigned reg)
> +{
> +       return readq_relaxed(ctx->base + reg);
> +}
> +
> +static void __sync_tlb(struct qcom_iommu_ctx *ctx)
> +{
> +       unsigned int val;
> +       unsigned int ret;
> +
> +       iommu_writel(ctx, ARM_SMMU_CB_TLBSYNC, 0);
> +
> +       ret = readl_poll_timeout(ctx->base + ARM_SMMU_CB_TLBSTATUS, val,
> +                                (val & 0x1) == 0, 0, 5000000);
> +       if (ret)
> +               dev_err(ctx->dev, "timeout waiting for TLB SYNC\n");
> +}
> +
> +static void qcom_iommu_tlb_sync(void *cookie)
> +{
> +       struct iommu_fwspec *fwspec = cookie;
> +       unsigned i;
> +
> +       for (i = 0; i < fwspec->num_ids; i++)
> +               __sync_tlb(to_ctx(fwspec, fwspec->ids[i]));
> +}
> +
> +static void qcom_iommu_tlb_inv_context(void *cookie)
> +{
> +       struct iommu_fwspec *fwspec = cookie;
> +       unsigned i;
> +
> +       for (i = 0; i < fwspec->num_ids; i++) {
> +               struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
> +
> +               iommu_writel(ctx, ARM_SMMU_CB_S1_TLBIASID, ctx->asid);
> +               __sync_tlb(ctx);
> +       }
> +}
> +
> +static void qcom_iommu_tlb_inv_range_nosync(unsigned long iova, size_t size,
> +                                           size_t granule, bool leaf, void *cookie)
> +{
> +       struct iommu_fwspec *fwspec = cookie;
> +       unsigned i, reg;
> +
> +       reg = leaf ? ARM_SMMU_CB_S1_TLBIVAL : ARM_SMMU_CB_S1_TLBIVA;
> +
> +       for (i = 0; i < fwspec->num_ids; i++) {
> +               struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
> +               size_t s = size;
> +
> +               iova &= ~12UL;
> +               iova |= ctx->asid;
> +               do {
> +                       iommu_writel(ctx, reg, iova);
> +                       iova += granule;
> +               } while (s -= granule);
> +       }
> +}
> +
> +static const struct iommu_gather_ops qcom_gather_ops = {
> +       .tlb_flush_all  = qcom_iommu_tlb_inv_context,
> +       .tlb_add_flush  = qcom_iommu_tlb_inv_range_nosync,
> +       .tlb_sync       = qcom_iommu_tlb_sync,
> +};
> +
> +static irqreturn_t qcom_iommu_fault(int irq, void *dev)
> +{
> +       struct qcom_iommu_ctx *ctx = dev;
> +       u32 fsr, fsynr;
> +       unsigned long iova;
> +
> +       fsr = iommu_readl(ctx, ARM_SMMU_CB_FSR);
> +
> +       if (!(fsr & FSR_FAULT))
> +               return IRQ_NONE;
> +
> +       fsynr = iommu_readl(ctx, ARM_SMMU_CB_FSYNR0);
> +       iova = iommu_readq(ctx, ARM_SMMU_CB_FAR);
> +
> +       dev_err_ratelimited(ctx->dev,
> +                           "Unhandled context fault: fsr=0x%x, "
> +                           "iova=0x%08lx, fsynr=0x%x, cb=%d\n",
> +                           fsr, iova, fsynr, ctx->asid);
> +
> +       iommu_writel(ctx, ARM_SMMU_CB_FSR, fsr);
> +
> +       return IRQ_HANDLED;
> +}
> +
> +static int qcom_iommu_init_domain(struct iommu_domain *domain,
> +                                 struct qcom_iommu_dev *qcom_iommu,
> +                                 struct iommu_fwspec *fwspec)
> +{
> +       struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
> +       struct io_pgtable_ops *pgtbl_ops;
> +       struct io_pgtable_cfg pgtbl_cfg;
> +       int i, ret = 0;
> +       u32 reg;
> +
> +       mutex_lock(&qcom_domain->init_mutex);
> +       if (qcom_domain->iommu)
> +               goto out_unlock;
> +
> +       pgtbl_cfg = (struct io_pgtable_cfg) {
> +               .pgsize_bitmap  = qcom_iommu_ops.pgsize_bitmap,
> +               .ias            = 32,
> +               .oas            = 40,
> +               .tlb            = &qcom_gather_ops,
> +               .iommu_dev      = qcom_iommu->dev,
> +       };
> +
> +       qcom_domain->iommu = qcom_iommu;
> +       pgtbl_ops = alloc_io_pgtable_ops(ARM_32_LPAE_S1, &pgtbl_cfg, fwspec);
> +       if (!pgtbl_ops) {
> +               dev_err(qcom_iommu->dev, "failed to allocate pagetable ops\n");
> +               ret = -ENOMEM;
> +               goto out_clear_iommu;
> +       }
> +
> +       /* Update the domain's page sizes to reflect the page table format */
> +       domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
> +       domain->geometry.aperture_end = (1ULL << pgtbl_cfg.ias) - 1;
> +       domain->geometry.force_aperture = true;
> +
> +       for (i = 0; i < fwspec->num_ids; i++) {
> +               struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
> +
> +               if (!ctx->secure_init) {
> +                       ret = qcom_scm_restore_sec_cfg(qcom_iommu->sec_id, ctx->asid);
> +                       if (ret) {
> +                               dev_err(qcom_iommu->dev, "secure init failed: %d\n", ret);
> +                               goto out_clear_iommu;
> +                       }
> +                       ctx->secure_init = true;
> +               }
> +
> +               /* TTBRs */
> +               iommu_writeq(ctx, ARM_SMMU_CB_TTBR0,
> +                               pgtbl_cfg.arm_lpae_s1_cfg.ttbr[0] |
> +                               ((u64)ctx->asid << TTBRn_ASID_SHIFT));
> +               iommu_writeq(ctx, ARM_SMMU_CB_TTBR1,
> +                               pgtbl_cfg.arm_lpae_s1_cfg.ttbr[1] |
> +                               ((u64)ctx->asid << TTBRn_ASID_SHIFT));
> +
> +               /* TTBCR */
> +               iommu_writel(ctx, ARM_SMMU_CB_TTBCR2,
> +                               (pgtbl_cfg.arm_lpae_s1_cfg.tcr >> 32) |
> +                               TTBCR2_SEP_UPSTREAM);
> +               iommu_writel(ctx, ARM_SMMU_CB_TTBCR,
> +                               pgtbl_cfg.arm_lpae_s1_cfg.tcr);
> +
> +               /* MAIRs (stage-1 only) */
> +               iommu_writel(ctx, ARM_SMMU_CB_S1_MAIR0,
> +                               pgtbl_cfg.arm_lpae_s1_cfg.mair[0]);
> +               iommu_writel(ctx, ARM_SMMU_CB_S1_MAIR1,
> +                               pgtbl_cfg.arm_lpae_s1_cfg.mair[1]);
> +
> +               /* SCTLR */
> +               reg = SCTLR_CFIE | SCTLR_CFRE | SCTLR_AFE | SCTLR_TRE |
> +                       SCTLR_M | SCTLR_S1_ASIDPNE;
> +#ifdef __BIG_ENDIAN

Probably want to use the kconfig symbol here instead and "if (IS_ENABLED(...))"

> +               reg |= SCTLR_E;
> +#endif
> +               iommu_writel(ctx, ARM_SMMU_CB_SCTLR, reg);
> +       }
> +
> +       mutex_unlock(&qcom_domain->init_mutex);
> +
> +       /* Publish page table ops for map/unmap */
> +       qcom_domain->pgtbl_ops = pgtbl_ops;
> +
> +       return 0;
> +
> +out_clear_iommu:
> +       qcom_domain->iommu = NULL;
> +out_unlock:
> +       mutex_unlock(&qcom_domain->init_mutex);
> +       return ret;
> +}
> +
> +static struct iommu_domain *qcom_iommu_domain_alloc(unsigned type)
> +{
> +       struct qcom_iommu_domain *qcom_domain;
> +
> +       if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA)
> +               return NULL;
> +       /*
> +        * Allocate the domain and initialise some of its data structures.
> +        * We can't really do anything meaningful until we've added a
> +        * master.
> +        */
> +       qcom_domain = kzalloc(sizeof(*qcom_domain), GFP_KERNEL);
> +       if (!qcom_domain)
> +               return NULL;
> +
> +       if (type == IOMMU_DOMAIN_DMA &&
> +           iommu_get_dma_cookie(&qcom_domain->domain)) {
> +               kfree(qcom_domain);
> +               return NULL;
> +       }
> +
> +       mutex_init(&qcom_domain->init_mutex);
> +       spin_lock_init(&qcom_domain->pgtbl_lock);
> +
> +       return &qcom_domain->domain;
> +}
> +
> +static void qcom_iommu_domain_free(struct iommu_domain *domain)
> +{
> +       struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
> +
> +       if (WARN_ON(qcom_domain->iommu))    /* forgot to detach? */
> +               return;
> +
> +       iommu_put_dma_cookie(domain);
> +
> +       free_io_pgtable_ops(qcom_domain->pgtbl_ops);
> +
> +       kfree(qcom_domain);
> +}
> +
> +static int qcom_iommu_attach_dev(struct iommu_domain *domain, struct device *dev)
> +{
> +       struct qcom_iommu_dev *qcom_iommu = to_iommu(dev->iommu_fwspec);
> +       struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
> +       int ret;
> +
> +       if (!qcom_iommu) {
> +               dev_err(dev, "cannot attach to IOMMU, is it on the same bus?\n");
> +               return -ENXIO;
> +       }
> +
> +       /* Ensure that the domain is finalized */
> +       pm_runtime_get_sync(qcom_iommu->dev);
> +       ret = qcom_iommu_init_domain(domain, qcom_iommu, dev->iommu_fwspec);
> +       pm_runtime_put_sync(qcom_iommu->dev);
> +       if (ret < 0)
> +               return ret;
> +
> +       /*
> +        * Sanity check the domain. We don't support domains across
> +        * different IOMMUs.
> +        */
> +       if (qcom_domain->iommu != qcom_iommu) {
> +               dev_err(dev, "cannot attach to IOMMU %s while already "
> +                       "attached to domain on IOMMU %s\n",
> +                       dev_name(qcom_domain->iommu->dev),
> +                       dev_name(qcom_iommu->dev));
> +               return -EINVAL;
> +       }
> +
> +       return 0;
> +}
> +
> +static void qcom_iommu_detach_dev(struct iommu_domain *domain, struct device *dev)
> +{
> +       struct iommu_fwspec *fwspec = dev->iommu_fwspec;
> +       struct qcom_iommu_dev *qcom_iommu = to_iommu(fwspec);
> +       struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
> +       unsigned i;
> +
> +       if (!qcom_domain->iommu)
> +               return;
> +
> +       pm_runtime_get_sync(qcom_iommu->dev);
> +       for (i = 0; i < fwspec->num_ids; i++) {
> +               struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
> +
> +               /* Disable the context bank: */
> +               iommu_writel(ctx, ARM_SMMU_CB_SCTLR, 0);
> +       }
> +       pm_runtime_put_sync(qcom_iommu->dev);
> +
> +       qcom_domain->iommu = NULL;
> +}
> +
> +static int qcom_iommu_map(struct iommu_domain *domain, unsigned long iova,
> +                         phys_addr_t paddr, size_t size, int prot)
> +{
> +       int ret;
> +       unsigned long flags;
> +       struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
> +       struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
> +
> +       if (!ops)
> +               return -ENODEV;
> +
> +       spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
> +       ret = ops->map(ops, iova, paddr, size, prot);
> +       spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
> +       return ret;
> +}
> +
> +static size_t qcom_iommu_unmap(struct iommu_domain *domain, unsigned long iova,
> +                              size_t size)
> +{
> +       size_t ret;
> +       unsigned long flags;
> +       struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
> +       struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
> +
> +       if (!ops)
> +               return 0;
> +
> +       spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
> +       ret = ops->unmap(ops, iova, size);
> +       spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
> +       return ret;
> +}
> +
> +static phys_addr_t qcom_iommu_iova_to_phys(struct iommu_domain *domain,
> +                                          dma_addr_t iova)
> +{
> +       phys_addr_t ret;
> +       unsigned long flags;
> +       struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
> +       struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
> +
> +       if (!ops)
> +               return 0;
> +
> +       spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
> +       ret = ops->iova_to_phys(ops, iova);
> +       spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
> +
> +       return ret;
> +}
> +
> +static bool qcom_iommu_capable(enum iommu_cap cap)
> +{
> +       switch (cap) {
> +       case IOMMU_CAP_CACHE_COHERENCY:
> +               /*
> +                * Return true here as the SMMU can always send out coherent
> +                * requests.
> +                */
> +               return true;
> +       case IOMMU_CAP_NOEXEC:
> +               return true;
> +       default:
> +               return false;
> +       }
> +}
> +
> +static int qcom_iommu_add_device(struct device *dev)
> +{
> +       struct qcom_iommu_dev *qcom_iommu = __to_iommu(dev->iommu_fwspec);
> +       struct iommu_group *group;
> +       struct device_link *link;
> +
> +       if (!qcom_iommu)
> +               return -ENODEV;
> +
> +       /*
> +        * Establish the link between iommu and master, so that the
> +        * iommu gets runtime enabled/disabled as per the master's
> +        * needs.
> +        */
> +       link = device_link_add(dev, qcom_iommu->dev, DL_FLAG_PM_RUNTIME);
> +       if (!link) {
> +               dev_err(qcom_iommu->dev, "Unable to create device link between %s and %s\n",
> +                       dev_name(qcom_iommu->dev), dev_name(dev));
> +               return -ENODEV;
> +       }
> +
> +       group = iommu_group_get_for_dev(dev);
> +       if (IS_ERR_OR_NULL(group))
> +               return PTR_ERR_OR_ZERO(group);
> +
> +       iommu_group_put(group);
> +       iommu_device_link(&qcom_iommu->iommu, dev);
> +
> +       return 0;
> +}
> +
> +static void qcom_iommu_remove_device(struct device *dev)
> +{
> +       struct qcom_iommu_dev *qcom_iommu = to_iommu(dev->iommu_fwspec);
> +
> +       if (!qcom_iommu)
> +               return;
> +
> +       iommu_group_remove_device(dev);
> +       iommu_device_unlink(&qcom_iommu->iommu, dev);
> +       iommu_fwspec_free(dev);
> +}
> +
> +static struct iommu_group *qcom_iommu_device_group(struct device *dev)
> +{
> +       struct iommu_fwspec *fwspec = dev->iommu_fwspec;
> +       struct iommu_group *group = NULL;
> +       unsigned i;
> +
> +       for (i = 0; i < fwspec->num_ids; i++) {
> +               struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
> +
> +               if (group && ctx->group && group != ctx->group)
> +                       return ERR_PTR(-EINVAL);
> +
> +               group = ctx->group;
> +       }
> +
> +       if (group)
> +               return iommu_group_ref_get(group);
> +
> +       group = generic_device_group(dev);
> +
> +       for (i = 0; i < fwspec->num_ids; i++) {
> +               struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
> +               ctx->group = iommu_group_ref_get(group);
> +       }
> +
> +       return group;
> +}
> +
> +static int qcom_iommu_of_xlate(struct device *dev, struct of_phandle_args *args)
> +{
> +       struct platform_device *iommu_pdev;
> +
> +       if (args->args_count != 1) {
> +               dev_err(dev, "incorrect number of iommu params found for %s "
> +                       "(found %d, expected 1)\n",
> +                       args->np->full_name, args->args_count);
> +               return -EINVAL;
> +       }
> +
> +       if (!dev->iommu_fwspec->iommu_priv) {
> +               iommu_pdev = of_find_device_by_node(args->np);
> +               if (WARN_ON(!iommu_pdev))
> +                       return -EINVAL;
> +
> +               dev->iommu_fwspec->iommu_priv = platform_get_drvdata(iommu_pdev);
> +       }
> +
> +       return iommu_fwspec_add_ids(dev, &args->args[0], 1);
> +}
> +
> +static const struct iommu_ops qcom_iommu_ops = {
> +       .capable        = qcom_iommu_capable,
> +       .domain_alloc   = qcom_iommu_domain_alloc,
> +       .domain_free    = qcom_iommu_domain_free,
> +       .attach_dev     = qcom_iommu_attach_dev,
> +       .detach_dev     = qcom_iommu_detach_dev,
> +       .map            = qcom_iommu_map,
> +       .unmap          = qcom_iommu_unmap,
> +       .map_sg         = default_iommu_map_sg,
> +       .iova_to_phys   = qcom_iommu_iova_to_phys,
> +       .add_device     = qcom_iommu_add_device,
> +       .remove_device  = qcom_iommu_remove_device,
> +       .device_group   = qcom_iommu_device_group,
> +       .of_xlate       = qcom_iommu_of_xlate,
> +       .pgsize_bitmap  = SZ_4K | SZ_64K | SZ_1M | SZ_16M,
> +};
> +
> +static int qcom_iommu_enable_clocks(struct qcom_iommu_dev *qcom_iommu)
> +{
> +       int ret;
> +
> +       ret = clk_prepare_enable(qcom_iommu->iface_clk);
> +       if (ret) {
> +               dev_err(qcom_iommu->dev, "Couldn't enable iface_clk\n");
> +               return ret;
> +       }
> +
> +       ret = clk_prepare_enable(qcom_iommu->bus_clk);
> +       if (ret) {
> +               dev_err(qcom_iommu->dev, "Couldn't enable bus_clk\n");
> +               clk_disable_unprepare(qcom_iommu->iface_clk);
> +               return ret;
> +       }
> +
> +       return 0;
> +}
> +
> +static void qcom_iommu_disable_clocks(struct qcom_iommu_dev *qcom_iommu)
> +{
> +       clk_disable_unprepare(qcom_iommu->bus_clk);
> +       clk_disable_unprepare(qcom_iommu->iface_clk);
> +}
> +
> +static int qcom_iommu_ctx_probe(struct platform_device *pdev)
> +{
> +       struct qcom_iommu_ctx *ctx;
> +       struct device *dev = &pdev->dev;
> +       struct qcom_iommu_dev *qcom_iommu = dev_get_drvdata(dev->parent);
> +       struct resource *res;
> +       int ret;
> +       u32 reg;
> +
> +       ctx = devm_kzalloc(dev, sizeof(*ctx), GFP_KERNEL);
> +       if (!ctx)
> +               return -ENOMEM;
> +
> +       ctx->dev = dev;
> +       platform_set_drvdata(pdev, ctx);
> +
> +       res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +       ctx->base = devm_ioremap_resource(dev, res);
> +       if (IS_ERR(ctx->base))
> +               return PTR_ERR(ctx->base);
> +
> +       ctx->irq = platform_get_irq(pdev, 0);
> +       if (ctx->irq < 0) {
> +               dev_err(dev, "failed to get irq\n");
> +               return -ENODEV;
> +       }
> +
> +       ret = devm_request_irq(dev, ctx->irq,
> +                              qcom_iommu_fault,
> +                              IRQF_SHARED,
> +                              "qcom-iommu-fault",
> +                              ctx);
> +       if (ret) {
> +               dev_err(dev, "failed to request IRQ %u\n", ctx->irq);
> +               return ret;
> +       }
> +
> +       /* read the "reg" property directly to get the relative address
> +        * of the context bank, and calculate the asid from that:
> +        */
> +       if (of_property_read_u32_index(dev->of_node, "reg", 0, &reg)) {
> +               dev_err(dev, "missing reg property\n");
> +               return -ENODEV;
> +       }
> +
> +       ctx->asid = reg / 0x1000;      /* context banks are 0x1000 apart */
> +
> +       dev_dbg(dev, "found asid %u\n", ctx->asid);
> +
> +       list_add_tail(&ctx->node, &qcom_iommu->context_list);
> +
> +       return 0;
> +}
> +
> +static int qcom_iommu_ctx_remove(struct platform_device *pdev)
> +{
> +       struct qcom_iommu_ctx *ctx = platform_get_drvdata(pdev);
> +
> +       iommu_group_put(ctx->group);
> +       platform_set_drvdata(pdev, NULL);
> +
> +       return 0;
> +}
> +
> +static const struct of_device_id ctx_of_match[] = {
> +       { .compatible = "qcom,msm-iommu-v1-ns" },
> +       { .compatible = "qcom,msm-iommu-v1-sec" },
> +       { /* sentinel */ }
> +};
> +
> +static struct platform_driver qcom_iommu_ctx_driver = {
> +       .driver = {
> +               .name           = "qcom-iommu-ctx",
> +               .of_match_table = of_match_ptr(ctx_of_match),
> +       },
> +       .probe  = qcom_iommu_ctx_probe,
> +       .remove = qcom_iommu_ctx_remove,
> +};
> +module_platform_driver(qcom_iommu_ctx_driver);
> +
> +static int qcom_iommu_device_probe(struct platform_device *pdev)
> +{
> +       struct qcom_iommu_dev *qcom_iommu;
> +       struct device *dev = &pdev->dev;
> +       struct resource *res;
> +       int ret;
> +
> +       qcom_iommu = devm_kzalloc(dev, sizeof(*qcom_iommu), GFP_KERNEL);
> +       if (!qcom_iommu)
> +               return -ENOMEM;
> +       qcom_iommu->dev = dev;
> +
> +       INIT_LIST_HEAD(&qcom_iommu->context_list);
> +
> +       res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +       if (res)
> +               qcom_iommu->local_base = devm_ioremap_resource(dev, res);
> +
> +       qcom_iommu->iface_clk = devm_clk_get(dev, "iface");
> +       if (IS_ERR(qcom_iommu->iface_clk)) {
> +               dev_err(dev, "failed to get iface clock\n");
> +               return PTR_ERR(qcom_iommu->iface_clk);
> +       }
> +
> +       qcom_iommu->bus_clk = devm_clk_get(dev, "bus");
> +       if (IS_ERR(qcom_iommu->bus_clk)) {
> +               dev_err(dev, "failed to get bus clock\n");
> +               return PTR_ERR(qcom_iommu->bus_clk);
> +       }
> +
> +       if (of_property_read_u32(dev->of_node, "qcom,iommu-secure-id",
> +                                &qcom_iommu->sec_id)) {
> +               dev_err(dev, "missing qcom,iommu-secure-id property\n");
> +               return -ENODEV;
> +       }
> +
> +       platform_set_drvdata(pdev, qcom_iommu);
> +
> +       /* register context bank devices, which are child nodes: */
> +       ret = of_platform_populate(dev->of_node, ctx_of_match, NULL, dev);
> +       if (ret) {
> +               dev_err(dev, "Failed to populate iommu contexts\n");
> +               return ret;
> +       }
> +
> +       ret = iommu_device_sysfs_add(&qcom_iommu->iommu, dev, NULL,
> +                                    "smmu.%pa", &res->start);
> +       if (ret) {
> +               dev_err(dev, "Failed to register iommu in sysfs\n");
> +               return ret;
> +       }
> +
> +       iommu_device_set_ops(&qcom_iommu->iommu, &qcom_iommu_ops);
> +       iommu_device_set_fwnode(&qcom_iommu->iommu, dev->fwnode);
> +
> +       ret = iommu_device_register(&qcom_iommu->iommu);
> +       if (ret) {
> +               dev_err(dev, "Failed to register iommu\n");
> +               return ret;
> +       }
> +
> +       pm_runtime_enable(dev);
> +       bus_set_iommu(&platform_bus_type, &qcom_iommu_ops);
> +
> +       if (qcom_iommu->local_base) {
> +               pm_runtime_get_sync(dev);
> +               writel_relaxed(0xffffffff, qcom_iommu->local_base + SMMU_INTR_SEL_NS);
> +               pm_runtime_put_sync(dev);
> +       }
> +
> +       return 0;
> +}
> +
> +static int qcom_iommu_device_remove(struct platform_device *pdev)
> +{
> +       pm_runtime_force_suspend(&pdev->dev);
> +       platform_set_drvdata(pdev, NULL);

Missing a lot of teardown like needing to remove the child devices?
Though I'm not sure you'd be doing much after removing the IOMMU.

> +
> +       return 0;
> +}
> +
> +#ifdef CONFIG_PM
> +static int qcom_iommu_resume(struct device *dev)
> +{
> +       struct platform_device *pdev = to_platform_device(dev);
> +       struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
> +
> +       return qcom_iommu_enable_clocks(qcom_iommu);
> +}
> +
> +static int qcom_iommu_suspend(struct device *dev)
> +{
> +       struct platform_device *pdev = to_platform_device(dev);
> +       struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
> +
> +       qcom_iommu_disable_clocks(qcom_iommu);
> +
> +       return 0;
> +}
> +#endif
> +
> +static const struct dev_pm_ops qcom_iommu_pm_ops = {
> +       SET_RUNTIME_PM_OPS(qcom_iommu_suspend, qcom_iommu_resume, NULL)
> +       SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
> +                               pm_runtime_force_resume)
> +};
> +
> +static const struct of_device_id qcom_iommu_of_match[] = {
> +       { .compatible = "qcom,msm-iommu-v1" },
> +       { /* sentinel */ }
> +};
> +MODULE_DEVICE_TABLE(of, qcom_iommu_of_match);
> +
> +static struct platform_driver qcom_iommu_driver = {
> +       .driver = {
> +               .name           = "qcom-iommu",
> +               .of_match_table = of_match_ptr(qcom_iommu_of_match),
> +               .pm             = &qcom_iommu_pm_ops,
> +       },
> +       .probe  = qcom_iommu_device_probe,
> +       .remove = qcom_iommu_device_remove,
> +};
> +module_platform_driver(qcom_iommu_driver);
> +
> +IOMMU_OF_DECLARE(qcom_iommu_dev, "qcom,msm-iommu-v1", NULL);

Is this needed any more with deferred probe now?

> +
> +MODULE_DESCRIPTION("IOMMU API for QCOM IOMMU v1 implementations");
> +MODULE_LICENSE("GPL v2");
> --
> 2.9.3
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 3/4] iommu: add qcom_iommu
       [not found] ` <20170504133436.24288-1-robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-05-04 13:34   ` Rob Clark
       [not found]     ` <20170504133436.24288-4-robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 24+ messages in thread
From: Rob Clark @ 2017-05-04 13:34 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
  Cc: Mark Rutland, Rob Herring, linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	Will Deacon, Stanimir Varbanov

An iommu driver for Qualcomm "B" family devices which do not completely
implement the ARM SMMU spec.  These devices have context-bank register
layout that is similar to ARM SMMU, but no global register space (or at
least not one that is accessible).

Signed-off-by: Rob Clark <robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Signed-off-by: Stanimir Varbanov <stanimir.varbanov-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 drivers/iommu/Kconfig      |  10 +
 drivers/iommu/Makefile     |   1 +
 drivers/iommu/qcom_iommu.c | 825 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 836 insertions(+)
 create mode 100644 drivers/iommu/qcom_iommu.c

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 37e204f..400a404 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -359,4 +359,14 @@ config MTK_IOMMU_V1
 
 	  if unsure, say N here.
 
+config QCOM_IOMMU
+	bool "Qualcomm IOMMU Support"
+	depends on ARM || ARM64
+	depends on ARCH_QCOM || COMPILE_TEST
+	select IOMMU_API
+	select IOMMU_IO_PGTABLE_LPAE
+	select ARM_DMA_USE_IOMMU
+	help
+	  Support for IOMMU on certain Qualcomm SoCs.
+
 endif # IOMMU_SUPPORT
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 195f7b9..b910aea 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -27,3 +27,4 @@ obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o
 obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
 obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
 obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
+obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
new file mode 100644
index 0000000..1cf7c8e
--- /dev/null
+++ b/drivers/iommu/qcom_iommu.c
@@ -0,0 +1,825 @@
+/*
+ * IOMMU API for QCOM secure IOMMUs.  Somewhat based on arm-smmu.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) 2013 ARM Limited
+ * Copyright (C) 2017 Red Hat
+ */
+
+#define pr_fmt(fmt) "qcom-iommu: " fmt
+
+#include <linux/atomic.h>
+#include <linux/clk.h>
+#include <linux/delay.h>
+#include <linux/dma-iommu.h>
+#include <linux/dma-mapping.h>
+#include <linux/err.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/io-64-nonatomic-hi-lo.h>
+#include <linux/iommu.h>
+#include <linux/iopoll.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/of_device.h>
+#include <linux/of_iommu.h>
+#include <linux/platform_device.h>
+#include <linux/pm_runtime.h>
+#include <linux/qcom_scm.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+
+#include "io-pgtable.h"
+#include "arm-smmu-regs.h"
+
+#define SMMU_INTR_SEL_NS     0x2000
+
+struct qcom_iommu_dev {
+	/* IOMMU core code handle */
+	struct iommu_device	 iommu;
+	struct device		*dev;
+	struct clk		*iface_clk;
+	struct clk		*bus_clk;
+	void __iomem		*local_base;
+	u32			 sec_id;
+	struct list_head	 context_list;   /* list of qcom_iommu_context */
+};
+
+struct qcom_iommu_ctx {
+	struct device		*dev;
+	void __iomem		*base;
+	unsigned int		 irq;
+	bool			 secure_init;
+	u32			 asid;      /* asid and ctx bank # are 1:1 */
+	struct iommu_group	*group;
+	struct list_head	 node;      /* head in qcom_iommu_device::context_list */
+};
+
+struct qcom_iommu_domain {
+	struct io_pgtable_ops	*pgtbl_ops;
+	spinlock_t		 pgtbl_lock;
+	struct mutex		 init_mutex; /* Protects iommu pointer */
+	struct iommu_domain	 domain;
+	struct qcom_iommu_dev	*iommu;
+};
+
+static struct qcom_iommu_domain *to_qcom_iommu_domain(struct iommu_domain *dom)
+{
+	return container_of(dom, struct qcom_iommu_domain, domain);
+}
+
+static const struct iommu_ops qcom_iommu_ops;
+
+static struct qcom_iommu_dev * __to_iommu(struct iommu_fwspec *fwspec)
+{
+	if (!fwspec || fwspec->ops != &qcom_iommu_ops)
+		return NULL;
+	return fwspec->iommu_priv;
+}
+
+static struct qcom_iommu_dev * to_iommu(struct iommu_fwspec *fwspec)
+{
+	struct qcom_iommu_dev *qcom_iommu = __to_iommu(fwspec);
+	WARN_ON(!qcom_iommu);
+	return qcom_iommu;
+}
+
+static struct qcom_iommu_ctx * to_ctx(struct iommu_fwspec *fwspec, unsigned asid)
+{
+	struct qcom_iommu_dev *qcom_iommu = to_iommu(fwspec);
+	struct qcom_iommu_ctx *ctx;
+
+	if (!qcom_iommu)
+		return NULL;
+
+	list_for_each_entry(ctx, &qcom_iommu->context_list, node)
+		if (ctx->asid == asid)
+			return ctx;
+
+	WARN(1, "no ctx for asid %u\n", asid);
+	return NULL;
+}
+
+static inline void
+iommu_writel(struct qcom_iommu_ctx *ctx, unsigned reg, u32 val)
+{
+	writel_relaxed(val, ctx->base + reg);
+}
+
+static inline void
+iommu_writeq(struct qcom_iommu_ctx *ctx, unsigned reg, u64 val)
+{
+	writeq_relaxed(val, ctx->base + reg);
+}
+
+static inline u32
+iommu_readl(struct qcom_iommu_ctx *ctx, unsigned reg)
+{
+	return readl_relaxed(ctx->base + reg);
+}
+
+static inline u32
+iommu_readq(struct qcom_iommu_ctx *ctx, unsigned reg)
+{
+	return readq_relaxed(ctx->base + reg);
+}
+
+static void __sync_tlb(struct qcom_iommu_ctx *ctx)
+{
+	unsigned int val;
+	unsigned int ret;
+
+	iommu_writel(ctx, ARM_SMMU_CB_TLBSYNC, 0);
+
+	ret = readl_poll_timeout(ctx->base + ARM_SMMU_CB_TLBSTATUS, val,
+				 (val & 0x1) == 0, 0, 5000000);
+	if (ret)
+		dev_err(ctx->dev, "timeout waiting for TLB SYNC\n");
+}
+
+static void qcom_iommu_tlb_sync(void *cookie)
+{
+	struct iommu_fwspec *fwspec = cookie;
+	unsigned i;
+
+	for (i = 0; i < fwspec->num_ids; i++)
+		__sync_tlb(to_ctx(fwspec, fwspec->ids[i]));
+}
+
+static void qcom_iommu_tlb_inv_context(void *cookie)
+{
+	struct iommu_fwspec *fwspec = cookie;
+	unsigned i;
+
+	for (i = 0; i < fwspec->num_ids; i++) {
+		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
+
+		iommu_writel(ctx, ARM_SMMU_CB_S1_TLBIASID, ctx->asid);
+		__sync_tlb(ctx);
+	}
+}
+
+static void qcom_iommu_tlb_inv_range_nosync(unsigned long iova, size_t size,
+					    size_t granule, bool leaf, void *cookie)
+{
+	struct iommu_fwspec *fwspec = cookie;
+	unsigned i, reg;
+
+	reg = leaf ? ARM_SMMU_CB_S1_TLBIVAL : ARM_SMMU_CB_S1_TLBIVA;
+
+	for (i = 0; i < fwspec->num_ids; i++) {
+		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
+		size_t s = size;
+
+		iova &= ~12UL;
+		iova |= ctx->asid;
+		do {
+			iommu_writel(ctx, reg, iova);
+			iova += granule;
+		} while (s -= granule);
+	}
+}
+
+static const struct iommu_gather_ops qcom_gather_ops = {
+	.tlb_flush_all	= qcom_iommu_tlb_inv_context,
+	.tlb_add_flush	= qcom_iommu_tlb_inv_range_nosync,
+	.tlb_sync	= qcom_iommu_tlb_sync,
+};
+
+static irqreturn_t qcom_iommu_fault(int irq, void *dev)
+{
+	struct qcom_iommu_ctx *ctx = dev;
+	u32 fsr, fsynr;
+	unsigned long iova;
+
+	fsr = iommu_readl(ctx, ARM_SMMU_CB_FSR);
+
+	if (!(fsr & FSR_FAULT))
+		return IRQ_NONE;
+
+	fsynr = iommu_readl(ctx, ARM_SMMU_CB_FSYNR0);
+	iova = iommu_readq(ctx, ARM_SMMU_CB_FAR);
+
+	dev_err_ratelimited(ctx->dev,
+			    "Unhandled context fault: fsr=0x%x, "
+			    "iova=0x%08lx, fsynr=0x%x, cb=%d\n",
+			    fsr, iova, fsynr, ctx->asid);
+
+	iommu_writel(ctx, ARM_SMMU_CB_FSR, fsr);
+
+	return IRQ_HANDLED;
+}
+
+static int qcom_iommu_init_domain(struct iommu_domain *domain,
+				  struct qcom_iommu_dev *qcom_iommu,
+				  struct iommu_fwspec *fwspec)
+{
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	struct io_pgtable_ops *pgtbl_ops;
+	struct io_pgtable_cfg pgtbl_cfg;
+	int i, ret = 0;
+	u32 reg;
+
+	mutex_lock(&qcom_domain->init_mutex);
+	if (qcom_domain->iommu)
+		goto out_unlock;
+
+	pgtbl_cfg = (struct io_pgtable_cfg) {
+		.pgsize_bitmap	= qcom_iommu_ops.pgsize_bitmap,
+		.ias		= 32,
+		.oas		= 40,
+		.tlb		= &qcom_gather_ops,
+		.iommu_dev	= qcom_iommu->dev,
+	};
+
+	qcom_domain->iommu = qcom_iommu;
+	pgtbl_ops = alloc_io_pgtable_ops(ARM_32_LPAE_S1, &pgtbl_cfg, fwspec);
+	if (!pgtbl_ops) {
+		dev_err(qcom_iommu->dev, "failed to allocate pagetable ops\n");
+		ret = -ENOMEM;
+		goto out_clear_iommu;
+	}
+
+	/* Update the domain's page sizes to reflect the page table format */
+	domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
+	domain->geometry.aperture_end = (1ULL << pgtbl_cfg.ias) - 1;
+	domain->geometry.force_aperture = true;
+
+	for (i = 0; i < fwspec->num_ids; i++) {
+		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
+
+		if (!ctx->secure_init) {
+			ret = qcom_scm_restore_sec_cfg(qcom_iommu->sec_id, ctx->asid);
+			if (ret) {
+				dev_err(qcom_iommu->dev, "secure init failed: %d\n", ret);
+				goto out_clear_iommu;
+			}
+			ctx->secure_init = true;
+		}
+
+		/* TTBRs */
+		iommu_writeq(ctx, ARM_SMMU_CB_TTBR0,
+				pgtbl_cfg.arm_lpae_s1_cfg.ttbr[0] |
+				((u64)ctx->asid << TTBRn_ASID_SHIFT));
+		iommu_writeq(ctx, ARM_SMMU_CB_TTBR1,
+				pgtbl_cfg.arm_lpae_s1_cfg.ttbr[1] |
+				((u64)ctx->asid << TTBRn_ASID_SHIFT));
+
+		/* TTBCR */
+		iommu_writel(ctx, ARM_SMMU_CB_TTBCR2,
+				(pgtbl_cfg.arm_lpae_s1_cfg.tcr >> 32) |
+				TTBCR2_SEP_UPSTREAM);
+		iommu_writel(ctx, ARM_SMMU_CB_TTBCR,
+				pgtbl_cfg.arm_lpae_s1_cfg.tcr);
+
+		/* MAIRs (stage-1 only) */
+		iommu_writel(ctx, ARM_SMMU_CB_S1_MAIR0,
+				pgtbl_cfg.arm_lpae_s1_cfg.mair[0]);
+		iommu_writel(ctx, ARM_SMMU_CB_S1_MAIR1,
+				pgtbl_cfg.arm_lpae_s1_cfg.mair[1]);
+
+		/* SCTLR */
+		reg = SCTLR_CFIE | SCTLR_CFRE | SCTLR_AFE | SCTLR_TRE |
+			SCTLR_M | SCTLR_S1_ASIDPNE;
+#ifdef __BIG_ENDIAN
+		reg |= SCTLR_E;
+#endif
+		iommu_writel(ctx, ARM_SMMU_CB_SCTLR, reg);
+	}
+
+	mutex_unlock(&qcom_domain->init_mutex);
+
+	/* Publish page table ops for map/unmap */
+	qcom_domain->pgtbl_ops = pgtbl_ops;
+
+	return 0;
+
+out_clear_iommu:
+	qcom_domain->iommu = NULL;
+out_unlock:
+	mutex_unlock(&qcom_domain->init_mutex);
+	return ret;
+}
+
+static struct iommu_domain *qcom_iommu_domain_alloc(unsigned type)
+{
+	struct qcom_iommu_domain *qcom_domain;
+
+	if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA)
+		return NULL;
+	/*
+	 * Allocate the domain and initialise some of its data structures.
+	 * We can't really do anything meaningful until we've added a
+	 * master.
+	 */
+	qcom_domain = kzalloc(sizeof(*qcom_domain), GFP_KERNEL);
+	if (!qcom_domain)
+		return NULL;
+
+	if (type == IOMMU_DOMAIN_DMA &&
+	    iommu_get_dma_cookie(&qcom_domain->domain)) {
+		kfree(qcom_domain);
+		return NULL;
+	}
+
+	mutex_init(&qcom_domain->init_mutex);
+	spin_lock_init(&qcom_domain->pgtbl_lock);
+
+	return &qcom_domain->domain;
+}
+
+static void qcom_iommu_domain_free(struct iommu_domain *domain)
+{
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+
+	if (WARN_ON(qcom_domain->iommu))    /* forgot to detach? */
+		return;
+
+	iommu_put_dma_cookie(domain);
+
+	free_io_pgtable_ops(qcom_domain->pgtbl_ops);
+
+	kfree(qcom_domain);
+}
+
+static int qcom_iommu_attach_dev(struct iommu_domain *domain, struct device *dev)
+{
+	struct qcom_iommu_dev *qcom_iommu = to_iommu(dev->iommu_fwspec);
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	int ret;
+
+	if (!qcom_iommu) {
+		dev_err(dev, "cannot attach to IOMMU, is it on the same bus?\n");
+		return -ENXIO;
+	}
+
+	/* Ensure that the domain is finalized */
+	pm_runtime_get_sync(qcom_iommu->dev);
+	ret = qcom_iommu_init_domain(domain, qcom_iommu, dev->iommu_fwspec);
+	pm_runtime_put_sync(qcom_iommu->dev);
+	if (ret < 0)
+		return ret;
+
+	/*
+	 * Sanity check the domain. We don't support domains across
+	 * different IOMMUs.
+	 */
+	if (qcom_domain->iommu != qcom_iommu) {
+		dev_err(dev, "cannot attach to IOMMU %s while already "
+			"attached to domain on IOMMU %s\n",
+			dev_name(qcom_domain->iommu->dev),
+			dev_name(qcom_iommu->dev));
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static void qcom_iommu_detach_dev(struct iommu_domain *domain, struct device *dev)
+{
+	struct iommu_fwspec *fwspec = dev->iommu_fwspec;
+	struct qcom_iommu_dev *qcom_iommu = to_iommu(fwspec);
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	unsigned i;
+
+	if (!qcom_domain->iommu)
+		return;
+
+	pm_runtime_get_sync(qcom_iommu->dev);
+	for (i = 0; i < fwspec->num_ids; i++) {
+		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
+
+		/* Disable the context bank: */
+		iommu_writel(ctx, ARM_SMMU_CB_SCTLR, 0);
+	}
+	pm_runtime_put_sync(qcom_iommu->dev);
+
+	qcom_domain->iommu = NULL;
+}
+
+static int qcom_iommu_map(struct iommu_domain *domain, unsigned long iova,
+			  phys_addr_t paddr, size_t size, int prot)
+{
+	int ret;
+	unsigned long flags;
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
+
+	if (!ops)
+		return -ENODEV;
+
+	spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
+	ret = ops->map(ops, iova, paddr, size, prot);
+	spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
+	return ret;
+}
+
+static size_t qcom_iommu_unmap(struct iommu_domain *domain, unsigned long iova,
+			       size_t size)
+{
+	size_t ret;
+	unsigned long flags;
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
+
+	if (!ops)
+		return 0;
+
+	spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
+	ret = ops->unmap(ops, iova, size);
+	spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
+	return ret;
+}
+
+static phys_addr_t qcom_iommu_iova_to_phys(struct iommu_domain *domain,
+					   dma_addr_t iova)
+{
+	phys_addr_t ret;
+	unsigned long flags;
+	struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
+	struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
+
+	if (!ops)
+		return 0;
+
+	spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
+	ret = ops->iova_to_phys(ops, iova);
+	spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
+
+	return ret;
+}
+
+static bool qcom_iommu_capable(enum iommu_cap cap)
+{
+	switch (cap) {
+	case IOMMU_CAP_CACHE_COHERENCY:
+		/*
+		 * Return true here as the SMMU can always send out coherent
+		 * requests.
+		 */
+		return true;
+	case IOMMU_CAP_NOEXEC:
+		return true;
+	default:
+		return false;
+	}
+}
+
+static int qcom_iommu_add_device(struct device *dev)
+{
+	struct qcom_iommu_dev *qcom_iommu = __to_iommu(dev->iommu_fwspec);
+	struct iommu_group *group;
+	struct device_link *link;
+
+	if (!qcom_iommu)
+		return -ENODEV;
+
+	/*
+	 * Establish the link between iommu and master, so that the
+	 * iommu gets runtime enabled/disabled as per the master's
+	 * needs.
+	 */
+	link = device_link_add(dev, qcom_iommu->dev, DL_FLAG_PM_RUNTIME);
+	if (!link) {
+		dev_err(qcom_iommu->dev, "Unable to create device link between %s and %s\n",
+			dev_name(qcom_iommu->dev), dev_name(dev));
+		return -ENODEV;
+	}
+
+	group = iommu_group_get_for_dev(dev);
+	if (IS_ERR_OR_NULL(group))
+		return PTR_ERR_OR_ZERO(group);
+
+	iommu_group_put(group);
+	iommu_device_link(&qcom_iommu->iommu, dev);
+
+	return 0;
+}
+
+static void qcom_iommu_remove_device(struct device *dev)
+{
+	struct qcom_iommu_dev *qcom_iommu = to_iommu(dev->iommu_fwspec);
+
+	if (!qcom_iommu)
+		return;
+
+	iommu_group_remove_device(dev);
+	iommu_device_unlink(&qcom_iommu->iommu, dev);
+	iommu_fwspec_free(dev);
+}
+
+static struct iommu_group *qcom_iommu_device_group(struct device *dev)
+{
+	struct iommu_fwspec *fwspec = dev->iommu_fwspec;
+	struct iommu_group *group = NULL;
+	unsigned i;
+
+	for (i = 0; i < fwspec->num_ids; i++) {
+		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
+
+		if (group && ctx->group && group != ctx->group)
+			return ERR_PTR(-EINVAL);
+
+		group = ctx->group;
+	}
+
+	if (group)
+		return iommu_group_ref_get(group);
+
+	group = generic_device_group(dev);
+
+	for (i = 0; i < fwspec->num_ids; i++) {
+		struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
+		ctx->group = iommu_group_ref_get(group);
+	}
+
+	return group;
+}
+
+static int qcom_iommu_of_xlate(struct device *dev, struct of_phandle_args *args)
+{
+	struct platform_device *iommu_pdev;
+
+	if (args->args_count != 1) {
+		dev_err(dev, "incorrect number of iommu params found for %s "
+			"(found %d, expected 1)\n",
+			args->np->full_name, args->args_count);
+		return -EINVAL;
+	}
+
+	if (!dev->iommu_fwspec->iommu_priv) {
+		iommu_pdev = of_find_device_by_node(args->np);
+		if (WARN_ON(!iommu_pdev))
+			return -EINVAL;
+
+		dev->iommu_fwspec->iommu_priv = platform_get_drvdata(iommu_pdev);
+	}
+
+	return iommu_fwspec_add_ids(dev, &args->args[0], 1);
+}
+
+static const struct iommu_ops qcom_iommu_ops = {
+	.capable	= qcom_iommu_capable,
+	.domain_alloc	= qcom_iommu_domain_alloc,
+	.domain_free	= qcom_iommu_domain_free,
+	.attach_dev	= qcom_iommu_attach_dev,
+	.detach_dev	= qcom_iommu_detach_dev,
+	.map		= qcom_iommu_map,
+	.unmap		= qcom_iommu_unmap,
+	.map_sg		= default_iommu_map_sg,
+	.iova_to_phys	= qcom_iommu_iova_to_phys,
+	.add_device	= qcom_iommu_add_device,
+	.remove_device	= qcom_iommu_remove_device,
+	.device_group	= qcom_iommu_device_group,
+	.of_xlate	= qcom_iommu_of_xlate,
+	.pgsize_bitmap	= SZ_4K | SZ_64K | SZ_1M | SZ_16M,
+};
+
+static int qcom_iommu_enable_clocks(struct qcom_iommu_dev *qcom_iommu)
+{
+	int ret;
+
+	ret = clk_prepare_enable(qcom_iommu->iface_clk);
+	if (ret) {
+		dev_err(qcom_iommu->dev, "Couldn't enable iface_clk\n");
+		return ret;
+	}
+
+	ret = clk_prepare_enable(qcom_iommu->bus_clk);
+	if (ret) {
+		dev_err(qcom_iommu->dev, "Couldn't enable bus_clk\n");
+		clk_disable_unprepare(qcom_iommu->iface_clk);
+		return ret;
+	}
+
+	return 0;
+}
+
+static void qcom_iommu_disable_clocks(struct qcom_iommu_dev *qcom_iommu)
+{
+	clk_disable_unprepare(qcom_iommu->bus_clk);
+	clk_disable_unprepare(qcom_iommu->iface_clk);
+}
+
+static int qcom_iommu_ctx_probe(struct platform_device *pdev)
+{
+	struct qcom_iommu_ctx *ctx;
+	struct device *dev = &pdev->dev;
+	struct qcom_iommu_dev *qcom_iommu = dev_get_drvdata(dev->parent);
+	struct resource *res;
+	int ret;
+	u32 reg;
+
+	ctx = devm_kzalloc(dev, sizeof(*ctx), GFP_KERNEL);
+	if (!ctx)
+		return -ENOMEM;
+
+	ctx->dev = dev;
+	platform_set_drvdata(pdev, ctx);
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	ctx->base = devm_ioremap_resource(dev, res);
+	if (IS_ERR(ctx->base))
+		return PTR_ERR(ctx->base);
+
+	ctx->irq = platform_get_irq(pdev, 0);
+	if (ctx->irq < 0) {
+		dev_err(dev, "failed to get irq\n");
+		return -ENODEV;
+	}
+
+	ret = devm_request_irq(dev, ctx->irq,
+			       qcom_iommu_fault,
+			       IRQF_SHARED,
+			       "qcom-iommu-fault",
+			       ctx);
+	if (ret) {
+		dev_err(dev, "failed to request IRQ %u\n", ctx->irq);
+		return ret;
+	}
+
+	/* read the "reg" property directly to get the relative address
+	 * of the context bank, and calculate the asid from that:
+	 */
+	if (of_property_read_u32_index(dev->of_node, "reg", 0, &reg)) {
+		dev_err(dev, "missing reg property\n");
+		return -ENODEV;
+	}
+
+	ctx->asid = reg / 0x1000;      /* context banks are 0x1000 apart */
+
+	dev_dbg(dev, "found asid %u\n", ctx->asid);
+
+	list_add_tail(&ctx->node, &qcom_iommu->context_list);
+
+	return 0;
+}
+
+static int qcom_iommu_ctx_remove(struct platform_device *pdev)
+{
+	struct qcom_iommu_ctx *ctx = platform_get_drvdata(pdev);
+
+	iommu_group_put(ctx->group);
+	platform_set_drvdata(pdev, NULL);
+
+	return 0;
+}
+
+static const struct of_device_id ctx_of_match[] = {
+	{ .compatible = "qcom,msm-iommu-v1-ns" },
+	{ .compatible = "qcom,msm-iommu-v1-sec" },
+	{ /* sentinel */ }
+};
+
+static struct platform_driver qcom_iommu_ctx_driver = {
+	.driver	= {
+		.name		= "qcom-iommu-ctx",
+		.of_match_table	= of_match_ptr(ctx_of_match),
+	},
+	.probe	= qcom_iommu_ctx_probe,
+	.remove = qcom_iommu_ctx_remove,
+};
+module_platform_driver(qcom_iommu_ctx_driver);
+
+static int qcom_iommu_device_probe(struct platform_device *pdev)
+{
+	struct qcom_iommu_dev *qcom_iommu;
+	struct device *dev = &pdev->dev;
+	struct resource *res;
+	int ret;
+
+	qcom_iommu = devm_kzalloc(dev, sizeof(*qcom_iommu), GFP_KERNEL);
+	if (!qcom_iommu)
+		return -ENOMEM;
+	qcom_iommu->dev = dev;
+
+	INIT_LIST_HEAD(&qcom_iommu->context_list);
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	if (res)
+		qcom_iommu->local_base = devm_ioremap_resource(dev, res);
+
+	qcom_iommu->iface_clk = devm_clk_get(dev, "iface");
+	if (IS_ERR(qcom_iommu->iface_clk)) {
+		dev_err(dev, "failed to get iface clock\n");
+		return PTR_ERR(qcom_iommu->iface_clk);
+	}
+
+	qcom_iommu->bus_clk = devm_clk_get(dev, "bus");
+	if (IS_ERR(qcom_iommu->bus_clk)) {
+		dev_err(dev, "failed to get bus clock\n");
+		return PTR_ERR(qcom_iommu->bus_clk);
+	}
+
+	if (of_property_read_u32(dev->of_node, "qcom,iommu-secure-id",
+				 &qcom_iommu->sec_id)) {
+		dev_err(dev, "missing qcom,iommu-secure-id property\n");
+		return -ENODEV;
+	}
+
+	platform_set_drvdata(pdev, qcom_iommu);
+
+	/* register context bank devices, which are child nodes: */
+	ret = of_platform_populate(dev->of_node, ctx_of_match, NULL, dev);
+	if (ret) {
+		dev_err(dev, "Failed to populate iommu contexts\n");
+		return ret;
+	}
+
+	ret = iommu_device_sysfs_add(&qcom_iommu->iommu, dev, NULL,
+				     "smmu.%pa", &res->start);
+	if (ret) {
+		dev_err(dev, "Failed to register iommu in sysfs\n");
+		return ret;
+	}
+
+	iommu_device_set_ops(&qcom_iommu->iommu, &qcom_iommu_ops);
+	iommu_device_set_fwnode(&qcom_iommu->iommu, dev->fwnode);
+
+	ret = iommu_device_register(&qcom_iommu->iommu);
+	if (ret) {
+		dev_err(dev, "Failed to register iommu\n");
+		return ret;
+	}
+
+	pm_runtime_enable(dev);
+	bus_set_iommu(&platform_bus_type, &qcom_iommu_ops);
+
+	if (qcom_iommu->local_base) {
+		pm_runtime_get_sync(dev);
+		writel_relaxed(0xffffffff, qcom_iommu->local_base + SMMU_INTR_SEL_NS);
+		pm_runtime_put_sync(dev);
+	}
+
+	return 0;
+}
+
+static int qcom_iommu_device_remove(struct platform_device *pdev)
+{
+	pm_runtime_force_suspend(&pdev->dev);
+	platform_set_drvdata(pdev, NULL);
+
+	return 0;
+}
+
+#ifdef CONFIG_PM
+static int qcom_iommu_resume(struct device *dev)
+{
+	struct platform_device *pdev = to_platform_device(dev);
+	struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
+
+	return qcom_iommu_enable_clocks(qcom_iommu);
+}
+
+static int qcom_iommu_suspend(struct device *dev)
+{
+	struct platform_device *pdev = to_platform_device(dev);
+	struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
+
+	qcom_iommu_disable_clocks(qcom_iommu);
+
+	return 0;
+}
+#endif
+
+static const struct dev_pm_ops qcom_iommu_pm_ops = {
+	SET_RUNTIME_PM_OPS(qcom_iommu_suspend, qcom_iommu_resume, NULL)
+	SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
+				pm_runtime_force_resume)
+};
+
+static const struct of_device_id qcom_iommu_of_match[] = {
+	{ .compatible = "qcom,msm-iommu-v1" },
+	{ /* sentinel */ }
+};
+MODULE_DEVICE_TABLE(of, qcom_iommu_of_match);
+
+static struct platform_driver qcom_iommu_driver = {
+	.driver	= {
+		.name		= "qcom-iommu",
+		.of_match_table	= of_match_ptr(qcom_iommu_of_match),
+		.pm		= &qcom_iommu_pm_ops,
+	},
+	.probe	= qcom_iommu_device_probe,
+	.remove	= qcom_iommu_device_remove,
+};
+module_platform_driver(qcom_iommu_driver);
+
+IOMMU_OF_DECLARE(qcom_iommu_dev, "qcom,msm-iommu-v1", NULL);
+
+MODULE_DESCRIPTION("IOMMU API for QCOM IOMMU v1 implementations");
+MODULE_LICENSE("GPL v2");
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2017-08-03 10:48 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-03 10:47 [PATCH 0/4] iommu: add qcom_iommu for early "B" family devices Rob Clark
     [not found] ` <20170803104800.18624-1-robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-08-03 10:47   ` [PATCH 1/4] Docs: dt: document qcom iommu bindings Rob Clark
2017-08-03 10:47     ` Rob Clark
2017-08-03 10:47   ` [PATCH 2/4] iommu: arm-smmu: split out register defines Rob Clark
2017-08-03 10:47     ` Rob Clark
2017-08-03 10:47     ` Rob Clark
2017-08-03 10:47   ` [PATCH 4/4] iommu: qcom: initialize secure page table Rob Clark
2017-08-03 10:47     ` Rob Clark
2017-08-03 10:47 ` [PATCH 3/4] iommu: add qcom_iommu Rob Clark
  -- strict thread matches above, loose matches on Subject: below --
2017-06-26 12:43 [PATCH 0/4] iommu: add qcom_iommu for early "B" family devices Rob Clark
     [not found] ` <20170626124352.21726-1-robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-06-26 12:43   ` [PATCH 3/4] iommu: add qcom_iommu Rob Clark
     [not found] <Message-ID: <CAF6AEGsGCWCASL=L6Z8_0TGWV6b1ozBND3tZHLn=y5AAJ=1JEA@mail.gmail.com>
2017-06-13 12:17 ` Rob Clark
2017-06-16 13:29   ` Riku Voipio
2017-06-01 13:58 [PATCH 0/4] iommu: add qcom_iommu for early "B" family devices Rob Clark
2017-06-01 13:58 ` [PATCH 3/4] iommu: add qcom_iommu Rob Clark
2017-05-25 17:33 [PATCH 0/4] iommu: add qcom_iommu for early "B" family devices Rob Clark
     [not found] ` <20170525173340.26904-1-robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-05-25 17:33   ` [PATCH 3/4] iommu: add qcom_iommu Rob Clark
     [not found]     ` <20170525173340.26904-4-robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-05-26 12:56       ` Robin Murphy
     [not found]         ` <47a738b1-7da5-7043-c16c-4159c6211f7e-5wv7dgnIgG8@public.gmane.org>
2017-05-26 19:12           ` Rob Clark
2017-06-12 13:25           ` Rob Clark
2017-05-05 18:21 [PATCH] " Rob Clark
     [not found] ` <20170505182151.22931-1-robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-05-09 14:23   ` [PATCH 3/4] " Rob Clark
     [not found]     ` <20170509142310.10535-1-robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-05-11 15:08       ` Sricharan R
2017-05-11 16:50         ` Rob Clark
2017-05-12  3:52           ` Sricharan R
2017-05-04 13:34 [PATCH 0/4] iommu: add qcom_iommu for early "B" family devices (v3) Rob Clark
     [not found] ` <20170504133436.24288-1-robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-05-04 13:34   ` [PATCH 3/4] iommu: add qcom_iommu Rob Clark
     [not found]     ` <20170504133436.24288-4-robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-05-04 14:31       ` Rob Herring
2017-05-05 12:31         ` Sricharan R

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.