iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] Apple M1 DART IOMMU driver
@ 2021-03-28  7:40 Sven Peter via iommu
  2021-03-28  7:40 ` [PATCH v2 1/3] iommu: io-pgtable: add DART pagetable format Sven Peter via iommu
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Sven Peter via iommu @ 2021-03-28  7:40 UTC (permalink / raw)
  To: Will Deacon, Robin Murphy, Joerg Roedel, Rob Herring
  Cc: Arnd Bergmann, devicetree, Marc Zyngier, Hector Martin,
	linux-kernel, iommu, Sven Peter, Mohamed Mediouni, Mark Kettenis,
	linux-arm-kernel, Stan Skowronek

Hi,

Here's v2 of my Apple M1 DART IOMMU driver series as a follow up to the original
version [1].

Short summary: this series adds support for the iommu found in Apple's new M1
SoC which is required to use DMA on most peripherals. So far this code has been
tested with dwc3 in host and device mode on a M1 Mac Mini on top of the latest
version of Hector's bringup series [2,3] together with my m1n1 bootloader
branch to bring up USB [4]. It will also apply (but not be very useful) on
top of iommu/next and v5.12-rc3.

Thanks everyone for the suggestions and discussions so far. I believe they
have already significantly improved the state of this driver and our
understanding of the DART iommu!

The part I'm most unsure about is the way I keep track of the multiple
iommu nodes attached to a device. I would love to especially get some
feedback there.


Changes for v2:
 - fixed devicetree binding linting issues pointed out by Rob Herring and
   reworked that file.
 - made DART-specific code in io-pgtable.c unconditional and removed flag from
   Kconfig as proposed by Robin Murphy.
 - allowed multiple DART nodes in the "iommus" property as proposed by
   Rob Herring and Robin Murphy. this resulted in significant changes
   to apple-iommu-dart.c.
 - the domain aperture is now forced to 32bit if translation is enabled after
   the original suggestion to limit the aperture by Mark Kettenis and the
   follow-up discussion and investigation with Mark Kettenis, Arnd Bergmann,
   Robin Murphy and Rob Herring. This change also simplified the code
   in io-pgtable.c and made some of the improvements suggested during review
   not apply anymore.
 - added support for bypassed and isolated domain modes.
 - reject IOMMU_MMIO and IOMMU_NOEXEC since it's unknown how to set these up
   for now or if the hardware even supports these flags.
 - renamed some registers to be less confusing (mainly s/DOMAIN/STREAM/ to
   prevent confusion with linux's iommu domain concept).

I have also fixed my email provider so this time the series should actually
be a single thread and not contain any HTML by accident anymore...

Best,


Sven


[1] https://lore.kernel.org/linux-iommu/20210320151903.60759-1-sven@svenpeter.dev/
[2] https://lore.kernel.org/linux-arch/20210304213902.83903-1-marcan@marcan.st/
[3] https://github.com/AsahiLinux/linux/tree/upstream-bringup-v4
[4] https://github.com/svenpeter42/m1n1/tree/usb-dwc3-serial-wip

Sven Peter (3):
  iommu: io-pgtable: add DART pagetable format
  dt-bindings: iommu: add DART iommu bindings
  iommu: dart: Add DART iommu driver

 .../devicetree/bindings/iommu/apple,dart.yaml |  81 ++
 MAINTAINERS                                   |   7 +
 drivers/iommu/Kconfig                         |  14 +
 drivers/iommu/Makefile                        |   1 +
 drivers/iommu/apple-dart-iommu.c              | 858 ++++++++++++++++++
 drivers/iommu/io-pgtable-arm.c                |  59 ++
 drivers/iommu/io-pgtable.c                    |   1 +
 include/linux/io-pgtable.h                    |   6 +
 8 files changed, 1027 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/iommu/apple,dart.yaml
 create mode 100644 drivers/iommu/apple-dart-iommu.c

-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 1/3] iommu: io-pgtable: add DART pagetable format
  2021-03-28  7:40 [PATCH v2 0/3] Apple M1 DART IOMMU driver Sven Peter via iommu
@ 2021-03-28  7:40 ` Sven Peter via iommu
  2021-04-07 10:44   ` Will Deacon
  2021-03-28  7:40 ` [PATCH v2 2/3] dt-bindings: iommu: add DART iommu bindings Sven Peter via iommu
  2021-03-28  7:40 ` [PATCH v2 3/3] iommu: dart: Add DART iommu driver Sven Peter via iommu
  2 siblings, 1 reply; 12+ messages in thread
From: Sven Peter via iommu @ 2021-03-28  7:40 UTC (permalink / raw)
  To: Will Deacon, Robin Murphy, Joerg Roedel, Rob Herring
  Cc: Arnd Bergmann, devicetree, Marc Zyngier, Hector Martin,
	linux-kernel, iommu, Sven Peter, Mohamed Mediouni, Mark Kettenis,
	linux-arm-kernel, Stan Skowronek

Apple's DART iommu uses a pagetable format that shares some
similarities with the ones already implemented by io-pgtable.c.
Add a new format variant to support the required differences
so that we don't have to duplicate the pagetable handling code.

Signed-off-by: Sven Peter <sven@svenpeter.dev>
---
 drivers/iommu/io-pgtable-arm.c | 59 ++++++++++++++++++++++++++++++++++
 drivers/iommu/io-pgtable.c     |  1 +
 include/linux/io-pgtable.h     |  6 ++++
 3 files changed, 66 insertions(+)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 87def58e79b5..2f63443fd115 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -127,6 +127,9 @@
 #define ARM_MALI_LPAE_MEMATTR_IMP_DEF	0x88ULL
 #define ARM_MALI_LPAE_MEMATTR_WRITE_ALLOC 0x8DULL
 
+#define APPLE_DART_PTE_PROT_NO_WRITE (1<<7)
+#define APPLE_DART_PTE_PROT_NO_READ (1<<8)
+
 /* IOPTE accessors */
 #define iopte_deref(pte,d) __va(iopte_to_paddr(pte, d))
 
@@ -381,6 +384,15 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data,
 {
 	arm_lpae_iopte pte;
 
+	if (data->iop.fmt == ARM_APPLE_DART) {
+		pte = 0;
+		if (!(prot & IOMMU_WRITE))
+			pte |= APPLE_DART_PTE_PROT_NO_WRITE;
+		if (!(prot & IOMMU_READ))
+			pte |= APPLE_DART_PTE_PROT_NO_READ;
+		return pte;
+	}
+
 	if (data->iop.fmt == ARM_64_LPAE_S1 ||
 	    data->iop.fmt == ARM_32_LPAE_S1) {
 		pte = ARM_LPAE_PTE_nG;
@@ -1043,6 +1055,48 @@ arm_mali_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg, void *cookie)
 	return NULL;
 }
 
+static struct io_pgtable *
+apple_dart_alloc_pgtable(struct io_pgtable_cfg *cfg, void *cookie)
+{
+	struct arm_lpae_io_pgtable *data;
+
+	if (cfg->ias > 36)
+		return NULL;
+	if (cfg->oas > 36)
+		return NULL;
+
+	if (!cfg->coherent_walk)
+		return NULL;
+
+	cfg->pgsize_bitmap &= SZ_16K;
+	if (!cfg->pgsize_bitmap)
+		return NULL;
+
+	if (cfg->quirks)
+		return NULL;
+
+	data = arm_lpae_alloc_pgtable(cfg);
+	if (!data)
+		return NULL;
+
+	data->start_level = 2;
+	data->pgd_bits = 11;
+	data->bits_per_level = 11;
+
+	data->pgd = __arm_lpae_alloc_pages(ARM_LPAE_PGD_SIZE(data), GFP_KERNEL,
+					   cfg);
+	if (!data->pgd)
+		goto out_free_data;
+
+	cfg->apple_dart_cfg.ttbr = virt_to_phys(data->pgd);
+
+	return &data->iop;
+
+out_free_data:
+	kfree(data);
+	return NULL;
+}
+
 struct io_pgtable_init_fns io_pgtable_arm_64_lpae_s1_init_fns = {
 	.alloc	= arm_64_lpae_alloc_pgtable_s1,
 	.free	= arm_lpae_free_pgtable,
@@ -1068,6 +1122,11 @@ struct io_pgtable_init_fns io_pgtable_arm_mali_lpae_init_fns = {
 	.free	= arm_lpae_free_pgtable,
 };
 
+struct io_pgtable_init_fns io_pgtable_apple_dart_init_fns = {
+	.alloc	= apple_dart_alloc_pgtable,
+	.free	= arm_lpae_free_pgtable,
+};
+
 #ifdef CONFIG_IOMMU_IO_PGTABLE_LPAE_SELFTEST
 
 static struct io_pgtable_cfg *cfg_cookie __initdata;
diff --git a/drivers/iommu/io-pgtable.c b/drivers/iommu/io-pgtable.c
index 6e9917ce980f..6ec75f3e9c3b 100644
--- a/drivers/iommu/io-pgtable.c
+++ b/drivers/iommu/io-pgtable.c
@@ -27,6 +27,7 @@ io_pgtable_init_table[IO_PGTABLE_NUM_FMTS] = {
 #ifdef CONFIG_AMD_IOMMU
 	[AMD_IOMMU_V1] = &io_pgtable_amd_iommu_v1_init_fns,
 #endif
+	[ARM_APPLE_DART] = &io_pgtable_apple_dart_init_fns,
 };
 
 struct io_pgtable_ops *alloc_io_pgtable_ops(enum io_pgtable_fmt fmt,
diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index a4c9ca2c31f1..b06925eb20d3 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -16,6 +16,7 @@ enum io_pgtable_fmt {
 	ARM_V7S,
 	ARM_MALI_LPAE,
 	AMD_IOMMU_V1,
+	ARM_APPLE_DART,
 	IO_PGTABLE_NUM_FMTS,
 };
 
@@ -136,6 +137,10 @@ struct io_pgtable_cfg {
 			u64	transtab;
 			u64	memattr;
 		} arm_mali_lpae_cfg;
+
+		struct {
+			u64 ttbr;
+		} apple_dart_cfg;
 	};
 };
 
@@ -250,5 +255,6 @@ extern struct io_pgtable_init_fns io_pgtable_arm_64_lpae_s2_init_fns;
 extern struct io_pgtable_init_fns io_pgtable_arm_v7s_init_fns;
 extern struct io_pgtable_init_fns io_pgtable_arm_mali_lpae_init_fns;
 extern struct io_pgtable_init_fns io_pgtable_amd_iommu_v1_init_fns;
+extern struct io_pgtable_init_fns io_pgtable_apple_dart_init_fns;
 
 #endif /* __IO_PGTABLE_H */
-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 2/3] dt-bindings: iommu: add DART iommu bindings
  2021-03-28  7:40 [PATCH v2 0/3] Apple M1 DART IOMMU driver Sven Peter via iommu
  2021-03-28  7:40 ` [PATCH v2 1/3] iommu: io-pgtable: add DART pagetable format Sven Peter via iommu
@ 2021-03-28  7:40 ` Sven Peter via iommu
  2021-03-28  8:16   ` Arnd Bergmann
  2021-03-28  7:40 ` [PATCH v2 3/3] iommu: dart: Add DART iommu driver Sven Peter via iommu
  2 siblings, 1 reply; 12+ messages in thread
From: Sven Peter via iommu @ 2021-03-28  7:40 UTC (permalink / raw)
  To: Will Deacon, Robin Murphy, Joerg Roedel, Rob Herring
  Cc: Arnd Bergmann, devicetree, Marc Zyngier, Hector Martin,
	linux-kernel, iommu, Sven Peter, Mohamed Mediouni, Mark Kettenis,
	linux-arm-kernel, Stan Skowronek

DART (Device Address Resolution Table) is the iommu found on Apple
ARM SoCs such as the M1.

Signed-off-by: Sven Peter <sven@svenpeter.dev>
---
 .../devicetree/bindings/iommu/apple,dart.yaml | 81 +++++++++++++++++++
 MAINTAINERS                                   |  6 ++
 2 files changed, 87 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/iommu/apple,dart.yaml

diff --git a/Documentation/devicetree/bindings/iommu/apple,dart.yaml b/Documentation/devicetree/bindings/iommu/apple,dart.yaml
new file mode 100644
index 000000000000..c0b43d90c157
--- /dev/null
+++ b/Documentation/devicetree/bindings/iommu/apple,dart.yaml
@@ -0,0 +1,81 @@
+# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/iommu/apple,dart.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Apple DART IOMMU
+
+maintainers:
+  - Sven Peter <sven@svenpeter.dev>
+
+description: |+
+  Apple SoCs may contain an implementation of their Device Address
+  Resolution Table which provides a mandatory layer of address
+  translations for various masters.
+
+  Each DART instance is capable of handling up to 16 different streams
+  with individual pagetables and page-level read/write protection flags.
+
+  This DART IOMMU also raises interrupts in response to various
+  fault conditions.
+
+properties:
+  compatible:
+    const: apple,t8103-dart
+
+  reg:
+    maxItems: 1
+
+  interrupts:
+    maxItems: 1
+
+  clocks:
+    description:
+      Reference to the gate clock phandle if required for this IOMMU.
+      Optional since not all IOMMUs are attached to a clock gate.
+
+  '#iommu-cells':
+    const: 1
+    description:
+      Has to be one. The single cell describes the stream id emitted by
+      a master to the IOMMU.
+
+required:
+  - compatible
+  - reg
+  - '#iommu-cells'
+  - interrupts
+
+additionalProperties: false
+
+examples:
+  - |+
+    dart1: dart1@82f80000 {
+      compatible = "apple,t8103-dart";
+      reg = <0x82f80000 0x4000>;
+      interrupts = <1 781 4>;
+      #iommu-cells = <1>;
+    };
+
+    master1 {
+      iommus = <&{/dart1} 0>;
+    };
+
+  - |+
+    dart2a: dart2a@82f00000 {
+      compatible = "apple,t8103-dart";
+      reg = <0x82f00000 0x4000>;
+      interrupts = <1 781 4>;
+      #iommu-cells = <1>;
+    };
+    dart2b: dart2@82f80000 {
+      compatible = "apple,t8103-dart";
+      reg = <0x82f80000 0x4000>;
+      interrupts = <1 781 4>;
+      #iommu-cells = <1>;
+    };
+
+    master2 {
+      iommus = <&{/dart2a} 0>, <&{/dart2b} 1>;
+    };
diff --git a/MAINTAINERS b/MAINTAINERS
index 9ac46317840b..f5397328fa1f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1236,6 +1236,12 @@ L:	linux-input@vger.kernel.org
 S:	Odd fixes
 F:	drivers/input/mouse/bcm5974.c
 
+APPLE DART IOMMU DRIVER
+M:	Sven Peter <sven@svenpeter.dev>
+L:	iommu@lists.linux-foundation.org
+S:	Maintained
+F:	Documentation/devicetree/bindings/iommu/apple,t8103-dart.yaml
+
 APPLE SMC DRIVER
 M:	Henrik Rydberg <rydberg@bitmath.org>
 L:	linux-hwmon@vger.kernel.org
-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 3/3] iommu: dart: Add DART iommu driver
  2021-03-28  7:40 [PATCH v2 0/3] Apple M1 DART IOMMU driver Sven Peter via iommu
  2021-03-28  7:40 ` [PATCH v2 1/3] iommu: io-pgtable: add DART pagetable format Sven Peter via iommu
  2021-03-28  7:40 ` [PATCH v2 2/3] dt-bindings: iommu: add DART iommu bindings Sven Peter via iommu
@ 2021-03-28  7:40 ` Sven Peter via iommu
  2021-04-07 10:42   ` Will Deacon
  2 siblings, 1 reply; 12+ messages in thread
From: Sven Peter via iommu @ 2021-03-28  7:40 UTC (permalink / raw)
  To: Will Deacon, Robin Murphy, Joerg Roedel, Rob Herring
  Cc: Arnd Bergmann, devicetree, Marc Zyngier, Hector Martin,
	linux-kernel, iommu, Sven Peter, Mohamed Mediouni, Mark Kettenis,
	linux-arm-kernel, Stan Skowronek

Apple's new SoCs use iommus for almost all peripherals. These Device
Address Resolution Tables must be setup before these peripherals can
act as DMA masters.

Signed-off-by: Sven Peter <sven@svenpeter.dev>
---
 MAINTAINERS                      |   1 +
 drivers/iommu/Kconfig            |  14 +
 drivers/iommu/Makefile           |   1 +
 drivers/iommu/apple-dart-iommu.c | 858 +++++++++++++++++++++++++++++++
 4 files changed, 874 insertions(+)
 create mode 100644 drivers/iommu/apple-dart-iommu.c

diff --git a/MAINTAINERS b/MAINTAINERS
index f5397328fa1f..70747b8ac0ee 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1241,6 +1241,7 @@ M:	Sven Peter <sven@svenpeter.dev>
 L:	iommu@lists.linux-foundation.org
 S:	Maintained
 F:	Documentation/devicetree/bindings/iommu/apple,t8103-dart.yaml
+F:	drivers/iommu/apple-dart-iommu.c
 
 APPLE SMC DRIVER
 M:	Henrik Rydberg <rydberg@bitmath.org>
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 192ef8f61310..a1b239147dbc 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -249,6 +249,20 @@ config SPAPR_TCE_IOMMU
 	  Enables bits of IOMMU API required by VFIO. The iommu_ops
 	  is not implemented as it is not necessary for VFIO.
 
+config IOMMU_APPLE_DART
+	tristate "Apple DART IOMMU Support"
+	depends on ARM64 || (COMPILE_TEST && !GENERIC_ATOMIC64)
+	select IOMMU_API
+	select IOMMU_IO_PGTABLE
+	select IOMMU_IO_PGTABLE_LPAE
+	help
+	  Support for Apple DART (Device Address Resolution Table) IOMMUs
+	  found in Apple ARM SoCs like the M1.
+	  This IOMMU is required for most peripherals using DMA to access
+	  the main memory.
+
+	  Say Y here if you are using an Apple SoC with a DART IOMMU.
+
 # ARM IOMMU support
 config ARM_SMMU
 	tristate "ARM Ltd. System MMU (SMMU) Support"
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 61bd30cd8369..5f21f0dfec6a 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -28,3 +28,4 @@ obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
 obj-$(CONFIG_HYPERV_IOMMU) += hyperv-iommu.o
 obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o
 obj-$(CONFIG_IOMMU_SVA_LIB) += iommu-sva-lib.o
+obj-$(CONFIG_IOMMU_APPLE_DART) += apple-dart-iommu.o
diff --git a/drivers/iommu/apple-dart-iommu.c b/drivers/iommu/apple-dart-iommu.c
new file mode 100644
index 000000000000..05fb8ca44843
--- /dev/null
+++ b/drivers/iommu/apple-dart-iommu.c
@@ -0,0 +1,858 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Apple DART (Device Address Resolution Table) IOMMU driver
+ *
+ * Copyright (C) 2021 The Asahi Linux Contributors
+ *
+ * Based on arm/arm-smmu/arm-ssmu.c and arm/arm-smmu-v3/arm-smmu-v3.c
+ *  Copyright (C) 2013 ARM Limited
+ *  Copyright (C) 2015 ARM Limited
+ * and on exynos-iommu.c
+ *  Copyright (c) 2011,2016 Samsung Electronics Co., Ltd.
+ */
+
+#include <linux/clk.h>
+#include <linux/dma-iommu.h>
+#include <linux/err.h>
+#include <linux/interrupt.h>
+#include <linux/io-pgtable.h>
+#include <linux/iopoll.h>
+#include <linux/list.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/of_iommu.h>
+#include <linux/of_platform.h>
+#include <linux/platform_device.h>
+#include <linux/ratelimit.h>
+
+#define DART_MAX_STREAMS 16
+#define DART_MAX_TTBR 4
+
+#define DART_STREAM_ALL 0xffff
+
+#define DART_CONFIG 0x60
+#define DART_CONFIG_LOCK BIT(15)
+
+#define DART_ERROR 0x40
+#define DART_ERROR_STREAM_SHIFT 24
+#define DART_ERROR_STREAM_MASK 0xf
+#define DART_ERROR_CODE_MASK 0xffffff
+#define DART_ERROR_FLAG BIT(31)
+#define DART_ERROR_READ_FAULT BIT(4)
+#define DART_ERROR_WRITE_FAULT BIT(3)
+#define DART_ERROR_NO_PTE BIT(2)
+#define DART_ERROR_NO_PMD BIT(1)
+#define DART_ERROR_NO_TTBR BIT(0)
+
+#define DART_STREAM_SELECT 0x34
+
+#define DART_STREAM_COMMAND 0x20
+#define DART_STREAM_COMMAND_BUSY BIT(2)
+#define DART_STREAM_COMMAND_INVALIDATE BIT(20)
+
+#define DART_STREAM_COMMAND_BUSY_TIMEOUT 100
+
+#define DART_STREAM_REMAP 0x80
+
+#define DART_ERROR_ADDR_HI 0x54
+#define DART_ERROR_ADDR_LO 0x50
+
+#define DART_TCR(sid) (0x100 + 4 * (sid))
+#define DART_TCR_TRANSLATE_ENABLE BIT(7)
+#define DART_TCR_BYPASS0_ENABLE BIT(8)
+#define DART_TCR_BYPASS1_ENABLE BIT(12)
+
+#define DART_TTBR(sid, idx) (0x200 + 16 * (sid) + 4 * (idx))
+#define DART_TTBR_VALID BIT(31)
+#define DART_TTBR_SHIFT 12
+
+/*
+ * Private structure associated with each DART device.
+ *
+ * @dev: device struct
+ * @regs: mapped MMIO region
+ * @irq: interrupt number, can be shared with other DARTs
+ * @clks: clocks associated with this DART
+ * @num_clks: number of @clks
+ * @lock: lock for @used_sids and hardware operations involving this dart
+ * @used_sids: bitmap of streams attached to a domain
+ * @iommu: iommu core device
+ */
+struct apple_dart {
+	struct device *dev;
+
+	void __iomem *regs;
+
+	int irq;
+	struct clk_bulk_data *clks;
+	int num_clks;
+
+	spinlock_t lock;
+
+	u32 used_sids;
+
+	struct iommu_device iommu;
+};
+
+/*
+ * This structure is used to identify a single stream attached to a domain.
+ * It's used as a list inside that domain to be able to attach multiple
+ * streams to a single domain. Since multiple devices can use a single stream
+ * it additionally keeps track of how many devices are represented by this
+ * stream. Once that number reaches zero it is detached from the IOMMU domain
+ * and all translations from this stream are disabled.
+ *
+ * @dart: DART instance to which this stream belongs
+ * @sid: stream id within the DART instance
+ * @num_devices: count of devices attached to this stream
+ * @stream_head: list head for the next stream
+ */
+struct apple_dart_stream {
+	struct apple_dart *dart;
+	u32 sid;
+
+	u32 num_devices;
+
+	struct list_head stream_head;
+};
+
+/*
+ * This structure is attached to each iommu domain handled by a DART.
+ * A single domain is used to represent a single virtual address spaces.
+ * It is always allocated together with a page table.
+ *
+ * Streams are the smallest units the DART hardware can differentiate.
+ * These are pointed to the page table of a domain whenever a device is
+ * attached to it. A single stream can only be assigned to a single domain.
+ *
+ * Devices are assigned to at least a single and sometimes multiple individual
+ * streams (using the iommus property in the device tree). Multiple devices
+ * can theoretically be represented by the same stream, though this is usually
+ * not the case.
+ *
+ * We only keep track of streams here and just count how many devices are
+ * represented by each stream. When the last device is removed the whole stream
+ * is removed from the domain.
+ *
+ * @dart: pointer to the DART instance
+ * @pgtbl_ops: pagetable ops allocated by io-pgtable
+ * @type: domain type IOMMU_DOMAIN_IDENTITY_{IDENTITY,DMA,UNMANAGED,BLOCKED}
+ * @streams: list of streams attached to this domain
+ * @lock: spinlock for operations involving the list of streams
+ * @domain: core iommu domain pointer
+ */
+struct apple_dart_domain {
+	struct apple_dart *dart;
+	struct io_pgtable_ops *pgtbl_ops;
+
+	unsigned int type;
+
+	struct list_head streams;
+
+	spinlock_t lock;
+
+	struct iommu_domain domain;
+};
+
+/*
+ * This structure is attached to devices with dev_iommu_priv_set() on of_xlate
+ * and contains a list of streams bound to this device as defined in the
+ * device tree. Multiple DART instances can be attached to a single device
+ * and each stream is identified by its stream id.
+ * It's usually reference by a pointer called *cfg.
+ *
+ * A dynamic array instead of a linked list is used here since in almost
+ * all cases a device will just be attached to a single stream and streams
+ * are never removed after they have been added.
+ *
+ * @num_streams: number of streams attached
+ * @streams: array of structs to identify attached streams and the device link
+ *           to the iommu
+ */
+struct apple_dart_master_cfg {
+	int num_streams;
+	struct {
+		struct apple_dart *dart;
+		u32 sid;
+
+		struct device_link *link;
+	} streams[];
+};
+
+static struct platform_driver apple_dart_driver;
+static const struct iommu_ops apple_dart_iommu_ops;
+
+static struct apple_dart_domain *to_dart_domain(struct iommu_domain *dom)
+{
+	return container_of(dom, struct apple_dart_domain, domain);
+}
+
+static void apple_dart_hw_enable_translation(struct apple_dart *dart, u16 sid)
+{
+	writel(DART_TCR_TRANSLATE_ENABLE, dart->regs + DART_TCR(sid));
+}
+
+static void apple_dart_hw_enable_isolation(struct apple_dart *dart, u16 sid)
+{
+	writel(0, dart->regs + DART_TCR(sid));
+}
+
+static void apple_dart_hw_enable_bypass(struct apple_dart *dart, u16 sid)
+{
+	writel(DART_TCR_BYPASS0_ENABLE | DART_TCR_BYPASS1_ENABLE,
+	       dart->regs + DART_TCR(sid));
+}
+
+static void apple_dart_hw_set_ttbr(struct apple_dart *dart, u16 sid, u16 idx,
+				   phys_addr_t paddr)
+{
+	writel(DART_TTBR_VALID | (paddr >> DART_TTBR_SHIFT),
+	       dart->regs + DART_TTBR(sid, idx));
+}
+
+static void apple_dart_hw_clear_ttbr(struct apple_dart *dart, u16 sid, u16 idx)
+{
+	writel(0, dart->regs + DART_TTBR(sid, idx));
+}
+
+static void apple_dart_hw_clear_all_ttbrs(struct apple_dart *dart, u16 sid)
+{
+	int i;
+
+	for (i = 0; i < 4; ++i)
+		apple_dart_hw_clear_ttbr(dart, sid, i);
+}
+
+static int apple_dart_hw_stream_command(struct apple_dart *dart, u16 sid_bitmap,
+					u32 command)
+{
+	unsigned long flags;
+	int ret;
+	u32 command_reg;
+
+	spin_lock_irqsave(&dart->lock, flags);
+
+	writel(sid_bitmap, dart->regs + DART_STREAM_SELECT);
+	writel(command, dart->regs + DART_STREAM_COMMAND);
+
+	ret = readl_poll_timeout(dart->regs + DART_STREAM_COMMAND, command_reg,
+				 !(command_reg & DART_STREAM_COMMAND_BUSY), 1,
+				 DART_STREAM_COMMAND_BUSY_TIMEOUT);
+
+	spin_unlock_irqrestore(&dart->lock, flags);
+
+	if (ret) {
+		dev_err(dart->dev,
+			"busy bit did not clear after command %x for streams %x\n",
+			command, sid_bitmap);
+		return ret;
+	}
+
+	return 0;
+}
+
+static int apple_dart_hw_invalidate_tlb_global(struct apple_dart *dart)
+{
+	return apple_dart_hw_stream_command(dart, DART_STREAM_ALL,
+					    DART_STREAM_COMMAND_INVALIDATE);
+}
+
+static int apple_dart_hw_invalidate_tlb_stream(struct apple_dart *dart, u16 sid)
+{
+	return apple_dart_hw_stream_command(dart, 1 << sid,
+					    DART_STREAM_COMMAND_INVALIDATE);
+}
+
+static int apple_dart_hw_reset(struct apple_dart *dart)
+{
+	int sid;
+	u32 config;
+
+	config = readl(dart->regs + DART_CONFIG);
+	if (config & DART_CONFIG_LOCK) {
+		dev_err(dart->dev, "DART is locked down until reboot: %08x\n",
+			config);
+		return -EINVAL;
+	}
+
+	for (sid = 0; sid < DART_MAX_STREAMS; ++sid) {
+		apple_dart_hw_enable_isolation(dart, sid);
+		apple_dart_hw_clear_all_ttbrs(dart, sid);
+	}
+
+	/* restore stream identity map */
+	writel(0x03020100, dart->regs + DART_STREAM_REMAP);
+	writel(0x07060504, dart->regs + DART_STREAM_REMAP + 4);
+	writel(0x0b0a0908, dart->regs + DART_STREAM_REMAP + 8);
+	writel(0x0f0e0d0c, dart->regs + DART_STREAM_REMAP + 12);
+
+	/* clear any pending errors before the interrupt is unmasked */
+	writel(readl(dart->regs + DART_ERROR), dart->regs + DART_ERROR);
+
+	return apple_dart_hw_invalidate_tlb_global(dart);
+}
+
+static void apple_dart_domain_flush_tlb(struct apple_dart_domain *domain)
+{
+	unsigned long flags;
+	struct apple_dart_stream *stream;
+	struct apple_dart *dart = domain->dart;
+
+	if (!dart)
+		return;
+
+	spin_lock_irqsave(&domain->lock, flags);
+	list_for_each_entry(stream, &domain->streams, stream_head) {
+		apple_dart_hw_invalidate_tlb_stream(stream->dart, stream->sid);
+	}
+	spin_unlock_irqrestore(&domain->lock, flags);
+}
+
+static void apple_dart_flush_iotlb_all(struct iommu_domain *domain)
+{
+	struct apple_dart_domain *dart_domain = to_dart_domain(domain);
+
+	apple_dart_domain_flush_tlb(dart_domain);
+}
+
+static void apple_dart_iotlb_sync(struct iommu_domain *domain,
+				  struct iommu_iotlb_gather *gather)
+{
+	struct apple_dart_domain *dart_domain = to_dart_domain(domain);
+
+	apple_dart_domain_flush_tlb(dart_domain);
+}
+
+static void apple_dart_iotlb_sync_map(struct iommu_domain *domain,
+				      unsigned long iova, size_t size)
+{
+	struct apple_dart_domain *dart_domain = to_dart_domain(domain);
+
+	apple_dart_domain_flush_tlb(dart_domain);
+}
+
+static void apple_dart_tlb_flush_all(void *cookie)
+{
+	struct apple_dart_domain *domain = cookie;
+
+	apple_dart_domain_flush_tlb(domain);
+}
+
+static void apple_dart_tlb_flush_walk(unsigned long iova, size_t size,
+				      size_t granule, void *cookie)
+{
+	struct apple_dart_domain *domain = cookie;
+
+	apple_dart_domain_flush_tlb(domain);
+}
+
+static const struct iommu_flush_ops apple_dart_tlb_ops = {
+	.tlb_flush_all = apple_dart_tlb_flush_all,
+	.tlb_flush_walk = apple_dart_tlb_flush_walk,
+	.tlb_add_page = NULL,
+};
+
+static phys_addr_t apple_dart_iova_to_phys(struct iommu_domain *domain,
+					   dma_addr_t iova)
+{
+	struct apple_dart_domain *dart_domain = to_dart_domain(domain);
+	struct io_pgtable_ops *ops = dart_domain->pgtbl_ops;
+
+	if (domain->type == IOMMU_DOMAIN_IDENTITY)
+		return iova;
+	if (!ops)
+		return -ENODEV;
+
+	return ops->iova_to_phys(ops, iova);
+}
+
+static int apple_dart_map(struct iommu_domain *domain, unsigned long iova,
+			  phys_addr_t paddr, size_t size, int prot, gfp_t gfp)
+{
+	struct apple_dart_domain *dart_domain = to_dart_domain(domain);
+	struct io_pgtable_ops *ops = dart_domain->pgtbl_ops;
+
+	if (!ops)
+		return -ENODEV;
+	if (prot & IOMMU_MMIO)
+		return -EINVAL;
+	if (prot & IOMMU_NOEXEC)
+		return -EINVAL;
+
+	return ops->map(ops, iova, paddr, size, prot, gfp);
+}
+
+static size_t apple_dart_unmap(struct iommu_domain *domain, unsigned long iova,
+			       size_t size, struct iommu_iotlb_gather *gather)
+{
+	struct apple_dart_domain *dart_domain = to_dart_domain(domain);
+	struct io_pgtable_ops *ops = dart_domain->pgtbl_ops;
+
+	if (!ops)
+		return 0;
+
+	return ops->unmap(ops, iova, size, gather);
+}
+
+/* must be called with held dart_domain->lock */
+static int apple_dart_finalize_domain(struct iommu_domain *domain)
+{
+	struct apple_dart_domain *dart_domain = to_dart_domain(domain);
+	struct apple_dart *dart = dart_domain->dart;
+	struct io_pgtable_cfg pgtbl_cfg;
+
+	if (dart_domain->pgtbl_ops)
+		return 0;
+	if (dart_domain->type != IOMMU_DOMAIN_DMA &&
+	    dart_domain->type != IOMMU_DOMAIN_UNMANAGED)
+		return 0;
+
+	pgtbl_cfg = (struct io_pgtable_cfg){
+		.pgsize_bitmap = SZ_16K,
+		.ias = 32,
+		.oas = 36,
+		.coherent_walk = 1,
+		.tlb = &apple_dart_tlb_ops,
+		.iommu_dev = dart->dev,
+	};
+
+	dart_domain->pgtbl_ops =
+		alloc_io_pgtable_ops(ARM_APPLE_DART, &pgtbl_cfg, domain);
+	if (!dart_domain->pgtbl_ops)
+		return -ENOMEM;
+
+	domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
+	domain->geometry.aperture_start = 0;
+	domain->geometry.aperture_end = DMA_BIT_MASK(32);
+	domain->geometry.force_aperture = true;
+
+	return 0;
+}
+
+/* must be called with held domain->lock */
+static int apple_dart_attach_stream(struct apple_dart_domain *domain,
+				    struct apple_dart *dart, u32 sid)
+{
+	unsigned long flags;
+	struct apple_dart_stream *stream;
+	struct io_pgtable_cfg *pgtbl_cfg;
+	int ret;
+
+	list_for_each_entry(stream, &domain->streams, stream_head) {
+		if (stream->dart == dart && stream->sid == sid) {
+			stream->num_devices++;
+			return 0;
+		}
+	}
+
+	spin_lock_irqsave(&dart->lock, flags);
+
+	if (WARN_ON(dart->used_sids & BIT(sid))) {
+		ret = -EINVAL;
+		goto error;
+	}
+
+	stream = kzalloc(sizeof(*stream), GFP_KERNEL);
+	if (!stream) {
+		ret = -ENOMEM;
+		goto error;
+	}
+
+	stream->dart = dart;
+	stream->sid = sid;
+	stream->num_devices = 1;
+	list_add(&stream->stream_head, &domain->streams);
+
+	dart->used_sids |= BIT(sid);
+	spin_unlock_irqrestore(&dart->lock, flags);
+
+	apple_dart_hw_clear_all_ttbrs(stream->dart, stream->sid);
+
+	switch (domain->type) {
+	case IOMMU_DOMAIN_IDENTITY:
+		apple_dart_hw_enable_bypass(stream->dart, stream->sid);
+		break;
+	case IOMMU_DOMAIN_BLOCKED:
+		apple_dart_hw_enable_isolation(stream->dart, stream->sid);
+		break;
+	case IOMMU_DOMAIN_UNMANAGED:
+	case IOMMU_DOMAIN_DMA:
+		pgtbl_cfg = &io_pgtable_ops_to_pgtable(domain->pgtbl_ops)->cfg;
+		apple_dart_hw_set_ttbr(stream->dart, stream->sid, 0,
+				       pgtbl_cfg->apple_dart_cfg.ttbr);
+
+		apple_dart_hw_enable_translation(stream->dart, stream->sid);
+		apple_dart_hw_invalidate_tlb_stream(stream->dart, stream->sid);
+		break;
+	}
+
+	return 0;
+
+error:
+	spin_unlock_irqrestore(&dart->lock, flags);
+	return ret;
+}
+
+static void apple_dart_disable_stream(struct apple_dart *dart, u32 sid)
+{
+	unsigned long flags;
+
+	apple_dart_hw_enable_isolation(dart, sid);
+	apple_dart_hw_clear_all_ttbrs(dart, sid);
+	apple_dart_hw_invalidate_tlb_stream(dart, sid);
+
+	spin_lock_irqsave(&dart->lock, flags);
+	dart->used_sids &= ~BIT(sid);
+	spin_unlock_irqrestore(&dart->lock, flags);
+}
+
+/* must be called with held domain->lock */
+static void apple_dart_detach_stream(struct apple_dart_domain *domain,
+				     struct apple_dart *dart, u32 sid)
+{
+	struct apple_dart_stream *stream;
+
+	list_for_each_entry(stream, &domain->streams, stream_head) {
+		if (stream->dart == dart && stream->sid == sid) {
+			stream->num_devices--;
+
+			if (stream->num_devices == 0) {
+				apple_dart_disable_stream(dart, sid);
+				list_del(&stream->stream_head);
+				kfree(stream);
+			}
+			return;
+		}
+	}
+}
+
+static int apple_dart_attach_dev(struct iommu_domain *domain,
+				 struct device *dev)
+{
+	int ret;
+	int i, j;
+	unsigned long flags;
+	struct apple_dart_master_cfg *cfg = dev_iommu_priv_get(dev);
+	struct apple_dart_domain *dart_domain = to_dart_domain(domain);
+	struct apple_dart *dart = cfg->streams[0].dart;
+
+	spin_lock_irqsave(&dart_domain->lock, flags);
+
+	if (!dart_domain->dart)
+		dart_domain->dart = dart;
+
+	ret = apple_dart_finalize_domain(domain);
+	if (ret)
+		goto out;
+
+	for (i = 0; i < cfg->num_streams; ++i) {
+		ret = apple_dart_attach_stream(
+			dart_domain, cfg->streams[i].dart, cfg->streams[i].sid);
+		if (ret) {
+			/* try to undo what we did before returning */
+			for (j = 0; j < i; ++j)
+				apple_dart_detach_stream(dart_domain,
+							 cfg->streams[j].dart,
+							 cfg->streams[j].sid);
+
+			goto out;
+		}
+	}
+
+	ret = 0;
+
+out:
+	spin_unlock_irqrestore(&dart_domain->lock, flags);
+	return ret;
+}
+
+static void apple_dart_detach_dev(struct iommu_domain *domain,
+				  struct device *dev)
+{
+	int i;
+	unsigned long flags;
+	struct apple_dart_master_cfg *cfg = dev_iommu_priv_get(dev);
+	struct apple_dart_domain *dart_domain = to_dart_domain(domain);
+
+	spin_lock_irqsave(&dart_domain->lock, flags);
+
+	for (i = 0; i < cfg->num_streams; ++i)
+		apple_dart_detach_stream(dart_domain, cfg->streams[i].dart,
+					 cfg->streams[i].sid);
+
+	spin_unlock_irqrestore(&dart_domain->lock, flags);
+}
+
+static struct iommu_device *apple_dart_probe_device(struct device *dev)
+{
+	struct apple_dart_master_cfg *cfg = dev_iommu_priv_get(dev);
+	int i;
+
+	if (!cfg)
+		return ERR_PTR(-ENODEV);
+
+	for (i = 0; i < cfg->num_streams; ++i) {
+		cfg->streams[i].link =
+			device_link_add(dev, cfg->streams[i].dart->dev,
+					DL_FLAG_PM_RUNTIME | DL_FLAG_STATELESS);
+	}
+
+	return &cfg->streams[0].dart->iommu;
+}
+
+static void apple_dart_release_device(struct device *dev)
+{
+	struct apple_dart_master_cfg *cfg = dev_iommu_priv_get(dev);
+	int i;
+
+	if (!cfg)
+		return;
+
+	for (i = 0; i < cfg->num_streams; ++i)
+		device_link_del(cfg->streams[i].link);
+
+	dev_iommu_priv_set(dev, NULL);
+	kfree(cfg);
+}
+
+static struct iommu_domain *apple_dart_domain_alloc(unsigned int type)
+{
+	struct apple_dart_domain *dart_domain;
+
+	if (type != IOMMU_DOMAIN_DMA && type != IOMMU_DOMAIN_UNMANAGED &&
+	    type != IOMMU_DOMAIN_IDENTITY && type != IOMMU_DOMAIN_BLOCKED)
+		return NULL;
+
+	dart_domain = kzalloc(sizeof(*dart_domain), GFP_KERNEL);
+	if (!dart_domain)
+		return NULL;
+
+	INIT_LIST_HEAD(&dart_domain->streams);
+	spin_lock_init(&dart_domain->lock);
+	iommu_get_dma_cookie(&dart_domain->domain);
+	dart_domain->type = type;
+
+	return &dart_domain->domain;
+}
+
+static void apple_dart_domain_free(struct iommu_domain *domain)
+{
+	struct apple_dart_domain *dart_domain = to_dart_domain(domain);
+
+	WARN_ON(!list_empty(&dart_domain->streams));
+
+	kfree(dart_domain);
+}
+
+static int apple_dart_of_xlate(struct device *dev, struct of_phandle_args *args)
+{
+	struct platform_device *iommu_pdev = of_find_device_by_node(args->np);
+	struct apple_dart_master_cfg *cfg = dev_iommu_priv_get(dev);
+	struct apple_dart_master_cfg *cfg_new;
+	struct apple_dart *dart = platform_get_drvdata(iommu_pdev);
+
+	if (args->args_count != 1)
+		return -EINVAL;
+
+	if (cfg == NULL) {
+		cfg = kzalloc(struct_size(cfg, streams, 1), GFP_KERNEL);
+		if (!cfg)
+			return -ENOMEM;
+	} else {
+		cfg_new = krealloc(
+			cfg, struct_size(cfg, streams, cfg->num_streams + 1),
+			GFP_KERNEL);
+		if (!cfg_new)
+			return -ENOMEM;
+
+		cfg = cfg_new;
+	}
+
+	dev_iommu_priv_set(dev, cfg);
+
+	cfg->streams[cfg->num_streams].dart = dart;
+	cfg->streams[cfg->num_streams].sid = args->args[0];
+	cfg->num_streams++;
+
+	return 0;
+}
+
+static struct iommu_group *apple_dart_device_group(struct device *dev)
+{
+	/* once we have PCI support this needs to use pci_device_group conditionally */
+	return generic_device_group(dev);
+}
+
+static const struct iommu_ops apple_dart_iommu_ops = {
+	.domain_alloc = apple_dart_domain_alloc,
+	.domain_free = apple_dart_domain_free,
+	.attach_dev = apple_dart_attach_dev,
+	.detach_dev = apple_dart_detach_dev,
+	.map = apple_dart_map,
+	.unmap = apple_dart_unmap,
+	.flush_iotlb_all = apple_dart_flush_iotlb_all,
+	.iotlb_sync = apple_dart_iotlb_sync,
+	.iotlb_sync_map = apple_dart_iotlb_sync_map,
+	.iova_to_phys = apple_dart_iova_to_phys,
+	.probe_device = apple_dart_probe_device,
+	.release_device = apple_dart_release_device,
+	.device_group = apple_dart_device_group,
+	.of_xlate = apple_dart_of_xlate,
+	.pgsize_bitmap = SZ_16K,
+};
+
+static irqreturn_t apple_dart_irq(int irq, void *dev)
+{
+	struct apple_dart *dart = dev;
+	static DEFINE_RATELIMIT_STATE(rs, DEFAULT_RATELIMIT_INTERVAL,
+				      DEFAULT_RATELIMIT_BURST);
+	const char *fault_name = NULL;
+	u32 error = readl(dart->regs + DART_ERROR);
+	u32 error_code = error & DART_ERROR_CODE_MASK;
+	u32 addr_lo = readl(dart->regs + DART_ERROR_ADDR_LO);
+	u32 addr_hi = readl(dart->regs + DART_ERROR_ADDR_HI);
+	u64 addr = addr_lo | (((u64)addr_hi) << 32);
+	u8 domain_idx =
+		(error >> DART_ERROR_STREAM_SHIFT) & DART_ERROR_STREAM_MASK;
+
+	if (!(error & DART_ERROR_FLAG))
+		return IRQ_NONE;
+
+	if (error_code & DART_ERROR_READ_FAULT)
+		fault_name = "READ FAULT";
+	else if (error_code & DART_ERROR_WRITE_FAULT)
+		fault_name = "WRITE FAULT";
+	else if (error_code & DART_ERROR_NO_PTE)
+		fault_name = "NO PTE FOR IOVA";
+	else if (error_code & DART_ERROR_NO_PMD)
+		fault_name = "NO PMD FOR IOVA";
+	else if (error_code & DART_ERROR_NO_TTBR)
+		fault_name = "NO TTBR FOR IOVA";
+
+	if (WARN_ON(fault_name == NULL))
+		fault_name = "unknown";
+
+	if (__ratelimit(&rs)) {
+		dev_err(dart->dev,
+			"translation fault: status:0x%x stream:%d code:0x%x (%s) at 0x%llx",
+			error, domain_idx, error_code, fault_name, addr);
+	}
+
+	writel(error, dart->regs + DART_ERROR);
+	return IRQ_HANDLED;
+}
+
+static int apple_dart_probe(struct platform_device *pdev)
+{
+	int ret;
+	int i;
+	struct resource *res;
+	struct apple_dart *dart;
+	struct device *dev = &pdev->dev;
+
+	dart = devm_kzalloc(dev, sizeof(*dart), GFP_KERNEL);
+	if (!dart)
+		return -ENOMEM;
+
+	dart->dev = dev;
+	spin_lock_init(&dart->lock);
+
+	if (pdev->num_resources < 1)
+		return -ENODEV;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, i);
+	if (resource_size(res) < 0x4000) {
+		dev_err(dev, "MMIO region too small (%pr)\n", res);
+		return -EINVAL;
+	}
+
+	dart->regs = devm_ioremap_resource(dev, res);
+	if (IS_ERR(dart->regs))
+		return PTR_ERR(dart->regs);
+
+	ret = devm_clk_bulk_get_all(dev, &dart->clks);
+	if (ret < 0)
+		return ret;
+	dart->num_clks = ret;
+
+	ret = clk_bulk_prepare_enable(dart->num_clks, dart->clks);
+	if (ret)
+		return ret;
+
+	ret = apple_dart_hw_reset(dart);
+	if (ret)
+		return ret;
+
+	dart->irq = platform_get_irq(pdev, 0);
+	if (dart->irq < 0)
+		return -ENODEV;
+
+	ret = devm_request_irq(dart->dev, dart->irq, apple_dart_irq,
+			       IRQF_SHARED, "apple-dart fault handler", dart);
+	if (ret)
+		return ret;
+
+	platform_set_drvdata(pdev, dart);
+
+	ret = iommu_device_sysfs_add(&dart->iommu, dev, NULL, "apple-dart.%s",
+				     dev_name(&pdev->dev));
+	if (ret)
+		return ret;
+
+	iommu_device_set_ops(&dart->iommu, &apple_dart_iommu_ops);
+	iommu_device_set_fwnode(&dart->iommu, dev->fwnode);
+
+	ret = iommu_device_register(&dart->iommu);
+	if (ret)
+		return ret;
+
+	if (dev->bus->iommu_ops != &apple_dart_iommu_ops) {
+		ret = bus_set_iommu(dev->bus, &apple_dart_iommu_ops);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
+static int apple_dart_remove(struct platform_device *pdev)
+{
+	struct apple_dart *dart = platform_get_drvdata(pdev);
+
+	devm_free_irq(dart->dev, dart->irq, dart);
+
+	iommu_device_unregister(&dart->iommu);
+	iommu_device_sysfs_remove(&dart->iommu);
+
+	clk_bulk_disable(dart->num_clks, dart->clks);
+	clk_bulk_unprepare(dart->num_clks, dart->clks);
+
+	return 0;
+}
+
+static void apple_dart_shutdown(struct platform_device *pdev)
+{
+	apple_dart_remove(pdev);
+}
+
+static const struct of_device_id apple_dart_of_match[] = {
+	{ .compatible = "apple,t8103-dart", .data = NULL },
+	{},
+};
+MODULE_DEVICE_TABLE(of, apple_dart_of_match);
+
+static struct platform_driver apple_dart_driver = {
+	.driver	= {
+		.name			= "apple-dart",
+		.of_match_table		= apple_dart_of_match,
+	},
+	.probe	= apple_dart_probe,
+	.remove	= apple_dart_remove,
+	.shutdown = apple_dart_shutdown,
+};
+module_platform_driver(apple_dart_driver);
+
+MODULE_DESCRIPTION("IOMMU API for Apple's DART");
+MODULE_AUTHOR("Sven Peter <sven@svenpeter.dev>");
+MODULE_LICENSE("GPL v2");
-- 
2.25.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/3] dt-bindings: iommu: add DART iommu bindings
  2021-03-28  7:40 ` [PATCH v2 2/3] dt-bindings: iommu: add DART iommu bindings Sven Peter via iommu
@ 2021-03-28  8:16   ` Arnd Bergmann
  2021-03-28  9:22     ` Sven Peter via iommu
  0 siblings, 1 reply; 12+ messages in thread
From: Arnd Bergmann @ 2021-03-28  8:16 UTC (permalink / raw)
  To: Sven Peter
  Cc: DTML, Will Deacon, Hector Martin, Linux Kernel Mailing List,
	open list:IOMMU DRIVERS, Rob Herring, Marc Zyngier,
	Mohamed Mediouni, Mark Kettenis, Robin Murphy, Linux ARM,
	Stan Skowronek

On Sun, Mar 28, 2021 at 9:40 AM Sven Peter <sven@svenpeter.dev> wrote:

I noticed only one detail here:

> +  - |+
> +    dart2a: dart2a@82f00000 {
> +      compatible = "apple,t8103-dart";
> +      reg = <0x82f00000 0x4000>;
> +      interrupts = <1 781 4>;
> +      #iommu-cells = <1>;
> +    };

The name of the iommu should be iommu@82f00000, not dart2a@82f00000.

       Arnd
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/3] dt-bindings: iommu: add DART iommu bindings
  2021-03-28  8:16   ` Arnd Bergmann
@ 2021-03-28  9:22     ` Sven Peter via iommu
  0 siblings, 0 replies; 12+ messages in thread
From: Sven Peter via iommu @ 2021-03-28  9:22 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: DTML, Will Deacon, Hector Martin, Linux Kernel Mailing List,
	open list:IOMMU DRIVERS, Rob Herring, Marc Zyngier,
	Mohamed Mediouni, Mark Kettenis, Robin Murphy, Linux ARM,
	Stan Skowronek



On Sun, Mar 28, 2021, at 10:16, Arnd Bergmann wrote:
> On Sun, Mar 28, 2021 at 9:40 AM Sven Peter <sven@svenpeter.dev> wrote:
> 
> I noticed only one detail here:
> 
> > +  - |+
> > +    dart2a: dart2a@82f00000 {
> > +      compatible = "apple,t8103-dart";
> > +      reg = <0x82f00000 0x4000>;
> > +      interrupts = <1 781 4>;
> > +      #iommu-cells = <1>;
> > +    };
> 
> The name of the iommu should be iommu@82f00000, not dart2a@82f00000.
> 
>        Arnd
>

Thanks, fixed for v3. I've also just noticed that I forgot to update
the filename in MAINTAINERS after I renamed it from apple,t8103-dart.yaml
which I've fixed as well.


Sven

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 3/3] iommu: dart: Add DART iommu driver
  2021-03-28  7:40 ` [PATCH v2 3/3] iommu: dart: Add DART iommu driver Sven Peter via iommu
@ 2021-04-07 10:42   ` Will Deacon
  2021-04-09 16:50     ` Sven Peter via iommu
  0 siblings, 1 reply; 12+ messages in thread
From: Will Deacon @ 2021-04-07 10:42 UTC (permalink / raw)
  To: Sven Peter
  Cc: Arnd Bergmann, devicetree, Marc Zyngier, Hector Martin,
	linux-kernel, iommu, Rob Herring, Mohamed Mediouni,
	Mark Kettenis, Robin Murphy, linux-arm-kernel, Stan Skowronek

On Sun, Mar 28, 2021 at 09:40:09AM +0200, Sven Peter wrote:
> Apple's new SoCs use iommus for almost all peripherals. These Device
> Address Resolution Tables must be setup before these peripherals can
> act as DMA masters.
> 
> Signed-off-by: Sven Peter <sven@svenpeter.dev>
> ---
>  MAINTAINERS                      |   1 +
>  drivers/iommu/Kconfig            |  14 +
>  drivers/iommu/Makefile           |   1 +
>  drivers/iommu/apple-dart-iommu.c | 858 +++++++++++++++++++++++++++++++
>  4 files changed, 874 insertions(+)
>  create mode 100644 drivers/iommu/apple-dart-iommu.c

[...]

> +/* must be called with held domain->lock */
> +static int apple_dart_attach_stream(struct apple_dart_domain *domain,
> +				    struct apple_dart *dart, u32 sid)
> +{
> +	unsigned long flags;
> +	struct apple_dart_stream *stream;
> +	struct io_pgtable_cfg *pgtbl_cfg;
> +	int ret;
> +
> +	list_for_each_entry(stream, &domain->streams, stream_head) {
> +		if (stream->dart == dart && stream->sid == sid) {
> +			stream->num_devices++;
> +			return 0;
> +		}
> +	}
> +
> +	spin_lock_irqsave(&dart->lock, flags);
> +
> +	if (WARN_ON(dart->used_sids & BIT(sid))) {
> +		ret = -EINVAL;
> +		goto error;
> +	}
> +
> +	stream = kzalloc(sizeof(*stream), GFP_KERNEL);
> +	if (!stream) {
> +		ret = -ENOMEM;
> +		goto error;
> +	}

Just in case you missed it, a cocci bot noticed that you're using GFP_KERNEL
to allocate while holding a spinlock here:

https://lore.kernel.org/r/alpine.DEB.2.22.394.2104041724340.2958@hadrien

Will
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/3] iommu: io-pgtable: add DART pagetable format
  2021-03-28  7:40 ` [PATCH v2 1/3] iommu: io-pgtable: add DART pagetable format Sven Peter via iommu
@ 2021-04-07 10:44   ` Will Deacon
  2021-04-09 16:55     ` Sven Peter via iommu
  0 siblings, 1 reply; 12+ messages in thread
From: Will Deacon @ 2021-04-07 10:44 UTC (permalink / raw)
  To: Sven Peter
  Cc: Arnd Bergmann, devicetree, Marc Zyngier, Hector Martin,
	linux-kernel, iommu, Rob Herring, Mohamed Mediouni,
	Mark Kettenis, Robin Murphy, linux-arm-kernel, Stan Skowronek

On Sun, Mar 28, 2021 at 09:40:07AM +0200, Sven Peter wrote:
> Apple's DART iommu uses a pagetable format that shares some
> similarities with the ones already implemented by io-pgtable.c.
> Add a new format variant to support the required differences
> so that we don't have to duplicate the pagetable handling code.
> 
> Signed-off-by: Sven Peter <sven@svenpeter.dev>
> ---
>  drivers/iommu/io-pgtable-arm.c | 59 ++++++++++++++++++++++++++++++++++
>  drivers/iommu/io-pgtable.c     |  1 +
>  include/linux/io-pgtable.h     |  6 ++++
>  3 files changed, 66 insertions(+)
> 
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 87def58e79b5..2f63443fd115 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -127,6 +127,9 @@
>  #define ARM_MALI_LPAE_MEMATTR_IMP_DEF	0x88ULL
>  #define ARM_MALI_LPAE_MEMATTR_WRITE_ALLOC 0x8DULL
>  
> +#define APPLE_DART_PTE_PROT_NO_WRITE (1<<7)
> +#define APPLE_DART_PTE_PROT_NO_READ (1<<8)
> +
>  /* IOPTE accessors */
>  #define iopte_deref(pte,d) __va(iopte_to_paddr(pte, d))
>  
> @@ -381,6 +384,15 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data,
>  {
>  	arm_lpae_iopte pte;
>  
> +	if (data->iop.fmt == ARM_APPLE_DART) {
> +		pte = 0;
> +		if (!(prot & IOMMU_WRITE))
> +			pte |= APPLE_DART_PTE_PROT_NO_WRITE;
> +		if (!(prot & IOMMU_READ))
> +			pte |= APPLE_DART_PTE_PROT_NO_READ;
> +		return pte;
> +	}
> +
>  	if (data->iop.fmt == ARM_64_LPAE_S1 ||
>  	    data->iop.fmt == ARM_32_LPAE_S1) {
>  		pte = ARM_LPAE_PTE_nG;
> @@ -1043,6 +1055,48 @@ arm_mali_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg, void *cookie)
>  	return NULL;
>  }
>  
> +static struct io_pgtable *
> +apple_dart_alloc_pgtable(struct io_pgtable_cfg *cfg, void *cookie)
> +{
> +	struct arm_lpae_io_pgtable *data;
> +
> +	if (cfg->ias > 36)
> +		return NULL;
> +	if (cfg->oas > 36)
> +		return NULL;
> +
> +	if (!cfg->coherent_walk)
> +		return NULL;

This all feels like IOMMU-specific limitations leaking into the page-table
code here; it doesn't feel so unlikely that future implementations of this
IP might have greater addressing capabilities, for example, and so I don't
see why the page-table code needs to police this.

> +	cfg->pgsize_bitmap &= SZ_16K;
> +	if (!cfg->pgsize_bitmap)
> +		return NULL;

This is worrying (and again, I don't think this belongs here). How is this
thing supposed to work if the CPU is using 4k pages?

Will
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 3/3] iommu: dart: Add DART iommu driver
  2021-04-07 10:42   ` Will Deacon
@ 2021-04-09 16:50     ` Sven Peter via iommu
  0 siblings, 0 replies; 12+ messages in thread
From: Sven Peter via iommu @ 2021-04-09 16:50 UTC (permalink / raw)
  To: Will Deacon
  Cc: Arnd Bergmann, devicetree, Marc Zyngier, Hector Martin,
	linux-kernel, Petr Mladek via iommu, Rob Herring,
	Mohamed Mediouni, Mark Kettenis, Robin Murphy, linux-arm-kernel,
	Stan Skowronek



On Wed, Apr 7, 2021, at 12:42, Will Deacon wrote:
> On Sun, Mar 28, 2021 at 09:40:09AM +0200, Sven Peter wrote:
> > Apple's new SoCs use iommus for almost all peripherals. These Device
> > Address Resolution Tables must be setup before these peripherals can
> > act as DMA masters.
> > 
> > Signed-off-by: Sven Peter <sven@svenpeter.dev>
> > ---
> >  MAINTAINERS                      |   1 +
> >  drivers/iommu/Kconfig            |  14 +
> >  drivers/iommu/Makefile           |   1 +
> >  drivers/iommu/apple-dart-iommu.c | 858 +++++++++++++++++++++++++++++++
> >  4 files changed, 874 insertions(+)
> >  create mode 100644 drivers/iommu/apple-dart-iommu.c
> 
> [...]
> 
> > +/* must be called with held domain->lock */
> > +static int apple_dart_attach_stream(struct apple_dart_domain *domain,
> > +				    struct apple_dart *dart, u32 sid)
> > +{
> > +	unsigned long flags;
> > +	struct apple_dart_stream *stream;
> > +	struct io_pgtable_cfg *pgtbl_cfg;
> > +	int ret;
> > +
> > +	list_for_each_entry(stream, &domain->streams, stream_head) {
> > +		if (stream->dart == dart && stream->sid == sid) {
> > +			stream->num_devices++;
> > +			return 0;
> > +		}
> > +	}
> > +
> > +	spin_lock_irqsave(&dart->lock, flags);
> > +
> > +	if (WARN_ON(dart->used_sids & BIT(sid))) {
> > +		ret = -EINVAL;
> > +		goto error;
> > +	}
> > +
> > +	stream = kzalloc(sizeof(*stream), GFP_KERNEL);
> > +	if (!stream) {
> > +		ret = -ENOMEM;
> > +		goto error;
> > +	}
> 
> Just in case you missed it, a cocci bot noticed that you're using GFP_KERNEL
> to allocate while holding a spinlock here:
> 
> https://lore.kernel.org/r/alpine.DEB.2.22.394.2104041724340.2958@hadrien
> 

Thanks for the reminder!
I haven't replied yet because that one was found later when the bot picked up
a (slightly earlier) version that Marc was using to bring up pcie I believe.
I'll fix it for the next version.


Sven
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/3] iommu: io-pgtable: add DART pagetable format
  2021-04-07 10:44   ` Will Deacon
@ 2021-04-09 16:55     ` Sven Peter via iommu
  2021-04-09 19:38       ` Arnd Bergmann
  0 siblings, 1 reply; 12+ messages in thread
From: Sven Peter via iommu @ 2021-04-09 16:55 UTC (permalink / raw)
  To: Will Deacon, Robin Murphy, Arnd Bergmann, Hector Martin
  Cc: devicetree, Marc Zyngier, linux-kernel, Petr Mladek via iommu,
	Rob Herring, Mohamed Mediouni, Mark Kettenis, linux-arm-kernel,
	Stan Skowronek



On Wed, Apr 7, 2021, at 12:44, Will Deacon wrote:
> On Sun, Mar 28, 2021 at 09:40:07AM +0200, Sven Peter wrote:
[...]
> >  
> > +static struct io_pgtable *
> > +apple_dart_alloc_pgtable(struct io_pgtable_cfg *cfg, void *cookie)
> > +{
> > +	struct arm_lpae_io_pgtable *data;
> > +
> > +	if (cfg->ias > 36)
> > +		return NULL;
> > +	if (cfg->oas > 36)
> > +		return NULL;
> > +
> > +	if (!cfg->coherent_walk)
> > +		return NULL;
> 
> This all feels like IOMMU-specific limitations leaking into the page-table
> code here; it doesn't feel so unlikely that future implementations of this
> IP might have greater addressing capabilities, for example, and so I don't
> see why the page-table code needs to police this.

That's true, this really doesn't belong here.
I'll fix it for the next version and make sure to keep iommu-specific
limitations inside the driver itself.


> 
> > +	cfg->pgsize_bitmap &= SZ_16K;
> > +	if (!cfg->pgsize_bitmap)
> > +		return NULL;
> 
> This is worrying (and again, I don't think this belongs here). How is this
> thing supposed to work if the CPU is using 4k pages?

This SoC is just full of fun surprises!
I didn't even think about that case since I've always been using 16k pages so far.

I've checked again and wasn't able to find any way to configure the pagesize
of the IOMMU. There seem to be variants of this IP in older iPhones which
support a 4k pagesize but to the best of my knowledge this is hard wired
and not configurable in software.

When booting with 4k pages I hit the BUG_ON in iova.c that ensures that the
iommu pagesize has to be <= the cpu page size.

I see two options here and I'm not sure I like either of them:

1) Just don't support 4k CPU pages together with IOMMU translations and only
   allow full bypass mode there.
   This would however mean that PCIe (i.e. ethernet, usb ports on the Mac
   mini) and possibly Thunderbolt support would not be possible since these
   devices don't seem to like iommu bypass mode at all.

2) I've had a brief discussion on IRC with Arnd about this [1] and he pointed
   out that the dma_map_sg API doesn't make any guarantees about the returned
   iovas and that it might be possible to make this work at least for devices
   that go through the normal DMA API.

   I've then replaced the page size check with a WARN_ON in iova.c just to see
   what happens. At least normal devices that go through the DMA API seem to
   work with my configuration. iommu_dma_alloc took the iommu_dma_alloc_remap
   path which was called with the cpu page size but then used
   domain->pgsize_bitmap to increase that to 16k. So this kinda works out, but
   there are other functions in dma-iommu.c that I believe rely on the fact that
   the iommu can map single cpu pages. This feels very fragile right now and
   would probably require some rather invasive changes.

   Any driver that tries to use the iommu API directly could be trouble
   as well if they make similar assumptions.

   Is this something you would even want to support in the iommu subsytem
   and is it even possible to do this in a sane way?


Best,


Sven


[1] https://freenode.irclog.whitequark.org/asahi/2021-04-07#29609786;
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/3] iommu: io-pgtable: add DART pagetable format
  2021-04-09 16:55     ` Sven Peter via iommu
@ 2021-04-09 19:38       ` Arnd Bergmann
  2021-04-19 16:31         ` Will Deacon
  0 siblings, 1 reply; 12+ messages in thread
From: Arnd Bergmann @ 2021-04-09 19:38 UTC (permalink / raw)
  To: Sven Peter
  Cc: DTML, Will Deacon, Hector Martin, Linux Kernel Mailing List,
	Petr Mladek via iommu, Rob Herring, Marc Zyngier,
	Mohamed Mediouni, Mark Kettenis, Robin Murphy, Linux ARM,
	Stan Skowronek

On Fri, Apr 9, 2021 at 6:56 PM Sven Peter <sven@svenpeter.dev> wrote:
> On Wed, Apr 7, 2021, at 12:44, Will Deacon wrote:
> > On Sun, Mar 28, 2021 at 09:40:07AM +0200, Sven Peter wrote:
> >
> > > +   cfg->pgsize_bitmap &= SZ_16K;
> > > +   if (!cfg->pgsize_bitmap)
> > > +           return NULL;
> >
> > This is worrying (and again, I don't think this belongs here). How is this
> > thing supposed to work if the CPU is using 4k pages?
>
> This SoC is just full of fun surprises!
> I didn't even think about that case since I've always been using 16k pages so far.
>
> I've checked again and wasn't able to find any way to configure the pagesize
> of the IOMMU. There seem to be variants of this IP in older iPhones which
> support a 4k pagesize but to the best of my knowledge this is hard wired
> and not configurable in software.
>
> When booting with 4k pages I hit the BUG_ON in iova.c that ensures that the
> iommu pagesize has to be <= the cpu page size.
>
> I see two options here and I'm not sure I like either of them:
>
> 1) Just don't support 4k CPU pages together with IOMMU translations and only
>    allow full bypass mode there.
>    This would however mean that PCIe (i.e. ethernet, usb ports on the Mac
>    mini) and possibly Thunderbolt support would not be possible since these
>    devices don't seem to like iommu bypass mode at all.

It should be possible to do a fake bypass mode by just programming a
static page table for as much address space as you can, and then
use swiotlb to address any memory beyond that. This won't perform
well because it requires bounce buffers for any high memory, but it
can be a last resort if a dart instance cannot do normal bypass mode.

> 2) I've had a brief discussion on IRC with Arnd about this [1] and he pointed
>    out that the dma_map_sg API doesn't make any guarantees about the returned
>    iovas and that it might be possible to make this work at least for devices
>    that go through the normal DMA API.
>
>    I've then replaced the page size check with a WARN_ON in iova.c just to see
>    what happens. At least normal devices that go through the DMA API seem to
>    work with my configuration. iommu_dma_alloc took the iommu_dma_alloc_remap
>    path which was called with the cpu page size but then used
>    domain->pgsize_bitmap to increase that to 16k. So this kinda works out, but
>    there are other functions in dma-iommu.c that I believe rely on the fact that
>    the iommu can map single cpu pages. This feels very fragile right now and
>    would probably require some rather invasive changes.

The other second-to-last resort here would be to duplicate the code from
the dma-iommu code and implement the dma-mapping API directly on
top of the dart hardware instead of the iommu layer. This would probably
be much faster than the swiotlb on top of a bypass or a linear map,
but it's a really awful abstraction that would require adding special cases
into a lot of generic code.

>    Any driver that tries to use the iommu API directly could be trouble
>    as well if they make similar assumptions.

I think pretty much all drivers using the iommu API directly already
depends on having a matching page size.  I don't see any way to use
e.g. PCI device assignment using vfio, or a GPU driver with per-process
contexts when the iotlb page size is larger than the CPU's.

>    Is this something you would even want to support in the iommu subsytem
>    and is it even possible to do this in a sane way?

I don't know how hard it is to do adjust the dma-iommu implementation
to allow this, but as long as we can work out the DT binding to support
both normal dma-iommu mode with 16KB pages and some kind of
passthrough mode (emulated or not) with 4KB pages, it can be left
as a possible optimization for later.

        Arnd
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/3] iommu: io-pgtable: add DART pagetable format
  2021-04-09 19:38       ` Arnd Bergmann
@ 2021-04-19 16:31         ` Will Deacon
  0 siblings, 0 replies; 12+ messages in thread
From: Will Deacon @ 2021-04-19 16:31 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: DTML, Marc Zyngier, Hector Martin, Linux Kernel Mailing List,
	Petr Mladek via iommu, Rob Herring, Mohamed Mediouni,
	Mark Kettenis, Robin Murphy, Linux ARM, Stan Skowronek

On Fri, Apr 09, 2021 at 09:38:15PM +0200, Arnd Bergmann wrote:
> On Fri, Apr 9, 2021 at 6:56 PM Sven Peter <sven@svenpeter.dev> wrote:
> > On Wed, Apr 7, 2021, at 12:44, Will Deacon wrote:
> > > On Sun, Mar 28, 2021 at 09:40:07AM +0200, Sven Peter wrote:
> > >
> > > > +   cfg->pgsize_bitmap &= SZ_16K;
> > > > +   if (!cfg->pgsize_bitmap)
> > > > +           return NULL;
> > >
> > > This is worrying (and again, I don't think this belongs here). How is this
> > > thing supposed to work if the CPU is using 4k pages?
> >
> > This SoC is just full of fun surprises!
> > I didn't even think about that case since I've always been using 16k pages so far.
> >
> > I've checked again and wasn't able to find any way to configure the pagesize
> > of the IOMMU. There seem to be variants of this IP in older iPhones which
> > support a 4k pagesize but to the best of my knowledge this is hard wired
> > and not configurable in software.
> >
> > When booting with 4k pages I hit the BUG_ON in iova.c that ensures that the
> > iommu pagesize has to be <= the cpu page size.
> >
> > I see two options here and I'm not sure I like either of them:
> >
> > 1) Just don't support 4k CPU pages together with IOMMU translations and only
> >    allow full bypass mode there.
> >    This would however mean that PCIe (i.e. ethernet, usb ports on the Mac
> >    mini) and possibly Thunderbolt support would not be possible since these
> >    devices don't seem to like iommu bypass mode at all.
> 
> It should be possible to do a fake bypass mode by just programming a
> static page table for as much address space as you can, and then
> use swiotlb to address any memory beyond that. This won't perform
> well because it requires bounce buffers for any high memory, but it
> can be a last resort if a dart instance cannot do normal bypass mode.
> 
> > 2) I've had a brief discussion on IRC with Arnd about this [1] and he pointed
> >    out that the dma_map_sg API doesn't make any guarantees about the returned
> >    iovas and that it might be possible to make this work at least for devices
> >    that go through the normal DMA API.
> >
> >    I've then replaced the page size check with a WARN_ON in iova.c just to see
> >    what happens. At least normal devices that go through the DMA API seem to
> >    work with my configuration. iommu_dma_alloc took the iommu_dma_alloc_remap
> >    path which was called with the cpu page size but then used
> >    domain->pgsize_bitmap to increase that to 16k. So this kinda works out, but
> >    there are other functions in dma-iommu.c that I believe rely on the fact that
> >    the iommu can map single cpu pages. This feels very fragile right now and
> >    would probably require some rather invasive changes.
> 
> The other second-to-last resort here would be to duplicate the code from
> the dma-iommu code and implement the dma-mapping API directly on
> top of the dart hardware instead of the iommu layer. This would probably
> be much faster than the swiotlb on top of a bypass or a linear map,
> but it's a really awful abstraction that would require adding special cases
> into a lot of generic code.
> 
> >    Any driver that tries to use the iommu API directly could be trouble
> >    as well if they make similar assumptions.
> 
> I think pretty much all drivers using the iommu API directly already
> depends on having a matching page size.  I don't see any way to use
> e.g. PCI device assignment using vfio, or a GPU driver with per-process
> contexts when the iotlb page size is larger than the CPU's.
> 
> >    Is this something you would even want to support in the iommu subsytem
> >    and is it even possible to do this in a sane way?
> 
> I don't know how hard it is to do adjust the dma-iommu implementation
> to allow this, but as long as we can work out the DT binding to support
> both normal dma-iommu mode with 16KB pages and some kind of
> passthrough mode (emulated or not) with 4KB pages, it can be left
> as a possible optimization for later.

I think one of the main things to modify is the IOVA allocator
(drivers/iommu/iova.c). Once that is happy with pages bigger than the CPU
page size, then you could probably fake everything else in the DMA layer by
offsetting the returned DMA addresses into the 16K page which got mapped.

Will
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-04-19 16:31 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-28  7:40 [PATCH v2 0/3] Apple M1 DART IOMMU driver Sven Peter via iommu
2021-03-28  7:40 ` [PATCH v2 1/3] iommu: io-pgtable: add DART pagetable format Sven Peter via iommu
2021-04-07 10:44   ` Will Deacon
2021-04-09 16:55     ` Sven Peter via iommu
2021-04-09 19:38       ` Arnd Bergmann
2021-04-19 16:31         ` Will Deacon
2021-03-28  7:40 ` [PATCH v2 2/3] dt-bindings: iommu: add DART iommu bindings Sven Peter via iommu
2021-03-28  8:16   ` Arnd Bergmann
2021-03-28  9:22     ` Sven Peter via iommu
2021-03-28  7:40 ` [PATCH v2 3/3] iommu: dart: Add DART iommu driver Sven Peter via iommu
2021-04-07 10:42   ` Will Deacon
2021-04-09 16:50     ` Sven Peter via iommu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).