devicetree.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/5] iommu: Support identity mappings of reserved-memory regions
@ 2021-04-23 16:32 Thierry Reding
  2021-04-23 16:32 ` [PATCH v2 1/5] dt-bindings: reserved-memory: Document memory region specifier Thierry Reding
                   ` (8 more replies)
  0 siblings, 9 replies; 41+ messages in thread
From: Thierry Reding @ 2021-04-23 16:32 UTC (permalink / raw)
  To: Joerg Roedel, Rob Herring
  Cc: Will Deacon, Robin Murphy, Nicolin Chen, Krishna Reddy,
	Dmitry Osipenko, devicetree, iommu, linux-tegra

From: Thierry Reding <treding@nvidia.com>

Hi,

this is an updated proposal to solve the problem of passing memory
regions that are actively being accessed during boot. The particular
use-case that I need this for is when the bootloader has set up the
display controller to scan out a boot splash screen. During boot the
DMA/IOMMU glue code will attach devices to an IOMMU domain and by
doing so enable IOMMU translations. Typically this will be before a
device driver has had a chance to either disable the display
controller or set up a new framebuffer and map it to the IOMMU.

In that case, the IOMMU will start to fault because the accesses of
the display controller will be for memory addresses that are not mapped
in the IOMMU. The solution is obviously to create identity mappings for
such memory regions. From a device tree point of view, these memory
regions can be described using the reserved-memory device tree bindings
and hooked up to the consumer devices using the "memory-region"
property. On the kernel side, the IOMMU framework already supports the
concept of reserved regions, as well as a way of marking these regions
as requiring identity (a.k.a. direct) mappings.

Unfortunately, the current reserved-memory region bindings only allow
properties of the regions themselves to be described (such as whether a
kernel virtual mapping of the region is needed or not), but it doesn't
provide a way of associating extra information with any particular
reference to these regions. However, that's exactly what's needed for
this case because a given region may need to be identity mapped for a
specific device (such as the display controller scanning out from the
region) but referenced by multiple devices (e.g. if the memory is some
special carveout memory reserved for display purposes).

This series of patches proposes a simple solution: extend memory-region
properties to use an optional specifier, such as the ones already
commonly used for things like GPIOs or interrupts. The specifier needs
to be provided if the reserved-memory region has a non-zero
#memory-region-cells property (if the property is not present, zero is
the assumed default value). The specifier contains flags that specify
how the reference is to be treated. This series of patches introduces
the MEMORY_REGION_IDENTITY_MAPPING flag (value: 0x1) that marks the
specific reference to the memory region to require an identity mapping.

In practice, a device tree would look like this:

	reserved-memory {
		#address-cells = <2>;
		#size-cells = <2>;

		fb: framebuffer@92cb2000 {
			reg = <0 0x92cb2000 0 0x00800000>;
			#memory-region-cells = <1>;
		};
	};

	...

	display@52400000 {
		...
		memory-region = <&fb MEMORY_REGION_IDENTITY_MAPPING>;
		...
	};

Note: While the above would be valid DTS content, it's more likely that
in practice this content would be dynamically generated by the
bootloader using runtime information (such as the framebuffer memory
location).

An operating system can derive from that <phandle, specifier> pair that
the 8 MiB of memory at physical address 0x92cb2000 need to be identity
mapped to the same IO virtual address if the device is attached to an
IOMMU. If no IOMMU is enabled in the system, obviously no identity
mapping needs to be created, but the operating system may still use the
reference to transition to its own framebuffer using the existing memory
region.

Note that an earlier proposal was to use the existing simple-framebuffer
device tree bindings to transport this information. Unfortunately there
are cases where this is not enough. On Tegra SoCs, for example, the
bootloader will also set up a color space correction lookup table in the
system memory that the display controller will access during boot,
alongside the framebuffer. The simple-framebuffer DT bindings have no
way of describing this (and I guess one could argue that this particular
setup no longer is a "simple" framebuffer), so the above, more flexible
proposal was implemented.

I've made corresponding changes in the proprietary bootloader, added a
compatibility shim in U-Boot (which forwards information created by the
proprietary bootloader to the kernel) and the attached patches to test
this on Jetson TX1, Jetson TX2 and Jetson AGX Xavier.

Note that there will be no new releases of the bootloader for earlier
devices, so adding support for these new DT bindings will not be
practical. The bootloaders on those devices do pass information about
the active framebuffer via the kernel command-line, so we may want to
add code to create reserved regions in the IOMMU based on that.

Thierry

Navneet Kumar (1):
  iommu/tegra-smmu: Support managed domains

Thierry Reding (4):
  dt-bindings: reserved-memory: Document memory region specifier
  iommu: Implement of_iommu_get_resv_regions()
  iommu: dma: Use of_iommu_get_resv_regions()
  iommu/tegra-smmu: Add support for reserved regions

 .../reserved-memory/reserved-memory.txt       |  21 +++
 drivers/iommu/dma-iommu.c                     |   3 +
 drivers/iommu/of_iommu.c                      |  54 ++++++++
 drivers/iommu/tegra-smmu.c                    | 121 +++++++++++++++---
 include/dt-bindings/reserved-memory.h         |   8 ++
 include/linux/of_iommu.h                      |   8 ++
 6 files changed, 199 insertions(+), 16 deletions(-)
 create mode 100644 include/dt-bindings/reserved-memory.h

-- 
2.30.2


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v2 1/5] dt-bindings: reserved-memory: Document memory region specifier
  2021-04-23 16:32 [PATCH v2 0/5] iommu: Support identity mappings of reserved-memory regions Thierry Reding
@ 2021-04-23 16:32 ` Thierry Reding
  2021-05-20 22:03   ` Rob Herring
  2021-04-23 16:32 ` [PATCH v2 2/5] iommu: Implement of_iommu_get_resv_regions() Thierry Reding
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 41+ messages in thread
From: Thierry Reding @ 2021-04-23 16:32 UTC (permalink / raw)
  To: Joerg Roedel, Rob Herring
  Cc: Will Deacon, Robin Murphy, Nicolin Chen, Krishna Reddy,
	Dmitry Osipenko, devicetree, iommu, linux-tegra

From: Thierry Reding <treding@nvidia.com>

Reserved memory region phandle references can be accompanied by a
specifier that provides additional information about how that specific
reference should be treated.

One use-case is to mark a memory region as needing an identity mapping
in the system's IOMMU for the device that references the region. This is
needed for example when the bootloader has set up hardware (such as a
display controller) to actively access a memory region (e.g. a boot
splash screen framebuffer) during boot. The operating system can use the
identity mapping flag from the specifier to make sure an IOMMU identity
mapping is set up for the framebuffer before IOMMU translations are
enabled for the display controller.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 .../reserved-memory/reserved-memory.txt       | 21 +++++++++++++++++++
 include/dt-bindings/reserved-memory.h         |  8 +++++++
 2 files changed, 29 insertions(+)
 create mode 100644 include/dt-bindings/reserved-memory.h

diff --git a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
index e8d3096d922c..e9c2f80b441f 100644
--- a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
+++ b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
@@ -52,6 +52,11 @@ compatible (optional) - standard definition
           be used by an operating system to instantiate the necessary pool
           management subsystem if necessary.
         - vendor specific string in the form <vendor>,[<device>-]<usage>
+#memory-region-cells (optional) -
+    - Defines how many cells are used to form the memory region specifier.
+      The memory region specifier contains additional information on how a
+      reserved memory region referenced by the corresponding phandle will
+      be used in a specific context.
 no-map (optional) - empty property
     - Indicates the operating system must not create a virtual mapping
       of the region as part of its standard mapping of system memory,
@@ -83,6 +88,22 @@ memory-region (optional) - phandle, specifier pairs to children of /reserved-mem
 memory-region-names (optional) - a list of names, one for each corresponding
   entry in the memory-region property
 
+Reserved memory region references can be accompanied by a memory region
+specifier, which provides additional information about how the memory region
+will be used in that specific context. If a reserved memory region does not
+have the #memory-region-cells property, 0 is implied and no information
+besides the phandle is conveyed. For reserved memory regions that contain
+#memory-region-cells = <1>, the following encoding applies if not otherwise
+overridden by the bindings selected by the region's compatible string:
+
+  - bit 0: If set, requests that the region be identity mapped if the system
+    uses an IOMMU for I/O virtual address translations. This is used, for
+    example, when a bootloader has configured a display controller to display
+    a boot splash. Once the OS takes over and enables the IOMMU for the given
+    display controller, the IOMMU may fault if the framebuffer hasn't been
+    mapped to the IOMMU at the address that the display controller tries to
+    access.
+
 Example
 -------
 This example defines 3 contiguous regions are defined for Linux kernel:
diff --git a/include/dt-bindings/reserved-memory.h b/include/dt-bindings/reserved-memory.h
new file mode 100644
index 000000000000..174ca3448342
--- /dev/null
+++ b/include/dt-bindings/reserved-memory.h
@@ -0,0 +1,8 @@
+/* SPDX-License-Identifier: (GPL-2.0+ or MIT) */
+
+#ifndef _DT_BINDINGS_RESERVED_MEMORY_H
+#define _DT_BINDINGS_RESERVED_MEMORY_H
+
+#define MEMORY_REGION_IDENTITY_MAPPING 0x1
+
+#endif
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 2/5] iommu: Implement of_iommu_get_resv_regions()
  2021-04-23 16:32 [PATCH v2 0/5] iommu: Support identity mappings of reserved-memory regions Thierry Reding
  2021-04-23 16:32 ` [PATCH v2 1/5] dt-bindings: reserved-memory: Document memory region specifier Thierry Reding
@ 2021-04-23 16:32 ` Thierry Reding
  2021-07-02 14:05   ` Dmitry Osipenko
  2021-04-23 16:32 ` [PATCH v2 3/5] iommu: dma: Use of_iommu_get_resv_regions() Thierry Reding
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 41+ messages in thread
From: Thierry Reding @ 2021-04-23 16:32 UTC (permalink / raw)
  To: Joerg Roedel, Rob Herring
  Cc: Will Deacon, Robin Murphy, Nicolin Chen, Krishna Reddy,
	Dmitry Osipenko, devicetree, iommu, linux-tegra, Frank Rowand,
	Rob Herring

From: Thierry Reding <treding@nvidia.com>

This is an implementation that IOMMU drivers can use to obtain reserved
memory regions from a device tree node. It uses the reserved-memory DT
bindings to find the regions associated with a given device. If these
regions are marked accordingly, identity mappings will be created for
them in the IOMMU domain that the devices will be attached to.

Cc: Frank Rowand <frowand.list@gmail.com>
Cc: devicetree@vger.kernel.org
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Thierry Reding <treding@nvidia.com>
---
Changes in v3:
- change "active" property to identity mapping flag that is part of the
  memory region specifier (as defined by #memory-region-cells) to allow
  per-reference flags to be used

Changes in v2:
- use "active" property to determine whether direct mappings are needed
---
 drivers/iommu/of_iommu.c | 54 ++++++++++++++++++++++++++++++++++++++++
 include/linux/of_iommu.h |  8 ++++++
 2 files changed, 62 insertions(+)

diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
index a9d2df001149..321ebd5fdaba 100644
--- a/drivers/iommu/of_iommu.c
+++ b/drivers/iommu/of_iommu.c
@@ -11,12 +11,15 @@
 #include <linux/module.h>
 #include <linux/msi.h>
 #include <linux/of.h>
+#include <linux/of_address.h>
 #include <linux/of_iommu.h>
 #include <linux/of_pci.h>
 #include <linux/pci.h>
 #include <linux/slab.h>
 #include <linux/fsl/mc.h>
 
+#include <dt-bindings/reserved-memory.h>
+
 #define NO_IOMMU	1
 
 /**
@@ -240,3 +243,54 @@ const struct iommu_ops *of_iommu_configure(struct device *dev,
 
 	return ops;
 }
+
+/**
+ * of_iommu_get_resv_regions - reserved region driver helper for device tree
+ * @dev: device for which to get reserved regions
+ * @list: reserved region list
+ *
+ * IOMMU drivers can use this to implement their .get_resv_regions() callback
+ * for memory regions attached to a device tree node. See the reserved-memory
+ * device tree bindings on how to use these:
+ *
+ *   Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
+ */
+void of_iommu_get_resv_regions(struct device *dev, struct list_head *list)
+{
+	struct of_phandle_iterator it;
+	int err;
+
+	of_for_each_phandle(&it, err, dev->of_node, "memory-region", "#memory-region-cells", 0) {
+		struct iommu_resv_region *region;
+		struct of_phandle_args args;
+		struct resource res;
+
+		args.args_count = of_phandle_iterator_args(&it, args.args, MAX_PHANDLE_ARGS);
+
+		err = of_address_to_resource(it.node, 0, &res);
+		if (err < 0) {
+			dev_err(dev, "failed to parse memory region %pOF: %d\n",
+				it.node, err);
+			continue;
+		}
+
+		if (args.args_count > 0) {
+			/*
+			 * Active memory regions are expected to be accessed by hardware during
+			 * boot and must therefore have an identity mapping created prior to the
+			 * driver taking control of the hardware. This ensures that non-quiescent
+			 * hardware doesn't cause IOMMU faults during boot.
+			 */
+			if (args.args[0] & MEMORY_REGION_IDENTITY_MAPPING) {
+				region = iommu_alloc_resv_region(res.start, resource_size(&res),
+								 IOMMU_READ | IOMMU_WRITE,
+								 IOMMU_RESV_DIRECT_RELAXABLE);
+				if (!region)
+					continue;
+
+				list_add_tail(&region->list, list);
+			}
+		}
+	}
+}
+EXPORT_SYMBOL(of_iommu_get_resv_regions);
diff --git a/include/linux/of_iommu.h b/include/linux/of_iommu.h
index 16f4b3e87f20..8412437acaac 100644
--- a/include/linux/of_iommu.h
+++ b/include/linux/of_iommu.h
@@ -16,6 +16,9 @@ extern const struct iommu_ops *of_iommu_configure(struct device *dev,
 					struct device_node *master_np,
 					const u32 *id);
 
+extern void of_iommu_get_resv_regions(struct device *dev,
+				      struct list_head *list);
+
 #else
 
 static inline int of_get_dma_window(struct device_node *dn, const char *prefix,
@@ -32,6 +35,11 @@ static inline const struct iommu_ops *of_iommu_configure(struct device *dev,
 	return NULL;
 }
 
+static inline void of_iommu_get_resv_regions(struct device *dev,
+					     struct list_head *list)
+{
+}
+
 #endif	/* CONFIG_OF_IOMMU */
 
 #endif /* __OF_IOMMU_H */
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 3/5] iommu: dma: Use of_iommu_get_resv_regions()
  2021-04-23 16:32 [PATCH v2 0/5] iommu: Support identity mappings of reserved-memory regions Thierry Reding
  2021-04-23 16:32 ` [PATCH v2 1/5] dt-bindings: reserved-memory: Document memory region specifier Thierry Reding
  2021-04-23 16:32 ` [PATCH v2 2/5] iommu: Implement of_iommu_get_resv_regions() Thierry Reding
@ 2021-04-23 16:32 ` Thierry Reding
  2021-04-23 16:32 ` [PATCH v2 4/5] iommu/tegra-smmu: Add support for reserved regions Thierry Reding
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 41+ messages in thread
From: Thierry Reding @ 2021-04-23 16:32 UTC (permalink / raw)
  To: Joerg Roedel, Rob Herring
  Cc: Will Deacon, Robin Murphy, Nicolin Chen, Krishna Reddy,
	Dmitry Osipenko, devicetree, iommu, linux-tegra, Frank Rowand

From: Thierry Reding <treding@nvidia.com>

For device tree nodes, use the standard of_iommu_get_resv_regions()
implementation to obtain the reserved memory regions associated with a
device.

Cc: Rob Herring <robh+dt@kernel.org>
Cc: Frank Rowand <frowand.list@gmail.com>
Cc: devicetree@vger.kernel.org
Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/iommu/dma-iommu.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 7bcdd1205535..52b424176241 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -19,6 +19,7 @@
 #include <linux/irq.h>
 #include <linux/mm.h>
 #include <linux/mutex.h>
+#include <linux/of_iommu.h>
 #include <linux/pci.h>
 #include <linux/swiotlb.h>
 #include <linux/scatterlist.h>
@@ -190,6 +191,8 @@ void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list)
 	if (!is_of_node(dev_iommu_fwspec_get(dev)->iommu_fwnode))
 		iort_iommu_msi_get_resv_regions(dev, list);
 
+	if (dev->of_node)
+		of_iommu_get_resv_regions(dev, list);
 }
 EXPORT_SYMBOL(iommu_dma_get_resv_regions);
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 4/5] iommu/tegra-smmu: Add support for reserved regions
  2021-04-23 16:32 [PATCH v2 0/5] iommu: Support identity mappings of reserved-memory regions Thierry Reding
                   ` (2 preceding siblings ...)
  2021-04-23 16:32 ` [PATCH v2 3/5] iommu: dma: Use of_iommu_get_resv_regions() Thierry Reding
@ 2021-04-23 16:32 ` Thierry Reding
  2021-04-23 16:32 ` [PATCH v2 5/5] iommu/tegra-smmu: Support managed domains Thierry Reding
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 41+ messages in thread
From: Thierry Reding @ 2021-04-23 16:32 UTC (permalink / raw)
  To: Joerg Roedel, Rob Herring
  Cc: Will Deacon, Robin Murphy, Nicolin Chen, Krishna Reddy,
	Dmitry Osipenko, devicetree, iommu, linux-tegra

From: Thierry Reding <treding@nvidia.com>

The Tegra DRM driver currently uses the IOMMU API explicitly. This means
that it has fine-grained control over when exactly the translation
through the IOMMU is enabled. This currently happens after the driver
probes, so the driver is in a DMA quiesced state when the IOMMU
translation is enabled.

During the transition of the Tegra DRM driver to use the DMA API instead
of the IOMMU API explicitly, it was observed that on certain platforms
the display controllers were still actively fetching from memory. When a
DMA IOMMU domain is created as part of the DMA/IOMMU API setup during
boot, the IOMMU translation for the display controllers can be enabled a
significant amount of time before the driver has had a chance to reset
the hardware into a sane state. This causes the SMMU to detect faults on
the addresses that the display controller is trying to fetch.

To avoid this, and as a byproduct paving the way for seamless transition
of display from the bootloader to the kernel, add support for reserved
regions in the Tegra SMMU driver. This is implemented using the standard
reserved memory device tree bindings, which let us describe regions of
memory which the kernel is forbidden from using for regular allocations.
The Tegra SMMU driver will parse the nodes associated with each device
via the "memory-region" property and return reserved regions that the
IOMMU core will then create direct mappings for prior to attaching the
IOMMU domains to the devices. This ensures that a 1:1 mapping is in
place when IOMMU translation starts and prevents the SMMU from detecting
any faults.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/iommu/tegra-smmu.c | 76 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 76 insertions(+)

diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
index 0a281833f611..6bf7654371c5 100644
--- a/drivers/iommu/tegra-smmu.c
+++ b/drivers/iommu/tegra-smmu.c
@@ -10,6 +10,7 @@
 #include <linux/kernel.h>
 #include <linux/of.h>
 #include <linux/of_device.h>
+#include <linux/of_iommu.h>
 #include <linux/pci.h>
 #include <linux/platform_device.h>
 #include <linux/slab.h>
@@ -539,6 +540,38 @@ static void tegra_smmu_set_pde(struct tegra_smmu_as *as, unsigned long iova,
 	struct tegra_smmu *smmu = as->smmu;
 	u32 *pd = page_address(as->pd);
 	unsigned long offset = pd_index * sizeof(*pd);
+	bool unmap = false;
+
+	/*
+	 * XXX Move this outside of this function. Perhaps add a struct
+	 * iommu_domain parameter to ->{get,put}_resv_regions() so that
+	 * the mapping can be done there.
+	 *
+	 * The problem here is that as->smmu is only known once we attach
+	 * the domain to a device (because then we look up the right SMMU
+	 * instance via the dev->archdata.iommu pointer). When the direct
+	 * mappings are created for reserved regions, the domain has not
+	 * been attached to a device yet, so we don't know. We currently
+	 * fix that up in ->apply_resv_regions() because that is the first
+	 * time where we have access to a struct device that will be used
+	 * with the IOMMU domain. However, that's asymmetric and doesn't
+	 * take care of the page directory mapping either, so we need to
+	 * come up with something better.
+	 */
+	if (as->pd_dma == 0) {
+		as->pd_dma = dma_map_page(smmu->dev, as->pd, 0, SMMU_SIZE_PD,
+					  DMA_TO_DEVICE);
+		if (dma_mapping_error(smmu->dev, as->pd_dma))
+			return;
+
+		if (!smmu_dma_addr_valid(smmu, as->pd_dma)) {
+			dma_unmap_page(smmu->dev, as->pd_dma, SMMU_SIZE_PD,
+				       DMA_TO_DEVICE);
+			return;
+		}
+
+		unmap = true;
+	}
 
 	/* Set the page directory entry first */
 	pd[pd_index] = value;
@@ -551,6 +584,12 @@ static void tegra_smmu_set_pde(struct tegra_smmu_as *as, unsigned long iova,
 	smmu_flush_ptc(smmu, as->pd_dma, offset);
 	smmu_flush_tlb_section(smmu, as->id, iova);
 	smmu_flush(smmu);
+
+	if (unmap) {
+		dma_unmap_page(smmu->dev, as->pd_dma, SMMU_SIZE_PD,
+			       DMA_TO_DEVICE);
+		as->pd_dma = 0;
+	}
 }
 
 static u32 *tegra_smmu_pte_offset(struct page *pt_page, unsigned long iova)
@@ -945,6 +984,40 @@ static struct iommu_group *tegra_smmu_device_group(struct device *dev)
 	return group->group;
 }
 
+static void tegra_smmu_apply_resv_region(struct device *dev,
+					 struct iommu_domain *domain,
+					 struct iommu_resv_region *region)
+{
+	struct tegra_smmu *smmu = dev_iommu_priv_get(dev);
+	struct tegra_smmu_as *as = to_smmu_as(domain);
+
+	/*
+	 * ->attach_dev() may not have been called yet at this point, so the
+	 * address space may not have been associated with an SMMU instance.
+	 * Set up the association here to make sure subsequent code can rely
+	 * on the SMMU instance being known.
+	 *
+	 * Also make sure that the SMMU instance doesn't conflict if an SMMU
+	 * has been associated with the address space already. This can happen
+	 * if a domain is shared between multiple devices.
+	 *
+	 * Note that this is purely theoretic because there are no known SoCs
+	 * with multiple instances of this SMMU.
+	 *
+	 * XXX Deal with this elsewhere. One possibility would be to pass the
+	 * struct iommu_domain that we're operating on to ->get_resv_regions()
+	 * and ->put_resv_regions() so that the connection between it and the
+	 * struct device (in order to find the SMMU instance) can already be
+	 * established at that time. This would be nicely symmetric because a
+	 * ->put_resv_regions() could undo that again so that ->attach_dev()
+	 * could start from a clean slate.
+	 */
+	if (as->smmu && as->smmu != smmu)
+		WARN(1, "conflicting SMMU instances\n");
+
+	as->smmu = smmu;
+}
+
 static int tegra_smmu_of_xlate(struct device *dev,
 			       struct of_phandle_args *args)
 {
@@ -978,6 +1051,9 @@ static const struct iommu_ops tegra_smmu_ops = {
 	.map = tegra_smmu_map,
 	.unmap = tegra_smmu_unmap,
 	.iova_to_phys = tegra_smmu_iova_to_phys,
+	.get_resv_regions = of_iommu_get_resv_regions,
+	.put_resv_regions = generic_iommu_put_resv_regions,
+	.apply_resv_region = tegra_smmu_apply_resv_region,
 	.of_xlate = tegra_smmu_of_xlate,
 	.pgsize_bitmap = SZ_4K,
 };
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 5/5] iommu/tegra-smmu: Support managed domains
  2021-04-23 16:32 [PATCH v2 0/5] iommu: Support identity mappings of reserved-memory regions Thierry Reding
                   ` (3 preceding siblings ...)
  2021-04-23 16:32 ` [PATCH v2 4/5] iommu/tegra-smmu: Add support for reserved regions Thierry Reding
@ 2021-04-23 16:32 ` Thierry Reding
  2021-10-11 23:25   ` Dmitry Osipenko
  2021-04-24  7:26 ` [PATCH v2 0/5] iommu: Support identity mappings of reserved-memory regions Dmitry Osipenko
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 41+ messages in thread
From: Thierry Reding @ 2021-04-23 16:32 UTC (permalink / raw)
  To: Joerg Roedel, Rob Herring
  Cc: Will Deacon, Robin Murphy, Nicolin Chen, Krishna Reddy,
	Dmitry Osipenko, devicetree, iommu, linux-tegra

From: Navneet Kumar <navneetk@nvidia.com>

Allow creating identity and DMA API compatible IOMMU domains. When
creating a DMA API compatible domain, make sure to also create the
required cookie.

Signed-off-by: Navneet Kumar <navneetk@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/iommu/tegra-smmu.c | 47 ++++++++++++++++++++++++--------------
 1 file changed, 30 insertions(+), 17 deletions(-)

diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
index 6bf7654371c5..40647e1f03ae 100644
--- a/drivers/iommu/tegra-smmu.c
+++ b/drivers/iommu/tegra-smmu.c
@@ -16,6 +16,7 @@
 #include <linux/slab.h>
 #include <linux/spinlock.h>
 #include <linux/dma-mapping.h>
+#include <linux/dma-iommu.h>
 
 #include <soc/tegra/ahb.h>
 #include <soc/tegra/mc.h>
@@ -281,8 +282,11 @@ static bool tegra_smmu_capable(enum iommu_cap cap)
 static struct iommu_domain *tegra_smmu_domain_alloc(unsigned type)
 {
 	struct tegra_smmu_as *as;
+	int ret;
 
-	if (type != IOMMU_DOMAIN_UNMANAGED)
+	if (type != IOMMU_DOMAIN_UNMANAGED &&
+	    type != IOMMU_DOMAIN_DMA &&
+	    type != IOMMU_DOMAIN_IDENTITY)
 		return NULL;
 
 	as = kzalloc(sizeof(*as), GFP_KERNEL);
@@ -291,26 +295,23 @@ static struct iommu_domain *tegra_smmu_domain_alloc(unsigned type)
 
 	as->attr = SMMU_PD_READABLE | SMMU_PD_WRITABLE | SMMU_PD_NONSECURE;
 
-	as->pd = alloc_page(GFP_KERNEL | __GFP_DMA | __GFP_ZERO);
-	if (!as->pd) {
-		kfree(as);
-		return NULL;
+	if (type == IOMMU_DOMAIN_DMA) {
+		ret = iommu_get_dma_cookie(&as->domain);
+		if (ret)
+			goto free_as;
 	}
 
+	as->pd = alloc_page(GFP_KERNEL | __GFP_DMA | __GFP_ZERO);
+	if (!as->pd)
+		goto put_dma_cookie;
+
 	as->count = kcalloc(SMMU_NUM_PDE, sizeof(u32), GFP_KERNEL);
-	if (!as->count) {
-		__free_page(as->pd);
-		kfree(as);
-		return NULL;
-	}
+	if (!as->count)
+		goto free_pd_range;
 
 	as->pts = kcalloc(SMMU_NUM_PDE, sizeof(*as->pts), GFP_KERNEL);
-	if (!as->pts) {
-		kfree(as->count);
-		__free_page(as->pd);
-		kfree(as);
-		return NULL;
-	}
+	if (!as->pts)
+		goto free_pts;
 
 	spin_lock_init(&as->lock);
 
@@ -320,6 +321,18 @@ static struct iommu_domain *tegra_smmu_domain_alloc(unsigned type)
 	as->domain.geometry.force_aperture = true;
 
 	return &as->domain;
+
+free_pts:
+	kfree(as->pts);
+free_pd_range:
+	__free_page(as->pd);
+put_dma_cookie:
+	if (type == IOMMU_DOMAIN_DMA)
+		iommu_put_dma_cookie(&as->domain);
+free_as:
+	kfree(as);
+
+	return NULL;
 }
 
 static void tegra_smmu_domain_free(struct iommu_domain *domain)
@@ -1051,7 +1064,7 @@ static const struct iommu_ops tegra_smmu_ops = {
 	.map = tegra_smmu_map,
 	.unmap = tegra_smmu_unmap,
 	.iova_to_phys = tegra_smmu_iova_to_phys,
-	.get_resv_regions = of_iommu_get_resv_regions,
+	.get_resv_regions = iommu_dma_get_resv_regions,
 	.put_resv_regions = generic_iommu_put_resv_regions,
 	.apply_resv_region = tegra_smmu_apply_resv_region,
 	.of_xlate = tegra_smmu_of_xlate,
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 0/5] iommu: Support identity mappings of reserved-memory regions
  2021-04-23 16:32 [PATCH v2 0/5] iommu: Support identity mappings of reserved-memory regions Thierry Reding
                   ` (4 preceding siblings ...)
  2021-04-23 16:32 ` [PATCH v2 5/5] iommu/tegra-smmu: Support managed domains Thierry Reding
@ 2021-04-24  7:26 ` Dmitry Osipenko
  2021-04-27 18:30   ` Krishna Reddy
  2021-04-28  5:51 ` Dmitry Osipenko
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 41+ messages in thread
From: Dmitry Osipenko @ 2021-04-24  7:26 UTC (permalink / raw)
  To: Thierry Reding, Joerg Roedel, Rob Herring
  Cc: Will Deacon, Robin Murphy, Nicolin Chen, Krishna Reddy,
	devicetree, iommu, linux-tegra

23.04.2021 19:32, Thierry Reding пишет:
> Hi,
> 
> this is an updated proposal to solve the problem of passing memory
> regions that are actively being accessed during boot. The particular
> use-case that I need this for is when the bootloader has set up the
> display controller to scan out a boot splash screen. During boot the
> DMA/IOMMU glue code will attach devices to an IOMMU domain and by
> doing so enable IOMMU translations. Typically this will be before a
> device driver has had a chance to either disable the display
> controller or set up a new framebuffer and map it to the IOMMU.

Hello Thierry,

Is it always safe to enable SMMU ASID in a middle of a DMA request made
by a memory client?

The memory controller supports blocking DMA requests, which we are
already using for the memory hot-resetting. A block could be needed
before ASID is toggled. This needs to be clarified.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* RE: [PATCH v2 0/5] iommu: Support identity mappings of reserved-memory regions
  2021-04-24  7:26 ` [PATCH v2 0/5] iommu: Support identity mappings of reserved-memory regions Dmitry Osipenko
@ 2021-04-27 18:30   ` Krishna Reddy
  2021-04-28  5:44     ` Dmitry Osipenko
  0 siblings, 1 reply; 41+ messages in thread
From: Krishna Reddy @ 2021-04-27 18:30 UTC (permalink / raw)
  To: Dmitry Osipenko, Thierry Reding, Joerg Roedel, Rob Herring
  Cc: Will Deacon, Robin Murphy, Nicolin Chen, devicetree, iommu, linux-tegra

> Is it always safe to enable SMMU ASID in a middle of a DMA request made by a
> memory client?

From MC point of view, It is safe to enable and has been this way from many years in downstream code for display engine.
It doesn't impact the transactions that have already bypassed SMMU before enabling SMMU ASID. 
Transactions that are yet to pass SMMU stage would go through SMMU once SMMU ASID is enabled and visible.

-KR


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 0/5] iommu: Support identity mappings of reserved-memory regions
  2021-04-27 18:30   ` Krishna Reddy
@ 2021-04-28  5:44     ` Dmitry Osipenko
  2021-04-29  5:51       ` Krishna Reddy
  0 siblings, 1 reply; 41+ messages in thread
From: Dmitry Osipenko @ 2021-04-28  5:44 UTC (permalink / raw)
  To: Krishna Reddy, Thierry Reding, Joerg Roedel, Rob Herring
  Cc: Will Deacon, Robin Murphy, Nicolin Chen, devicetree, iommu, linux-tegra

27.04.2021 21:30, Krishna Reddy пишет:
>> Is it always safe to enable SMMU ASID in a middle of a DMA request made by a
>> memory client?
> 
> From MC point of view, It is safe to enable and has been this way from many years in downstream code for display engine.
> It doesn't impact the transactions that have already bypassed SMMU before enabling SMMU ASID. 
> Transactions that are yet to pass SMMU stage would go through SMMU once SMMU ASID is enabled and visible.

Hello,

Thank you for the answer. Could you please give more information about:

1) Are you on software or hardware team, or both?

2) Is SMMU a third party IP or developed in-house?

3) Do you have a direct access to HDL sources? Are you 100% sure that
hardware does what you say?

4) What happens when CPU writes to ASID register? Does SMMU state
machine latch ASID status (making it visible) only at a single "safe" point?

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 0/5] iommu: Support identity mappings of reserved-memory regions
  2021-04-23 16:32 [PATCH v2 0/5] iommu: Support identity mappings of reserved-memory regions Thierry Reding
                   ` (5 preceding siblings ...)
  2021-04-24  7:26 ` [PATCH v2 0/5] iommu: Support identity mappings of reserved-memory regions Dmitry Osipenko
@ 2021-04-28  5:51 ` Dmitry Osipenko
  2021-04-28  5:57   ` Mikko Perttunen
  2021-04-28  5:59 ` Dmitry Osipenko
  2021-10-03  1:09 ` Dmitry Osipenko
  8 siblings, 1 reply; 41+ messages in thread
From: Dmitry Osipenko @ 2021-04-28  5:51 UTC (permalink / raw)
  To: Thierry Reding, Joerg Roedel, Rob Herring
  Cc: Will Deacon, Robin Murphy, Nicolin Chen, Krishna Reddy,
	devicetree, iommu, linux-tegra

23.04.2021 19:32, Thierry Reding пишет:
> Note that there will be no new releases of the bootloader for earlier
> devices, so adding support for these new DT bindings will not be
> practical. The bootloaders on those devices do pass information about
> the active framebuffer via the kernel command-line, so we may want to
> add code to create reserved regions in the IOMMU based on that.

Since this change requires a bootloader update anyways, why it's not
possible to fix the bootloader properly, making it to stop all the DMA
activity before jumping into kernel?

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 0/5] iommu: Support identity mappings of reserved-memory regions
  2021-04-28  5:51 ` Dmitry Osipenko
@ 2021-04-28  5:57   ` Mikko Perttunen
  2021-04-28  7:55     ` Dmitry Osipenko
  0 siblings, 1 reply; 41+ messages in thread
From: Mikko Perttunen @ 2021-04-28  5:57 UTC (permalink / raw)
  To: Dmitry Osipenko, Thierry Reding, Joerg Roedel, Rob Herring
  Cc: Will Deacon, Robin Murphy, Nicolin Chen, Krishna Reddy,
	devicetree, iommu, linux-tegra

On 4/28/21 8:51 AM, Dmitry Osipenko wrote:
> 23.04.2021 19:32, Thierry Reding пишет:
>> Note that there will be no new releases of the bootloader for earlier
>> devices, so adding support for these new DT bindings will not be
>> practical. The bootloaders on those devices do pass information about
>> the active framebuffer via the kernel command-line, so we may want to
>> add code to create reserved regions in the IOMMU based on that.
> 
> Since this change requires a bootloader update anyways, why it's not
> possible to fix the bootloader properly, making it to stop all the DMA
> activity before jumping into kernel?
> 

That is not desirable, as then we couldn't have seamless 
bootloader-kernel boot splash transition.

Mikko

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 0/5] iommu: Support identity mappings of reserved-memory regions
  2021-04-23 16:32 [PATCH v2 0/5] iommu: Support identity mappings of reserved-memory regions Thierry Reding
                   ` (6 preceding siblings ...)
  2021-04-28  5:51 ` Dmitry Osipenko
@ 2021-04-28  5:59 ` Dmitry Osipenko
  2021-10-03  1:09 ` Dmitry Osipenko
  8 siblings, 0 replies; 41+ messages in thread
From: Dmitry Osipenko @ 2021-04-28  5:59 UTC (permalink / raw)
  To: Thierry Reding, Joerg Roedel, Rob Herring
  Cc: Will Deacon, Robin Murphy, Nicolin Chen, Krishna Reddy,
	devicetree, iommu, linux-tegra

23.04.2021 19:32, Thierry Reding пишет:
> Note that an earlier proposal was to use the existing simple-framebuffer
> device tree bindings to transport this information. Unfortunately there
> are cases where this is not enough. On Tegra SoCs, for example, the
> bootloader will also set up a color space correction lookup table in the
> system memory that the display controller will access during boot,
> alongside the framebuffer. The simple-framebuffer DT bindings have no
> way of describing this (and I guess one could argue that this particular
> setup no longer is a "simple" framebuffer), so the above, more flexible
> proposal was implemented.

Will simple-framebuffer be able to use that reserved region
transparently? Or will it require a custom simple-framebuffer driver?

Could we make simple-framebuffer support a part of this series?

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 0/5] iommu: Support identity mappings of reserved-memory regions
  2021-04-28  5:57   ` Mikko Perttunen
@ 2021-04-28  7:55     ` Dmitry Osipenko
  0 siblings, 0 replies; 41+ messages in thread
From: Dmitry Osipenko @ 2021-04-28  7:55 UTC (permalink / raw)
  To: Mikko Perttunen, Thierry Reding, Joerg Roedel, Rob Herring
  Cc: Will Deacon, Robin Murphy, Nicolin Chen, Krishna Reddy,
	devicetree, iommu, linux-tegra

28.04.2021 08:57, Mikko Perttunen пишет:
> On 4/28/21 8:51 AM, Dmitry Osipenko wrote:
>> 23.04.2021 19:32, Thierry Reding пишет:
>>> Note that there will be no new releases of the bootloader for earlier
>>> devices, so adding support for these new DT bindings will not be
>>> practical. The bootloaders on those devices do pass information about
>>> the active framebuffer via the kernel command-line, so we may want to
>>> add code to create reserved regions in the IOMMU based on that.
>>
>> Since this change requires a bootloader update anyways, why it's not
>> possible to fix the bootloader properly, making it to stop all the DMA
>> activity before jumping into kernel?
>>
> 
> That is not desirable, as then we couldn't have seamless
> bootloader-kernel boot splash transition.

The seamless transition should be more complicated since it should
require to read out the hardware state in order to convert it into DRM
state + display panel needs to stay ON. It's a bit questionable whether
this is really needed, so far this is not achievable in mainline.

Nevertheless, it will be good to have an early simple-framebuffer, which
I realized only after sending out the message.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* RE: [PATCH v2 0/5] iommu: Support identity mappings of reserved-memory regions
  2021-04-28  5:44     ` Dmitry Osipenko
@ 2021-04-29  5:51       ` Krishna Reddy
  2021-04-29 12:43         ` Dmitry Osipenko
  0 siblings, 1 reply; 41+ messages in thread
From: Krishna Reddy @ 2021-04-29  5:51 UTC (permalink / raw)
  To: Dmitry Osipenko, Thierry Reding, Joerg Roedel, Rob Herring
  Cc: Will Deacon, Robin Murphy, Nicolin Chen, devicetree, iommu, linux-tegra

Hi Dmitry,

> Thank you for the answer. Could you please give more information about:
> 1) Are you on software or hardware team, or both?

I am in the software team and has contributed to initial Tegra SMMU driver in the downstream along with earlier team member Hiroshi Doyu.

> 2) Is SMMU a third party IP or developed in-house?

Tegra SMMU is developed in-house. 

> 3) Do you have a direct access to HDL sources? Are you 100% sure that
> hardware does what you say?

It was discussed with Hardware team before and again today as well.
Enabling ASID for display engine while it continues to access the buffer memory is a safe operation.
As per HW team, The only side-effect that can happen is additional latency to transaction as SMMU caches get warmed up.

> 4) What happens when CPU writes to ASID register? Does SMMU state machine
> latch ASID status (making it visible) only at a single "safe" point?

MC makes a decision on routing transaction through either SMMU page tables or bypassing based on the ASID register value.  It
checks the ASID register only once per transaction in the pipeline.

-KR

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 0/5] iommu: Support identity mappings of reserved-memory regions
  2021-04-29  5:51       ` Krishna Reddy
@ 2021-04-29 12:43         ` Dmitry Osipenko
  0 siblings, 0 replies; 41+ messages in thread
From: Dmitry Osipenko @ 2021-04-29 12:43 UTC (permalink / raw)
  To: Krishna Reddy, Thierry Reding, Joerg Roedel, Rob Herring
  Cc: Will Deacon, Robin Murphy, Nicolin Chen, devicetree, iommu, linux-tegra

29.04.2021 08:51, Krishna Reddy пишет:
> Hi Dmitry,
> 
>> Thank you for the answer. Could you please give more information about:
>> 1) Are you on software or hardware team, or both?
> 
> I am in the software team and has contributed to initial Tegra SMMU driver in the downstream along with earlier team member Hiroshi Doyu.
> 
>> 2) Is SMMU a third party IP or developed in-house?
> 
> Tegra SMMU is developed in-house. 
> 
>> 3) Do you have a direct access to HDL sources? Are you 100% sure that
>> hardware does what you say?
> 
> It was discussed with Hardware team before and again today as well.
> Enabling ASID for display engine while it continues to access the buffer memory is a safe operation.
> As per HW team, The only side-effect that can happen is additional latency to transaction as SMMU caches get warmed up.
> 
>> 4) What happens when CPU writes to ASID register? Does SMMU state machine
>> latch ASID status (making it visible) only at a single "safe" point?
> 
> MC makes a decision on routing transaction through either SMMU page tables or bypassing based on the ASID register value.  It
> checks the ASID register only once per transaction in the pipeline.

Thank you very much for the clarification.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 1/5] dt-bindings: reserved-memory: Document memory region specifier
  2021-04-23 16:32 ` [PATCH v2 1/5] dt-bindings: reserved-memory: Document memory region specifier Thierry Reding
@ 2021-05-20 22:03   ` Rob Herring
  2021-05-28 16:54     ` Thierry Reding
  0 siblings, 1 reply; 41+ messages in thread
From: Rob Herring @ 2021-05-20 22:03 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Joerg Roedel, Will Deacon, Robin Murphy, Nicolin Chen,
	Krishna Reddy, Dmitry Osipenko, devicetree, iommu, linux-tegra

On Fri, Apr 23, 2021 at 06:32:30PM +0200, Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> Reserved memory region phandle references can be accompanied by a
> specifier that provides additional information about how that specific
> reference should be treated.
> 
> One use-case is to mark a memory region as needing an identity mapping
> in the system's IOMMU for the device that references the region. This is
> needed for example when the bootloader has set up hardware (such as a
> display controller) to actively access a memory region (e.g. a boot
> splash screen framebuffer) during boot. The operating system can use the
> identity mapping flag from the specifier to make sure an IOMMU identity
> mapping is set up for the framebuffer before IOMMU translations are
> enabled for the display controller.
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  .../reserved-memory/reserved-memory.txt       | 21 +++++++++++++++++++
>  include/dt-bindings/reserved-memory.h         |  8 +++++++
>  2 files changed, 29 insertions(+)
>  create mode 100644 include/dt-bindings/reserved-memory.h

Sorry for being slow on this. I have 2 concerns.

First, this creates an ABI issue. A DT with cells in 'memory-region' 
will not be understood by an existing OS. I'm less concerned about this 
if we address that with a stable fix. (Though I'm pretty sure we've 
naively added #?-cells in the past ignoring this issue.)

Second, it could be the bootloader setting up the reserved region. If a 
node already has 'memory-region', then adding more regions is more 
complicated compared to adding new properties. And defining what each 
memory-region entry is or how many in schemas is impossible.

Both could be addressed with a new property. Perhaps something like 
'iommu-memory-region = <&phandle>;'. I think the 'iommu' prefix is 
appropriate given this is entirely because of the IOMMU being in the 
mix. I might feel differently if we had other uses for cells, but I 
don't really see it in this case. 

Rob

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 1/5] dt-bindings: reserved-memory: Document memory region specifier
  2021-05-20 22:03   ` Rob Herring
@ 2021-05-28 16:54     ` Thierry Reding
  2021-06-08 16:51       ` Thierry Reding
  0 siblings, 1 reply; 41+ messages in thread
From: Thierry Reding @ 2021-05-28 16:54 UTC (permalink / raw)
  To: Rob Herring
  Cc: Joerg Roedel, Will Deacon, Robin Murphy, Nicolin Chen,
	Krishna Reddy, Dmitry Osipenko, devicetree, iommu, linux-tegra

[-- Attachment #1: Type: text/plain, Size: 4688 bytes --]

On Thu, May 20, 2021 at 05:03:06PM -0500, Rob Herring wrote:
> On Fri, Apr 23, 2021 at 06:32:30PM +0200, Thierry Reding wrote:
> > From: Thierry Reding <treding@nvidia.com>
> > 
> > Reserved memory region phandle references can be accompanied by a
> > specifier that provides additional information about how that specific
> > reference should be treated.
> > 
> > One use-case is to mark a memory region as needing an identity mapping
> > in the system's IOMMU for the device that references the region. This is
> > needed for example when the bootloader has set up hardware (such as a
> > display controller) to actively access a memory region (e.g. a boot
> > splash screen framebuffer) during boot. The operating system can use the
> > identity mapping flag from the specifier to make sure an IOMMU identity
> > mapping is set up for the framebuffer before IOMMU translations are
> > enabled for the display controller.
> > 
> > Signed-off-by: Thierry Reding <treding@nvidia.com>
> > ---
> >  .../reserved-memory/reserved-memory.txt       | 21 +++++++++++++++++++
> >  include/dt-bindings/reserved-memory.h         |  8 +++++++
> >  2 files changed, 29 insertions(+)
> >  create mode 100644 include/dt-bindings/reserved-memory.h
> 
> Sorry for being slow on this. I have 2 concerns.
> 
> First, this creates an ABI issue. A DT with cells in 'memory-region' 
> will not be understood by an existing OS. I'm less concerned about this 
> if we address that with a stable fix. (Though I'm pretty sure we've 
> naively added #?-cells in the past ignoring this issue.)

A while ago I had proposed adding memory-region*s* as an alternative
name for memory-region to make the naming more consistent with other
types of properties (think clocks, resets, gpios, ...). If we added
that, we could easily differentiate between the "legacy" cases where
no #memory-region-cells was allowed and the new cases where it was.

> Second, it could be the bootloader setting up the reserved region. If a 
> node already has 'memory-region', then adding more regions is more 
> complicated compared to adding new properties. And defining what each 
> memory-region entry is or how many in schemas is impossible.

It's true that updating the property gets a bit complicated, but it's
not exactly rocket science. We really just need to splice the array. I
have a working implemention for this in U-Boot.

For what it's worth, we could run into the same issue with any new
property that we add. Even if we renamed this to iommu-memory-region,
it's still possible that a bootloader may have to update this property
if it already exists (it could be hard-coded in DT, or it could have
been added by some earlier bootloader or firmware).

> Both could be addressed with a new property. Perhaps something like 
> 'iommu-memory-region = <&phandle>;'. I think the 'iommu' prefix is 
> appropriate given this is entirely because of the IOMMU being in the 
> mix. I might feel differently if we had other uses for cells, but I 
> don't really see it in this case. 

I'm afraid that down the road we'll end up with other cases and then we
might proliferate a number of *-memory-region properties with varying
prefixes.

I am aware of one other case where we might need something like this: on
some Tegra SoCs we have audio processors that will access memory buffers
using a DMA engine. These processors are booted from early firmware
using firmware from system memory. In order to avoid trashing the
firmware, we need to reserve memory. We can do this using reserved
memory nodes. However, the audio DMA engine also uses the SMMU, so we
need to make sure that the firmware memory is marked as reserved within
the SMMU. This is similar to the identity mapping case, but not exactly
the same. Instead of creating a 1:1 mapping, we just want that IOVA
region to be reserved (i.e. IOMMU_RESV_RESERVED instead of
IOMMU_RESV_DIRECT{,_RELAXABLE}).

That would also fall into the IOMMU domain, but we can't reuse the
iommu-memory-region property for that because then we don't have enough
information to decide which type of reservation we need.

We could obviously make iommu-memory-region take a specifier, but we
could just as well use memory-regions in that case since we have
something more generic anyway.

With the #memory-region-cells proposal, we can easily extend the cell in
the specifier with an additional MEMORY_REGION_IOMMU_RESERVE flag to
take that other use case into account. If we than also change to the new
memory-regions property name, we avoid the ABI issue (and we gain a bit
of consistency while at it).

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 1/5] dt-bindings: reserved-memory: Document memory region specifier
  2021-05-28 16:54     ` Thierry Reding
@ 2021-06-08 16:51       ` Thierry Reding
  2021-07-01 18:14         ` Thierry Reding
  0 siblings, 1 reply; 41+ messages in thread
From: Thierry Reding @ 2021-06-08 16:51 UTC (permalink / raw)
  To: Rob Herring
  Cc: Joerg Roedel, Will Deacon, Robin Murphy, Nicolin Chen,
	Krishna Reddy, Dmitry Osipenko, devicetree, iommu, linux-tegra

[-- Attachment #1: Type: text/plain, Size: 5090 bytes --]

On Fri, May 28, 2021 at 06:54:55PM +0200, Thierry Reding wrote:
> On Thu, May 20, 2021 at 05:03:06PM -0500, Rob Herring wrote:
> > On Fri, Apr 23, 2021 at 06:32:30PM +0200, Thierry Reding wrote:
> > > From: Thierry Reding <treding@nvidia.com>
> > > 
> > > Reserved memory region phandle references can be accompanied by a
> > > specifier that provides additional information about how that specific
> > > reference should be treated.
> > > 
> > > One use-case is to mark a memory region as needing an identity mapping
> > > in the system's IOMMU for the device that references the region. This is
> > > needed for example when the bootloader has set up hardware (such as a
> > > display controller) to actively access a memory region (e.g. a boot
> > > splash screen framebuffer) during boot. The operating system can use the
> > > identity mapping flag from the specifier to make sure an IOMMU identity
> > > mapping is set up for the framebuffer before IOMMU translations are
> > > enabled for the display controller.
> > > 
> > > Signed-off-by: Thierry Reding <treding@nvidia.com>
> > > ---
> > >  .../reserved-memory/reserved-memory.txt       | 21 +++++++++++++++++++
> > >  include/dt-bindings/reserved-memory.h         |  8 +++++++
> > >  2 files changed, 29 insertions(+)
> > >  create mode 100644 include/dt-bindings/reserved-memory.h
> > 
> > Sorry for being slow on this. I have 2 concerns.
> > 
> > First, this creates an ABI issue. A DT with cells in 'memory-region' 
> > will not be understood by an existing OS. I'm less concerned about this 
> > if we address that with a stable fix. (Though I'm pretty sure we've 
> > naively added #?-cells in the past ignoring this issue.)
> 
> A while ago I had proposed adding memory-region*s* as an alternative
> name for memory-region to make the naming more consistent with other
> types of properties (think clocks, resets, gpios, ...). If we added
> that, we could easily differentiate between the "legacy" cases where
> no #memory-region-cells was allowed and the new cases where it was.
> 
> > Second, it could be the bootloader setting up the reserved region. If a 
> > node already has 'memory-region', then adding more regions is more 
> > complicated compared to adding new properties. And defining what each 
> > memory-region entry is or how many in schemas is impossible.
> 
> It's true that updating the property gets a bit complicated, but it's
> not exactly rocket science. We really just need to splice the array. I
> have a working implemention for this in U-Boot.
> 
> For what it's worth, we could run into the same issue with any new
> property that we add. Even if we renamed this to iommu-memory-region,
> it's still possible that a bootloader may have to update this property
> if it already exists (it could be hard-coded in DT, or it could have
> been added by some earlier bootloader or firmware).
> 
> > Both could be addressed with a new property. Perhaps something like 
> > 'iommu-memory-region = <&phandle>;'. I think the 'iommu' prefix is 
> > appropriate given this is entirely because of the IOMMU being in the 
> > mix. I might feel differently if we had other uses for cells, but I 
> > don't really see it in this case. 
> 
> I'm afraid that down the road we'll end up with other cases and then we
> might proliferate a number of *-memory-region properties with varying
> prefixes.
> 
> I am aware of one other case where we might need something like this: on
> some Tegra SoCs we have audio processors that will access memory buffers
> using a DMA engine. These processors are booted from early firmware
> using firmware from system memory. In order to avoid trashing the
> firmware, we need to reserve memory. We can do this using reserved
> memory nodes. However, the audio DMA engine also uses the SMMU, so we
> need to make sure that the firmware memory is marked as reserved within
> the SMMU. This is similar to the identity mapping case, but not exactly
> the same. Instead of creating a 1:1 mapping, we just want that IOVA
> region to be reserved (i.e. IOMMU_RESV_RESERVED instead of
> IOMMU_RESV_DIRECT{,_RELAXABLE}).
> 
> That would also fall into the IOMMU domain, but we can't reuse the
> iommu-memory-region property for that because then we don't have enough
> information to decide which type of reservation we need.
> 
> We could obviously make iommu-memory-region take a specifier, but we
> could just as well use memory-regions in that case since we have
> something more generic anyway.
> 
> With the #memory-region-cells proposal, we can easily extend the cell in
> the specifier with an additional MEMORY_REGION_IOMMU_RESERVE flag to
> take that other use case into account. If we than also change to the new
> memory-regions property name, we avoid the ABI issue (and we gain a bit
> of consistency while at it).

Ping? Rob, do you want me to add this second use-case to the patch
series to make it more obvious that this isn't just a one-off thing? Or
how do we proceed?

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 1/5] dt-bindings: reserved-memory: Document memory region specifier
  2021-06-08 16:51       ` Thierry Reding
@ 2021-07-01 18:14         ` Thierry Reding
  2021-07-02 14:16           ` Dmitry Osipenko
  0 siblings, 1 reply; 41+ messages in thread
From: Thierry Reding @ 2021-07-01 18:14 UTC (permalink / raw)
  To: Rob Herring
  Cc: Joerg Roedel, Will Deacon, Robin Murphy, Nicolin Chen,
	Krishna Reddy, Dmitry Osipenko, devicetree, iommu, linux-tegra

[-- Attachment #1: Type: text/plain, Size: 5453 bytes --]

On Tue, Jun 08, 2021 at 06:51:40PM +0200, Thierry Reding wrote:
> On Fri, May 28, 2021 at 06:54:55PM +0200, Thierry Reding wrote:
> > On Thu, May 20, 2021 at 05:03:06PM -0500, Rob Herring wrote:
> > > On Fri, Apr 23, 2021 at 06:32:30PM +0200, Thierry Reding wrote:
> > > > From: Thierry Reding <treding@nvidia.com>
> > > > 
> > > > Reserved memory region phandle references can be accompanied by a
> > > > specifier that provides additional information about how that specific
> > > > reference should be treated.
> > > > 
> > > > One use-case is to mark a memory region as needing an identity mapping
> > > > in the system's IOMMU for the device that references the region. This is
> > > > needed for example when the bootloader has set up hardware (such as a
> > > > display controller) to actively access a memory region (e.g. a boot
> > > > splash screen framebuffer) during boot. The operating system can use the
> > > > identity mapping flag from the specifier to make sure an IOMMU identity
> > > > mapping is set up for the framebuffer before IOMMU translations are
> > > > enabled for the display controller.
> > > > 
> > > > Signed-off-by: Thierry Reding <treding@nvidia.com>
> > > > ---
> > > >  .../reserved-memory/reserved-memory.txt       | 21 +++++++++++++++++++
> > > >  include/dt-bindings/reserved-memory.h         |  8 +++++++
> > > >  2 files changed, 29 insertions(+)
> > > >  create mode 100644 include/dt-bindings/reserved-memory.h
> > > 
> > > Sorry for being slow on this. I have 2 concerns.
> > > 
> > > First, this creates an ABI issue. A DT with cells in 'memory-region' 
> > > will not be understood by an existing OS. I'm less concerned about this 
> > > if we address that with a stable fix. (Though I'm pretty sure we've 
> > > naively added #?-cells in the past ignoring this issue.)
> > 
> > A while ago I had proposed adding memory-region*s* as an alternative
> > name for memory-region to make the naming more consistent with other
> > types of properties (think clocks, resets, gpios, ...). If we added
> > that, we could easily differentiate between the "legacy" cases where
> > no #memory-region-cells was allowed and the new cases where it was.
> > 
> > > Second, it could be the bootloader setting up the reserved region. If a 
> > > node already has 'memory-region', then adding more regions is more 
> > > complicated compared to adding new properties. And defining what each 
> > > memory-region entry is or how many in schemas is impossible.
> > 
> > It's true that updating the property gets a bit complicated, but it's
> > not exactly rocket science. We really just need to splice the array. I
> > have a working implemention for this in U-Boot.
> > 
> > For what it's worth, we could run into the same issue with any new
> > property that we add. Even if we renamed this to iommu-memory-region,
> > it's still possible that a bootloader may have to update this property
> > if it already exists (it could be hard-coded in DT, or it could have
> > been added by some earlier bootloader or firmware).
> > 
> > > Both could be addressed with a new property. Perhaps something like 
> > > 'iommu-memory-region = <&phandle>;'. I think the 'iommu' prefix is 
> > > appropriate given this is entirely because of the IOMMU being in the 
> > > mix. I might feel differently if we had other uses for cells, but I 
> > > don't really see it in this case. 
> > 
> > I'm afraid that down the road we'll end up with other cases and then we
> > might proliferate a number of *-memory-region properties with varying
> > prefixes.
> > 
> > I am aware of one other case where we might need something like this: on
> > some Tegra SoCs we have audio processors that will access memory buffers
> > using a DMA engine. These processors are booted from early firmware
> > using firmware from system memory. In order to avoid trashing the
> > firmware, we need to reserve memory. We can do this using reserved
> > memory nodes. However, the audio DMA engine also uses the SMMU, so we
> > need to make sure that the firmware memory is marked as reserved within
> > the SMMU. This is similar to the identity mapping case, but not exactly
> > the same. Instead of creating a 1:1 mapping, we just want that IOVA
> > region to be reserved (i.e. IOMMU_RESV_RESERVED instead of
> > IOMMU_RESV_DIRECT{,_RELAXABLE}).
> > 
> > That would also fall into the IOMMU domain, but we can't reuse the
> > iommu-memory-region property for that because then we don't have enough
> > information to decide which type of reservation we need.
> > 
> > We could obviously make iommu-memory-region take a specifier, but we
> > could just as well use memory-regions in that case since we have
> > something more generic anyway.
> > 
> > With the #memory-region-cells proposal, we can easily extend the cell in
> > the specifier with an additional MEMORY_REGION_IOMMU_RESERVE flag to
> > take that other use case into account. If we than also change to the new
> > memory-regions property name, we avoid the ABI issue (and we gain a bit
> > of consistency while at it).
> 
> Ping? Rob, do you want me to add this second use-case to the patch
> series to make it more obvious that this isn't just a one-off thing? Or
> how do we proceed?

Rob, given that additional use-case, do you want me to run with this
proposal and send out an updated series?

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 2/5] iommu: Implement of_iommu_get_resv_regions()
  2021-04-23 16:32 ` [PATCH v2 2/5] iommu: Implement of_iommu_get_resv_regions() Thierry Reding
@ 2021-07-02 14:05   ` Dmitry Osipenko
  2021-07-16 14:41     ` Rob Herring
  0 siblings, 1 reply; 41+ messages in thread
From: Dmitry Osipenko @ 2021-07-02 14:05 UTC (permalink / raw)
  To: Thierry Reding, Joerg Roedel, Rob Herring
  Cc: Will Deacon, Robin Murphy, Nicolin Chen, Krishna Reddy,
	devicetree, iommu, linux-tegra, Frank Rowand, Rob Herring

23.04.2021 19:32, Thierry Reding пишет:
> +void of_iommu_get_resv_regions(struct device *dev, struct list_head *list)
> +{
> +	struct of_phandle_iterator it;
> +	int err;
> +
> +	of_for_each_phandle(&it, err, dev->of_node, "memory-region", "#memory-region-cells", 0) {
> +		struct iommu_resv_region *region;
> +		struct of_phandle_args args;
> +		struct resource res;
> +
> +		args.args_count = of_phandle_iterator_args(&it, args.args, MAX_PHANDLE_ARGS);
> +
> +		err = of_address_to_resource(it.node, 0, &res);
> +		if (err < 0) {
> +			dev_err(dev, "failed to parse memory region %pOF: %d\n",
> +				it.node, err);
> +			continue;
> +		}
> +
> +		if (args.args_count > 0) {
> +			/*
> +			 * Active memory regions are expected to be accessed by hardware during
> +			 * boot and must therefore have an identity mapping created prior to the
> +			 * driver taking control of the hardware. This ensures that non-quiescent
> +			 * hardware doesn't cause IOMMU faults during boot.
> +			 */
> +			if (args.args[0] & MEMORY_REGION_IDENTITY_MAPPING) {
> +				region = iommu_alloc_resv_region(res.start, resource_size(&res),
> +								 IOMMU_READ | IOMMU_WRITE,
> +								 IOMMU_RESV_DIRECT_RELAXABLE);
> +				if (!region)
> +					continue;
> +
> +				list_add_tail(&region->list, list);
> +			}
> +		}
> +	}
> +}
> +EXPORT_SYMBOL(of_iommu_get_resv_regions);

Any reason why this is not EXPORT_SYMBOL_GPL? I'm curious what is the
logic behind the OF symbols in general since it looks like half of them
are GPL.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 1/5] dt-bindings: reserved-memory: Document memory region specifier
  2021-07-01 18:14         ` Thierry Reding
@ 2021-07-02 14:16           ` Dmitry Osipenko
  2021-09-01 14:13             ` Thierry Reding
  0 siblings, 1 reply; 41+ messages in thread
From: Dmitry Osipenko @ 2021-07-02 14:16 UTC (permalink / raw)
  To: Thierry Reding, Rob Herring
  Cc: Joerg Roedel, Will Deacon, Robin Murphy, Nicolin Chen,
	Krishna Reddy, devicetree, iommu, linux-tegra

01.07.2021 21:14, Thierry Reding пишет:
> On Tue, Jun 08, 2021 at 06:51:40PM +0200, Thierry Reding wrote:
>> On Fri, May 28, 2021 at 06:54:55PM +0200, Thierry Reding wrote:
>>> On Thu, May 20, 2021 at 05:03:06PM -0500, Rob Herring wrote:
>>>> On Fri, Apr 23, 2021 at 06:32:30PM +0200, Thierry Reding wrote:
>>>>> From: Thierry Reding <treding@nvidia.com>
>>>>>
>>>>> Reserved memory region phandle references can be accompanied by a
>>>>> specifier that provides additional information about how that specific
>>>>> reference should be treated.
>>>>>
>>>>> One use-case is to mark a memory region as needing an identity mapping
>>>>> in the system's IOMMU for the device that references the region. This is
>>>>> needed for example when the bootloader has set up hardware (such as a
>>>>> display controller) to actively access a memory region (e.g. a boot
>>>>> splash screen framebuffer) during boot. The operating system can use the
>>>>> identity mapping flag from the specifier to make sure an IOMMU identity
>>>>> mapping is set up for the framebuffer before IOMMU translations are
>>>>> enabled for the display controller.
>>>>>
>>>>> Signed-off-by: Thierry Reding <treding@nvidia.com>
>>>>> ---
>>>>>  .../reserved-memory/reserved-memory.txt       | 21 +++++++++++++++++++
>>>>>  include/dt-bindings/reserved-memory.h         |  8 +++++++
>>>>>  2 files changed, 29 insertions(+)
>>>>>  create mode 100644 include/dt-bindings/reserved-memory.h
>>>>
>>>> Sorry for being slow on this. I have 2 concerns.
>>>>
>>>> First, this creates an ABI issue. A DT with cells in 'memory-region' 
>>>> will not be understood by an existing OS. I'm less concerned about this 
>>>> if we address that with a stable fix. (Though I'm pretty sure we've 
>>>> naively added #?-cells in the past ignoring this issue.)
>>>
>>> A while ago I had proposed adding memory-region*s* as an alternative
>>> name for memory-region to make the naming more consistent with other
>>> types of properties (think clocks, resets, gpios, ...). If we added
>>> that, we could easily differentiate between the "legacy" cases where
>>> no #memory-region-cells was allowed and the new cases where it was.
>>>
>>>> Second, it could be the bootloader setting up the reserved region. If a 
>>>> node already has 'memory-region', then adding more regions is more 
>>>> complicated compared to adding new properties. And defining what each 
>>>> memory-region entry is or how many in schemas is impossible.
>>>
>>> It's true that updating the property gets a bit complicated, but it's
>>> not exactly rocket science. We really just need to splice the array. I
>>> have a working implemention for this in U-Boot.
>>>
>>> For what it's worth, we could run into the same issue with any new
>>> property that we add. Even if we renamed this to iommu-memory-region,
>>> it's still possible that a bootloader may have to update this property
>>> if it already exists (it could be hard-coded in DT, or it could have
>>> been added by some earlier bootloader or firmware).
>>>
>>>> Both could be addressed with a new property. Perhaps something like 
>>>> 'iommu-memory-region = <&phandle>;'. I think the 'iommu' prefix is 
>>>> appropriate given this is entirely because of the IOMMU being in the 
>>>> mix. I might feel differently if we had other uses for cells, but I 
>>>> don't really see it in this case. 
>>>
>>> I'm afraid that down the road we'll end up with other cases and then we
>>> might proliferate a number of *-memory-region properties with varying
>>> prefixes.
>>>
>>> I am aware of one other case where we might need something like this: on
>>> some Tegra SoCs we have audio processors that will access memory buffers
>>> using a DMA engine. These processors are booted from early firmware
>>> using firmware from system memory. In order to avoid trashing the
>>> firmware, we need to reserve memory. We can do this using reserved
>>> memory nodes. However, the audio DMA engine also uses the SMMU, so we
>>> need to make sure that the firmware memory is marked as reserved within
>>> the SMMU. This is similar to the identity mapping case, but not exactly
>>> the same. Instead of creating a 1:1 mapping, we just want that IOVA
>>> region to be reserved (i.e. IOMMU_RESV_RESERVED instead of
>>> IOMMU_RESV_DIRECT{,_RELAXABLE}).
>>>
>>> That would also fall into the IOMMU domain, but we can't reuse the
>>> iommu-memory-region property for that because then we don't have enough
>>> information to decide which type of reservation we need.
>>>
>>> We could obviously make iommu-memory-region take a specifier, but we
>>> could just as well use memory-regions in that case since we have
>>> something more generic anyway.
>>>
>>> With the #memory-region-cells proposal, we can easily extend the cell in
>>> the specifier with an additional MEMORY_REGION_IOMMU_RESERVE flag to
>>> take that other use case into account. If we than also change to the new
>>> memory-regions property name, we avoid the ABI issue (and we gain a bit
>>> of consistency while at it).
>>
>> Ping? Rob, do you want me to add this second use-case to the patch
>> series to make it more obvious that this isn't just a one-off thing? Or
>> how do we proceed?
> 
> Rob, given that additional use-case, do you want me to run with this
> proposal and send out an updated series?


What about variant with a "descriptor" properties that will describe
each region:

fb_desc: display-framebuffer-memory-descriptor {
	needs-identity-mapping;
}

display@52400000 {
	memory-region = <&fb ...>;
	memory-region-descriptor = <&fb_desc ...>;
};

It could be a more flexible/extendible variant.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 2/5] iommu: Implement of_iommu_get_resv_regions()
  2021-07-02 14:05   ` Dmitry Osipenko
@ 2021-07-16 14:41     ` Rob Herring
  2021-07-17 11:07       ` Dmitry Osipenko
  0 siblings, 1 reply; 41+ messages in thread
From: Rob Herring @ 2021-07-16 14:41 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Thierry Reding, Joerg Roedel, Will Deacon, Robin Murphy,
	Nicolin Chen, Krishna Reddy, devicetree, Linux IOMMU,
	linux-tegra, Frank Rowand

On Fri, Jul 2, 2021 at 8:05 AM Dmitry Osipenko <digetx@gmail.com> wrote:
>
> 23.04.2021 19:32, Thierry Reding пишет:
> > +void of_iommu_get_resv_regions(struct device *dev, struct list_head *list)
> > +{
> > +     struct of_phandle_iterator it;
> > +     int err;
> > +
> > +     of_for_each_phandle(&it, err, dev->of_node, "memory-region", "#memory-region-cells", 0) {
> > +             struct iommu_resv_region *region;
> > +             struct of_phandle_args args;
> > +             struct resource res;
> > +
> > +             args.args_count = of_phandle_iterator_args(&it, args.args, MAX_PHANDLE_ARGS);
> > +
> > +             err = of_address_to_resource(it.node, 0, &res);
> > +             if (err < 0) {
> > +                     dev_err(dev, "failed to parse memory region %pOF: %d\n",
> > +                             it.node, err);
> > +                     continue;
> > +             }
> > +
> > +             if (args.args_count > 0) {
> > +                     /*
> > +                      * Active memory regions are expected to be accessed by hardware during
> > +                      * boot and must therefore have an identity mapping created prior to the
> > +                      * driver taking control of the hardware. This ensures that non-quiescent
> > +                      * hardware doesn't cause IOMMU faults during boot.
> > +                      */
> > +                     if (args.args[0] & MEMORY_REGION_IDENTITY_MAPPING) {
> > +                             region = iommu_alloc_resv_region(res.start, resource_size(&res),
> > +                                                              IOMMU_READ | IOMMU_WRITE,
> > +                                                              IOMMU_RESV_DIRECT_RELAXABLE);
> > +                             if (!region)
> > +                                     continue;
> > +
> > +                             list_add_tail(&region->list, list);
> > +                     }
> > +             }
> > +     }
> > +}
> > +EXPORT_SYMBOL(of_iommu_get_resv_regions);
>
> Any reason why this is not EXPORT_SYMBOL_GPL? I'm curious what is the
> logic behind the OF symbols in general since it looks like half of them
> are GPL.

Generally, new ones are _GPL. Old ones probably predate _GPL.

This one is up to the IOMMU maintainers.

Rob

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 2/5] iommu: Implement of_iommu_get_resv_regions()
  2021-07-16 14:41     ` Rob Herring
@ 2021-07-17 11:07       ` Dmitry Osipenko
  2021-07-30 12:18         ` Will Deacon
  0 siblings, 1 reply; 41+ messages in thread
From: Dmitry Osipenko @ 2021-07-17 11:07 UTC (permalink / raw)
  To: Rob Herring
  Cc: Thierry Reding, Joerg Roedel, Will Deacon, Robin Murphy,
	Nicolin Chen, Krishna Reddy, devicetree, Linux IOMMU,
	linux-tegra, Frank Rowand

16.07.2021 17:41, Rob Herring пишет:
> On Fri, Jul 2, 2021 at 8:05 AM Dmitry Osipenko <digetx@gmail.com> wrote:
>>
>> 23.04.2021 19:32, Thierry Reding пишет:
>>> +void of_iommu_get_resv_regions(struct device *dev, struct list_head *list)
>>> +{
>>> +     struct of_phandle_iterator it;
>>> +     int err;
>>> +
>>> +     of_for_each_phandle(&it, err, dev->of_node, "memory-region", "#memory-region-cells", 0) {
>>> +             struct iommu_resv_region *region;
>>> +             struct of_phandle_args args;
>>> +             struct resource res;
>>> +
>>> +             args.args_count = of_phandle_iterator_args(&it, args.args, MAX_PHANDLE_ARGS);
>>> +
>>> +             err = of_address_to_resource(it.node, 0, &res);
>>> +             if (err < 0) {
>>> +                     dev_err(dev, "failed to parse memory region %pOF: %d\n",
>>> +                             it.node, err);
>>> +                     continue;
>>> +             }
>>> +
>>> +             if (args.args_count > 0) {
>>> +                     /*
>>> +                      * Active memory regions are expected to be accessed by hardware during
>>> +                      * boot and must therefore have an identity mapping created prior to the
>>> +                      * driver taking control of the hardware. This ensures that non-quiescent
>>> +                      * hardware doesn't cause IOMMU faults during boot.
>>> +                      */
>>> +                     if (args.args[0] & MEMORY_REGION_IDENTITY_MAPPING) {
>>> +                             region = iommu_alloc_resv_region(res.start, resource_size(&res),
>>> +                                                              IOMMU_READ | IOMMU_WRITE,
>>> +                                                              IOMMU_RESV_DIRECT_RELAXABLE);
>>> +                             if (!region)
>>> +                                     continue;
>>> +
>>> +                             list_add_tail(&region->list, list);
>>> +                     }
>>> +             }
>>> +     }
>>> +}
>>> +EXPORT_SYMBOL(of_iommu_get_resv_regions);
>>
>> Any reason why this is not EXPORT_SYMBOL_GPL? I'm curious what is the
>> logic behind the OF symbols in general since it looks like half of them
>> are GPL.
> 
> Generally, new ones are _GPL. Old ones probably predate _GPL.
> 
> This one is up to the IOMMU maintainers.

Thank you.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 2/5] iommu: Implement of_iommu_get_resv_regions()
  2021-07-17 11:07       ` Dmitry Osipenko
@ 2021-07-30 12:18         ` Will Deacon
  0 siblings, 0 replies; 41+ messages in thread
From: Will Deacon @ 2021-07-30 12:18 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Rob Herring, Thierry Reding, Joerg Roedel, Robin Murphy,
	Nicolin Chen, Krishna Reddy, devicetree, Linux IOMMU,
	linux-tegra, Frank Rowand

On Sat, Jul 17, 2021 at 02:07:12PM +0300, Dmitry Osipenko wrote:
> 16.07.2021 17:41, Rob Herring пишет:
> > On Fri, Jul 2, 2021 at 8:05 AM Dmitry Osipenko <digetx@gmail.com> wrote:
> >>
> >> 23.04.2021 19:32, Thierry Reding пишет:
> >>> +void of_iommu_get_resv_regions(struct device *dev, struct list_head *list)
> >>> +{
> >>> +     struct of_phandle_iterator it;
> >>> +     int err;
> >>> +
> >>> +     of_for_each_phandle(&it, err, dev->of_node, "memory-region", "#memory-region-cells", 0) {
> >>> +             struct iommu_resv_region *region;
> >>> +             struct of_phandle_args args;
> >>> +             struct resource res;
> >>> +
> >>> +             args.args_count = of_phandle_iterator_args(&it, args.args, MAX_PHANDLE_ARGS);
> >>> +
> >>> +             err = of_address_to_resource(it.node, 0, &res);
> >>> +             if (err < 0) {
> >>> +                     dev_err(dev, "failed to parse memory region %pOF: %d\n",
> >>> +                             it.node, err);
> >>> +                     continue;
> >>> +             }
> >>> +
> >>> +             if (args.args_count > 0) {
> >>> +                     /*
> >>> +                      * Active memory regions are expected to be accessed by hardware during
> >>> +                      * boot and must therefore have an identity mapping created prior to the
> >>> +                      * driver taking control of the hardware. This ensures that non-quiescent
> >>> +                      * hardware doesn't cause IOMMU faults during boot.
> >>> +                      */
> >>> +                     if (args.args[0] & MEMORY_REGION_IDENTITY_MAPPING) {
> >>> +                             region = iommu_alloc_resv_region(res.start, resource_size(&res),
> >>> +                                                              IOMMU_READ | IOMMU_WRITE,
> >>> +                                                              IOMMU_RESV_DIRECT_RELAXABLE);
> >>> +                             if (!region)
> >>> +                                     continue;
> >>> +
> >>> +                             list_add_tail(&region->list, list);
> >>> +                     }
> >>> +             }
> >>> +     }
> >>> +}
> >>> +EXPORT_SYMBOL(of_iommu_get_resv_regions);
> >>
> >> Any reason why this is not EXPORT_SYMBOL_GPL? I'm curious what is the
> >> logic behind the OF symbols in general since it looks like half of them
> >> are GPL.
> > 
> > Generally, new ones are _GPL. Old ones probably predate _GPL.
> > 
> > This one is up to the IOMMU maintainers.
> 
> Thank you.

I prefer EXPORT_SYMBOL_GPL(). That's aligned with the symbols exported by
iommu.c, with the *single* exception of generic_iommu_put_resv_regions(),
which I think should be changed to _GPL() as well.

Will

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 1/5] dt-bindings: reserved-memory: Document memory region specifier
  2021-07-02 14:16           ` Dmitry Osipenko
@ 2021-09-01 14:13             ` Thierry Reding
  2021-09-03 13:20               ` Rob Herring
  0 siblings, 1 reply; 41+ messages in thread
From: Thierry Reding @ 2021-09-01 14:13 UTC (permalink / raw)
  To: Rob Herring, Alyssa Rosenzweig, Sven Peter
  Cc: Dmitry Osipenko, Joerg Roedel, Will Deacon, Robin Murphy,
	Nicolin Chen, Krishna Reddy, devicetree, iommu, linux-tegra,
	dri-devel

[-- Attachment #1: Type: text/plain, Size: 11234 bytes --]

On Fri, Jul 02, 2021 at 05:16:25PM +0300, Dmitry Osipenko wrote:
> 01.07.2021 21:14, Thierry Reding пишет:
> > On Tue, Jun 08, 2021 at 06:51:40PM +0200, Thierry Reding wrote:
> >> On Fri, May 28, 2021 at 06:54:55PM +0200, Thierry Reding wrote:
> >>> On Thu, May 20, 2021 at 05:03:06PM -0500, Rob Herring wrote:
> >>>> On Fri, Apr 23, 2021 at 06:32:30PM +0200, Thierry Reding wrote:
> >>>>> From: Thierry Reding <treding@nvidia.com>
> >>>>>
> >>>>> Reserved memory region phandle references can be accompanied by a
> >>>>> specifier that provides additional information about how that specific
> >>>>> reference should be treated.
> >>>>>
> >>>>> One use-case is to mark a memory region as needing an identity mapping
> >>>>> in the system's IOMMU for the device that references the region. This is
> >>>>> needed for example when the bootloader has set up hardware (such as a
> >>>>> display controller) to actively access a memory region (e.g. a boot
> >>>>> splash screen framebuffer) during boot. The operating system can use the
> >>>>> identity mapping flag from the specifier to make sure an IOMMU identity
> >>>>> mapping is set up for the framebuffer before IOMMU translations are
> >>>>> enabled for the display controller.
> >>>>>
> >>>>> Signed-off-by: Thierry Reding <treding@nvidia.com>
> >>>>> ---
> >>>>>  .../reserved-memory/reserved-memory.txt       | 21 +++++++++++++++++++
> >>>>>  include/dt-bindings/reserved-memory.h         |  8 +++++++
> >>>>>  2 files changed, 29 insertions(+)
> >>>>>  create mode 100644 include/dt-bindings/reserved-memory.h
> >>>>
> >>>> Sorry for being slow on this. I have 2 concerns.
> >>>>
> >>>> First, this creates an ABI issue. A DT with cells in 'memory-region' 
> >>>> will not be understood by an existing OS. I'm less concerned about this 
> >>>> if we address that with a stable fix. (Though I'm pretty sure we've 
> >>>> naively added #?-cells in the past ignoring this issue.)
> >>>
> >>> A while ago I had proposed adding memory-region*s* as an alternative
> >>> name for memory-region to make the naming more consistent with other
> >>> types of properties (think clocks, resets, gpios, ...). If we added
> >>> that, we could easily differentiate between the "legacy" cases where
> >>> no #memory-region-cells was allowed and the new cases where it was.
> >>>
> >>>> Second, it could be the bootloader setting up the reserved region. If a 
> >>>> node already has 'memory-region', then adding more regions is more 
> >>>> complicated compared to adding new properties. And defining what each 
> >>>> memory-region entry is or how many in schemas is impossible.
> >>>
> >>> It's true that updating the property gets a bit complicated, but it's
> >>> not exactly rocket science. We really just need to splice the array. I
> >>> have a working implemention for this in U-Boot.
> >>>
> >>> For what it's worth, we could run into the same issue with any new
> >>> property that we add. Even if we renamed this to iommu-memory-region,
> >>> it's still possible that a bootloader may have to update this property
> >>> if it already exists (it could be hard-coded in DT, or it could have
> >>> been added by some earlier bootloader or firmware).
> >>>
> >>>> Both could be addressed with a new property. Perhaps something like 
> >>>> 'iommu-memory-region = <&phandle>;'. I think the 'iommu' prefix is 
> >>>> appropriate given this is entirely because of the IOMMU being in the 
> >>>> mix. I might feel differently if we had other uses for cells, but I 
> >>>> don't really see it in this case. 
> >>>
> >>> I'm afraid that down the road we'll end up with other cases and then we
> >>> might proliferate a number of *-memory-region properties with varying
> >>> prefixes.
> >>>
> >>> I am aware of one other case where we might need something like this: on
> >>> some Tegra SoCs we have audio processors that will access memory buffers
> >>> using a DMA engine. These processors are booted from early firmware
> >>> using firmware from system memory. In order to avoid trashing the
> >>> firmware, we need to reserve memory. We can do this using reserved
> >>> memory nodes. However, the audio DMA engine also uses the SMMU, so we
> >>> need to make sure that the firmware memory is marked as reserved within
> >>> the SMMU. This is similar to the identity mapping case, but not exactly
> >>> the same. Instead of creating a 1:1 mapping, we just want that IOVA
> >>> region to be reserved (i.e. IOMMU_RESV_RESERVED instead of
> >>> IOMMU_RESV_DIRECT{,_RELAXABLE}).
> >>>
> >>> That would also fall into the IOMMU domain, but we can't reuse the
> >>> iommu-memory-region property for that because then we don't have enough
> >>> information to decide which type of reservation we need.
> >>>
> >>> We could obviously make iommu-memory-region take a specifier, but we
> >>> could just as well use memory-regions in that case since we have
> >>> something more generic anyway.
> >>>
> >>> With the #memory-region-cells proposal, we can easily extend the cell in
> >>> the specifier with an additional MEMORY_REGION_IOMMU_RESERVE flag to
> >>> take that other use case into account. If we than also change to the new
> >>> memory-regions property name, we avoid the ABI issue (and we gain a bit
> >>> of consistency while at it).
> >>
> >> Ping? Rob, do you want me to add this second use-case to the patch
> >> series to make it more obvious that this isn't just a one-off thing? Or
> >> how do we proceed?
> > 
> > Rob, given that additional use-case, do you want me to run with this
> > proposal and send out an updated series?
> 
> 
> What about variant with a "descriptor" properties that will describe
> each region:
> 
> fb_desc: display-framebuffer-memory-descriptor {
> 	needs-identity-mapping;
> }
> 
> display@52400000 {
> 	memory-region = <&fb ...>;
> 	memory-region-descriptor = <&fb_desc ...>;
> };
> 
> It could be a more flexible/extendible variant.

This problem recently came up on #dri-devel again. Adding Alyssa and
Sven who are facing a similar challenge on their work on Apple M1 (if I
understood correctly). Also adding dri-devel for visibility since this
is a very common problem for display in particular.

On M1 the situation is slightly more complicated: the firmware will
allocate a couple of buffers (including the framebuffer) in high memory
(> 4 GiB) and use the IOMMU to map that into an IOVA region below 4 GiB
so that the display hardware can access it. This makes it impossible to
bypass the IOMMU like we do on other chips (in particular to work around
the fault-by-default policy of the ARM SMMU driver). It also means that
in addition to the simple reserved regions I mentioned we need for audio
use-cases and identity mapping use-cases we need for display on Tegra,
we now also need to be able to convey physical to IOVA mappings.

Fitting the latter into the original proposal sounds difficult. A quick
fix would've been to generate a mapping table in memory and pass that to
the kernel using a reserved-memory node (similar to what's done for
example on Tegra for the EMC frequency table on Tegra210) and mark it as
such using a special flag. But that then involves two layers of parsing,
which seems a bit suboptimal. Another way to shoehorn that into the
original proposal would've been to add flags for physical and virtual
address regions and use pairs to pass them using special flags. Again,
this is a bit wonky because it needs these to be carefully parsed and
matched up.

Another downside is that we now have a situation where some of these
regions are no longer "reserved-memory regions" in the traditional
sense. This would require an additional flag in the reserved-memory
region nodes to prevent the IOVA regions from being reserved. By the
way, this is something that would also be needed for the audio use-case
I mentioned before, because the physical memory at that address can
still be used by an operating system.

A more general solution would be to draw a bit from Dmitry's proposal
and introduce a new top-level "iov-reserved-memory" node. This could be
modelled on the existing reserved-memory node, except that the physical
memory pages for regions represented by child nodes would not be marked
as reserved. Only the IOVA range described by the region would be
reserved subsequently by the IOMMU framework and/or IOMMU driver.

The simplest case where we just want to reserve some IOVA region could
then be done like this:

	iov-reserved-memory {
		/*
		 * Probably safest to default to <2>, <2> here given
		 * that most IOMMUs support either > 32 bits of IAS
		 * or OAS.
		 */
		#address-cells = <2>;
		#size-cells = <2>;

		firmware: firmware@80000000 {
			reg = <0 0x80000000 0 0x01000000>;
		};
	};

	audio@30000000 {
		...
		iov-memory-regions = <&firmware>;
		...
	};

Mappings could be represented by an IOV reserved region taking a
reference to the reserved-region that they map:

	reserved-memory {
		#address-cells = <2>;
		#size-cells = <2>;

		/* 16 MiB of framebuffer at top-of-memory */
		framebuffer: framebuffer@1,ff000000 {
			reg = <0x1 0xff000000 0 0x01000000>;
			no-map;
		};
	};

	iov-reserved-memory {
		/* IOMMU supports only 32-bit output address space */
		#address-cells = <1>;
		#size-cells = <1>;

		/* 16 MiB of framebuffer mapped to top of IOVA */
		fb: fb@ff000000 {
			reg = <0 0xff000000 0 0x01000000>;
			memory-region = <&framebuffer>;
		};
	};

	display@40000000 {
		...
		/* optional? */
		memory-region = <&framebuffer>;
		iov-memory-regions = <&fb>;
		...
	};

It's interesting how identity mapped regions now become a trivial
special case of mappings. All that is needed is to make the reg property
of the IOV reserved region correspond to the reg property of the normal
reserved region. Alternatively, as a small optimization for lazy people
like me, we could just allow these cases to omit the reg property and
instead inherit it from the referenced reserved region.

As the second example shows it might be convenient if memory-region
could be derived from iov-memory-regions. This could be useful for cases
where the driver wants to do something with the physical pages of the
reserved region (such as mapping them and copying out the framebuffer
data to another buffer so that the reserved memory can be recycled). If
we have the IOV reserved region, we could provide an API to extract the
physical reserved region (if it exists). That way we could avoid
referencing it twice in DT. Then again, there's something elegant about
the explicit second reference to. It indicates the intent that we may
want to use the region for something other than just the IOV mapping.

Anyway, this has been long enough. Let me know what you think. Alyssa,
Sven, it'd be interesting to hear if you think this could work as a
solution to the problem on M1.

Rob, I think you might like this alternative because it basically gets
rid of all the points in the original proposal that you were concerned
about. Let me know what you think.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 1/5] dt-bindings: reserved-memory: Document memory region specifier
  2021-09-01 14:13             ` Thierry Reding
@ 2021-09-03 13:20               ` Rob Herring
  2021-09-03 13:52                 ` Thierry Reding
  0 siblings, 1 reply; 41+ messages in thread
From: Rob Herring @ 2021-09-03 13:20 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Alyssa Rosenzweig, Sven Peter, Dmitry Osipenko, Joerg Roedel,
	Will Deacon, Robin Murphy, Nicolin Chen, Krishna Reddy,
	devicetree, Linux IOMMU, linux-tegra, dri-devel

On Wed, Sep 1, 2021 at 9:13 AM Thierry Reding <thierry.reding@gmail.com> wrote:
>
> On Fri, Jul 02, 2021 at 05:16:25PM +0300, Dmitry Osipenko wrote:
> > 01.07.2021 21:14, Thierry Reding пишет:
> > > On Tue, Jun 08, 2021 at 06:51:40PM +0200, Thierry Reding wrote:
> > >> On Fri, May 28, 2021 at 06:54:55PM +0200, Thierry Reding wrote:
> > >>> On Thu, May 20, 2021 at 05:03:06PM -0500, Rob Herring wrote:
> > >>>> On Fri, Apr 23, 2021 at 06:32:30PM +0200, Thierry Reding wrote:
> > >>>>> From: Thierry Reding <treding@nvidia.com>
> > >>>>>
> > >>>>> Reserved memory region phandle references can be accompanied by a
> > >>>>> specifier that provides additional information about how that specific
> > >>>>> reference should be treated.
> > >>>>>
> > >>>>> One use-case is to mark a memory region as needing an identity mapping
> > >>>>> in the system's IOMMU for the device that references the region. This is
> > >>>>> needed for example when the bootloader has set up hardware (such as a
> > >>>>> display controller) to actively access a memory region (e.g. a boot
> > >>>>> splash screen framebuffer) during boot. The operating system can use the
> > >>>>> identity mapping flag from the specifier to make sure an IOMMU identity
> > >>>>> mapping is set up for the framebuffer before IOMMU translations are
> > >>>>> enabled for the display controller.
> > >>>>>
> > >>>>> Signed-off-by: Thierry Reding <treding@nvidia.com>
> > >>>>> ---
> > >>>>>  .../reserved-memory/reserved-memory.txt       | 21 +++++++++++++++++++
> > >>>>>  include/dt-bindings/reserved-memory.h         |  8 +++++++
> > >>>>>  2 files changed, 29 insertions(+)
> > >>>>>  create mode 100644 include/dt-bindings/reserved-memory.h
> > >>>>
> > >>>> Sorry for being slow on this. I have 2 concerns.
> > >>>>
> > >>>> First, this creates an ABI issue. A DT with cells in 'memory-region'
> > >>>> will not be understood by an existing OS. I'm less concerned about this
> > >>>> if we address that with a stable fix. (Though I'm pretty sure we've
> > >>>> naively added #?-cells in the past ignoring this issue.)
> > >>>
> > >>> A while ago I had proposed adding memory-region*s* as an alternative
> > >>> name for memory-region to make the naming more consistent with other
> > >>> types of properties (think clocks, resets, gpios, ...). If we added
> > >>> that, we could easily differentiate between the "legacy" cases where
> > >>> no #memory-region-cells was allowed and the new cases where it was.
> > >>>
> > >>>> Second, it could be the bootloader setting up the reserved region. If a
> > >>>> node already has 'memory-region', then adding more regions is more
> > >>>> complicated compared to adding new properties. And defining what each
> > >>>> memory-region entry is or how many in schemas is impossible.
> > >>>
> > >>> It's true that updating the property gets a bit complicated, but it's
> > >>> not exactly rocket science. We really just need to splice the array. I
> > >>> have a working implemention for this in U-Boot.
> > >>>
> > >>> For what it's worth, we could run into the same issue with any new
> > >>> property that we add. Even if we renamed this to iommu-memory-region,
> > >>> it's still possible that a bootloader may have to update this property
> > >>> if it already exists (it could be hard-coded in DT, or it could have
> > >>> been added by some earlier bootloader or firmware).
> > >>>
> > >>>> Both could be addressed with a new property. Perhaps something like
> > >>>> 'iommu-memory-region = <&phandle>;'. I think the 'iommu' prefix is
> > >>>> appropriate given this is entirely because of the IOMMU being in the
> > >>>> mix. I might feel differently if we had other uses for cells, but I
> > >>>> don't really see it in this case.
> > >>>
> > >>> I'm afraid that down the road we'll end up with other cases and then we
> > >>> might proliferate a number of *-memory-region properties with varying
> > >>> prefixes.
> > >>>
> > >>> I am aware of one other case where we might need something like this: on
> > >>> some Tegra SoCs we have audio processors that will access memory buffers
> > >>> using a DMA engine. These processors are booted from early firmware
> > >>> using firmware from system memory. In order to avoid trashing the
> > >>> firmware, we need to reserve memory. We can do this using reserved
> > >>> memory nodes. However, the audio DMA engine also uses the SMMU, so we
> > >>> need to make sure that the firmware memory is marked as reserved within
> > >>> the SMMU. This is similar to the identity mapping case, but not exactly
> > >>> the same. Instead of creating a 1:1 mapping, we just want that IOVA
> > >>> region to be reserved (i.e. IOMMU_RESV_RESERVED instead of
> > >>> IOMMU_RESV_DIRECT{,_RELAXABLE}).
> > >>>
> > >>> That would also fall into the IOMMU domain, but we can't reuse the
> > >>> iommu-memory-region property for that because then we don't have enough
> > >>> information to decide which type of reservation we need.
> > >>>
> > >>> We could obviously make iommu-memory-region take a specifier, but we
> > >>> could just as well use memory-regions in that case since we have
> > >>> something more generic anyway.
> > >>>
> > >>> With the #memory-region-cells proposal, we can easily extend the cell in
> > >>> the specifier with an additional MEMORY_REGION_IOMMU_RESERVE flag to
> > >>> take that other use case into account. If we than also change to the new
> > >>> memory-regions property name, we avoid the ABI issue (and we gain a bit
> > >>> of consistency while at it).
> > >>
> > >> Ping? Rob, do you want me to add this second use-case to the patch
> > >> series to make it more obvious that this isn't just a one-off thing? Or
> > >> how do we proceed?
> > >
> > > Rob, given that additional use-case, do you want me to run with this
> > > proposal and send out an updated series?
> >
> >
> > What about variant with a "descriptor" properties that will describe
> > each region:
> >
> > fb_desc: display-framebuffer-memory-descriptor {
> >       needs-identity-mapping;
> > }
> >
> > display@52400000 {
> >       memory-region = <&fb ...>;
> >       memory-region-descriptor = <&fb_desc ...>;
> > };
> >
> > It could be a more flexible/extendible variant.
>
> This problem recently came up on #dri-devel again. Adding Alyssa and
> Sven who are facing a similar challenge on their work on Apple M1 (if I
> understood correctly). Also adding dri-devel for visibility since this
> is a very common problem for display in particular.
>
> On M1 the situation is slightly more complicated: the firmware will
> allocate a couple of buffers (including the framebuffer) in high memory
> (> 4 GiB) and use the IOMMU to map that into an IOVA region below 4 GiB
> so that the display hardware can access it. This makes it impossible to
> bypass the IOMMU like we do on other chips (in particular to work around
> the fault-by-default policy of the ARM SMMU driver). It also means that
> in addition to the simple reserved regions I mentioned we need for audio
> use-cases and identity mapping use-cases we need for display on Tegra,
> we now also need to be able to convey physical to IOVA mappings.
>
> Fitting the latter into the original proposal sounds difficult. A quick
> fix would've been to generate a mapping table in memory and pass that to
> the kernel using a reserved-memory node (similar to what's done for
> example on Tegra for the EMC frequency table on Tegra210) and mark it as
> such using a special flag. But that then involves two layers of parsing,
> which seems a bit suboptimal. Another way to shoehorn that into the
> original proposal would've been to add flags for physical and virtual
> address regions and use pairs to pass them using special flags. Again,
> this is a bit wonky because it needs these to be carefully parsed and
> matched up.
>
> Another downside is that we now have a situation where some of these
> regions are no longer "reserved-memory regions" in the traditional
> sense. This would require an additional flag in the reserved-memory
> region nodes to prevent the IOVA regions from being reserved. By the
> way, this is something that would also be needed for the audio use-case
> I mentioned before, because the physical memory at that address can
> still be used by an operating system.
>
> A more general solution would be to draw a bit from Dmitry's proposal
> and introduce a new top-level "iov-reserved-memory" node. This could be
> modelled on the existing reserved-memory node, except that the physical
> memory pages for regions represented by child nodes would not be marked
> as reserved. Only the IOVA range described by the region would be
> reserved subsequently by the IOMMU framework and/or IOMMU driver.
>
> The simplest case where we just want to reserve some IOVA region could
> then be done like this:
>
>         iov-reserved-memory {
>                 /*
>                  * Probably safest to default to <2>, <2> here given
>                  * that most IOMMUs support either > 32 bits of IAS
>                  * or OAS.
>                  */
>                 #address-cells = <2>;
>                 #size-cells = <2>;
>
>                 firmware: firmware@80000000 {
>                         reg = <0 0x80000000 0 0x01000000>;
>                 };
>         };
>
>         audio@30000000 {
>                 ...
>                 iov-memory-regions = <&firmware>;
>                 ...
>         };
>
> Mappings could be represented by an IOV reserved region taking a
> reference to the reserved-region that they map:
>
>         reserved-memory {
>                 #address-cells = <2>;
>                 #size-cells = <2>;
>
>                 /* 16 MiB of framebuffer at top-of-memory */
>                 framebuffer: framebuffer@1,ff000000 {
>                         reg = <0x1 0xff000000 0 0x01000000>;
>                         no-map;
>                 };
>         };
>
>         iov-reserved-memory {
>                 /* IOMMU supports only 32-bit output address space */
>                 #address-cells = <1>;
>                 #size-cells = <1>;
>
>                 /* 16 MiB of framebuffer mapped to top of IOVA */
>                 fb: fb@ff000000 {
>                         reg = <0 0xff000000 0 0x01000000>;
>                         memory-region = <&framebuffer>;
>                 };
>         };
>
>         display@40000000 {
>                 ...
>                 /* optional? */
>                 memory-region = <&framebuffer>;
>                 iov-memory-regions = <&fb>;
>                 ...
>         };
>
> It's interesting how identity mapped regions now become a trivial
> special case of mappings. All that is needed is to make the reg property
> of the IOV reserved region correspond to the reg property of the normal
> reserved region. Alternatively, as a small optimization for lazy people
> like me, we could just allow these cases to omit the reg property and
> instead inherit it from the referenced reserved region.
>
> As the second example shows it might be convenient if memory-region
> could be derived from iov-memory-regions. This could be useful for cases
> where the driver wants to do something with the physical pages of the
> reserved region (such as mapping them and copying out the framebuffer
> data to another buffer so that the reserved memory can be recycled). If
> we have the IOV reserved region, we could provide an API to extract the
> physical reserved region (if it exists). That way we could avoid
> referencing it twice in DT. Then again, there's something elegant about
> the explicit second reference to. It indicates the intent that we may
> want to use the region for something other than just the IOV mapping.
>
> Anyway, this has been long enough. Let me know what you think. Alyssa,
> Sven, it'd be interesting to hear if you think this could work as a
> solution to the problem on M1.
>
> Rob, I think you might like this alternative because it basically gets
> rid of all the points in the original proposal that you were concerned
> about. Let me know what you think.

Couldn't we keep this all in /reserved-memory? Just add an iova
version of reg. Perhaps abuse 'assigned-address' for this purpose. The
issue I see would be handling reserved iova areas without a physical
area. That can be handled with just a iova and no reg. We already have
a no reg case.

Rob

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 1/5] dt-bindings: reserved-memory: Document memory region specifier
  2021-09-03 13:20               ` Rob Herring
@ 2021-09-03 13:52                 ` Thierry Reding
  2021-09-03 14:36                   ` Rob Herring
  0 siblings, 1 reply; 41+ messages in thread
From: Thierry Reding @ 2021-09-03 13:52 UTC (permalink / raw)
  To: Rob Herring
  Cc: Alyssa Rosenzweig, Sven Peter, Dmitry Osipenko, Joerg Roedel,
	Will Deacon, Robin Murphy, Nicolin Chen, Krishna Reddy,
	devicetree, Linux IOMMU, linux-tegra, dri-devel

[-- Attachment #1: Type: text/plain, Size: 14042 bytes --]

On Fri, Sep 03, 2021 at 08:20:55AM -0500, Rob Herring wrote:
> On Wed, Sep 1, 2021 at 9:13 AM Thierry Reding <thierry.reding@gmail.com> wrote:
> >
> > On Fri, Jul 02, 2021 at 05:16:25PM +0300, Dmitry Osipenko wrote:
> > > 01.07.2021 21:14, Thierry Reding пишет:
> > > > On Tue, Jun 08, 2021 at 06:51:40PM +0200, Thierry Reding wrote:
> > > >> On Fri, May 28, 2021 at 06:54:55PM +0200, Thierry Reding wrote:
> > > >>> On Thu, May 20, 2021 at 05:03:06PM -0500, Rob Herring wrote:
> > > >>>> On Fri, Apr 23, 2021 at 06:32:30PM +0200, Thierry Reding wrote:
> > > >>>>> From: Thierry Reding <treding@nvidia.com>
> > > >>>>>
> > > >>>>> Reserved memory region phandle references can be accompanied by a
> > > >>>>> specifier that provides additional information about how that specific
> > > >>>>> reference should be treated.
> > > >>>>>
> > > >>>>> One use-case is to mark a memory region as needing an identity mapping
> > > >>>>> in the system's IOMMU for the device that references the region. This is
> > > >>>>> needed for example when the bootloader has set up hardware (such as a
> > > >>>>> display controller) to actively access a memory region (e.g. a boot
> > > >>>>> splash screen framebuffer) during boot. The operating system can use the
> > > >>>>> identity mapping flag from the specifier to make sure an IOMMU identity
> > > >>>>> mapping is set up for the framebuffer before IOMMU translations are
> > > >>>>> enabled for the display controller.
> > > >>>>>
> > > >>>>> Signed-off-by: Thierry Reding <treding@nvidia.com>
> > > >>>>> ---
> > > >>>>>  .../reserved-memory/reserved-memory.txt       | 21 +++++++++++++++++++
> > > >>>>>  include/dt-bindings/reserved-memory.h         |  8 +++++++
> > > >>>>>  2 files changed, 29 insertions(+)
> > > >>>>>  create mode 100644 include/dt-bindings/reserved-memory.h
> > > >>>>
> > > >>>> Sorry for being slow on this. I have 2 concerns.
> > > >>>>
> > > >>>> First, this creates an ABI issue. A DT with cells in 'memory-region'
> > > >>>> will not be understood by an existing OS. I'm less concerned about this
> > > >>>> if we address that with a stable fix. (Though I'm pretty sure we've
> > > >>>> naively added #?-cells in the past ignoring this issue.)
> > > >>>
> > > >>> A while ago I had proposed adding memory-region*s* as an alternative
> > > >>> name for memory-region to make the naming more consistent with other
> > > >>> types of properties (think clocks, resets, gpios, ...). If we added
> > > >>> that, we could easily differentiate between the "legacy" cases where
> > > >>> no #memory-region-cells was allowed and the new cases where it was.
> > > >>>
> > > >>>> Second, it could be the bootloader setting up the reserved region. If a
> > > >>>> node already has 'memory-region', then adding more regions is more
> > > >>>> complicated compared to adding new properties. And defining what each
> > > >>>> memory-region entry is or how many in schemas is impossible.
> > > >>>
> > > >>> It's true that updating the property gets a bit complicated, but it's
> > > >>> not exactly rocket science. We really just need to splice the array. I
> > > >>> have a working implemention for this in U-Boot.
> > > >>>
> > > >>> For what it's worth, we could run into the same issue with any new
> > > >>> property that we add. Even if we renamed this to iommu-memory-region,
> > > >>> it's still possible that a bootloader may have to update this property
> > > >>> if it already exists (it could be hard-coded in DT, or it could have
> > > >>> been added by some earlier bootloader or firmware).
> > > >>>
> > > >>>> Both could be addressed with a new property. Perhaps something like
> > > >>>> 'iommu-memory-region = <&phandle>;'. I think the 'iommu' prefix is
> > > >>>> appropriate given this is entirely because of the IOMMU being in the
> > > >>>> mix. I might feel differently if we had other uses for cells, but I
> > > >>>> don't really see it in this case.
> > > >>>
> > > >>> I'm afraid that down the road we'll end up with other cases and then we
> > > >>> might proliferate a number of *-memory-region properties with varying
> > > >>> prefixes.
> > > >>>
> > > >>> I am aware of one other case where we might need something like this: on
> > > >>> some Tegra SoCs we have audio processors that will access memory buffers
> > > >>> using a DMA engine. These processors are booted from early firmware
> > > >>> using firmware from system memory. In order to avoid trashing the
> > > >>> firmware, we need to reserve memory. We can do this using reserved
> > > >>> memory nodes. However, the audio DMA engine also uses the SMMU, so we
> > > >>> need to make sure that the firmware memory is marked as reserved within
> > > >>> the SMMU. This is similar to the identity mapping case, but not exactly
> > > >>> the same. Instead of creating a 1:1 mapping, we just want that IOVA
> > > >>> region to be reserved (i.e. IOMMU_RESV_RESERVED instead of
> > > >>> IOMMU_RESV_DIRECT{,_RELAXABLE}).
> > > >>>
> > > >>> That would also fall into the IOMMU domain, but we can't reuse the
> > > >>> iommu-memory-region property for that because then we don't have enough
> > > >>> information to decide which type of reservation we need.
> > > >>>
> > > >>> We could obviously make iommu-memory-region take a specifier, but we
> > > >>> could just as well use memory-regions in that case since we have
> > > >>> something more generic anyway.
> > > >>>
> > > >>> With the #memory-region-cells proposal, we can easily extend the cell in
> > > >>> the specifier with an additional MEMORY_REGION_IOMMU_RESERVE flag to
> > > >>> take that other use case into account. If we than also change to the new
> > > >>> memory-regions property name, we avoid the ABI issue (and we gain a bit
> > > >>> of consistency while at it).
> > > >>
> > > >> Ping? Rob, do you want me to add this second use-case to the patch
> > > >> series to make it more obvious that this isn't just a one-off thing? Or
> > > >> how do we proceed?
> > > >
> > > > Rob, given that additional use-case, do you want me to run with this
> > > > proposal and send out an updated series?
> > >
> > >
> > > What about variant with a "descriptor" properties that will describe
> > > each region:
> > >
> > > fb_desc: display-framebuffer-memory-descriptor {
> > >       needs-identity-mapping;
> > > }
> > >
> > > display@52400000 {
> > >       memory-region = <&fb ...>;
> > >       memory-region-descriptor = <&fb_desc ...>;
> > > };
> > >
> > > It could be a more flexible/extendible variant.
> >
> > This problem recently came up on #dri-devel again. Adding Alyssa and
> > Sven who are facing a similar challenge on their work on Apple M1 (if I
> > understood correctly). Also adding dri-devel for visibility since this
> > is a very common problem for display in particular.
> >
> > On M1 the situation is slightly more complicated: the firmware will
> > allocate a couple of buffers (including the framebuffer) in high memory
> > (> 4 GiB) and use the IOMMU to map that into an IOVA region below 4 GiB
> > so that the display hardware can access it. This makes it impossible to
> > bypass the IOMMU like we do on other chips (in particular to work around
> > the fault-by-default policy of the ARM SMMU driver). It also means that
> > in addition to the simple reserved regions I mentioned we need for audio
> > use-cases and identity mapping use-cases we need for display on Tegra,
> > we now also need to be able to convey physical to IOVA mappings.
> >
> > Fitting the latter into the original proposal sounds difficult. A quick
> > fix would've been to generate a mapping table in memory and pass that to
> > the kernel using a reserved-memory node (similar to what's done for
> > example on Tegra for the EMC frequency table on Tegra210) and mark it as
> > such using a special flag. But that then involves two layers of parsing,
> > which seems a bit suboptimal. Another way to shoehorn that into the
> > original proposal would've been to add flags for physical and virtual
> > address regions and use pairs to pass them using special flags. Again,
> > this is a bit wonky because it needs these to be carefully parsed and
> > matched up.
> >
> > Another downside is that we now have a situation where some of these
> > regions are no longer "reserved-memory regions" in the traditional
> > sense. This would require an additional flag in the reserved-memory
> > region nodes to prevent the IOVA regions from being reserved. By the
> > way, this is something that would also be needed for the audio use-case
> > I mentioned before, because the physical memory at that address can
> > still be used by an operating system.
> >
> > A more general solution would be to draw a bit from Dmitry's proposal
> > and introduce a new top-level "iov-reserved-memory" node. This could be
> > modelled on the existing reserved-memory node, except that the physical
> > memory pages for regions represented by child nodes would not be marked
> > as reserved. Only the IOVA range described by the region would be
> > reserved subsequently by the IOMMU framework and/or IOMMU driver.
> >
> > The simplest case where we just want to reserve some IOVA region could
> > then be done like this:
> >
> >         iov-reserved-memory {
> >                 /*
> >                  * Probably safest to default to <2>, <2> here given
> >                  * that most IOMMUs support either > 32 bits of IAS
> >                  * or OAS.
> >                  */
> >                 #address-cells = <2>;
> >                 #size-cells = <2>;
> >
> >                 firmware: firmware@80000000 {
> >                         reg = <0 0x80000000 0 0x01000000>;
> >                 };
> >         };
> >
> >         audio@30000000 {
> >                 ...
> >                 iov-memory-regions = <&firmware>;
> >                 ...
> >         };
> >
> > Mappings could be represented by an IOV reserved region taking a
> > reference to the reserved-region that they map:
> >
> >         reserved-memory {
> >                 #address-cells = <2>;
> >                 #size-cells = <2>;
> >
> >                 /* 16 MiB of framebuffer at top-of-memory */
> >                 framebuffer: framebuffer@1,ff000000 {
> >                         reg = <0x1 0xff000000 0 0x01000000>;
> >                         no-map;
> >                 };
> >         };
> >
> >         iov-reserved-memory {
> >                 /* IOMMU supports only 32-bit output address space */
> >                 #address-cells = <1>;
> >                 #size-cells = <1>;
> >
> >                 /* 16 MiB of framebuffer mapped to top of IOVA */
> >                 fb: fb@ff000000 {
> >                         reg = <0 0xff000000 0 0x01000000>;
> >                         memory-region = <&framebuffer>;
> >                 };
> >         };
> >
> >         display@40000000 {
> >                 ...
> >                 /* optional? */
> >                 memory-region = <&framebuffer>;
> >                 iov-memory-regions = <&fb>;
> >                 ...
> >         };
> >
> > It's interesting how identity mapped regions now become a trivial
> > special case of mappings. All that is needed is to make the reg property
> > of the IOV reserved region correspond to the reg property of the normal
> > reserved region. Alternatively, as a small optimization for lazy people
> > like me, we could just allow these cases to omit the reg property and
> > instead inherit it from the referenced reserved region.
> >
> > As the second example shows it might be convenient if memory-region
> > could be derived from iov-memory-regions. This could be useful for cases
> > where the driver wants to do something with the physical pages of the
> > reserved region (such as mapping them and copying out the framebuffer
> > data to another buffer so that the reserved memory can be recycled). If
> > we have the IOV reserved region, we could provide an API to extract the
> > physical reserved region (if it exists). That way we could avoid
> > referencing it twice in DT. Then again, there's something elegant about
> > the explicit second reference to. It indicates the intent that we may
> > want to use the region for something other than just the IOV mapping.
> >
> > Anyway, this has been long enough. Let me know what you think. Alyssa,
> > Sven, it'd be interesting to hear if you think this could work as a
> > solution to the problem on M1.
> >
> > Rob, I think you might like this alternative because it basically gets
> > rid of all the points in the original proposal that you were concerned
> > about. Let me know what you think.
> 
> Couldn't we keep this all in /reserved-memory? Just add an iova
> version of reg. Perhaps abuse 'assigned-address' for this purpose. The
> issue I see would be handling reserved iova areas without a physical
> area. That can be handled with just a iova and no reg. We already have
> a no reg case.

I had thought about that initially. One thing I'm worried about is that
every child node in /reserved-memory will effectively cause the memory
that it described to be reserved. But we don't want that for regions
that are "virtual only" (i.e. IOMMU reservations).

Obviously we can fix that in Linux, but what about other operating
systems? Currently "reg" is a required property for statically allocated
regions (which all of these would be). Do you have an idea of how widely
that's used? What about other OSes, or bootloaders, what if they
encounter these nodes that don't have a "reg" property?

If you don't see any concerns with that, I think we could make it work.
I don't have any strong opinions about the naming for the IOVA regions.
With a tiny stretch of the imagination, "assigned-addresses" even makes
sense in this context.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 1/5] dt-bindings: reserved-memory: Document memory region specifier
  2021-09-03 13:52                 ` Thierry Reding
@ 2021-09-03 14:36                   ` Rob Herring
  2021-09-03 15:35                     ` Thierry Reding
  0 siblings, 1 reply; 41+ messages in thread
From: Rob Herring @ 2021-09-03 14:36 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Alyssa Rosenzweig, Sven Peter, Dmitry Osipenko, Joerg Roedel,
	Will Deacon, Robin Murphy, Nicolin Chen, Krishna Reddy,
	devicetree, Linux IOMMU, linux-tegra, dri-devel

On Fri, Sep 3, 2021 at 8:52 AM Thierry Reding <thierry.reding@gmail.com> wrote:
>
> On Fri, Sep 03, 2021 at 08:20:55AM -0500, Rob Herring wrote:
> > On Wed, Sep 1, 2021 at 9:13 AM Thierry Reding <thierry.reding@gmail.com> wrote:
> > >
> > > On Fri, Jul 02, 2021 at 05:16:25PM +0300, Dmitry Osipenko wrote:
> > > > 01.07.2021 21:14, Thierry Reding пишет:
> > > > > On Tue, Jun 08, 2021 at 06:51:40PM +0200, Thierry Reding wrote:
> > > > >> On Fri, May 28, 2021 at 06:54:55PM +0200, Thierry Reding wrote:
> > > > >>> On Thu, May 20, 2021 at 05:03:06PM -0500, Rob Herring wrote:
> > > > >>>> On Fri, Apr 23, 2021 at 06:32:30PM +0200, Thierry Reding wrote:
> > > > >>>>> From: Thierry Reding <treding@nvidia.com>
> > > > >>>>>
> > > > >>>>> Reserved memory region phandle references can be accompanied by a
> > > > >>>>> specifier that provides additional information about how that specific
> > > > >>>>> reference should be treated.
> > > > >>>>>
> > > > >>>>> One use-case is to mark a memory region as needing an identity mapping
> > > > >>>>> in the system's IOMMU for the device that references the region. This is
> > > > >>>>> needed for example when the bootloader has set up hardware (such as a
> > > > >>>>> display controller) to actively access a memory region (e.g. a boot
> > > > >>>>> splash screen framebuffer) during boot. The operating system can use the
> > > > >>>>> identity mapping flag from the specifier to make sure an IOMMU identity
> > > > >>>>> mapping is set up for the framebuffer before IOMMU translations are
> > > > >>>>> enabled for the display controller.
> > > > >>>>>
> > > > >>>>> Signed-off-by: Thierry Reding <treding@nvidia.com>
> > > > >>>>> ---
> > > > >>>>>  .../reserved-memory/reserved-memory.txt       | 21 +++++++++++++++++++
> > > > >>>>>  include/dt-bindings/reserved-memory.h         |  8 +++++++
> > > > >>>>>  2 files changed, 29 insertions(+)
> > > > >>>>>  create mode 100644 include/dt-bindings/reserved-memory.h
> > > > >>>>
> > > > >>>> Sorry for being slow on this. I have 2 concerns.
> > > > >>>>
> > > > >>>> First, this creates an ABI issue. A DT with cells in 'memory-region'
> > > > >>>> will not be understood by an existing OS. I'm less concerned about this
> > > > >>>> if we address that with a stable fix. (Though I'm pretty sure we've
> > > > >>>> naively added #?-cells in the past ignoring this issue.)
> > > > >>>
> > > > >>> A while ago I had proposed adding memory-region*s* as an alternative
> > > > >>> name for memory-region to make the naming more consistent with other
> > > > >>> types of properties (think clocks, resets, gpios, ...). If we added
> > > > >>> that, we could easily differentiate between the "legacy" cases where
> > > > >>> no #memory-region-cells was allowed and the new cases where it was.
> > > > >>>
> > > > >>>> Second, it could be the bootloader setting up the reserved region. If a
> > > > >>>> node already has 'memory-region', then adding more regions is more
> > > > >>>> complicated compared to adding new properties. And defining what each
> > > > >>>> memory-region entry is or how many in schemas is impossible.
> > > > >>>
> > > > >>> It's true that updating the property gets a bit complicated, but it's
> > > > >>> not exactly rocket science. We really just need to splice the array. I
> > > > >>> have a working implemention for this in U-Boot.
> > > > >>>
> > > > >>> For what it's worth, we could run into the same issue with any new
> > > > >>> property that we add. Even if we renamed this to iommu-memory-region,
> > > > >>> it's still possible that a bootloader may have to update this property
> > > > >>> if it already exists (it could be hard-coded in DT, or it could have
> > > > >>> been added by some earlier bootloader or firmware).
> > > > >>>
> > > > >>>> Both could be addressed with a new property. Perhaps something like
> > > > >>>> 'iommu-memory-region = <&phandle>;'. I think the 'iommu' prefix is
> > > > >>>> appropriate given this is entirely because of the IOMMU being in the
> > > > >>>> mix. I might feel differently if we had other uses for cells, but I
> > > > >>>> don't really see it in this case.
> > > > >>>
> > > > >>> I'm afraid that down the road we'll end up with other cases and then we
> > > > >>> might proliferate a number of *-memory-region properties with varying
> > > > >>> prefixes.
> > > > >>>
> > > > >>> I am aware of one other case where we might need something like this: on
> > > > >>> some Tegra SoCs we have audio processors that will access memory buffers
> > > > >>> using a DMA engine. These processors are booted from early firmware
> > > > >>> using firmware from system memory. In order to avoid trashing the
> > > > >>> firmware, we need to reserve memory. We can do this using reserved
> > > > >>> memory nodes. However, the audio DMA engine also uses the SMMU, so we
> > > > >>> need to make sure that the firmware memory is marked as reserved within
> > > > >>> the SMMU. This is similar to the identity mapping case, but not exactly
> > > > >>> the same. Instead of creating a 1:1 mapping, we just want that IOVA
> > > > >>> region to be reserved (i.e. IOMMU_RESV_RESERVED instead of
> > > > >>> IOMMU_RESV_DIRECT{,_RELAXABLE}).
> > > > >>>
> > > > >>> That would also fall into the IOMMU domain, but we can't reuse the
> > > > >>> iommu-memory-region property for that because then we don't have enough
> > > > >>> information to decide which type of reservation we need.
> > > > >>>
> > > > >>> We could obviously make iommu-memory-region take a specifier, but we
> > > > >>> could just as well use memory-regions in that case since we have
> > > > >>> something more generic anyway.
> > > > >>>
> > > > >>> With the #memory-region-cells proposal, we can easily extend the cell in
> > > > >>> the specifier with an additional MEMORY_REGION_IOMMU_RESERVE flag to
> > > > >>> take that other use case into account. If we than also change to the new
> > > > >>> memory-regions property name, we avoid the ABI issue (and we gain a bit
> > > > >>> of consistency while at it).
> > > > >>
> > > > >> Ping? Rob, do you want me to add this second use-case to the patch
> > > > >> series to make it more obvious that this isn't just a one-off thing? Or
> > > > >> how do we proceed?
> > > > >
> > > > > Rob, given that additional use-case, do you want me to run with this
> > > > > proposal and send out an updated series?
> > > >
> > > >
> > > > What about variant with a "descriptor" properties that will describe
> > > > each region:
> > > >
> > > > fb_desc: display-framebuffer-memory-descriptor {
> > > >       needs-identity-mapping;
> > > > }
> > > >
> > > > display@52400000 {
> > > >       memory-region = <&fb ...>;
> > > >       memory-region-descriptor = <&fb_desc ...>;
> > > > };
> > > >
> > > > It could be a more flexible/extendible variant.
> > >
> > > This problem recently came up on #dri-devel again. Adding Alyssa and
> > > Sven who are facing a similar challenge on their work on Apple M1 (if I
> > > understood correctly). Also adding dri-devel for visibility since this
> > > is a very common problem for display in particular.
> > >
> > > On M1 the situation is slightly more complicated: the firmware will
> > > allocate a couple of buffers (including the framebuffer) in high memory
> > > (> 4 GiB) and use the IOMMU to map that into an IOVA region below 4 GiB
> > > so that the display hardware can access it. This makes it impossible to
> > > bypass the IOMMU like we do on other chips (in particular to work around
> > > the fault-by-default policy of the ARM SMMU driver). It also means that
> > > in addition to the simple reserved regions I mentioned we need for audio
> > > use-cases and identity mapping use-cases we need for display on Tegra,
> > > we now also need to be able to convey physical to IOVA mappings.
> > >
> > > Fitting the latter into the original proposal sounds difficult. A quick
> > > fix would've been to generate a mapping table in memory and pass that to
> > > the kernel using a reserved-memory node (similar to what's done for
> > > example on Tegra for the EMC frequency table on Tegra210) and mark it as
> > > such using a special flag. But that then involves two layers of parsing,
> > > which seems a bit suboptimal. Another way to shoehorn that into the
> > > original proposal would've been to add flags for physical and virtual
> > > address regions and use pairs to pass them using special flags. Again,
> > > this is a bit wonky because it needs these to be carefully parsed and
> > > matched up.
> > >
> > > Another downside is that we now have a situation where some of these
> > > regions are no longer "reserved-memory regions" in the traditional
> > > sense. This would require an additional flag in the reserved-memory
> > > region nodes to prevent the IOVA regions from being reserved. By the
> > > way, this is something that would also be needed for the audio use-case
> > > I mentioned before, because the physical memory at that address can
> > > still be used by an operating system.
> > >
> > > A more general solution would be to draw a bit from Dmitry's proposal
> > > and introduce a new top-level "iov-reserved-memory" node. This could be
> > > modelled on the existing reserved-memory node, except that the physical
> > > memory pages for regions represented by child nodes would not be marked
> > > as reserved. Only the IOVA range described by the region would be
> > > reserved subsequently by the IOMMU framework and/or IOMMU driver.
> > >
> > > The simplest case where we just want to reserve some IOVA region could
> > > then be done like this:
> > >
> > >         iov-reserved-memory {
> > >                 /*
> > >                  * Probably safest to default to <2>, <2> here given
> > >                  * that most IOMMUs support either > 32 bits of IAS
> > >                  * or OAS.
> > >                  */
> > >                 #address-cells = <2>;
> > >                 #size-cells = <2>;
> > >
> > >                 firmware: firmware@80000000 {
> > >                         reg = <0 0x80000000 0 0x01000000>;
> > >                 };
> > >         };
> > >
> > >         audio@30000000 {
> > >                 ...
> > >                 iov-memory-regions = <&firmware>;
> > >                 ...
> > >         };
> > >
> > > Mappings could be represented by an IOV reserved region taking a
> > > reference to the reserved-region that they map:
> > >
> > >         reserved-memory {
> > >                 #address-cells = <2>;
> > >                 #size-cells = <2>;
> > >
> > >                 /* 16 MiB of framebuffer at top-of-memory */
> > >                 framebuffer: framebuffer@1,ff000000 {
> > >                         reg = <0x1 0xff000000 0 0x01000000>;
> > >                         no-map;
> > >                 };
> > >         };
> > >
> > >         iov-reserved-memory {
> > >                 /* IOMMU supports only 32-bit output address space */
> > >                 #address-cells = <1>;
> > >                 #size-cells = <1>;
> > >
> > >                 /* 16 MiB of framebuffer mapped to top of IOVA */
> > >                 fb: fb@ff000000 {
> > >                         reg = <0 0xff000000 0 0x01000000>;
> > >                         memory-region = <&framebuffer>;
> > >                 };
> > >         };
> > >
> > >         display@40000000 {
> > >                 ...
> > >                 /* optional? */
> > >                 memory-region = <&framebuffer>;
> > >                 iov-memory-regions = <&fb>;
> > >                 ...
> > >         };
> > >
> > > It's interesting how identity mapped regions now become a trivial
> > > special case of mappings. All that is needed is to make the reg property
> > > of the IOV reserved region correspond to the reg property of the normal
> > > reserved region. Alternatively, as a small optimization for lazy people
> > > like me, we could just allow these cases to omit the reg property and
> > > instead inherit it from the referenced reserved region.
> > >
> > > As the second example shows it might be convenient if memory-region
> > > could be derived from iov-memory-regions. This could be useful for cases
> > > where the driver wants to do something with the physical pages of the
> > > reserved region (such as mapping them and copying out the framebuffer
> > > data to another buffer so that the reserved memory can be recycled). If
> > > we have the IOV reserved region, we could provide an API to extract the
> > > physical reserved region (if it exists). That way we could avoid
> > > referencing it twice in DT. Then again, there's something elegant about
> > > the explicit second reference to. It indicates the intent that we may
> > > want to use the region for something other than just the IOV mapping.
> > >
> > > Anyway, this has been long enough. Let me know what you think. Alyssa,
> > > Sven, it'd be interesting to hear if you think this could work as a
> > > solution to the problem on M1.
> > >
> > > Rob, I think you might like this alternative because it basically gets
> > > rid of all the points in the original proposal that you were concerned
> > > about. Let me know what you think.
> >
> > Couldn't we keep this all in /reserved-memory? Just add an iova
> > version of reg. Perhaps abuse 'assigned-address' for this purpose. The
> > issue I see would be handling reserved iova areas without a physical
> > area. That can be handled with just a iova and no reg. We already have
> > a no reg case.
>
> I had thought about that initially. One thing I'm worried about is that
> every child node in /reserved-memory will effectively cause the memory
> that it described to be reserved. But we don't want that for regions
> that are "virtual only" (i.e. IOMMU reservations).

By virtual only, you mean no physical mapping, just a region of
virtual space, right? For that we'd have no 'reg' and therefore no
(physical) reservation by the OS. It's similar to non-static regions.
You need a specific handler for them. We'd probably want a compatible
as well for these virtual reservations.

Are these being global in DT going to be a problem? Presumably we have
a virtual space per IOMMU. We'd know which IOMMU based on a device's
'iommus' and 'memory-region' properties, but within /reserved-memory
we wouldn't be able to distinguish overlapping addresses from separate
address spaces. Or we could have 2 different IOVAs for 1 physical
space. That could be solved with something like this:

iommu-addresses = <&iommu1 <address cells> <size cells>>;

Or the other way to do this is reuse 'iommus' property to define the
mapping of each address entry to iommu.

> Obviously we can fix that in Linux, but what about other operating
> systems? Currently "reg" is a required property for statically allocated
> regions (which all of these would be). Do you have an idea of how widely
> that's used? What about other OSes, or bootloaders, what if they
> encounter these nodes that don't have a "reg" property?

Without 'reg', there must be a compatible that the client understands
or the node should be ignored.

My suspicion is that /reserved-memory is abused for all sorts of
things downstream, but that's not really relevant here.

Rob

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 1/5] dt-bindings: reserved-memory: Document memory region specifier
  2021-09-03 14:36                   ` Rob Herring
@ 2021-09-03 15:35                     ` Thierry Reding
  2021-09-07 15:33                       ` Rob Herring
  0 siblings, 1 reply; 41+ messages in thread
From: Thierry Reding @ 2021-09-03 15:35 UTC (permalink / raw)
  To: Rob Herring
  Cc: Alyssa Rosenzweig, Sven Peter, Dmitry Osipenko, Joerg Roedel,
	Will Deacon, Robin Murphy, Nicolin Chen, Krishna Reddy,
	devicetree, Linux IOMMU, linux-tegra, dri-devel

[-- Attachment #1: Type: text/plain, Size: 19173 bytes --]

On Fri, Sep 03, 2021 at 09:36:33AM -0500, Rob Herring wrote:
> On Fri, Sep 3, 2021 at 8:52 AM Thierry Reding <thierry.reding@gmail.com> wrote:
> >
> > On Fri, Sep 03, 2021 at 08:20:55AM -0500, Rob Herring wrote:
> > > On Wed, Sep 1, 2021 at 9:13 AM Thierry Reding <thierry.reding@gmail.com> wrote:
> > > >
> > > > On Fri, Jul 02, 2021 at 05:16:25PM +0300, Dmitry Osipenko wrote:
> > > > > 01.07.2021 21:14, Thierry Reding пишет:
> > > > > > On Tue, Jun 08, 2021 at 06:51:40PM +0200, Thierry Reding wrote:
> > > > > >> On Fri, May 28, 2021 at 06:54:55PM +0200, Thierry Reding wrote:
> > > > > >>> On Thu, May 20, 2021 at 05:03:06PM -0500, Rob Herring wrote:
> > > > > >>>> On Fri, Apr 23, 2021 at 06:32:30PM +0200, Thierry Reding wrote:
> > > > > >>>>> From: Thierry Reding <treding@nvidia.com>
> > > > > >>>>>
> > > > > >>>>> Reserved memory region phandle references can be accompanied by a
> > > > > >>>>> specifier that provides additional information about how that specific
> > > > > >>>>> reference should be treated.
> > > > > >>>>>
> > > > > >>>>> One use-case is to mark a memory region as needing an identity mapping
> > > > > >>>>> in the system's IOMMU for the device that references the region. This is
> > > > > >>>>> needed for example when the bootloader has set up hardware (such as a
> > > > > >>>>> display controller) to actively access a memory region (e.g. a boot
> > > > > >>>>> splash screen framebuffer) during boot. The operating system can use the
> > > > > >>>>> identity mapping flag from the specifier to make sure an IOMMU identity
> > > > > >>>>> mapping is set up for the framebuffer before IOMMU translations are
> > > > > >>>>> enabled for the display controller.
> > > > > >>>>>
> > > > > >>>>> Signed-off-by: Thierry Reding <treding@nvidia.com>
> > > > > >>>>> ---
> > > > > >>>>>  .../reserved-memory/reserved-memory.txt       | 21 +++++++++++++++++++
> > > > > >>>>>  include/dt-bindings/reserved-memory.h         |  8 +++++++
> > > > > >>>>>  2 files changed, 29 insertions(+)
> > > > > >>>>>  create mode 100644 include/dt-bindings/reserved-memory.h
> > > > > >>>>
> > > > > >>>> Sorry for being slow on this. I have 2 concerns.
> > > > > >>>>
> > > > > >>>> First, this creates an ABI issue. A DT with cells in 'memory-region'
> > > > > >>>> will not be understood by an existing OS. I'm less concerned about this
> > > > > >>>> if we address that with a stable fix. (Though I'm pretty sure we've
> > > > > >>>> naively added #?-cells in the past ignoring this issue.)
> > > > > >>>
> > > > > >>> A while ago I had proposed adding memory-region*s* as an alternative
> > > > > >>> name for memory-region to make the naming more consistent with other
> > > > > >>> types of properties (think clocks, resets, gpios, ...). If we added
> > > > > >>> that, we could easily differentiate between the "legacy" cases where
> > > > > >>> no #memory-region-cells was allowed and the new cases where it was.
> > > > > >>>
> > > > > >>>> Second, it could be the bootloader setting up the reserved region. If a
> > > > > >>>> node already has 'memory-region', then adding more regions is more
> > > > > >>>> complicated compared to adding new properties. And defining what each
> > > > > >>>> memory-region entry is or how many in schemas is impossible.
> > > > > >>>
> > > > > >>> It's true that updating the property gets a bit complicated, but it's
> > > > > >>> not exactly rocket science. We really just need to splice the array. I
> > > > > >>> have a working implemention for this in U-Boot.
> > > > > >>>
> > > > > >>> For what it's worth, we could run into the same issue with any new
> > > > > >>> property that we add. Even if we renamed this to iommu-memory-region,
> > > > > >>> it's still possible that a bootloader may have to update this property
> > > > > >>> if it already exists (it could be hard-coded in DT, or it could have
> > > > > >>> been added by some earlier bootloader or firmware).
> > > > > >>>
> > > > > >>>> Both could be addressed with a new property. Perhaps something like
> > > > > >>>> 'iommu-memory-region = <&phandle>;'. I think the 'iommu' prefix is
> > > > > >>>> appropriate given this is entirely because of the IOMMU being in the
> > > > > >>>> mix. I might feel differently if we had other uses for cells, but I
> > > > > >>>> don't really see it in this case.
> > > > > >>>
> > > > > >>> I'm afraid that down the road we'll end up with other cases and then we
> > > > > >>> might proliferate a number of *-memory-region properties with varying
> > > > > >>> prefixes.
> > > > > >>>
> > > > > >>> I am aware of one other case where we might need something like this: on
> > > > > >>> some Tegra SoCs we have audio processors that will access memory buffers
> > > > > >>> using a DMA engine. These processors are booted from early firmware
> > > > > >>> using firmware from system memory. In order to avoid trashing the
> > > > > >>> firmware, we need to reserve memory. We can do this using reserved
> > > > > >>> memory nodes. However, the audio DMA engine also uses the SMMU, so we
> > > > > >>> need to make sure that the firmware memory is marked as reserved within
> > > > > >>> the SMMU. This is similar to the identity mapping case, but not exactly
> > > > > >>> the same. Instead of creating a 1:1 mapping, we just want that IOVA
> > > > > >>> region to be reserved (i.e. IOMMU_RESV_RESERVED instead of
> > > > > >>> IOMMU_RESV_DIRECT{,_RELAXABLE}).
> > > > > >>>
> > > > > >>> That would also fall into the IOMMU domain, but we can't reuse the
> > > > > >>> iommu-memory-region property for that because then we don't have enough
> > > > > >>> information to decide which type of reservation we need.
> > > > > >>>
> > > > > >>> We could obviously make iommu-memory-region take a specifier, but we
> > > > > >>> could just as well use memory-regions in that case since we have
> > > > > >>> something more generic anyway.
> > > > > >>>
> > > > > >>> With the #memory-region-cells proposal, we can easily extend the cell in
> > > > > >>> the specifier with an additional MEMORY_REGION_IOMMU_RESERVE flag to
> > > > > >>> take that other use case into account. If we than also change to the new
> > > > > >>> memory-regions property name, we avoid the ABI issue (and we gain a bit
> > > > > >>> of consistency while at it).
> > > > > >>
> > > > > >> Ping? Rob, do you want me to add this second use-case to the patch
> > > > > >> series to make it more obvious that this isn't just a one-off thing? Or
> > > > > >> how do we proceed?
> > > > > >
> > > > > > Rob, given that additional use-case, do you want me to run with this
> > > > > > proposal and send out an updated series?
> > > > >
> > > > >
> > > > > What about variant with a "descriptor" properties that will describe
> > > > > each region:
> > > > >
> > > > > fb_desc: display-framebuffer-memory-descriptor {
> > > > >       needs-identity-mapping;
> > > > > }
> > > > >
> > > > > display@52400000 {
> > > > >       memory-region = <&fb ...>;
> > > > >       memory-region-descriptor = <&fb_desc ...>;
> > > > > };
> > > > >
> > > > > It could be a more flexible/extendible variant.
> > > >
> > > > This problem recently came up on #dri-devel again. Adding Alyssa and
> > > > Sven who are facing a similar challenge on their work on Apple M1 (if I
> > > > understood correctly). Also adding dri-devel for visibility since this
> > > > is a very common problem for display in particular.
> > > >
> > > > On M1 the situation is slightly more complicated: the firmware will
> > > > allocate a couple of buffers (including the framebuffer) in high memory
> > > > (> 4 GiB) and use the IOMMU to map that into an IOVA region below 4 GiB
> > > > so that the display hardware can access it. This makes it impossible to
> > > > bypass the IOMMU like we do on other chips (in particular to work around
> > > > the fault-by-default policy of the ARM SMMU driver). It also means that
> > > > in addition to the simple reserved regions I mentioned we need for audio
> > > > use-cases and identity mapping use-cases we need for display on Tegra,
> > > > we now also need to be able to convey physical to IOVA mappings.
> > > >
> > > > Fitting the latter into the original proposal sounds difficult. A quick
> > > > fix would've been to generate a mapping table in memory and pass that to
> > > > the kernel using a reserved-memory node (similar to what's done for
> > > > example on Tegra for the EMC frequency table on Tegra210) and mark it as
> > > > such using a special flag. But that then involves two layers of parsing,
> > > > which seems a bit suboptimal. Another way to shoehorn that into the
> > > > original proposal would've been to add flags for physical and virtual
> > > > address regions and use pairs to pass them using special flags. Again,
> > > > this is a bit wonky because it needs these to be carefully parsed and
> > > > matched up.
> > > >
> > > > Another downside is that we now have a situation where some of these
> > > > regions are no longer "reserved-memory regions" in the traditional
> > > > sense. This would require an additional flag in the reserved-memory
> > > > region nodes to prevent the IOVA regions from being reserved. By the
> > > > way, this is something that would also be needed for the audio use-case
> > > > I mentioned before, because the physical memory at that address can
> > > > still be used by an operating system.
> > > >
> > > > A more general solution would be to draw a bit from Dmitry's proposal
> > > > and introduce a new top-level "iov-reserved-memory" node. This could be
> > > > modelled on the existing reserved-memory node, except that the physical
> > > > memory pages for regions represented by child nodes would not be marked
> > > > as reserved. Only the IOVA range described by the region would be
> > > > reserved subsequently by the IOMMU framework and/or IOMMU driver.
> > > >
> > > > The simplest case where we just want to reserve some IOVA region could
> > > > then be done like this:
> > > >
> > > >         iov-reserved-memory {
> > > >                 /*
> > > >                  * Probably safest to default to <2>, <2> here given
> > > >                  * that most IOMMUs support either > 32 bits of IAS
> > > >                  * or OAS.
> > > >                  */
> > > >                 #address-cells = <2>;
> > > >                 #size-cells = <2>;
> > > >
> > > >                 firmware: firmware@80000000 {
> > > >                         reg = <0 0x80000000 0 0x01000000>;
> > > >                 };
> > > >         };
> > > >
> > > >         audio@30000000 {
> > > >                 ...
> > > >                 iov-memory-regions = <&firmware>;
> > > >                 ...
> > > >         };
> > > >
> > > > Mappings could be represented by an IOV reserved region taking a
> > > > reference to the reserved-region that they map:
> > > >
> > > >         reserved-memory {
> > > >                 #address-cells = <2>;
> > > >                 #size-cells = <2>;
> > > >
> > > >                 /* 16 MiB of framebuffer at top-of-memory */
> > > >                 framebuffer: framebuffer@1,ff000000 {
> > > >                         reg = <0x1 0xff000000 0 0x01000000>;
> > > >                         no-map;
> > > >                 };
> > > >         };
> > > >
> > > >         iov-reserved-memory {
> > > >                 /* IOMMU supports only 32-bit output address space */
> > > >                 #address-cells = <1>;
> > > >                 #size-cells = <1>;
> > > >
> > > >                 /* 16 MiB of framebuffer mapped to top of IOVA */
> > > >                 fb: fb@ff000000 {
> > > >                         reg = <0 0xff000000 0 0x01000000>;
> > > >                         memory-region = <&framebuffer>;
> > > >                 };
> > > >         };
> > > >
> > > >         display@40000000 {
> > > >                 ...
> > > >                 /* optional? */
> > > >                 memory-region = <&framebuffer>;
> > > >                 iov-memory-regions = <&fb>;
> > > >                 ...
> > > >         };
> > > >
> > > > It's interesting how identity mapped regions now become a trivial
> > > > special case of mappings. All that is needed is to make the reg property
> > > > of the IOV reserved region correspond to the reg property of the normal
> > > > reserved region. Alternatively, as a small optimization for lazy people
> > > > like me, we could just allow these cases to omit the reg property and
> > > > instead inherit it from the referenced reserved region.
> > > >
> > > > As the second example shows it might be convenient if memory-region
> > > > could be derived from iov-memory-regions. This could be useful for cases
> > > > where the driver wants to do something with the physical pages of the
> > > > reserved region (such as mapping them and copying out the framebuffer
> > > > data to another buffer so that the reserved memory can be recycled). If
> > > > we have the IOV reserved region, we could provide an API to extract the
> > > > physical reserved region (if it exists). That way we could avoid
> > > > referencing it twice in DT. Then again, there's something elegant about
> > > > the explicit second reference to. It indicates the intent that we may
> > > > want to use the region for something other than just the IOV mapping.
> > > >
> > > > Anyway, this has been long enough. Let me know what you think. Alyssa,
> > > > Sven, it'd be interesting to hear if you think this could work as a
> > > > solution to the problem on M1.
> > > >
> > > > Rob, I think you might like this alternative because it basically gets
> > > > rid of all the points in the original proposal that you were concerned
> > > > about. Let me know what you think.
> > >
> > > Couldn't we keep this all in /reserved-memory? Just add an iova
> > > version of reg. Perhaps abuse 'assigned-address' for this purpose. The
> > > issue I see would be handling reserved iova areas without a physical
> > > area. That can be handled with just a iova and no reg. We already have
> > > a no reg case.
> >
> > I had thought about that initially. One thing I'm worried about is that
> > every child node in /reserved-memory will effectively cause the memory
> > that it described to be reserved. But we don't want that for regions
> > that are "virtual only" (i.e. IOMMU reservations).
> 
> By virtual only, you mean no physical mapping, just a region of
> virtual space, right? For that we'd have no 'reg' and therefore no
> (physical) reservation by the OS. It's similar to non-static regions.
> You need a specific handler for them. We'd probably want a compatible
> as well for these virtual reservations.

Yeah, these would be purely used for reserving regions in the IOVA so
that they won't be used by the IOVA allocator. Typically these would be
used for cases where those addresses have some special meaning.

Do we want something like:

	compatible = "iommu-reserved";

for these? Or would that need to be:

	compatible = "linux,iommu-reserved";

? There seems to be a mix of vendor-prefix vs. non-vendor-prefix
compatible strings in the reserved-memory DT bindings directory.

On the other hand, do we actually need the compatible string? Because we
don't really want to associate much extra information with this like we
do for example with "shared-dma-pool". The logic to handle this would
all be within the IOMMU framework. All we really need is for the
standard reservation code to skip nodes that don't have a reg property
so we don't reserve memory for "virtual-only" allocations.

> Are these being global in DT going to be a problem? Presumably we have
> a virtual space per IOMMU. We'd know which IOMMU based on a device's
> 'iommus' and 'memory-region' properties, but within /reserved-memory
> we wouldn't be able to distinguish overlapping addresses from separate
> address spaces. Or we could have 2 different IOVAs for 1 physical
> space. That could be solved with something like this:
> 
> iommu-addresses = <&iommu1 <address cells> <size cells>>;

The only case that would be problematic would be if we have overlapping
physical regions, because that will probably trip up the standard code.

But this could also be worked around by looking at iommu-addresses. For
example, if we had something like this:

	reserved-memory {
		fb_dc0: fb@80000000 {
			reg = <0x80000000 0x01000000>;
			iommu-addresses = <0xa0000000 0x01000000>;
		};

		fb_dc1: fb@80000000 {
			reg = <0x80000000 0x01000000>;
			iommu-addresses = <0xb0000000 0x01000000>;
		};
	};

We could make the code identify that this is for the same physical
reservation (maybe make it so that reg needs to match exactly for this
to be recognized) but with different virtual allocations.

On a side-note: do we really need to repeat the size? I'd think if we
want mappings then we'd likely want them for the whole reservation.

I'd like to keep references to IOMMUs out of this because they would be
duplicated. We will only use these nodes if they are referenced by a
device node that also has an iommus property. Also, the IOMMU reference
itself isn't enough. We'd also need to support the complete specifier
because you can have things like SIDs in there to specify the exact
address space that a device uses.

Also, for some of these they may be reused independently of the IOMMU
address space. For example the Tegra framebuffer identity mapping can
be used by either of the 2-4 display controllers, each with (at least
potentially) their own address space. But we don't want to have to
describe the identity mapping separately for each display controller.

Another thing to consider is that these nodes will often be added by
firmware (e.g. firmware will allocate the framebuffer and set up the
corresponding reserved memory region in DT). Wiring up references like
this would get very complicated very quickly.

> Or the other way to do this is reuse 'iommus' property to define the
> mapping of each address entry to iommu.
> 
> > Obviously we can fix that in Linux, but what about other operating
> > systems? Currently "reg" is a required property for statically allocated
> > regions (which all of these would be). Do you have an idea of how widely
> > that's used? What about other OSes, or bootloaders, what if they
> > encounter these nodes that don't have a "reg" property?
> 
> Without 'reg', there must be a compatible that the client understands
> or the node should be ignored.
> 
> My suspicion is that /reserved-memory is abused for all sorts of
> things downstream, but that's not really relevant here.

Yeah, my only concern was that we might break users of this that are not
sophisticated enough to handle the nuances that we'd introduce here. If
we can assume that nodes without a reg property will be ignored, then I
think that's good enough.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 1/5] dt-bindings: reserved-memory: Document memory region specifier
  2021-09-03 15:35                     ` Thierry Reding
@ 2021-09-07 15:33                       ` Rob Herring
  2021-09-07 17:44                         ` Thierry Reding
  0 siblings, 1 reply; 41+ messages in thread
From: Rob Herring @ 2021-09-07 15:33 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Alyssa Rosenzweig, Sven Peter, Dmitry Osipenko, Joerg Roedel,
	Will Deacon, Robin Murphy, Nicolin Chen, Krishna Reddy,
	devicetree, Linux IOMMU, linux-tegra, dri-devel

On Fri, Sep 3, 2021 at 10:36 AM Thierry Reding <thierry.reding@gmail.com> wrote:
>
> On Fri, Sep 03, 2021 at 09:36:33AM -0500, Rob Herring wrote:
> > On Fri, Sep 3, 2021 at 8:52 AM Thierry Reding <thierry.reding@gmail.com> wrote:
> > >
> > > On Fri, Sep 03, 2021 at 08:20:55AM -0500, Rob Herring wrote:
> > > > On Wed, Sep 1, 2021 at 9:13 AM Thierry Reding <thierry.reding@gmail.com> wrote:
> > > > >
> > > > > On Fri, Jul 02, 2021 at 05:16:25PM +0300, Dmitry Osipenko wrote:
> > > > > > 01.07.2021 21:14, Thierry Reding пишет:
> > > > > > > On Tue, Jun 08, 2021 at 06:51:40PM +0200, Thierry Reding wrote:
> > > > > > >> On Fri, May 28, 2021 at 06:54:55PM +0200, Thierry Reding wrote:
> > > > > > >>> On Thu, May 20, 2021 at 05:03:06PM -0500, Rob Herring wrote:
> > > > > > >>>> On Fri, Apr 23, 2021 at 06:32:30PM +0200, Thierry Reding wrote:
> > > > > > >>>>> From: Thierry Reding <treding@nvidia.com>
> > > > > > >>>>>
> > > > > > >>>>> Reserved memory region phandle references can be accompanied by a
> > > > > > >>>>> specifier that provides additional information about how that specific
> > > > > > >>>>> reference should be treated.
> > > > > > >>>>>
> > > > > > >>>>> One use-case is to mark a memory region as needing an identity mapping
> > > > > > >>>>> in the system's IOMMU for the device that references the region. This is
> > > > > > >>>>> needed for example when the bootloader has set up hardware (such as a
> > > > > > >>>>> display controller) to actively access a memory region (e.g. a boot
> > > > > > >>>>> splash screen framebuffer) during boot. The operating system can use the
> > > > > > >>>>> identity mapping flag from the specifier to make sure an IOMMU identity
> > > > > > >>>>> mapping is set up for the framebuffer before IOMMU translations are
> > > > > > >>>>> enabled for the display controller.
> > > > > > >>>>>
> > > > > > >>>>> Signed-off-by: Thierry Reding <treding@nvidia.com>
> > > > > > >>>>> ---
> > > > > > >>>>>  .../reserved-memory/reserved-memory.txt       | 21 +++++++++++++++++++
> > > > > > >>>>>  include/dt-bindings/reserved-memory.h         |  8 +++++++
> > > > > > >>>>>  2 files changed, 29 insertions(+)
> > > > > > >>>>>  create mode 100644 include/dt-bindings/reserved-memory.h
> > > > > > >>>>
> > > > > > >>>> Sorry for being slow on this. I have 2 concerns.
> > > > > > >>>>
> > > > > > >>>> First, this creates an ABI issue. A DT with cells in 'memory-region'
> > > > > > >>>> will not be understood by an existing OS. I'm less concerned about this
> > > > > > >>>> if we address that with a stable fix. (Though I'm pretty sure we've
> > > > > > >>>> naively added #?-cells in the past ignoring this issue.)
> > > > > > >>>
> > > > > > >>> A while ago I had proposed adding memory-region*s* as an alternative
> > > > > > >>> name for memory-region to make the naming more consistent with other
> > > > > > >>> types of properties (think clocks, resets, gpios, ...). If we added
> > > > > > >>> that, we could easily differentiate between the "legacy" cases where
> > > > > > >>> no #memory-region-cells was allowed and the new cases where it was.
> > > > > > >>>
> > > > > > >>>> Second, it could be the bootloader setting up the reserved region. If a
> > > > > > >>>> node already has 'memory-region', then adding more regions is more
> > > > > > >>>> complicated compared to adding new properties. And defining what each
> > > > > > >>>> memory-region entry is or how many in schemas is impossible.
> > > > > > >>>
> > > > > > >>> It's true that updating the property gets a bit complicated, but it's
> > > > > > >>> not exactly rocket science. We really just need to splice the array. I
> > > > > > >>> have a working implemention for this in U-Boot.
> > > > > > >>>
> > > > > > >>> For what it's worth, we could run into the same issue with any new
> > > > > > >>> property that we add. Even if we renamed this to iommu-memory-region,
> > > > > > >>> it's still possible that a bootloader may have to update this property
> > > > > > >>> if it already exists (it could be hard-coded in DT, or it could have
> > > > > > >>> been added by some earlier bootloader or firmware).
> > > > > > >>>
> > > > > > >>>> Both could be addressed with a new property. Perhaps something like
> > > > > > >>>> 'iommu-memory-region = <&phandle>;'. I think the 'iommu' prefix is
> > > > > > >>>> appropriate given this is entirely because of the IOMMU being in the
> > > > > > >>>> mix. I might feel differently if we had other uses for cells, but I
> > > > > > >>>> don't really see it in this case.
> > > > > > >>>
> > > > > > >>> I'm afraid that down the road we'll end up with other cases and then we
> > > > > > >>> might proliferate a number of *-memory-region properties with varying
> > > > > > >>> prefixes.
> > > > > > >>>
> > > > > > >>> I am aware of one other case where we might need something like this: on
> > > > > > >>> some Tegra SoCs we have audio processors that will access memory buffers
> > > > > > >>> using a DMA engine. These processors are booted from early firmware
> > > > > > >>> using firmware from system memory. In order to avoid trashing the
> > > > > > >>> firmware, we need to reserve memory. We can do this using reserved
> > > > > > >>> memory nodes. However, the audio DMA engine also uses the SMMU, so we
> > > > > > >>> need to make sure that the firmware memory is marked as reserved within
> > > > > > >>> the SMMU. This is similar to the identity mapping case, but not exactly
> > > > > > >>> the same. Instead of creating a 1:1 mapping, we just want that IOVA
> > > > > > >>> region to be reserved (i.e. IOMMU_RESV_RESERVED instead of
> > > > > > >>> IOMMU_RESV_DIRECT{,_RELAXABLE}).
> > > > > > >>>
> > > > > > >>> That would also fall into the IOMMU domain, but we can't reuse the
> > > > > > >>> iommu-memory-region property for that because then we don't have enough
> > > > > > >>> information to decide which type of reservation we need.
> > > > > > >>>
> > > > > > >>> We could obviously make iommu-memory-region take a specifier, but we
> > > > > > >>> could just as well use memory-regions in that case since we have
> > > > > > >>> something more generic anyway.
> > > > > > >>>
> > > > > > >>> With the #memory-region-cells proposal, we can easily extend the cell in
> > > > > > >>> the specifier with an additional MEMORY_REGION_IOMMU_RESERVE flag to
> > > > > > >>> take that other use case into account. If we than also change to the new
> > > > > > >>> memory-regions property name, we avoid the ABI issue (and we gain a bit
> > > > > > >>> of consistency while at it).
> > > > > > >>
> > > > > > >> Ping? Rob, do you want me to add this second use-case to the patch
> > > > > > >> series to make it more obvious that this isn't just a one-off thing? Or
> > > > > > >> how do we proceed?
> > > > > > >
> > > > > > > Rob, given that additional use-case, do you want me to run with this
> > > > > > > proposal and send out an updated series?
> > > > > >
> > > > > >
> > > > > > What about variant with a "descriptor" properties that will describe
> > > > > > each region:
> > > > > >
> > > > > > fb_desc: display-framebuffer-memory-descriptor {
> > > > > >       needs-identity-mapping;
> > > > > > }
> > > > > >
> > > > > > display@52400000 {
> > > > > >       memory-region = <&fb ...>;
> > > > > >       memory-region-descriptor = <&fb_desc ...>;
> > > > > > };
> > > > > >
> > > > > > It could be a more flexible/extendible variant.
> > > > >
> > > > > This problem recently came up on #dri-devel again. Adding Alyssa and
> > > > > Sven who are facing a similar challenge on their work on Apple M1 (if I
> > > > > understood correctly). Also adding dri-devel for visibility since this
> > > > > is a very common problem for display in particular.
> > > > >
> > > > > On M1 the situation is slightly more complicated: the firmware will
> > > > > allocate a couple of buffers (including the framebuffer) in high memory
> > > > > (> 4 GiB) and use the IOMMU to map that into an IOVA region below 4 GiB
> > > > > so that the display hardware can access it. This makes it impossible to
> > > > > bypass the IOMMU like we do on other chips (in particular to work around
> > > > > the fault-by-default policy of the ARM SMMU driver). It also means that
> > > > > in addition to the simple reserved regions I mentioned we need for audio
> > > > > use-cases and identity mapping use-cases we need for display on Tegra,
> > > > > we now also need to be able to convey physical to IOVA mappings.
> > > > >
> > > > > Fitting the latter into the original proposal sounds difficult. A quick
> > > > > fix would've been to generate a mapping table in memory and pass that to
> > > > > the kernel using a reserved-memory node (similar to what's done for
> > > > > example on Tegra for the EMC frequency table on Tegra210) and mark it as
> > > > > such using a special flag. But that then involves two layers of parsing,
> > > > > which seems a bit suboptimal. Another way to shoehorn that into the
> > > > > original proposal would've been to add flags for physical and virtual
> > > > > address regions and use pairs to pass them using special flags. Again,
> > > > > this is a bit wonky because it needs these to be carefully parsed and
> > > > > matched up.
> > > > >
> > > > > Another downside is that we now have a situation where some of these
> > > > > regions are no longer "reserved-memory regions" in the traditional
> > > > > sense. This would require an additional flag in the reserved-memory
> > > > > region nodes to prevent the IOVA regions from being reserved. By the
> > > > > way, this is something that would also be needed for the audio use-case
> > > > > I mentioned before, because the physical memory at that address can
> > > > > still be used by an operating system.
> > > > >
> > > > > A more general solution would be to draw a bit from Dmitry's proposal
> > > > > and introduce a new top-level "iov-reserved-memory" node. This could be
> > > > > modelled on the existing reserved-memory node, except that the physical
> > > > > memory pages for regions represented by child nodes would not be marked
> > > > > as reserved. Only the IOVA range described by the region would be
> > > > > reserved subsequently by the IOMMU framework and/or IOMMU driver.
> > > > >
> > > > > The simplest case where we just want to reserve some IOVA region could
> > > > > then be done like this:
> > > > >
> > > > >         iov-reserved-memory {
> > > > >                 /*
> > > > >                  * Probably safest to default to <2>, <2> here given
> > > > >                  * that most IOMMUs support either > 32 bits of IAS
> > > > >                  * or OAS.
> > > > >                  */
> > > > >                 #address-cells = <2>;
> > > > >                 #size-cells = <2>;
> > > > >
> > > > >                 firmware: firmware@80000000 {
> > > > >                         reg = <0 0x80000000 0 0x01000000>;
> > > > >                 };
> > > > >         };
> > > > >
> > > > >         audio@30000000 {
> > > > >                 ...
> > > > >                 iov-memory-regions = <&firmware>;
> > > > >                 ...
> > > > >         };
> > > > >
> > > > > Mappings could be represented by an IOV reserved region taking a
> > > > > reference to the reserved-region that they map:
> > > > >
> > > > >         reserved-memory {
> > > > >                 #address-cells = <2>;
> > > > >                 #size-cells = <2>;
> > > > >
> > > > >                 /* 16 MiB of framebuffer at top-of-memory */
> > > > >                 framebuffer: framebuffer@1,ff000000 {
> > > > >                         reg = <0x1 0xff000000 0 0x01000000>;
> > > > >                         no-map;
> > > > >                 };
> > > > >         };
> > > > >
> > > > >         iov-reserved-memory {
> > > > >                 /* IOMMU supports only 32-bit output address space */
> > > > >                 #address-cells = <1>;
> > > > >                 #size-cells = <1>;
> > > > >
> > > > >                 /* 16 MiB of framebuffer mapped to top of IOVA */
> > > > >                 fb: fb@ff000000 {
> > > > >                         reg = <0 0xff000000 0 0x01000000>;
> > > > >                         memory-region = <&framebuffer>;
> > > > >                 };
> > > > >         };
> > > > >
> > > > >         display@40000000 {
> > > > >                 ...
> > > > >                 /* optional? */
> > > > >                 memory-region = <&framebuffer>;
> > > > >                 iov-memory-regions = <&fb>;
> > > > >                 ...
> > > > >         };
> > > > >
> > > > > It's interesting how identity mapped regions now become a trivial
> > > > > special case of mappings. All that is needed is to make the reg property
> > > > > of the IOV reserved region correspond to the reg property of the normal
> > > > > reserved region. Alternatively, as a small optimization for lazy people
> > > > > like me, we could just allow these cases to omit the reg property and
> > > > > instead inherit it from the referenced reserved region.
> > > > >
> > > > > As the second example shows it might be convenient if memory-region
> > > > > could be derived from iov-memory-regions. This could be useful for cases
> > > > > where the driver wants to do something with the physical pages of the
> > > > > reserved region (such as mapping them and copying out the framebuffer
> > > > > data to another buffer so that the reserved memory can be recycled). If
> > > > > we have the IOV reserved region, we could provide an API to extract the
> > > > > physical reserved region (if it exists). That way we could avoid
> > > > > referencing it twice in DT. Then again, there's something elegant about
> > > > > the explicit second reference to. It indicates the intent that we may
> > > > > want to use the region for something other than just the IOV mapping.
> > > > >
> > > > > Anyway, this has been long enough. Let me know what you think. Alyssa,
> > > > > Sven, it'd be interesting to hear if you think this could work as a
> > > > > solution to the problem on M1.
> > > > >
> > > > > Rob, I think you might like this alternative because it basically gets
> > > > > rid of all the points in the original proposal that you were concerned
> > > > > about. Let me know what you think.
> > > >
> > > > Couldn't we keep this all in /reserved-memory? Just add an iova
> > > > version of reg. Perhaps abuse 'assigned-address' for this purpose. The
> > > > issue I see would be handling reserved iova areas without a physical
> > > > area. That can be handled with just a iova and no reg. We already have
> > > > a no reg case.
> > >
> > > I had thought about that initially. One thing I'm worried about is that
> > > every child node in /reserved-memory will effectively cause the memory
> > > that it described to be reserved. But we don't want that for regions
> > > that are "virtual only" (i.e. IOMMU reservations).
> >
> > By virtual only, you mean no physical mapping, just a region of
> > virtual space, right? For that we'd have no 'reg' and therefore no
> > (physical) reservation by the OS. It's similar to non-static regions.
> > You need a specific handler for them. We'd probably want a compatible
> > as well for these virtual reservations.
>
> Yeah, these would be purely used for reserving regions in the IOVA so
> that they won't be used by the IOVA allocator. Typically these would be
> used for cases where those addresses have some special meaning.
>
> Do we want something like:
>
>         compatible = "iommu-reserved";
>
> for these? Or would that need to be:
>
>         compatible = "linux,iommu-reserved";
>
> ? There seems to be a mix of vendor-prefix vs. non-vendor-prefix
> compatible strings in the reserved-memory DT bindings directory.

I would not use 'linux,' here.

>
> On the other hand, do we actually need the compatible string? Because we
> don't really want to associate much extra information with this like we
> do for example with "shared-dma-pool". The logic to handle this would
> all be within the IOMMU framework. All we really need is for the
> standard reservation code to skip nodes that don't have a reg property
> so we don't reserve memory for "virtual-only" allocations.

It doesn't hurt to have one and I can imagine we might want to iterate
over all the nodes. It's slightly easier and more common to iterate
over compatible nodes rather than nodes with some property.

> > Are these being global in DT going to be a problem? Presumably we have
> > a virtual space per IOMMU. We'd know which IOMMU based on a device's
> > 'iommus' and 'memory-region' properties, but within /reserved-memory
> > we wouldn't be able to distinguish overlapping addresses from separate
> > address spaces. Or we could have 2 different IOVAs for 1 physical
> > space. That could be solved with something like this:
> >
> > iommu-addresses = <&iommu1 <address cells> <size cells>>;
>
> The only case that would be problematic would be if we have overlapping
> physical regions, because that will probably trip up the standard code.
>
> But this could also be worked around by looking at iommu-addresses. For
> example, if we had something like this:
>
>         reserved-memory {
>                 fb_dc0: fb@80000000 {
>                         reg = <0x80000000 0x01000000>;
>                         iommu-addresses = <0xa0000000 0x01000000>;
>                 };
>
>                 fb_dc1: fb@80000000 {

You can't have 2 nodes with the same name (actually, you can, they
just get merged together). Different names with the same unit-address
is a dtc warning. I'd really like to make that a full blown
overlapping region check.

>                         reg = <0x80000000 0x01000000>;
>                         iommu-addresses = <0xb0000000 0x01000000>;
>                 };
>         };
>
> We could make the code identify that this is for the same physical
> reservation (maybe make it so that reg needs to match exactly for this
> to be recognized) but with different virtual allocations.
>
> On a side-note: do we really need to repeat the size? I'd think if we
> want mappings then we'd likely want them for the whole reservation.

Humm, I suppose not, but dropping it paints us into a corner if we
come up with wanting a different size later. You could have a carveout
for double/triple buffering your framebuffer, but the bootloader
framebuffer is only single buffered. So would you want actual size?

> I'd like to keep references to IOMMUs out of this because they would be
> duplicated. We will only use these nodes if they are referenced by a
> device node that also has an iommus property. Also, the IOMMU reference
> itself isn't enough. We'd also need to support the complete specifier
> because you can have things like SIDs in there to specify the exact
> address space that a device uses.
>
> Also, for some of these they may be reused independently of the IOMMU
> address space. For example the Tegra framebuffer identity mapping can
> be used by either of the 2-4 display controllers, each with (at least
> potentially) their own address space. But we don't want to have to
> describe the identity mapping separately for each display controller.

Okay, but I'd rather have to duplicate things in your case than not be
able to express some other case.

> Another thing to consider is that these nodes will often be added by
> firmware (e.g. firmware will allocate the framebuffer and set up the
> corresponding reserved memory region in DT). Wiring up references like
> this would get very complicated very quickly.

Yes.

The using 'iommus' property option below can be optional and doesn't
have to be defined/supported now. Just trying to think ahead and not
be stuck with something that can't be extended.

> > Or the other way to do this is reuse 'iommus' property to define the
> > mapping of each address entry to iommu.
> >
> > > Obviously we can fix that in Linux, but what about other operating
> > > systems? Currently "reg" is a required property for statically allocated
> > > regions (which all of these would be). Do you have an idea of how widely
> > > that's used? What about other OSes, or bootloaders, what if they
> > > encounter these nodes that don't have a "reg" property?
> >
> > Without 'reg', there must be a compatible that the client understands
> > or the node should be ignored.
> >
> > My suspicion is that /reserved-memory is abused for all sorts of
> > things downstream, but that's not really relevant here.
>
> Yeah, my only concern was that we might break users of this that are not
> sophisticated enough to handle the nuances that we'd introduce here. If
> we can assume that nodes without a reg property will be ignored, then I
> think that's good enough.

I'm pretty sure we should be okay, but check the code.

Rob

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 1/5] dt-bindings: reserved-memory: Document memory region specifier
  2021-09-07 15:33                       ` Rob Herring
@ 2021-09-07 17:44                         ` Thierry Reding
  2021-09-15 15:19                           ` Thierry Reding
  0 siblings, 1 reply; 41+ messages in thread
From: Thierry Reding @ 2021-09-07 17:44 UTC (permalink / raw)
  To: Rob Herring
  Cc: Alyssa Rosenzweig, Sven Peter, Dmitry Osipenko, Joerg Roedel,
	Will Deacon, Robin Murphy, Nicolin Chen, Krishna Reddy,
	devicetree, Linux IOMMU, linux-tegra, dri-devel

[-- Attachment #1: Type: text/plain, Size: 25244 bytes --]

On Tue, Sep 07, 2021 at 10:33:24AM -0500, Rob Herring wrote:
> On Fri, Sep 3, 2021 at 10:36 AM Thierry Reding <thierry.reding@gmail.com> wrote:
> >
> > On Fri, Sep 03, 2021 at 09:36:33AM -0500, Rob Herring wrote:
> > > On Fri, Sep 3, 2021 at 8:52 AM Thierry Reding <thierry.reding@gmail.com> wrote:
> > > >
> > > > On Fri, Sep 03, 2021 at 08:20:55AM -0500, Rob Herring wrote:
> > > > > On Wed, Sep 1, 2021 at 9:13 AM Thierry Reding <thierry.reding@gmail.com> wrote:
> > > > > >
> > > > > > On Fri, Jul 02, 2021 at 05:16:25PM +0300, Dmitry Osipenko wrote:
> > > > > > > 01.07.2021 21:14, Thierry Reding пишет:
> > > > > > > > On Tue, Jun 08, 2021 at 06:51:40PM +0200, Thierry Reding wrote:
> > > > > > > >> On Fri, May 28, 2021 at 06:54:55PM +0200, Thierry Reding wrote:
> > > > > > > >>> On Thu, May 20, 2021 at 05:03:06PM -0500, Rob Herring wrote:
> > > > > > > >>>> On Fri, Apr 23, 2021 at 06:32:30PM +0200, Thierry Reding wrote:
> > > > > > > >>>>> From: Thierry Reding <treding@nvidia.com>
> > > > > > > >>>>>
> > > > > > > >>>>> Reserved memory region phandle references can be accompanied by a
> > > > > > > >>>>> specifier that provides additional information about how that specific
> > > > > > > >>>>> reference should be treated.
> > > > > > > >>>>>
> > > > > > > >>>>> One use-case is to mark a memory region as needing an identity mapping
> > > > > > > >>>>> in the system's IOMMU for the device that references the region. This is
> > > > > > > >>>>> needed for example when the bootloader has set up hardware (such as a
> > > > > > > >>>>> display controller) to actively access a memory region (e.g. a boot
> > > > > > > >>>>> splash screen framebuffer) during boot. The operating system can use the
> > > > > > > >>>>> identity mapping flag from the specifier to make sure an IOMMU identity
> > > > > > > >>>>> mapping is set up for the framebuffer before IOMMU translations are
> > > > > > > >>>>> enabled for the display controller.
> > > > > > > >>>>>
> > > > > > > >>>>> Signed-off-by: Thierry Reding <treding@nvidia.com>
> > > > > > > >>>>> ---
> > > > > > > >>>>>  .../reserved-memory/reserved-memory.txt       | 21 +++++++++++++++++++
> > > > > > > >>>>>  include/dt-bindings/reserved-memory.h         |  8 +++++++
> > > > > > > >>>>>  2 files changed, 29 insertions(+)
> > > > > > > >>>>>  create mode 100644 include/dt-bindings/reserved-memory.h
> > > > > > > >>>>
> > > > > > > >>>> Sorry for being slow on this. I have 2 concerns.
> > > > > > > >>>>
> > > > > > > >>>> First, this creates an ABI issue. A DT with cells in 'memory-region'
> > > > > > > >>>> will not be understood by an existing OS. I'm less concerned about this
> > > > > > > >>>> if we address that with a stable fix. (Though I'm pretty sure we've
> > > > > > > >>>> naively added #?-cells in the past ignoring this issue.)
> > > > > > > >>>
> > > > > > > >>> A while ago I had proposed adding memory-region*s* as an alternative
> > > > > > > >>> name for memory-region to make the naming more consistent with other
> > > > > > > >>> types of properties (think clocks, resets, gpios, ...). If we added
> > > > > > > >>> that, we could easily differentiate between the "legacy" cases where
> > > > > > > >>> no #memory-region-cells was allowed and the new cases where it was.
> > > > > > > >>>
> > > > > > > >>>> Second, it could be the bootloader setting up the reserved region. If a
> > > > > > > >>>> node already has 'memory-region', then adding more regions is more
> > > > > > > >>>> complicated compared to adding new properties. And defining what each
> > > > > > > >>>> memory-region entry is or how many in schemas is impossible.
> > > > > > > >>>
> > > > > > > >>> It's true that updating the property gets a bit complicated, but it's
> > > > > > > >>> not exactly rocket science. We really just need to splice the array. I
> > > > > > > >>> have a working implemention for this in U-Boot.
> > > > > > > >>>
> > > > > > > >>> For what it's worth, we could run into the same issue with any new
> > > > > > > >>> property that we add. Even if we renamed this to iommu-memory-region,
> > > > > > > >>> it's still possible that a bootloader may have to update this property
> > > > > > > >>> if it already exists (it could be hard-coded in DT, or it could have
> > > > > > > >>> been added by some earlier bootloader or firmware).
> > > > > > > >>>
> > > > > > > >>>> Both could be addressed with a new property. Perhaps something like
> > > > > > > >>>> 'iommu-memory-region = <&phandle>;'. I think the 'iommu' prefix is
> > > > > > > >>>> appropriate given this is entirely because of the IOMMU being in the
> > > > > > > >>>> mix. I might feel differently if we had other uses for cells, but I
> > > > > > > >>>> don't really see it in this case.
> > > > > > > >>>
> > > > > > > >>> I'm afraid that down the road we'll end up with other cases and then we
> > > > > > > >>> might proliferate a number of *-memory-region properties with varying
> > > > > > > >>> prefixes.
> > > > > > > >>>
> > > > > > > >>> I am aware of one other case where we might need something like this: on
> > > > > > > >>> some Tegra SoCs we have audio processors that will access memory buffers
> > > > > > > >>> using a DMA engine. These processors are booted from early firmware
> > > > > > > >>> using firmware from system memory. In order to avoid trashing the
> > > > > > > >>> firmware, we need to reserve memory. We can do this using reserved
> > > > > > > >>> memory nodes. However, the audio DMA engine also uses the SMMU, so we
> > > > > > > >>> need to make sure that the firmware memory is marked as reserved within
> > > > > > > >>> the SMMU. This is similar to the identity mapping case, but not exactly
> > > > > > > >>> the same. Instead of creating a 1:1 mapping, we just want that IOVA
> > > > > > > >>> region to be reserved (i.e. IOMMU_RESV_RESERVED instead of
> > > > > > > >>> IOMMU_RESV_DIRECT{,_RELAXABLE}).
> > > > > > > >>>
> > > > > > > >>> That would also fall into the IOMMU domain, but we can't reuse the
> > > > > > > >>> iommu-memory-region property for that because then we don't have enough
> > > > > > > >>> information to decide which type of reservation we need.
> > > > > > > >>>
> > > > > > > >>> We could obviously make iommu-memory-region take a specifier, but we
> > > > > > > >>> could just as well use memory-regions in that case since we have
> > > > > > > >>> something more generic anyway.
> > > > > > > >>>
> > > > > > > >>> With the #memory-region-cells proposal, we can easily extend the cell in
> > > > > > > >>> the specifier with an additional MEMORY_REGION_IOMMU_RESERVE flag to
> > > > > > > >>> take that other use case into account. If we than also change to the new
> > > > > > > >>> memory-regions property name, we avoid the ABI issue (and we gain a bit
> > > > > > > >>> of consistency while at it).
> > > > > > > >>
> > > > > > > >> Ping? Rob, do you want me to add this second use-case to the patch
> > > > > > > >> series to make it more obvious that this isn't just a one-off thing? Or
> > > > > > > >> how do we proceed?
> > > > > > > >
> > > > > > > > Rob, given that additional use-case, do you want me to run with this
> > > > > > > > proposal and send out an updated series?
> > > > > > >
> > > > > > >
> > > > > > > What about variant with a "descriptor" properties that will describe
> > > > > > > each region:
> > > > > > >
> > > > > > > fb_desc: display-framebuffer-memory-descriptor {
> > > > > > >       needs-identity-mapping;
> > > > > > > }
> > > > > > >
> > > > > > > display@52400000 {
> > > > > > >       memory-region = <&fb ...>;
> > > > > > >       memory-region-descriptor = <&fb_desc ...>;
> > > > > > > };
> > > > > > >
> > > > > > > It could be a more flexible/extendible variant.
> > > > > >
> > > > > > This problem recently came up on #dri-devel again. Adding Alyssa and
> > > > > > Sven who are facing a similar challenge on their work on Apple M1 (if I
> > > > > > understood correctly). Also adding dri-devel for visibility since this
> > > > > > is a very common problem for display in particular.
> > > > > >
> > > > > > On M1 the situation is slightly more complicated: the firmware will
> > > > > > allocate a couple of buffers (including the framebuffer) in high memory
> > > > > > (> 4 GiB) and use the IOMMU to map that into an IOVA region below 4 GiB
> > > > > > so that the display hardware can access it. This makes it impossible to
> > > > > > bypass the IOMMU like we do on other chips (in particular to work around
> > > > > > the fault-by-default policy of the ARM SMMU driver). It also means that
> > > > > > in addition to the simple reserved regions I mentioned we need for audio
> > > > > > use-cases and identity mapping use-cases we need for display on Tegra,
> > > > > > we now also need to be able to convey physical to IOVA mappings.
> > > > > >
> > > > > > Fitting the latter into the original proposal sounds difficult. A quick
> > > > > > fix would've been to generate a mapping table in memory and pass that to
> > > > > > the kernel using a reserved-memory node (similar to what's done for
> > > > > > example on Tegra for the EMC frequency table on Tegra210) and mark it as
> > > > > > such using a special flag. But that then involves two layers of parsing,
> > > > > > which seems a bit suboptimal. Another way to shoehorn that into the
> > > > > > original proposal would've been to add flags for physical and virtual
> > > > > > address regions and use pairs to pass them using special flags. Again,
> > > > > > this is a bit wonky because it needs these to be carefully parsed and
> > > > > > matched up.
> > > > > >
> > > > > > Another downside is that we now have a situation where some of these
> > > > > > regions are no longer "reserved-memory regions" in the traditional
> > > > > > sense. This would require an additional flag in the reserved-memory
> > > > > > region nodes to prevent the IOVA regions from being reserved. By the
> > > > > > way, this is something that would also be needed for the audio use-case
> > > > > > I mentioned before, because the physical memory at that address can
> > > > > > still be used by an operating system.
> > > > > >
> > > > > > A more general solution would be to draw a bit from Dmitry's proposal
> > > > > > and introduce a new top-level "iov-reserved-memory" node. This could be
> > > > > > modelled on the existing reserved-memory node, except that the physical
> > > > > > memory pages for regions represented by child nodes would not be marked
> > > > > > as reserved. Only the IOVA range described by the region would be
> > > > > > reserved subsequently by the IOMMU framework and/or IOMMU driver.
> > > > > >
> > > > > > The simplest case where we just want to reserve some IOVA region could
> > > > > > then be done like this:
> > > > > >
> > > > > >         iov-reserved-memory {
> > > > > >                 /*
> > > > > >                  * Probably safest to default to <2>, <2> here given
> > > > > >                  * that most IOMMUs support either > 32 bits of IAS
> > > > > >                  * or OAS.
> > > > > >                  */
> > > > > >                 #address-cells = <2>;
> > > > > >                 #size-cells = <2>;
> > > > > >
> > > > > >                 firmware: firmware@80000000 {
> > > > > >                         reg = <0 0x80000000 0 0x01000000>;
> > > > > >                 };
> > > > > >         };
> > > > > >
> > > > > >         audio@30000000 {
> > > > > >                 ...
> > > > > >                 iov-memory-regions = <&firmware>;
> > > > > >                 ...
> > > > > >         };
> > > > > >
> > > > > > Mappings could be represented by an IOV reserved region taking a
> > > > > > reference to the reserved-region that they map:
> > > > > >
> > > > > >         reserved-memory {
> > > > > >                 #address-cells = <2>;
> > > > > >                 #size-cells = <2>;
> > > > > >
> > > > > >                 /* 16 MiB of framebuffer at top-of-memory */
> > > > > >                 framebuffer: framebuffer@1,ff000000 {
> > > > > >                         reg = <0x1 0xff000000 0 0x01000000>;
> > > > > >                         no-map;
> > > > > >                 };
> > > > > >         };
> > > > > >
> > > > > >         iov-reserved-memory {
> > > > > >                 /* IOMMU supports only 32-bit output address space */
> > > > > >                 #address-cells = <1>;
> > > > > >                 #size-cells = <1>;
> > > > > >
> > > > > >                 /* 16 MiB of framebuffer mapped to top of IOVA */
> > > > > >                 fb: fb@ff000000 {
> > > > > >                         reg = <0 0xff000000 0 0x01000000>;
> > > > > >                         memory-region = <&framebuffer>;
> > > > > >                 };
> > > > > >         };
> > > > > >
> > > > > >         display@40000000 {
> > > > > >                 ...
> > > > > >                 /* optional? */
> > > > > >                 memory-region = <&framebuffer>;
> > > > > >                 iov-memory-regions = <&fb>;
> > > > > >                 ...
> > > > > >         };
> > > > > >
> > > > > > It's interesting how identity mapped regions now become a trivial
> > > > > > special case of mappings. All that is needed is to make the reg property
> > > > > > of the IOV reserved region correspond to the reg property of the normal
> > > > > > reserved region. Alternatively, as a small optimization for lazy people
> > > > > > like me, we could just allow these cases to omit the reg property and
> > > > > > instead inherit it from the referenced reserved region.
> > > > > >
> > > > > > As the second example shows it might be convenient if memory-region
> > > > > > could be derived from iov-memory-regions. This could be useful for cases
> > > > > > where the driver wants to do something with the physical pages of the
> > > > > > reserved region (such as mapping them and copying out the framebuffer
> > > > > > data to another buffer so that the reserved memory can be recycled). If
> > > > > > we have the IOV reserved region, we could provide an API to extract the
> > > > > > physical reserved region (if it exists). That way we could avoid
> > > > > > referencing it twice in DT. Then again, there's something elegant about
> > > > > > the explicit second reference to. It indicates the intent that we may
> > > > > > want to use the region for something other than just the IOV mapping.
> > > > > >
> > > > > > Anyway, this has been long enough. Let me know what you think. Alyssa,
> > > > > > Sven, it'd be interesting to hear if you think this could work as a
> > > > > > solution to the problem on M1.
> > > > > >
> > > > > > Rob, I think you might like this alternative because it basically gets
> > > > > > rid of all the points in the original proposal that you were concerned
> > > > > > about. Let me know what you think.
> > > > >
> > > > > Couldn't we keep this all in /reserved-memory? Just add an iova
> > > > > version of reg. Perhaps abuse 'assigned-address' for this purpose. The
> > > > > issue I see would be handling reserved iova areas without a physical
> > > > > area. That can be handled with just a iova and no reg. We already have
> > > > > a no reg case.
> > > >
> > > > I had thought about that initially. One thing I'm worried about is that
> > > > every child node in /reserved-memory will effectively cause the memory
> > > > that it described to be reserved. But we don't want that for regions
> > > > that are "virtual only" (i.e. IOMMU reservations).
> > >
> > > By virtual only, you mean no physical mapping, just a region of
> > > virtual space, right? For that we'd have no 'reg' and therefore no
> > > (physical) reservation by the OS. It's similar to non-static regions.
> > > You need a specific handler for them. We'd probably want a compatible
> > > as well for these virtual reservations.
> >
> > Yeah, these would be purely used for reserving regions in the IOVA so
> > that they won't be used by the IOVA allocator. Typically these would be
> > used for cases where those addresses have some special meaning.
> >
> > Do we want something like:
> >
> >         compatible = "iommu-reserved";
> >
> > for these? Or would that need to be:
> >
> >         compatible = "linux,iommu-reserved";
> >
> > ? There seems to be a mix of vendor-prefix vs. non-vendor-prefix
> > compatible strings in the reserved-memory DT bindings directory.
> 
> I would not use 'linux,' here.
> 
> >
> > On the other hand, do we actually need the compatible string? Because we
> > don't really want to associate much extra information with this like we
> > do for example with "shared-dma-pool". The logic to handle this would
> > all be within the IOMMU framework. All we really need is for the
> > standard reservation code to skip nodes that don't have a reg property
> > so we don't reserve memory for "virtual-only" allocations.
> 
> It doesn't hurt to have one and I can imagine we might want to iterate
> over all the nodes. It's slightly easier and more common to iterate
> over compatible nodes rather than nodes with some property.
> 
> > > Are these being global in DT going to be a problem? Presumably we have
> > > a virtual space per IOMMU. We'd know which IOMMU based on a device's
> > > 'iommus' and 'memory-region' properties, but within /reserved-memory
> > > we wouldn't be able to distinguish overlapping addresses from separate
> > > address spaces. Or we could have 2 different IOVAs for 1 physical
> > > space. That could be solved with something like this:
> > >
> > > iommu-addresses = <&iommu1 <address cells> <size cells>>;
> >
> > The only case that would be problematic would be if we have overlapping
> > physical regions, because that will probably trip up the standard code.
> >
> > But this could also be worked around by looking at iommu-addresses. For
> > example, if we had something like this:
> >
> >         reserved-memory {
> >                 fb_dc0: fb@80000000 {
> >                         reg = <0x80000000 0x01000000>;
> >                         iommu-addresses = <0xa0000000 0x01000000>;
> >                 };
> >
> >                 fb_dc1: fb@80000000 {
> 
> You can't have 2 nodes with the same name (actually, you can, they
> just get merged together). Different names with the same unit-address
> is a dtc warning. I'd really like to make that a full blown
> overlapping region check.

Right... so this would be a lot easier to deal with using that earlier
proposal where the IOMMU regions were a separate thing and referencing
the reserved-memory nodes. In those cases we could just have the
physical reservation for the framebuffer once (so we don't get any
duplicates or overlaps) and then have each IOVA reservation reference
that to create the mapping.

> 
> >                         reg = <0x80000000 0x01000000>;
> >                         iommu-addresses = <0xb0000000 0x01000000>;
> >                 };
> >         };
> >
> > We could make the code identify that this is for the same physical
> > reservation (maybe make it so that reg needs to match exactly for this
> > to be recognized) but with different virtual allocations.
> >
> > On a side-note: do we really need to repeat the size? I'd think if we
> > want mappings then we'd likely want them for the whole reservation.
> 
> Humm, I suppose not, but dropping it paints us into a corner if we
> come up with wanting a different size later. You could have a carveout
> for double/triple buffering your framebuffer, but the bootloader
> framebuffer is only single buffered. So would you want actual size?

Perhaps this needs to be a bit more verbose then. If we want the ability
to create a mapping for only a partial reservation, I could imagine we
may as well want one that doesn't start at the beginning. So perhaps an
ever better solution would be to have a complete mapping, something that
works similar to "ranges" perhaps, like so:

	fb@80000000 {
		reg = <0x80000000 0x01000000>;
		iommu-ranges = <0x80000000 0x01000000 0x80000000>;
	};

That would be for a full identity mapping, but we could also have
something along the lines of this:

	fb@80000000 {
		reg = <0x80000000 0x01000000>;
		iommu-ranges = <0x80100000 0x00100000 0xa0000000>;
	};

So that would only map a 1 MiB chunk at offset 1 MiB (of the physical
reservation) to I/O virtual address 0xa0000000.

> > I'd like to keep references to IOMMUs out of this because they would be
> > duplicated. We will only use these nodes if they are referenced by a
> > device node that also has an iommus property. Also, the IOMMU reference
> > itself isn't enough. We'd also need to support the complete specifier
> > because you can have things like SIDs in there to specify the exact
> > address space that a device uses.
> >
> > Also, for some of these they may be reused independently of the IOMMU
> > address space. For example the Tegra framebuffer identity mapping can
> > be used by either of the 2-4 display controllers, each with (at least
> > potentially) their own address space. But we don't want to have to
> > describe the identity mapping separately for each display controller.
> 
> Okay, but I'd rather have to duplicate things in your case than not be
> able to express some other case.

The earlier "separate iov-reserved-memory" proposal would be a good
compromise here. It'd allow us to duplicate only the necessary bits
(i.e. the IOVA mappings) but keep the common bits simple. And even
the IOVA mappings could be shared for cases like identity mappings.
See below for more on that.

> > Another thing to consider is that these nodes will often be added by
> > firmware (e.g. firmware will allocate the framebuffer and set up the
> > corresponding reserved memory region in DT). Wiring up references like
> > this would get very complicated very quickly.
> 
> Yes.
> 
> The using 'iommus' property option below can be optional and doesn't
> have to be defined/supported now. Just trying to think ahead and not
> be stuck with something that can't be extended.

One other benefit of the separate iov-reserved-memory node would be that
the iommus property could be simplified. If we have a physical
reservation that needs to be accessed by multiple different display
controllers, we'd end up with something fairly complex, such as this:

	fb: fb@80000000 {
		reg = <0x80000000 0x01000000>;
		iommus = <&dc0_iommu 0xa0000000 0x01000000>,
			 <&dc1_iommu 0xb0000000 0x01000000>,
			 <&dc2_iommu 0xc0000000 0x01000000>;
	};

This would get even worse if we want to support partial mappings. Also,
it'd become quite complicated to correlate this with the memory-region
references:

	dc0: dc@40000000 {
		...
		memory-region = <&fb>;
		iommus = <&dc0_iommu>;
		...
	};

So now you have to go match up the phandle (and potentially specifier)
in the iommus property of the disp0 node with an entry in the fb node's
iommus property. That's all fairly complicated stuff.

With separate iov-reserved-memory, this would be a bit more verbose, but
each individual node would be simpler:

	reserved-memory {
		fb: fb@80000000 {
			reg = <0x80000000 0x01000000>;
		};
	};

	iov-reserved-memory {
		fb0: fb@80000000 {
			/* identity mapping, "reg" optional? */
			reg = <0x80000000 0x01000000>;
			memory-region = <&fb>;
		};

		fb1: fb@90000000 {
			/* but doesn't have to be */
			reg = <0x90000000 0x01000000>;
			memory-region = <&fb>;
		};

		fb2: fb@a0000000 {
			/* can be partial, too */
			ranges = <0x80000000 0x00800000 0xa0000000>;
			memory-region = <&fb>;
		};
	}

	dc0: dc@40000000 {
		iov-memory-regions = <&fb0>;
		/* optional? */
		memory-region = <&fb>;
		iommus = <&dc0_iommu>;
	};

Alternatively, if we want to support partial mappings, we could replace
those reg properties by ranges properties that I showed earlier. We may
even want to support both. Use "reg" for virtual-only reservations and
identity mappings, or "simple partial mappings" (that map a sub-region
starting from the beginning). Identity mappings could still be
simplified by just omitting the "reg" property. For more complicated
mappings, such as the ones on M1, the "ranges" property could be used.

Note how this looks a bit boilerplate-y, but it's actually really quite
simple to understand, even for humans, I think.

Also, the phandles in this are comparatively easy to wire up because
they can all be generated in a hierarchical way: generate physical
reservation and store phandle, then generate I/O virtual reservation
to reference that phandle and store the new phandle as well. Finally,
wire this up to the display controller (using either the IOV phandle or
both).

Granted, this requires the addition of a new top-level node, but given
how expressive this becomes, I think it might be worth a second
consideration.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 1/5] dt-bindings: reserved-memory: Document memory region specifier
  2021-09-07 17:44                         ` Thierry Reding
@ 2021-09-15 15:19                           ` Thierry Reding
  2022-02-06 22:27                             ` Janne Grunau
  0 siblings, 1 reply; 41+ messages in thread
From: Thierry Reding @ 2021-09-15 15:19 UTC (permalink / raw)
  To: Rob Herring
  Cc: Alyssa Rosenzweig, Sven Peter, Dmitry Osipenko, Joerg Roedel,
	Will Deacon, Robin Murphy, Nicolin Chen, Krishna Reddy,
	devicetree, Linux IOMMU, linux-tegra, dri-devel

[-- Attachment #1: Type: text/plain, Size: 27633 bytes --]

On Tue, Sep 07, 2021 at 07:44:44PM +0200, Thierry Reding wrote:
> On Tue, Sep 07, 2021 at 10:33:24AM -0500, Rob Herring wrote:
> > On Fri, Sep 3, 2021 at 10:36 AM Thierry Reding <thierry.reding@gmail.com> wrote:
> > >
> > > On Fri, Sep 03, 2021 at 09:36:33AM -0500, Rob Herring wrote:
> > > > On Fri, Sep 3, 2021 at 8:52 AM Thierry Reding <thierry.reding@gmail.com> wrote:
> > > > >
> > > > > On Fri, Sep 03, 2021 at 08:20:55AM -0500, Rob Herring wrote:
> > > > > > On Wed, Sep 1, 2021 at 9:13 AM Thierry Reding <thierry.reding@gmail.com> wrote:
> > > > > > >
> > > > > > > On Fri, Jul 02, 2021 at 05:16:25PM +0300, Dmitry Osipenko wrote:
> > > > > > > > 01.07.2021 21:14, Thierry Reding пишет:
> > > > > > > > > On Tue, Jun 08, 2021 at 06:51:40PM +0200, Thierry Reding wrote:
> > > > > > > > >> On Fri, May 28, 2021 at 06:54:55PM +0200, Thierry Reding wrote:
> > > > > > > > >>> On Thu, May 20, 2021 at 05:03:06PM -0500, Rob Herring wrote:
> > > > > > > > >>>> On Fri, Apr 23, 2021 at 06:32:30PM +0200, Thierry Reding wrote:
> > > > > > > > >>>>> From: Thierry Reding <treding@nvidia.com>
> > > > > > > > >>>>>
> > > > > > > > >>>>> Reserved memory region phandle references can be accompanied by a
> > > > > > > > >>>>> specifier that provides additional information about how that specific
> > > > > > > > >>>>> reference should be treated.
> > > > > > > > >>>>>
> > > > > > > > >>>>> One use-case is to mark a memory region as needing an identity mapping
> > > > > > > > >>>>> in the system's IOMMU for the device that references the region. This is
> > > > > > > > >>>>> needed for example when the bootloader has set up hardware (such as a
> > > > > > > > >>>>> display controller) to actively access a memory region (e.g. a boot
> > > > > > > > >>>>> splash screen framebuffer) during boot. The operating system can use the
> > > > > > > > >>>>> identity mapping flag from the specifier to make sure an IOMMU identity
> > > > > > > > >>>>> mapping is set up for the framebuffer before IOMMU translations are
> > > > > > > > >>>>> enabled for the display controller.
> > > > > > > > >>>>>
> > > > > > > > >>>>> Signed-off-by: Thierry Reding <treding@nvidia.com>
> > > > > > > > >>>>> ---
> > > > > > > > >>>>>  .../reserved-memory/reserved-memory.txt       | 21 +++++++++++++++++++
> > > > > > > > >>>>>  include/dt-bindings/reserved-memory.h         |  8 +++++++
> > > > > > > > >>>>>  2 files changed, 29 insertions(+)
> > > > > > > > >>>>>  create mode 100644 include/dt-bindings/reserved-memory.h
> > > > > > > > >>>>
> > > > > > > > >>>> Sorry for being slow on this. I have 2 concerns.
> > > > > > > > >>>>
> > > > > > > > >>>> First, this creates an ABI issue. A DT with cells in 'memory-region'
> > > > > > > > >>>> will not be understood by an existing OS. I'm less concerned about this
> > > > > > > > >>>> if we address that with a stable fix. (Though I'm pretty sure we've
> > > > > > > > >>>> naively added #?-cells in the past ignoring this issue.)
> > > > > > > > >>>
> > > > > > > > >>> A while ago I had proposed adding memory-region*s* as an alternative
> > > > > > > > >>> name for memory-region to make the naming more consistent with other
> > > > > > > > >>> types of properties (think clocks, resets, gpios, ...). If we added
> > > > > > > > >>> that, we could easily differentiate between the "legacy" cases where
> > > > > > > > >>> no #memory-region-cells was allowed and the new cases where it was.
> > > > > > > > >>>
> > > > > > > > >>>> Second, it could be the bootloader setting up the reserved region. If a
> > > > > > > > >>>> node already has 'memory-region', then adding more regions is more
> > > > > > > > >>>> complicated compared to adding new properties. And defining what each
> > > > > > > > >>>> memory-region entry is or how many in schemas is impossible.
> > > > > > > > >>>
> > > > > > > > >>> It's true that updating the property gets a bit complicated, but it's
> > > > > > > > >>> not exactly rocket science. We really just need to splice the array. I
> > > > > > > > >>> have a working implemention for this in U-Boot.
> > > > > > > > >>>
> > > > > > > > >>> For what it's worth, we could run into the same issue with any new
> > > > > > > > >>> property that we add. Even if we renamed this to iommu-memory-region,
> > > > > > > > >>> it's still possible that a bootloader may have to update this property
> > > > > > > > >>> if it already exists (it could be hard-coded in DT, or it could have
> > > > > > > > >>> been added by some earlier bootloader or firmware).
> > > > > > > > >>>
> > > > > > > > >>>> Both could be addressed with a new property. Perhaps something like
> > > > > > > > >>>> 'iommu-memory-region = <&phandle>;'. I think the 'iommu' prefix is
> > > > > > > > >>>> appropriate given this is entirely because of the IOMMU being in the
> > > > > > > > >>>> mix. I might feel differently if we had other uses for cells, but I
> > > > > > > > >>>> don't really see it in this case.
> > > > > > > > >>>
> > > > > > > > >>> I'm afraid that down the road we'll end up with other cases and then we
> > > > > > > > >>> might proliferate a number of *-memory-region properties with varying
> > > > > > > > >>> prefixes.
> > > > > > > > >>>
> > > > > > > > >>> I am aware of one other case where we might need something like this: on
> > > > > > > > >>> some Tegra SoCs we have audio processors that will access memory buffers
> > > > > > > > >>> using a DMA engine. These processors are booted from early firmware
> > > > > > > > >>> using firmware from system memory. In order to avoid trashing the
> > > > > > > > >>> firmware, we need to reserve memory. We can do this using reserved
> > > > > > > > >>> memory nodes. However, the audio DMA engine also uses the SMMU, so we
> > > > > > > > >>> need to make sure that the firmware memory is marked as reserved within
> > > > > > > > >>> the SMMU. This is similar to the identity mapping case, but not exactly
> > > > > > > > >>> the same. Instead of creating a 1:1 mapping, we just want that IOVA
> > > > > > > > >>> region to be reserved (i.e. IOMMU_RESV_RESERVED instead of
> > > > > > > > >>> IOMMU_RESV_DIRECT{,_RELAXABLE}).
> > > > > > > > >>>
> > > > > > > > >>> That would also fall into the IOMMU domain, but we can't reuse the
> > > > > > > > >>> iommu-memory-region property for that because then we don't have enough
> > > > > > > > >>> information to decide which type of reservation we need.
> > > > > > > > >>>
> > > > > > > > >>> We could obviously make iommu-memory-region take a specifier, but we
> > > > > > > > >>> could just as well use memory-regions in that case since we have
> > > > > > > > >>> something more generic anyway.
> > > > > > > > >>>
> > > > > > > > >>> With the #memory-region-cells proposal, we can easily extend the cell in
> > > > > > > > >>> the specifier with an additional MEMORY_REGION_IOMMU_RESERVE flag to
> > > > > > > > >>> take that other use case into account. If we than also change to the new
> > > > > > > > >>> memory-regions property name, we avoid the ABI issue (and we gain a bit
> > > > > > > > >>> of consistency while at it).
> > > > > > > > >>
> > > > > > > > >> Ping? Rob, do you want me to add this second use-case to the patch
> > > > > > > > >> series to make it more obvious that this isn't just a one-off thing? Or
> > > > > > > > >> how do we proceed?
> > > > > > > > >
> > > > > > > > > Rob, given that additional use-case, do you want me to run with this
> > > > > > > > > proposal and send out an updated series?
> > > > > > > >
> > > > > > > >
> > > > > > > > What about variant with a "descriptor" properties that will describe
> > > > > > > > each region:
> > > > > > > >
> > > > > > > > fb_desc: display-framebuffer-memory-descriptor {
> > > > > > > >       needs-identity-mapping;
> > > > > > > > }
> > > > > > > >
> > > > > > > > display@52400000 {
> > > > > > > >       memory-region = <&fb ...>;
> > > > > > > >       memory-region-descriptor = <&fb_desc ...>;
> > > > > > > > };
> > > > > > > >
> > > > > > > > It could be a more flexible/extendible variant.
> > > > > > >
> > > > > > > This problem recently came up on #dri-devel again. Adding Alyssa and
> > > > > > > Sven who are facing a similar challenge on their work on Apple M1 (if I
> > > > > > > understood correctly). Also adding dri-devel for visibility since this
> > > > > > > is a very common problem for display in particular.
> > > > > > >
> > > > > > > On M1 the situation is slightly more complicated: the firmware will
> > > > > > > allocate a couple of buffers (including the framebuffer) in high memory
> > > > > > > (> 4 GiB) and use the IOMMU to map that into an IOVA region below 4 GiB
> > > > > > > so that the display hardware can access it. This makes it impossible to
> > > > > > > bypass the IOMMU like we do on other chips (in particular to work around
> > > > > > > the fault-by-default policy of the ARM SMMU driver). It also means that
> > > > > > > in addition to the simple reserved regions I mentioned we need for audio
> > > > > > > use-cases and identity mapping use-cases we need for display on Tegra,
> > > > > > > we now also need to be able to convey physical to IOVA mappings.
> > > > > > >
> > > > > > > Fitting the latter into the original proposal sounds difficult. A quick
> > > > > > > fix would've been to generate a mapping table in memory and pass that to
> > > > > > > the kernel using a reserved-memory node (similar to what's done for
> > > > > > > example on Tegra for the EMC frequency table on Tegra210) and mark it as
> > > > > > > such using a special flag. But that then involves two layers of parsing,
> > > > > > > which seems a bit suboptimal. Another way to shoehorn that into the
> > > > > > > original proposal would've been to add flags for physical and virtual
> > > > > > > address regions and use pairs to pass them using special flags. Again,
> > > > > > > this is a bit wonky because it needs these to be carefully parsed and
> > > > > > > matched up.
> > > > > > >
> > > > > > > Another downside is that we now have a situation where some of these
> > > > > > > regions are no longer "reserved-memory regions" in the traditional
> > > > > > > sense. This would require an additional flag in the reserved-memory
> > > > > > > region nodes to prevent the IOVA regions from being reserved. By the
> > > > > > > way, this is something that would also be needed for the audio use-case
> > > > > > > I mentioned before, because the physical memory at that address can
> > > > > > > still be used by an operating system.
> > > > > > >
> > > > > > > A more general solution would be to draw a bit from Dmitry's proposal
> > > > > > > and introduce a new top-level "iov-reserved-memory" node. This could be
> > > > > > > modelled on the existing reserved-memory node, except that the physical
> > > > > > > memory pages for regions represented by child nodes would not be marked
> > > > > > > as reserved. Only the IOVA range described by the region would be
> > > > > > > reserved subsequently by the IOMMU framework and/or IOMMU driver.
> > > > > > >
> > > > > > > The simplest case where we just want to reserve some IOVA region could
> > > > > > > then be done like this:
> > > > > > >
> > > > > > >         iov-reserved-memory {
> > > > > > >                 /*
> > > > > > >                  * Probably safest to default to <2>, <2> here given
> > > > > > >                  * that most IOMMUs support either > 32 bits of IAS
> > > > > > >                  * or OAS.
> > > > > > >                  */
> > > > > > >                 #address-cells = <2>;
> > > > > > >                 #size-cells = <2>;
> > > > > > >
> > > > > > >                 firmware: firmware@80000000 {
> > > > > > >                         reg = <0 0x80000000 0 0x01000000>;
> > > > > > >                 };
> > > > > > >         };
> > > > > > >
> > > > > > >         audio@30000000 {
> > > > > > >                 ...
> > > > > > >                 iov-memory-regions = <&firmware>;
> > > > > > >                 ...
> > > > > > >         };
> > > > > > >
> > > > > > > Mappings could be represented by an IOV reserved region taking a
> > > > > > > reference to the reserved-region that they map:
> > > > > > >
> > > > > > >         reserved-memory {
> > > > > > >                 #address-cells = <2>;
> > > > > > >                 #size-cells = <2>;
> > > > > > >
> > > > > > >                 /* 16 MiB of framebuffer at top-of-memory */
> > > > > > >                 framebuffer: framebuffer@1,ff000000 {
> > > > > > >                         reg = <0x1 0xff000000 0 0x01000000>;
> > > > > > >                         no-map;
> > > > > > >                 };
> > > > > > >         };
> > > > > > >
> > > > > > >         iov-reserved-memory {
> > > > > > >                 /* IOMMU supports only 32-bit output address space */
> > > > > > >                 #address-cells = <1>;
> > > > > > >                 #size-cells = <1>;
> > > > > > >
> > > > > > >                 /* 16 MiB of framebuffer mapped to top of IOVA */
> > > > > > >                 fb: fb@ff000000 {
> > > > > > >                         reg = <0 0xff000000 0 0x01000000>;
> > > > > > >                         memory-region = <&framebuffer>;
> > > > > > >                 };
> > > > > > >         };
> > > > > > >
> > > > > > >         display@40000000 {
> > > > > > >                 ...
> > > > > > >                 /* optional? */
> > > > > > >                 memory-region = <&framebuffer>;
> > > > > > >                 iov-memory-regions = <&fb>;
> > > > > > >                 ...
> > > > > > >         };
> > > > > > >
> > > > > > > It's interesting how identity mapped regions now become a trivial
> > > > > > > special case of mappings. All that is needed is to make the reg property
> > > > > > > of the IOV reserved region correspond to the reg property of the normal
> > > > > > > reserved region. Alternatively, as a small optimization for lazy people
> > > > > > > like me, we could just allow these cases to omit the reg property and
> > > > > > > instead inherit it from the referenced reserved region.
> > > > > > >
> > > > > > > As the second example shows it might be convenient if memory-region
> > > > > > > could be derived from iov-memory-regions. This could be useful for cases
> > > > > > > where the driver wants to do something with the physical pages of the
> > > > > > > reserved region (such as mapping them and copying out the framebuffer
> > > > > > > data to another buffer so that the reserved memory can be recycled). If
> > > > > > > we have the IOV reserved region, we could provide an API to extract the
> > > > > > > physical reserved region (if it exists). That way we could avoid
> > > > > > > referencing it twice in DT. Then again, there's something elegant about
> > > > > > > the explicit second reference to. It indicates the intent that we may
> > > > > > > want to use the region for something other than just the IOV mapping.
> > > > > > >
> > > > > > > Anyway, this has been long enough. Let me know what you think. Alyssa,
> > > > > > > Sven, it'd be interesting to hear if you think this could work as a
> > > > > > > solution to the problem on M1.
> > > > > > >
> > > > > > > Rob, I think you might like this alternative because it basically gets
> > > > > > > rid of all the points in the original proposal that you were concerned
> > > > > > > about. Let me know what you think.
> > > > > >
> > > > > > Couldn't we keep this all in /reserved-memory? Just add an iova
> > > > > > version of reg. Perhaps abuse 'assigned-address' for this purpose. The
> > > > > > issue I see would be handling reserved iova areas without a physical
> > > > > > area. That can be handled with just a iova and no reg. We already have
> > > > > > a no reg case.
> > > > >
> > > > > I had thought about that initially. One thing I'm worried about is that
> > > > > every child node in /reserved-memory will effectively cause the memory
> > > > > that it described to be reserved. But we don't want that for regions
> > > > > that are "virtual only" (i.e. IOMMU reservations).
> > > >
> > > > By virtual only, you mean no physical mapping, just a region of
> > > > virtual space, right? For that we'd have no 'reg' and therefore no
> > > > (physical) reservation by the OS. It's similar to non-static regions.
> > > > You need a specific handler for them. We'd probably want a compatible
> > > > as well for these virtual reservations.
> > >
> > > Yeah, these would be purely used for reserving regions in the IOVA so
> > > that they won't be used by the IOVA allocator. Typically these would be
> > > used for cases where those addresses have some special meaning.
> > >
> > > Do we want something like:
> > >
> > >         compatible = "iommu-reserved";
> > >
> > > for these? Or would that need to be:
> > >
> > >         compatible = "linux,iommu-reserved";
> > >
> > > ? There seems to be a mix of vendor-prefix vs. non-vendor-prefix
> > > compatible strings in the reserved-memory DT bindings directory.
> > 
> > I would not use 'linux,' here.
> > 
> > >
> > > On the other hand, do we actually need the compatible string? Because we
> > > don't really want to associate much extra information with this like we
> > > do for example with "shared-dma-pool". The logic to handle this would
> > > all be within the IOMMU framework. All we really need is for the
> > > standard reservation code to skip nodes that don't have a reg property
> > > so we don't reserve memory for "virtual-only" allocations.
> > 
> > It doesn't hurt to have one and I can imagine we might want to iterate
> > over all the nodes. It's slightly easier and more common to iterate
> > over compatible nodes rather than nodes with some property.
> > 
> > > > Are these being global in DT going to be a problem? Presumably we have
> > > > a virtual space per IOMMU. We'd know which IOMMU based on a device's
> > > > 'iommus' and 'memory-region' properties, but within /reserved-memory
> > > > we wouldn't be able to distinguish overlapping addresses from separate
> > > > address spaces. Or we could have 2 different IOVAs for 1 physical
> > > > space. That could be solved with something like this:
> > > >
> > > > iommu-addresses = <&iommu1 <address cells> <size cells>>;
> > >
> > > The only case that would be problematic would be if we have overlapping
> > > physical regions, because that will probably trip up the standard code.
> > >
> > > But this could also be worked around by looking at iommu-addresses. For
> > > example, if we had something like this:
> > >
> > >         reserved-memory {
> > >                 fb_dc0: fb@80000000 {
> > >                         reg = <0x80000000 0x01000000>;
> > >                         iommu-addresses = <0xa0000000 0x01000000>;
> > >                 };
> > >
> > >                 fb_dc1: fb@80000000 {
> > 
> > You can't have 2 nodes with the same name (actually, you can, they
> > just get merged together). Different names with the same unit-address
> > is a dtc warning. I'd really like to make that a full blown
> > overlapping region check.
> 
> Right... so this would be a lot easier to deal with using that earlier
> proposal where the IOMMU regions were a separate thing and referencing
> the reserved-memory nodes. In those cases we could just have the
> physical reservation for the framebuffer once (so we don't get any
> duplicates or overlaps) and then have each IOVA reservation reference
> that to create the mapping.
> 
> > 
> > >                         reg = <0x80000000 0x01000000>;
> > >                         iommu-addresses = <0xb0000000 0x01000000>;
> > >                 };
> > >         };
> > >
> > > We could make the code identify that this is for the same physical
> > > reservation (maybe make it so that reg needs to match exactly for this
> > > to be recognized) but with different virtual allocations.
> > >
> > > On a side-note: do we really need to repeat the size? I'd think if we
> > > want mappings then we'd likely want them for the whole reservation.
> > 
> > Humm, I suppose not, but dropping it paints us into a corner if we
> > come up with wanting a different size later. You could have a carveout
> > for double/triple buffering your framebuffer, but the bootloader
> > framebuffer is only single buffered. So would you want actual size?
> 
> Perhaps this needs to be a bit more verbose then. If we want the ability
> to create a mapping for only a partial reservation, I could imagine we
> may as well want one that doesn't start at the beginning. So perhaps an
> ever better solution would be to have a complete mapping, something that
> works similar to "ranges" perhaps, like so:
> 
> 	fb@80000000 {
> 		reg = <0x80000000 0x01000000>;
> 		iommu-ranges = <0x80000000 0x01000000 0x80000000>;
> 	};
> 
> That would be for a full identity mapping, but we could also have
> something along the lines of this:
> 
> 	fb@80000000 {
> 		reg = <0x80000000 0x01000000>;
> 		iommu-ranges = <0x80100000 0x00100000 0xa0000000>;
> 	};
> 
> So that would only map a 1 MiB chunk at offset 1 MiB (of the physical
> reservation) to I/O virtual address 0xa0000000.
> 
> > > I'd like to keep references to IOMMUs out of this because they would be
> > > duplicated. We will only use these nodes if they are referenced by a
> > > device node that also has an iommus property. Also, the IOMMU reference
> > > itself isn't enough. We'd also need to support the complete specifier
> > > because you can have things like SIDs in there to specify the exact
> > > address space that a device uses.
> > >
> > > Also, for some of these they may be reused independently of the IOMMU
> > > address space. For example the Tegra framebuffer identity mapping can
> > > be used by either of the 2-4 display controllers, each with (at least
> > > potentially) their own address space. But we don't want to have to
> > > describe the identity mapping separately for each display controller.
> > 
> > Okay, but I'd rather have to duplicate things in your case than not be
> > able to express some other case.
> 
> The earlier "separate iov-reserved-memory" proposal would be a good
> compromise here. It'd allow us to duplicate only the necessary bits
> (i.e. the IOVA mappings) but keep the common bits simple. And even
> the IOVA mappings could be shared for cases like identity mappings.
> See below for more on that.
> 
> > > Another thing to consider is that these nodes will often be added by
> > > firmware (e.g. firmware will allocate the framebuffer and set up the
> > > corresponding reserved memory region in DT). Wiring up references like
> > > this would get very complicated very quickly.
> > 
> > Yes.
> > 
> > The using 'iommus' property option below can be optional and doesn't
> > have to be defined/supported now. Just trying to think ahead and not
> > be stuck with something that can't be extended.
> 
> One other benefit of the separate iov-reserved-memory node would be that
> the iommus property could be simplified. If we have a physical
> reservation that needs to be accessed by multiple different display
> controllers, we'd end up with something fairly complex, such as this:
> 
> 	fb: fb@80000000 {
> 		reg = <0x80000000 0x01000000>;
> 		iommus = <&dc0_iommu 0xa0000000 0x01000000>,
> 			 <&dc1_iommu 0xb0000000 0x01000000>,
> 			 <&dc2_iommu 0xc0000000 0x01000000>;
> 	};
> 
> This would get even worse if we want to support partial mappings. Also,
> it'd become quite complicated to correlate this with the memory-region
> references:
> 
> 	dc0: dc@40000000 {
> 		...
> 		memory-region = <&fb>;
> 		iommus = <&dc0_iommu>;
> 		...
> 	};
> 
> So now you have to go match up the phandle (and potentially specifier)
> in the iommus property of the disp0 node with an entry in the fb node's
> iommus property. That's all fairly complicated stuff.
> 
> With separate iov-reserved-memory, this would be a bit more verbose, but
> each individual node would be simpler:
> 
> 	reserved-memory {
> 		fb: fb@80000000 {
> 			reg = <0x80000000 0x01000000>;
> 		};
> 	};
> 
> 	iov-reserved-memory {
> 		fb0: fb@80000000 {
> 			/* identity mapping, "reg" optional? */
> 			reg = <0x80000000 0x01000000>;
> 			memory-region = <&fb>;
> 		};
> 
> 		fb1: fb@90000000 {
> 			/* but doesn't have to be */
> 			reg = <0x90000000 0x01000000>;
> 			memory-region = <&fb>;
> 		};
> 
> 		fb2: fb@a0000000 {
> 			/* can be partial, too */
> 			ranges = <0x80000000 0x00800000 0xa0000000>;
> 			memory-region = <&fb>;
> 		};
> 	}
> 
> 	dc0: dc@40000000 {
> 		iov-memory-regions = <&fb0>;
> 		/* optional? */
> 		memory-region = <&fb>;
> 		iommus = <&dc0_iommu>;
> 	};
> 
> Alternatively, if we want to support partial mappings, we could replace
> those reg properties by ranges properties that I showed earlier. We may
> even want to support both. Use "reg" for virtual-only reservations and
> identity mappings, or "simple partial mappings" (that map a sub-region
> starting from the beginning). Identity mappings could still be
> simplified by just omitting the "reg" property. For more complicated
> mappings, such as the ones on M1, the "ranges" property could be used.
> 
> Note how this looks a bit boilerplate-y, but it's actually really quite
> simple to understand, even for humans, I think.
> 
> Also, the phandles in this are comparatively easy to wire up because
> they can all be generated in a hierarchical way: generate physical
> reservation and store phandle, then generate I/O virtual reservation
> to reference that phandle and store the new phandle as well. Finally,
> wire this up to the display controller (using either the IOV phandle or
> both).
> 
> Granted, this requires the addition of a new top-level node, but given
> how expressive this becomes, I think it might be worth a second
> consideration.

I guess as a middle-ground between your suggestion and mine, we could
also move the IOV nodes back into reserved-memory. If we make sure the
names (together with unit-addresses) are unique, to support cases where
we want to identity map, or have multiple mappings at the same address.
So it'd look something like this:

	reserved-memory {
		fb: fb@80000000 {
			reg = <0x80000000 0x01000000>;
		};

		audio-firmware@ff000000 {
			/* perhaps add "iommu-reserved" for this case */
			compatible = "iommu-mapping";
			/*
			 * no memory-region referencing a physical
			 * reservation, indicates that this is an
			 * IOMMU reservation, rather than a mapping
			 /
			reg = <0xff000000 0x01000000>;
		};

		fb0: fb-mapping@80000000 {
			compatible = "iommu-mapping";
			/* identity mapping, "reg" optional? */
			reg = <0x80000000 0x01000000>;
			memory-region = <&fb>;
		};

		fb1: fb-mapping@90000000 {
			compatible = "iommu-mapping";
			/* but doesn't have to be */
			reg = <0x90000000 0x01000000>;
			memory-region = <&fb>;
		};

		fb2: fb-mapping@a0000000 {
			compatible = "iommu-mapping";
			/* can be partial, too */
			ranges = <0xa0000000 0x00800000 0x80000000>;
			memory-region = <&fb>;
		};
	}

	dc0: dc@40000000 {
		memory-region = <&fb0>;
		iommus = <&dc0_iommu>;
	};

What do you think?

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 0/5] iommu: Support identity mappings of reserved-memory regions
  2021-04-23 16:32 [PATCH v2 0/5] iommu: Support identity mappings of reserved-memory regions Thierry Reding
                   ` (7 preceding siblings ...)
  2021-04-28  5:59 ` Dmitry Osipenko
@ 2021-10-03  1:09 ` Dmitry Osipenko
  2021-10-04 19:23   ` Thierry Reding
  8 siblings, 1 reply; 41+ messages in thread
From: Dmitry Osipenko @ 2021-10-03  1:09 UTC (permalink / raw)
  To: Thierry Reding, Joerg Roedel, Rob Herring
  Cc: Will Deacon, Robin Murphy, Nicolin Chen, Krishna Reddy,
	devicetree, iommu, linux-tegra

23.04.2021 19:32, Thierry Reding пишет:
> I've made corresponding changes in the proprietary bootloader, added a
> compatibility shim in U-Boot (which forwards information created by the
> proprietary bootloader to the kernel) and the attached patches to test
> this on Jetson TX1, Jetson TX2 and Jetson AGX Xavier.

Could you please tell what downstream kernel does for coping with the
identity mappings in conjunction with the original proprietary bootloader?

If there is some other method of passing mappings to kernel, could it be
supported by upstream? Putting burden on users to upgrade bootloader
feels a bit inconvenient.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 0/5] iommu: Support identity mappings of reserved-memory regions
  2021-10-03  1:09 ` Dmitry Osipenko
@ 2021-10-04 19:23   ` Thierry Reding
  2021-10-04 20:32     ` Dmitry Osipenko
  0 siblings, 1 reply; 41+ messages in thread
From: Thierry Reding @ 2021-10-04 19:23 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Joerg Roedel, Rob Herring, Will Deacon, Robin Murphy,
	Nicolin Chen, Krishna Reddy, devicetree, iommu, linux-tegra

[-- Attachment #1: Type: text/plain, Size: 1112 bytes --]

On Sun, Oct 03, 2021 at 04:09:56AM +0300, Dmitry Osipenko wrote:
> 23.04.2021 19:32, Thierry Reding пишет:
> > I've made corresponding changes in the proprietary bootloader, added a
> > compatibility shim in U-Boot (which forwards information created by the
> > proprietary bootloader to the kernel) and the attached patches to test
> > this on Jetson TX1, Jetson TX2 and Jetson AGX Xavier.
> 
> Could you please tell what downstream kernel does for coping with the
> identity mappings in conjunction with the original proprietary bootloader?
> 
> If there is some other method of passing mappings to kernel, could it be
> supported by upstream? Putting burden on users to upgrade bootloader
> feels a bit inconvenient.

It depends on the chip generation. As far as I know there have been
several iterations. The earliest was to pass this information via a
command-line option, but more recent versions use device tree to pass
this information in a similar way as described here. However, these
use non-standard DT bindings, so I don't think we can just implement
them as-is.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 0/5] iommu: Support identity mappings of reserved-memory regions
  2021-10-04 19:23   ` Thierry Reding
@ 2021-10-04 20:32     ` Dmitry Osipenko
  0 siblings, 0 replies; 41+ messages in thread
From: Dmitry Osipenko @ 2021-10-04 20:32 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Joerg Roedel, Rob Herring, Will Deacon, Robin Murphy,
	Nicolin Chen, Krishna Reddy, devicetree, iommu, linux-tegra

04.10.2021 22:23, Thierry Reding пишет:
> On Sun, Oct 03, 2021 at 04:09:56AM +0300, Dmitry Osipenko wrote:
>> 23.04.2021 19:32, Thierry Reding пишет:
>>> I've made corresponding changes in the proprietary bootloader, added a
>>> compatibility shim in U-Boot (which forwards information created by the
>>> proprietary bootloader to the kernel) and the attached patches to test
>>> this on Jetson TX1, Jetson TX2 and Jetson AGX Xavier.
>>
>> Could you please tell what downstream kernel does for coping with the
>> identity mappings in conjunction with the original proprietary bootloader?
>>
>> If there is some other method of passing mappings to kernel, could it be
>> supported by upstream? Putting burden on users to upgrade bootloader
>> feels a bit inconvenient.
> 
> It depends on the chip generation. As far as I know there have been
> several iterations. The earliest was to pass this information via a
> command-line option, but more recent versions use device tree to pass
> this information in a similar way as described here. However, these
> use non-standard DT bindings, so I don't think we can just implement
> them as-is.

Is it possible to boot upstream kernel with that original bootloader?

I remember seeing other platforms, like QCOM, supporting downstream
quirks in upstream kernel on a side, i.e. they are undocumented, but the
additional support code is there. That is what "normal" people want. You
should consider doing that for Tegra too, if possible.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 5/5] iommu/tegra-smmu: Support managed domains
  2021-04-23 16:32 ` [PATCH v2 5/5] iommu/tegra-smmu: Support managed domains Thierry Reding
@ 2021-10-11 23:25   ` Dmitry Osipenko
  0 siblings, 0 replies; 41+ messages in thread
From: Dmitry Osipenko @ 2021-10-11 23:25 UTC (permalink / raw)
  To: Thierry Reding, Joerg Roedel, Rob Herring
  Cc: Will Deacon, Robin Murphy, Nicolin Chen, Krishna Reddy,
	devicetree, iommu, linux-tegra

23.04.2021 19:32, Thierry Reding пишет:
> From: Navneet Kumar <navneetk@nvidia.com>
> 
> Allow creating identity and DMA API compatible IOMMU domains. When
> creating a DMA API compatible domain, make sure to also create the
> required cookie.

IOMMU_DOMAIN_DMA should be a disaster. It shouldn't work without
preparing DRM and VDE drivers at first. We discussed this briefly in the
past.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 1/5] dt-bindings: reserved-memory: Document memory region specifier
  2021-09-15 15:19                           ` Thierry Reding
@ 2022-02-06 22:27                             ` Janne Grunau
  2022-02-09 16:31                               ` Thierry Reding
  0 siblings, 1 reply; 41+ messages in thread
From: Janne Grunau @ 2022-02-06 22:27 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Rob Herring, Alyssa Rosenzweig, Sven Peter, Dmitry Osipenko,
	Joerg Roedel, Will Deacon, Robin Murphy, Nicolin Chen,
	Krishna Reddy, devicetree, Linux IOMMU, linux-tegra, dri-devel

On 2021-09-15 17:19:39 +0200, Thierry Reding wrote:
> On Tue, Sep 07, 2021 at 07:44:44PM +0200, Thierry Reding wrote:
> > On Tue, Sep 07, 2021 at 10:33:24AM -0500, Rob Herring wrote:
> > > On Fri, Sep 3, 2021 at 10:36 AM Thierry Reding <thierry.reding@gmail.com> wrote:
> > > >
> > > > On Fri, Sep 03, 2021 at 09:36:33AM -0500, Rob Herring wrote:
> > > > > On Fri, Sep 3, 2021 at 8:52 AM Thierry Reding <thierry.reding@gmail.com> wrote:
> > > > > >
> > > > > > On Fri, Sep 03, 2021 at 08:20:55AM -0500, Rob Herring wrote:
> > > > > > >
> > > > > > > Couldn't we keep this all in /reserved-memory? Just add an iova
> > > > > > > version of reg. Perhaps abuse 'assigned-address' for this purpose. The
> > > > > > > issue I see would be handling reserved iova areas without a physical
> > > > > > > area. That can be handled with just a iova and no reg. We already have
> > > > > > > a no reg case.
> > > > > >
> > > > > > I had thought about that initially. One thing I'm worried about is that
> > > > > > every child node in /reserved-memory will effectively cause the memory
> > > > > > that it described to be reserved. But we don't want that for regions
> > > > > > that are "virtual only" (i.e. IOMMU reservations).
> > > > >
> > > > > By virtual only, you mean no physical mapping, just a region of
> > > > > virtual space, right? For that we'd have no 'reg' and therefore no
> > > > > (physical) reservation by the OS. It's similar to non-static regions.
> > > > > You need a specific handler for them. We'd probably want a compatible
> > > > > as well for these virtual reservations.
> > > >
> > > > Yeah, these would be purely used for reserving regions in the IOVA so
> > > > that they won't be used by the IOVA allocator. Typically these would be
> > > > used for cases where those addresses have some special meaning.
> > > >
> > > > Do we want something like:
> > > >
> > > >         compatible = "iommu-reserved";
> > > >
> > > > for these? Or would that need to be:
> > > >
> > > >         compatible = "linux,iommu-reserved";
> > > >
> > > > ? There seems to be a mix of vendor-prefix vs. non-vendor-prefix
> > > > compatible strings in the reserved-memory DT bindings directory.
> > > 
> > > I would not use 'linux,' here.
> > > 
> > > >
> > > > On the other hand, do we actually need the compatible string? Because we
> > > > don't really want to associate much extra information with this like we
> > > > do for example with "shared-dma-pool". The logic to handle this would
> > > > all be within the IOMMU framework. All we really need is for the
> > > > standard reservation code to skip nodes that don't have a reg property
> > > > so we don't reserve memory for "virtual-only" allocations.
> > > 
> > > It doesn't hurt to have one and I can imagine we might want to iterate
> > > over all the nodes. It's slightly easier and more common to iterate
> > > over compatible nodes rather than nodes with some property.
> > > 
> > > > > Are these being global in DT going to be a problem? Presumably we have
> > > > > a virtual space per IOMMU. We'd know which IOMMU based on a device's
> > > > > 'iommus' and 'memory-region' properties, but within /reserved-memory
> > > > > we wouldn't be able to distinguish overlapping addresses from separate
> > > > > address spaces. Or we could have 2 different IOVAs for 1 physical
> > > > > space. That could be solved with something like this:
> > > > >
> > > > > iommu-addresses = <&iommu1 <address cells> <size cells>>;
> > > >
> > > > The only case that would be problematic would be if we have overlapping
> > > > physical regions, because that will probably trip up the standard code.
> > > >
> > > > But this could also be worked around by looking at iommu-addresses. For
> > > > example, if we had something like this:
> > > >
> > > >         reserved-memory {
> > > >                 fb_dc0: fb@80000000 {
> > > >                         reg = <0x80000000 0x01000000>;
> > > >                         iommu-addresses = <0xa0000000 0x01000000>;
> > > >                 };
> > > >
> > > >                 fb_dc1: fb@80000000 {
> > > 
> > > You can't have 2 nodes with the same name (actually, you can, they
> > > just get merged together). Different names with the same unit-address
> > > is a dtc warning. I'd really like to make that a full blown
> > > overlapping region check.
> > 
> > Right... so this would be a lot easier to deal with using that earlier
> > proposal where the IOMMU regions were a separate thing and referencing
> > the reserved-memory nodes. In those cases we could just have the
> > physical reservation for the framebuffer once (so we don't get any
> > duplicates or overlaps) and then have each IOVA reservation reference
> > that to create the mapping.
> > 
> > > 
> > > >                         reg = <0x80000000 0x01000000>;
> > > >                         iommu-addresses = <0xb0000000 0x01000000>;
> > > >                 };
> > > >         };
> > > >
> > > > We could make the code identify that this is for the same physical
> > > > reservation (maybe make it so that reg needs to match exactly for this
> > > > to be recognized) but with different virtual allocations.
> > > >
> > > > On a side-note: do we really need to repeat the size? I'd think if we
> > > > want mappings then we'd likely want them for the whole reservation.
> > > 
> > > Humm, I suppose not, but dropping it paints us into a corner if we
> > > come up with wanting a different size later. You could have a carveout
> > > for double/triple buffering your framebuffer, but the bootloader
> > > framebuffer is only single buffered. So would you want actual size?
> > 
> > Perhaps this needs to be a bit more verbose then. If we want the ability
> > to create a mapping for only a partial reservation, I could imagine we
> > may as well want one that doesn't start at the beginning. So perhaps an
> > ever better solution would be to have a complete mapping, something that
> > works similar to "ranges" perhaps, like so:
> > 
> > 	fb@80000000 {
> > 		reg = <0x80000000 0x01000000>;
> > 		iommu-ranges = <0x80000000 0x01000000 0x80000000>;
> > 	};
> > 
> > That would be for a full identity mapping, but we could also have
> > something along the lines of this:
> > 
> > 	fb@80000000 {
> > 		reg = <0x80000000 0x01000000>;
> > 		iommu-ranges = <0x80100000 0x00100000 0xa0000000>;
> > 	};
> > 
> > So that would only map a 1 MiB chunk at offset 1 MiB (of the physical
> > reservation) to I/O virtual address 0xa0000000.
> > 
> > > > I'd like to keep references to IOMMUs out of this because they would be
> > > > duplicated. We will only use these nodes if they are referenced by a
> > > > device node that also has an iommus property. Also, the IOMMU reference
> > > > itself isn't enough. We'd also need to support the complete specifier
> > > > because you can have things like SIDs in there to specify the exact
> > > > address space that a device uses.
> > > >
> > > > Also, for some of these they may be reused independently of the IOMMU
> > > > address space. For example the Tegra framebuffer identity mapping can
> > > > be used by either of the 2-4 display controllers, each with (at least
> > > > potentially) their own address space. But we don't want to have to
> > > > describe the identity mapping separately for each display controller.
> > > 
> > > Okay, but I'd rather have to duplicate things in your case than not be
> > > able to express some other case.
> > 
> > The earlier "separate iov-reserved-memory" proposal would be a good
> > compromise here. It'd allow us to duplicate only the necessary bits
> > (i.e. the IOVA mappings) but keep the common bits simple. And even
> > the IOVA mappings could be shared for cases like identity mappings.
> > See below for more on that.
> > 
> > > > Another thing to consider is that these nodes will often be added by
> > > > firmware (e.g. firmware will allocate the framebuffer and set up the
> > > > corresponding reserved memory region in DT). Wiring up references like
> > > > this would get very complicated very quickly.
> > > 
> > > Yes.
> > > 
> > > The using 'iommus' property option below can be optional and doesn't
> > > have to be defined/supported now. Just trying to think ahead and not
> > > be stuck with something that can't be extended.
> > 
> > One other benefit of the separate iov-reserved-memory node would be that
> > the iommus property could be simplified. If we have a physical
> > reservation that needs to be accessed by multiple different display
> > controllers, we'd end up with something fairly complex, such as this:
> > 
> > 	fb: fb@80000000 {
> > 		reg = <0x80000000 0x01000000>;
> > 		iommus = <&dc0_iommu 0xa0000000 0x01000000>,
> > 			 <&dc1_iommu 0xb0000000 0x01000000>,
> > 			 <&dc2_iommu 0xc0000000 0x01000000>;
> > 	};
> > 
> > This would get even worse if we want to support partial mappings. Also,
> > it'd become quite complicated to correlate this with the memory-region
> > references:
> > 
> > 	dc0: dc@40000000 {
> > 		...
> > 		memory-region = <&fb>;
> > 		iommus = <&dc0_iommu>;
> > 		...
> > 	};
> > 
> > So now you have to go match up the phandle (and potentially specifier)
> > in the iommus property of the disp0 node with an entry in the fb node's
> > iommus property. That's all fairly complicated stuff.
> > 
> > With separate iov-reserved-memory, this would be a bit more verbose, but
> > each individual node would be simpler:
> > 
> > 	reserved-memory {
> > 		fb: fb@80000000 {
> > 			reg = <0x80000000 0x01000000>;
> > 		};
> > 	};
> > 
> > 	iov-reserved-memory {
> > 		fb0: fb@80000000 {
> > 			/* identity mapping, "reg" optional? */
> > 			reg = <0x80000000 0x01000000>;
> > 			memory-region = <&fb>;
> > 		};
> > 
> > 		fb1: fb@90000000 {
> > 			/* but doesn't have to be */
> > 			reg = <0x90000000 0x01000000>;
> > 			memory-region = <&fb>;
> > 		};
> > 
> > 		fb2: fb@a0000000 {
> > 			/* can be partial, too */
> > 			ranges = <0x80000000 0x00800000 0xa0000000>;
> > 			memory-region = <&fb>;
> > 		};
> > 	}
> > 
> > 	dc0: dc@40000000 {
> > 		iov-memory-regions = <&fb0>;
> > 		/* optional? */
> > 		memory-region = <&fb>;
> > 		iommus = <&dc0_iommu>;
> > 	};
> > 
> > Alternatively, if we want to support partial mappings, we could replace
> > those reg properties by ranges properties that I showed earlier. We may
> > even want to support both. Use "reg" for virtual-only reservations and
> > identity mappings, or "simple partial mappings" (that map a sub-region
> > starting from the beginning). Identity mappings could still be
> > simplified by just omitting the "reg" property. For more complicated
> > mappings, such as the ones on M1, the "ranges" property could be used.
> > 
> > Note how this looks a bit boilerplate-y, but it's actually really quite
> > simple to understand, even for humans, I think.
> > 
> > Also, the phandles in this are comparatively easy to wire up because
> > they can all be generated in a hierarchical way: generate physical
> > reservation and store phandle, then generate I/O virtual reservation
> > to reference that phandle and store the new phandle as well. Finally,
> > wire this up to the display controller (using either the IOV phandle or
> > both).
> > 
> > Granted, this requires the addition of a new top-level node, but given
> > how expressive this becomes, I think it might be worth a second
> > consideration.
> 
> I guess as a middle-ground between your suggestion and mine, we could
> also move the IOV nodes back into reserved-memory. If we make sure the
> names (together with unit-addresses) are unique, to support cases where
> we want to identity map, or have multiple mappings at the same address.
> So it'd look something like this:
> 
> 	reserved-memory {
> 		fb: fb@80000000 {
> 			reg = <0x80000000 0x01000000>;
> 		};
> 
> 		audio-firmware@ff000000 {
> 			/* perhaps add "iommu-reserved" for this case */
> 			compatible = "iommu-mapping";
> 			/*
> 			 * no memory-region referencing a physical
> 			 * reservation, indicates that this is an
> 			 * IOMMU reservation, rather than a mapping
> 			 /
> 			reg = <0xff000000 0x01000000>;
> 		};
> 
> 		fb0: fb-mapping@80000000 {
> 			compatible = "iommu-mapping";
> 			/* identity mapping, "reg" optional? */
> 			reg = <0x80000000 0x01000000>;
> 			memory-region = <&fb>;
> 		};
> 
> 		fb1: fb-mapping@90000000 {
> 			compatible = "iommu-mapping";
> 			/* but doesn't have to be */
> 			reg = <0x90000000 0x01000000>;
> 			memory-region = <&fb>;
> 		};
> 
> 		fb2: fb-mapping@a0000000 {
> 			compatible = "iommu-mapping";
> 			/* can be partial, too */
> 			ranges = <0xa0000000 0x00800000 0x80000000>;
> 			memory-region = <&fb>;
> 		};
> 	}
> 
> 	dc0: dc@40000000 {
> 		memory-region = <&fb0>;
> 		iommus = <&dc0_iommu>;
> 	};
> 
> What do you think?

I converted the Apple M1 display controller driver to using reserved 
regions using these bindings. It is sufficient for the needs of the M1 
display controller which is so far the only device requiring this.

I encountered two problems with this bindings proposal:

1) It is impossible to express which iommu needs to be used if a device 
has multiple "iommus" specified. This is on the M1 only a theoretical 
problem as the display co-processor devices use a single iommu.

2) The reserved regions can not easily looked up at iommu probe time.  
The Apple M1 iommu driver resets the iommu at probe. This breaks the 
framebuffer. The display controller appears to crash then an active 
scan-out framebuffer is unmapped. Resetting the iommu looks like a 
sensible approach though.
To work around this I added custom property to the affected iommu node 
to avoid the reset. This doesn't feel correct since the reason to avoid 
the reset is that we have to maintain the reserved regions mapping until 
the display controller driver takes over.
As far as I can see the only method to retrieve devices with reserved 
memory from the iommu is to iterate over all devices. This looks 
impractical. The M1 has over 20 distinct iommus.

One way to avoid both problems would be to move the mappings to the 
iommu node as sub nodes. The device would then reference those.  This 
way the mapping is readily available at iommu probe time and adding 
iommu type specific parameters to map the region correctly is possible.

The sample above would transfor to:

	reserved-memory {
		fb: fb@80000000 {
			reg = <0x80000000 0x01000000>;
		};
	};

	dc0_iommu: iommu@20000000 {
		#iommu-cells = <1>;

		fb0: fb-mapping@80000000 {
			compatible = "iommu-mapping";
			/* identity mapping, "reg" optional? */
			reg = <0x80000000 0x01000000>;
			memory-region = <&fb>;
			device-id = <0>; /* for #iommu-cells*/
		};

		fb1: fb-mapping@90000000 {
			compatible = "iommu-mapping";
			/* but doesn't have to be */
			reg = <0x90000000 0x01000000>;
			memory-region = <&fb>;
			device-id = <1>; /* for #iommu-cells*/
		};
	};

	dc0: dc@40000000 {
		iommu-region = <&fb0>;
		iommus = <&dc0_iommu 0>;
	};

Does anyone see problems with this approach or can think of something 
better?

Janne

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 1/5] dt-bindings: reserved-memory: Document memory region specifier
  2022-02-06 22:27                             ` Janne Grunau
@ 2022-02-09 16:31                               ` Thierry Reding
  2022-02-10 23:15                                 ` Janne Grunau
  0 siblings, 1 reply; 41+ messages in thread
From: Thierry Reding @ 2022-02-09 16:31 UTC (permalink / raw)
  To: Janne Grunau
  Cc: Rob Herring, Alyssa Rosenzweig, Sven Peter, Dmitry Osipenko,
	Joerg Roedel, Will Deacon, Robin Murphy, Nicolin Chen,
	Krishna Reddy, devicetree, Linux IOMMU, linux-tegra, dri-devel

[-- Attachment #1: Type: text/plain, Size: 18212 bytes --]

On Sun, Feb 06, 2022 at 11:27:00PM +0100, Janne Grunau wrote:
> On 2021-09-15 17:19:39 +0200, Thierry Reding wrote:
> > On Tue, Sep 07, 2021 at 07:44:44PM +0200, Thierry Reding wrote:
> > > On Tue, Sep 07, 2021 at 10:33:24AM -0500, Rob Herring wrote:
> > > > On Fri, Sep 3, 2021 at 10:36 AM Thierry Reding <thierry.reding@gmail.com> wrote:
> > > > >
> > > > > On Fri, Sep 03, 2021 at 09:36:33AM -0500, Rob Herring wrote:
> > > > > > On Fri, Sep 3, 2021 at 8:52 AM Thierry Reding <thierry.reding@gmail.com> wrote:
> > > > > > >
> > > > > > > On Fri, Sep 03, 2021 at 08:20:55AM -0500, Rob Herring wrote:
> > > > > > > >
> > > > > > > > Couldn't we keep this all in /reserved-memory? Just add an iova
> > > > > > > > version of reg. Perhaps abuse 'assigned-address' for this purpose. The
> > > > > > > > issue I see would be handling reserved iova areas without a physical
> > > > > > > > area. That can be handled with just a iova and no reg. We already have
> > > > > > > > a no reg case.
> > > > > > >
> > > > > > > I had thought about that initially. One thing I'm worried about is that
> > > > > > > every child node in /reserved-memory will effectively cause the memory
> > > > > > > that it described to be reserved. But we don't want that for regions
> > > > > > > that are "virtual only" (i.e. IOMMU reservations).
> > > > > >
> > > > > > By virtual only, you mean no physical mapping, just a region of
> > > > > > virtual space, right? For that we'd have no 'reg' and therefore no
> > > > > > (physical) reservation by the OS. It's similar to non-static regions.
> > > > > > You need a specific handler for them. We'd probably want a compatible
> > > > > > as well for these virtual reservations.
> > > > >
> > > > > Yeah, these would be purely used for reserving regions in the IOVA so
> > > > > that they won't be used by the IOVA allocator. Typically these would be
> > > > > used for cases where those addresses have some special meaning.
> > > > >
> > > > > Do we want something like:
> > > > >
> > > > >         compatible = "iommu-reserved";
> > > > >
> > > > > for these? Or would that need to be:
> > > > >
> > > > >         compatible = "linux,iommu-reserved";
> > > > >
> > > > > ? There seems to be a mix of vendor-prefix vs. non-vendor-prefix
> > > > > compatible strings in the reserved-memory DT bindings directory.
> > > > 
> > > > I would not use 'linux,' here.
> > > > 
> > > > >
> > > > > On the other hand, do we actually need the compatible string? Because we
> > > > > don't really want to associate much extra information with this like we
> > > > > do for example with "shared-dma-pool". The logic to handle this would
> > > > > all be within the IOMMU framework. All we really need is for the
> > > > > standard reservation code to skip nodes that don't have a reg property
> > > > > so we don't reserve memory for "virtual-only" allocations.
> > > > 
> > > > It doesn't hurt to have one and I can imagine we might want to iterate
> > > > over all the nodes. It's slightly easier and more common to iterate
> > > > over compatible nodes rather than nodes with some property.
> > > > 
> > > > > > Are these being global in DT going to be a problem? Presumably we have
> > > > > > a virtual space per IOMMU. We'd know which IOMMU based on a device's
> > > > > > 'iommus' and 'memory-region' properties, but within /reserved-memory
> > > > > > we wouldn't be able to distinguish overlapping addresses from separate
> > > > > > address spaces. Or we could have 2 different IOVAs for 1 physical
> > > > > > space. That could be solved with something like this:
> > > > > >
> > > > > > iommu-addresses = <&iommu1 <address cells> <size cells>>;
> > > > >
> > > > > The only case that would be problematic would be if we have overlapping
> > > > > physical regions, because that will probably trip up the standard code.
> > > > >
> > > > > But this could also be worked around by looking at iommu-addresses. For
> > > > > example, if we had something like this:
> > > > >
> > > > >         reserved-memory {
> > > > >                 fb_dc0: fb@80000000 {
> > > > >                         reg = <0x80000000 0x01000000>;
> > > > >                         iommu-addresses = <0xa0000000 0x01000000>;
> > > > >                 };
> > > > >
> > > > >                 fb_dc1: fb@80000000 {
> > > > 
> > > > You can't have 2 nodes with the same name (actually, you can, they
> > > > just get merged together). Different names with the same unit-address
> > > > is a dtc warning. I'd really like to make that a full blown
> > > > overlapping region check.
> > > 
> > > Right... so this would be a lot easier to deal with using that earlier
> > > proposal where the IOMMU regions were a separate thing and referencing
> > > the reserved-memory nodes. In those cases we could just have the
> > > physical reservation for the framebuffer once (so we don't get any
> > > duplicates or overlaps) and then have each IOVA reservation reference
> > > that to create the mapping.
> > > 
> > > > 
> > > > >                         reg = <0x80000000 0x01000000>;
> > > > >                         iommu-addresses = <0xb0000000 0x01000000>;
> > > > >                 };
> > > > >         };
> > > > >
> > > > > We could make the code identify that this is for the same physical
> > > > > reservation (maybe make it so that reg needs to match exactly for this
> > > > > to be recognized) but with different virtual allocations.
> > > > >
> > > > > On a side-note: do we really need to repeat the size? I'd think if we
> > > > > want mappings then we'd likely want them for the whole reservation.
> > > > 
> > > > Humm, I suppose not, but dropping it paints us into a corner if we
> > > > come up with wanting a different size later. You could have a carveout
> > > > for double/triple buffering your framebuffer, but the bootloader
> > > > framebuffer is only single buffered. So would you want actual size?
> > > 
> > > Perhaps this needs to be a bit more verbose then. If we want the ability
> > > to create a mapping for only a partial reservation, I could imagine we
> > > may as well want one that doesn't start at the beginning. So perhaps an
> > > ever better solution would be to have a complete mapping, something that
> > > works similar to "ranges" perhaps, like so:
> > > 
> > > 	fb@80000000 {
> > > 		reg = <0x80000000 0x01000000>;
> > > 		iommu-ranges = <0x80000000 0x01000000 0x80000000>;
> > > 	};
> > > 
> > > That would be for a full identity mapping, but we could also have
> > > something along the lines of this:
> > > 
> > > 	fb@80000000 {
> > > 		reg = <0x80000000 0x01000000>;
> > > 		iommu-ranges = <0x80100000 0x00100000 0xa0000000>;
> > > 	};
> > > 
> > > So that would only map a 1 MiB chunk at offset 1 MiB (of the physical
> > > reservation) to I/O virtual address 0xa0000000.
> > > 
> > > > > I'd like to keep references to IOMMUs out of this because they would be
> > > > > duplicated. We will only use these nodes if they are referenced by a
> > > > > device node that also has an iommus property. Also, the IOMMU reference
> > > > > itself isn't enough. We'd also need to support the complete specifier
> > > > > because you can have things like SIDs in there to specify the exact
> > > > > address space that a device uses.
> > > > >
> > > > > Also, for some of these they may be reused independently of the IOMMU
> > > > > address space. For example the Tegra framebuffer identity mapping can
> > > > > be used by either of the 2-4 display controllers, each with (at least
> > > > > potentially) their own address space. But we don't want to have to
> > > > > describe the identity mapping separately for each display controller.
> > > > 
> > > > Okay, but I'd rather have to duplicate things in your case than not be
> > > > able to express some other case.
> > > 
> > > The earlier "separate iov-reserved-memory" proposal would be a good
> > > compromise here. It'd allow us to duplicate only the necessary bits
> > > (i.e. the IOVA mappings) but keep the common bits simple. And even
> > > the IOVA mappings could be shared for cases like identity mappings.
> > > See below for more on that.
> > > 
> > > > > Another thing to consider is that these nodes will often be added by
> > > > > firmware (e.g. firmware will allocate the framebuffer and set up the
> > > > > corresponding reserved memory region in DT). Wiring up references like
> > > > > this would get very complicated very quickly.
> > > > 
> > > > Yes.
> > > > 
> > > > The using 'iommus' property option below can be optional and doesn't
> > > > have to be defined/supported now. Just trying to think ahead and not
> > > > be stuck with something that can't be extended.
> > > 
> > > One other benefit of the separate iov-reserved-memory node would be that
> > > the iommus property could be simplified. If we have a physical
> > > reservation that needs to be accessed by multiple different display
> > > controllers, we'd end up with something fairly complex, such as this:
> > > 
> > > 	fb: fb@80000000 {
> > > 		reg = <0x80000000 0x01000000>;
> > > 		iommus = <&dc0_iommu 0xa0000000 0x01000000>,
> > > 			 <&dc1_iommu 0xb0000000 0x01000000>,
> > > 			 <&dc2_iommu 0xc0000000 0x01000000>;
> > > 	};
> > > 
> > > This would get even worse if we want to support partial mappings. Also,
> > > it'd become quite complicated to correlate this with the memory-region
> > > references:
> > > 
> > > 	dc0: dc@40000000 {
> > > 		...
> > > 		memory-region = <&fb>;
> > > 		iommus = <&dc0_iommu>;
> > > 		...
> > > 	};
> > > 
> > > So now you have to go match up the phandle (and potentially specifier)
> > > in the iommus property of the disp0 node with an entry in the fb node's
> > > iommus property. That's all fairly complicated stuff.
> > > 
> > > With separate iov-reserved-memory, this would be a bit more verbose, but
> > > each individual node would be simpler:
> > > 
> > > 	reserved-memory {
> > > 		fb: fb@80000000 {
> > > 			reg = <0x80000000 0x01000000>;
> > > 		};
> > > 	};
> > > 
> > > 	iov-reserved-memory {
> > > 		fb0: fb@80000000 {
> > > 			/* identity mapping, "reg" optional? */
> > > 			reg = <0x80000000 0x01000000>;
> > > 			memory-region = <&fb>;
> > > 		};
> > > 
> > > 		fb1: fb@90000000 {
> > > 			/* but doesn't have to be */
> > > 			reg = <0x90000000 0x01000000>;
> > > 			memory-region = <&fb>;
> > > 		};
> > > 
> > > 		fb2: fb@a0000000 {
> > > 			/* can be partial, too */
> > > 			ranges = <0x80000000 0x00800000 0xa0000000>;
> > > 			memory-region = <&fb>;
> > > 		};
> > > 	}
> > > 
> > > 	dc0: dc@40000000 {
> > > 		iov-memory-regions = <&fb0>;
> > > 		/* optional? */
> > > 		memory-region = <&fb>;
> > > 		iommus = <&dc0_iommu>;
> > > 	};
> > > 
> > > Alternatively, if we want to support partial mappings, we could replace
> > > those reg properties by ranges properties that I showed earlier. We may
> > > even want to support both. Use "reg" for virtual-only reservations and
> > > identity mappings, or "simple partial mappings" (that map a sub-region
> > > starting from the beginning). Identity mappings could still be
> > > simplified by just omitting the "reg" property. For more complicated
> > > mappings, such as the ones on M1, the "ranges" property could be used.
> > > 
> > > Note how this looks a bit boilerplate-y, but it's actually really quite
> > > simple to understand, even for humans, I think.
> > > 
> > > Also, the phandles in this are comparatively easy to wire up because
> > > they can all be generated in a hierarchical way: generate physical
> > > reservation and store phandle, then generate I/O virtual reservation
> > > to reference that phandle and store the new phandle as well. Finally,
> > > wire this up to the display controller (using either the IOV phandle or
> > > both).
> > > 
> > > Granted, this requires the addition of a new top-level node, but given
> > > how expressive this becomes, I think it might be worth a second
> > > consideration.
> > 
> > I guess as a middle-ground between your suggestion and mine, we could
> > also move the IOV nodes back into reserved-memory. If we make sure the
> > names (together with unit-addresses) are unique, to support cases where
> > we want to identity map, or have multiple mappings at the same address.
> > So it'd look something like this:
> > 
> > 	reserved-memory {
> > 		fb: fb@80000000 {
> > 			reg = <0x80000000 0x01000000>;
> > 		};
> > 
> > 		audio-firmware@ff000000 {
> > 			/* perhaps add "iommu-reserved" for this case */
> > 			compatible = "iommu-mapping";
> > 			/*
> > 			 * no memory-region referencing a physical
> > 			 * reservation, indicates that this is an
> > 			 * IOMMU reservation, rather than a mapping
> > 			 /
> > 			reg = <0xff000000 0x01000000>;
> > 		};
> > 
> > 		fb0: fb-mapping@80000000 {
> > 			compatible = "iommu-mapping";
> > 			/* identity mapping, "reg" optional? */
> > 			reg = <0x80000000 0x01000000>;
> > 			memory-region = <&fb>;
> > 		};
> > 
> > 		fb1: fb-mapping@90000000 {
> > 			compatible = "iommu-mapping";
> > 			/* but doesn't have to be */
> > 			reg = <0x90000000 0x01000000>;
> > 			memory-region = <&fb>;
> > 		};
> > 
> > 		fb2: fb-mapping@a0000000 {
> > 			compatible = "iommu-mapping";
> > 			/* can be partial, too */
> > 			ranges = <0xa0000000 0x00800000 0x80000000>;
> > 			memory-region = <&fb>;
> > 		};
> > 	}
> > 
> > 	dc0: dc@40000000 {
> > 		memory-region = <&fb0>;
> > 		iommus = <&dc0_iommu>;
> > 	};
> > 
> > What do you think?
> 
> I converted the Apple M1 display controller driver to using reserved 
> regions using these bindings. It is sufficient for the needs of the M1 
> display controller which is so far the only device requiring this.

Thanks for trying this out. I've been meaning to resume this discussion
to finally get closure because we really want to enable this for various
Tegra SoCs.

> I encountered two problems with this bindings proposal:
> 
> 1) It is impossible to express which iommu needs to be used if a device 
> has multiple "iommus" specified. This is on the M1 only a theoretical 
> problem as the display co-processor devices use a single iommu.

From what I recall this is something that we don't fully support either
way. If you've got a struct device and you want to allocate DMA'able
memory, you can only pass that struct device to the DMA API upon
allocation but you have no way of specifying separate instances
depending on use-case.

> 2) The reserved regions can not easily looked up at iommu probe time.  
> The Apple M1 iommu driver resets the iommu at probe. This breaks the 
> framebuffer. The display controller appears to crash then an active 
> scan-out framebuffer is unmapped. Resetting the iommu looks like a 
> sensible approach though.
> 
> To work around this I added custom property to the affected iommu node 
> to avoid the reset. This doesn't feel correct since the reason to avoid 
> the reset is that we have to maintain the reserved regions mapping until 
> the display controller driver takes over.
> As far as I can see the only method to retrieve devices with reserved 
> memory from the iommu is to iterate over all devices. This looks 
> impractical. The M1 has over 20 distinct iommus.

Do I understand correctly that on the M1, the firmware sets up a mapping
in the IOMMU already and then you want to recreate that mapping after
the IOMMU driver has reset the IOMMU?

In that case, how do you make sure that you atomically transition from
the firmware mapping to the kernel mapping? As soon as you reset the
IOMMU, the display controller will cause IOMMU faults because its now
scanning out from an unmapped buffer, right?

So that approach of avoiding the reset doesn't seem wrong to me.
Obviously that's not altogether trivial to do either. Typically the
IOMMU mappings would be contained in system memory, so you'd have to
reserve those via reserved-memory nodes as well, etc.

> One way to avoid both problems would be to move the mappings to the 
> iommu node as sub nodes. The device would then reference those.  This 
> way the mapping is readily available at iommu probe time and adding 
> iommu type specific parameters to map the region correctly is possible.
> 
> The sample above would transfor to:
> 
> 	reserved-memory {
> 		fb: fb@80000000 {
> 			reg = <0x80000000 0x01000000>;
> 		};
> 	};
> 
> 	dc0_iommu: iommu@20000000 {
> 		#iommu-cells = <1>;
> 
> 		fb0: fb-mapping@80000000 {
> 			compatible = "iommu-mapping";
> 			/* identity mapping, "reg" optional? */
> 			reg = <0x80000000 0x01000000>;
> 			memory-region = <&fb>;
> 			device-id = <0>; /* for #iommu-cells*/
> 		};
> 
> 		fb1: fb-mapping@90000000 {
> 			compatible = "iommu-mapping";
> 			/* but doesn't have to be */
> 			reg = <0x90000000 0x01000000>;
> 			memory-region = <&fb>;
> 			device-id = <1>; /* for #iommu-cells*/
> 		};
> 	};
> 
> 	dc0: dc@40000000 {
> 		iommu-region = <&fb0>;
> 		iommus = <&dc0_iommu 0>;
> 	};
> 
> Does anyone see problems with this approach or can think of something 
> better?

The device tree description of this looks a bit weird because it
sprinkles things all around. For instance now we've got the "stream ID"
(i.e. what you seem to be referring to as "device-id") in two places,
once in the iommus property of the DC node and once in the mapping.

Would it work if you added back-references to the devices that are
active on boot to the IOMMU node? Something along these lines:

	reserved-memory {
		fb: fb@80000000 {
			reg = <0x80000000 0x01000000>;
		};
	};

	dc0_iommu: iommu@20000000 {
		#iommu-cells = <1>;

		mapped-devices = <&dc0>;
	};

	dc0: dc@40000000 {
		memory-region = <&fb0>;
		iommus = <&dc0_iommu 0>;
	};

Depending on how you look at it that's a circular dependency, but it
won't be in practice. It makes things a bit more compact and puts the
data where it belongs.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 1/5] dt-bindings: reserved-memory: Document memory region specifier
  2022-02-09 16:31                               ` Thierry Reding
@ 2022-02-10 23:15                                 ` Janne Grunau
  2022-03-31 16:25                                   ` Thierry Reding
  0 siblings, 1 reply; 41+ messages in thread
From: Janne Grunau @ 2022-02-10 23:15 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Rob Herring, Alyssa Rosenzweig, Sven Peter, Dmitry Osipenko,
	Joerg Roedel, Will Deacon, Robin Murphy, Nicolin Chen,
	Krishna Reddy, devicetree, Linux IOMMU, linux-tegra, dri-devel

On 2022-02-09 17:31:16 +0100, Thierry Reding wrote:
> On Sun, Feb 06, 2022 at 11:27:00PM +0100, Janne Grunau wrote:
> > On 2021-09-15 17:19:39 +0200, Thierry Reding wrote:
> > > On Tue, Sep 07, 2021 at 07:44:44PM +0200, Thierry Reding wrote:
> > > > On Tue, Sep 07, 2021 at 10:33:24AM -0500, Rob Herring wrote:
> > > > > On Fri, Sep 3, 2021 at 10:36 AM Thierry Reding <thierry.reding@gmail.com> wrote:
> > > > > >
> > > > > > On Fri, Sep 03, 2021 at 09:36:33AM -0500, Rob Herring wrote:
> > > > > > > On Fri, Sep 3, 2021 at 8:52 AM Thierry Reding <thierry.reding@gmail.com> wrote:
> > > > > > > >
> > > > > > > > On Fri, Sep 03, 2021 at 08:20:55AM -0500, Rob Herring wrote:
> > > > > > > > >
> > > > > > > > > Couldn't we keep this all in /reserved-memory? Just add an iova
> > > > > > > > > version of reg. Perhaps abuse 'assigned-address' for this purpose. The
> > > > > > > > > issue I see would be handling reserved iova areas without a physical
> > > > > > > > > area. That can be handled with just a iova and no reg. We already have
> > > > > > > > > a no reg case.
> > > > > > > >
> > > > > > > > I had thought about that initially. One thing I'm worried about is that
> > > > > > > > every child node in /reserved-memory will effectively cause the memory
> > > > > > > > that it described to be reserved. But we don't want that for regions
> > > > > > > > that are "virtual only" (i.e. IOMMU reservations).
> > > > > > >
> > > > > > > By virtual only, you mean no physical mapping, just a region of
> > > > > > > virtual space, right? For that we'd have no 'reg' and therefore no
> > > > > > > (physical) reservation by the OS. It's similar to non-static regions.
> > > > > > > You need a specific handler for them. We'd probably want a compatible
> > > > > > > as well for these virtual reservations.
> > > > > >
> > > > > > Yeah, these would be purely used for reserving regions in the IOVA so
> > > > > > that they won't be used by the IOVA allocator. Typically these would be
> > > > > > used for cases where those addresses have some special meaning.
> > > > > >
> > > > > > Do we want something like:
> > > > > >
> > > > > >         compatible = "iommu-reserved";
> > > > > >
> > > > > > for these? Or would that need to be:
> > > > > >
> > > > > >         compatible = "linux,iommu-reserved";
> > > > > >
> > > > > > ? There seems to be a mix of vendor-prefix vs. non-vendor-prefix
> > > > > > compatible strings in the reserved-memory DT bindings directory.
> > > > > 
> > > > > I would not use 'linux,' here.
> > > > > 
> > > > > >
> > > > > > On the other hand, do we actually need the compatible string? Because we
> > > > > > don't really want to associate much extra information with this like we
> > > > > > do for example with "shared-dma-pool". The logic to handle this would
> > > > > > all be within the IOMMU framework. All we really need is for the
> > > > > > standard reservation code to skip nodes that don't have a reg property
> > > > > > so we don't reserve memory for "virtual-only" allocations.
> > > > > 
> > > > > It doesn't hurt to have one and I can imagine we might want to iterate
> > > > > over all the nodes. It's slightly easier and more common to iterate
> > > > > over compatible nodes rather than nodes with some property.
> > > > > 
> > > > > > > Are these being global in DT going to be a problem? Presumably we have
> > > > > > > a virtual space per IOMMU. We'd know which IOMMU based on a device's
> > > > > > > 'iommus' and 'memory-region' properties, but within /reserved-memory
> > > > > > > we wouldn't be able to distinguish overlapping addresses from separate
> > > > > > > address spaces. Or we could have 2 different IOVAs for 1 physical
> > > > > > > space. That could be solved with something like this:
> > > > > > >
> > > > > > > iommu-addresses = <&iommu1 <address cells> <size cells>>;
> > > > > >
> > > > > > The only case that would be problematic would be if we have overlapping
> > > > > > physical regions, because that will probably trip up the standard code.
> > > > > >
> > > > > > But this could also be worked around by looking at iommu-addresses. For
> > > > > > example, if we had something like this:
> > > > > >
> > > > > >         reserved-memory {
> > > > > >                 fb_dc0: fb@80000000 {
> > > > > >                         reg = <0x80000000 0x01000000>;
> > > > > >                         iommu-addresses = <0xa0000000 0x01000000>;
> > > > > >                 };
> > > > > >
> > > > > >                 fb_dc1: fb@80000000 {
> > > > > 
> > > > > You can't have 2 nodes with the same name (actually, you can, they
> > > > > just get merged together). Different names with the same unit-address
> > > > > is a dtc warning. I'd really like to make that a full blown
> > > > > overlapping region check.
> > > > 
> > > > Right... so this would be a lot easier to deal with using that earlier
> > > > proposal where the IOMMU regions were a separate thing and referencing
> > > > the reserved-memory nodes. In those cases we could just have the
> > > > physical reservation for the framebuffer once (so we don't get any
> > > > duplicates or overlaps) and then have each IOVA reservation reference
> > > > that to create the mapping.
> > > > 
> > > > > 
> > > > > >                         reg = <0x80000000 0x01000000>;
> > > > > >                         iommu-addresses = <0xb0000000 0x01000000>;
> > > > > >                 };
> > > > > >         };
> > > > > >
> > > > > > We could make the code identify that this is for the same physical
> > > > > > reservation (maybe make it so that reg needs to match exactly for this
> > > > > > to be recognized) but with different virtual allocations.
> > > > > >
> > > > > > On a side-note: do we really need to repeat the size? I'd think if we
> > > > > > want mappings then we'd likely want them for the whole reservation.
> > > > > 
> > > > > Humm, I suppose not, but dropping it paints us into a corner if we
> > > > > come up with wanting a different size later. You could have a carveout
> > > > > for double/triple buffering your framebuffer, but the bootloader
> > > > > framebuffer is only single buffered. So would you want actual size?
> > > > 
> > > > Perhaps this needs to be a bit more verbose then. If we want the ability
> > > > to create a mapping for only a partial reservation, I could imagine we
> > > > may as well want one that doesn't start at the beginning. So perhaps an
> > > > ever better solution would be to have a complete mapping, something that
> > > > works similar to "ranges" perhaps, like so:
> > > > 
> > > > 	fb@80000000 {
> > > > 		reg = <0x80000000 0x01000000>;
> > > > 		iommu-ranges = <0x80000000 0x01000000 0x80000000>;
> > > > 	};
> > > > 
> > > > That would be for a full identity mapping, but we could also have
> > > > something along the lines of this:
> > > > 
> > > > 	fb@80000000 {
> > > > 		reg = <0x80000000 0x01000000>;
> > > > 		iommu-ranges = <0x80100000 0x00100000 0xa0000000>;
> > > > 	};
> > > > 
> > > > So that would only map a 1 MiB chunk at offset 1 MiB (of the physical
> > > > reservation) to I/O virtual address 0xa0000000.
> > > > 
> > > > > > I'd like to keep references to IOMMUs out of this because they would be
> > > > > > duplicated. We will only use these nodes if they are referenced by a
> > > > > > device node that also has an iommus property. Also, the IOMMU reference
> > > > > > itself isn't enough. We'd also need to support the complete specifier
> > > > > > because you can have things like SIDs in there to specify the exact
> > > > > > address space that a device uses.
> > > > > >
> > > > > > Also, for some of these they may be reused independently of the IOMMU
> > > > > > address space. For example the Tegra framebuffer identity mapping can
> > > > > > be used by either of the 2-4 display controllers, each with (at least
> > > > > > potentially) their own address space. But we don't want to have to
> > > > > > describe the identity mapping separately for each display controller.
> > > > > 
> > > > > Okay, but I'd rather have to duplicate things in your case than not be
> > > > > able to express some other case.
> > > > 
> > > > The earlier "separate iov-reserved-memory" proposal would be a good
> > > > compromise here. It'd allow us to duplicate only the necessary bits
> > > > (i.e. the IOVA mappings) but keep the common bits simple. And even
> > > > the IOVA mappings could be shared for cases like identity mappings.
> > > > See below for more on that.
> > > > 
> > > > > > Another thing to consider is that these nodes will often be added by
> > > > > > firmware (e.g. firmware will allocate the framebuffer and set up the
> > > > > > corresponding reserved memory region in DT). Wiring up references like
> > > > > > this would get very complicated very quickly.
> > > > > 
> > > > > Yes.
> > > > > 
> > > > > The using 'iommus' property option below can be optional and doesn't
> > > > > have to be defined/supported now. Just trying to think ahead and not
> > > > > be stuck with something that can't be extended.
> > > > 
> > > > One other benefit of the separate iov-reserved-memory node would be that
> > > > the iommus property could be simplified. If we have a physical
> > > > reservation that needs to be accessed by multiple different display
> > > > controllers, we'd end up with something fairly complex, such as this:
> > > > 
> > > > 	fb: fb@80000000 {
> > > > 		reg = <0x80000000 0x01000000>;
> > > > 		iommus = <&dc0_iommu 0xa0000000 0x01000000>,
> > > > 			 <&dc1_iommu 0xb0000000 0x01000000>,
> > > > 			 <&dc2_iommu 0xc0000000 0x01000000>;
> > > > 	};
> > > > 
> > > > This would get even worse if we want to support partial mappings. Also,
> > > > it'd become quite complicated to correlate this with the memory-region
> > > > references:
> > > > 
> > > > 	dc0: dc@40000000 {
> > > > 		...
> > > > 		memory-region = <&fb>;
> > > > 		iommus = <&dc0_iommu>;
> > > > 		...
> > > > 	};
> > > > 
> > > > So now you have to go match up the phandle (and potentially specifier)
> > > > in the iommus property of the disp0 node with an entry in the fb node's
> > > > iommus property. That's all fairly complicated stuff.
> > > > 
> > > > With separate iov-reserved-memory, this would be a bit more verbose, but
> > > > each individual node would be simpler:
> > > > 
> > > > 	reserved-memory {
> > > > 		fb: fb@80000000 {
> > > > 			reg = <0x80000000 0x01000000>;
> > > > 		};
> > > > 	};
> > > > 
> > > > 	iov-reserved-memory {
> > > > 		fb0: fb@80000000 {
> > > > 			/* identity mapping, "reg" optional? */
> > > > 			reg = <0x80000000 0x01000000>;
> > > > 			memory-region = <&fb>;
> > > > 		};
> > > > 
> > > > 		fb1: fb@90000000 {
> > > > 			/* but doesn't have to be */
> > > > 			reg = <0x90000000 0x01000000>;
> > > > 			memory-region = <&fb>;
> > > > 		};
> > > > 
> > > > 		fb2: fb@a0000000 {
> > > > 			/* can be partial, too */
> > > > 			ranges = <0x80000000 0x00800000 0xa0000000>;
> > > > 			memory-region = <&fb>;
> > > > 		};
> > > > 	}
> > > > 
> > > > 	dc0: dc@40000000 {
> > > > 		iov-memory-regions = <&fb0>;
> > > > 		/* optional? */
> > > > 		memory-region = <&fb>;
> > > > 		iommus = <&dc0_iommu>;
> > > > 	};
> > > > 
> > > > Alternatively, if we want to support partial mappings, we could replace
> > > > those reg properties by ranges properties that I showed earlier. We may
> > > > even want to support both. Use "reg" for virtual-only reservations and
> > > > identity mappings, or "simple partial mappings" (that map a sub-region
> > > > starting from the beginning). Identity mappings could still be
> > > > simplified by just omitting the "reg" property. For more complicated
> > > > mappings, such as the ones on M1, the "ranges" property could be used.
> > > > 
> > > > Note how this looks a bit boilerplate-y, but it's actually really quite
> > > > simple to understand, even for humans, I think.
> > > > 
> > > > Also, the phandles in this are comparatively easy to wire up because
> > > > they can all be generated in a hierarchical way: generate physical
> > > > reservation and store phandle, then generate I/O virtual reservation
> > > > to reference that phandle and store the new phandle as well. Finally,
> > > > wire this up to the display controller (using either the IOV phandle or
> > > > both).
> > > > 
> > > > Granted, this requires the addition of a new top-level node, but given
> > > > how expressive this becomes, I think it might be worth a second
> > > > consideration.
> > > 
> > > I guess as a middle-ground between your suggestion and mine, we could
> > > also move the IOV nodes back into reserved-memory. If we make sure the
> > > names (together with unit-addresses) are unique, to support cases where
> > > we want to identity map, or have multiple mappings at the same address.
> > > So it'd look something like this:
> > > 
> > > 	reserved-memory {
> > > 		fb: fb@80000000 {
> > > 			reg = <0x80000000 0x01000000>;
> > > 		};
> > > 
> > > 		audio-firmware@ff000000 {
> > > 			/* perhaps add "iommu-reserved" for this case */
> > > 			compatible = "iommu-mapping";
> > > 			/*
> > > 			 * no memory-region referencing a physical
> > > 			 * reservation, indicates that this is an
> > > 			 * IOMMU reservation, rather than a mapping
> > > 			 /
> > > 			reg = <0xff000000 0x01000000>;
> > > 		};
> > > 
> > > 		fb0: fb-mapping@80000000 {
> > > 			compatible = "iommu-mapping";
> > > 			/* identity mapping, "reg" optional? */
> > > 			reg = <0x80000000 0x01000000>;
> > > 			memory-region = <&fb>;
> > > 		};
> > > 
> > > 		fb1: fb-mapping@90000000 {
> > > 			compatible = "iommu-mapping";
> > > 			/* but doesn't have to be */
> > > 			reg = <0x90000000 0x01000000>;
> > > 			memory-region = <&fb>;
> > > 		};
> > > 
> > > 		fb2: fb-mapping@a0000000 {
> > > 			compatible = "iommu-mapping";
> > > 			/* can be partial, too */
> > > 			ranges = <0xa0000000 0x00800000 0x80000000>;
> > > 			memory-region = <&fb>;
> > > 		};
> > > 	}
> > > 
> > > 	dc0: dc@40000000 {
> > > 		memory-region = <&fb0>;
> > > 		iommus = <&dc0_iommu>;
> > > 	};
> > > 
> > > What do you think?
> > 
> > I converted the Apple M1 display controller driver to using reserved 
> > regions using these bindings. It is sufficient for the needs of the M1 
> > display controller which is so far the only device requiring this.
> 
> Thanks for trying this out. I've been meaning to resume this discussion
> to finally get closure because we really want to enable this for various
> Tegra SoCs.
> 
> > I encountered two problems with this bindings proposal:
> > 
> > 1) It is impossible to express which iommu needs to be used if a device 
> > has multiple "iommus" specified. This is on the M1 only a theoretical 
> > problem as the display co-processor devices use a single iommu.
> 
> From what I recall this is something that we don't fully support either
> way. If you've got a struct device and you want to allocate DMA'able
> memory, you can only pass that struct device to the DMA API upon
> allocation but you have no way of specifying separate instances
> depending on use-case.

Ok, let's us ignore then my complicated proposal. It is not a problem we 
need to solve for the M1.

> > 2) The reserved regions can not easily looked up at iommu probe 
> > time.  The Apple M1 iommu driver resets the iommu at probe. This 
> > breaks the framebuffer. The display controller appears to crash then 
> > an active scan-out framebuffer is unmapped. Resetting the iommu 
> > looks like a sensible approach though.
> > 
> > To work around this I added custom property to the affected iommu node 
> > to avoid the reset. This doesn't feel correct since the reason to avoid 
> > the reset is that we have to maintain the reserved regions mapping until 
> > the display controller driver takes over.
> > As far as I can see the only method to retrieve devices with reserved 
> > memory from the iommu is to iterate over all devices. This looks 
> > impractical. The M1 has over 20 distinct iommus.
> 
> Do I understand correctly that on the M1, the firmware sets up a mapping
> in the IOMMU already and then you want to recreate that mapping after
> the IOMMU driver has reset the IOMMU?

The mappings are already set up by firmware as it uses the frame buffer 
already itself. We need to make the kernel aware of the existing mapping 
so it can use the IOMMU. Using reserved memory regions and mappings 
seems to be clean way to do this. We want to reset IOMMUs without 
pre-existing mappings (the M1 has over 20 IOMMUs). We need a way to 
identify the two IOMMUs which must not be reseted at driver probe time.  
A simple property in the IOMMU node would be enough. It would duplicate 
information though since the only reason why we can't reset the IOMMU is 
the pre-existing mapping

> In that case, how do you make sure that you atomically transition from
> the firmware mapping to the kernel mapping? As soon as you reset the
> IOMMU, the display controller will cause IOMMU faults because its now
> scanning out from an unmapped buffer, right?

We are replacing the entire firmware managed page table with a kernel 
managed one with a TTBR MMIO register write. The second IOMMU with 
pre-existing mapping has unfortunately the TTBR locked. Dealing with 
this is more complicated but the device using this IOMMU appears to
sleep.

> So that approach of avoiding the reset doesn't seem wrong to me.
> Obviously that's not altogether trivial to do either. Typically the
> IOMMU mappings would be contained in system memory, so you'd have to
> reserve those via reserved-memory nodes as well, etc.

The system memory is currently not expressed as reserved-memory but 
simply outside of the specified memory.
 
> > One way to avoid both problems would be to move the mappings to the 
> > iommu node as sub nodes. The device would then reference those.  
> > This way the mapping is readily available at iommu probe time and 
> > adding iommu type specific parameters to map the region correctly is 
> > possible.
> > 
> > The sample above would transfor to:
> > 
> > 	reserved-memory {
> > 		fb: fb@80000000 {
> > 			reg = <0x80000000 0x01000000>;
> > 		};
> > 	};
> > 
> > 	dc0_iommu: iommu@20000000 {
> > 		#iommu-cells = <1>;
> > 
> > 		fb0: fb-mapping@80000000 {
> > 			compatible = "iommu-mapping";
> > 			/* identity mapping, "reg" optional? */
> > 			reg = <0x80000000 0x01000000>;
> > 			memory-region = <&fb>;
> > 			device-id = <0>; /* for #iommu-cells*/
> > 		};
> > 
> > 		fb1: fb-mapping@90000000 {
> > 			compatible = "iommu-mapping";
> > 			/* but doesn't have to be */
> > 			reg = <0x90000000 0x01000000>;
> > 			memory-region = <&fb>;
> > 			device-id = <1>; /* for #iommu-cells*/
> > 		};
> > 	};
> > 
> > 	dc0: dc@40000000 {
> > 		iommu-region = <&fb0>;
> > 		iommus = <&dc0_iommu 0>;
> > 	};
> > 
> > Does anyone see problems with this approach or can think of something 
> > better?
> 
> The device tree description of this looks a bit weird because it
> sprinkles things all around. For instance now we've got the "stream ID"
> (i.e. what you seem to be referring to as "device-id") in two places,
> once in the iommus property of the DC node and once in the mapping.

Yes, stream_id would be the device-id. It is the term used in the 
apple-dart IOMMU driver. It is duplicated to deal with the multiple 
IOMMU problem. Let's ignore that and scrape my proposal.
 
> Would it work if you added back-references to the devices that are
> active on boot to the IOMMU node? Something along these lines:
> 
> 	reserved-memory {
> 		fb: fb@80000000 {
> 			reg = <0x80000000 0x01000000>;
> 		};
> 	};
> 
> 	dc0_iommu: iommu@20000000 {
> 		#iommu-cells = <1>;
> 
> 		mapped-devices = <&dc0>;
> 	};
> 
> 	dc0: dc@40000000 {
> 		memory-region = <&fb0>;
> 		iommus = <&dc0_iommu 0>;
> 	};
> 
> Depending on how you look at it that's a circular dependency, but it
> won't be in practice. It makes things a bit more compact and puts the
> data where it belongs.

Yes, this works for the Apple M1 display co-processor. I've changed the 
dts and my apple-dart private parsing code to use "mapped-devices" 
back-references and it works as before. We probably need an automated 
check to ensure the references between device and IOMMU remains 
consistent.

thanks
Janne

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 1/5] dt-bindings: reserved-memory: Document memory region specifier
  2022-02-10 23:15                                 ` Janne Grunau
@ 2022-03-31 16:25                                   ` Thierry Reding
  2022-04-01 17:08                                     ` Janne Grunau
  0 siblings, 1 reply; 41+ messages in thread
From: Thierry Reding @ 2022-03-31 16:25 UTC (permalink / raw)
  To: Janne Grunau, Rob Herring
  Cc: Alyssa Rosenzweig, Sven Peter, Dmitry Osipenko, Joerg Roedel,
	Will Deacon, Robin Murphy, Nicolin Chen, Krishna Reddy,
	devicetree, Linux IOMMU, linux-tegra, dri-devel

[-- Attachment #1: Type: text/plain, Size: 23828 bytes --]

On Fri, Feb 11, 2022 at 12:15:44AM +0100, Janne Grunau wrote:
> On 2022-02-09 17:31:16 +0100, Thierry Reding wrote:
> > On Sun, Feb 06, 2022 at 11:27:00PM +0100, Janne Grunau wrote:
> > > On 2021-09-15 17:19:39 +0200, Thierry Reding wrote:
> > > > On Tue, Sep 07, 2021 at 07:44:44PM +0200, Thierry Reding wrote:
> > > > > On Tue, Sep 07, 2021 at 10:33:24AM -0500, Rob Herring wrote:
> > > > > > On Fri, Sep 3, 2021 at 10:36 AM Thierry Reding <thierry.reding@gmail.com> wrote:
> > > > > > >
> > > > > > > On Fri, Sep 03, 2021 at 09:36:33AM -0500, Rob Herring wrote:
> > > > > > > > On Fri, Sep 3, 2021 at 8:52 AM Thierry Reding <thierry.reding@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > On Fri, Sep 03, 2021 at 08:20:55AM -0500, Rob Herring wrote:
> > > > > > > > > >
> > > > > > > > > > Couldn't we keep this all in /reserved-memory? Just add an iova
> > > > > > > > > > version of reg. Perhaps abuse 'assigned-address' for this purpose. The
> > > > > > > > > > issue I see would be handling reserved iova areas without a physical
> > > > > > > > > > area. That can be handled with just a iova and no reg. We already have
> > > > > > > > > > a no reg case.
> > > > > > > > >
> > > > > > > > > I had thought about that initially. One thing I'm worried about is that
> > > > > > > > > every child node in /reserved-memory will effectively cause the memory
> > > > > > > > > that it described to be reserved. But we don't want that for regions
> > > > > > > > > that are "virtual only" (i.e. IOMMU reservations).
> > > > > > > >
> > > > > > > > By virtual only, you mean no physical mapping, just a region of
> > > > > > > > virtual space, right? For that we'd have no 'reg' and therefore no
> > > > > > > > (physical) reservation by the OS. It's similar to non-static regions.
> > > > > > > > You need a specific handler for them. We'd probably want a compatible
> > > > > > > > as well for these virtual reservations.
> > > > > > >
> > > > > > > Yeah, these would be purely used for reserving regions in the IOVA so
> > > > > > > that they won't be used by the IOVA allocator. Typically these would be
> > > > > > > used for cases where those addresses have some special meaning.
> > > > > > >
> > > > > > > Do we want something like:
> > > > > > >
> > > > > > >         compatible = "iommu-reserved";
> > > > > > >
> > > > > > > for these? Or would that need to be:
> > > > > > >
> > > > > > >         compatible = "linux,iommu-reserved";
> > > > > > >
> > > > > > > ? There seems to be a mix of vendor-prefix vs. non-vendor-prefix
> > > > > > > compatible strings in the reserved-memory DT bindings directory.
> > > > > > 
> > > > > > I would not use 'linux,' here.
> > > > > > 
> > > > > > >
> > > > > > > On the other hand, do we actually need the compatible string? Because we
> > > > > > > don't really want to associate much extra information with this like we
> > > > > > > do for example with "shared-dma-pool". The logic to handle this would
> > > > > > > all be within the IOMMU framework. All we really need is for the
> > > > > > > standard reservation code to skip nodes that don't have a reg property
> > > > > > > so we don't reserve memory for "virtual-only" allocations.
> > > > > > 
> > > > > > It doesn't hurt to have one and I can imagine we might want to iterate
> > > > > > over all the nodes. It's slightly easier and more common to iterate
> > > > > > over compatible nodes rather than nodes with some property.
> > > > > > 
> > > > > > > > Are these being global in DT going to be a problem? Presumably we have
> > > > > > > > a virtual space per IOMMU. We'd know which IOMMU based on a device's
> > > > > > > > 'iommus' and 'memory-region' properties, but within /reserved-memory
> > > > > > > > we wouldn't be able to distinguish overlapping addresses from separate
> > > > > > > > address spaces. Or we could have 2 different IOVAs for 1 physical
> > > > > > > > space. That could be solved with something like this:
> > > > > > > >
> > > > > > > > iommu-addresses = <&iommu1 <address cells> <size cells>>;
> > > > > > >
> > > > > > > The only case that would be problematic would be if we have overlapping
> > > > > > > physical regions, because that will probably trip up the standard code.
> > > > > > >
> > > > > > > But this could also be worked around by looking at iommu-addresses. For
> > > > > > > example, if we had something like this:
> > > > > > >
> > > > > > >         reserved-memory {
> > > > > > >                 fb_dc0: fb@80000000 {
> > > > > > >                         reg = <0x80000000 0x01000000>;
> > > > > > >                         iommu-addresses = <0xa0000000 0x01000000>;
> > > > > > >                 };
> > > > > > >
> > > > > > >                 fb_dc1: fb@80000000 {
> > > > > > 
> > > > > > You can't have 2 nodes with the same name (actually, you can, they
> > > > > > just get merged together). Different names with the same unit-address
> > > > > > is a dtc warning. I'd really like to make that a full blown
> > > > > > overlapping region check.
> > > > > 
> > > > > Right... so this would be a lot easier to deal with using that earlier
> > > > > proposal where the IOMMU regions were a separate thing and referencing
> > > > > the reserved-memory nodes. In those cases we could just have the
> > > > > physical reservation for the framebuffer once (so we don't get any
> > > > > duplicates or overlaps) and then have each IOVA reservation reference
> > > > > that to create the mapping.
> > > > > 
> > > > > > 
> > > > > > >                         reg = <0x80000000 0x01000000>;
> > > > > > >                         iommu-addresses = <0xb0000000 0x01000000>;
> > > > > > >                 };
> > > > > > >         };
> > > > > > >
> > > > > > > We could make the code identify that this is for the same physical
> > > > > > > reservation (maybe make it so that reg needs to match exactly for this
> > > > > > > to be recognized) but with different virtual allocations.
> > > > > > >
> > > > > > > On a side-note: do we really need to repeat the size? I'd think if we
> > > > > > > want mappings then we'd likely want them for the whole reservation.
> > > > > > 
> > > > > > Humm, I suppose not, but dropping it paints us into a corner if we
> > > > > > come up with wanting a different size later. You could have a carveout
> > > > > > for double/triple buffering your framebuffer, but the bootloader
> > > > > > framebuffer is only single buffered. So would you want actual size?
> > > > > 
> > > > > Perhaps this needs to be a bit more verbose then. If we want the ability
> > > > > to create a mapping for only a partial reservation, I could imagine we
> > > > > may as well want one that doesn't start at the beginning. So perhaps an
> > > > > ever better solution would be to have a complete mapping, something that
> > > > > works similar to "ranges" perhaps, like so:
> > > > > 
> > > > > 	fb@80000000 {
> > > > > 		reg = <0x80000000 0x01000000>;
> > > > > 		iommu-ranges = <0x80000000 0x01000000 0x80000000>;
> > > > > 	};
> > > > > 
> > > > > That would be for a full identity mapping, but we could also have
> > > > > something along the lines of this:
> > > > > 
> > > > > 	fb@80000000 {
> > > > > 		reg = <0x80000000 0x01000000>;
> > > > > 		iommu-ranges = <0x80100000 0x00100000 0xa0000000>;
> > > > > 	};
> > > > > 
> > > > > So that would only map a 1 MiB chunk at offset 1 MiB (of the physical
> > > > > reservation) to I/O virtual address 0xa0000000.
> > > > > 
> > > > > > > I'd like to keep references to IOMMUs out of this because they would be
> > > > > > > duplicated. We will only use these nodes if they are referenced by a
> > > > > > > device node that also has an iommus property. Also, the IOMMU reference
> > > > > > > itself isn't enough. We'd also need to support the complete specifier
> > > > > > > because you can have things like SIDs in there to specify the exact
> > > > > > > address space that a device uses.
> > > > > > >
> > > > > > > Also, for some of these they may be reused independently of the IOMMU
> > > > > > > address space. For example the Tegra framebuffer identity mapping can
> > > > > > > be used by either of the 2-4 display controllers, each with (at least
> > > > > > > potentially) their own address space. But we don't want to have to
> > > > > > > describe the identity mapping separately for each display controller.
> > > > > > 
> > > > > > Okay, but I'd rather have to duplicate things in your case than not be
> > > > > > able to express some other case.
> > > > > 
> > > > > The earlier "separate iov-reserved-memory" proposal would be a good
> > > > > compromise here. It'd allow us to duplicate only the necessary bits
> > > > > (i.e. the IOVA mappings) but keep the common bits simple. And even
> > > > > the IOVA mappings could be shared for cases like identity mappings.
> > > > > See below for more on that.
> > > > > 
> > > > > > > Another thing to consider is that these nodes will often be added by
> > > > > > > firmware (e.g. firmware will allocate the framebuffer and set up the
> > > > > > > corresponding reserved memory region in DT). Wiring up references like
> > > > > > > this would get very complicated very quickly.
> > > > > > 
> > > > > > Yes.
> > > > > > 
> > > > > > The using 'iommus' property option below can be optional and doesn't
> > > > > > have to be defined/supported now. Just trying to think ahead and not
> > > > > > be stuck with something that can't be extended.
> > > > > 
> > > > > One other benefit of the separate iov-reserved-memory node would be that
> > > > > the iommus property could be simplified. If we have a physical
> > > > > reservation that needs to be accessed by multiple different display
> > > > > controllers, we'd end up with something fairly complex, such as this:
> > > > > 
> > > > > 	fb: fb@80000000 {
> > > > > 		reg = <0x80000000 0x01000000>;
> > > > > 		iommus = <&dc0_iommu 0xa0000000 0x01000000>,
> > > > > 			 <&dc1_iommu 0xb0000000 0x01000000>,
> > > > > 			 <&dc2_iommu 0xc0000000 0x01000000>;
> > > > > 	};
> > > > > 
> > > > > This would get even worse if we want to support partial mappings. Also,
> > > > > it'd become quite complicated to correlate this with the memory-region
> > > > > references:
> > > > > 
> > > > > 	dc0: dc@40000000 {
> > > > > 		...
> > > > > 		memory-region = <&fb>;
> > > > > 		iommus = <&dc0_iommu>;
> > > > > 		...
> > > > > 	};
> > > > > 
> > > > > So now you have to go match up the phandle (and potentially specifier)
> > > > > in the iommus property of the disp0 node with an entry in the fb node's
> > > > > iommus property. That's all fairly complicated stuff.
> > > > > 
> > > > > With separate iov-reserved-memory, this would be a bit more verbose, but
> > > > > each individual node would be simpler:
> > > > > 
> > > > > 	reserved-memory {
> > > > > 		fb: fb@80000000 {
> > > > > 			reg = <0x80000000 0x01000000>;
> > > > > 		};
> > > > > 	};
> > > > > 
> > > > > 	iov-reserved-memory {
> > > > > 		fb0: fb@80000000 {
> > > > > 			/* identity mapping, "reg" optional? */
> > > > > 			reg = <0x80000000 0x01000000>;
> > > > > 			memory-region = <&fb>;
> > > > > 		};
> > > > > 
> > > > > 		fb1: fb@90000000 {
> > > > > 			/* but doesn't have to be */
> > > > > 			reg = <0x90000000 0x01000000>;
> > > > > 			memory-region = <&fb>;
> > > > > 		};
> > > > > 
> > > > > 		fb2: fb@a0000000 {
> > > > > 			/* can be partial, too */
> > > > > 			ranges = <0x80000000 0x00800000 0xa0000000>;
> > > > > 			memory-region = <&fb>;
> > > > > 		};
> > > > > 	}
> > > > > 
> > > > > 	dc0: dc@40000000 {
> > > > > 		iov-memory-regions = <&fb0>;
> > > > > 		/* optional? */
> > > > > 		memory-region = <&fb>;
> > > > > 		iommus = <&dc0_iommu>;
> > > > > 	};
> > > > > 
> > > > > Alternatively, if we want to support partial mappings, we could replace
> > > > > those reg properties by ranges properties that I showed earlier. We may
> > > > > even want to support both. Use "reg" for virtual-only reservations and
> > > > > identity mappings, or "simple partial mappings" (that map a sub-region
> > > > > starting from the beginning). Identity mappings could still be
> > > > > simplified by just omitting the "reg" property. For more complicated
> > > > > mappings, such as the ones on M1, the "ranges" property could be used.
> > > > > 
> > > > > Note how this looks a bit boilerplate-y, but it's actually really quite
> > > > > simple to understand, even for humans, I think.
> > > > > 
> > > > > Also, the phandles in this are comparatively easy to wire up because
> > > > > they can all be generated in a hierarchical way: generate physical
> > > > > reservation and store phandle, then generate I/O virtual reservation
> > > > > to reference that phandle and store the new phandle as well. Finally,
> > > > > wire this up to the display controller (using either the IOV phandle or
> > > > > both).
> > > > > 
> > > > > Granted, this requires the addition of a new top-level node, but given
> > > > > how expressive this becomes, I think it might be worth a second
> > > > > consideration.
> > > > 
> > > > I guess as a middle-ground between your suggestion and mine, we could
> > > > also move the IOV nodes back into reserved-memory. If we make sure the
> > > > names (together with unit-addresses) are unique, to support cases where
> > > > we want to identity map, or have multiple mappings at the same address.
> > > > So it'd look something like this:
> > > > 
> > > > 	reserved-memory {
> > > > 		fb: fb@80000000 {
> > > > 			reg = <0x80000000 0x01000000>;
> > > > 		};
> > > > 
> > > > 		audio-firmware@ff000000 {
> > > > 			/* perhaps add "iommu-reserved" for this case */
> > > > 			compatible = "iommu-mapping";
> > > > 			/*
> > > > 			 * no memory-region referencing a physical
> > > > 			 * reservation, indicates that this is an
> > > > 			 * IOMMU reservation, rather than a mapping
> > > > 			 /
> > > > 			reg = <0xff000000 0x01000000>;
> > > > 		};
> > > > 
> > > > 		fb0: fb-mapping@80000000 {
> > > > 			compatible = "iommu-mapping";
> > > > 			/* identity mapping, "reg" optional? */
> > > > 			reg = <0x80000000 0x01000000>;
> > > > 			memory-region = <&fb>;
> > > > 		};
> > > > 
> > > > 		fb1: fb-mapping@90000000 {
> > > > 			compatible = "iommu-mapping";
> > > > 			/* but doesn't have to be */
> > > > 			reg = <0x90000000 0x01000000>;
> > > > 			memory-region = <&fb>;
> > > > 		};
> > > > 
> > > > 		fb2: fb-mapping@a0000000 {
> > > > 			compatible = "iommu-mapping";
> > > > 			/* can be partial, too */
> > > > 			ranges = <0xa0000000 0x00800000 0x80000000>;
> > > > 			memory-region = <&fb>;
> > > > 		};
> > > > 	}
> > > > 
> > > > 	dc0: dc@40000000 {
> > > > 		memory-region = <&fb0>;
> > > > 		iommus = <&dc0_iommu>;
> > > > 	};
> > > > 
> > > > What do you think?
> > > 
> > > I converted the Apple M1 display controller driver to using reserved 
> > > regions using these bindings. It is sufficient for the needs of the M1 
> > > display controller which is so far the only device requiring this.
> > 
> > Thanks for trying this out. I've been meaning to resume this discussion
> > to finally get closure because we really want to enable this for various
> > Tegra SoCs.
> > 
> > > I encountered two problems with this bindings proposal:
> > > 
> > > 1) It is impossible to express which iommu needs to be used if a device 
> > > has multiple "iommus" specified. This is on the M1 only a theoretical 
> > > problem as the display co-processor devices use a single iommu.
> > 
> > From what I recall this is something that we don't fully support either
> > way. If you've got a struct device and you want to allocate DMA'able
> > memory, you can only pass that struct device to the DMA API upon
> > allocation but you have no way of specifying separate instances
> > depending on use-case.
> 
> Ok, let's us ignore then my complicated proposal. It is not a problem we 
> need to solve for the M1.
> 
> > > 2) The reserved regions can not easily looked up at iommu probe 
> > > time.  The Apple M1 iommu driver resets the iommu at probe. This 
> > > breaks the framebuffer. The display controller appears to crash then 
> > > an active scan-out framebuffer is unmapped. Resetting the iommu 
> > > looks like a sensible approach though.
> > > 
> > > To work around this I added custom property to the affected iommu node 
> > > to avoid the reset. This doesn't feel correct since the reason to avoid 
> > > the reset is that we have to maintain the reserved regions mapping until 
> > > the display controller driver takes over.
> > > As far as I can see the only method to retrieve devices with reserved 
> > > memory from the iommu is to iterate over all devices. This looks 
> > > impractical. The M1 has over 20 distinct iommus.
> > 
> > Do I understand correctly that on the M1, the firmware sets up a mapping
> > in the IOMMU already and then you want to recreate that mapping after
> > the IOMMU driver has reset the IOMMU?
> 
> The mappings are already set up by firmware as it uses the frame buffer 
> already itself. We need to make the kernel aware of the existing mapping 
> so it can use the IOMMU. Using reserved memory regions and mappings 
> seems to be clean way to do this. We want to reset IOMMUs without 
> pre-existing mappings (the M1 has over 20 IOMMUs). We need a way to 
> identify the two IOMMUs which must not be reseted at driver probe time.  
> A simple property in the IOMMU node would be enough. It would duplicate 
> information though since the only reason why we can't reset the IOMMU is 
> the pre-existing mapping
> 
> > In that case, how do you make sure that you atomically transition from
> > the firmware mapping to the kernel mapping? As soon as you reset the
> > IOMMU, the display controller will cause IOMMU faults because its now
> > scanning out from an unmapped buffer, right?
> 
> We are replacing the entire firmware managed page table with a kernel 
> managed one with a TTBR MMIO register write. The second IOMMU with 
> pre-existing mapping has unfortunately the TTBR locked. Dealing with 
> this is more complicated but the device using this IOMMU appears to
> sleep.
> 
> > So that approach of avoiding the reset doesn't seem wrong to me.
> > Obviously that's not altogether trivial to do either. Typically the
> > IOMMU mappings would be contained in system memory, so you'd have to
> > reserve those via reserved-memory nodes as well, etc.
> 
> The system memory is currently not expressed as reserved-memory but 
> simply outside of the specified memory.
>  
> > > One way to avoid both problems would be to move the mappings to the 
> > > iommu node as sub nodes. The device would then reference those.  
> > > This way the mapping is readily available at iommu probe time and 
> > > adding iommu type specific parameters to map the region correctly is 
> > > possible.
> > > 
> > > The sample above would transfor to:
> > > 
> > > 	reserved-memory {
> > > 		fb: fb@80000000 {
> > > 			reg = <0x80000000 0x01000000>;
> > > 		};
> > > 	};
> > > 
> > > 	dc0_iommu: iommu@20000000 {
> > > 		#iommu-cells = <1>;
> > > 
> > > 		fb0: fb-mapping@80000000 {
> > > 			compatible = "iommu-mapping";
> > > 			/* identity mapping, "reg" optional? */
> > > 			reg = <0x80000000 0x01000000>;
> > > 			memory-region = <&fb>;
> > > 			device-id = <0>; /* for #iommu-cells*/
> > > 		};
> > > 
> > > 		fb1: fb-mapping@90000000 {
> > > 			compatible = "iommu-mapping";
> > > 			/* but doesn't have to be */
> > > 			reg = <0x90000000 0x01000000>;
> > > 			memory-region = <&fb>;
> > > 			device-id = <1>; /* for #iommu-cells*/
> > > 		};
> > > 	};
> > > 
> > > 	dc0: dc@40000000 {
> > > 		iommu-region = <&fb0>;
> > > 		iommus = <&dc0_iommu 0>;
> > > 	};
> > > 
> > > Does anyone see problems with this approach or can think of something 
> > > better?
> > 
> > The device tree description of this looks a bit weird because it
> > sprinkles things all around. For instance now we've got the "stream ID"
> > (i.e. what you seem to be referring to as "device-id") in two places,
> > once in the iommus property of the DC node and once in the mapping.
> 
> Yes, stream_id would be the device-id. It is the term used in the 
> apple-dart IOMMU driver. It is duplicated to deal with the multiple 
> IOMMU problem. Let's ignore that and scrape my proposal.
>  
> > Would it work if you added back-references to the devices that are
> > active on boot to the IOMMU node? Something along these lines:
> > 
> > 	reserved-memory {
> > 		fb: fb@80000000 {
> > 			reg = <0x80000000 0x01000000>;
> > 		};
> > 	};
> > 
> > 	dc0_iommu: iommu@20000000 {
> > 		#iommu-cells = <1>;
> > 
> > 		mapped-devices = <&dc0>;
> > 	};
> > 
> > 	dc0: dc@40000000 {
> > 		memory-region = <&fb0>;
> > 		iommus = <&dc0_iommu 0>;
> > 	};
> > 
> > Depending on how you look at it that's a circular dependency, but it
> > won't be in practice. It makes things a bit more compact and puts the
> > data where it belongs.
> 
> Yes, this works for the Apple M1 display co-processor. I've changed the 
> dts and my apple-dart private parsing code to use "mapped-devices" 
> back-references and it works as before. We probably need an automated 
> check to ensure the references between device and IOMMU remains 
> consistent.

Circling back to this... again. I've been thinking about this some more
and have come up with a mix between what Rob, Janne and I had proposed.
This is how it would look (based on Tegra210):

	reserved-memory {
		fb: framebuffer@80000000 {
			/*
			 * Physical memory region that is reserved. If
			 * this property is omitted, this region should
			 * be treated as an IOVA reservation.
			 */
			reg = <0x80000000 0x01000000>;

			/*
			 * Create 1:1 mapping for display controller.
			 *
			 * Note how instead of the IOMMU reference we
			 * actually pass the device reference here. This
			 * combines the "mapped-devices" property that
			 * was proposed earlier and makes it easier to
			 * find the device that needs this mapping. The
			 * IOMMU phandle and specifier can be obtained
			 * via this backlink to the consumer device.
			 *
			 * More than one entry could be specified here
			 * to allow mappings for multiple devices. This
			 * avoids the problem of having multiple nodes
			 * with the same name.
			 *
			 * Could also be "iommu-addresses" as Rob had
			 * suggested earlier, but "iommu-mapping" seems
			 * a bit more appropriate given that there's
			 * also the phandle now.
			 */
			iommu-mapping = <&dc 0x80000000 0x01000000>;
		};
	};

	mc: memory-controller@70019000 {
		...
		#iommu-cells = <1>;
		...
	};

	dc: dc@54200000 {
		...
		iommus = <&mc TEGRA_SWGROUP_DC>;

		/*
		 * As in earlier proposals, this could be optional if
		 * all we need is the IOMMU mapping. It can be specified
		 * if there's a need for the driver to use the physical
		 * memory region (i.e. to copy out existing framebuffer
		 * content and recycle memory).
		 */
		memory-region = <&fb>;
		...
	};

One last remaining question that I have for this is whether we also need
some sort of #address-cells and #size-cells for the IOMMU which we need
to determine how many cells the addresses in iommu-mapping need to have.
I suppose we could derive that from the dma-ranges property somehow,
since that defines the addressable region of the device that needs the
mapping.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 1/5] dt-bindings: reserved-memory: Document memory region specifier
  2022-03-31 16:25                                   ` Thierry Reding
@ 2022-04-01 17:08                                     ` Janne Grunau
  0 siblings, 0 replies; 41+ messages in thread
From: Janne Grunau @ 2022-04-01 17:08 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Rob Herring, Alyssa Rosenzweig, Sven Peter, Dmitry Osipenko,
	Joerg Roedel, Will Deacon, Robin Murphy, Nicolin Chen,
	Krishna Reddy, devicetree, Linux IOMMU, linux-tegra, dri-devel

On 2022-03-31 18:25:05 +0200, Thierry Reding wrote:
> On Fri, Feb 11, 2022 at 12:15:44AM +0100, Janne Grunau wrote:
> > On 2022-02-09 17:31:16 +0100, Thierry Reding wrote:
> > > On Sun, Feb 06, 2022 at 11:27:00PM +0100, Janne Grunau wrote:
> > > > On 2021-09-15 17:19:39 +0200, Thierry Reding wrote:
> > > > > On Tue, Sep 07, 2021 at 07:44:44PM +0200, Thierry Reding wrote:
> > > > > > On Tue, Sep 07, 2021 at 10:33:24AM -0500, Rob Herring wrote:
> > > > > > > On Fri, Sep 3, 2021 at 10:36 AM Thierry Reding <thierry.reding@gmail.com> wrote:
> > > > > > > >
> > > > > > > > On Fri, Sep 03, 2021 at 09:36:33AM -0500, Rob Herring wrote:
> > > > > > > > > On Fri, Sep 3, 2021 at 8:52 AM Thierry Reding <thierry.reding@gmail.com> wrote:
> > > > > > > > > >
> > > > > > > > > > On Fri, Sep 03, 2021 at 08:20:55AM -0500, Rob Herring wrote:
> > > > > > > > > > >
> > > > > > > > > > > Couldn't we keep this all in /reserved-memory? Just add an iova
> > > > > > > > > > > version of reg. Perhaps abuse 'assigned-address' for this purpose. The
> > > > > > > > > > > issue I see would be handling reserved iova areas without a physical
> > > > > > > > > > > area. That can be handled with just a iova and no reg. We already have
> > > > > > > > > > > a no reg case.
> > > > > > > > > >
> > > > > > > > > > I had thought about that initially. One thing I'm worried about is that
> > > > > > > > > > every child node in /reserved-memory will effectively cause the memory
> > > > > > > > > > that it described to be reserved. But we don't want that for regions
> > > > > > > > > > that are "virtual only" (i.e. IOMMU reservations).
> > > > > > > > >
> > > > > > > > > By virtual only, you mean no physical mapping, just a region of
> > > > > > > > > virtual space, right? For that we'd have no 'reg' and therefore no
> > > > > > > > > (physical) reservation by the OS. It's similar to non-static regions.
> > > > > > > > > You need a specific handler for them. We'd probably want a compatible
> > > > > > > > > as well for these virtual reservations.
> > > > > > > >
> > > > > > > > Yeah, these would be purely used for reserving regions in the IOVA so
> > > > > > > > that they won't be used by the IOVA allocator. Typically these would be
> > > > > > > > used for cases where those addresses have some special meaning.
> > > > > > > >
> > > > > > > > Do we want something like:
> > > > > > > >
> > > > > > > >         compatible = "iommu-reserved";
> > > > > > > >
> > > > > > > > for these? Or would that need to be:
> > > > > > > >
> > > > > > > >         compatible = "linux,iommu-reserved";
> > > > > > > >
> > > > > > > > ? There seems to be a mix of vendor-prefix vs. non-vendor-prefix
> > > > > > > > compatible strings in the reserved-memory DT bindings directory.
> > > > > > > 
> > > > > > > I would not use 'linux,' here.
> > > > > > > 
> > > > > > > >
> > > > > > > > On the other hand, do we actually need the compatible string? Because we
> > > > > > > > don't really want to associate much extra information with this like we
> > > > > > > > do for example with "shared-dma-pool". The logic to handle this would
> > > > > > > > all be within the IOMMU framework. All we really need is for the
> > > > > > > > standard reservation code to skip nodes that don't have a reg property
> > > > > > > > so we don't reserve memory for "virtual-only" allocations.
> > > > > > > 
> > > > > > > It doesn't hurt to have one and I can imagine we might want to iterate
> > > > > > > over all the nodes. It's slightly easier and more common to iterate
> > > > > > > over compatible nodes rather than nodes with some property.
> > > > > > > 
> > > > > > > > > Are these being global in DT going to be a problem? Presumably we have
> > > > > > > > > a virtual space per IOMMU. We'd know which IOMMU based on a device's
> > > > > > > > > 'iommus' and 'memory-region' properties, but within /reserved-memory
> > > > > > > > > we wouldn't be able to distinguish overlapping addresses from separate
> > > > > > > > > address spaces. Or we could have 2 different IOVAs for 1 physical
> > > > > > > > > space. That could be solved with something like this:
> > > > > > > > >
> > > > > > > > > iommu-addresses = <&iommu1 <address cells> <size cells>>;
> > > > > > > >
> > > > > > > > The only case that would be problematic would be if we have overlapping
> > > > > > > > physical regions, because that will probably trip up the standard code.
> > > > > > > >
> > > > > > > > But this could also be worked around by looking at iommu-addresses. For
> > > > > > > > example, if we had something like this:
> > > > > > > >
> > > > > > > >         reserved-memory {
> > > > > > > >                 fb_dc0: fb@80000000 {
> > > > > > > >                         reg = <0x80000000 0x01000000>;
> > > > > > > >                         iommu-addresses = <0xa0000000 0x01000000>;
> > > > > > > >                 };
> > > > > > > >
> > > > > > > >                 fb_dc1: fb@80000000 {
> > > > > > > 
> > > > > > > You can't have 2 nodes with the same name (actually, you can, they
> > > > > > > just get merged together). Different names with the same unit-address
> > > > > > > is a dtc warning. I'd really like to make that a full blown
> > > > > > > overlapping region check.
> > > > > > 
> > > > > > Right... so this would be a lot easier to deal with using that earlier
> > > > > > proposal where the IOMMU regions were a separate thing and referencing
> > > > > > the reserved-memory nodes. In those cases we could just have the
> > > > > > physical reservation for the framebuffer once (so we don't get any
> > > > > > duplicates or overlaps) and then have each IOVA reservation reference
> > > > > > that to create the mapping.
> > > > > > 
> > > > > > > 
> > > > > > > >                         reg = <0x80000000 0x01000000>;
> > > > > > > >                         iommu-addresses = <0xb0000000 0x01000000>;
> > > > > > > >                 };
> > > > > > > >         };
> > > > > > > >
> > > > > > > > We could make the code identify that this is for the same physical
> > > > > > > > reservation (maybe make it so that reg needs to match exactly for this
> > > > > > > > to be recognized) but with different virtual allocations.
> > > > > > > >
> > > > > > > > On a side-note: do we really need to repeat the size? I'd think if we
> > > > > > > > want mappings then we'd likely want them for the whole reservation.
> > > > > > > 
> > > > > > > Humm, I suppose not, but dropping it paints us into a corner if we
> > > > > > > come up with wanting a different size later. You could have a carveout
> > > > > > > for double/triple buffering your framebuffer, but the bootloader
> > > > > > > framebuffer is only single buffered. So would you want actual size?
> > > > > > 
> > > > > > Perhaps this needs to be a bit more verbose then. If we want the ability
> > > > > > to create a mapping for only a partial reservation, I could imagine we
> > > > > > may as well want one that doesn't start at the beginning. So perhaps an
> > > > > > ever better solution would be to have a complete mapping, something that
> > > > > > works similar to "ranges" perhaps, like so:
> > > > > > 
> > > > > > 	fb@80000000 {
> > > > > > 		reg = <0x80000000 0x01000000>;
> > > > > > 		iommu-ranges = <0x80000000 0x01000000 0x80000000>;
> > > > > > 	};
> > > > > > 
> > > > > > That would be for a full identity mapping, but we could also have
> > > > > > something along the lines of this:
> > > > > > 
> > > > > > 	fb@80000000 {
> > > > > > 		reg = <0x80000000 0x01000000>;
> > > > > > 		iommu-ranges = <0x80100000 0x00100000 0xa0000000>;
> > > > > > 	};
> > > > > > 
> > > > > > So that would only map a 1 MiB chunk at offset 1 MiB (of the physical
> > > > > > reservation) to I/O virtual address 0xa0000000.
> > > > > > 
> > > > > > > > I'd like to keep references to IOMMUs out of this because they would be
> > > > > > > > duplicated. We will only use these nodes if they are referenced by a
> > > > > > > > device node that also has an iommus property. Also, the IOMMU reference
> > > > > > > > itself isn't enough. We'd also need to support the complete specifier
> > > > > > > > because you can have things like SIDs in there to specify the exact
> > > > > > > > address space that a device uses.
> > > > > > > >
> > > > > > > > Also, for some of these they may be reused independently of the IOMMU
> > > > > > > > address space. For example the Tegra framebuffer identity mapping can
> > > > > > > > be used by either of the 2-4 display controllers, each with (at least
> > > > > > > > potentially) their own address space. But we don't want to have to
> > > > > > > > describe the identity mapping separately for each display controller.
> > > > > > > 
> > > > > > > Okay, but I'd rather have to duplicate things in your case than not be
> > > > > > > able to express some other case.
> > > > > > 
> > > > > > The earlier "separate iov-reserved-memory" proposal would be a good
> > > > > > compromise here. It'd allow us to duplicate only the necessary bits
> > > > > > (i.e. the IOVA mappings) but keep the common bits simple. And even
> > > > > > the IOVA mappings could be shared for cases like identity mappings.
> > > > > > See below for more on that.
> > > > > > 
> > > > > > > > Another thing to consider is that these nodes will often be added by
> > > > > > > > firmware (e.g. firmware will allocate the framebuffer and set up the
> > > > > > > > corresponding reserved memory region in DT). Wiring up references like
> > > > > > > > this would get very complicated very quickly.
> > > > > > > 
> > > > > > > Yes.
> > > > > > > 
> > > > > > > The using 'iommus' property option below can be optional and doesn't
> > > > > > > have to be defined/supported now. Just trying to think ahead and not
> > > > > > > be stuck with something that can't be extended.
> > > > > > 
> > > > > > One other benefit of the separate iov-reserved-memory node would be that
> > > > > > the iommus property could be simplified. If we have a physical
> > > > > > reservation that needs to be accessed by multiple different display
> > > > > > controllers, we'd end up with something fairly complex, such as this:
> > > > > > 
> > > > > > 	fb: fb@80000000 {
> > > > > > 		reg = <0x80000000 0x01000000>;
> > > > > > 		iommus = <&dc0_iommu 0xa0000000 0x01000000>,
> > > > > > 			 <&dc1_iommu 0xb0000000 0x01000000>,
> > > > > > 			 <&dc2_iommu 0xc0000000 0x01000000>;
> > > > > > 	};
> > > > > > 
> > > > > > This would get even worse if we want to support partial mappings. Also,
> > > > > > it'd become quite complicated to correlate this with the memory-region
> > > > > > references:
> > > > > > 
> > > > > > 	dc0: dc@40000000 {
> > > > > > 		...
> > > > > > 		memory-region = <&fb>;
> > > > > > 		iommus = <&dc0_iommu>;
> > > > > > 		...
> > > > > > 	};
> > > > > > 
> > > > > > So now you have to go match up the phandle (and potentially specifier)
> > > > > > in the iommus property of the disp0 node with an entry in the fb node's
> > > > > > iommus property. That's all fairly complicated stuff.
> > > > > > 
> > > > > > With separate iov-reserved-memory, this would be a bit more verbose, but
> > > > > > each individual node would be simpler:
> > > > > > 
> > > > > > 	reserved-memory {
> > > > > > 		fb: fb@80000000 {
> > > > > > 			reg = <0x80000000 0x01000000>;
> > > > > > 		};
> > > > > > 	};
> > > > > > 
> > > > > > 	iov-reserved-memory {
> > > > > > 		fb0: fb@80000000 {
> > > > > > 			/* identity mapping, "reg" optional? */
> > > > > > 			reg = <0x80000000 0x01000000>;
> > > > > > 			memory-region = <&fb>;
> > > > > > 		};
> > > > > > 
> > > > > > 		fb1: fb@90000000 {
> > > > > > 			/* but doesn't have to be */
> > > > > > 			reg = <0x90000000 0x01000000>;
> > > > > > 			memory-region = <&fb>;
> > > > > > 		};
> > > > > > 
> > > > > > 		fb2: fb@a0000000 {
> > > > > > 			/* can be partial, too */
> > > > > > 			ranges = <0x80000000 0x00800000 0xa0000000>;
> > > > > > 			memory-region = <&fb>;
> > > > > > 		};
> > > > > > 	}
> > > > > > 
> > > > > > 	dc0: dc@40000000 {
> > > > > > 		iov-memory-regions = <&fb0>;
> > > > > > 		/* optional? */
> > > > > > 		memory-region = <&fb>;
> > > > > > 		iommus = <&dc0_iommu>;
> > > > > > 	};
> > > > > > 
> > > > > > Alternatively, if we want to support partial mappings, we could replace
> > > > > > those reg properties by ranges properties that I showed earlier. We may
> > > > > > even want to support both. Use "reg" for virtual-only reservations and
> > > > > > identity mappings, or "simple partial mappings" (that map a sub-region
> > > > > > starting from the beginning). Identity mappings could still be
> > > > > > simplified by just omitting the "reg" property. For more complicated
> > > > > > mappings, such as the ones on M1, the "ranges" property could be used.
> > > > > > 
> > > > > > Note how this looks a bit boilerplate-y, but it's actually really quite
> > > > > > simple to understand, even for humans, I think.
> > > > > > 
> > > > > > Also, the phandles in this are comparatively easy to wire up because
> > > > > > they can all be generated in a hierarchical way: generate physical
> > > > > > reservation and store phandle, then generate I/O virtual reservation
> > > > > > to reference that phandle and store the new phandle as well. Finally,
> > > > > > wire this up to the display controller (using either the IOV phandle or
> > > > > > both).
> > > > > > 
> > > > > > Granted, this requires the addition of a new top-level node, but given
> > > > > > how expressive this becomes, I think it might be worth a second
> > > > > > consideration.
> > > > > 
> > > > > I guess as a middle-ground between your suggestion and mine, we could
> > > > > also move the IOV nodes back into reserved-memory. If we make sure the
> > > > > names (together with unit-addresses) are unique, to support cases where
> > > > > we want to identity map, or have multiple mappings at the same address.
> > > > > So it'd look something like this:
> > > > > 
> > > > > 	reserved-memory {
> > > > > 		fb: fb@80000000 {
> > > > > 			reg = <0x80000000 0x01000000>;
> > > > > 		};
> > > > > 
> > > > > 		audio-firmware@ff000000 {
> > > > > 			/* perhaps add "iommu-reserved" for this case */
> > > > > 			compatible = "iommu-mapping";
> > > > > 			/*
> > > > > 			 * no memory-region referencing a physical
> > > > > 			 * reservation, indicates that this is an
> > > > > 			 * IOMMU reservation, rather than a mapping
> > > > > 			 /
> > > > > 			reg = <0xff000000 0x01000000>;
> > > > > 		};
> > > > > 
> > > > > 		fb0: fb-mapping@80000000 {
> > > > > 			compatible = "iommu-mapping";
> > > > > 			/* identity mapping, "reg" optional? */
> > > > > 			reg = <0x80000000 0x01000000>;
> > > > > 			memory-region = <&fb>;
> > > > > 		};
> > > > > 
> > > > > 		fb1: fb-mapping@90000000 {
> > > > > 			compatible = "iommu-mapping";
> > > > > 			/* but doesn't have to be */
> > > > > 			reg = <0x90000000 0x01000000>;
> > > > > 			memory-region = <&fb>;
> > > > > 		};
> > > > > 
> > > > > 		fb2: fb-mapping@a0000000 {
> > > > > 			compatible = "iommu-mapping";
> > > > > 			/* can be partial, too */
> > > > > 			ranges = <0xa0000000 0x00800000 0x80000000>;
> > > > > 			memory-region = <&fb>;
> > > > > 		};
> > > > > 	}
> > > > > 
> > > > > 	dc0: dc@40000000 {
> > > > > 		memory-region = <&fb0>;
> > > > > 		iommus = <&dc0_iommu>;
> > > > > 	};
> > > > > 
> > > > > What do you think?
> > > > 
> > > > I converted the Apple M1 display controller driver to using reserved 
> > > > regions using these bindings. It is sufficient for the needs of the M1 
> > > > display controller which is so far the only device requiring this.
> > > 
> > > Thanks for trying this out. I've been meaning to resume this discussion
> > > to finally get closure because we really want to enable this for various
> > > Tegra SoCs.
> > > 
> > > > I encountered two problems with this bindings proposal:
> > > > 
> > > > 1) It is impossible to express which iommu needs to be used if a device 
> > > > has multiple "iommus" specified. This is on the M1 only a theoretical 
> > > > problem as the display co-processor devices use a single iommu.
> > > 
> > > From what I recall this is something that we don't fully support either
> > > way. If you've got a struct device and you want to allocate DMA'able
> > > memory, you can only pass that struct device to the DMA API upon
> > > allocation but you have no way of specifying separate instances
> > > depending on use-case.
> > 
> > Ok, let's us ignore then my complicated proposal. It is not a problem we 
> > need to solve for the M1.
> > 
> > > > 2) The reserved regions can not easily looked up at iommu probe 
> > > > time.  The Apple M1 iommu driver resets the iommu at probe. This 
> > > > breaks the framebuffer. The display controller appears to crash then 
> > > > an active scan-out framebuffer is unmapped. Resetting the iommu 
> > > > looks like a sensible approach though.
> > > > 
> > > > To work around this I added custom property to the affected iommu node 
> > > > to avoid the reset. This doesn't feel correct since the reason to avoid 
> > > > the reset is that we have to maintain the reserved regions mapping until 
> > > > the display controller driver takes over.
> > > > As far as I can see the only method to retrieve devices with reserved 
> > > > memory from the iommu is to iterate over all devices. This looks 
> > > > impractical. The M1 has over 20 distinct iommus.
> > > 
> > > Do I understand correctly that on the M1, the firmware sets up a mapping
> > > in the IOMMU already and then you want to recreate that mapping after
> > > the IOMMU driver has reset the IOMMU?
> > 
> > The mappings are already set up by firmware as it uses the frame buffer 
> > already itself. We need to make the kernel aware of the existing mapping 
> > so it can use the IOMMU. Using reserved memory regions and mappings 
> > seems to be clean way to do this. We want to reset IOMMUs without 
> > pre-existing mappings (the M1 has over 20 IOMMUs). We need a way to 
> > identify the two IOMMUs which must not be reseted at driver probe time.  
> > A simple property in the IOMMU node would be enough. It would duplicate 
> > information though since the only reason why we can't reset the IOMMU is 
> > the pre-existing mapping
> > 
> > > In that case, how do you make sure that you atomically transition from
> > > the firmware mapping to the kernel mapping? As soon as you reset the
> > > IOMMU, the display controller will cause IOMMU faults because its now
> > > scanning out from an unmapped buffer, right?
> > 
> > We are replacing the entire firmware managed page table with a kernel 
> > managed one with a TTBR MMIO register write. The second IOMMU with 
> > pre-existing mapping has unfortunately the TTBR locked. Dealing with 
> > this is more complicated but the device using this IOMMU appears to
> > sleep.
> > 
> > > So that approach of avoiding the reset doesn't seem wrong to me.
> > > Obviously that's not altogether trivial to do either. Typically the
> > > IOMMU mappings would be contained in system memory, so you'd have to
> > > reserve those via reserved-memory nodes as well, etc.
> > 
> > The system memory is currently not expressed as reserved-memory but 
> > simply outside of the specified memory.
> >  
> > > > One way to avoid both problems would be to move the mappings to the 
> > > > iommu node as sub nodes. The device would then reference those.  
> > > > This way the mapping is readily available at iommu probe time and 
> > > > adding iommu type specific parameters to map the region correctly is 
> > > > possible.
> > > > 
> > > > The sample above would transfor to:
> > > > 
> > > > 	reserved-memory {
> > > > 		fb: fb@80000000 {
> > > > 			reg = <0x80000000 0x01000000>;
> > > > 		};
> > > > 	};
> > > > 
> > > > 	dc0_iommu: iommu@20000000 {
> > > > 		#iommu-cells = <1>;
> > > > 
> > > > 		fb0: fb-mapping@80000000 {
> > > > 			compatible = "iommu-mapping";
> > > > 			/* identity mapping, "reg" optional? */
> > > > 			reg = <0x80000000 0x01000000>;
> > > > 			memory-region = <&fb>;
> > > > 			device-id = <0>; /* for #iommu-cells*/
> > > > 		};
> > > > 
> > > > 		fb1: fb-mapping@90000000 {
> > > > 			compatible = "iommu-mapping";
> > > > 			/* but doesn't have to be */
> > > > 			reg = <0x90000000 0x01000000>;
> > > > 			memory-region = <&fb>;
> > > > 			device-id = <1>; /* for #iommu-cells*/
> > > > 		};
> > > > 	};
> > > > 
> > > > 	dc0: dc@40000000 {
> > > > 		iommu-region = <&fb0>;
> > > > 		iommus = <&dc0_iommu 0>;
> > > > 	};
> > > > 
> > > > Does anyone see problems with this approach or can think of something 
> > > > better?
> > > 
> > > The device tree description of this looks a bit weird because it
> > > sprinkles things all around. For instance now we've got the "stream ID"
> > > (i.e. what you seem to be referring to as "device-id") in two places,
> > > once in the iommus property of the DC node and once in the mapping.
> > 
> > Yes, stream_id would be the device-id. It is the term used in the 
> > apple-dart IOMMU driver. It is duplicated to deal with the multiple 
> > IOMMU problem. Let's ignore that and scrape my proposal.
> >  
> > > Would it work if you added back-references to the devices that are
> > > active on boot to the IOMMU node? Something along these lines:
> > > 
> > > 	reserved-memory {
> > > 		fb: fb@80000000 {
> > > 			reg = <0x80000000 0x01000000>;
> > > 		};
> > > 	};
> > > 
> > > 	dc0_iommu: iommu@20000000 {
> > > 		#iommu-cells = <1>;
> > > 
> > > 		mapped-devices = <&dc0>;
> > > 	};
> > > 
> > > 	dc0: dc@40000000 {
> > > 		memory-region = <&fb0>;
> > > 		iommus = <&dc0_iommu 0>;
> > > 	};
> > > 
> > > Depending on how you look at it that's a circular dependency, but it
> > > won't be in practice. It makes things a bit more compact and puts the
> > > data where it belongs.
> > 
> > Yes, this works for the Apple M1 display co-processor. I've changed the 
> > dts and my apple-dart private parsing code to use "mapped-devices" 
> > back-references and it works as before. We probably need an automated 
> > check to ensure the references between device and IOMMU remains 
> > consistent.
> 
> Circling back to this... again. I've been thinking about this some more
> and have come up with a mix between what Rob, Janne and I had proposed.
> This is how it would look (based on Tegra210):
> 
> 	reserved-memory {
> 		fb: framebuffer@80000000 {
> 			/*
> 			 * Physical memory region that is reserved. If
> 			 * this property is omitted, this region should
> 			 * be treated as an IOVA reservation.
> 			 */
> 			reg = <0x80000000 0x01000000>;
> 
> 			/*
> 			 * Create 1:1 mapping for display controller.
> 			 *
> 			 * Note how instead of the IOMMU reference we
> 			 * actually pass the device reference here. This
> 			 * combines the "mapped-devices" property that
> 			 * was proposed earlier and makes it easier to
> 			 * find the device that needs this mapping. The
> 			 * IOMMU phandle and specifier can be obtained
> 			 * via this backlink to the consumer device.

The reset issue on apple silicon SoCs needs to be solved in a different 
way.  So there is no longer the need to discover the mappings at IOMMU 
probe time.  I'll probably will still apply the mappings at probe time 
to have a defined state as early as possible.

The device reference is still required when multiple device map the same 
memory region as it is the case on the apple silicon platform.

> 			 *
> 			 * More than one entry could be specified here
> 			 * to allow mappings for multiple devices. This
> 			 * avoids the problem of having multiple nodes
> 			 * with the same name.
> 			 *
> 			 * Could also be "iommu-addresses" as Rob had
> 			 * suggested earlier, but "iommu-mapping" seems
> 			 * a bit more appropriate given that there's
> 			 * also the phandle now.
> 			 */
> 			iommu-mapping = <&dc 0x80000000 0x01000000>;
> 		};
> 	};

This binding covers the needs for the display subsystem of all known 
Apple silicon systems. I'll test our work in progress DRM driver with 
this binding when I find the time. I don't foresee any problems though.

I prefer this binding over the previous one with separate 
reserved-memory nodes with "iommu-mapping" compatible. It's more compact 
and keeps related information at a single place. It avoids the need to 
find multiple names for the essentially the same thing.

I'm not sure if I should have considered this a deal breaker for the 
previous proposal. Using reserved memory nodes with an "iommu-mapping" 
compatible string is not backwards compatible with existing u-boot 
installs. This proposal looks safe as the problematic IOVA is hidden 
from all software not using the new "iommu-mapping" property.

> 	mc: memory-controller@70019000 {
> 		...
> 		#iommu-cells = <1>;
> 		...
> 	};
> 
> 	dc: dc@54200000 {
> 		...
> 		iommus = <&mc TEGRA_SWGROUP_DC>;
> 
> 		/*
> 		 * As in earlier proposals, this could be optional if
> 		 * all we need is the IOMMU mapping. It can be specified
> 		 * if there's a need for the driver to use the physical
> 		 * memory region (i.e. to copy out existing framebuffer
> 		 * content and recycle memory).
> 		 */
> 		memory-region = <&fb>;
> 		...
> 	};
> 
> One last remaining question that I have for this is whether we also need
> some sort of #address-cells and #size-cells for the IOMMU which we need
> to determine how many cells the addresses in iommu-mapping need to have.
> I suppose we could derive that from the dma-ranges property somehow,
> since that defines the addressable region of the device that needs the
> mapping.

To my understanding "dma-ranges" is not mandatory when using IOMMUs.  
None of the device trees for apple silicon devices has currently 
"dma-ranges" and as far as I'm aware all DMA capable devices on the 
platform are behind an IOMMU. I see no other standardized property in 
the device tree which holds information about the devices/IOMMUs virtual 
address space.

Janne

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2022-04-01 17:17 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-23 16:32 [PATCH v2 0/5] iommu: Support identity mappings of reserved-memory regions Thierry Reding
2021-04-23 16:32 ` [PATCH v2 1/5] dt-bindings: reserved-memory: Document memory region specifier Thierry Reding
2021-05-20 22:03   ` Rob Herring
2021-05-28 16:54     ` Thierry Reding
2021-06-08 16:51       ` Thierry Reding
2021-07-01 18:14         ` Thierry Reding
2021-07-02 14:16           ` Dmitry Osipenko
2021-09-01 14:13             ` Thierry Reding
2021-09-03 13:20               ` Rob Herring
2021-09-03 13:52                 ` Thierry Reding
2021-09-03 14:36                   ` Rob Herring
2021-09-03 15:35                     ` Thierry Reding
2021-09-07 15:33                       ` Rob Herring
2021-09-07 17:44                         ` Thierry Reding
2021-09-15 15:19                           ` Thierry Reding
2022-02-06 22:27                             ` Janne Grunau
2022-02-09 16:31                               ` Thierry Reding
2022-02-10 23:15                                 ` Janne Grunau
2022-03-31 16:25                                   ` Thierry Reding
2022-04-01 17:08                                     ` Janne Grunau
2021-04-23 16:32 ` [PATCH v2 2/5] iommu: Implement of_iommu_get_resv_regions() Thierry Reding
2021-07-02 14:05   ` Dmitry Osipenko
2021-07-16 14:41     ` Rob Herring
2021-07-17 11:07       ` Dmitry Osipenko
2021-07-30 12:18         ` Will Deacon
2021-04-23 16:32 ` [PATCH v2 3/5] iommu: dma: Use of_iommu_get_resv_regions() Thierry Reding
2021-04-23 16:32 ` [PATCH v2 4/5] iommu/tegra-smmu: Add support for reserved regions Thierry Reding
2021-04-23 16:32 ` [PATCH v2 5/5] iommu/tegra-smmu: Support managed domains Thierry Reding
2021-10-11 23:25   ` Dmitry Osipenko
2021-04-24  7:26 ` [PATCH v2 0/5] iommu: Support identity mappings of reserved-memory regions Dmitry Osipenko
2021-04-27 18:30   ` Krishna Reddy
2021-04-28  5:44     ` Dmitry Osipenko
2021-04-29  5:51       ` Krishna Reddy
2021-04-29 12:43         ` Dmitry Osipenko
2021-04-28  5:51 ` Dmitry Osipenko
2021-04-28  5:57   ` Mikko Perttunen
2021-04-28  7:55     ` Dmitry Osipenko
2021-04-28  5:59 ` Dmitry Osipenko
2021-10-03  1:09 ` Dmitry Osipenko
2021-10-04 19:23   ` Thierry Reding
2021-10-04 20:32     ` Dmitry Osipenko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).