iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 0/6] iommu/arm-smmu: Auxiliary domain and per instance pagetables
@ 2020-01-28 22:16 Jordan Crouse
  2020-01-28 22:16 ` [PATCH v1 1/6] iommu: Add DOMAIN_ATTR_PTBASE Jordan Crouse
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: Jordan Crouse @ 2020-01-28 22:16 UTC (permalink / raw)
  To: iommu
  Cc: freedreno, David Airlie, will, Sharat Masetty, robin.murphy,
	dri-devel, linux-kernel, Daniel Vetter, linux-arm-msm, Sean Paul,
	linux-arm-kernel

Some clients have a requirement to sandbox memory mappings for security and
advanced features like SVM. This series adds support to enable per-instance
pagetables as auxiliary domains in the arm-smmu driver and adds per-instance
support for the Adreno GPU.

This patchset builds on the split pagetable support from [1]. In that series the
TTBR1 address space is programmed for the default ("master") domain and enables
support for auxiliary domains. Each new auxiliary domain will allocate a
pagetable which the leaf driver can program through the usual IOMMU APIs. It can
also query the physical address of the pagetable.

In the SMMU driver the first auxiliary domain will enable and program the TTBR0
space. Subsequent auxiliary domains won't touch the hardware. Similarly when
the last auxiliary domain is detached the TTBR0 region will be disabled again.

In the Adreno driver each new file descriptor instance will create a new
auxiliary domain / pagetable and use it for all the memory allocations of that
instance. The driver will query the base address of each pagetable and switch
them dynamically using the built-in table switch capability of the GPU. If any
of these features fail the driver will automatically fall back to using the
default (global) pagetable.

This patchset had previously been submitted as [2] but has been significantly
modified since then.

Jordan

[1] https://lists.linuxfoundation.org/pipermail/iommu/2020-January/041438.html
[2] https://patchwork.freedesktop.org/series/57441/


Jordan Crouse (6):
  iommu: Add DOMAIN_ATTR_PTBASE
  arm/smmu: Add auxiliary domain support for arm-smmuv2
  drm/msm/adreno: ADd support for IOMMU auxiliary domains
  drm/msm: Add support to create target specific address spaces
  drm/msm/gpu: Add ttbr0 to the memptrs
  drm/msm/a6xx: Support per-instance pagetables

 drivers/gpu/drm/msm/adreno/a6xx_gpu.c |  89 +++++++++++++
 drivers/gpu/drm/msm/msm_drv.c         |  22 +++-
 drivers/gpu/drm/msm/msm_gpu.h         |   2 +
 drivers/gpu/drm/msm/msm_iommu.c       |  72 +++++++++++
 drivers/gpu/drm/msm/msm_mmu.h         |   3 +
 drivers/gpu/drm/msm/msm_ringbuffer.h  |   1 +
 drivers/iommu/arm-smmu.c              | 230 +++++++++++++++++++++++++++++++---
 drivers/iommu/arm-smmu.h              |   3 +
 include/linux/iommu.h                 |   2 +
 9 files changed, 405 insertions(+), 19 deletions(-)

-- 
2.7.4
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v1 1/6] iommu: Add DOMAIN_ATTR_PTBASE
  2020-01-28 22:16 [PATCH v1 0/6] iommu/arm-smmu: Auxiliary domain and per instance pagetables Jordan Crouse
@ 2020-01-28 22:16 ` Jordan Crouse
  2020-01-28 22:16 ` [PATCH v1 2/6] arm/smmu: Add auxiliary domain support for arm-smmuv2 Jordan Crouse
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Jordan Crouse @ 2020-01-28 22:16 UTC (permalink / raw)
  To: iommu; +Cc: robin.murphy, linux-kernel, linux-arm-msm, will, linux-arm-kernel

Add an attribute to return the base address of the pagetable. This is used
by auxiliary domains from arm-smmu to return the address of the pagetable
to the domain so that it can set the appropriate pagetable through it's
own means.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 include/linux/iommu.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index b14398b..0e9bcd9 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -128,6 +128,8 @@ enum iommu_attr {
 	DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE,
 	/* Enable split pagetables (for example, TTBR1 on arm-smmu) */
 	DOMAIN_ATTR_SPLIT_TABLES,
+	/* Return the pagetable base address */
+	DOMAIN_ATTR_PTBASE,
 	DOMAIN_ATTR_MAX,
 };
 
-- 
2.7.4
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v1 2/6] arm/smmu: Add auxiliary domain support for arm-smmuv2
  2020-01-28 22:16 [PATCH v1 0/6] iommu/arm-smmu: Auxiliary domain and per instance pagetables Jordan Crouse
  2020-01-28 22:16 ` [PATCH v1 1/6] iommu: Add DOMAIN_ATTR_PTBASE Jordan Crouse
@ 2020-01-28 22:16 ` Jordan Crouse
  2020-03-18 22:48   ` Will Deacon
  2020-01-28 22:16 ` [PATCH v1 3/6] drm/msm/adreno: ADd support for IOMMU auxiliary domains Jordan Crouse
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Jordan Crouse @ 2020-01-28 22:16 UTC (permalink / raw)
  To: iommu; +Cc: robin.murphy, linux-kernel, linux-arm-msm, will, linux-arm-kernel

Support auxiliary domains for arm-smmu-v2 to initialize and support
multiple pagetables for a single SMMU context bank. Since the smmu-v2
hardware doesn't have any built in support for switching the pagetable
base it is left as an exercise to the caller to actually use the pagetable.

Aux domains are supported if split pagetable (TTBR1) support has been
enabled on the master domain.  Each auxiliary domain will reuse the
configuration of the master domain. By default the a domain with TTBR1
support will have the TTBR0 region disabled so the first attached aux
domain will enable the TTBR0 region in the hardware and conversely the
last domain to be detached will disable TTBR0 translations.  All subsequent
auxiliary domains create a pagetable but not touch the hardware.

The leaf driver will be able to query the physical address of the
pagetable with the DOMAIN_ATTR_PTBASE attribute so that it can use the
address with whatever means it has to switch the pagetable base.

Following is a pseudo code example of how a domain can be created

 /* Check to see if aux domains are supported */
 if (iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX)) {
	 iommu = iommu_domain_alloc(...);

	 if (iommu_aux_attach_device(domain, dev))
		 return FAIL;

	/* Save the base address of the pagetable for use by the driver
	iommu_domain_get_attr(domain, DOMAIN_ATTR_PTBASE, &ptbase);
 }

Then 'domain' can be used like any other iommu domain to map and
unmap iova addresses in the pagetable.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 drivers/iommu/arm-smmu.c | 230 +++++++++++++++++++++++++++++++++++++++++++----
 drivers/iommu/arm-smmu.h |   3 +
 2 files changed, 217 insertions(+), 16 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 23b22fa..85a6773 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -91,6 +91,8 @@ struct arm_smmu_cb {
 	u32				tcr[2];
 	u32				mair[2];
 	struct arm_smmu_cfg		*cfg;
+	atomic_t			aux;
+	atomic_t			refcount;
 };
 
 struct arm_smmu_master_cfg {
@@ -533,6 +535,7 @@ static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain,
 	struct arm_smmu_cb *cb = &smmu_domain->smmu->cbs[cfg->cbndx];
 	bool stage1 = cfg->cbar != CBAR_TYPE_S2_TRANS;
 
+	atomic_inc(&cb->refcount);
 	cb->cfg = cfg;
 
 	/* TCR */
@@ -671,6 +674,91 @@ static void arm_smmu_write_context_bank(struct arm_smmu_device *smmu, int idx)
 	arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_SCTLR, reg);
 }
 
+/*
+ * Update the context context bank to enable TTBR0. Assumes AARCH64 S1
+ * configuration.
+ */
+static void arm_smmu_context_set_ttbr0(struct arm_smmu_cb *cb,
+		struct io_pgtable_cfg *pgtbl_cfg)
+{
+	u32 tcr = cb->tcr[0];
+
+	/* Add the TCR configuration from the new pagetable config */
+	tcr |= arm_smmu_lpae_tcr(pgtbl_cfg);
+
+	/* Make sure that both TTBR0 and TTBR1 are enabled */
+	tcr &= ~(ARM_SMMU_TCR_EPD0 | ARM_SMMU_TCR_EPD1);
+
+	/* Udate the TCR register */
+	cb->tcr[0] = tcr;
+
+	/* Program the new TTBR0 */
+	cb->ttbr[0] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
+	cb->ttbr[0] |= FIELD_PREP(ARM_SMMU_TTBRn_ASID, cb->cfg->asid);
+}
+
+/*
+ * Thus function assumes that the current model only allows aux domains for
+ * AARCH64 S1 configurations
+ */
+static int arm_smmu_aux_init_domain_context(struct iommu_domain *domain,
+		struct arm_smmu_device *smmu, struct arm_smmu_cfg *master)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct io_pgtable_ops *pgtbl_ops;
+	struct io_pgtable_cfg pgtbl_cfg;
+
+	mutex_lock(&smmu_domain->init_mutex);
+
+	/* Copy the configuration from the master */
+	memcpy(&smmu_domain->cfg, master, sizeof(smmu_domain->cfg));
+
+	smmu_domain->flush_ops = &arm_smmu_s1_tlb_ops;
+	smmu_domain->smmu = smmu;
+
+	pgtbl_cfg = (struct io_pgtable_cfg) {
+		.pgsize_bitmap = smmu->pgsize_bitmap,
+		.ias = smmu->va_size,
+		.oas = smmu->ipa_size,
+		.coherent_walk = smmu->features & ARM_SMMU_FEAT_COHERENT_WALK,
+		.tlb = smmu_domain->flush_ops,
+		.iommu_dev = smmu->dev,
+		.quirks = 0,
+	};
+
+	if (smmu_domain->non_strict)
+		pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
+
+	pgtbl_ops = alloc_io_pgtable_ops(ARM_64_LPAE_S1, &pgtbl_cfg,
+		smmu_domain);
+	if (!pgtbl_ops) {
+		mutex_unlock(&smmu_domain->init_mutex);
+		return -ENOMEM;
+	}
+
+	domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
+
+	domain->geometry.aperture_end = (1UL << smmu->va_size) - 1;
+	domain->geometry.force_aperture = true;
+
+	/*
+	 * Add a reference to the context bank to make sure it stays programmed
+	 * wile we are attached
+	 */
+	atomic_inc(&smmu->cbs[master->cbndx].refcount);
+
+	/* enable TTBR0 when the the first aux domain is attached */
+	if (atomic_inc_return(&smmu->cbs[master->cbndx].aux) == 1) {
+		arm_smmu_context_set_ttbr0(&smmu->cbs[master->cbndx],
+			&pgtbl_cfg);
+		arm_smmu_write_context_bank(smmu, master->cbndx);
+	}
+
+	smmu_domain->ptbase = pgtbl_cfg.arm_lpae_s1_cfg.ttbr;
+	smmu_domain->pgtbl_ops = pgtbl_ops;
+	return 0;
+}
+
 static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 					struct arm_smmu_device *smmu)
 {
@@ -882,36 +970,70 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	return ret;
 }
 
+static void
+arm_smmu_destroy_aux_domain_context(struct arm_smmu_domain *smmu_domain)
+{
+	struct arm_smmu_device *smmu = smmu_domain->smmu;
+	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
+	int ret;
+
+	/*
+	 * If this is the last aux domain to be freed, disable TTBR0 by turning
+	 * off translations and clearing TTBR0
+	 */
+	if (atomic_dec_return(&smmu->cbs[cfg->cbndx].aux) == 0) {
+		/* Clear out the T0 region */
+		smmu->cbs[cfg->cbndx].tcr[0] &= ~GENMASK(15, 0);
+		/* Disable TTBR0 translations */
+		smmu->cbs[cfg->cbndx].tcr[0] |= ARM_SMMU_TCR_EPD0;
+		/* Clear the TTBR0 pagetable address */
+		smmu->cbs[cfg->cbndx].ttbr[0] =
+			FIELD_PREP(ARM_SMMU_TTBRn_ASID, cfg->asid);
+
+		ret = arm_smmu_rpm_get(smmu);
+		if (!ret) {
+			arm_smmu_write_context_bank(smmu, cfg->cbndx);
+			arm_smmu_rpm_put(smmu);
+		}
+	}
+
+}
+
 static void arm_smmu_destroy_domain_context(struct iommu_domain *domain)
 {
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
 	struct arm_smmu_device *smmu = smmu_domain->smmu;
 	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
-	int ret, irq;
 
 	if (!smmu || domain->type == IOMMU_DOMAIN_IDENTITY)
 		return;
 
-	ret = arm_smmu_rpm_get(smmu);
-	if (ret < 0)
-		return;
+	if (smmu_domain->aux)
+		arm_smmu_destroy_aux_domain_context(smmu_domain);
 
-	/*
-	 * Disable the context bank and free the page tables before freeing
-	 * it.
-	 */
-	smmu->cbs[cfg->cbndx].cfg = NULL;
-	arm_smmu_write_context_bank(smmu, cfg->cbndx);
+	/* Check if the last user is done with the context bank */
+	if (atomic_dec_return(&smmu->cbs[cfg->cbndx].refcount) == 0) {
+		int ret = arm_smmu_rpm_get(smmu);
+		int irq;
 
-	if (cfg->irptndx != ARM_SMMU_INVALID_IRPTNDX) {
-		irq = smmu->irqs[smmu->num_global_irqs + cfg->irptndx];
-		devm_free_irq(smmu->dev, irq, domain);
+		if (ret < 0)
+			return;
+
+		/* Disable the context bank */
+		smmu->cbs[cfg->cbndx].cfg = NULL;
+		arm_smmu_write_context_bank(smmu, cfg->cbndx);
+
+		if (cfg->irptndx != ARM_SMMU_INVALID_IRPTNDX) {
+			irq = smmu->irqs[smmu->num_global_irqs + cfg->irptndx];
+			devm_free_irq(smmu->dev, irq, domain);
+		}
+
+		__arm_smmu_free_bitmap(smmu->context_map, cfg->cbndx);
+		arm_smmu_rpm_put(smmu);
 	}
 
+	/* Destroy the pagetable */
 	free_io_pgtable_ops(smmu_domain->pgtbl_ops);
-	__arm_smmu_free_bitmap(smmu->context_map, cfg->cbndx);
-
-	arm_smmu_rpm_put(smmu);
 }
 
 static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
@@ -1181,6 +1303,73 @@ static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain,
 	return 0;
 }
 
+static bool arm_smmu_dev_has_feat(struct device *dev,
+		enum iommu_dev_features feat)
+{
+	if (feat != IOMMU_DEV_FEAT_AUX)
+		return false;
+
+	return true;
+}
+
+static int arm_smmu_dev_enable_feat(struct device *dev,
+		enum iommu_dev_features feat)
+{
+	/* aux domain support is always available */
+	if (feat == IOMMU_DEV_FEAT_AUX)
+		return 0;
+
+	return -ENODEV;
+}
+
+static int arm_smmu_dev_disable_feat(struct device *dev,
+		enum iommu_dev_features feat)
+{
+	return -EBUSY;
+}
+
+static int arm_smmu_aux_attach_dev(struct iommu_domain *domain,
+		struct device *dev)
+{
+	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+	struct arm_smmu_device *smmu = fwspec_smmu(fwspec);
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct arm_smmu_cb *cb;
+	int idx, i, ret, cbndx = -1;
+
+	/* Try to find the context bank configured for this device */
+	for_each_cfg_sme(fwspec, i, idx) {
+		if (idx != INVALID_SMENDX) {
+			cbndx = smmu->s2crs[idx].cbndx;
+			break;
+		}
+	}
+
+	if (cbndx == -1)
+		return -ENODEV;
+
+	cb = &smmu->cbs[cbndx];
+
+	/* Aux domains are only supported for AARCH64 configurations */
+	if (cb->cfg->fmt != ARM_SMMU_CTX_FMT_AARCH64)
+		return -EINVAL;
+
+	/* Make sure that TTBR1 is enabled in the hardware */
+	if ((cb->tcr[0] & ARM_SMMU_TCR_EPD1))
+		return -EINVAL;
+
+	smmu_domain->aux = true;
+
+	ret = arm_smmu_rpm_get(smmu);
+	if (ret < 0)
+		return ret;
+
+	ret = arm_smmu_aux_init_domain_context(domain, smmu, cb->cfg);
+
+	arm_smmu_rpm_put(smmu);
+	return ret;
+}
+
 static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 {
 	int ret;
@@ -1549,6 +1738,11 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
 		case DOMAIN_ATTR_SPLIT_TABLES:
 			*(int *)data = smmu_domain->split_pagetables;
 			return 0;
+		case DOMAIN_ATTR_PTBASE:
+			if (!smmu_domain->aux)
+				return -EINVAL;
+			*(u64 *)data = smmu_domain->ptbase;
+			return 0;
 		default:
 			return -ENODEV;
 		}
@@ -1667,6 +1861,10 @@ static struct iommu_ops arm_smmu_ops = {
 	.of_xlate		= arm_smmu_of_xlate,
 	.get_resv_regions	= arm_smmu_get_resv_regions,
 	.put_resv_regions	= generic_iommu_put_resv_regions,
+	.dev_has_feat		= arm_smmu_dev_has_feat,
+	.dev_enable_feat	= arm_smmu_dev_enable_feat,
+	.dev_disable_feat	= arm_smmu_dev_disable_feat,
+	.aux_attach_dev		= arm_smmu_aux_attach_dev,
 	.pgsize_bitmap		= -1UL, /* Restricted during device attach */
 };
 
diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
index 53053fd..8266b29 100644
--- a/drivers/iommu/arm-smmu.h
+++ b/drivers/iommu/arm-smmu.h
@@ -345,6 +345,9 @@ struct arm_smmu_domain {
 	spinlock_t			cb_lock; /* Serialises ATS1* ops and TLB syncs */
 	struct iommu_domain		domain;
 	bool				split_pagetables;
+	bool				aux;
+	/* Shadow of the pagetable base for aux domains */
+	u64				ptbase;
 };
 
 static inline u32 arm_smmu_lpae_tcr(struct io_pgtable_cfg *cfg)
-- 
2.7.4
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v1 3/6] drm/msm/adreno: ADd support for IOMMU auxiliary domains
  2020-01-28 22:16 [PATCH v1 0/6] iommu/arm-smmu: Auxiliary domain and per instance pagetables Jordan Crouse
  2020-01-28 22:16 ` [PATCH v1 1/6] iommu: Add DOMAIN_ATTR_PTBASE Jordan Crouse
  2020-01-28 22:16 ` [PATCH v1 2/6] arm/smmu: Add auxiliary domain support for arm-smmuv2 Jordan Crouse
@ 2020-01-28 22:16 ` Jordan Crouse
  2020-01-28 22:16 ` [PATCH v1 4/6] drm/msm: Add support to create target specific address spaces Jordan Crouse
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Jordan Crouse @ 2020-01-28 22:16 UTC (permalink / raw)
  To: iommu
  Cc: freedreno, David Airlie, will, robin.murphy, dri-devel,
	linux-kernel, Daniel Vetter, linux-arm-msm, Sean Paul,
	linux-arm-kernel

Add support for creating a auxiliary domain from the IOMMU device to
implement per-instance pagetables. Also add a helper function to
return the pagetable base address (ttbr) and asid to the caller so
that the GPU target code can set up the pagetable switch.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 drivers/gpu/drm/msm/msm_iommu.c | 72 +++++++++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/msm/msm_mmu.h   |  3 ++
 2 files changed, 75 insertions(+)

diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c
index e773ef8..df0d70a 100644
--- a/drivers/gpu/drm/msm/msm_iommu.c
+++ b/drivers/gpu/drm/msm/msm_iommu.c
@@ -7,9 +7,17 @@
 #include "msm_drv.h"
 #include "msm_mmu.h"
 
+/*
+ * It is up to us to assign ASIDS for our instances. Start at 32 to give a
+ * cushion to account for ASIDS assigned to real context banks
+ */
+static int msm_iommu_asid = 32;
+
 struct msm_iommu {
 	struct msm_mmu base;
 	struct iommu_domain *domain;
+	u64 ttbr;
+	int asid;
 };
 #define to_msm_iommu(x) container_of(x, struct msm_iommu, base)
 
@@ -58,6 +66,20 @@ static void msm_iommu_destroy(struct msm_mmu *mmu)
 	kfree(iommu);
 }
 
+static void msm_iommu_aux_detach(struct msm_mmu *mmu)
+{
+	struct msm_iommu *iommu = to_msm_iommu(mmu);
+
+	iommu_aux_detach_device(iommu->domain, mmu->dev);
+}
+
+static const struct msm_mmu_funcs aux_funcs = {
+		.detach = msm_iommu_aux_detach,
+		.map = msm_iommu_map,
+		.unmap = msm_iommu_unmap,
+		.destroy = msm_iommu_destroy,
+};
+
 static const struct msm_mmu_funcs funcs = {
 		.detach = msm_iommu_detach,
 		.map = msm_iommu_map,
@@ -65,6 +87,56 @@ static const struct msm_mmu_funcs funcs = {
 		.destroy = msm_iommu_destroy,
 };
 
+bool msm_iommu_get_ptinfo(struct msm_mmu *mmu, u64 *ttbr, u32 *asid)
+{
+	struct msm_iommu *iommu = to_msm_iommu(mmu);
+
+	if (!iommu->ttbr)
+		return false;
+
+	if (ttbr)
+		*ttbr = iommu->ttbr;
+	if (asid)
+		*asid = iommu->asid;
+
+	return true;
+}
+
+struct msm_mmu *msm_iommu_new_instance(struct device *dev,
+		struct iommu_domain *domain)
+{
+	struct msm_iommu *iommu;
+	u64 ptbase;
+	int ret;
+
+	ret = iommu_aux_attach_device(domain, dev);
+	if (ret)
+		return ERR_PTR(ret);
+
+	ret = iommu_domain_get_attr(domain, DOMAIN_ATTR_PTBASE, &ptbase);
+	if (ret) {
+		iommu_aux_detach_device(domain, dev);
+		return ERR_PTR(ret);
+	}
+
+	iommu = kzalloc(sizeof(*iommu), GFP_KERNEL);
+	if (!iommu) {
+		iommu_aux_detach_device(domain, dev);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	iommu->domain = domain;
+	iommu->ttbr = ptbase;
+	iommu->asid = msm_iommu_asid++;
+
+	if (msm_iommu_asid > 0xff)
+		msm_iommu_asid = 32;
+
+	msm_mmu_init(&iommu->base, dev, &aux_funcs);
+
+	return &iommu->base;
+}
+
 struct msm_mmu *msm_iommu_new(struct device *dev, struct iommu_domain *domain)
 {
 	struct msm_iommu *iommu;
diff --git a/drivers/gpu/drm/msm/msm_mmu.h b/drivers/gpu/drm/msm/msm_mmu.h
index bae9e8e..65a5cb2 100644
--- a/drivers/gpu/drm/msm/msm_mmu.h
+++ b/drivers/gpu/drm/msm/msm_mmu.h
@@ -32,6 +32,9 @@ static inline void msm_mmu_init(struct msm_mmu *mmu, struct device *dev,
 }
 
 struct msm_mmu *msm_iommu_new(struct device *dev, struct iommu_domain *domain);
+struct msm_mmu *msm_iommu_new_instance(struct device *dev,
+		struct iommu_domain *domain);
+bool msm_iommu_get_ptinfo(struct msm_mmu *mmu, u64 *ttbr, u32 *asid);
 struct msm_mmu *msm_gpummu_new(struct device *dev, struct msm_gpu *gpu);
 
 static inline void msm_mmu_set_fault_handler(struct msm_mmu *mmu, void *arg,
-- 
2.7.4
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v1 4/6] drm/msm: Add support to create target specific address spaces
  2020-01-28 22:16 [PATCH v1 0/6] iommu/arm-smmu: Auxiliary domain and per instance pagetables Jordan Crouse
                   ` (2 preceding siblings ...)
  2020-01-28 22:16 ` [PATCH v1 3/6] drm/msm/adreno: ADd support for IOMMU auxiliary domains Jordan Crouse
@ 2020-01-28 22:16 ` Jordan Crouse
  2020-01-28 22:16 ` [PATCH v1 5/6] drm/msm/gpu: Add ttbr0 to the memptrs Jordan Crouse
  2020-01-28 22:16 ` [PATCH v1 6/6] drm/msm/a6xx: Support per-instance pagetables Jordan Crouse
  5 siblings, 0 replies; 15+ messages in thread
From: Jordan Crouse @ 2020-01-28 22:16 UTC (permalink / raw)
  To: iommu
  Cc: freedreno, David Airlie, will, robin.murphy, dri-devel,
	linux-kernel, Daniel Vetter, linux-arm-msm, Sean Paul,
	linux-arm-kernel

Add support to create a GPU target specific address space for
a context. For those targets that support per-instance
pagetables they will return a new address space set up for
the instance if possible otherwise just use the global
device pagetable.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 drivers/gpu/drm/msm/msm_drv.c | 22 +++++++++++++++++++---
 drivers/gpu/drm/msm/msm_gpu.h |  2 ++
 2 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index e4b750b..e485dc1 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -585,6 +585,18 @@ static void load_gpu(struct drm_device *dev)
 	mutex_unlock(&init_lock);
 }
 
+static struct msm_gem_address_space *context_address_space(struct msm_gpu *gpu)
+{
+	if (!gpu)
+		return NULL;
+
+	if (gpu->funcs->create_instance_space)
+		return gpu->funcs->create_instance_space(gpu);
+
+	/* If all else fails use the default global space */
+	return gpu->aspace;
+}
+
 static int context_init(struct drm_device *dev, struct drm_file *file)
 {
 	struct msm_drm_private *priv = dev->dev_private;
@@ -596,7 +608,7 @@ static int context_init(struct drm_device *dev, struct drm_file *file)
 
 	msm_submitqueue_init(dev, ctx);
 
-	ctx->aspace = priv->gpu ? priv->gpu->aspace : NULL;
+	ctx->aspace = context_address_space(priv->gpu);
 	file->driver_priv = ctx;
 
 	return 0;
@@ -612,8 +624,12 @@ static int msm_open(struct drm_device *dev, struct drm_file *file)
 	return context_init(dev, file);
 }
 
-static void context_close(struct msm_file_private *ctx)
+static void context_close(struct msm_drm_private *priv,
+		struct msm_file_private *ctx)
 {
+	if (priv->gpu && ctx->aspace != priv->gpu->aspace)
+		msm_gem_address_space_put(ctx->aspace);
+
 	msm_submitqueue_close(ctx);
 	kfree(ctx);
 }
@@ -628,7 +644,7 @@ static void msm_postclose(struct drm_device *dev, struct drm_file *file)
 		priv->lastctx = NULL;
 	mutex_unlock(&dev->struct_mutex);
 
-	context_close(ctx);
+	context_close(priv, ctx);
 }
 
 static irqreturn_t msm_irq(int irq, void *arg)
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index d496b68..76636da 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -64,6 +64,8 @@ struct msm_gpu_funcs {
 	void (*gpu_set_freq)(struct msm_gpu *gpu, unsigned long freq);
 	struct msm_gem_address_space *(*create_address_space)
 		(struct msm_gpu *gpu, struct platform_device *pdev);
+	struct msm_gem_address_space *(*create_instance_space)
+		(struct msm_gpu *gpu);
 };
 
 struct msm_gpu {
-- 
2.7.4
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v1 5/6] drm/msm/gpu: Add ttbr0 to the memptrs
  2020-01-28 22:16 [PATCH v1 0/6] iommu/arm-smmu: Auxiliary domain and per instance pagetables Jordan Crouse
                   ` (3 preceding siblings ...)
  2020-01-28 22:16 ` [PATCH v1 4/6] drm/msm: Add support to create target specific address spaces Jordan Crouse
@ 2020-01-28 22:16 ` Jordan Crouse
  2020-01-28 22:16 ` [PATCH v1 6/6] drm/msm/a6xx: Support per-instance pagetables Jordan Crouse
  5 siblings, 0 replies; 15+ messages in thread
From: Jordan Crouse @ 2020-01-28 22:16 UTC (permalink / raw)
  To: iommu
  Cc: freedreno, David Airlie, will, robin.murphy, dri-devel,
	linux-kernel, Daniel Vetter, linux-arm-msm, Sean Paul,
	linux-arm-kernel

Targets that support per-instance pagetable switching will have to keep
track of which pagetable belongs to each instance to be able to recover
for preemption.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 drivers/gpu/drm/msm/msm_ringbuffer.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.h b/drivers/gpu/drm/msm/msm_ringbuffer.h
index 7764373..c5822bd 100644
--- a/drivers/gpu/drm/msm/msm_ringbuffer.h
+++ b/drivers/gpu/drm/msm/msm_ringbuffer.h
@@ -29,6 +29,7 @@ struct msm_gpu_submit_stats {
 struct msm_rbmemptrs {
 	volatile uint32_t rptr;
 	volatile uint32_t fence;
+	volatile uint64_t ttbr0;
 
 	volatile struct msm_gpu_submit_stats stats[MSM_GPU_SUBMIT_STATS_COUNT];
 };
-- 
2.7.4
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v1 6/6] drm/msm/a6xx: Support per-instance pagetables
  2020-01-28 22:16 [PATCH v1 0/6] iommu/arm-smmu: Auxiliary domain and per instance pagetables Jordan Crouse
                   ` (4 preceding siblings ...)
  2020-01-28 22:16 ` [PATCH v1 5/6] drm/msm/gpu: Add ttbr0 to the memptrs Jordan Crouse
@ 2020-01-28 22:16 ` Jordan Crouse
  5 siblings, 0 replies; 15+ messages in thread
From: Jordan Crouse @ 2020-01-28 22:16 UTC (permalink / raw)
  To: iommu
  Cc: freedreno, David Airlie, will, Sharat Masetty, robin.murphy,
	dri-devel, linux-kernel, Daniel Vetter, linux-arm-msm, Sean Paul,
	linux-arm-kernel

Add support for per-instance pagetables for a6xx targets. Add support
to handle split pagetables and create a new instance if the needed
IOMMU support exists and insert the necessary PM4 commands to trigger
a pagetable switch at the beginning of a user command.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 89 +++++++++++++++++++++++++++++++++++
 1 file changed, 89 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 9bec603c..e1a257e 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -12,6 +12,62 @@
 
 #define GPU_PAS_ID 13
 
+static void a6xx_set_pagetable(struct msm_gpu *gpu, struct msm_ringbuffer *ring,
+		struct msm_file_private *ctx)
+{
+	u64 ttbr;
+	u32 asid;
+
+	if (!msm_iommu_get_ptinfo(ctx->aspace->mmu, &ttbr, &asid))
+		return;
+
+	ttbr = ttbr | ((u64) asid) << 48;
+
+	/* Turn off protected mode */
+	OUT_PKT7(ring, CP_SET_PROTECTED_MODE, 1);
+	OUT_RING(ring, 0);
+
+	/* Turn on APIV mode to access critical regions */
+	OUT_PKT4(ring, REG_A6XX_CP_MISC_CNTL, 1);
+	OUT_RING(ring, 1);
+
+	/* Make sure the ME is synchronized before staring the update */
+	OUT_PKT7(ring, CP_WAIT_FOR_ME, 0);
+
+	/* Execute the table update */
+	OUT_PKT7(ring, CP_SMMU_TABLE_UPDATE, 4);
+	OUT_RING(ring, lower_32_bits(ttbr));
+	OUT_RING(ring, upper_32_bits(ttbr));
+	/* CONTEXTIDR is currently unused */
+	OUT_RING(ring, 0);
+	/* CONTEXTBANK is currently unused */
+	OUT_RING(ring, 0);
+
+	/*
+	 * Write the new TTBR0 to the preemption records - this will be used to
+	 * reload the pagetable if the current ring gets preempted out.
+	 */
+	OUT_PKT7(ring, CP_MEM_WRITE, 4);
+	OUT_RING(ring, lower_32_bits(rbmemptr(ring, ttbr0)));
+	OUT_RING(ring, upper_32_bits(rbmemptr(ring, ttbr0)));
+	OUT_RING(ring, lower_32_bits(ttbr));
+	OUT_RING(ring, upper_32_bits(ttbr));
+
+	/* Invalidate the draw state so we start off fresh */
+	OUT_PKT7(ring, CP_SET_DRAW_STATE, 3);
+	OUT_RING(ring, 0x40000);
+	OUT_RING(ring, 1);
+	OUT_RING(ring, 0);
+
+	/* Turn off APRIV */
+	OUT_PKT4(ring, REG_A6XX_CP_MISC_CNTL, 1);
+	OUT_RING(ring, 0);
+
+	/* Turn off protected mode */
+	OUT_PKT7(ring, CP_SET_PROTECTED_MODE, 1);
+	OUT_RING(ring, 1);
+}
+
 static inline bool _a6xx_check_idle(struct msm_gpu *gpu)
 {
 	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
@@ -89,6 +145,8 @@ static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit,
 	struct msm_ringbuffer *ring = submit->ring;
 	unsigned int i;
 
+	a6xx_set_pagetable(gpu, ring, ctx);
+
 	get_stats_counter(ring, REG_A6XX_RBBM_PERFCTR_CP_0_LO,
 		rbmemptr_stats(ring, index, cpcycles_start));
 
@@ -878,6 +936,36 @@ static unsigned long a6xx_gpu_busy(struct msm_gpu *gpu)
 	return (unsigned long)busy_time;
 }
 
+static struct msm_gem_address_space*
+a6xx_create_instance_space(struct msm_gpu *gpu)
+{
+	struct msm_gem_address_space *aspace;
+	struct iommu_domain *iommu;
+	struct msm_mmu *mmu;
+
+	if (!iommu_dev_has_feature(&gpu->pdev->dev, IOMMU_DEV_FEAT_AUX))
+		return gpu->aspace;
+
+	iommu = iommu_domain_alloc(&platform_bus_type);
+	if (!iommu)
+		return gpu->aspace;
+
+	mmu = msm_iommu_new_instance(&gpu->pdev->dev, iommu);
+	if (IS_ERR(mmu)) {
+		iommu_domain_free(iommu);
+		return gpu->aspace;
+	}
+
+	aspace = msm_gem_address_space_create(mmu, "gpu",
+		0x100000000ULL, 0x1ffffffffULL);
+	if (IS_ERR(aspace)) {
+		mmu->funcs->destroy(mmu);
+		return gpu->aspace;
+	}
+
+	return aspace;
+}
+
 static struct msm_gem_address_space *
 a6xx_create_address_space(struct msm_gpu *gpu, struct platform_device *pdev)
 {
@@ -951,6 +1039,7 @@ static const struct adreno_gpu_funcs funcs = {
 		.gpu_state_put = a6xx_gpu_state_put,
 #endif
 		.create_address_space = a6xx_create_address_space,
+		.create_instance_space = a6xx_create_instance_space,
 	},
 	.get_timestamp = a6xx_get_timestamp,
 };
-- 
2.7.4
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 2/6] arm/smmu: Add auxiliary domain support for arm-smmuv2
  2020-01-28 22:16 ` [PATCH v1 2/6] arm/smmu: Add auxiliary domain support for arm-smmuv2 Jordan Crouse
@ 2020-03-18 22:48   ` Will Deacon
  2020-03-18 23:43     ` Rob Clark
  0 siblings, 1 reply; 15+ messages in thread
From: Will Deacon @ 2020-03-18 22:48 UTC (permalink / raw)
  To: Jordan Crouse
  Cc: linux-arm-msm, linux-kernel, iommu, robin.murphy, linux-arm-kernel

On Tue, Jan 28, 2020 at 03:16:06PM -0700, Jordan Crouse wrote:
> Support auxiliary domains for arm-smmu-v2 to initialize and support
> multiple pagetables for a single SMMU context bank. Since the smmu-v2
> hardware doesn't have any built in support for switching the pagetable
> base it is left as an exercise to the caller to actually use the pagetable.
> 
> Aux domains are supported if split pagetable (TTBR1) support has been
> enabled on the master domain.  Each auxiliary domain will reuse the
> configuration of the master domain. By default the a domain with TTBR1
> support will have the TTBR0 region disabled so the first attached aux
> domain will enable the TTBR0 region in the hardware and conversely the
> last domain to be detached will disable TTBR0 translations.  All subsequent
> auxiliary domains create a pagetable but not touch the hardware.
> 
> The leaf driver will be able to query the physical address of the
> pagetable with the DOMAIN_ATTR_PTBASE attribute so that it can use the
> address with whatever means it has to switch the pagetable base.
> 
> Following is a pseudo code example of how a domain can be created
> 
>  /* Check to see if aux domains are supported */
>  if (iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX)) {
> 	 iommu = iommu_domain_alloc(...);
> 
> 	 if (iommu_aux_attach_device(domain, dev))
> 		 return FAIL;
> 
> 	/* Save the base address of the pagetable for use by the driver
> 	iommu_domain_get_attr(domain, DOMAIN_ATTR_PTBASE, &ptbase);
>  }

I'm not really understanding what the pagetable base gets used for here and,
to be honest with you, the whole thing feels like a huge layering violation
with the way things are structured today. Why doesn't the caller just
interface with io-pgtable directly?

Finally, if we need to support context-switching TTBR0 for a live domain
then that code really needs to live inside the SMMU driver because the
ASID and TLB management necessary to do that safely doesn't belong anywhere
else.

Will
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 2/6] arm/smmu: Add auxiliary domain support for arm-smmuv2
  2020-03-18 22:48   ` Will Deacon
@ 2020-03-18 23:43     ` Rob Clark
  2020-03-19 15:23       ` Jordan Crouse
  2020-05-18 15:18       ` Will Deacon
  0 siblings, 2 replies; 15+ messages in thread
From: Rob Clark @ 2020-03-18 23:43 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-arm-msm, Linux Kernel Mailing List,
	list@263.net:IOMMU DRIVERS
	<iommu@lists.linux-foundation.org>,
	Joerg Roedel <joro@8bytes.org>, ,
	Robin Murphy,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE

On Wed, Mar 18, 2020 at 3:48 PM Will Deacon <will@kernel.org> wrote:
>
> On Tue, Jan 28, 2020 at 03:16:06PM -0700, Jordan Crouse wrote:
> > Support auxiliary domains for arm-smmu-v2 to initialize and support
> > multiple pagetables for a single SMMU context bank. Since the smmu-v2
> > hardware doesn't have any built in support for switching the pagetable
> > base it is left as an exercise to the caller to actually use the pagetable.
> >
> > Aux domains are supported if split pagetable (TTBR1) support has been
> > enabled on the master domain.  Each auxiliary domain will reuse the
> > configuration of the master domain. By default the a domain with TTBR1
> > support will have the TTBR0 region disabled so the first attached aux
> > domain will enable the TTBR0 region in the hardware and conversely the
> > last domain to be detached will disable TTBR0 translations.  All subsequent
> > auxiliary domains create a pagetable but not touch the hardware.
> >
> > The leaf driver will be able to query the physical address of the
> > pagetable with the DOMAIN_ATTR_PTBASE attribute so that it can use the
> > address with whatever means it has to switch the pagetable base.
> >
> > Following is a pseudo code example of how a domain can be created
> >
> >  /* Check to see if aux domains are supported */
> >  if (iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX)) {
> >        iommu = iommu_domain_alloc(...);
> >
> >        if (iommu_aux_attach_device(domain, dev))
> >                return FAIL;
> >
> >       /* Save the base address of the pagetable for use by the driver
> >       iommu_domain_get_attr(domain, DOMAIN_ATTR_PTBASE, &ptbase);
> >  }
>
> I'm not really understanding what the pagetable base gets used for here and,
> to be honest with you, the whole thing feels like a huge layering violation
> with the way things are structured today. Why doesn't the caller just
> interface with io-pgtable directly?
>
> Finally, if we need to support context-switching TTBR0 for a live domain
> then that code really needs to live inside the SMMU driver because the
> ASID and TLB management necessary to do that safely doesn't belong anywhere
> else.

Hi Will,

We do in fact need live domain switching, that is really the whole
point.  The GPU CP (command processor/parser) is directly updating
TTBR0 and triggering TLB flush, asynchronously from the CPU.

And I think the answer about ASID is easy (on current hw).. it must be zero[*].

BR,
-R

[*] my rough theory/plan there, and to solve the issue with drm/msm
getting dma-iommu ops when it really would rather not (since
blacklisting idea wasn't popular and I couldn't figure out a way to
deal with case where device gets attached before driver shows up) is
to invent some API that drm/msm can call to unhook the dma-iommu ops
and detatch the DMA domain.  Hopefully that at least gets us closer to
the point where, when drm/msm attaches it's UNMANAGED domain, we get
cbidx/asid zero.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 2/6] arm/smmu: Add auxiliary domain support for arm-smmuv2
  2020-03-18 23:43     ` Rob Clark
@ 2020-03-19 15:23       ` Jordan Crouse
  2020-05-18 15:18       ` Will Deacon
  1 sibling, 0 replies; 15+ messages in thread
From: Jordan Crouse @ 2020-03-19 15:23 UTC (permalink / raw)
  To: Rob Clark
  Cc: linux-arm-msm, Linux Kernel Mailing List,
	list@263.net:IOMMU DRIVERS
	<iommu@lists.linux-foundation.org>,
	Joerg Roedel <joro@8bytes.org>, ,
	Will Deacon,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE,
	Robin Murphy

On Wed, Mar 18, 2020 at 04:43:07PM -0700, Rob Clark wrote:
> On Wed, Mar 18, 2020 at 3:48 PM Will Deacon <will@kernel.org> wrote:
> >
> > On Tue, Jan 28, 2020 at 03:16:06PM -0700, Jordan Crouse wrote:
> > > Support auxiliary domains for arm-smmu-v2 to initialize and support
> > > multiple pagetables for a single SMMU context bank. Since the smmu-v2
> > > hardware doesn't have any built in support for switching the pagetable
> > > base it is left as an exercise to the caller to actually use the pagetable.
> > >
> > > Aux domains are supported if split pagetable (TTBR1) support has been
> > > enabled on the master domain.  Each auxiliary domain will reuse the
> > > configuration of the master domain. By default the a domain with TTBR1
> > > support will have the TTBR0 region disabled so the first attached aux
> > > domain will enable the TTBR0 region in the hardware and conversely the
> > > last domain to be detached will disable TTBR0 translations.  All subsequent
> > > auxiliary domains create a pagetable but not touch the hardware.
> > >
> > > The leaf driver will be able to query the physical address of the
> > > pagetable with the DOMAIN_ATTR_PTBASE attribute so that it can use the
> > > address with whatever means it has to switch the pagetable base.
> > >
> > > Following is a pseudo code example of how a domain can be created
> > >
> > >  /* Check to see if aux domains are supported */
> > >  if (iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX)) {
> > >        iommu = iommu_domain_alloc(...);
> > >
> > >        if (iommu_aux_attach_device(domain, dev))
> > >                return FAIL;
> > >
> > >       /* Save the base address of the pagetable for use by the driver
> > >       iommu_domain_get_attr(domain, DOMAIN_ATTR_PTBASE, &ptbase);
> > >  }
> >
> > I'm not really understanding what the pagetable base gets used for here and,
> > to be honest with you, the whole thing feels like a huge layering violation
> > with the way things are structured today. Why doesn't the caller just
> > interface with io-pgtable directly?
> >
> > Finally, if we need to support context-switching TTBR0 for a live domain
> > then that code really needs to live inside the SMMU driver because the
> > ASID and TLB management necessary to do that safely doesn't belong anywhere
> > else.
> 
> Hi Will,
> 
> We do in fact need live domain switching, that is really the whole
> point.  The GPU CP (command processor/parser) is directly updating
> TTBR0 and triggering TLB flush, asynchronously from the CPU.

Right. This is entirely done in hardware with a GPU that has complete access to
the context bank registers. All the driver does is send the PTBASE to the
command stream see [1] and especially [2] (look for CP_SMMU_TABLE_UPDATE).

As for interacting with the io-pgtable directly I would love to do that but it
would need some new infrastructure to either pull the io-pgtable from the aux
domain or to create an io-pgtable ourselves and pass it for use by the aux
domain. I'm not sure if that is better for the layering violation.

> And I think the answer about ASID is easy (on current hw).. it must be zero[*].

Right now the GPU microcode still uses TLBIALL. I want to assign each new aux
domain its own ASID in the hopes that we could some day change that but for now
having a uinque ASID doesn't help.

Jordan

[1] https://patchwork.freedesktop.org/patch/351089/
[2] https://patchwork.freedesktop.org/patch/351090/

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 2/6] arm/smmu: Add auxiliary domain support for arm-smmuv2
  2020-03-18 23:43     ` Rob Clark
  2020-03-19 15:23       ` Jordan Crouse
@ 2020-05-18 15:18       ` Will Deacon
  2020-05-18 15:50         ` Rob Clark
  1 sibling, 1 reply; 15+ messages in thread
From: Will Deacon @ 2020-05-18 15:18 UTC (permalink / raw)
  To: Rob Clark
  Cc: linux-arm-msm, Linux Kernel Mailing List,
	list@263.net:IOMMU DRIVERS
	<iommu@lists.linux-foundation.org>,
	Joerg Roedel <joro@8bytes.org>, ,
	Robin Murphy,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE

On Wed, Mar 18, 2020 at 04:43:07PM -0700, Rob Clark wrote:
> On Wed, Mar 18, 2020 at 3:48 PM Will Deacon <will@kernel.org> wrote:
> > On Tue, Jan 28, 2020 at 03:16:06PM -0700, Jordan Crouse wrote:
> > > Support auxiliary domains for arm-smmu-v2 to initialize and support
> > > multiple pagetables for a single SMMU context bank. Since the smmu-v2
> > > hardware doesn't have any built in support for switching the pagetable
> > > base it is left as an exercise to the caller to actually use the pagetable.
> > >
> > > Aux domains are supported if split pagetable (TTBR1) support has been
> > > enabled on the master domain.  Each auxiliary domain will reuse the
> > > configuration of the master domain. By default the a domain with TTBR1
> > > support will have the TTBR0 region disabled so the first attached aux
> > > domain will enable the TTBR0 region in the hardware and conversely the
> > > last domain to be detached will disable TTBR0 translations.  All subsequent
> > > auxiliary domains create a pagetable but not touch the hardware.
> > >
> > > The leaf driver will be able to query the physical address of the
> > > pagetable with the DOMAIN_ATTR_PTBASE attribute so that it can use the
> > > address with whatever means it has to switch the pagetable base.
> > >
> > > Following is a pseudo code example of how a domain can be created
> > >
> > >  /* Check to see if aux domains are supported */
> > >  if (iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX)) {
> > >        iommu = iommu_domain_alloc(...);
> > >
> > >        if (iommu_aux_attach_device(domain, dev))
> > >                return FAIL;
> > >
> > >       /* Save the base address of the pagetable for use by the driver
> > >       iommu_domain_get_attr(domain, DOMAIN_ATTR_PTBASE, &ptbase);
> > >  }
> >
> > I'm not really understanding what the pagetable base gets used for here and,
> > to be honest with you, the whole thing feels like a huge layering violation
> > with the way things are structured today. Why doesn't the caller just
> > interface with io-pgtable directly?
> >
> > Finally, if we need to support context-switching TTBR0 for a live domain
> > then that code really needs to live inside the SMMU driver because the
> > ASID and TLB management necessary to do that safely doesn't belong anywhere
> > else.
> 
> We do in fact need live domain switching, that is really the whole
> point.  The GPU CP (command processor/parser) is directly updating
> TTBR0 and triggering TLB flush, asynchronously from the CPU.
> 
> And I think the answer about ASID is easy (on current hw).. it must be zero[*].

Using ASID zero is really bad, because it means that you will end up sharing
TLB entries with whichever device is using context bank 0.

Is the SMMU only used by the GPU in your SoC?

Will
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 2/6] arm/smmu: Add auxiliary domain support for arm-smmuv2
  2020-05-18 15:18       ` Will Deacon
@ 2020-05-18 15:50         ` Rob Clark
  2020-05-20 12:57           ` Will Deacon
  0 siblings, 1 reply; 15+ messages in thread
From: Rob Clark @ 2020-05-18 15:50 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-arm-msm, Linux Kernel Mailing List,
	list@263.net:IOMMU DRIVERS
	<iommu@lists.linux-foundation.org>,
	Joerg Roedel <joro@8bytes.org>, ,
	Robin Murphy,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE

On Mon, May 18, 2020 at 8:18 AM Will Deacon <will@kernel.org> wrote:
>
> On Wed, Mar 18, 2020 at 04:43:07PM -0700, Rob Clark wrote:
> > On Wed, Mar 18, 2020 at 3:48 PM Will Deacon <will@kernel.org> wrote:
> > > On Tue, Jan 28, 2020 at 03:16:06PM -0700, Jordan Crouse wrote:
> > > > Support auxiliary domains for arm-smmu-v2 to initialize and support
> > > > multiple pagetables for a single SMMU context bank. Since the smmu-v2
> > > > hardware doesn't have any built in support for switching the pagetable
> > > > base it is left as an exercise to the caller to actually use the pagetable.
> > > >
> > > > Aux domains are supported if split pagetable (TTBR1) support has been
> > > > enabled on the master domain.  Each auxiliary domain will reuse the
> > > > configuration of the master domain. By default the a domain with TTBR1
> > > > support will have the TTBR0 region disabled so the first attached aux
> > > > domain will enable the TTBR0 region in the hardware and conversely the
> > > > last domain to be detached will disable TTBR0 translations.  All subsequent
> > > > auxiliary domains create a pagetable but not touch the hardware.
> > > >
> > > > The leaf driver will be able to query the physical address of the
> > > > pagetable with the DOMAIN_ATTR_PTBASE attribute so that it can use the
> > > > address with whatever means it has to switch the pagetable base.
> > > >
> > > > Following is a pseudo code example of how a domain can be created
> > > >
> > > >  /* Check to see if aux domains are supported */
> > > >  if (iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX)) {
> > > >        iommu = iommu_domain_alloc(...);
> > > >
> > > >        if (iommu_aux_attach_device(domain, dev))
> > > >                return FAIL;
> > > >
> > > >       /* Save the base address of the pagetable for use by the driver
> > > >       iommu_domain_get_attr(domain, DOMAIN_ATTR_PTBASE, &ptbase);
> > > >  }
> > >
> > > I'm not really understanding what the pagetable base gets used for here and,
> > > to be honest with you, the whole thing feels like a huge layering violation
> > > with the way things are structured today. Why doesn't the caller just
> > > interface with io-pgtable directly?
> > >
> > > Finally, if we need to support context-switching TTBR0 for a live domain
> > > then that code really needs to live inside the SMMU driver because the
> > > ASID and TLB management necessary to do that safely doesn't belong anywhere
> > > else.
> >
> > We do in fact need live domain switching, that is really the whole
> > point.  The GPU CP (command processor/parser) is directly updating
> > TTBR0 and triggering TLB flush, asynchronously from the CPU.
> >
> > And I think the answer about ASID is easy (on current hw).. it must be zero[*].
>
> Using ASID zero is really bad, because it means that you will end up sharing
> TLB entries with whichever device is using context bank 0.
>
> Is the SMMU only used by the GPU in your SoC?
>

yes, the snapdragon SoCs have two SMMU instances, one used by the GPU,
where ASID0/cb0 is the gpu itself, and another cb is the GMU
(basically power control for the gpu), and the second SMMU is
everything else.

BR,
-R
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 2/6] arm/smmu: Add auxiliary domain support for arm-smmuv2
  2020-05-18 15:50         ` Rob Clark
@ 2020-05-20 12:57           ` Will Deacon
  2020-05-20 15:13             ` Jordan Crouse
  0 siblings, 1 reply; 15+ messages in thread
From: Will Deacon @ 2020-05-20 12:57 UTC (permalink / raw)
  To: Rob Clark
  Cc: linux-arm-msm, Linux Kernel Mailing List,
	list@263.net:IOMMU DRIVERS
	<iommu@lists.linux-foundation.org>,
	Joerg Roedel <joro@8bytes.org>, ,
	Robin Murphy,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE

On Mon, May 18, 2020 at 08:50:27AM -0700, Rob Clark wrote:
> On Mon, May 18, 2020 at 8:18 AM Will Deacon <will@kernel.org> wrote:
> > On Wed, Mar 18, 2020 at 04:43:07PM -0700, Rob Clark wrote:
> > > We do in fact need live domain switching, that is really the whole
> > > point.  The GPU CP (command processor/parser) is directly updating
> > > TTBR0 and triggering TLB flush, asynchronously from the CPU.
> > >
> > > And I think the answer about ASID is easy (on current hw).. it must be zero[*].
> >
> > Using ASID zero is really bad, because it means that you will end up sharing
> > TLB entries with whichever device is using context bank 0.
> >
> > Is the SMMU only used by the GPU in your SoC?
> >
> 
> yes, the snapdragon SoCs have two SMMU instances, one used by the GPU,
> where ASID0/cb0 is the gpu itself, and another cb is the GMU
> (basically power control for the gpu), and the second SMMU is
> everything else.

Right, in which case I'm starting to think that we should treat this GPU
SMMU instance specially. Give it its own compatible string (looks like you
need this for HUPCFG anyway) and hook in via arm_smmu_impl_init(). You can
then set IO_PGTABLE_QUIRK_ARM_TTBR1 when talking to the io-pgtable code
without having to add a domain attribute.

With that. you'll need to find a way to allow the GPU driver to call into
your own hooks for getting at the TTBR0 tables -- given that you're
programming these in the hardware, I don't think it makes sense to expose
that in the IOMMU API, since most devices won't be able to do anything with
that data. Perhaps you could install a couple of function pointers
(subdomain_alloc/subdomain_free) in the GPU device when you see it appear
from the SMMU driver? Alternatively, you could make an io_pgtable_cfg
available so that the GPU driver can interface with io-pgtable directly.

Yes, it's ugly, but I don't think it's worth trying to abstract this.

Thoughts? It's taken me a long time to figure out what's going on here,
so sorry if it feels like I'm leading you round the houses.

Will
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 2/6] arm/smmu: Add auxiliary domain support for arm-smmuv2
  2020-05-20 12:57           ` Will Deacon
@ 2020-05-20 15:13             ` Jordan Crouse
  2020-05-20 16:35               ` Rob Clark
  0 siblings, 1 reply; 15+ messages in thread
From: Jordan Crouse @ 2020-05-20 15:13 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-arm-msm, Linux Kernel Mailing List,
	list@263.net:IOMMU DRIVERS
	<iommu@lists.linux-foundation.org>,
	Joerg Roedel <joro@8bytes.org>, ,
	Robin Murphy,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE

On Wed, May 20, 2020 at 01:57:01PM +0100, Will Deacon wrote:
> On Mon, May 18, 2020 at 08:50:27AM -0700, Rob Clark wrote:
> > On Mon, May 18, 2020 at 8:18 AM Will Deacon <will@kernel.org> wrote:
> > > On Wed, Mar 18, 2020 at 04:43:07PM -0700, Rob Clark wrote:
> > > > We do in fact need live domain switching, that is really the whole
> > > > point.  The GPU CP (command processor/parser) is directly updating
> > > > TTBR0 and triggering TLB flush, asynchronously from the CPU.
> > > >
> > > > And I think the answer about ASID is easy (on current hw).. it must be zero[*].
> > >
> > > Using ASID zero is really bad, because it means that you will end up sharing
> > > TLB entries with whichever device is using context bank 0.
> > >
> > > Is the SMMU only used by the GPU in your SoC?
> > >
> > 
> > yes, the snapdragon SoCs have two SMMU instances, one used by the GPU,
> > where ASID0/cb0 is the gpu itself, and another cb is the GMU
> > (basically power control for the gpu), and the second SMMU is
> > everything else.
> 
> Right, in which case I'm starting to think that we should treat this GPU
> SMMU instance specially. Give it its own compatible string (looks like you
> need this for HUPCFG anyway) and hook in via arm_smmu_impl_init(). You can
> then set IO_PGTABLE_QUIRK_ARM_TTBR1 when talking to the io-pgtable code
> without having to add a domain attribute.

If we did this via a special GPU SMMU instance then we could also create and
register a dummy TTBR0 instance along with the TTBR1 instance and then we
wouldn't need to worry about the aux domains at all.

> With that. you'll need to find a way to allow the GPU driver to call into
> your own hooks for getting at the TTBR0 tables -- given that you're
> programming these in the hardware, I don't think it makes sense to expose
> that in the IOMMU API, since most devices won't be able to do anything with
> that data. Perhaps you could install a couple of function pointers
> (subdomain_alloc/subdomain_free) in the GPU device when you see it appear
> from the SMMU driver? Alternatively, you could make an io_pgtable_cfg
> available so that the GPU driver can interface with io-pgtable directly.
 
I don't want to speak for Rob but I think that this is the same direction we've
landed on. If we use the implementation specific code to initialize the base
pagetables then the GPU driver can use io-pgtable directly. We can easily
construct an io_pgtable_cfg. This feature will only be available for opt-in
GPU targets that will have a known configuration.

The only gotcha is TLB maintenance but Rob and I have ideas about coordinating
with the GPU hardware (which has to do a TLBIALL during a switch anyway) and we
can always use the iommu_tlb_flush_all() hammer from software if we really need
it. It might take a bit of thought, but it is doable.

> Yes, it's ugly, but I don't think it's worth trying to abstract this.

I'm not sure how ugly it is. I've always operated under the assumption that the
GPU SMMU was special (though it had generic registers) just because of where it
was and how it it was used.  In the long run baking in a implementation specific
solution would probably be preferable to lots of domain attributes and aux
domains that would never be used except by us.

> Thoughts? It's taken me a long time to figure out what's going on here,
> so sorry if it feels like I'm leading you round the houses.

I'll hack on this and try to get something in place. It might be dumber on the
GPU side than we would like but it would at least spur some more conversation.

Jordan

> Will

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 2/6] arm/smmu: Add auxiliary domain support for arm-smmuv2
  2020-05-20 15:13             ` Jordan Crouse
@ 2020-05-20 16:35               ` Rob Clark
  0 siblings, 0 replies; 15+ messages in thread
From: Rob Clark @ 2020-05-20 16:35 UTC (permalink / raw)
  To: Will Deacon, Rob Clark, linux-arm-msm, Linux Kernel Mailing List,
	list@263.net:IOMMU DRIVERS
	<iommu@lists.linux-foundation.org>,
	Joerg Roedel <joro@8bytes.org>, ,
	Robin Murphy,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE

On Wed, May 20, 2020 at 8:13 AM Jordan Crouse <jcrouse@codeaurora.org> wrote:
>
> On Wed, May 20, 2020 at 01:57:01PM +0100, Will Deacon wrote:
> > On Mon, May 18, 2020 at 08:50:27AM -0700, Rob Clark wrote:
> > > On Mon, May 18, 2020 at 8:18 AM Will Deacon <will@kernel.org> wrote:
> > > > On Wed, Mar 18, 2020 at 04:43:07PM -0700, Rob Clark wrote:
> > > > > We do in fact need live domain switching, that is really the whole
> > > > > point.  The GPU CP (command processor/parser) is directly updating
> > > > > TTBR0 and triggering TLB flush, asynchronously from the CPU.
> > > > >
> > > > > And I think the answer about ASID is easy (on current hw).. it must be zero[*].
> > > >
> > > > Using ASID zero is really bad, because it means that you will end up sharing
> > > > TLB entries with whichever device is using context bank 0.
> > > >
> > > > Is the SMMU only used by the GPU in your SoC?
> > > >
> > >
> > > yes, the snapdragon SoCs have two SMMU instances, one used by the GPU,
> > > where ASID0/cb0 is the gpu itself, and another cb is the GMU
> > > (basically power control for the gpu), and the second SMMU is
> > > everything else.
> >
> > Right, in which case I'm starting to think that we should treat this GPU
> > SMMU instance specially. Give it its own compatible string (looks like you
> > need this for HUPCFG anyway) and hook in via arm_smmu_impl_init(). You can
> > then set IO_PGTABLE_QUIRK_ARM_TTBR1 when talking to the io-pgtable code
> > without having to add a domain attribute.
>
> If we did this via a special GPU SMMU instance then we could also create and
> register a dummy TTBR0 instance along with the TTBR1 instance and then we
> wouldn't need to worry about the aux domains at all.
>
> > With that. you'll need to find a way to allow the GPU driver to call into
> > your own hooks for getting at the TTBR0 tables -- given that you're
> > programming these in the hardware, I don't think it makes sense to expose
> > that in the IOMMU API, since most devices won't be able to do anything with
> > that data. Perhaps you could install a couple of function pointers
> > (subdomain_alloc/subdomain_free) in the GPU device when you see it appear
> > from the SMMU driver? Alternatively, you could make an io_pgtable_cfg
> > available so that the GPU driver can interface with io-pgtable directly.
>
> I don't want to speak for Rob but I think that this is the same direction we've
> landed on. If we use the implementation specific code to initialize the base
> pagetables then the GPU driver can use io-pgtable directly. We can easily
> construct an io_pgtable_cfg. This feature will only be available for opt-in
> GPU targets that will have a known configuration.

Agreed about using io-pgtable helpers directly.. the gpu's use-case is
pretty far different from anything normal/sane, and I don't think it
is worth designing some generic iommu interfaces with precisely one
user[*].  We just need enough in arm-smmu(/-impl) to bootstrap things
when we power up the gpu.

BR,
-R

[*] all the other gpu's that I've seen so far, even if they sit behind
an iommu, they have their own internal mmu

> The only gotcha is TLB maintenance but Rob and I have ideas about coordinating
> with the GPU hardware (which has to do a TLBIALL during a switch anyway) and we
> can always use the iommu_tlb_flush_all() hammer from software if we really need
> it. It might take a bit of thought, but it is doable.
>
> > Yes, it's ugly, but I don't think it's worth trying to abstract this.
>
> I'm not sure how ugly it is. I've always operated under the assumption that the
> GPU SMMU was special (though it had generic registers) just because of where it
> was and how it it was used.  In the long run baking in a implementation specific
> solution would probably be preferable to lots of domain attributes and aux
> domains that would never be used except by us.
>
> > Thoughts? It's taken me a long time to figure out what's going on here,
> > so sorry if it feels like I'm leading you round the houses.
>
> I'll hack on this and try to get something in place. It might be dumber on the
> GPU side than we would like but it would at least spur some more conversation.
>
> Jordan
>
> > Will
>
> --
> The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> a Linux Foundation Collaborative Project
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-05-20 16:35 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-28 22:16 [PATCH v1 0/6] iommu/arm-smmu: Auxiliary domain and per instance pagetables Jordan Crouse
2020-01-28 22:16 ` [PATCH v1 1/6] iommu: Add DOMAIN_ATTR_PTBASE Jordan Crouse
2020-01-28 22:16 ` [PATCH v1 2/6] arm/smmu: Add auxiliary domain support for arm-smmuv2 Jordan Crouse
2020-03-18 22:48   ` Will Deacon
2020-03-18 23:43     ` Rob Clark
2020-03-19 15:23       ` Jordan Crouse
2020-05-18 15:18       ` Will Deacon
2020-05-18 15:50         ` Rob Clark
2020-05-20 12:57           ` Will Deacon
2020-05-20 15:13             ` Jordan Crouse
2020-05-20 16:35               ` Rob Clark
2020-01-28 22:16 ` [PATCH v1 3/6] drm/msm/adreno: ADd support for IOMMU auxiliary domains Jordan Crouse
2020-01-28 22:16 ` [PATCH v1 4/6] drm/msm: Add support to create target specific address spaces Jordan Crouse
2020-01-28 22:16 ` [PATCH v1 5/6] drm/msm/gpu: Add ttbr0 to the memptrs Jordan Crouse
2020-01-28 22:16 ` [PATCH v1 6/6] drm/msm/a6xx: Support per-instance pagetables Jordan Crouse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).