linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/6] iommu-arm-smmu: Add auxiliary domains and per-instance pagetables
@ 2020-06-26 20:04 Jordan Crouse
  2020-06-26 20:04 ` [PATCH v2 1/6] iommu/arm-smmu: Add auxiliary domain support for arm-smmuv2 Jordan Crouse
                   ` (5 more replies)
  0 siblings, 6 replies; 21+ messages in thread
From: Jordan Crouse @ 2020-06-26 20:04 UTC (permalink / raw)
  To: linux-arm-msm
  Cc: Sai Prakash Ranjan, iommu, John Stultz, freedreno,
	Akhil P Oommen, Daniel Vetter, David Airlie, Emil Velikov,
	Eric Anholt, Joerg Roedel, Joerg Roedel, Jonathan Marek,
	Rob Clark, Robin Murphy, Sean Paul, Sharat Masetty, Will Deacon,
	Yong Wu, dri-devel, linux-arm-kernel, linux-kernel


This is a new refresh of support for auxiliary domains for arm-smmu-v2
and per-instance pagetables for drm/msm. The big change here from past
efforts is that outside of creating a single aux-domain to enable TTBR0
all of the per-instance pagetables are created and managed exclusively
in drm/msm without involving the arm-smmu driver. This fits in with the
suggested model of letting the GPU hardware do what it needs and leave the
arm-smmu driver blissfully unaware.

Almost. In order to set up the io-pgtable properly in drm/msm we need to
query the pagetable configuration from the current active domain and we need to
rely on the iommu API to flush TLBs after a unmap. In the future we can optimize
this in the drm/msm driver to track the state of the TLBs but for now the big
hammer lets us get off the ground.

This series is built on the split pagetable support [1].

[1] https://patchwork.kernel.org/patch/11628543/

v2: Remove unneeded cruft in the a6xx page switch sequence

Jordan Crouse (6):
  iommu/arm-smmu: Add auxiliary domain support for arm-smmuv2
  iommu/io-pgtable: Allow a pgtable implementation to skip TLB
    operations
  iommu/arm-smmu: Add a domain attribute to pass the pagetable config
  drm/msm: Add support to create a local pagetable
  drm/msm: Add support for address space instances
  drm/msm/a6xx: Add support for per-instance pagetables

 drivers/gpu/drm/msm/adreno/a6xx_gpu.c |  43 +++++
 drivers/gpu/drm/msm/msm_drv.c         |  15 +-
 drivers/gpu/drm/msm/msm_drv.h         |   4 +
 drivers/gpu/drm/msm/msm_gem_vma.c     |   9 +
 drivers/gpu/drm/msm/msm_gpu.c         |  17 ++
 drivers/gpu/drm/msm/msm_gpu.h         |   5 +
 drivers/gpu/drm/msm/msm_gpummu.c      |   2 +-
 drivers/gpu/drm/msm/msm_iommu.c       | 180 +++++++++++++++++++-
 drivers/gpu/drm/msm/msm_mmu.h         |  16 +-
 drivers/gpu/drm/msm/msm_ringbuffer.h  |   1 +
 drivers/iommu/arm-smmu.c              | 231 ++++++++++++++++++++++++--
 drivers/iommu/arm-smmu.h              |   1 +
 include/linux/io-pgtable.h            |  11 +-
 include/linux/iommu.h                 |   1 +
 14 files changed, 507 insertions(+), 29 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v2 1/6] iommu/arm-smmu: Add auxiliary domain support for arm-smmuv2
  2020-06-26 20:04 [PATCH v2 0/6] iommu-arm-smmu: Add auxiliary domains and per-instance pagetables Jordan Crouse
@ 2020-06-26 20:04 ` Jordan Crouse
  2020-07-07 10:48   ` Jean-Philippe Brucker
  2020-07-07 12:34   ` Robin Murphy
  2020-06-26 20:04 ` [PATCH v2 2/6] iommu/io-pgtable: Allow a pgtable implementation to skip TLB operations Jordan Crouse
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 21+ messages in thread
From: Jordan Crouse @ 2020-06-26 20:04 UTC (permalink / raw)
  To: linux-arm-msm
  Cc: Sai Prakash Ranjan, iommu, John Stultz, freedreno, Joerg Roedel,
	Robin Murphy, Will Deacon, linux-arm-kernel, linux-kernel

Support auxiliary domains for arm-smmu-v2 to initialize and support
multiple pagetables for a single SMMU context bank. Since the smmu-v2
hardware doesn't have any built in support for switching the pagetable
base it is left as an exercise to the caller to actually use the pagetable.

Aux domains are supported if split pagetable (TTBR1) support has been
enabled on the master domain.  Each auxiliary domain will reuse the
configuration of the master domain. By default the a domain with TTBR1
support will have the TTBR0 region disabled so the first attached aux
domain will enable the TTBR0 region in the hardware and conversely the
last domain to be detached will disable TTBR0 translations.  All subsequent
auxiliary domains create a pagetable but not touch the hardware.

The leaf driver will be able to query the physical address of the
pagetable with the DOMAIN_ATTR_PTBASE attribute so that it can use the
address with whatever means it has to switch the pagetable base.

Following is a pseudo code example of how a domain can be created

 /* Check to see if aux domains are supported */
 if (iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX)) {
	 iommu = iommu_domain_alloc(...);

	 if (iommu_aux_attach_device(domain, dev))
		 return FAIL;

	/* Save the base address of the pagetable for use by the driver
	iommu_domain_get_attr(domain, DOMAIN_ATTR_PTBASE, &ptbase);
 }

Then 'domain' can be used like any other iommu domain to map and
unmap iova addresses in the pagetable.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 drivers/iommu/arm-smmu.c | 219 ++++++++++++++++++++++++++++++++++++---
 drivers/iommu/arm-smmu.h |   1 +
 2 files changed, 204 insertions(+), 16 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 060139452c54..ce6d654301bf 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -91,6 +91,7 @@ struct arm_smmu_cb {
 	u32				tcr[2];
 	u32				mair[2];
 	struct arm_smmu_cfg		*cfg;
+	atomic_t			aux;
 };
 
 struct arm_smmu_master_cfg {
@@ -667,6 +668,86 @@ static void arm_smmu_write_context_bank(struct arm_smmu_device *smmu, int idx)
 	arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_SCTLR, reg);
 }
 
+/*
+ * Update the context context bank to enable TTBR0. Assumes AARCH64 S1
+ * configuration.
+ */
+static void arm_smmu_context_set_ttbr0(struct arm_smmu_cb *cb,
+		struct io_pgtable_cfg *pgtbl_cfg)
+{
+	u32 tcr = cb->tcr[0];
+
+	/* Add the TCR configuration from the new pagetable config */
+	tcr |= arm_smmu_lpae_tcr(pgtbl_cfg);
+
+	/* Make sure that both TTBR0 and TTBR1 are enabled */
+	tcr &= ~(ARM_SMMU_TCR_EPD0 | ARM_SMMU_TCR_EPD1);
+
+	/* Udate the TCR register */
+	cb->tcr[0] = tcr;
+
+	/* Program the new TTBR0 */
+	cb->ttbr[0] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
+	cb->ttbr[0] |= FIELD_PREP(ARM_SMMU_TTBRn_ASID, cb->cfg->asid);
+}
+
+/*
+ * Thus function assumes that the current model only allows aux domains for
+ * AARCH64 S1 configurations
+ */
+static int arm_smmu_aux_init_domain_context(struct iommu_domain *domain,
+		struct arm_smmu_device *smmu, struct arm_smmu_cfg *master)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct io_pgtable_ops *pgtbl_ops;
+	struct io_pgtable_cfg pgtbl_cfg;
+
+	mutex_lock(&smmu_domain->init_mutex);
+
+	/* Copy the configuration from the master */
+	memcpy(&smmu_domain->cfg, master, sizeof(smmu_domain->cfg));
+
+	smmu_domain->flush_ops = &arm_smmu_s1_tlb_ops;
+	smmu_domain->smmu = smmu;
+
+	pgtbl_cfg = (struct io_pgtable_cfg) {
+		.pgsize_bitmap = smmu->pgsize_bitmap,
+		.ias = smmu->va_size,
+		.oas = smmu->ipa_size,
+		.coherent_walk = smmu->features & ARM_SMMU_FEAT_COHERENT_WALK,
+		.tlb = smmu_domain->flush_ops,
+		.iommu_dev = smmu->dev,
+		.quirks = 0,
+	};
+
+	if (smmu_domain->non_strict)
+		pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
+
+	pgtbl_ops = alloc_io_pgtable_ops(ARM_64_LPAE_S1, &pgtbl_cfg,
+		smmu_domain);
+	if (!pgtbl_ops) {
+		mutex_unlock(&smmu_domain->init_mutex);
+		return -ENOMEM;
+	}
+
+	domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
+
+	domain->geometry.aperture_end = (1UL << smmu->va_size) - 1;
+	domain->geometry.force_aperture = true;
+
+	/* enable TTBR0 when the the first aux domain is attached */
+	if (atomic_inc_return(&smmu->cbs[master->cbndx].aux) == 1) {
+		arm_smmu_context_set_ttbr0(&smmu->cbs[master->cbndx],
+			&pgtbl_cfg);
+		arm_smmu_write_context_bank(smmu, master->cbndx);
+	}
+
+	smmu_domain->pgtbl_ops = pgtbl_ops;
+	mutex_unlock(&smmu_domain->init_mutex);
+
+	return 0;
+}
+
 static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 					struct arm_smmu_device *smmu,
 					struct device *dev)
@@ -871,36 +952,70 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	return ret;
 }
 
+static void
+arm_smmu_destroy_aux_domain_context(struct arm_smmu_domain *smmu_domain)
+{
+	struct arm_smmu_device *smmu = smmu_domain->smmu;
+	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
+	int ret;
+
+	/*
+	 * If this is the last aux domain to be freed, disable TTBR0 by turning
+	 * off translations and clearing TTBR0
+	 */
+	if (atomic_dec_return(&smmu->cbs[cfg->cbndx].aux) == 0) {
+		/* Clear out the T0 region */
+		smmu->cbs[cfg->cbndx].tcr[0] &= ~GENMASK(15, 0);
+		/* Disable TTBR0 translations */
+		smmu->cbs[cfg->cbndx].tcr[0] |= ARM_SMMU_TCR_EPD0;
+		/* Clear the TTBR0 pagetable address */
+		smmu->cbs[cfg->cbndx].ttbr[0] =
+			FIELD_PREP(ARM_SMMU_TTBRn_ASID, cfg->asid);
+
+		ret = arm_smmu_rpm_get(smmu);
+		if (!ret) {
+			arm_smmu_write_context_bank(smmu, cfg->cbndx);
+			arm_smmu_rpm_put(smmu);
+		}
+	}
+
+}
+
 static void arm_smmu_destroy_domain_context(struct iommu_domain *domain)
 {
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
 	struct arm_smmu_device *smmu = smmu_domain->smmu;
 	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
-	int ret, irq;
 
 	if (!smmu || domain->type == IOMMU_DOMAIN_IDENTITY)
 		return;
 
-	ret = arm_smmu_rpm_get(smmu);
-	if (ret < 0)
-		return;
+	if (smmu_domain->aux)
+		arm_smmu_destroy_aux_domain_context(smmu_domain);
 
-	/*
-	 * Disable the context bank and free the page tables before freeing
-	 * it.
-	 */
-	smmu->cbs[cfg->cbndx].cfg = NULL;
-	arm_smmu_write_context_bank(smmu, cfg->cbndx);
+	/* Check if the last user is done with the context bank */
+	if (atomic_read(&smmu->cbs[cfg->cbndx].aux) == 0) {
+		int ret = arm_smmu_rpm_get(smmu);
+		int irq;
 
-	if (cfg->irptndx != ARM_SMMU_INVALID_IRPTNDX) {
-		irq = smmu->irqs[smmu->num_global_irqs + cfg->irptndx];
-		devm_free_irq(smmu->dev, irq, domain);
+		if (ret < 0)
+			return;
+
+		/* Disable the context bank */
+		smmu->cbs[cfg->cbndx].cfg = NULL;
+		arm_smmu_write_context_bank(smmu, cfg->cbndx);
+
+		if (cfg->irptndx != ARM_SMMU_INVALID_IRPTNDX) {
+			irq = smmu->irqs[smmu->num_global_irqs + cfg->irptndx];
+			devm_free_irq(smmu->dev, irq, domain);
+		}
+
+		__arm_smmu_free_bitmap(smmu->context_map, cfg->cbndx);
+		arm_smmu_rpm_put(smmu);
 	}
 
+	/* Destroy the pagetable */
 	free_io_pgtable_ops(smmu_domain->pgtbl_ops);
-	__arm_smmu_free_bitmap(smmu->context_map, cfg->cbndx);
-
-	arm_smmu_rpm_put(smmu);
 }
 
 static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
@@ -1161,6 +1276,74 @@ static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain,
 	return 0;
 }
 
+static bool arm_smmu_dev_has_feat(struct device *dev,
+		enum iommu_dev_features feat)
+{
+	if (feat != IOMMU_DEV_FEAT_AUX)
+		return false;
+
+	return true;
+}
+
+static int arm_smmu_dev_enable_feat(struct device *dev,
+		enum iommu_dev_features feat)
+{
+	/* aux domain support is always available */
+	if (feat == IOMMU_DEV_FEAT_AUX)
+		return 0;
+
+	return -ENODEV;
+}
+
+static int arm_smmu_dev_disable_feat(struct device *dev,
+		enum iommu_dev_features feat)
+{
+	return -EBUSY;
+}
+
+static int arm_smmu_aux_attach_dev(struct iommu_domain *domain,
+		struct device *dev)
+{
+	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+	struct arm_smmu_master_cfg *cfg = dev_iommu_priv_get(dev);
+	struct arm_smmu_device *smmu = cfg->smmu;
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct arm_smmu_cb *cb;
+	int idx, i, ret, cbndx = -1;
+
+	/* Try to find the context bank configured for this device */
+	for_each_cfg_sme(cfg, fwspec, i, idx) {
+		if (idx != INVALID_SMENDX) {
+			cbndx = smmu->s2crs[idx].cbndx;
+			break;
+		}
+	}
+
+	if (cbndx == -1)
+		return -ENODEV;
+
+	cb = &smmu->cbs[cbndx];
+
+	/* Aux domains are only supported for AARCH64 configurations */
+	if (cb->cfg->fmt != ARM_SMMU_CTX_FMT_AARCH64)
+		return -EINVAL;
+
+	/* Make sure that TTBR1 is enabled in the hardware */
+	if ((cb->tcr[0] & ARM_SMMU_TCR_EPD1))
+		return -EINVAL;
+
+	smmu_domain->aux = true;
+
+	ret = arm_smmu_rpm_get(smmu);
+	if (ret < 0)
+		return ret;
+
+	ret = arm_smmu_aux_init_domain_context(domain, smmu, cb->cfg);
+
+	arm_smmu_rpm_put(smmu);
+	return ret;
+}
+
 static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 {
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
@@ -1653,6 +1836,10 @@ static struct iommu_ops arm_smmu_ops = {
 	.get_resv_regions	= arm_smmu_get_resv_regions,
 	.put_resv_regions	= generic_iommu_put_resv_regions,
 	.def_domain_type	= arm_smmu_def_domain_type,
+	.dev_has_feat		= arm_smmu_dev_has_feat,
+	.dev_enable_feat	= arm_smmu_dev_enable_feat,
+	.dev_disable_feat	= arm_smmu_dev_disable_feat,
+	.aux_attach_dev		= arm_smmu_aux_attach_dev,
 	.pgsize_bitmap		= -1UL, /* Restricted during device attach */
 };
 
diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
index c417814f1d98..79d441024043 100644
--- a/drivers/iommu/arm-smmu.h
+++ b/drivers/iommu/arm-smmu.h
@@ -346,6 +346,7 @@ struct arm_smmu_domain {
 	spinlock_t			cb_lock; /* Serialises ATS1* ops and TLB syncs */
 	struct iommu_domain		domain;
 	struct device			*dev;	/* Device attached to this domain */
+	bool				aux;
 };
 
 static inline u32 arm_smmu_lpae_tcr(struct io_pgtable_cfg *cfg)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 2/6] iommu/io-pgtable: Allow a pgtable implementation to skip TLB operations
  2020-06-26 20:04 [PATCH v2 0/6] iommu-arm-smmu: Add auxiliary domains and per-instance pagetables Jordan Crouse
  2020-06-26 20:04 ` [PATCH v2 1/6] iommu/arm-smmu: Add auxiliary domain support for arm-smmuv2 Jordan Crouse
@ 2020-06-26 20:04 ` Jordan Crouse
  2020-07-07 11:34   ` Robin Murphy
  2020-06-26 20:04 ` [PATCH v2 3/6] iommu/arm-smmu: Add a domain attribute to pass the pagetable config Jordan Crouse
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 21+ messages in thread
From: Jordan Crouse @ 2020-06-26 20:04 UTC (permalink / raw)
  To: linux-arm-msm
  Cc: Sai Prakash Ranjan, iommu, John Stultz, freedreno, Joerg Roedel,
	Robin Murphy, Will Deacon, Yong Wu, linux-kernel

Allow a io-pgtable implementation to skip TLB operations by checking for
NULL pointers in the helper functions. It will be up to to the owner
of the io-pgtable instance to make sure that they independently handle
the TLB correctly.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 include/linux/io-pgtable.h | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index 53d53c6c2be9..bbed1d3925ba 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -210,21 +210,24 @@ struct io_pgtable {
 
 static inline void io_pgtable_tlb_flush_all(struct io_pgtable *iop)
 {
-	iop->cfg.tlb->tlb_flush_all(iop->cookie);
+	if (iop->cfg.tlb)
+		iop->cfg.tlb->tlb_flush_all(iop->cookie);
 }
 
 static inline void
 io_pgtable_tlb_flush_walk(struct io_pgtable *iop, unsigned long iova,
 			  size_t size, size_t granule)
 {
-	iop->cfg.tlb->tlb_flush_walk(iova, size, granule, iop->cookie);
+	if (iop->cfg.tlb)
+		iop->cfg.tlb->tlb_flush_walk(iova, size, granule, iop->cookie);
 }
 
 static inline void
 io_pgtable_tlb_flush_leaf(struct io_pgtable *iop, unsigned long iova,
 			  size_t size, size_t granule)
 {
-	iop->cfg.tlb->tlb_flush_leaf(iova, size, granule, iop->cookie);
+	if (iop->cfg.tlb)
+		iop->cfg.tlb->tlb_flush_leaf(iova, size, granule, iop->cookie);
 }
 
 static inline void
@@ -232,7 +235,7 @@ io_pgtable_tlb_add_page(struct io_pgtable *iop,
 			struct iommu_iotlb_gather * gather, unsigned long iova,
 			size_t granule)
 {
-	if (iop->cfg.tlb->tlb_add_page)
+	if (iop->cfg.tlb && iop->cfg.tlb->tlb_add_page)
 		iop->cfg.tlb->tlb_add_page(gather, iova, granule, iop->cookie);
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 3/6] iommu/arm-smmu: Add a domain attribute to pass the pagetable config
  2020-06-26 20:04 [PATCH v2 0/6] iommu-arm-smmu: Add auxiliary domains and per-instance pagetables Jordan Crouse
  2020-06-26 20:04 ` [PATCH v2 1/6] iommu/arm-smmu: Add auxiliary domain support for arm-smmuv2 Jordan Crouse
  2020-06-26 20:04 ` [PATCH v2 2/6] iommu/io-pgtable: Allow a pgtable implementation to skip TLB operations Jordan Crouse
@ 2020-06-26 20:04 ` Jordan Crouse
  2020-06-26 20:04 ` [PATCH v2 4/6] drm/msm: Add support to create a local pagetable Jordan Crouse
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 21+ messages in thread
From: Jordan Crouse @ 2020-06-26 20:04 UTC (permalink / raw)
  To: linux-arm-msm
  Cc: Sai Prakash Ranjan, iommu, John Stultz, freedreno, Joerg Roedel,
	Robin Murphy, Will Deacon, linux-arm-kernel, linux-kernel

The Adreno GPU has the capacity to manage its own pagetables and switch
them dynamically from the hardware. Add a domain attribute for arm-smmu-v2
to get the default pagetable configuration so that the GPU driver can match
the format for its own pagetables.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 drivers/iommu/arm-smmu.c | 12 ++++++++++++
 include/linux/iommu.h    |  1 +
 2 files changed, 13 insertions(+)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index ce6d654301bf..4bd247dfd703 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1714,6 +1714,18 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
 		case DOMAIN_ATTR_NESTING:
 			*(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED);
 			return 0;
+		case DOMAIN_ATTR_PGTABLE_CFG: {
+			struct io_pgtable *pgtable;
+			struct io_pgtable_cfg *dest = data;
+
+			if (!smmu_domain->pgtbl_ops)
+				return -ENODEV;
+
+			pgtable = io_pgtable_ops_to_pgtable(smmu_domain->pgtbl_ops);
+
+			memcpy(dest, &pgtable->cfg, sizeof(*dest));
+			return 0;
+		}
 		default:
 			return -ENODEV;
 		}
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 5f0b7859d2eb..2388117641f1 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -124,6 +124,7 @@ enum iommu_attr {
 	DOMAIN_ATTR_FSL_PAMUV1,
 	DOMAIN_ATTR_NESTING,	/* two stages of translation */
 	DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE,
+	DOMAIN_ATTR_PGTABLE_CFG,
 	DOMAIN_ATTR_MAX,
 };
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 4/6] drm/msm: Add support to create a local pagetable
  2020-06-26 20:04 [PATCH v2 0/6] iommu-arm-smmu: Add auxiliary domains and per-instance pagetables Jordan Crouse
                   ` (2 preceding siblings ...)
  2020-06-26 20:04 ` [PATCH v2 3/6] iommu/arm-smmu: Add a domain attribute to pass the pagetable config Jordan Crouse
@ 2020-06-26 20:04 ` Jordan Crouse
  2020-07-07 11:36   ` Robin Murphy
  2020-06-26 20:04 ` [PATCH v2 5/6] drm/msm: Add support for address space instances Jordan Crouse
  2020-06-26 20:04 ` [PATCH v2 6/6] drm/msm/a6xx: Add support for per-instance pagetables Jordan Crouse
  5 siblings, 1 reply; 21+ messages in thread
From: Jordan Crouse @ 2020-06-26 20:04 UTC (permalink / raw)
  To: linux-arm-msm
  Cc: Sai Prakash Ranjan, iommu, John Stultz, freedreno, Daniel Vetter,
	David Airlie, Rob Clark, Sean Paul, dri-devel, linux-kernel

Add support to create a io-pgtable for use by targets that support
per-instance pagetables.  In order to support per-instance pagetables the
GPU SMMU device needs to have the qcom,adreno-smmu compatible string and
split pagetables and auxiliary domains need to be supported and enabled.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 drivers/gpu/drm/msm/msm_gpummu.c |   2 +-
 drivers/gpu/drm/msm/msm_iommu.c  | 180 ++++++++++++++++++++++++++++++-
 drivers/gpu/drm/msm/msm_mmu.h    |  16 ++-
 3 files changed, 195 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gpummu.c b/drivers/gpu/drm/msm/msm_gpummu.c
index 310a31b05faa..aab121f4beb7 100644
--- a/drivers/gpu/drm/msm/msm_gpummu.c
+++ b/drivers/gpu/drm/msm/msm_gpummu.c
@@ -102,7 +102,7 @@ struct msm_mmu *msm_gpummu_new(struct device *dev, struct msm_gpu *gpu)
 	}
 
 	gpummu->gpu = gpu;
-	msm_mmu_init(&gpummu->base, dev, &funcs);
+	msm_mmu_init(&gpummu->base, dev, &funcs, MSM_MMU_GPUMMU);
 
 	return &gpummu->base;
 }
diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c
index 1b6635504069..f455c597f76d 100644
--- a/drivers/gpu/drm/msm/msm_iommu.c
+++ b/drivers/gpu/drm/msm/msm_iommu.c
@@ -4,15 +4,192 @@
  * Author: Rob Clark <robdclark@gmail.com>
  */
 
+#include <linux/io-pgtable.h>
 #include "msm_drv.h"
 #include "msm_mmu.h"
 
 struct msm_iommu {
 	struct msm_mmu base;
 	struct iommu_domain *domain;
+	struct iommu_domain *aux_domain;
 };
+
 #define to_msm_iommu(x) container_of(x, struct msm_iommu, base)
 
+struct msm_iommu_pagetable {
+	struct msm_mmu base;
+	struct msm_mmu *parent;
+	struct io_pgtable_ops *pgtbl_ops;
+	phys_addr_t ttbr;
+	u32 asid;
+};
+
+static struct msm_iommu_pagetable *to_pagetable(struct msm_mmu *mmu)
+{
+	return container_of(mmu, struct msm_iommu_pagetable, base);
+}
+
+static int msm_iommu_pagetable_unmap(struct msm_mmu *mmu, u64 iova,
+		size_t size)
+{
+	struct msm_iommu_pagetable *pagetable = to_pagetable(mmu);
+	struct io_pgtable_ops *ops = pagetable->pgtbl_ops;
+	size_t unmapped = 0;
+
+	/* Unmap the block one page at a time */
+	while (size) {
+		unmapped += ops->unmap(ops, iova, 4096, NULL);
+		iova += 4096;
+		size -= 4096;
+	}
+
+	iommu_flush_tlb_all(to_msm_iommu(pagetable->parent)->domain);
+
+	return (unmapped == size) ? 0 : -EINVAL;
+}
+
+static int msm_iommu_pagetable_map(struct msm_mmu *mmu, u64 iova,
+		struct sg_table *sgt, size_t len, int prot)
+{
+	struct msm_iommu_pagetable *pagetable = to_pagetable(mmu);
+	struct io_pgtable_ops *ops = pagetable->pgtbl_ops;
+	struct scatterlist *sg;
+	size_t mapped = 0;
+	u64 addr = iova;
+	unsigned int i;
+
+	for_each_sg(sgt->sgl, sg, sgt->nents, i) {
+		size_t size = sg->length;
+		phys_addr_t phys = sg_phys(sg);
+
+		/* Map the block one page at a time */
+		while (size) {
+			if (ops->map(ops, addr, phys, 4096, prot)) {
+				msm_iommu_pagetable_unmap(mmu, iova, mapped);
+				return -EINVAL;
+			}
+
+			phys += 4096;
+			addr += 4096;
+			size -= 4096;
+			mapped += 4096;
+		}
+	}
+
+	return 0;
+}
+
+static void msm_iommu_pagetable_destroy(struct msm_mmu *mmu)
+{
+	struct msm_iommu_pagetable *pagetable = to_pagetable(mmu);
+
+	free_io_pgtable_ops(pagetable->pgtbl_ops);
+	kfree(pagetable);
+}
+
+/*
+ * Given a parent device, create and return an aux domain. This will enable the
+ * TTBR0 region
+ */
+static struct iommu_domain *msm_iommu_get_aux_domain(struct msm_mmu *parent)
+{
+	struct msm_iommu *iommu = to_msm_iommu(parent);
+	struct iommu_domain *domain;
+	int ret;
+
+	if (iommu->aux_domain)
+		return iommu->aux_domain;
+
+	if (!iommu_dev_has_feature(parent->dev, IOMMU_DEV_FEAT_AUX))
+		return ERR_PTR(-ENODEV);
+
+	domain = iommu_domain_alloc(&platform_bus_type);
+	if (!domain)
+		return ERR_PTR(-ENODEV);
+
+	ret = iommu_aux_attach_device(domain, parent->dev);
+	if (ret) {
+		iommu_domain_free(domain);
+		return ERR_PTR(ret);
+	}
+
+	iommu->aux_domain = domain;
+	return domain;
+}
+
+int msm_iommu_pagetable_params(struct msm_mmu *mmu,
+		phys_addr_t *ttbr, int *asid)
+{
+	struct msm_iommu_pagetable *pagetable;
+
+	if (mmu->type != MSM_MMU_IOMMU_PAGETABLE)
+		return -EINVAL;
+
+	pagetable = to_pagetable(mmu);
+
+	if (ttbr)
+		*ttbr = pagetable->ttbr;
+
+	if (asid)
+		*asid = pagetable->asid;
+
+	return 0;
+}
+
+static const struct msm_mmu_funcs pagetable_funcs = {
+		.map = msm_iommu_pagetable_map,
+		.unmap = msm_iommu_pagetable_unmap,
+		.destroy = msm_iommu_pagetable_destroy,
+};
+
+struct msm_mmu *msm_iommu_pagetable_create(struct msm_mmu *parent)
+{
+	static int next_asid = 16;
+	struct msm_iommu_pagetable *pagetable;
+	struct iommu_domain *aux_domain;
+	struct io_pgtable_cfg cfg;
+	int ret;
+
+	/* Make sure that the parent has a aux domain attached */
+	aux_domain = msm_iommu_get_aux_domain(parent);
+	if (IS_ERR(aux_domain))
+		return ERR_CAST(aux_domain);
+
+	/* Get the pagetable configuration from the aux domain */
+	ret = iommu_domain_get_attr(aux_domain, DOMAIN_ATTR_PGTABLE_CFG, &cfg);
+	if (ret)
+		return ERR_PTR(ret);
+
+	pagetable = kzalloc(sizeof(*pagetable), GFP_KERNEL);
+	if (!pagetable)
+		return ERR_PTR(-ENOMEM);
+
+	msm_mmu_init(&pagetable->base, parent->dev, &pagetable_funcs,
+		MSM_MMU_IOMMU_PAGETABLE);
+
+	cfg.tlb = NULL;
+
+	pagetable->pgtbl_ops = alloc_io_pgtable_ops(ARM_64_LPAE_S1,
+		&cfg, aux_domain);
+
+	if (!pagetable->pgtbl_ops) {
+		kfree(pagetable);
+		return ERR_PTR(-ENOMEM);
+	}
+
+
+	/* Needed later for TLB flush */
+	pagetable->parent = parent;
+	pagetable->ttbr = cfg.arm_lpae_s1_cfg.ttbr;
+
+	pagetable->asid = next_asid;
+	next_asid = (next_asid + 1)  % 255;
+	if (next_asid < 16)
+		next_asid = 16;
+
+	return &pagetable->base;
+}
+
 static int msm_fault_handler(struct iommu_domain *domain, struct device *dev,
 		unsigned long iova, int flags, void *arg)
 {
@@ -40,6 +217,7 @@ static int msm_iommu_map(struct msm_mmu *mmu, uint64_t iova,
 	if (iova & BIT_ULL(48))
 		iova |= GENMASK_ULL(63, 49);
 
+
 	ret = iommu_map_sg(iommu->domain, iova, sgt->sgl, sgt->nents, prot);
 	WARN_ON(!ret);
 
@@ -85,7 +263,7 @@ struct msm_mmu *msm_iommu_new(struct device *dev, struct iommu_domain *domain)
 		return ERR_PTR(-ENOMEM);
 
 	iommu->domain = domain;
-	msm_mmu_init(&iommu->base, dev, &funcs);
+	msm_mmu_init(&iommu->base, dev, &funcs, MSM_MMU_IOMMU);
 	iommu_set_fault_handler(domain, msm_fault_handler, iommu);
 
 	ret = iommu_attach_device(iommu->domain, dev);
diff --git a/drivers/gpu/drm/msm/msm_mmu.h b/drivers/gpu/drm/msm/msm_mmu.h
index 3a534ee59bf6..61ade89d9e48 100644
--- a/drivers/gpu/drm/msm/msm_mmu.h
+++ b/drivers/gpu/drm/msm/msm_mmu.h
@@ -17,18 +17,26 @@ struct msm_mmu_funcs {
 	void (*destroy)(struct msm_mmu *mmu);
 };
 
+enum msm_mmu_type {
+	MSM_MMU_GPUMMU,
+	MSM_MMU_IOMMU,
+	MSM_MMU_IOMMU_PAGETABLE,
+};
+
 struct msm_mmu {
 	const struct msm_mmu_funcs *funcs;
 	struct device *dev;
 	int (*handler)(void *arg, unsigned long iova, int flags);
 	void *arg;
+	enum msm_mmu_type type;
 };
 
 static inline void msm_mmu_init(struct msm_mmu *mmu, struct device *dev,
-		const struct msm_mmu_funcs *funcs)
+		const struct msm_mmu_funcs *funcs, enum msm_mmu_type type)
 {
 	mmu->dev = dev;
 	mmu->funcs = funcs;
+	mmu->type = type;
 }
 
 struct msm_mmu *msm_iommu_new(struct device *dev, struct iommu_domain *domain);
@@ -41,7 +49,13 @@ static inline void msm_mmu_set_fault_handler(struct msm_mmu *mmu, void *arg,
 	mmu->handler = handler;
 }
 
+struct msm_mmu *msm_iommu_pagetable_create(struct msm_mmu *parent);
+
 void msm_gpummu_params(struct msm_mmu *mmu, dma_addr_t *pt_base,
 		dma_addr_t *tran_error);
 
+
+int msm_iommu_pagetable_params(struct msm_mmu *mmu, phys_addr_t *ttbr,
+		int *asid);
+
 #endif /* __MSM_MMU_H__ */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 5/6] drm/msm: Add support for address space instances
  2020-06-26 20:04 [PATCH v2 0/6] iommu-arm-smmu: Add auxiliary domains and per-instance pagetables Jordan Crouse
                   ` (3 preceding siblings ...)
  2020-06-26 20:04 ` [PATCH v2 4/6] drm/msm: Add support to create a local pagetable Jordan Crouse
@ 2020-06-26 20:04 ` Jordan Crouse
  2020-06-26 20:04 ` [PATCH v2 6/6] drm/msm/a6xx: Add support for per-instance pagetables Jordan Crouse
  5 siblings, 0 replies; 21+ messages in thread
From: Jordan Crouse @ 2020-06-26 20:04 UTC (permalink / raw)
  To: linux-arm-msm
  Cc: Sai Prakash Ranjan, iommu, John Stultz, freedreno, Daniel Vetter,
	David Airlie, Rob Clark, Sean Paul, dri-devel, linux-kernel

Add support for allocating an address space instance. Targets that support
per-instance pagetables should implement their own function to allocate a
new instance. The default will return the existing generic address space.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 drivers/gpu/drm/msm/msm_drv.c     | 15 +++++++++------
 drivers/gpu/drm/msm/msm_drv.h     |  4 ++++
 drivers/gpu/drm/msm/msm_gem_vma.c |  9 +++++++++
 drivers/gpu/drm/msm/msm_gpu.c     | 17 +++++++++++++++++
 drivers/gpu/drm/msm/msm_gpu.h     |  5 +++++
 5 files changed, 44 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index 6c57cc72d627..092c49552ddd 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -588,7 +588,7 @@ static int context_init(struct drm_device *dev, struct drm_file *file)
 
 	msm_submitqueue_init(dev, ctx);
 
-	ctx->aspace = priv->gpu ? priv->gpu->aspace : NULL;
+	ctx->aspace = msm_gpu_address_space_instance(priv->gpu);
 	file->driver_priv = ctx;
 
 	return 0;
@@ -607,6 +607,8 @@ static int msm_open(struct drm_device *dev, struct drm_file *file)
 static void context_close(struct msm_file_private *ctx)
 {
 	msm_submitqueue_close(ctx);
+
+	msm_gem_address_space_put(ctx->aspace);
 	kfree(ctx);
 }
 
@@ -771,18 +773,19 @@ static int msm_ioctl_gem_cpu_fini(struct drm_device *dev, void *data,
 }
 
 static int msm_ioctl_gem_info_iova(struct drm_device *dev,
-		struct drm_gem_object *obj, uint64_t *iova)
+		struct drm_file *file, struct drm_gem_object *obj,
+		uint64_t *iova)
 {
-	struct msm_drm_private *priv = dev->dev_private;
+	struct msm_file_private *ctx = file->driver_priv;
 
-	if (!priv->gpu)
+	if (!ctx->aspace)
 		return -EINVAL;
 
 	/*
 	 * Don't pin the memory here - just get an address so that userspace can
 	 * be productive
 	 */
-	return msm_gem_get_iova(obj, priv->gpu->aspace, iova);
+	return msm_gem_get_iova(obj, ctx->aspace, iova);
 }
 
 static int msm_ioctl_gem_info(struct drm_device *dev, void *data,
@@ -821,7 +824,7 @@ static int msm_ioctl_gem_info(struct drm_device *dev, void *data,
 		args->value = msm_gem_mmap_offset(obj);
 		break;
 	case MSM_INFO_GET_IOVA:
-		ret = msm_ioctl_gem_info_iova(dev, obj, &args->value);
+		ret = msm_ioctl_gem_info_iova(dev, file, obj, &args->value);
 		break;
 	case MSM_INFO_SET_NAME:
 		/* length check should leave room for terminating null: */
diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
index e2d6a6056418..983a8b7e5a74 100644
--- a/drivers/gpu/drm/msm/msm_drv.h
+++ b/drivers/gpu/drm/msm/msm_drv.h
@@ -249,6 +249,10 @@ int msm_gem_map_vma(struct msm_gem_address_space *aspace,
 void msm_gem_close_vma(struct msm_gem_address_space *aspace,
 		struct msm_gem_vma *vma);
 
+
+struct msm_gem_address_space *
+msm_gem_address_space_get(struct msm_gem_address_space *aspace);
+
 void msm_gem_address_space_put(struct msm_gem_address_space *aspace);
 
 struct msm_gem_address_space *
diff --git a/drivers/gpu/drm/msm/msm_gem_vma.c b/drivers/gpu/drm/msm/msm_gem_vma.c
index 5f6a11211b64..29cc1305cf37 100644
--- a/drivers/gpu/drm/msm/msm_gem_vma.c
+++ b/drivers/gpu/drm/msm/msm_gem_vma.c
@@ -27,6 +27,15 @@ void msm_gem_address_space_put(struct msm_gem_address_space *aspace)
 		kref_put(&aspace->kref, msm_gem_address_space_destroy);
 }
 
+struct msm_gem_address_space *
+msm_gem_address_space_get(struct msm_gem_address_space *aspace)
+{
+	if (!IS_ERR_OR_NULL(aspace))
+		kref_get(&aspace->kref);
+
+	return aspace;
+}
+
 /* Actually unmap memory for the vma */
 void msm_gem_purge_vma(struct msm_gem_address_space *aspace,
 		struct msm_gem_vma *vma)
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index 86a138641477..0fa614430799 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -821,6 +821,23 @@ static int get_clocks(struct platform_device *pdev, struct msm_gpu *gpu)
 	return 0;
 }
 
+/* Return a new address space instance */
+struct msm_gem_address_space *
+msm_gpu_address_space_instance(struct msm_gpu *gpu)
+{
+	if (!gpu)
+		return NULL;
+
+	/*
+	 * If the GPU doesn't support instanced address spaces return the
+	 * default address space
+	 */
+	if (!gpu->funcs->address_space_instance)
+		return msm_gem_address_space_get(gpu->aspace);
+
+	return gpu->funcs->address_space_instance(gpu);
+}
+
 int msm_gpu_init(struct drm_device *drm, struct platform_device *pdev,
 		struct msm_gpu *gpu, const struct msm_gpu_funcs *funcs,
 		const char *name, struct msm_gpu_config *config)
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index 429cb40f7931..f1762b77bea8 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -64,6 +64,8 @@ struct msm_gpu_funcs {
 	void (*gpu_set_freq)(struct msm_gpu *gpu, unsigned long freq);
 	struct msm_gem_address_space *(*create_address_space)
 		(struct msm_gpu *gpu, struct platform_device *pdev);
+	struct msm_gem_address_space *(*address_space_instance)
+		(struct msm_gpu *gpu);
 };
 
 struct msm_gpu {
@@ -286,6 +288,9 @@ int msm_gpu_init(struct drm_device *drm, struct platform_device *pdev,
 		struct msm_gpu *gpu, const struct msm_gpu_funcs *funcs,
 		const char *name, struct msm_gpu_config *config);
 
+struct msm_gem_address_space *
+msm_gpu_address_space_instance(struct msm_gpu *gpu);
+
 void msm_gpu_cleanup(struct msm_gpu *gpu);
 
 struct msm_gpu *adreno_load_gpu(struct drm_device *dev);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 6/6] drm/msm/a6xx: Add support for per-instance pagetables
  2020-06-26 20:04 [PATCH v2 0/6] iommu-arm-smmu: Add auxiliary domains and per-instance pagetables Jordan Crouse
                   ` (4 preceding siblings ...)
  2020-06-26 20:04 ` [PATCH v2 5/6] drm/msm: Add support for address space instances Jordan Crouse
@ 2020-06-26 20:04 ` Jordan Crouse
  2020-06-27 19:56   ` Rob Clark
  5 siblings, 1 reply; 21+ messages in thread
From: Jordan Crouse @ 2020-06-26 20:04 UTC (permalink / raw)
  To: linux-arm-msm
  Cc: Sai Prakash Ranjan, iommu, John Stultz, freedreno,
	Akhil P Oommen, Daniel Vetter, David Airlie, Emil Velikov,
	Eric Anholt, Jonathan Marek, Rob Clark, Sean Paul,
	Sharat Masetty, dri-devel, linux-kernel

Add support for using per-instance pagetables if all the dependencies are
available.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 43 +++++++++++++++++++++++++++
 drivers/gpu/drm/msm/msm_ringbuffer.h  |  1 +
 2 files changed, 44 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index aa53f47b7e8b..95ed2ceac121 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -79,6 +79,34 @@ static void get_stats_counter(struct msm_ringbuffer *ring, u32 counter,
 	OUT_RING(ring, upper_32_bits(iova));
 }
 
+static void a6xx_set_pagetable(struct msm_gpu *gpu, struct msm_ringbuffer *ring,
+		struct msm_file_private *ctx)
+{
+	phys_addr_t ttbr;
+	u32 asid;
+
+	if (msm_iommu_pagetable_params(ctx->aspace->mmu, &ttbr, &asid))
+		return;
+
+	/* Execute the table update */
+	OUT_PKT7(ring, CP_SMMU_TABLE_UPDATE, 4);
+	OUT_RING(ring, lower_32_bits(ttbr));
+	OUT_RING(ring, (((u64) asid) << 48) | upper_32_bits(ttbr));
+	/* CONTEXTIDR is currently unused */
+	OUT_RING(ring, 0);
+	/* CONTEXTBANK is currently unused */
+	OUT_RING(ring, 0);
+
+	/*
+	 * Write the new TTBR0 to the memstore. This is good for debugging.
+	 */
+	OUT_PKT7(ring, CP_MEM_WRITE, 4);
+	OUT_RING(ring, lower_32_bits(rbmemptr(ring, ttbr0)));
+	OUT_RING(ring, upper_32_bits(rbmemptr(ring, ttbr0)));
+	OUT_RING(ring, lower_32_bits(ttbr));
+	OUT_RING(ring, (((u64) asid) << 48) | upper_32_bits(ttbr));
+}
+
 static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit,
 	struct msm_file_private *ctx)
 {
@@ -89,6 +117,8 @@ static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit,
 	struct msm_ringbuffer *ring = submit->ring;
 	unsigned int i;
 
+	a6xx_set_pagetable(gpu, ring, ctx);
+
 	get_stats_counter(ring, REG_A6XX_RBBM_PERFCTR_CP_0_LO,
 		rbmemptr_stats(ring, index, cpcycles_start));
 
@@ -872,6 +902,18 @@ static unsigned long a6xx_gpu_busy(struct msm_gpu *gpu)
 	return (unsigned long)busy_time;
 }
 
+struct msm_gem_address_space *a6xx_address_space_instance(struct msm_gpu *gpu)
+{
+	struct msm_mmu *mmu;
+
+	mmu = msm_iommu_pagetable_create(gpu->aspace->mmu);
+	if (IS_ERR(mmu))
+		return msm_gem_address_space_get(gpu->aspace);
+
+	return msm_gem_address_space_create(mmu,
+		"gpu", 0x100000000ULL, 0x1ffffffffULL);
+}
+
 static const struct adreno_gpu_funcs funcs = {
 	.base = {
 		.get_param = adreno_get_param,
@@ -895,6 +937,7 @@ static const struct adreno_gpu_funcs funcs = {
 		.gpu_state_put = a6xx_gpu_state_put,
 #endif
 		.create_address_space = adreno_iommu_create_address_space,
+		.address_space_instance = a6xx_address_space_instance,
 	},
 	.get_timestamp = a6xx_get_timestamp,
 };
diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.h b/drivers/gpu/drm/msm/msm_ringbuffer.h
index 7764373d0ed2..0987d6bf848c 100644
--- a/drivers/gpu/drm/msm/msm_ringbuffer.h
+++ b/drivers/gpu/drm/msm/msm_ringbuffer.h
@@ -31,6 +31,7 @@ struct msm_rbmemptrs {
 	volatile uint32_t fence;
 
 	volatile struct msm_gpu_submit_stats stats[MSM_GPU_SUBMIT_STATS_COUNT];
+	volatile u64 ttbr0;
 };
 
 struct msm_ringbuffer {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 6/6] drm/msm/a6xx: Add support for per-instance pagetables
  2020-06-26 20:04 ` [PATCH v2 6/6] drm/msm/a6xx: Add support for per-instance pagetables Jordan Crouse
@ 2020-06-27 19:56   ` Rob Clark
  2020-06-27 20:11     ` Rob Clark
  0 siblings, 1 reply; 21+ messages in thread
From: Rob Clark @ 2020-06-27 19:56 UTC (permalink / raw)
  To: Jordan Crouse
  Cc: linux-arm-msm, Sai Prakash Ranjan,
	list@263.net:IOMMU DRIVERS
	<iommu@lists.linux-foundation.org>,
	Joerg Roedel <joro@8bytes.org>,,
	John Stultz, freedreno, Akhil P Oommen, Daniel Vetter,
	David Airlie, Emil Velikov, Eric Anholt, Jonathan Marek,
	Sean Paul, Sharat Masetty, dri-devel, Linux Kernel Mailing List

On Fri, Jun 26, 2020 at 1:04 PM Jordan Crouse <jcrouse@codeaurora.org> wrote:
>
> Add support for using per-instance pagetables if all the dependencies are
> available.
>
> Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> ---
>
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 43 +++++++++++++++++++++++++++
>  drivers/gpu/drm/msm/msm_ringbuffer.h  |  1 +
>  2 files changed, 44 insertions(+)
>
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index aa53f47b7e8b..95ed2ceac121 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -79,6 +79,34 @@ static void get_stats_counter(struct msm_ringbuffer *ring, u32 counter,
>         OUT_RING(ring, upper_32_bits(iova));
>  }
>
> +static void a6xx_set_pagetable(struct msm_gpu *gpu, struct msm_ringbuffer *ring,
> +               struct msm_file_private *ctx)
> +{
> +       phys_addr_t ttbr;
> +       u32 asid;
> +
> +       if (msm_iommu_pagetable_params(ctx->aspace->mmu, &ttbr, &asid))
> +               return;
> +
> +       /* Execute the table update */
> +       OUT_PKT7(ring, CP_SMMU_TABLE_UPDATE, 4);
> +       OUT_RING(ring, lower_32_bits(ttbr));
> +       OUT_RING(ring, (((u64) asid) << 48) | upper_32_bits(ttbr));
> +       /* CONTEXTIDR is currently unused */
> +       OUT_RING(ring, 0);
> +       /* CONTEXTBANK is currently unused */
> +       OUT_RING(ring, 0);
> +
> +       /*
> +        * Write the new TTBR0 to the memstore. This is good for debugging.
> +        */
> +       OUT_PKT7(ring, CP_MEM_WRITE, 4);
> +       OUT_RING(ring, lower_32_bits(rbmemptr(ring, ttbr0)));
> +       OUT_RING(ring, upper_32_bits(rbmemptr(ring, ttbr0)));
> +       OUT_RING(ring, lower_32_bits(ttbr));
> +       OUT_RING(ring, (((u64) asid) << 48) | upper_32_bits(ttbr));
> +}
> +
>  static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit,
>         struct msm_file_private *ctx)
>  {
> @@ -89,6 +117,8 @@ static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit,
>         struct msm_ringbuffer *ring = submit->ring;
>         unsigned int i;
>
> +       a6xx_set_pagetable(gpu, ring, ctx);
> +
>         get_stats_counter(ring, REG_A6XX_RBBM_PERFCTR_CP_0_LO,
>                 rbmemptr_stats(ring, index, cpcycles_start));
>
> @@ -872,6 +902,18 @@ static unsigned long a6xx_gpu_busy(struct msm_gpu *gpu)
>         return (unsigned long)busy_time;
>  }
>
> +struct msm_gem_address_space *a6xx_address_space_instance(struct msm_gpu *gpu)
> +{
> +       struct msm_mmu *mmu;
> +
> +       mmu = msm_iommu_pagetable_create(gpu->aspace->mmu);
> +       if (IS_ERR(mmu))
> +               return msm_gem_address_space_get(gpu->aspace);
> +
> +       return msm_gem_address_space_create(mmu,
> +               "gpu", 0x100000000ULL, 0x1ffffffffULL);
> +}
> +
>  static const struct adreno_gpu_funcs funcs = {
>         .base = {
>                 .get_param = adreno_get_param,
> @@ -895,6 +937,7 @@ static const struct adreno_gpu_funcs funcs = {
>                 .gpu_state_put = a6xx_gpu_state_put,
>  #endif
>                 .create_address_space = adreno_iommu_create_address_space,
> +               .address_space_instance = a6xx_address_space_instance,

Hmm, maybe instead of .address_space_instance, something like
.create_context_address_space?

Since like .create_address_space, it is creating an address space..
the difference is that it is a per context/process aspace..

BR,
-R

>         },
>         .get_timestamp = a6xx_get_timestamp,
>  };
> diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.h b/drivers/gpu/drm/msm/msm_ringbuffer.h
> index 7764373d0ed2..0987d6bf848c 100644
> --- a/drivers/gpu/drm/msm/msm_ringbuffer.h
> +++ b/drivers/gpu/drm/msm/msm_ringbuffer.h
> @@ -31,6 +31,7 @@ struct msm_rbmemptrs {
>         volatile uint32_t fence;
>
>         volatile struct msm_gpu_submit_stats stats[MSM_GPU_SUBMIT_STATS_COUNT];
> +       volatile u64 ttbr0;
>  };
>
>  struct msm_ringbuffer {
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 6/6] drm/msm/a6xx: Add support for per-instance pagetables
  2020-06-27 19:56   ` Rob Clark
@ 2020-06-27 20:11     ` Rob Clark
  2020-06-29 14:56       ` [Freedreno] " Jordan Crouse
  0 siblings, 1 reply; 21+ messages in thread
From: Rob Clark @ 2020-06-27 20:11 UTC (permalink / raw)
  To: Jordan Crouse
  Cc: linux-arm-msm, Sai Prakash Ranjan,
	list@263.net:IOMMU DRIVERS
	<iommu@lists.linux-foundation.org>,
	Joerg Roedel <joro@8bytes.org>,,
	John Stultz, freedreno, Akhil P Oommen, Daniel Vetter,
	David Airlie, Emil Velikov, Eric Anholt, Jonathan Marek,
	Sean Paul, Sharat Masetty, dri-devel, Linux Kernel Mailing List

On Sat, Jun 27, 2020 at 12:56 PM Rob Clark <robdclark@gmail.com> wrote:
>
> On Fri, Jun 26, 2020 at 1:04 PM Jordan Crouse <jcrouse@codeaurora.org> wrote:
> >
> > Add support for using per-instance pagetables if all the dependencies are
> > available.
> >
> > Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> > ---
> >
> >  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 43 +++++++++++++++++++++++++++
> >  drivers/gpu/drm/msm/msm_ringbuffer.h  |  1 +
> >  2 files changed, 44 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > index aa53f47b7e8b..95ed2ceac121 100644
> > --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > @@ -79,6 +79,34 @@ static void get_stats_counter(struct msm_ringbuffer *ring, u32 counter,
> >         OUT_RING(ring, upper_32_bits(iova));
> >  }
> >
> > +static void a6xx_set_pagetable(struct msm_gpu *gpu, struct msm_ringbuffer *ring,
> > +               struct msm_file_private *ctx)
> > +{
> > +       phys_addr_t ttbr;
> > +       u32 asid;
> > +
> > +       if (msm_iommu_pagetable_params(ctx->aspace->mmu, &ttbr, &asid))
> > +               return;
> > +
> > +       /* Execute the table update */
> > +       OUT_PKT7(ring, CP_SMMU_TABLE_UPDATE, 4);
> > +       OUT_RING(ring, lower_32_bits(ttbr));
> > +       OUT_RING(ring, (((u64) asid) << 48) | upper_32_bits(ttbr));
> > +       /* CONTEXTIDR is currently unused */
> > +       OUT_RING(ring, 0);
> > +       /* CONTEXTBANK is currently unused */
> > +       OUT_RING(ring, 0);
> > +
> > +       /*
> > +        * Write the new TTBR0 to the memstore. This is good for debugging.
> > +        */
> > +       OUT_PKT7(ring, CP_MEM_WRITE, 4);
> > +       OUT_RING(ring, lower_32_bits(rbmemptr(ring, ttbr0)));
> > +       OUT_RING(ring, upper_32_bits(rbmemptr(ring, ttbr0)));
> > +       OUT_RING(ring, lower_32_bits(ttbr));
> > +       OUT_RING(ring, (((u64) asid) << 48) | upper_32_bits(ttbr));
> > +}
> > +
> >  static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit,
> >         struct msm_file_private *ctx)
> >  {
> > @@ -89,6 +117,8 @@ static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit,
> >         struct msm_ringbuffer *ring = submit->ring;
> >         unsigned int i;
> >
> > +       a6xx_set_pagetable(gpu, ring, ctx);
> > +
> >         get_stats_counter(ring, REG_A6XX_RBBM_PERFCTR_CP_0_LO,
> >                 rbmemptr_stats(ring, index, cpcycles_start));
> >
> > @@ -872,6 +902,18 @@ static unsigned long a6xx_gpu_busy(struct msm_gpu *gpu)
> >         return (unsigned long)busy_time;
> >  }
> >
> > +struct msm_gem_address_space *a6xx_address_space_instance(struct msm_gpu *gpu)
> > +{
> > +       struct msm_mmu *mmu;
> > +
> > +       mmu = msm_iommu_pagetable_create(gpu->aspace->mmu);
> > +       if (IS_ERR(mmu))
> > +               return msm_gem_address_space_get(gpu->aspace);
> > +
> > +       return msm_gem_address_space_create(mmu,
> > +               "gpu", 0x100000000ULL, 0x1ffffffffULL);
> > +}
> > +
> >  static const struct adreno_gpu_funcs funcs = {
> >         .base = {
> >                 .get_param = adreno_get_param,
> > @@ -895,6 +937,7 @@ static const struct adreno_gpu_funcs funcs = {
> >                 .gpu_state_put = a6xx_gpu_state_put,
> >  #endif
> >                 .create_address_space = adreno_iommu_create_address_space,
> > +               .address_space_instance = a6xx_address_space_instance,
>
> Hmm, maybe instead of .address_space_instance, something like
> .create_context_address_space?
>
> Since like .create_address_space, it is creating an address space..
> the difference is that it is a per context/process aspace..
>


or maybe just .create_pgtable and return the 'struct msm_mmu' (which
is itself starting to become less of a great name)..

The only other thing a6xx_address_space_instance() adds is knowing
where the split is between the kernel and user pgtables, and I suppose
that isn't a thing that would really be changing between gens?

BR,
-R

> BR,
> -R
>
> >         },
> >         .get_timestamp = a6xx_get_timestamp,
> >  };
> > diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.h b/drivers/gpu/drm/msm/msm_ringbuffer.h
> > index 7764373d0ed2..0987d6bf848c 100644
> > --- a/drivers/gpu/drm/msm/msm_ringbuffer.h
> > +++ b/drivers/gpu/drm/msm/msm_ringbuffer.h
> > @@ -31,6 +31,7 @@ struct msm_rbmemptrs {
> >         volatile uint32_t fence;
> >
> >         volatile struct msm_gpu_submit_stats stats[MSM_GPU_SUBMIT_STATS_COUNT];
> > +       volatile u64 ttbr0;
> >  };
> >
> >  struct msm_ringbuffer {
> > --
> > 2.17.1
> >

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Freedreno] [PATCH v2 6/6] drm/msm/a6xx: Add support for per-instance pagetables
  2020-06-27 20:11     ` Rob Clark
@ 2020-06-29 14:56       ` Jordan Crouse
  0 siblings, 0 replies; 21+ messages in thread
From: Jordan Crouse @ 2020-06-29 14:56 UTC (permalink / raw)
  To: Rob Clark
  Cc: Sean Paul, Sai Prakash Ranjan, Jonathan Marek, David Airlie,
	linux-arm-msm, Sharat Masetty, Akhil P Oommen, dri-devel,
	Linux Kernel Mailing List, Eric Anholt,
	list@263.net:IOMMU DRIVERS
	<iommu@lists.linux-foundation.org>,
	Joerg Roedel <joro@8bytes.org>, ,
	John Stultz, Daniel Vetter, freedreno, Emil Velikov

On Sat, Jun 27, 2020 at 01:11:14PM -0700, Rob Clark wrote:
> On Sat, Jun 27, 2020 at 12:56 PM Rob Clark <robdclark@gmail.com> wrote:
> >
> > On Fri, Jun 26, 2020 at 1:04 PM Jordan Crouse <jcrouse@codeaurora.org> wrote:
> > >
> > > Add support for using per-instance pagetables if all the dependencies are
> > > available.
> > >
> > > Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> > > ---
> > >
> > >  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 43 +++++++++++++++++++++++++++
> > >  drivers/gpu/drm/msm/msm_ringbuffer.h  |  1 +
> > >  2 files changed, 44 insertions(+)
> > >
> > > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > index aa53f47b7e8b..95ed2ceac121 100644
> > > --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > @@ -79,6 +79,34 @@ static void get_stats_counter(struct msm_ringbuffer *ring, u32 counter,
> > >         OUT_RING(ring, upper_32_bits(iova));
> > >  }
> > >
> > > +static void a6xx_set_pagetable(struct msm_gpu *gpu, struct msm_ringbuffer *ring,
> > > +               struct msm_file_private *ctx)
> > > +{
> > > +       phys_addr_t ttbr;
> > > +       u32 asid;
> > > +
> > > +       if (msm_iommu_pagetable_params(ctx->aspace->mmu, &ttbr, &asid))
> > > +               return;
> > > +
> > > +       /* Execute the table update */
> > > +       OUT_PKT7(ring, CP_SMMU_TABLE_UPDATE, 4);
> > > +       OUT_RING(ring, lower_32_bits(ttbr));
> > > +       OUT_RING(ring, (((u64) asid) << 48) | upper_32_bits(ttbr));
> > > +       /* CONTEXTIDR is currently unused */
> > > +       OUT_RING(ring, 0);
> > > +       /* CONTEXTBANK is currently unused */
> > > +       OUT_RING(ring, 0);
> > > +
> > > +       /*
> > > +        * Write the new TTBR0 to the memstore. This is good for debugging.
> > > +        */
> > > +       OUT_PKT7(ring, CP_MEM_WRITE, 4);
> > > +       OUT_RING(ring, lower_32_bits(rbmemptr(ring, ttbr0)));
> > > +       OUT_RING(ring, upper_32_bits(rbmemptr(ring, ttbr0)));
> > > +       OUT_RING(ring, lower_32_bits(ttbr));
> > > +       OUT_RING(ring, (((u64) asid) << 48) | upper_32_bits(ttbr));
> > > +}
> > > +
> > >  static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit,
> > >         struct msm_file_private *ctx)
> > >  {
> > > @@ -89,6 +117,8 @@ static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit,
> > >         struct msm_ringbuffer *ring = submit->ring;
> > >         unsigned int i;
> > >
> > > +       a6xx_set_pagetable(gpu, ring, ctx);
> > > +
> > >         get_stats_counter(ring, REG_A6XX_RBBM_PERFCTR_CP_0_LO,
> > >                 rbmemptr_stats(ring, index, cpcycles_start));
> > >
> > > @@ -872,6 +902,18 @@ static unsigned long a6xx_gpu_busy(struct msm_gpu *gpu)
> > >         return (unsigned long)busy_time;
> > >  }
> > >
> > > +struct msm_gem_address_space *a6xx_address_space_instance(struct msm_gpu *gpu)
> > > +{
> > > +       struct msm_mmu *mmu;
> > > +
> > > +       mmu = msm_iommu_pagetable_create(gpu->aspace->mmu);
> > > +       if (IS_ERR(mmu))
> > > +               return msm_gem_address_space_get(gpu->aspace);
> > > +
> > > +       return msm_gem_address_space_create(mmu,
> > > +               "gpu", 0x100000000ULL, 0x1ffffffffULL);
> > > +}
> > > +
> > >  static const struct adreno_gpu_funcs funcs = {
> > >         .base = {
> > >                 .get_param = adreno_get_param,
> > > @@ -895,6 +937,7 @@ static const struct adreno_gpu_funcs funcs = {
> > >                 .gpu_state_put = a6xx_gpu_state_put,
> > >  #endif
> > >                 .create_address_space = adreno_iommu_create_address_space,
> > > +               .address_space_instance = a6xx_address_space_instance,
> >
> > Hmm, maybe instead of .address_space_instance, something like
> > .create_context_address_space?
> >
> > Since like .create_address_space, it is creating an address space..
> > the difference is that it is a per context/process aspace..
> >

This is a good suggestion. I'm always open to changing function names.

> 
> 
> or maybe just .create_pgtable and return the 'struct msm_mmu' (which
> is itself starting to become less of a great name)..
> 
> The only other thing a6xx_address_space_instance() adds is knowing
> where the split is between the kernel and user pgtables, and I suppose
> that isn't a thing that would really be changing between gens?

In theory the split is determined by the hardware but its been the same for all
a5xx/a6xx targets.

Jordan

> BR,
> -R
> 
> > BR,
> > -R
> >
> > >         },
> > >         .get_timestamp = a6xx_get_timestamp,
> > >  };
> > > diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.h b/drivers/gpu/drm/msm/msm_ringbuffer.h
> > > index 7764373d0ed2..0987d6bf848c 100644
> > > --- a/drivers/gpu/drm/msm/msm_ringbuffer.h
> > > +++ b/drivers/gpu/drm/msm/msm_ringbuffer.h
> > > @@ -31,6 +31,7 @@ struct msm_rbmemptrs {
> > >         volatile uint32_t fence;
> > >
> > >         volatile struct msm_gpu_submit_stats stats[MSM_GPU_SUBMIT_STATS_COUNT];
> > > +       volatile u64 ttbr0;
> > >  };
> > >
> > >  struct msm_ringbuffer {
> > > --
> > > 2.17.1
> > >
> _______________________________________________
> Freedreno mailing list
> Freedreno@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/freedreno

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/6] iommu/arm-smmu: Add auxiliary domain support for arm-smmuv2
  2020-06-26 20:04 ` [PATCH v2 1/6] iommu/arm-smmu: Add auxiliary domain support for arm-smmuv2 Jordan Crouse
@ 2020-07-07 10:48   ` Jean-Philippe Brucker
  2020-07-07 12:34   ` Robin Murphy
  1 sibling, 0 replies; 21+ messages in thread
From: Jean-Philippe Brucker @ 2020-07-07 10:48 UTC (permalink / raw)
  To: Jordan Crouse
  Cc: linux-arm-msm, Will Deacon, Robin Murphy, linux-kernel, iommu,
	John Stultz, freedreno, linux-arm-kernel

Hi Jordan,

On Fri, Jun 26, 2020 at 02:04:09PM -0600, Jordan Crouse wrote:
> Support auxiliary domains for arm-smmu-v2 to initialize and support
> multiple pagetables for a single SMMU context bank. Since the smmu-v2
> hardware doesn't have any built in support for switching the pagetable
> base it is left as an exercise to the caller to actually use the pagetable.
> 
> Aux domains are supported if split pagetable (TTBR1) support has been
> enabled on the master domain.  Each auxiliary domain will reuse the
> configuration of the master domain. By default the a domain with TTBR1
> support will have the TTBR0 region disabled so the first attached aux
> domain will enable the TTBR0 region in the hardware and conversely the
> last domain to be detached will disable TTBR0 translations.  All subsequent
> auxiliary domains create a pagetable but not touch the hardware.
> 
> The leaf driver will be able to query the physical address of the
> pagetable with the DOMAIN_ATTR_PTBASE attribute so that it can use the
> address with whatever means it has to switch the pagetable base.
> 
> Following is a pseudo code example of how a domain can be created
> 
>  /* Check to see if aux domains are supported */
>  if (iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX)) {
> 	 iommu = iommu_domain_alloc(...);
> 

The device driver should also call iommu_dev_enable_feature() before using
the AUX feature. I see that you implement them as NOPs and in this case
the GPU is tightly coupled with the SMMU so interoperability between
different IOMMU and device drivers doesn't matter much, but I think it's
still a good idea to follow the same patterns in all drivers to make
future work on the core IOMMU easier.

> 	 if (iommu_aux_attach_device(domain, dev))
> 		 return FAIL;
> 
> 	/* Save the base address of the pagetable for use by the driver
> 	iommu_domain_get_attr(domain, DOMAIN_ATTR_PTBASE, &ptbase);
>  }
> 
> Then 'domain' can be used like any other iommu domain to map and
> unmap iova addresses in the pagetable.
> 
> Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> ---
> 
>  drivers/iommu/arm-smmu.c | 219 ++++++++++++++++++++++++++++++++++++---
>  drivers/iommu/arm-smmu.h |   1 +
>  2 files changed, 204 insertions(+), 16 deletions(-)
[...]
> @@ -1653,6 +1836,10 @@ static struct iommu_ops arm_smmu_ops = {
>  	.get_resv_regions	= arm_smmu_get_resv_regions,
>  	.put_resv_regions	= generic_iommu_put_resv_regions,
>  	.def_domain_type	= arm_smmu_def_domain_type,
> +	.dev_has_feat		= arm_smmu_dev_has_feat,
> +	.dev_enable_feat	= arm_smmu_dev_enable_feat,
> +	.dev_disable_feat	= arm_smmu_dev_disable_feat,
> +	.aux_attach_dev		= arm_smmu_aux_attach_dev,

To be complete this also needs dev_feat_enabled() and aux_detach_dev() ops

Thanks,
Jean

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 2/6] iommu/io-pgtable: Allow a pgtable implementation to skip TLB operations
  2020-06-26 20:04 ` [PATCH v2 2/6] iommu/io-pgtable: Allow a pgtable implementation to skip TLB operations Jordan Crouse
@ 2020-07-07 11:34   ` Robin Murphy
  2020-07-07 14:25     ` [Freedreno] " Rob Clark
  0 siblings, 1 reply; 21+ messages in thread
From: Robin Murphy @ 2020-07-07 11:34 UTC (permalink / raw)
  To: Jordan Crouse, linux-arm-msm
  Cc: Sai Prakash Ranjan, iommu, John Stultz, freedreno, Joerg Roedel,
	Will Deacon, Yong Wu, linux-kernel

On 2020-06-26 21:04, Jordan Crouse wrote:
> Allow a io-pgtable implementation to skip TLB operations by checking for
> NULL pointers in the helper functions. It will be up to to the owner
> of the io-pgtable instance to make sure that they independently handle
> the TLB correctly.

I don't really understand what this is for - tricking the IOMMU driver 
into not performing its TLB maintenance at points when that maintenance 
has been deemed necessary doesn't seem like the appropriate way to 
achieve anything good :/

Robin.

> Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> ---
> 
>   include/linux/io-pgtable.h | 11 +++++++----
>   1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
> index 53d53c6c2be9..bbed1d3925ba 100644
> --- a/include/linux/io-pgtable.h
> +++ b/include/linux/io-pgtable.h
> @@ -210,21 +210,24 @@ struct io_pgtable {
>   
>   static inline void io_pgtable_tlb_flush_all(struct io_pgtable *iop)
>   {
> -	iop->cfg.tlb->tlb_flush_all(iop->cookie);
> +	if (iop->cfg.tlb)
> +		iop->cfg.tlb->tlb_flush_all(iop->cookie);
>   }
>   
>   static inline void
>   io_pgtable_tlb_flush_walk(struct io_pgtable *iop, unsigned long iova,
>   			  size_t size, size_t granule)
>   {
> -	iop->cfg.tlb->tlb_flush_walk(iova, size, granule, iop->cookie);
> +	if (iop->cfg.tlb)
> +		iop->cfg.tlb->tlb_flush_walk(iova, size, granule, iop->cookie);
>   }
>   
>   static inline void
>   io_pgtable_tlb_flush_leaf(struct io_pgtable *iop, unsigned long iova,
>   			  size_t size, size_t granule)
>   {
> -	iop->cfg.tlb->tlb_flush_leaf(iova, size, granule, iop->cookie);
> +	if (iop->cfg.tlb)
> +		iop->cfg.tlb->tlb_flush_leaf(iova, size, granule, iop->cookie);
>   }
>   
>   static inline void
> @@ -232,7 +235,7 @@ io_pgtable_tlb_add_page(struct io_pgtable *iop,
>   			struct iommu_iotlb_gather * gather, unsigned long iova,
>   			size_t granule)
>   {
> -	if (iop->cfg.tlb->tlb_add_page)
> +	if (iop->cfg.tlb && iop->cfg.tlb->tlb_add_page)
>   		iop->cfg.tlb->tlb_add_page(gather, iova, granule, iop->cookie);
>   }
>   
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 4/6] drm/msm: Add support to create a local pagetable
  2020-06-26 20:04 ` [PATCH v2 4/6] drm/msm: Add support to create a local pagetable Jordan Crouse
@ 2020-07-07 11:36   ` Robin Murphy
  2020-07-07 14:41     ` [Freedreno] " Rob Clark
  2020-07-08 19:35     ` Jordan Crouse
  0 siblings, 2 replies; 21+ messages in thread
From: Robin Murphy @ 2020-07-07 11:36 UTC (permalink / raw)
  To: Jordan Crouse, linux-arm-msm
  Cc: David Airlie, Sean Paul, dri-devel, linux-kernel, iommu,
	John Stultz, Daniel Vetter, freedreno

On 2020-06-26 21:04, Jordan Crouse wrote:
> Add support to create a io-pgtable for use by targets that support
> per-instance pagetables.  In order to support per-instance pagetables the
> GPU SMMU device needs to have the qcom,adreno-smmu compatible string and
> split pagetables and auxiliary domains need to be supported and enabled.
> 
> Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> ---
> 
>   drivers/gpu/drm/msm/msm_gpummu.c |   2 +-
>   drivers/gpu/drm/msm/msm_iommu.c  | 180 ++++++++++++++++++++++++++++++-
>   drivers/gpu/drm/msm/msm_mmu.h    |  16 ++-
>   3 files changed, 195 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/msm_gpummu.c b/drivers/gpu/drm/msm/msm_gpummu.c
> index 310a31b05faa..aab121f4beb7 100644
> --- a/drivers/gpu/drm/msm/msm_gpummu.c
> +++ b/drivers/gpu/drm/msm/msm_gpummu.c
> @@ -102,7 +102,7 @@ struct msm_mmu *msm_gpummu_new(struct device *dev, struct msm_gpu *gpu)
>   	}
>   
>   	gpummu->gpu = gpu;
> -	msm_mmu_init(&gpummu->base, dev, &funcs);
> +	msm_mmu_init(&gpummu->base, dev, &funcs, MSM_MMU_GPUMMU);
>   
>   	return &gpummu->base;
>   }
> diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c
> index 1b6635504069..f455c597f76d 100644
> --- a/drivers/gpu/drm/msm/msm_iommu.c
> +++ b/drivers/gpu/drm/msm/msm_iommu.c
> @@ -4,15 +4,192 @@
>    * Author: Rob Clark <robdclark@gmail.com>
>    */
>   
> +#include <linux/io-pgtable.h>
>   #include "msm_drv.h"
>   #include "msm_mmu.h"
>   
>   struct msm_iommu {
>   	struct msm_mmu base;
>   	struct iommu_domain *domain;
> +	struct iommu_domain *aux_domain;
>   };
> +
>   #define to_msm_iommu(x) container_of(x, struct msm_iommu, base)
>   
> +struct msm_iommu_pagetable {
> +	struct msm_mmu base;
> +	struct msm_mmu *parent;
> +	struct io_pgtable_ops *pgtbl_ops;
> +	phys_addr_t ttbr;
> +	u32 asid;
> +};
> +
> +static struct msm_iommu_pagetable *to_pagetable(struct msm_mmu *mmu)
> +{
> +	return container_of(mmu, struct msm_iommu_pagetable, base);
> +}
> +
> +static int msm_iommu_pagetable_unmap(struct msm_mmu *mmu, u64 iova,
> +		size_t size)
> +{
> +	struct msm_iommu_pagetable *pagetable = to_pagetable(mmu);
> +	struct io_pgtable_ops *ops = pagetable->pgtbl_ops;
> +	size_t unmapped = 0;
> +
> +	/* Unmap the block one page at a time */
> +	while (size) {
> +		unmapped += ops->unmap(ops, iova, 4096, NULL);
> +		iova += 4096;
> +		size -= 4096;
> +	}
> +
> +	iommu_flush_tlb_all(to_msm_iommu(pagetable->parent)->domain);
> +
> +	return (unmapped == size) ? 0 : -EINVAL;
> +}

Remember in patch #1 when you said "Then 'domain' can be used like any 
other iommu domain to map and unmap iova addresses in the pagetable."?

This appears to be very much not that :/

Robin.

> +
> +static int msm_iommu_pagetable_map(struct msm_mmu *mmu, u64 iova,
> +		struct sg_table *sgt, size_t len, int prot)
> +{
> +	struct msm_iommu_pagetable *pagetable = to_pagetable(mmu);
> +	struct io_pgtable_ops *ops = pagetable->pgtbl_ops;
> +	struct scatterlist *sg;
> +	size_t mapped = 0;
> +	u64 addr = iova;
> +	unsigned int i;
> +
> +	for_each_sg(sgt->sgl, sg, sgt->nents, i) {
> +		size_t size = sg->length;
> +		phys_addr_t phys = sg_phys(sg);
> +
> +		/* Map the block one page at a time */
> +		while (size) {
> +			if (ops->map(ops, addr, phys, 4096, prot)) {
> +				msm_iommu_pagetable_unmap(mmu, iova, mapped);
> +				return -EINVAL;
> +			}
> +
> +			phys += 4096;
> +			addr += 4096;
> +			size -= 4096;
> +			mapped += 4096;
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +static void msm_iommu_pagetable_destroy(struct msm_mmu *mmu)
> +{
> +	struct msm_iommu_pagetable *pagetable = to_pagetable(mmu);
> +
> +	free_io_pgtable_ops(pagetable->pgtbl_ops);
> +	kfree(pagetable);
> +}
> +
> +/*
> + * Given a parent device, create and return an aux domain. This will enable the
> + * TTBR0 region
> + */
> +static struct iommu_domain *msm_iommu_get_aux_domain(struct msm_mmu *parent)
> +{
> +	struct msm_iommu *iommu = to_msm_iommu(parent);
> +	struct iommu_domain *domain;
> +	int ret;
> +
> +	if (iommu->aux_domain)
> +		return iommu->aux_domain;
> +
> +	if (!iommu_dev_has_feature(parent->dev, IOMMU_DEV_FEAT_AUX))
> +		return ERR_PTR(-ENODEV);
> +
> +	domain = iommu_domain_alloc(&platform_bus_type);
> +	if (!domain)
> +		return ERR_PTR(-ENODEV);
> +
> +	ret = iommu_aux_attach_device(domain, parent->dev);
> +	if (ret) {
> +		iommu_domain_free(domain);
> +		return ERR_PTR(ret);
> +	}
> +
> +	iommu->aux_domain = domain;
> +	return domain;
> +}
> +
> +int msm_iommu_pagetable_params(struct msm_mmu *mmu,
> +		phys_addr_t *ttbr, int *asid)
> +{
> +	struct msm_iommu_pagetable *pagetable;
> +
> +	if (mmu->type != MSM_MMU_IOMMU_PAGETABLE)
> +		return -EINVAL;
> +
> +	pagetable = to_pagetable(mmu);
> +
> +	if (ttbr)
> +		*ttbr = pagetable->ttbr;
> +
> +	if (asid)
> +		*asid = pagetable->asid;
> +
> +	return 0;
> +}
> +
> +static const struct msm_mmu_funcs pagetable_funcs = {
> +		.map = msm_iommu_pagetable_map,
> +		.unmap = msm_iommu_pagetable_unmap,
> +		.destroy = msm_iommu_pagetable_destroy,
> +};
> +
> +struct msm_mmu *msm_iommu_pagetable_create(struct msm_mmu *parent)
> +{
> +	static int next_asid = 16;
> +	struct msm_iommu_pagetable *pagetable;
> +	struct iommu_domain *aux_domain;
> +	struct io_pgtable_cfg cfg;
> +	int ret;
> +
> +	/* Make sure that the parent has a aux domain attached */
> +	aux_domain = msm_iommu_get_aux_domain(parent);
> +	if (IS_ERR(aux_domain))
> +		return ERR_CAST(aux_domain);
> +
> +	/* Get the pagetable configuration from the aux domain */
> +	ret = iommu_domain_get_attr(aux_domain, DOMAIN_ATTR_PGTABLE_CFG, &cfg);
> +	if (ret)
> +		return ERR_PTR(ret);
> +
> +	pagetable = kzalloc(sizeof(*pagetable), GFP_KERNEL);
> +	if (!pagetable)
> +		return ERR_PTR(-ENOMEM);
> +
> +	msm_mmu_init(&pagetable->base, parent->dev, &pagetable_funcs,
> +		MSM_MMU_IOMMU_PAGETABLE);
> +
> +	cfg.tlb = NULL;
> +
> +	pagetable->pgtbl_ops = alloc_io_pgtable_ops(ARM_64_LPAE_S1,
> +		&cfg, aux_domain);
> +
> +	if (!pagetable->pgtbl_ops) {
> +		kfree(pagetable);
> +		return ERR_PTR(-ENOMEM);
> +	}
> +
> +
> +	/* Needed later for TLB flush */
> +	pagetable->parent = parent;
> +	pagetable->ttbr = cfg.arm_lpae_s1_cfg.ttbr;
> +
> +	pagetable->asid = next_asid;
> +	next_asid = (next_asid + 1)  % 255;
> +	if (next_asid < 16)
> +		next_asid = 16;
> +
> +	return &pagetable->base;
> +}
> +
>   static int msm_fault_handler(struct iommu_domain *domain, struct device *dev,
>   		unsigned long iova, int flags, void *arg)
>   {
> @@ -40,6 +217,7 @@ static int msm_iommu_map(struct msm_mmu *mmu, uint64_t iova,
>   	if (iova & BIT_ULL(48))
>   		iova |= GENMASK_ULL(63, 49);
>   
> +
>   	ret = iommu_map_sg(iommu->domain, iova, sgt->sgl, sgt->nents, prot);
>   	WARN_ON(!ret);
>   
> @@ -85,7 +263,7 @@ struct msm_mmu *msm_iommu_new(struct device *dev, struct iommu_domain *domain)
>   		return ERR_PTR(-ENOMEM);
>   
>   	iommu->domain = domain;
> -	msm_mmu_init(&iommu->base, dev, &funcs);
> +	msm_mmu_init(&iommu->base, dev, &funcs, MSM_MMU_IOMMU);
>   	iommu_set_fault_handler(domain, msm_fault_handler, iommu);
>   
>   	ret = iommu_attach_device(iommu->domain, dev);
> diff --git a/drivers/gpu/drm/msm/msm_mmu.h b/drivers/gpu/drm/msm/msm_mmu.h
> index 3a534ee59bf6..61ade89d9e48 100644
> --- a/drivers/gpu/drm/msm/msm_mmu.h
> +++ b/drivers/gpu/drm/msm/msm_mmu.h
> @@ -17,18 +17,26 @@ struct msm_mmu_funcs {
>   	void (*destroy)(struct msm_mmu *mmu);
>   };
>   
> +enum msm_mmu_type {
> +	MSM_MMU_GPUMMU,
> +	MSM_MMU_IOMMU,
> +	MSM_MMU_IOMMU_PAGETABLE,
> +};
> +
>   struct msm_mmu {
>   	const struct msm_mmu_funcs *funcs;
>   	struct device *dev;
>   	int (*handler)(void *arg, unsigned long iova, int flags);
>   	void *arg;
> +	enum msm_mmu_type type;
>   };
>   
>   static inline void msm_mmu_init(struct msm_mmu *mmu, struct device *dev,
> -		const struct msm_mmu_funcs *funcs)
> +		const struct msm_mmu_funcs *funcs, enum msm_mmu_type type)
>   {
>   	mmu->dev = dev;
>   	mmu->funcs = funcs;
> +	mmu->type = type;
>   }
>   
>   struct msm_mmu *msm_iommu_new(struct device *dev, struct iommu_domain *domain);
> @@ -41,7 +49,13 @@ static inline void msm_mmu_set_fault_handler(struct msm_mmu *mmu, void *arg,
>   	mmu->handler = handler;
>   }
>   
> +struct msm_mmu *msm_iommu_pagetable_create(struct msm_mmu *parent);
> +
>   void msm_gpummu_params(struct msm_mmu *mmu, dma_addr_t *pt_base,
>   		dma_addr_t *tran_error);
>   
> +
> +int msm_iommu_pagetable_params(struct msm_mmu *mmu, phys_addr_t *ttbr,
> +		int *asid);
> +
>   #endif /* __MSM_MMU_H__ */
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/6] iommu/arm-smmu: Add auxiliary domain support for arm-smmuv2
  2020-06-26 20:04 ` [PATCH v2 1/6] iommu/arm-smmu: Add auxiliary domain support for arm-smmuv2 Jordan Crouse
  2020-07-07 10:48   ` Jean-Philippe Brucker
@ 2020-07-07 12:34   ` Robin Murphy
  2020-07-07 15:09     ` [Freedreno] " Rob Clark
  1 sibling, 1 reply; 21+ messages in thread
From: Robin Murphy @ 2020-07-07 12:34 UTC (permalink / raw)
  To: Jordan Crouse, linux-arm-msm
  Cc: Will Deacon, linux-kernel, iommu, John Stultz, freedreno,
	linux-arm-kernel

On 2020-06-26 21:04, Jordan Crouse wrote:
> Support auxiliary domains for arm-smmu-v2 to initialize and support
> multiple pagetables for a single SMMU context bank. Since the smmu-v2
> hardware doesn't have any built in support for switching the pagetable
> base it is left as an exercise to the caller to actually use the pagetable.

Hmm, I've still been thinking that we could model this as supporting 
exactly 1 aux domain iff the device is currently attached to a primary 
domain with TTBR1 enabled. Then supporting multiple aux domains with 
magic TTBR0 switching is the Adreno-specific extension on top of that.

And if we don't want to go to that length, then - as I think Will was 
getting at - I'm not sure it's worth bothering at all. There doesn't 
seem to be any point in half-implementing a pretend aux domain interface 
while still driving a bus through the rest of the abstractions - it's 
really the worst of both worlds. If we're going to hand over the guts of 
io-pgtable to the GPU driver then couldn't it just use 
DOMAIN_ATTR_PGTABLE_CFG bidirectionally to inject a TTBR0 table straight 
into the TTBR1-ified domain?

Much as I like the idea of the aux domain abstraction and making this 
fit semi-transparently into the IOMMU API, if an almost entirely private 
interface will be the simplest and cleanest way to get it done then at 
this point also I'm starting to lean towards just getting it done. But 
if some other mediated-device type case then turns up that doesn't quite 
fit that private interface, we revisit the proper abstraction again and 
I reserve the right to say "I told you so" ;)

Robin.

> Aux domains are supported if split pagetable (TTBR1) support has been
> enabled on the master domain.  Each auxiliary domain will reuse the
> configuration of the master domain. By default the a domain with TTBR1
> support will have the TTBR0 region disabled so the first attached aux
> domain will enable the TTBR0 region in the hardware and conversely the
> last domain to be detached will disable TTBR0 translations.  All subsequent
> auxiliary domains create a pagetable but not touch the hardware.
> 
> The leaf driver will be able to query the physical address of the
> pagetable with the DOMAIN_ATTR_PTBASE attribute so that it can use the
> address with whatever means it has to switch the pagetable base.
> 
> Following is a pseudo code example of how a domain can be created
> 
>   /* Check to see if aux domains are supported */
>   if (iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX)) {
> 	 iommu = iommu_domain_alloc(...);
> 
> 	 if (iommu_aux_attach_device(domain, dev))
> 		 return FAIL;
> 
> 	/* Save the base address of the pagetable for use by the driver
> 	iommu_domain_get_attr(domain, DOMAIN_ATTR_PTBASE, &ptbase);
>   }
> 
> Then 'domain' can be used like any other iommu domain to map and
> unmap iova addresses in the pagetable.
> 
> Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> ---
> 
>   drivers/iommu/arm-smmu.c | 219 ++++++++++++++++++++++++++++++++++++---
>   drivers/iommu/arm-smmu.h |   1 +
>   2 files changed, 204 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 060139452c54..ce6d654301bf 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -91,6 +91,7 @@ struct arm_smmu_cb {
>   	u32				tcr[2];
>   	u32				mair[2];
>   	struct arm_smmu_cfg		*cfg;
> +	atomic_t			aux;
>   };
>   
>   struct arm_smmu_master_cfg {
> @@ -667,6 +668,86 @@ static void arm_smmu_write_context_bank(struct arm_smmu_device *smmu, int idx)
>   	arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_SCTLR, reg);
>   }
>   
> +/*
> + * Update the context context bank to enable TTBR0. Assumes AARCH64 S1
> + * configuration.
> + */
> +static void arm_smmu_context_set_ttbr0(struct arm_smmu_cb *cb,
> +		struct io_pgtable_cfg *pgtbl_cfg)
> +{
> +	u32 tcr = cb->tcr[0];
> +
> +	/* Add the TCR configuration from the new pagetable config */
> +	tcr |= arm_smmu_lpae_tcr(pgtbl_cfg);
> +
> +	/* Make sure that both TTBR0 and TTBR1 are enabled */
> +	tcr &= ~(ARM_SMMU_TCR_EPD0 | ARM_SMMU_TCR_EPD1);
> +
> +	/* Udate the TCR register */
> +	cb->tcr[0] = tcr;
> +
> +	/* Program the new TTBR0 */
> +	cb->ttbr[0] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
> +	cb->ttbr[0] |= FIELD_PREP(ARM_SMMU_TTBRn_ASID, cb->cfg->asid);
> +}
> +
> +/*
> + * Thus function assumes that the current model only allows aux domains for
> + * AARCH64 S1 configurations
> + */
> +static int arm_smmu_aux_init_domain_context(struct iommu_domain *domain,
> +		struct arm_smmu_device *smmu, struct arm_smmu_cfg *master)
> +{
> +	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> +	struct io_pgtable_ops *pgtbl_ops;
> +	struct io_pgtable_cfg pgtbl_cfg;
> +
> +	mutex_lock(&smmu_domain->init_mutex);
> +
> +	/* Copy the configuration from the master */
> +	memcpy(&smmu_domain->cfg, master, sizeof(smmu_domain->cfg));
> +
> +	smmu_domain->flush_ops = &arm_smmu_s1_tlb_ops;
> +	smmu_domain->smmu = smmu;
> +
> +	pgtbl_cfg = (struct io_pgtable_cfg) {
> +		.pgsize_bitmap = smmu->pgsize_bitmap,
> +		.ias = smmu->va_size,
> +		.oas = smmu->ipa_size,
> +		.coherent_walk = smmu->features & ARM_SMMU_FEAT_COHERENT_WALK,
> +		.tlb = smmu_domain->flush_ops,
> +		.iommu_dev = smmu->dev,
> +		.quirks = 0,
> +	};
> +
> +	if (smmu_domain->non_strict)
> +		pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
> +
> +	pgtbl_ops = alloc_io_pgtable_ops(ARM_64_LPAE_S1, &pgtbl_cfg,
> +		smmu_domain);
> +	if (!pgtbl_ops) {
> +		mutex_unlock(&smmu_domain->init_mutex);
> +		return -ENOMEM;
> +	}
> +
> +	domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
> +
> +	domain->geometry.aperture_end = (1UL << smmu->va_size) - 1;
> +	domain->geometry.force_aperture = true;
> +
> +	/* enable TTBR0 when the the first aux domain is attached */
> +	if (atomic_inc_return(&smmu->cbs[master->cbndx].aux) == 1) {
> +		arm_smmu_context_set_ttbr0(&smmu->cbs[master->cbndx],
> +			&pgtbl_cfg);
> +		arm_smmu_write_context_bank(smmu, master->cbndx);
> +	}
> +
> +	smmu_domain->pgtbl_ops = pgtbl_ops;
> +	mutex_unlock(&smmu_domain->init_mutex);
> +
> +	return 0;
> +}
> +
>   static int arm_smmu_init_domain_context(struct iommu_domain *domain,
>   					struct arm_smmu_device *smmu,
>   					struct device *dev)
> @@ -871,36 +952,70 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
>   	return ret;
>   }
>   
> +static void
> +arm_smmu_destroy_aux_domain_context(struct arm_smmu_domain *smmu_domain)
> +{
> +	struct arm_smmu_device *smmu = smmu_domain->smmu;
> +	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> +	int ret;
> +
> +	/*
> +	 * If this is the last aux domain to be freed, disable TTBR0 by turning
> +	 * off translations and clearing TTBR0
> +	 */
> +	if (atomic_dec_return(&smmu->cbs[cfg->cbndx].aux) == 0) {
> +		/* Clear out the T0 region */
> +		smmu->cbs[cfg->cbndx].tcr[0] &= ~GENMASK(15, 0);
> +		/* Disable TTBR0 translations */
> +		smmu->cbs[cfg->cbndx].tcr[0] |= ARM_SMMU_TCR_EPD0;
> +		/* Clear the TTBR0 pagetable address */
> +		smmu->cbs[cfg->cbndx].ttbr[0] =
> +			FIELD_PREP(ARM_SMMU_TTBRn_ASID, cfg->asid);
> +
> +		ret = arm_smmu_rpm_get(smmu);
> +		if (!ret) {
> +			arm_smmu_write_context_bank(smmu, cfg->cbndx);
> +			arm_smmu_rpm_put(smmu);
> +		}
> +	}
> +
> +}
> +
>   static void arm_smmu_destroy_domain_context(struct iommu_domain *domain)
>   {
>   	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>   	struct arm_smmu_device *smmu = smmu_domain->smmu;
>   	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> -	int ret, irq;
>   
>   	if (!smmu || domain->type == IOMMU_DOMAIN_IDENTITY)
>   		return;
>   
> -	ret = arm_smmu_rpm_get(smmu);
> -	if (ret < 0)
> -		return;
> +	if (smmu_domain->aux)
> +		arm_smmu_destroy_aux_domain_context(smmu_domain);
>   
> -	/*
> -	 * Disable the context bank and free the page tables before freeing
> -	 * it.
> -	 */
> -	smmu->cbs[cfg->cbndx].cfg = NULL;
> -	arm_smmu_write_context_bank(smmu, cfg->cbndx);
> +	/* Check if the last user is done with the context bank */
> +	if (atomic_read(&smmu->cbs[cfg->cbndx].aux) == 0) {
> +		int ret = arm_smmu_rpm_get(smmu);
> +		int irq;
>   
> -	if (cfg->irptndx != ARM_SMMU_INVALID_IRPTNDX) {
> -		irq = smmu->irqs[smmu->num_global_irqs + cfg->irptndx];
> -		devm_free_irq(smmu->dev, irq, domain);
> +		if (ret < 0)
> +			return;
> +
> +		/* Disable the context bank */
> +		smmu->cbs[cfg->cbndx].cfg = NULL;
> +		arm_smmu_write_context_bank(smmu, cfg->cbndx);
> +
> +		if (cfg->irptndx != ARM_SMMU_INVALID_IRPTNDX) {
> +			irq = smmu->irqs[smmu->num_global_irqs + cfg->irptndx];
> +			devm_free_irq(smmu->dev, irq, domain);
> +		}
> +
> +		__arm_smmu_free_bitmap(smmu->context_map, cfg->cbndx);
> +		arm_smmu_rpm_put(smmu);
>   	}
>   
> +	/* Destroy the pagetable */
>   	free_io_pgtable_ops(smmu_domain->pgtbl_ops);
> -	__arm_smmu_free_bitmap(smmu->context_map, cfg->cbndx);
> -
> -	arm_smmu_rpm_put(smmu);
>   }
>   
>   static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
> @@ -1161,6 +1276,74 @@ static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain,
>   	return 0;
>   }
>   
> +static bool arm_smmu_dev_has_feat(struct device *dev,
> +		enum iommu_dev_features feat)
> +{
> +	if (feat != IOMMU_DEV_FEAT_AUX)
> +		return false;
> +
> +	return true;
> +}
> +
> +static int arm_smmu_dev_enable_feat(struct device *dev,
> +		enum iommu_dev_features feat)
> +{
> +	/* aux domain support is always available */
> +	if (feat == IOMMU_DEV_FEAT_AUX)
> +		return 0;
> +
> +	return -ENODEV;
> +}
> +
> +static int arm_smmu_dev_disable_feat(struct device *dev,
> +		enum iommu_dev_features feat)
> +{
> +	return -EBUSY;
> +}
> +
> +static int arm_smmu_aux_attach_dev(struct iommu_domain *domain,
> +		struct device *dev)
> +{
> +	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> +	struct arm_smmu_master_cfg *cfg = dev_iommu_priv_get(dev);
> +	struct arm_smmu_device *smmu = cfg->smmu;
> +	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> +	struct arm_smmu_cb *cb;
> +	int idx, i, ret, cbndx = -1;
> +
> +	/* Try to find the context bank configured for this device */
> +	for_each_cfg_sme(cfg, fwspec, i, idx) {
> +		if (idx != INVALID_SMENDX) {
> +			cbndx = smmu->s2crs[idx].cbndx;
> +			break;
> +		}
> +	}
> +
> +	if (cbndx == -1)
> +		return -ENODEV;
> +
> +	cb = &smmu->cbs[cbndx];
> +
> +	/* Aux domains are only supported for AARCH64 configurations */
> +	if (cb->cfg->fmt != ARM_SMMU_CTX_FMT_AARCH64)
> +		return -EINVAL;
> +
> +	/* Make sure that TTBR1 is enabled in the hardware */
> +	if ((cb->tcr[0] & ARM_SMMU_TCR_EPD1))
> +		return -EINVAL;
> +
> +	smmu_domain->aux = true;
> +
> +	ret = arm_smmu_rpm_get(smmu);
> +	if (ret < 0)
> +		return ret;
> +
> +	ret = arm_smmu_aux_init_domain_context(domain, smmu, cb->cfg);
> +
> +	arm_smmu_rpm_put(smmu);
> +	return ret;
> +}
> +
>   static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>   {
>   	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> @@ -1653,6 +1836,10 @@ static struct iommu_ops arm_smmu_ops = {
>   	.get_resv_regions	= arm_smmu_get_resv_regions,
>   	.put_resv_regions	= generic_iommu_put_resv_regions,
>   	.def_domain_type	= arm_smmu_def_domain_type,
> +	.dev_has_feat		= arm_smmu_dev_has_feat,
> +	.dev_enable_feat	= arm_smmu_dev_enable_feat,
> +	.dev_disable_feat	= arm_smmu_dev_disable_feat,
> +	.aux_attach_dev		= arm_smmu_aux_attach_dev,
>   	.pgsize_bitmap		= -1UL, /* Restricted during device attach */
>   };
>   
> diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
> index c417814f1d98..79d441024043 100644
> --- a/drivers/iommu/arm-smmu.h
> +++ b/drivers/iommu/arm-smmu.h
> @@ -346,6 +346,7 @@ struct arm_smmu_domain {
>   	spinlock_t			cb_lock; /* Serialises ATS1* ops and TLB syncs */
>   	struct iommu_domain		domain;
>   	struct device			*dev;	/* Device attached to this domain */
> +	bool				aux;
>   };
>   
>   static inline u32 arm_smmu_lpae_tcr(struct io_pgtable_cfg *cfg)
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Freedreno] [PATCH v2 2/6] iommu/io-pgtable: Allow a pgtable implementation to skip TLB operations
  2020-07-07 11:34   ` Robin Murphy
@ 2020-07-07 14:25     ` Rob Clark
  2020-07-07 14:58       ` Rob Clark
  0 siblings, 1 reply; 21+ messages in thread
From: Rob Clark @ 2020-07-07 14:25 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Jordan Crouse, linux-arm-msm, Sai Prakash Ranjan, Joerg Roedel,
	Will Deacon, Linux Kernel Mailing List,
	list@263.net:IOMMU DRIVERS
	<iommu@lists.linux-foundation.org>,
	Joerg Roedel <joro@8bytes.org>,,
	John Stultz, freedreno, Yong Wu

On Tue, Jul 7, 2020 at 4:34 AM Robin Murphy <robin.murphy@arm.com> wrote:
>
> On 2020-06-26 21:04, Jordan Crouse wrote:
> > Allow a io-pgtable implementation to skip TLB operations by checking for
> > NULL pointers in the helper functions. It will be up to to the owner
> > of the io-pgtable instance to make sure that they independently handle
> > the TLB correctly.
>
> I don't really understand what this is for - tricking the IOMMU driver
> into not performing its TLB maintenance at points when that maintenance
> has been deemed necessary doesn't seem like the appropriate way to
> achieve anything good :/

No, for triggering the io-pgtable helpers into not performing TLB
maintenance.  But seriously, since we are creating pgtables ourselves,
and we don't want to be ioremap'ing the GPU's SMMU instance, the
alternative is plugging in no-op helpers.  Which amounts to the same
thing.

Currently (in a later patch in the series) we are using
iommu_flush_tlb_all() when unmapping, which is a bit of a big hammer.
Although I think we could be a bit more clever and do the TLB ops on
the GPU (since the GPU knows if pagetables we are unmapping from are
in-use and could skip the TLB ops otherwise).

On the topic, if we are using unique ASID values per set of
pagetables, how expensive is tlb invalidate for an ASID that has no
entries in the TLB?

BR,
-R

>
> Robin.
>
> > Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> > ---
> >
> >   include/linux/io-pgtable.h | 11 +++++++----
> >   1 file changed, 7 insertions(+), 4 deletions(-)
> >
> > diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
> > index 53d53c6c2be9..bbed1d3925ba 100644
> > --- a/include/linux/io-pgtable.h
> > +++ b/include/linux/io-pgtable.h
> > @@ -210,21 +210,24 @@ struct io_pgtable {
> >
> >   static inline void io_pgtable_tlb_flush_all(struct io_pgtable *iop)
> >   {
> > -     iop->cfg.tlb->tlb_flush_all(iop->cookie);
> > +     if (iop->cfg.tlb)
> > +             iop->cfg.tlb->tlb_flush_all(iop->cookie);
> >   }
> >
> >   static inline void
> >   io_pgtable_tlb_flush_walk(struct io_pgtable *iop, unsigned long iova,
> >                         size_t size, size_t granule)
> >   {
> > -     iop->cfg.tlb->tlb_flush_walk(iova, size, granule, iop->cookie);
> > +     if (iop->cfg.tlb)
> > +             iop->cfg.tlb->tlb_flush_walk(iova, size, granule, iop->cookie);
> >   }
> >
> >   static inline void
> >   io_pgtable_tlb_flush_leaf(struct io_pgtable *iop, unsigned long iova,
> >                         size_t size, size_t granule)
> >   {
> > -     iop->cfg.tlb->tlb_flush_leaf(iova, size, granule, iop->cookie);
> > +     if (iop->cfg.tlb)
> > +             iop->cfg.tlb->tlb_flush_leaf(iova, size, granule, iop->cookie);
> >   }
> >
> >   static inline void
> > @@ -232,7 +235,7 @@ io_pgtable_tlb_add_page(struct io_pgtable *iop,
> >                       struct iommu_iotlb_gather * gather, unsigned long iova,
> >                       size_t granule)
> >   {
> > -     if (iop->cfg.tlb->tlb_add_page)
> > +     if (iop->cfg.tlb && iop->cfg.tlb->tlb_add_page)
> >               iop->cfg.tlb->tlb_add_page(gather, iova, granule, iop->cookie);
> >   }
> >
> >
> _______________________________________________
> Freedreno mailing list
> Freedreno@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Freedreno] [PATCH v2 4/6] drm/msm: Add support to create a local pagetable
  2020-07-07 11:36   ` Robin Murphy
@ 2020-07-07 14:41     ` Rob Clark
  2020-07-08 19:35     ` Jordan Crouse
  1 sibling, 0 replies; 21+ messages in thread
From: Rob Clark @ 2020-07-07 14:41 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Jordan Crouse, linux-arm-msm, David Airlie, freedreno,
	Linux Kernel Mailing List, dri-devel,
	list@263.net:IOMMU DRIVERS
	<iommu@lists.linux-foundation.org>,
	Joerg Roedel <joro@8bytes.org>,,
	John Stultz, Daniel Vetter, Sean Paul

On Tue, Jul 7, 2020 at 4:36 AM Robin Murphy <robin.murphy@arm.com> wrote:
>
> On 2020-06-26 21:04, Jordan Crouse wrote:
> > Add support to create a io-pgtable for use by targets that support
> > per-instance pagetables.  In order to support per-instance pagetables the
> > GPU SMMU device needs to have the qcom,adreno-smmu compatible string and
> > split pagetables and auxiliary domains need to be supported and enabled.
> >
> > Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> > ---
> >
> >   drivers/gpu/drm/msm/msm_gpummu.c |   2 +-
> >   drivers/gpu/drm/msm/msm_iommu.c  | 180 ++++++++++++++++++++++++++++++-
> >   drivers/gpu/drm/msm/msm_mmu.h    |  16 ++-
> >   3 files changed, 195 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/msm/msm_gpummu.c b/drivers/gpu/drm/msm/msm_gpummu.c
> > index 310a31b05faa..aab121f4beb7 100644
> > --- a/drivers/gpu/drm/msm/msm_gpummu.c
> > +++ b/drivers/gpu/drm/msm/msm_gpummu.c
> > @@ -102,7 +102,7 @@ struct msm_mmu *msm_gpummu_new(struct device *dev, struct msm_gpu *gpu)
> >       }
> >
> >       gpummu->gpu = gpu;
> > -     msm_mmu_init(&gpummu->base, dev, &funcs);
> > +     msm_mmu_init(&gpummu->base, dev, &funcs, MSM_MMU_GPUMMU);
> >
> >       return &gpummu->base;
> >   }
> > diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c
> > index 1b6635504069..f455c597f76d 100644
> > --- a/drivers/gpu/drm/msm/msm_iommu.c
> > +++ b/drivers/gpu/drm/msm/msm_iommu.c
> > @@ -4,15 +4,192 @@
> >    * Author: Rob Clark <robdclark@gmail.com>
> >    */
> >
> > +#include <linux/io-pgtable.h>
> >   #include "msm_drv.h"
> >   #include "msm_mmu.h"
> >
> >   struct msm_iommu {
> >       struct msm_mmu base;
> >       struct iommu_domain *domain;
> > +     struct iommu_domain *aux_domain;
> >   };
> > +
> >   #define to_msm_iommu(x) container_of(x, struct msm_iommu, base)
> >
> > +struct msm_iommu_pagetable {
> > +     struct msm_mmu base;
> > +     struct msm_mmu *parent;
> > +     struct io_pgtable_ops *pgtbl_ops;
> > +     phys_addr_t ttbr;
> > +     u32 asid;
> > +};
> > +
> > +static struct msm_iommu_pagetable *to_pagetable(struct msm_mmu *mmu)
> > +{
> > +     return container_of(mmu, struct msm_iommu_pagetable, base);
> > +}
> > +
> > +static int msm_iommu_pagetable_unmap(struct msm_mmu *mmu, u64 iova,
> > +             size_t size)
> > +{
> > +     struct msm_iommu_pagetable *pagetable = to_pagetable(mmu);
> > +     struct io_pgtable_ops *ops = pagetable->pgtbl_ops;
> > +     size_t unmapped = 0;
> > +
> > +     /* Unmap the block one page at a time */
> > +     while (size) {
> > +             unmapped += ops->unmap(ops, iova, 4096, NULL);
> > +             iova += 4096;
> > +             size -= 4096;
> > +     }
> > +
> > +     iommu_flush_tlb_all(to_msm_iommu(pagetable->parent)->domain);
> > +
> > +     return (unmapped == size) ? 0 : -EINVAL;
> > +}
>
> Remember in patch #1 when you said "Then 'domain' can be used like any
> other iommu domain to map and unmap iova addresses in the pagetable."?
>
> This appears to be very much not that :/
>

I guess that comment is a bit stale.. the original plan was to create
an iommu_domain per set of pgtables, but at some point we realized
that by using the io-pgtable helpers directly, we would inflict a lot
less GPU-crazy on the iommu drivers

BR,
-R

> Robin.
>
> > +
> > +static int msm_iommu_pagetable_map(struct msm_mmu *mmu, u64 iova,
> > +             struct sg_table *sgt, size_t len, int prot)
> > +{
> > +     struct msm_iommu_pagetable *pagetable = to_pagetable(mmu);
> > +     struct io_pgtable_ops *ops = pagetable->pgtbl_ops;
> > +     struct scatterlist *sg;
> > +     size_t mapped = 0;
> > +     u64 addr = iova;
> > +     unsigned int i;
> > +
> > +     for_each_sg(sgt->sgl, sg, sgt->nents, i) {
> > +             size_t size = sg->length;
> > +             phys_addr_t phys = sg_phys(sg);
> > +
> > +             /* Map the block one page at a time */
> > +             while (size) {
> > +                     if (ops->map(ops, addr, phys, 4096, prot)) {
> > +                             msm_iommu_pagetable_unmap(mmu, iova, mapped);
> > +                             return -EINVAL;
> > +                     }
> > +
> > +                     phys += 4096;
> > +                     addr += 4096;
> > +                     size -= 4096;
> > +                     mapped += 4096;
> > +             }
> > +     }
> > +
> > +     return 0;
> > +}
> > +
> > +static void msm_iommu_pagetable_destroy(struct msm_mmu *mmu)
> > +{
> > +     struct msm_iommu_pagetable *pagetable = to_pagetable(mmu);
> > +
> > +     free_io_pgtable_ops(pagetable->pgtbl_ops);
> > +     kfree(pagetable);
> > +}
> > +
> > +/*
> > + * Given a parent device, create and return an aux domain. This will enable the
> > + * TTBR0 region
> > + */
> > +static struct iommu_domain *msm_iommu_get_aux_domain(struct msm_mmu *parent)
> > +{
> > +     struct msm_iommu *iommu = to_msm_iommu(parent);
> > +     struct iommu_domain *domain;
> > +     int ret;
> > +
> > +     if (iommu->aux_domain)
> > +             return iommu->aux_domain;
> > +
> > +     if (!iommu_dev_has_feature(parent->dev, IOMMU_DEV_FEAT_AUX))
> > +             return ERR_PTR(-ENODEV);
> > +
> > +     domain = iommu_domain_alloc(&platform_bus_type);
> > +     if (!domain)
> > +             return ERR_PTR(-ENODEV);
> > +
> > +     ret = iommu_aux_attach_device(domain, parent->dev);
> > +     if (ret) {
> > +             iommu_domain_free(domain);
> > +             return ERR_PTR(ret);
> > +     }
> > +
> > +     iommu->aux_domain = domain;
> > +     return domain;
> > +}
> > +
> > +int msm_iommu_pagetable_params(struct msm_mmu *mmu,
> > +             phys_addr_t *ttbr, int *asid)
> > +{
> > +     struct msm_iommu_pagetable *pagetable;
> > +
> > +     if (mmu->type != MSM_MMU_IOMMU_PAGETABLE)
> > +             return -EINVAL;
> > +
> > +     pagetable = to_pagetable(mmu);
> > +
> > +     if (ttbr)
> > +             *ttbr = pagetable->ttbr;
> > +
> > +     if (asid)
> > +             *asid = pagetable->asid;
> > +
> > +     return 0;
> > +}
> > +
> > +static const struct msm_mmu_funcs pagetable_funcs = {
> > +             .map = msm_iommu_pagetable_map,
> > +             .unmap = msm_iommu_pagetable_unmap,
> > +             .destroy = msm_iommu_pagetable_destroy,
> > +};
> > +
> > +struct msm_mmu *msm_iommu_pagetable_create(struct msm_mmu *parent)
> > +{
> > +     static int next_asid = 16;
> > +     struct msm_iommu_pagetable *pagetable;
> > +     struct iommu_domain *aux_domain;
> > +     struct io_pgtable_cfg cfg;
> > +     int ret;
> > +
> > +     /* Make sure that the parent has a aux domain attached */
> > +     aux_domain = msm_iommu_get_aux_domain(parent);
> > +     if (IS_ERR(aux_domain))
> > +             return ERR_CAST(aux_domain);
> > +
> > +     /* Get the pagetable configuration from the aux domain */
> > +     ret = iommu_domain_get_attr(aux_domain, DOMAIN_ATTR_PGTABLE_CFG, &cfg);
> > +     if (ret)
> > +             return ERR_PTR(ret);
> > +
> > +     pagetable = kzalloc(sizeof(*pagetable), GFP_KERNEL);
> > +     if (!pagetable)
> > +             return ERR_PTR(-ENOMEM);
> > +
> > +     msm_mmu_init(&pagetable->base, parent->dev, &pagetable_funcs,
> > +             MSM_MMU_IOMMU_PAGETABLE);
> > +
> > +     cfg.tlb = NULL;
> > +
> > +     pagetable->pgtbl_ops = alloc_io_pgtable_ops(ARM_64_LPAE_S1,
> > +             &cfg, aux_domain);
> > +
> > +     if (!pagetable->pgtbl_ops) {
> > +             kfree(pagetable);
> > +             return ERR_PTR(-ENOMEM);
> > +     }
> > +
> > +
> > +     /* Needed later for TLB flush */
> > +     pagetable->parent = parent;
> > +     pagetable->ttbr = cfg.arm_lpae_s1_cfg.ttbr;
> > +
> > +     pagetable->asid = next_asid;
> > +     next_asid = (next_asid + 1)  % 255;
> > +     if (next_asid < 16)
> > +             next_asid = 16;
> > +
> > +     return &pagetable->base;
> > +}
> > +
> >   static int msm_fault_handler(struct iommu_domain *domain, struct device *dev,
> >               unsigned long iova, int flags, void *arg)
> >   {
> > @@ -40,6 +217,7 @@ static int msm_iommu_map(struct msm_mmu *mmu, uint64_t iova,
> >       if (iova & BIT_ULL(48))
> >               iova |= GENMASK_ULL(63, 49);
> >
> > +
> >       ret = iommu_map_sg(iommu->domain, iova, sgt->sgl, sgt->nents, prot);
> >       WARN_ON(!ret);
> >
> > @@ -85,7 +263,7 @@ struct msm_mmu *msm_iommu_new(struct device *dev, struct iommu_domain *domain)
> >               return ERR_PTR(-ENOMEM);
> >
> >       iommu->domain = domain;
> > -     msm_mmu_init(&iommu->base, dev, &funcs);
> > +     msm_mmu_init(&iommu->base, dev, &funcs, MSM_MMU_IOMMU);
> >       iommu_set_fault_handler(domain, msm_fault_handler, iommu);
> >
> >       ret = iommu_attach_device(iommu->domain, dev);
> > diff --git a/drivers/gpu/drm/msm/msm_mmu.h b/drivers/gpu/drm/msm/msm_mmu.h
> > index 3a534ee59bf6..61ade89d9e48 100644
> > --- a/drivers/gpu/drm/msm/msm_mmu.h
> > +++ b/drivers/gpu/drm/msm/msm_mmu.h
> > @@ -17,18 +17,26 @@ struct msm_mmu_funcs {
> >       void (*destroy)(struct msm_mmu *mmu);
> >   };
> >
> > +enum msm_mmu_type {
> > +     MSM_MMU_GPUMMU,
> > +     MSM_MMU_IOMMU,
> > +     MSM_MMU_IOMMU_PAGETABLE,
> > +};
> > +
> >   struct msm_mmu {
> >       const struct msm_mmu_funcs *funcs;
> >       struct device *dev;
> >       int (*handler)(void *arg, unsigned long iova, int flags);
> >       void *arg;
> > +     enum msm_mmu_type type;
> >   };
> >
> >   static inline void msm_mmu_init(struct msm_mmu *mmu, struct device *dev,
> > -             const struct msm_mmu_funcs *funcs)
> > +             const struct msm_mmu_funcs *funcs, enum msm_mmu_type type)
> >   {
> >       mmu->dev = dev;
> >       mmu->funcs = funcs;
> > +     mmu->type = type;
> >   }
> >
> >   struct msm_mmu *msm_iommu_new(struct device *dev, struct iommu_domain *domain);
> > @@ -41,7 +49,13 @@ static inline void msm_mmu_set_fault_handler(struct msm_mmu *mmu, void *arg,
> >       mmu->handler = handler;
> >   }
> >
> > +struct msm_mmu *msm_iommu_pagetable_create(struct msm_mmu *parent);
> > +
> >   void msm_gpummu_params(struct msm_mmu *mmu, dma_addr_t *pt_base,
> >               dma_addr_t *tran_error);
> >
> > +
> > +int msm_iommu_pagetable_params(struct msm_mmu *mmu, phys_addr_t *ttbr,
> > +             int *asid);
> > +
> >   #endif /* __MSM_MMU_H__ */
> >
> _______________________________________________
> Freedreno mailing list
> Freedreno@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Freedreno] [PATCH v2 2/6] iommu/io-pgtable: Allow a pgtable implementation to skip TLB operations
  2020-07-07 14:25     ` [Freedreno] " Rob Clark
@ 2020-07-07 14:58       ` Rob Clark
  2020-07-08 19:19         ` Jordan Crouse
  0 siblings, 1 reply; 21+ messages in thread
From: Rob Clark @ 2020-07-07 14:58 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Jordan Crouse, linux-arm-msm, Sai Prakash Ranjan, Joerg Roedel,
	Will Deacon, Linux Kernel Mailing List,
	list@263.net:IOMMU DRIVERS
	<iommu@lists.linux-foundation.org>,
	Joerg Roedel <joro@8bytes.org>,,
	John Stultz, freedreno, Yong Wu

On Tue, Jul 7, 2020 at 7:25 AM Rob Clark <robdclark@gmail.com> wrote:
>
> On Tue, Jul 7, 2020 at 4:34 AM Robin Murphy <robin.murphy@arm.com> wrote:
> >
> > On 2020-06-26 21:04, Jordan Crouse wrote:
> > > Allow a io-pgtable implementation to skip TLB operations by checking for
> > > NULL pointers in the helper functions. It will be up to to the owner
> > > of the io-pgtable instance to make sure that they independently handle
> > > the TLB correctly.
> >
> > I don't really understand what this is for - tricking the IOMMU driver
> > into not performing its TLB maintenance at points when that maintenance
> > has been deemed necessary doesn't seem like the appropriate way to
> > achieve anything good :/
>
> No, for triggering the io-pgtable helpers into not performing TLB
> maintenance.  But seriously, since we are creating pgtables ourselves,
> and we don't want to be ioremap'ing the GPU's SMMU instance, the
> alternative is plugging in no-op helpers.  Which amounts to the same
> thing.

Hmm, that said, since we are just memcpy'ing the io_pgtable_cfg from
arm-smmu, it will already be populated with arm-smmu's fxn ptrs.  I
guess we could maybe make it work without no-op helpers, although in
that case it looks like we need to fix something about aux-domain vs
tlb helpers:

[  +0.004373] Unable to handle kernel NULL pointer dereference at
virtual address 0000000000000019
[  +0.004086] Mem abort info:
[  +0.004319]   ESR = 0x96000004
[  +0.003462]   EC = 0x25: DABT (current EL), IL = 32 bits
[  +0.003494]   SET = 0, FnV = 0
[  +0.002812]   EA = 0, S1PTW = 0
[  +0.002873] Data abort info:
[  +0.003031]   ISV = 0, ISS = 0x00000004
[  +0.003785]   CM = 0, WnR = 0
[  +0.003641] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000261d65000
[  +0.003383] [0000000000000019] pgd=0000000000000000, p4d=0000000000000000
[  +0.003715] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[  +0.002744] Modules linked in: xt_CHECKSUM xt_MASQUERADE
xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle
ip6table_nat iptable_mangle iptable_nat nf_nat nf_conntrack
nf_defrag_ipv4 libcrc32c bridge stp llc ip6table_filter ip6_tables
iptable_filter ax88179_178a usbnet uvcvideo videobuf2_vmalloc
videobuf2_memops videobuf2_v4l2 videobuf2_common videodev mc
hid_multitouch i2c_hid some_battery ti_sn65dsi86 hci_uart btqca btbcm
qcom_spmi_adc5 bluetooth qcom_spmi_temp_alarm qcom_vadc_common
ecdh_generic ecc snd_soc_sdm845 snd_soc_rt5663 snd_soc_qcom_common
ath10k_snoc ath10k_core crct10dif_ce ath mac80211 snd_soc_rl6231
soundwire_bus i2c_qcom_geni libarc4 qcom_rng msm phy_qcom_qusb2
reset_qcom_pdc drm_kms_helper cfg80211 rfkill qcom_q6v5_mss
qcom_q6v5_ipa_notify socinfo qrtr ns panel_simple qcom_q6v5_pas
qcom_common qcom_glink_smem slim_qcom_ngd_ctrl qcom_sysmon drm
qcom_q6v5 slimbus qmi_helpers qcom_wdt mdt_loader rmtfs_mem be2iscsi
bnx2i cnic uio cxgb4i cxgb4 cxgb3i cxgb3 mdio
[  +0.000139]  libcxgbi libcxgb qla4xxx iscsi_boot_sysfs iscsi_tcp
libiscsi_tcp libiscsi scsi_transport_iscsi fuse ip_tables x_tables
ipv6 nf_defrag_ipv6
[  +0.020933] CPU: 3 PID: 168 Comm: kworker/u16:7 Not tainted
5.8.0-rc1-c630+ #31
[  +0.003828] Hardware name: LENOVO 81JL/LNVNB161216, BIOS
9UCN33WW(V2.06) 06/ 4/2019
[  +0.004039] Workqueue: msm msm_gem_free_work [msm]
[  +0.003885] pstate: 60c00005 (nZCv daif +PAN +UAO BTYPE=--)
[  +0.003859] pc : arm_smmu_tlb_inv_range_s1+0x30/0x148
[  +0.003742] lr : arm_smmu_tlb_add_page_s1+0x1c/0x28
[  +0.003887] sp : ffff800011cdb970
[  +0.003868] x29: ffff800011cdb970 x28: 0000000000000003
[  +0.003930] x27: ffff0001f1882f80 x26: 0000000000000001
[  +0.003886] x25: 0000000000000003 x24: 0000000000000620
[  +0.003932] x23: 0000000000000000 x22: 0000000000001000
[  +0.003886] x21: 0000000000001000 x20: ffff0001cf857300
[  +0.003916] x19: 0000000000000001 x18: 00000000ffffffff
[  +0.003921] x17: ffffd9e6a24ae0e8 x16: 0000000000012577
[  +0.003843] x15: 0000000000012578 x14: 0000000000000000
[  +0.003884] x13: 0000000000012574 x12: ffffd9e6a2550180
[  +0.003834] x11: 0000000000083f80 x10: 0000000000000000
[  +0.003889] x9 : 0000000000000000 x8 : ffff0001f1882f80
[  +0.003812] x7 : 0000000000000001 x6 : 0000000000000048
[  +0.003807] x5 : ffff0001c86e1000 x4 : 0000000000000620
[  +0.003802] x3 : ffff0001ddb57700 x2 : 0000000000001000
[  +0.003809] x1 : 0000000000001000 x0 : 0000000101048000
[  +0.003768] Call trace:
[  +0.003665]  arm_smmu_tlb_inv_range_s1+0x30/0x148
[  +0.003769]  arm_smmu_tlb_add_page_s1+0x1c/0x28
[  +0.003760]  __arm_lpae_unmap+0x3c4/0x498
[  +0.003821]  __arm_lpae_unmap+0xfc/0x498
[  +0.003693]  __arm_lpae_unmap+0xfc/0x498
[  +0.003704]  __arm_lpae_unmap+0xfc/0x498
[  +0.003608]  arm_lpae_unmap+0x60/0x78
[  +0.003653]  msm_iommu_pagetable_unmap+0x5c/0xa0 [msm]
[  +0.003711]  msm_gem_purge_vma+0x48/0x70 [msm]
[  +0.003716]  put_iova+0x68/0xc8 [msm]
[  +0.003792]  msm_gem_free_work+0x118/0x190 [msm]
[  +0.003739]  process_one_work+0x28c/0x6e8
[  +0.003595]  worker_thread+0x4c/0x420
[  +0.003546]  kthread+0x148/0x168
[  +0.003675]  ret_from_fork+0x10/0x1c
[  +0.003596] Code: 2a0403f8 a9046bf9 f9400073 39406077 (b9401a61)

BR,
-R

>
> Currently (in a later patch in the series) we are using
> iommu_flush_tlb_all() when unmapping, which is a bit of a big hammer.
> Although I think we could be a bit more clever and do the TLB ops on
> the GPU (since the GPU knows if pagetables we are unmapping from are
> in-use and could skip the TLB ops otherwise).
>
> On the topic, if we are using unique ASID values per set of
> pagetables, how expensive is tlb invalidate for an ASID that has no
> entries in the TLB?
>
> BR,
> -R
>
> >
> > Robin.
> >
> > > Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> > > ---
> > >
> > >   include/linux/io-pgtable.h | 11 +++++++----
> > >   1 file changed, 7 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
> > > index 53d53c6c2be9..bbed1d3925ba 100644
> > > --- a/include/linux/io-pgtable.h
> > > +++ b/include/linux/io-pgtable.h
> > > @@ -210,21 +210,24 @@ struct io_pgtable {
> > >
> > >   static inline void io_pgtable_tlb_flush_all(struct io_pgtable *iop)
> > >   {
> > > -     iop->cfg.tlb->tlb_flush_all(iop->cookie);
> > > +     if (iop->cfg.tlb)
> > > +             iop->cfg.tlb->tlb_flush_all(iop->cookie);
> > >   }
> > >
> > >   static inline void
> > >   io_pgtable_tlb_flush_walk(struct io_pgtable *iop, unsigned long iova,
> > >                         size_t size, size_t granule)
> > >   {
> > > -     iop->cfg.tlb->tlb_flush_walk(iova, size, granule, iop->cookie);
> > > +     if (iop->cfg.tlb)
> > > +             iop->cfg.tlb->tlb_flush_walk(iova, size, granule, iop->cookie);
> > >   }
> > >
> > >   static inline void
> > >   io_pgtable_tlb_flush_leaf(struct io_pgtable *iop, unsigned long iova,
> > >                         size_t size, size_t granule)
> > >   {
> > > -     iop->cfg.tlb->tlb_flush_leaf(iova, size, granule, iop->cookie);
> > > +     if (iop->cfg.tlb)
> > > +             iop->cfg.tlb->tlb_flush_leaf(iova, size, granule, iop->cookie);
> > >   }
> > >
> > >   static inline void
> > > @@ -232,7 +235,7 @@ io_pgtable_tlb_add_page(struct io_pgtable *iop,
> > >                       struct iommu_iotlb_gather * gather, unsigned long iova,
> > >                       size_t granule)
> > >   {
> > > -     if (iop->cfg.tlb->tlb_add_page)
> > > +     if (iop->cfg.tlb && iop->cfg.tlb->tlb_add_page)
> > >               iop->cfg.tlb->tlb_add_page(gather, iova, granule, iop->cookie);
> > >   }
> > >
> > >
> > _______________________________________________
> > Freedreno mailing list
> > Freedreno@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Freedreno] [PATCH v2 1/6] iommu/arm-smmu: Add auxiliary domain support for arm-smmuv2
  2020-07-07 12:34   ` Robin Murphy
@ 2020-07-07 15:09     ` Rob Clark
  2020-07-13 17:35       ` Jordan Crouse
  0 siblings, 1 reply; 21+ messages in thread
From: Rob Clark @ 2020-07-07 15:09 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Jordan Crouse, linux-arm-msm, Will Deacon,
	Linux Kernel Mailing List,
	list@263.net:IOMMU DRIVERS
	<iommu@lists.linux-foundation.org>,
	Joerg Roedel <joro@8bytes.org>,,
	John Stultz, freedreno,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE

On Tue, Jul 7, 2020 at 5:34 AM Robin Murphy <robin.murphy@arm.com> wrote:
>
> On 2020-06-26 21:04, Jordan Crouse wrote:
> > Support auxiliary domains for arm-smmu-v2 to initialize and support
> > multiple pagetables for a single SMMU context bank. Since the smmu-v2
> > hardware doesn't have any built in support for switching the pagetable
> > base it is left as an exercise to the caller to actually use the pagetable.
>
> Hmm, I've still been thinking that we could model this as supporting
> exactly 1 aux domain iff the device is currently attached to a primary
> domain with TTBR1 enabled. Then supporting multiple aux domains with
> magic TTBR0 switching is the Adreno-specific extension on top of that.
>
> And if we don't want to go to that length, then - as I think Will was
> getting at - I'm not sure it's worth bothering at all. There doesn't
> seem to be any point in half-implementing a pretend aux domain interface
> while still driving a bus through the rest of the abstractions - it's
> really the worst of both worlds. If we're going to hand over the guts of
> io-pgtable to the GPU driver then couldn't it just use
> DOMAIN_ATTR_PGTABLE_CFG bidirectionally to inject a TTBR0 table straight
> into the TTBR1-ified domain?

So, something along the lines of:

1) qcom_adreno_smmu_impl somehow tells core arms-smmu that we want
   to use TTBR1 instead of TTBR0

2) gpu driver uses iommu_domain_get_attr(PGTABLE_CFG) to snapshot
   the initial pgtable cfg.  (Btw, I kinda feel like we should add
   io_pgtable_fmt to io_pgtable_cfg to make it self contained.)

3) gpu driver constructs pgtable_ops for TTBR0, and then kicks
   arm-smmu to do the initial setup to enable TTBR0 with
   iommu_domain_set_attr(PGTABLE_CFG, &ttbr0_pgtable_cfg)

if I understood you properly, that sounds simpler.

> Much as I like the idea of the aux domain abstraction and making this
> fit semi-transparently into the IOMMU API, if an almost entirely private
> interface will be the simplest and cleanest way to get it done then at
> this point also I'm starting to lean towards just getting it done. But
> if some other mediated-device type case then turns up that doesn't quite
> fit that private interface, we revisit the proper abstraction again and
> I reserve the right to say "I told you so" ;)

I'm on board with not trying to design this too generically until
there is a second user

BR,
-R


>
> Robin.
>
> > Aux domains are supported if split pagetable (TTBR1) support has been
> > enabled on the master domain.  Each auxiliary domain will reuse the
> > configuration of the master domain. By default the a domain with TTBR1
> > support will have the TTBR0 region disabled so the first attached aux
> > domain will enable the TTBR0 region in the hardware and conversely the
> > last domain to be detached will disable TTBR0 translations.  All subsequent
> > auxiliary domains create a pagetable but not touch the hardware.
> >
> > The leaf driver will be able to query the physical address of the
> > pagetable with the DOMAIN_ATTR_PTBASE attribute so that it can use the
> > address with whatever means it has to switch the pagetable base.
> >
> > Following is a pseudo code example of how a domain can be created
> >
> >   /* Check to see if aux domains are supported */
> >   if (iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX)) {
> >        iommu = iommu_domain_alloc(...);
> >
> >        if (iommu_aux_attach_device(domain, dev))
> >                return FAIL;
> >
> >       /* Save the base address of the pagetable for use by the driver
> >       iommu_domain_get_attr(domain, DOMAIN_ATTR_PTBASE, &ptbase);
> >   }
> >
> > Then 'domain' can be used like any other iommu domain to map and
> > unmap iova addresses in the pagetable.
> >
> > Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> > ---
> >
> >   drivers/iommu/arm-smmu.c | 219 ++++++++++++++++++++++++++++++++++++---
> >   drivers/iommu/arm-smmu.h |   1 +
> >   2 files changed, 204 insertions(+), 16 deletions(-)
> >
> > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> > index 060139452c54..ce6d654301bf 100644
> > --- a/drivers/iommu/arm-smmu.c
> > +++ b/drivers/iommu/arm-smmu.c
> > @@ -91,6 +91,7 @@ struct arm_smmu_cb {
> >       u32                             tcr[2];
> >       u32                             mair[2];
> >       struct arm_smmu_cfg             *cfg;
> > +     atomic_t                        aux;
> >   };
> >
> >   struct arm_smmu_master_cfg {
> > @@ -667,6 +668,86 @@ static void arm_smmu_write_context_bank(struct arm_smmu_device *smmu, int idx)
> >       arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_SCTLR, reg);
> >   }
> >
> > +/*
> > + * Update the context context bank to enable TTBR0. Assumes AARCH64 S1
> > + * configuration.
> > + */
> > +static void arm_smmu_context_set_ttbr0(struct arm_smmu_cb *cb,
> > +             struct io_pgtable_cfg *pgtbl_cfg)
> > +{
> > +     u32 tcr = cb->tcr[0];
> > +
> > +     /* Add the TCR configuration from the new pagetable config */
> > +     tcr |= arm_smmu_lpae_tcr(pgtbl_cfg);
> > +
> > +     /* Make sure that both TTBR0 and TTBR1 are enabled */
> > +     tcr &= ~(ARM_SMMU_TCR_EPD0 | ARM_SMMU_TCR_EPD1);
> > +
> > +     /* Udate the TCR register */
> > +     cb->tcr[0] = tcr;
> > +
> > +     /* Program the new TTBR0 */
> > +     cb->ttbr[0] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
> > +     cb->ttbr[0] |= FIELD_PREP(ARM_SMMU_TTBRn_ASID, cb->cfg->asid);
> > +}
> > +
> > +/*
> > + * Thus function assumes that the current model only allows aux domains for
> > + * AARCH64 S1 configurations
> > + */
> > +static int arm_smmu_aux_init_domain_context(struct iommu_domain *domain,
> > +             struct arm_smmu_device *smmu, struct arm_smmu_cfg *master)
> > +{
> > +     struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> > +     struct io_pgtable_ops *pgtbl_ops;
> > +     struct io_pgtable_cfg pgtbl_cfg;
> > +
> > +     mutex_lock(&smmu_domain->init_mutex);
> > +
> > +     /* Copy the configuration from the master */
> > +     memcpy(&smmu_domain->cfg, master, sizeof(smmu_domain->cfg));
> > +
> > +     smmu_domain->flush_ops = &arm_smmu_s1_tlb_ops;
> > +     smmu_domain->smmu = smmu;
> > +
> > +     pgtbl_cfg = (struct io_pgtable_cfg) {
> > +             .pgsize_bitmap = smmu->pgsize_bitmap,
> > +             .ias = smmu->va_size,
> > +             .oas = smmu->ipa_size,
> > +             .coherent_walk = smmu->features & ARM_SMMU_FEAT_COHERENT_WALK,
> > +             .tlb = smmu_domain->flush_ops,
> > +             .iommu_dev = smmu->dev,
> > +             .quirks = 0,
> > +     };
> > +
> > +     if (smmu_domain->non_strict)
> > +             pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
> > +
> > +     pgtbl_ops = alloc_io_pgtable_ops(ARM_64_LPAE_S1, &pgtbl_cfg,
> > +             smmu_domain);
> > +     if (!pgtbl_ops) {
> > +             mutex_unlock(&smmu_domain->init_mutex);
> > +             return -ENOMEM;
> > +     }
> > +
> > +     domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
> > +
> > +     domain->geometry.aperture_end = (1UL << smmu->va_size) - 1;
> > +     domain->geometry.force_aperture = true;
> > +
> > +     /* enable TTBR0 when the the first aux domain is attached */
> > +     if (atomic_inc_return(&smmu->cbs[master->cbndx].aux) == 1) {
> > +             arm_smmu_context_set_ttbr0(&smmu->cbs[master->cbndx],
> > +                     &pgtbl_cfg);
> > +             arm_smmu_write_context_bank(smmu, master->cbndx);
> > +     }
> > +
> > +     smmu_domain->pgtbl_ops = pgtbl_ops;
> > +     mutex_unlock(&smmu_domain->init_mutex);
> > +
> > +     return 0;
> > +}
> > +
> >   static int arm_smmu_init_domain_context(struct iommu_domain *domain,
> >                                       struct arm_smmu_device *smmu,
> >                                       struct device *dev)
> > @@ -871,36 +952,70 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
> >       return ret;
> >   }
> >
> > +static void
> > +arm_smmu_destroy_aux_domain_context(struct arm_smmu_domain *smmu_domain)
> > +{
> > +     struct arm_smmu_device *smmu = smmu_domain->smmu;
> > +     struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> > +     int ret;
> > +
> > +     /*
> > +      * If this is the last aux domain to be freed, disable TTBR0 by turning
> > +      * off translations and clearing TTBR0
> > +      */
> > +     if (atomic_dec_return(&smmu->cbs[cfg->cbndx].aux) == 0) {
> > +             /* Clear out the T0 region */
> > +             smmu->cbs[cfg->cbndx].tcr[0] &= ~GENMASK(15, 0);
> > +             /* Disable TTBR0 translations */
> > +             smmu->cbs[cfg->cbndx].tcr[0] |= ARM_SMMU_TCR_EPD0;
> > +             /* Clear the TTBR0 pagetable address */
> > +             smmu->cbs[cfg->cbndx].ttbr[0] =
> > +                     FIELD_PREP(ARM_SMMU_TTBRn_ASID, cfg->asid);
> > +
> > +             ret = arm_smmu_rpm_get(smmu);
> > +             if (!ret) {
> > +                     arm_smmu_write_context_bank(smmu, cfg->cbndx);
> > +                     arm_smmu_rpm_put(smmu);
> > +             }
> > +     }
> > +
> > +}
> > +
> >   static void arm_smmu_destroy_domain_context(struct iommu_domain *domain)
> >   {
> >       struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> >       struct arm_smmu_device *smmu = smmu_domain->smmu;
> >       struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> > -     int ret, irq;
> >
> >       if (!smmu || domain->type == IOMMU_DOMAIN_IDENTITY)
> >               return;
> >
> > -     ret = arm_smmu_rpm_get(smmu);
> > -     if (ret < 0)
> > -             return;
> > +     if (smmu_domain->aux)
> > +             arm_smmu_destroy_aux_domain_context(smmu_domain);
> >
> > -     /*
> > -      * Disable the context bank and free the page tables before freeing
> > -      * it.
> > -      */
> > -     smmu->cbs[cfg->cbndx].cfg = NULL;
> > -     arm_smmu_write_context_bank(smmu, cfg->cbndx);
> > +     /* Check if the last user is done with the context bank */
> > +     if (atomic_read(&smmu->cbs[cfg->cbndx].aux) == 0) {
> > +             int ret = arm_smmu_rpm_get(smmu);
> > +             int irq;
> >
> > -     if (cfg->irptndx != ARM_SMMU_INVALID_IRPTNDX) {
> > -             irq = smmu->irqs[smmu->num_global_irqs + cfg->irptndx];
> > -             devm_free_irq(smmu->dev, irq, domain);
> > +             if (ret < 0)
> > +                     return;
> > +
> > +             /* Disable the context bank */
> > +             smmu->cbs[cfg->cbndx].cfg = NULL;
> > +             arm_smmu_write_context_bank(smmu, cfg->cbndx);
> > +
> > +             if (cfg->irptndx != ARM_SMMU_INVALID_IRPTNDX) {
> > +                     irq = smmu->irqs[smmu->num_global_irqs + cfg->irptndx];
> > +                     devm_free_irq(smmu->dev, irq, domain);
> > +             }
> > +
> > +             __arm_smmu_free_bitmap(smmu->context_map, cfg->cbndx);
> > +             arm_smmu_rpm_put(smmu);
> >       }
> >
> > +     /* Destroy the pagetable */
> >       free_io_pgtable_ops(smmu_domain->pgtbl_ops);
> > -     __arm_smmu_free_bitmap(smmu->context_map, cfg->cbndx);
> > -
> > -     arm_smmu_rpm_put(smmu);
> >   }
> >
> >   static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
> > @@ -1161,6 +1276,74 @@ static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain,
> >       return 0;
> >   }
> >
> > +static bool arm_smmu_dev_has_feat(struct device *dev,
> > +             enum iommu_dev_features feat)
> > +{
> > +     if (feat != IOMMU_DEV_FEAT_AUX)
> > +             return false;
> > +
> > +     return true;
> > +}
> > +
> > +static int arm_smmu_dev_enable_feat(struct device *dev,
> > +             enum iommu_dev_features feat)
> > +{
> > +     /* aux domain support is always available */
> > +     if (feat == IOMMU_DEV_FEAT_AUX)
> > +             return 0;
> > +
> > +     return -ENODEV;
> > +}
> > +
> > +static int arm_smmu_dev_disable_feat(struct device *dev,
> > +             enum iommu_dev_features feat)
> > +{
> > +     return -EBUSY;
> > +}
> > +
> > +static int arm_smmu_aux_attach_dev(struct iommu_domain *domain,
> > +             struct device *dev)
> > +{
> > +     struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> > +     struct arm_smmu_master_cfg *cfg = dev_iommu_priv_get(dev);
> > +     struct arm_smmu_device *smmu = cfg->smmu;
> > +     struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> > +     struct arm_smmu_cb *cb;
> > +     int idx, i, ret, cbndx = -1;
> > +
> > +     /* Try to find the context bank configured for this device */
> > +     for_each_cfg_sme(cfg, fwspec, i, idx) {
> > +             if (idx != INVALID_SMENDX) {
> > +                     cbndx = smmu->s2crs[idx].cbndx;
> > +                     break;
> > +             }
> > +     }
> > +
> > +     if (cbndx == -1)
> > +             return -ENODEV;
> > +
> > +     cb = &smmu->cbs[cbndx];
> > +
> > +     /* Aux domains are only supported for AARCH64 configurations */
> > +     if (cb->cfg->fmt != ARM_SMMU_CTX_FMT_AARCH64)
> > +             return -EINVAL;
> > +
> > +     /* Make sure that TTBR1 is enabled in the hardware */
> > +     if ((cb->tcr[0] & ARM_SMMU_TCR_EPD1))
> > +             return -EINVAL;
> > +
> > +     smmu_domain->aux = true;
> > +
> > +     ret = arm_smmu_rpm_get(smmu);
> > +     if (ret < 0)
> > +             return ret;
> > +
> > +     ret = arm_smmu_aux_init_domain_context(domain, smmu, cb->cfg);
> > +
> > +     arm_smmu_rpm_put(smmu);
> > +     return ret;
> > +}
> > +
> >   static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
> >   {
> >       struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> > @@ -1653,6 +1836,10 @@ static struct iommu_ops arm_smmu_ops = {
> >       .get_resv_regions       = arm_smmu_get_resv_regions,
> >       .put_resv_regions       = generic_iommu_put_resv_regions,
> >       .def_domain_type        = arm_smmu_def_domain_type,
> > +     .dev_has_feat           = arm_smmu_dev_has_feat,
> > +     .dev_enable_feat        = arm_smmu_dev_enable_feat,
> > +     .dev_disable_feat       = arm_smmu_dev_disable_feat,
> > +     .aux_attach_dev         = arm_smmu_aux_attach_dev,
> >       .pgsize_bitmap          = -1UL, /* Restricted during device attach */
> >   };
> >
> > diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
> > index c417814f1d98..79d441024043 100644
> > --- a/drivers/iommu/arm-smmu.h
> > +++ b/drivers/iommu/arm-smmu.h
> > @@ -346,6 +346,7 @@ struct arm_smmu_domain {
> >       spinlock_t                      cb_lock; /* Serialises ATS1* ops and TLB syncs */
> >       struct iommu_domain             domain;
> >       struct device                   *dev;   /* Device attached to this domain */
> > +     bool                            aux;
> >   };
> >
> >   static inline u32 arm_smmu_lpae_tcr(struct io_pgtable_cfg *cfg)
> >
> _______________________________________________
> Freedreno mailing list
> Freedreno@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Freedreno] [PATCH v2 2/6] iommu/io-pgtable: Allow a pgtable implementation to skip TLB operations
  2020-07-07 14:58       ` Rob Clark
@ 2020-07-08 19:19         ` Jordan Crouse
  0 siblings, 0 replies; 21+ messages in thread
From: Jordan Crouse @ 2020-07-08 19:19 UTC (permalink / raw)
  To: Rob Clark
  Cc: Robin Murphy, linux-arm-msm, Sai Prakash Ranjan, Joerg Roedel,
	Will Deacon, Linux Kernel Mailing List,
	list@263.net:IOMMU DRIVERS
	<iommu@lists.linux-foundation.org>,
	Joerg Roedel <joro@8bytes.org>,,
	John Stultz, freedreno, Yong Wu

On Tue, Jul 07, 2020 at 07:58:18AM -0700, Rob Clark wrote:
> On Tue, Jul 7, 2020 at 7:25 AM Rob Clark <robdclark@gmail.com> wrote:
> >
> > On Tue, Jul 7, 2020 at 4:34 AM Robin Murphy <robin.murphy@arm.com> wrote:
> > >
> > > On 2020-06-26 21:04, Jordan Crouse wrote:
> > > > Allow a io-pgtable implementation to skip TLB operations by checking for
> > > > NULL pointers in the helper functions. It will be up to to the owner
> > > > of the io-pgtable instance to make sure that they independently handle
> > > > the TLB correctly.
> > >
> > > I don't really understand what this is for - tricking the IOMMU driver
> > > into not performing its TLB maintenance at points when that maintenance
> > > has been deemed necessary doesn't seem like the appropriate way to
> > > achieve anything good :/
> >
> > No, for triggering the io-pgtable helpers into not performing TLB
> > maintenance.  But seriously, since we are creating pgtables ourselves,
> > and we don't want to be ioremap'ing the GPU's SMMU instance, the
> > alternative is plugging in no-op helpers.  Which amounts to the same
> > thing.
> 
> Hmm, that said, since we are just memcpy'ing the io_pgtable_cfg from
> arm-smmu, it will already be populated with arm-smmu's fxn ptrs.  I
> guess we could maybe make it work without no-op helpers, although in
> that case it looks like we need to fix something about aux-domain vs
> tlb helpers:

I had a change that handled these correctly but I abandoned it because the
TLB functions didn't kick the power and I didn't think that would be desirable
at the generic level for performance reasons. Since the GPU SMMU is on the same
power domain as the GMU we could enable it in the GPU driver before calling
the TLB operations but we would need to be clever about it to prevent bringing
up the GMU just to unmap memory.

Jordan

> [  +0.004373] Unable to handle kernel NULL pointer dereference at
> virtual address 0000000000000019
> [  +0.004086] Mem abort info:
> [  +0.004319]   ESR = 0x96000004
> [  +0.003462]   EC = 0x25: DABT (current EL), IL = 32 bits
> [  +0.003494]   SET = 0, FnV = 0
> [  +0.002812]   EA = 0, S1PTW = 0
> [  +0.002873] Data abort info:
> [  +0.003031]   ISV = 0, ISS = 0x00000004
> [  +0.003785]   CM = 0, WnR = 0
> [  +0.003641] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000261d65000
> [  +0.003383] [0000000000000019] pgd=0000000000000000, p4d=0000000000000000
> [  +0.003715] Internal error: Oops: 96000004 [#1] PREEMPT SMP
> [  +0.002744] Modules linked in: xt_CHECKSUM xt_MASQUERADE
> xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle
> ip6table_nat iptable_mangle iptable_nat nf_nat nf_conntrack
> nf_defrag_ipv4 libcrc32c bridge stp llc ip6table_filter ip6_tables
> iptable_filter ax88179_178a usbnet uvcvideo videobuf2_vmalloc
> videobuf2_memops videobuf2_v4l2 videobuf2_common videodev mc
> hid_multitouch i2c_hid some_battery ti_sn65dsi86 hci_uart btqca btbcm
> qcom_spmi_adc5 bluetooth qcom_spmi_temp_alarm qcom_vadc_common
> ecdh_generic ecc snd_soc_sdm845 snd_soc_rt5663 snd_soc_qcom_common
> ath10k_snoc ath10k_core crct10dif_ce ath mac80211 snd_soc_rl6231
> soundwire_bus i2c_qcom_geni libarc4 qcom_rng msm phy_qcom_qusb2
> reset_qcom_pdc drm_kms_helper cfg80211 rfkill qcom_q6v5_mss
> qcom_q6v5_ipa_notify socinfo qrtr ns panel_simple qcom_q6v5_pas
> qcom_common qcom_glink_smem slim_qcom_ngd_ctrl qcom_sysmon drm
> qcom_q6v5 slimbus qmi_helpers qcom_wdt mdt_loader rmtfs_mem be2iscsi
> bnx2i cnic uio cxgb4i cxgb4 cxgb3i cxgb3 mdio
> [  +0.000139]  libcxgbi libcxgb qla4xxx iscsi_boot_sysfs iscsi_tcp
> libiscsi_tcp libiscsi scsi_transport_iscsi fuse ip_tables x_tables
> ipv6 nf_defrag_ipv6
> [  +0.020933] CPU: 3 PID: 168 Comm: kworker/u16:7 Not tainted
> 5.8.0-rc1-c630+ #31
> [  +0.003828] Hardware name: LENOVO 81JL/LNVNB161216, BIOS
> 9UCN33WW(V2.06) 06/ 4/2019
> [  +0.004039] Workqueue: msm msm_gem_free_work [msm]
> [  +0.003885] pstate: 60c00005 (nZCv daif +PAN +UAO BTYPE=--)
> [  +0.003859] pc : arm_smmu_tlb_inv_range_s1+0x30/0x148
> [  +0.003742] lr : arm_smmu_tlb_add_page_s1+0x1c/0x28
> [  +0.003887] sp : ffff800011cdb970
> [  +0.003868] x29: ffff800011cdb970 x28: 0000000000000003
> [  +0.003930] x27: ffff0001f1882f80 x26: 0000000000000001
> [  +0.003886] x25: 0000000000000003 x24: 0000000000000620
> [  +0.003932] x23: 0000000000000000 x22: 0000000000001000
> [  +0.003886] x21: 0000000000001000 x20: ffff0001cf857300
> [  +0.003916] x19: 0000000000000001 x18: 00000000ffffffff
> [  +0.003921] x17: ffffd9e6a24ae0e8 x16: 0000000000012577
> [  +0.003843] x15: 0000000000012578 x14: 0000000000000000
> [  +0.003884] x13: 0000000000012574 x12: ffffd9e6a2550180
> [  +0.003834] x11: 0000000000083f80 x10: 0000000000000000
> [  +0.003889] x9 : 0000000000000000 x8 : ffff0001f1882f80
> [  +0.003812] x7 : 0000000000000001 x6 : 0000000000000048
> [  +0.003807] x5 : ffff0001c86e1000 x4 : 0000000000000620
> [  +0.003802] x3 : ffff0001ddb57700 x2 : 0000000000001000
> [  +0.003809] x1 : 0000000000001000 x0 : 0000000101048000
> [  +0.003768] Call trace:
> [  +0.003665]  arm_smmu_tlb_inv_range_s1+0x30/0x148
> [  +0.003769]  arm_smmu_tlb_add_page_s1+0x1c/0x28
> [  +0.003760]  __arm_lpae_unmap+0x3c4/0x498
> [  +0.003821]  __arm_lpae_unmap+0xfc/0x498
> [  +0.003693]  __arm_lpae_unmap+0xfc/0x498
> [  +0.003704]  __arm_lpae_unmap+0xfc/0x498
> [  +0.003608]  arm_lpae_unmap+0x60/0x78
> [  +0.003653]  msm_iommu_pagetable_unmap+0x5c/0xa0 [msm]
> [  +0.003711]  msm_gem_purge_vma+0x48/0x70 [msm]
> [  +0.003716]  put_iova+0x68/0xc8 [msm]
> [  +0.003792]  msm_gem_free_work+0x118/0x190 [msm]
> [  +0.003739]  process_one_work+0x28c/0x6e8
> [  +0.003595]  worker_thread+0x4c/0x420
> [  +0.003546]  kthread+0x148/0x168
> [  +0.003675]  ret_from_fork+0x10/0x1c
> [  +0.003596] Code: 2a0403f8 a9046bf9 f9400073 39406077 (b9401a61)
> 
> BR,
> -R
> 
> >
> > Currently (in a later patch in the series) we are using
> > iommu_flush_tlb_all() when unmapping, which is a bit of a big hammer.
> > Although I think we could be a bit more clever and do the TLB ops on
> > the GPU (since the GPU knows if pagetables we are unmapping from are
> > in-use and could skip the TLB ops otherwise).
> >
> > On the topic, if we are using unique ASID values per set of
> > pagetables, how expensive is tlb invalidate for an ASID that has no
> > entries in the TLB?
> >
> > BR,
> > -R
> >
> > >
> > > Robin.
> > >
> > > > Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> > > > ---
> > > >
> > > >   include/linux/io-pgtable.h | 11 +++++++----
> > > >   1 file changed, 7 insertions(+), 4 deletions(-)
> > > >
> > > > diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
> > > > index 53d53c6c2be9..bbed1d3925ba 100644
> > > > --- a/include/linux/io-pgtable.h
> > > > +++ b/include/linux/io-pgtable.h
> > > > @@ -210,21 +210,24 @@ struct io_pgtable {
> > > >
> > > >   static inline void io_pgtable_tlb_flush_all(struct io_pgtable *iop)
> > > >   {
> > > > -     iop->cfg.tlb->tlb_flush_all(iop->cookie);
> > > > +     if (iop->cfg.tlb)
> > > > +             iop->cfg.tlb->tlb_flush_all(iop->cookie);
> > > >   }
> > > >
> > > >   static inline void
> > > >   io_pgtable_tlb_flush_walk(struct io_pgtable *iop, unsigned long iova,
> > > >                         size_t size, size_t granule)
> > > >   {
> > > > -     iop->cfg.tlb->tlb_flush_walk(iova, size, granule, iop->cookie);
> > > > +     if (iop->cfg.tlb)
> > > > +             iop->cfg.tlb->tlb_flush_walk(iova, size, granule, iop->cookie);
> > > >   }
> > > >
> > > >   static inline void
> > > >   io_pgtable_tlb_flush_leaf(struct io_pgtable *iop, unsigned long iova,
> > > >                         size_t size, size_t granule)
> > > >   {
> > > > -     iop->cfg.tlb->tlb_flush_leaf(iova, size, granule, iop->cookie);
> > > > +     if (iop->cfg.tlb)
> > > > +             iop->cfg.tlb->tlb_flush_leaf(iova, size, granule, iop->cookie);
> > > >   }
> > > >
> > > >   static inline void
> > > > @@ -232,7 +235,7 @@ io_pgtable_tlb_add_page(struct io_pgtable *iop,
> > > >                       struct iommu_iotlb_gather * gather, unsigned long iova,
> > > >                       size_t granule)
> > > >   {
> > > > -     if (iop->cfg.tlb->tlb_add_page)
> > > > +     if (iop->cfg.tlb && iop->cfg.tlb->tlb_add_page)
> > > >               iop->cfg.tlb->tlb_add_page(gather, iova, granule, iop->cookie);
> > > >   }
> > > >
> > > >
> > > _______________________________________________
> > > Freedreno mailing list
> > > Freedreno@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/freedreno

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 4/6] drm/msm: Add support to create a local pagetable
  2020-07-07 11:36   ` Robin Murphy
  2020-07-07 14:41     ` [Freedreno] " Rob Clark
@ 2020-07-08 19:35     ` Jordan Crouse
  1 sibling, 0 replies; 21+ messages in thread
From: Jordan Crouse @ 2020-07-08 19:35 UTC (permalink / raw)
  To: Robin Murphy
  Cc: linux-arm-msm, David Airlie, Sean Paul, dri-devel, linux-kernel,
	iommu, John Stultz, Daniel Vetter, freedreno

On Tue, Jul 07, 2020 at 12:36:42PM +0100, Robin Murphy wrote:
> On 2020-06-26 21:04, Jordan Crouse wrote:
> >Add support to create a io-pgtable for use by targets that support
> >per-instance pagetables.  In order to support per-instance pagetables the
> >GPU SMMU device needs to have the qcom,adreno-smmu compatible string and
> >split pagetables and auxiliary domains need to be supported and enabled.
> >
> >Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> >---
> >
> >  drivers/gpu/drm/msm/msm_gpummu.c |   2 +-
> >  drivers/gpu/drm/msm/msm_iommu.c  | 180 ++++++++++++++++++++++++++++++-
> >  drivers/gpu/drm/msm/msm_mmu.h    |  16 ++-
> >  3 files changed, 195 insertions(+), 3 deletions(-)
> >
> >diff --git a/drivers/gpu/drm/msm/msm_gpummu.c b/drivers/gpu/drm/msm/msm_gpummu.c
> >index 310a31b05faa..aab121f4beb7 100644
> >--- a/drivers/gpu/drm/msm/msm_gpummu.c
> >+++ b/drivers/gpu/drm/msm/msm_gpummu.c
> >@@ -102,7 +102,7 @@ struct msm_mmu *msm_gpummu_new(struct device *dev, struct msm_gpu *gpu)
> >  	}
> >  	gpummu->gpu = gpu;
> >-	msm_mmu_init(&gpummu->base, dev, &funcs);
> >+	msm_mmu_init(&gpummu->base, dev, &funcs, MSM_MMU_GPUMMU);
> >  	return &gpummu->base;
> >  }
> >diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c
> >index 1b6635504069..f455c597f76d 100644
> >--- a/drivers/gpu/drm/msm/msm_iommu.c
> >+++ b/drivers/gpu/drm/msm/msm_iommu.c
> >@@ -4,15 +4,192 @@
> >   * Author: Rob Clark <robdclark@gmail.com>
> >   */
> >+#include <linux/io-pgtable.h>
> >  #include "msm_drv.h"
> >  #include "msm_mmu.h"
> >  struct msm_iommu {
> >  	struct msm_mmu base;
> >  	struct iommu_domain *domain;
> >+	struct iommu_domain *aux_domain;
> >  };
> >+
> >  #define to_msm_iommu(x) container_of(x, struct msm_iommu, base)
> >+struct msm_iommu_pagetable {
> >+	struct msm_mmu base;
> >+	struct msm_mmu *parent;
> >+	struct io_pgtable_ops *pgtbl_ops;
> >+	phys_addr_t ttbr;
> >+	u32 asid;
> >+};
> >+
> >+static struct msm_iommu_pagetable *to_pagetable(struct msm_mmu *mmu)
> >+{
> >+	return container_of(mmu, struct msm_iommu_pagetable, base);
> >+}
> >+
> >+static int msm_iommu_pagetable_unmap(struct msm_mmu *mmu, u64 iova,
> >+		size_t size)
> >+{
> >+	struct msm_iommu_pagetable *pagetable = to_pagetable(mmu);
> >+	struct io_pgtable_ops *ops = pagetable->pgtbl_ops;
> >+	size_t unmapped = 0;
> >+
> >+	/* Unmap the block one page at a time */
> >+	while (size) {
> >+		unmapped += ops->unmap(ops, iova, 4096, NULL);
> >+		iova += 4096;
> >+		size -= 4096;
> >+	}
> >+
> >+	iommu_flush_tlb_all(to_msm_iommu(pagetable->parent)->domain);
> >+
> >+	return (unmapped == size) ? 0 : -EINVAL;
> >+}
> 
> Remember in patch #1 when you said "Then 'domain' can be used like any other
> iommu domain to map and unmap iova addresses in the pagetable."?
> 
> This appears to be very much not that :/
 
The code changed but the commit log stayed the same.  I'll reword.

Jordan

> Robin.
> 
> >+
> >+static int msm_iommu_pagetable_map(struct msm_mmu *mmu, u64 iova,
> >+		struct sg_table *sgt, size_t len, int prot)
> >+{
> >+	struct msm_iommu_pagetable *pagetable = to_pagetable(mmu);
> >+	struct io_pgtable_ops *ops = pagetable->pgtbl_ops;
> >+	struct scatterlist *sg;
> >+	size_t mapped = 0;
> >+	u64 addr = iova;
> >+	unsigned int i;
> >+
> >+	for_each_sg(sgt->sgl, sg, sgt->nents, i) {
> >+		size_t size = sg->length;
> >+		phys_addr_t phys = sg_phys(sg);
> >+
> >+		/* Map the block one page at a time */
> >+		while (size) {
> >+			if (ops->map(ops, addr, phys, 4096, prot)) {
> >+				msm_iommu_pagetable_unmap(mmu, iova, mapped);
> >+				return -EINVAL;
> >+			}
> >+
> >+			phys += 4096;
> >+			addr += 4096;
> >+			size -= 4096;
> >+			mapped += 4096;
> >+		}
> >+	}
> >+
> >+	return 0;
> >+}
> >+
> >+static void msm_iommu_pagetable_destroy(struct msm_mmu *mmu)
> >+{
> >+	struct msm_iommu_pagetable *pagetable = to_pagetable(mmu);
> >+
> >+	free_io_pgtable_ops(pagetable->pgtbl_ops);
> >+	kfree(pagetable);
> >+}
> >+
> >+/*
> >+ * Given a parent device, create and return an aux domain. This will enable the
> >+ * TTBR0 region
> >+ */
> >+static struct iommu_domain *msm_iommu_get_aux_domain(struct msm_mmu *parent)
> >+{
> >+	struct msm_iommu *iommu = to_msm_iommu(parent);
> >+	struct iommu_domain *domain;
> >+	int ret;
> >+
> >+	if (iommu->aux_domain)
> >+		return iommu->aux_domain;
> >+
> >+	if (!iommu_dev_has_feature(parent->dev, IOMMU_DEV_FEAT_AUX))
> >+		return ERR_PTR(-ENODEV);
> >+
> >+	domain = iommu_domain_alloc(&platform_bus_type);
> >+	if (!domain)
> >+		return ERR_PTR(-ENODEV);
> >+
> >+	ret = iommu_aux_attach_device(domain, parent->dev);
> >+	if (ret) {
> >+		iommu_domain_free(domain);
> >+		return ERR_PTR(ret);
> >+	}
> >+
> >+	iommu->aux_domain = domain;
> >+	return domain;
> >+}
> >+
> >+int msm_iommu_pagetable_params(struct msm_mmu *mmu,
> >+		phys_addr_t *ttbr, int *asid)
> >+{
> >+	struct msm_iommu_pagetable *pagetable;
> >+
> >+	if (mmu->type != MSM_MMU_IOMMU_PAGETABLE)
> >+		return -EINVAL;
> >+
> >+	pagetable = to_pagetable(mmu);
> >+
> >+	if (ttbr)
> >+		*ttbr = pagetable->ttbr;
> >+
> >+	if (asid)
> >+		*asid = pagetable->asid;
> >+
> >+	return 0;
> >+}
> >+
> >+static const struct msm_mmu_funcs pagetable_funcs = {
> >+		.map = msm_iommu_pagetable_map,
> >+		.unmap = msm_iommu_pagetable_unmap,
> >+		.destroy = msm_iommu_pagetable_destroy,
> >+};
> >+
> >+struct msm_mmu *msm_iommu_pagetable_create(struct msm_mmu *parent)
> >+{
> >+	static int next_asid = 16;
> >+	struct msm_iommu_pagetable *pagetable;
> >+	struct iommu_domain *aux_domain;
> >+	struct io_pgtable_cfg cfg;
> >+	int ret;
> >+
> >+	/* Make sure that the parent has a aux domain attached */
> >+	aux_domain = msm_iommu_get_aux_domain(parent);
> >+	if (IS_ERR(aux_domain))
> >+		return ERR_CAST(aux_domain);
> >+
> >+	/* Get the pagetable configuration from the aux domain */
> >+	ret = iommu_domain_get_attr(aux_domain, DOMAIN_ATTR_PGTABLE_CFG, &cfg);
> >+	if (ret)
> >+		return ERR_PTR(ret);
> >+
> >+	pagetable = kzalloc(sizeof(*pagetable), GFP_KERNEL);
> >+	if (!pagetable)
> >+		return ERR_PTR(-ENOMEM);
> >+
> >+	msm_mmu_init(&pagetable->base, parent->dev, &pagetable_funcs,
> >+		MSM_MMU_IOMMU_PAGETABLE);
> >+
> >+	cfg.tlb = NULL;
> >+
> >+	pagetable->pgtbl_ops = alloc_io_pgtable_ops(ARM_64_LPAE_S1,
> >+		&cfg, aux_domain);
> >+
> >+	if (!pagetable->pgtbl_ops) {
> >+		kfree(pagetable);
> >+		return ERR_PTR(-ENOMEM);
> >+	}
> >+
> >+
> >+	/* Needed later for TLB flush */
> >+	pagetable->parent = parent;
> >+	pagetable->ttbr = cfg.arm_lpae_s1_cfg.ttbr;
> >+
> >+	pagetable->asid = next_asid;
> >+	next_asid = (next_asid + 1)  % 255;
> >+	if (next_asid < 16)
> >+		next_asid = 16;
> >+
> >+	return &pagetable->base;
> >+}
> >+
> >  static int msm_fault_handler(struct iommu_domain *domain, struct device *dev,
> >  		unsigned long iova, int flags, void *arg)
> >  {
> >@@ -40,6 +217,7 @@ static int msm_iommu_map(struct msm_mmu *mmu, uint64_t iova,
> >  	if (iova & BIT_ULL(48))
> >  		iova |= GENMASK_ULL(63, 49);
> >+
> >  	ret = iommu_map_sg(iommu->domain, iova, sgt->sgl, sgt->nents, prot);
> >  	WARN_ON(!ret);
> >@@ -85,7 +263,7 @@ struct msm_mmu *msm_iommu_new(struct device *dev, struct iommu_domain *domain)
> >  		return ERR_PTR(-ENOMEM);
> >  	iommu->domain = domain;
> >-	msm_mmu_init(&iommu->base, dev, &funcs);
> >+	msm_mmu_init(&iommu->base, dev, &funcs, MSM_MMU_IOMMU);
> >  	iommu_set_fault_handler(domain, msm_fault_handler, iommu);
> >  	ret = iommu_attach_device(iommu->domain, dev);
> >diff --git a/drivers/gpu/drm/msm/msm_mmu.h b/drivers/gpu/drm/msm/msm_mmu.h
> >index 3a534ee59bf6..61ade89d9e48 100644
> >--- a/drivers/gpu/drm/msm/msm_mmu.h
> >+++ b/drivers/gpu/drm/msm/msm_mmu.h
> >@@ -17,18 +17,26 @@ struct msm_mmu_funcs {
> >  	void (*destroy)(struct msm_mmu *mmu);
> >  };
> >+enum msm_mmu_type {
> >+	MSM_MMU_GPUMMU,
> >+	MSM_MMU_IOMMU,
> >+	MSM_MMU_IOMMU_PAGETABLE,
> >+};
> >+
> >  struct msm_mmu {
> >  	const struct msm_mmu_funcs *funcs;
> >  	struct device *dev;
> >  	int (*handler)(void *arg, unsigned long iova, int flags);
> >  	void *arg;
> >+	enum msm_mmu_type type;
> >  };
> >  static inline void msm_mmu_init(struct msm_mmu *mmu, struct device *dev,
> >-		const struct msm_mmu_funcs *funcs)
> >+		const struct msm_mmu_funcs *funcs, enum msm_mmu_type type)
> >  {
> >  	mmu->dev = dev;
> >  	mmu->funcs = funcs;
> >+	mmu->type = type;
> >  }
> >  struct msm_mmu *msm_iommu_new(struct device *dev, struct iommu_domain *domain);
> >@@ -41,7 +49,13 @@ static inline void msm_mmu_set_fault_handler(struct msm_mmu *mmu, void *arg,
> >  	mmu->handler = handler;
> >  }
> >+struct msm_mmu *msm_iommu_pagetable_create(struct msm_mmu *parent);
> >+
> >  void msm_gpummu_params(struct msm_mmu *mmu, dma_addr_t *pt_base,
> >  		dma_addr_t *tran_error);
> >+
> >+int msm_iommu_pagetable_params(struct msm_mmu *mmu, phys_addr_t *ttbr,
> >+		int *asid);
> >+
> >  #endif /* __MSM_MMU_H__ */
> >

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Freedreno] [PATCH v2 1/6] iommu/arm-smmu: Add auxiliary domain support for arm-smmuv2
  2020-07-07 15:09     ` [Freedreno] " Rob Clark
@ 2020-07-13 17:35       ` Jordan Crouse
  0 siblings, 0 replies; 21+ messages in thread
From: Jordan Crouse @ 2020-07-13 17:35 UTC (permalink / raw)
  To: Rob Clark
  Cc: Robin Murphy, linux-arm-msm, Will Deacon,
	Linux Kernel Mailing List,
	list@263.net:IOMMU DRIVERS
	<iommu@lists.linux-foundation.org>,
	Joerg Roedel <joro@8bytes.org>,,
	John Stultz, freedreno,
	moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE

On Tue, Jul 07, 2020 at 08:09:41AM -0700, Rob Clark wrote:
> On Tue, Jul 7, 2020 at 5:34 AM Robin Murphy <robin.murphy@arm.com> wrote:
> >
> > On 2020-06-26 21:04, Jordan Crouse wrote:
> > > Support auxiliary domains for arm-smmu-v2 to initialize and support
> > > multiple pagetables for a single SMMU context bank. Since the smmu-v2
> > > hardware doesn't have any built in support for switching the pagetable
> > > base it is left as an exercise to the caller to actually use the pagetable.
> >
> > Hmm, I've still been thinking that we could model this as supporting
> > exactly 1 aux domain iff the device is currently attached to a primary
> > domain with TTBR1 enabled. Then supporting multiple aux domains with
> > magic TTBR0 switching is the Adreno-specific extension on top of that.
> >
> > And if we don't want to go to that length, then - as I think Will was
> > getting at - I'm not sure it's worth bothering at all. There doesn't
> > seem to be any point in half-implementing a pretend aux domain interface
> > while still driving a bus through the rest of the abstractions - it's
> > really the worst of both worlds. If we're going to hand over the guts of
> > io-pgtable to the GPU driver then couldn't it just use
> > DOMAIN_ATTR_PGTABLE_CFG bidirectionally to inject a TTBR0 table straight
> > into the TTBR1-ified domain?
> 
> So, something along the lines of:
> 
> 1) qcom_adreno_smmu_impl somehow tells core arms-smmu that we want
>    to use TTBR1 instead of TTBR0
> 
> 2) gpu driver uses iommu_domain_get_attr(PGTABLE_CFG) to snapshot
>    the initial pgtable cfg.  (Btw, I kinda feel like we should add
>    io_pgtable_fmt to io_pgtable_cfg to make it self contained.)
> 
> 3) gpu driver constructs pgtable_ops for TTBR0, and then kicks
>    arm-smmu to do the initial setup to enable TTBR0 with
>    iommu_domain_set_attr(PGTABLE_CFG, &ttbr0_pgtable_cfg)


There being no objections, I'm going to start going in this direction.
I think we should have a quirk on the arm-smmu device to allow the PGTABLE_CFG
to be set otherwise there is a chance for abuse. Ideally we would filter this
behavior on a stream ID basis if we come up with a scheme to do that cleanly
based on Will's comments in [1].

[1] https://lists.linuxfoundation.org/pipermail/iommu/2020-July/046488.html

Jordan

> if I understood you properly, that sounds simpler.
> 
> > Much as I like the idea of the aux domain abstraction and making this
> > fit semi-transparently into the IOMMU API, if an almost entirely private
> > interface will be the simplest and cleanest way to get it done then at
> > this point also I'm starting to lean towards just getting it done. But
> > if some other mediated-device type case then turns up that doesn't quite
> > fit that private interface, we revisit the proper abstraction again and
> > I reserve the right to say "I told you so" ;)
> 
> I'm on board with not trying to design this too generically until
> there is a second user
> 
> BR,
> -R
> 
> 
> >
> > Robin.
> >
> > > Aux domains are supported if split pagetable (TTBR1) support has been
> > > enabled on the master domain.  Each auxiliary domain will reuse the
> > > configuration of the master domain. By default the a domain with TTBR1
> > > support will have the TTBR0 region disabled so the first attached aux
> > > domain will enable the TTBR0 region in the hardware and conversely the
> > > last domain to be detached will disable TTBR0 translations.  All subsequent
> > > auxiliary domains create a pagetable but not touch the hardware.
> > >
> > > The leaf driver will be able to query the physical address of the
> > > pagetable with the DOMAIN_ATTR_PTBASE attribute so that it can use the
> > > address with whatever means it has to switch the pagetable base.
> > >
> > > Following is a pseudo code example of how a domain can be created
> > >
> > >   /* Check to see if aux domains are supported */
> > >   if (iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX)) {
> > >        iommu = iommu_domain_alloc(...);
> > >
> > >        if (iommu_aux_attach_device(domain, dev))
> > >                return FAIL;
> > >
> > >       /* Save the base address of the pagetable for use by the driver
> > >       iommu_domain_get_attr(domain, DOMAIN_ATTR_PTBASE, &ptbase);
> > >   }
> > >
> > > Then 'domain' can be used like any other iommu domain to map and
> > > unmap iova addresses in the pagetable.
> > >
> > > Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> > > ---
> > >
> > >   drivers/iommu/arm-smmu.c | 219 ++++++++++++++++++++++++++++++++++++---
> > >   drivers/iommu/arm-smmu.h |   1 +
> > >   2 files changed, 204 insertions(+), 16 deletions(-)
> > >
> > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> > > index 060139452c54..ce6d654301bf 100644
> > > --- a/drivers/iommu/arm-smmu.c
> > > +++ b/drivers/iommu/arm-smmu.c
> > > @@ -91,6 +91,7 @@ struct arm_smmu_cb {
> > >       u32                             tcr[2];
> > >       u32                             mair[2];
> > >       struct arm_smmu_cfg             *cfg;
> > > +     atomic_t                        aux;
> > >   };
> > >
> > >   struct arm_smmu_master_cfg {
> > > @@ -667,6 +668,86 @@ static void arm_smmu_write_context_bank(struct arm_smmu_device *smmu, int idx)
> > >       arm_smmu_cb_write(smmu, idx, ARM_SMMU_CB_SCTLR, reg);
> > >   }
> > >
> > > +/*
> > > + * Update the context context bank to enable TTBR0. Assumes AARCH64 S1
> > > + * configuration.
> > > + */
> > > +static void arm_smmu_context_set_ttbr0(struct arm_smmu_cb *cb,
> > > +             struct io_pgtable_cfg *pgtbl_cfg)
> > > +{
> > > +     u32 tcr = cb->tcr[0];
> > > +
> > > +     /* Add the TCR configuration from the new pagetable config */
> > > +     tcr |= arm_smmu_lpae_tcr(pgtbl_cfg);
> > > +
> > > +     /* Make sure that both TTBR0 and TTBR1 are enabled */
> > > +     tcr &= ~(ARM_SMMU_TCR_EPD0 | ARM_SMMU_TCR_EPD1);
> > > +
> > > +     /* Udate the TCR register */
> > > +     cb->tcr[0] = tcr;
> > > +
> > > +     /* Program the new TTBR0 */
> > > +     cb->ttbr[0] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
> > > +     cb->ttbr[0] |= FIELD_PREP(ARM_SMMU_TTBRn_ASID, cb->cfg->asid);
> > > +}
> > > +
> > > +/*
> > > + * Thus function assumes that the current model only allows aux domains for
> > > + * AARCH64 S1 configurations
> > > + */
> > > +static int arm_smmu_aux_init_domain_context(struct iommu_domain *domain,
> > > +             struct arm_smmu_device *smmu, struct arm_smmu_cfg *master)
> > > +{
> > > +     struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> > > +     struct io_pgtable_ops *pgtbl_ops;
> > > +     struct io_pgtable_cfg pgtbl_cfg;
> > > +
> > > +     mutex_lock(&smmu_domain->init_mutex);
> > > +
> > > +     /* Copy the configuration from the master */
> > > +     memcpy(&smmu_domain->cfg, master, sizeof(smmu_domain->cfg));
> > > +
> > > +     smmu_domain->flush_ops = &arm_smmu_s1_tlb_ops;
> > > +     smmu_domain->smmu = smmu;
> > > +
> > > +     pgtbl_cfg = (struct io_pgtable_cfg) {
> > > +             .pgsize_bitmap = smmu->pgsize_bitmap,
> > > +             .ias = smmu->va_size,
> > > +             .oas = smmu->ipa_size,
> > > +             .coherent_walk = smmu->features & ARM_SMMU_FEAT_COHERENT_WALK,
> > > +             .tlb = smmu_domain->flush_ops,
> > > +             .iommu_dev = smmu->dev,
> > > +             .quirks = 0,
> > > +     };
> > > +
> > > +     if (smmu_domain->non_strict)
> > > +             pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
> > > +
> > > +     pgtbl_ops = alloc_io_pgtable_ops(ARM_64_LPAE_S1, &pgtbl_cfg,
> > > +             smmu_domain);
> > > +     if (!pgtbl_ops) {
> > > +             mutex_unlock(&smmu_domain->init_mutex);
> > > +             return -ENOMEM;
> > > +     }
> > > +
> > > +     domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
> > > +
> > > +     domain->geometry.aperture_end = (1UL << smmu->va_size) - 1;
> > > +     domain->geometry.force_aperture = true;
> > > +
> > > +     /* enable TTBR0 when the the first aux domain is attached */
> > > +     if (atomic_inc_return(&smmu->cbs[master->cbndx].aux) == 1) {
> > > +             arm_smmu_context_set_ttbr0(&smmu->cbs[master->cbndx],
> > > +                     &pgtbl_cfg);
> > > +             arm_smmu_write_context_bank(smmu, master->cbndx);
> > > +     }
> > > +
> > > +     smmu_domain->pgtbl_ops = pgtbl_ops;
> > > +     mutex_unlock(&smmu_domain->init_mutex);
> > > +
> > > +     return 0;
> > > +}
> > > +
> > >   static int arm_smmu_init_domain_context(struct iommu_domain *domain,
> > >                                       struct arm_smmu_device *smmu,
> > >                                       struct device *dev)
> > > @@ -871,36 +952,70 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
> > >       return ret;
> > >   }
> > >
> > > +static void
> > > +arm_smmu_destroy_aux_domain_context(struct arm_smmu_domain *smmu_domain)
> > > +{
> > > +     struct arm_smmu_device *smmu = smmu_domain->smmu;
> > > +     struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> > > +     int ret;
> > > +
> > > +     /*
> > > +      * If this is the last aux domain to be freed, disable TTBR0 by turning
> > > +      * off translations and clearing TTBR0
> > > +      */
> > > +     if (atomic_dec_return(&smmu->cbs[cfg->cbndx].aux) == 0) {
> > > +             /* Clear out the T0 region */
> > > +             smmu->cbs[cfg->cbndx].tcr[0] &= ~GENMASK(15, 0);
> > > +             /* Disable TTBR0 translations */
> > > +             smmu->cbs[cfg->cbndx].tcr[0] |= ARM_SMMU_TCR_EPD0;
> > > +             /* Clear the TTBR0 pagetable address */
> > > +             smmu->cbs[cfg->cbndx].ttbr[0] =
> > > +                     FIELD_PREP(ARM_SMMU_TTBRn_ASID, cfg->asid);
> > > +
> > > +             ret = arm_smmu_rpm_get(smmu);
> > > +             if (!ret) {
> > > +                     arm_smmu_write_context_bank(smmu, cfg->cbndx);
> > > +                     arm_smmu_rpm_put(smmu);
> > > +             }
> > > +     }
> > > +
> > > +}
> > > +
> > >   static void arm_smmu_destroy_domain_context(struct iommu_domain *domain)
> > >   {
> > >       struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> > >       struct arm_smmu_device *smmu = smmu_domain->smmu;
> > >       struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> > > -     int ret, irq;
> > >
> > >       if (!smmu || domain->type == IOMMU_DOMAIN_IDENTITY)
> > >               return;
> > >
> > > -     ret = arm_smmu_rpm_get(smmu);
> > > -     if (ret < 0)
> > > -             return;
> > > +     if (smmu_domain->aux)
> > > +             arm_smmu_destroy_aux_domain_context(smmu_domain);
> > >
> > > -     /*
> > > -      * Disable the context bank and free the page tables before freeing
> > > -      * it.
> > > -      */
> > > -     smmu->cbs[cfg->cbndx].cfg = NULL;
> > > -     arm_smmu_write_context_bank(smmu, cfg->cbndx);
> > > +     /* Check if the last user is done with the context bank */
> > > +     if (atomic_read(&smmu->cbs[cfg->cbndx].aux) == 0) {
> > > +             int ret = arm_smmu_rpm_get(smmu);
> > > +             int irq;
> > >
> > > -     if (cfg->irptndx != ARM_SMMU_INVALID_IRPTNDX) {
> > > -             irq = smmu->irqs[smmu->num_global_irqs + cfg->irptndx];
> > > -             devm_free_irq(smmu->dev, irq, domain);
> > > +             if (ret < 0)
> > > +                     return;
> > > +
> > > +             /* Disable the context bank */
> > > +             smmu->cbs[cfg->cbndx].cfg = NULL;
> > > +             arm_smmu_write_context_bank(smmu, cfg->cbndx);
> > > +
> > > +             if (cfg->irptndx != ARM_SMMU_INVALID_IRPTNDX) {
> > > +                     irq = smmu->irqs[smmu->num_global_irqs + cfg->irptndx];
> > > +                     devm_free_irq(smmu->dev, irq, domain);
> > > +             }
> > > +
> > > +             __arm_smmu_free_bitmap(smmu->context_map, cfg->cbndx);
> > > +             arm_smmu_rpm_put(smmu);
> > >       }
> > >
> > > +     /* Destroy the pagetable */
> > >       free_io_pgtable_ops(smmu_domain->pgtbl_ops);
> > > -     __arm_smmu_free_bitmap(smmu->context_map, cfg->cbndx);
> > > -
> > > -     arm_smmu_rpm_put(smmu);
> > >   }
> > >
> > >   static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
> > > @@ -1161,6 +1276,74 @@ static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain,
> > >       return 0;
> > >   }
> > >
> > > +static bool arm_smmu_dev_has_feat(struct device *dev,
> > > +             enum iommu_dev_features feat)
> > > +{
> > > +     if (feat != IOMMU_DEV_FEAT_AUX)
> > > +             return false;
> > > +
> > > +     return true;
> > > +}
> > > +
> > > +static int arm_smmu_dev_enable_feat(struct device *dev,
> > > +             enum iommu_dev_features feat)
> > > +{
> > > +     /* aux domain support is always available */
> > > +     if (feat == IOMMU_DEV_FEAT_AUX)
> > > +             return 0;
> > > +
> > > +     return -ENODEV;
> > > +}
> > > +
> > > +static int arm_smmu_dev_disable_feat(struct device *dev,
> > > +             enum iommu_dev_features feat)
> > > +{
> > > +     return -EBUSY;
> > > +}
> > > +
> > > +static int arm_smmu_aux_attach_dev(struct iommu_domain *domain,
> > > +             struct device *dev)
> > > +{
> > > +     struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> > > +     struct arm_smmu_master_cfg *cfg = dev_iommu_priv_get(dev);
> > > +     struct arm_smmu_device *smmu = cfg->smmu;
> > > +     struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> > > +     struct arm_smmu_cb *cb;
> > > +     int idx, i, ret, cbndx = -1;
> > > +
> > > +     /* Try to find the context bank configured for this device */
> > > +     for_each_cfg_sme(cfg, fwspec, i, idx) {
> > > +             if (idx != INVALID_SMENDX) {
> > > +                     cbndx = smmu->s2crs[idx].cbndx;
> > > +                     break;
> > > +             }
> > > +     }
> > > +
> > > +     if (cbndx == -1)
> > > +             return -ENODEV;
> > > +
> > > +     cb = &smmu->cbs[cbndx];
> > > +
> > > +     /* Aux domains are only supported for AARCH64 configurations */
> > > +     if (cb->cfg->fmt != ARM_SMMU_CTX_FMT_AARCH64)
> > > +             return -EINVAL;
> > > +
> > > +     /* Make sure that TTBR1 is enabled in the hardware */
> > > +     if ((cb->tcr[0] & ARM_SMMU_TCR_EPD1))
> > > +             return -EINVAL;
> > > +
> > > +     smmu_domain->aux = true;
> > > +
> > > +     ret = arm_smmu_rpm_get(smmu);
> > > +     if (ret < 0)
> > > +             return ret;
> > > +
> > > +     ret = arm_smmu_aux_init_domain_context(domain, smmu, cb->cfg);
> > > +
> > > +     arm_smmu_rpm_put(smmu);
> > > +     return ret;
> > > +}
> > > +
> > >   static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
> > >   {
> > >       struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> > > @@ -1653,6 +1836,10 @@ static struct iommu_ops arm_smmu_ops = {
> > >       .get_resv_regions       = arm_smmu_get_resv_regions,
> > >       .put_resv_regions       = generic_iommu_put_resv_regions,
> > >       .def_domain_type        = arm_smmu_def_domain_type,
> > > +     .dev_has_feat           = arm_smmu_dev_has_feat,
> > > +     .dev_enable_feat        = arm_smmu_dev_enable_feat,
> > > +     .dev_disable_feat       = arm_smmu_dev_disable_feat,
> > > +     .aux_attach_dev         = arm_smmu_aux_attach_dev,
> > >       .pgsize_bitmap          = -1UL, /* Restricted during device attach */
> > >   };
> > >
> > > diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h
> > > index c417814f1d98..79d441024043 100644
> > > --- a/drivers/iommu/arm-smmu.h
> > > +++ b/drivers/iommu/arm-smmu.h
> > > @@ -346,6 +346,7 @@ struct arm_smmu_domain {
> > >       spinlock_t                      cb_lock; /* Serialises ATS1* ops and TLB syncs */
> > >       struct iommu_domain             domain;
> > >       struct device                   *dev;   /* Device attached to this domain */
> > > +     bool                            aux;
> > >   };
> > >
> > >   static inline u32 arm_smmu_lpae_tcr(struct io_pgtable_cfg *cfg)
> > >
> > _______________________________________________
> > Freedreno mailing list
> > Freedreno@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/freedreno

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2020-07-13 17:36 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-26 20:04 [PATCH v2 0/6] iommu-arm-smmu: Add auxiliary domains and per-instance pagetables Jordan Crouse
2020-06-26 20:04 ` [PATCH v2 1/6] iommu/arm-smmu: Add auxiliary domain support for arm-smmuv2 Jordan Crouse
2020-07-07 10:48   ` Jean-Philippe Brucker
2020-07-07 12:34   ` Robin Murphy
2020-07-07 15:09     ` [Freedreno] " Rob Clark
2020-07-13 17:35       ` Jordan Crouse
2020-06-26 20:04 ` [PATCH v2 2/6] iommu/io-pgtable: Allow a pgtable implementation to skip TLB operations Jordan Crouse
2020-07-07 11:34   ` Robin Murphy
2020-07-07 14:25     ` [Freedreno] " Rob Clark
2020-07-07 14:58       ` Rob Clark
2020-07-08 19:19         ` Jordan Crouse
2020-06-26 20:04 ` [PATCH v2 3/6] iommu/arm-smmu: Add a domain attribute to pass the pagetable config Jordan Crouse
2020-06-26 20:04 ` [PATCH v2 4/6] drm/msm: Add support to create a local pagetable Jordan Crouse
2020-07-07 11:36   ` Robin Murphy
2020-07-07 14:41     ` [Freedreno] " Rob Clark
2020-07-08 19:35     ` Jordan Crouse
2020-06-26 20:04 ` [PATCH v2 5/6] drm/msm: Add support for address space instances Jordan Crouse
2020-06-26 20:04 ` [PATCH v2 6/6] drm/msm/a6xx: Add support for per-instance pagetables Jordan Crouse
2020-06-27 19:56   ` Rob Clark
2020-06-27 20:11     ` Rob Clark
2020-06-29 14:56       ` [Freedreno] " Jordan Crouse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).