All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC v2 00/16] Private PASID and per-instance pagetables
@ 2018-05-18 21:34 ` Jordan Crouse
  0 siblings, 0 replies; 38+ messages in thread
From: Jordan Crouse @ 2018-05-18 21:34 UTC (permalink / raw)
  To: freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: jean-philippe.brucker-5wv7dgnIgG8,
	linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	tfiga-F7+t8E8rja9g9hUCZPvPmw,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

This is v2 of a patchset of changes to implmeent private PASID support
for arm-smmu-v2 targets and implement per-instance pagetables for
MSM GPUs.

Per-instance pagetables allow the target GPU driver to create and manage
an individual pagetable for each file descriptor instance and switch
between them asynchronously using the GPU to reprogram the pagetable
registers on the fly.

This is done by expanding the shared PASID support from Jean Phillipe [1]
to create a "private" version of a PASID that enables a IOMMU driver
(arm-smmu-v2) to allocate a new pagetable and associate it with a PASID
identifier. That identifier could then be passed to a set of iommu
map/unmap functions to map entries into the new pagetable. Using a
set of sideband functions the GPU driver can get the TTBR0 and 
other information for each PASID and use that information to
reporgram the pagetables.

The first three patches implement split pagetables for arm-smmu-v2
targets. This allows the GPU to take advantage of split pagetables
to map global buffers that won't be affected by the pagetable switch.

The next 3 patches implement private PASID support by adding a few
new API hooks and piggybacking on existing functions from the shared
PASID effort.

The next 8 patches hook up the MSM-GPU driver to implement and use
per-instance pagetables if available.

And finally the last 2 patches are a re-post of changes I provided
a few weeks ago to get the GPU driver to "opt out" of the DMA domain
and thus keeps the context bank free for the GPU domain (which is
important because the per-instance mechanism only knows how to work
on context bank 0).

All of this is based on top of Jean Phillipe's latest tree available
from.
	git://linux-arm.org/linux-jpb.git sva/v2

[changes from v1]:
 * Switch the domain attribute to SPLIT_TABLES (Robin Murphy)
 * Reuse existing mm hooks as much as possible (Jean Phillipe Brucker)
 * Consolidate iommu map/unmap code (Jean Phillipe Brucker)

[1] https://patchwork.kernel.org/patch/10394883/

Jordan Crouse (16):
  iommu: Add DOMAIN_ATTR_SPLIT_TABLES
  iommu/arm-smmu: Add split pagetable support for arm-smmu-v2
  iommu/io-pgtable-arm: Remove ttbr[1] from io_pgtbl_cfg
  iommu: sva: Add support for private PASIDs
  iommu: arm-smmu: Add support for private PASIDs
  iommu: arm-smmu: Add side-band function for specific PASID callbacks
  drm/msm: Enable 64 bit mode by default
  drm/msm: Pass the MMU domain index in struct msm_file_private
  drm/msm/gpu: Support using split page tables for kernel buffer objects
  drm/msm: Add msm_mmu features
  drm/msm: Add support for iommu-sva PASIDs
  drm/msm: Add support for per-instance address spaces
  drm/msm/a5xx: Support per-instance pagetables
  drm/msm: Support per-instance address spaces
  iommu: Gracefully allow drivers to not attach to a default domain
  iommu/arm-smmu: Add list of devices to opt out of DMA domains

 drivers/gpu/drm/msm/Kconfig               |   1 +
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c     |  69 ++++
 drivers/gpu/drm/msm/adreno/a5xx_gpu.h     |  17 +
 drivers/gpu/drm/msm/adreno/a5xx_preempt.c |  74 ++++-
 drivers/gpu/drm/msm/adreno/adreno_gpu.c   |  11 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.h   |   5 +
 drivers/gpu/drm/msm/msm_drv.c             |  45 ++-
 drivers/gpu/drm/msm/msm_drv.h             |   4 +
 drivers/gpu/drm/msm/msm_gem.h             |   1 +
 drivers/gpu/drm/msm/msm_gem_submit.c      |  11 +-
 drivers/gpu/drm/msm/msm_gem_vma.c         |  37 ++-
 drivers/gpu/drm/msm/msm_gpu.c             |  24 +-
 drivers/gpu/drm/msm/msm_gpu.h             |   4 +-
 drivers/gpu/drm/msm/msm_iommu.c           | 192 ++++++++++-
 drivers/gpu/drm/msm/msm_mmu.h             |  19 ++
 drivers/gpu/drm/msm/msm_ringbuffer.h      |   1 +
 drivers/iommu/arm-smmu-regs.h             |  18 ++
 drivers/iommu/arm-smmu-v3-context.c       |   2 +-
 drivers/iommu/arm-smmu.c                  | 370 ++++++++++++++++++++--
 drivers/iommu/io-pgtable-arm-v7s.c        |   3 +-
 drivers/iommu/io-pgtable-arm.c            |   8 +-
 drivers/iommu/io-pgtable.h                |  14 +-
 drivers/iommu/iommu-sva.c                 | 139 +++++++-
 drivers/iommu/iommu.c                     |  83 +++--
 drivers/iommu/ipmmu-vmsa.c                |   2 +-
 drivers/iommu/msm_iommu.c                 |   4 +-
 drivers/iommu/mtk_iommu.c                 |   4 +-
 drivers/iommu/qcom_iommu.c                |   3 +-
 include/linux/arm-smmu.h                  |  18 ++
 include/linux/iommu.h                     |  75 ++++-
 30 files changed, 1137 insertions(+), 121 deletions(-)
 create mode 100644 include/linux/arm-smmu.h

-- 
2.17.0

_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC v2 00/16] Private PASID and per-instance pagetables
@ 2018-05-18 21:34 ` Jordan Crouse
  0 siblings, 0 replies; 38+ messages in thread
From: Jordan Crouse @ 2018-05-18 21:34 UTC (permalink / raw)
  To: linux-arm-kernel

This is v2 of a patchset of changes to implmeent private PASID support
for arm-smmu-v2 targets and implement per-instance pagetables for
MSM GPUs.

Per-instance pagetables allow the target GPU driver to create and manage
an individual pagetable for each file descriptor instance and switch
between them asynchronously using the GPU to reprogram the pagetable
registers on the fly.

This is done by expanding the shared PASID support from Jean Phillipe [1]
to create a "private" version of a PASID that enables a IOMMU driver
(arm-smmu-v2) to allocate a new pagetable and associate it with a PASID
identifier. That identifier could then be passed to a set of iommu
map/unmap functions to map entries into the new pagetable. Using a
set of sideband functions the GPU driver can get the TTBR0 and 
other information for each PASID and use that information to
reporgram the pagetables.

The first three patches implement split pagetables for arm-smmu-v2
targets. This allows the GPU to take advantage of split pagetables
to map global buffers that won't be affected by the pagetable switch.

The next 3 patches implement private PASID support by adding a few
new API hooks and piggybacking on existing functions from the shared
PASID effort.

The next 8 patches hook up the MSM-GPU driver to implement and use
per-instance pagetables if available.

And finally the last 2 patches are a re-post of changes I provided
a few weeks ago to get the GPU driver to "opt out" of the DMA domain
and thus keeps the context bank free for the GPU domain (which is
important because the per-instance mechanism only knows how to work
on context bank 0).

All of this is based on top of Jean Phillipe's latest tree available
from.
	git://linux-arm.org/linux-jpb.git sva/v2

[changes from v1]:
 * Switch the domain attribute to SPLIT_TABLES (Robin Murphy)
 * Reuse existing mm hooks as much as possible (Jean Phillipe Brucker)
 * Consolidate iommu map/unmap code (Jean Phillipe Brucker)

[1] https://patchwork.kernel.org/patch/10394883/

Jordan Crouse (16):
  iommu: Add DOMAIN_ATTR_SPLIT_TABLES
  iommu/arm-smmu: Add split pagetable support for arm-smmu-v2
  iommu/io-pgtable-arm: Remove ttbr[1] from io_pgtbl_cfg
  iommu: sva: Add support for private PASIDs
  iommu: arm-smmu: Add support for private PASIDs
  iommu: arm-smmu: Add side-band function for specific PASID callbacks
  drm/msm: Enable 64 bit mode by default
  drm/msm: Pass the MMU domain index in struct msm_file_private
  drm/msm/gpu: Support using split page tables for kernel buffer objects
  drm/msm: Add msm_mmu features
  drm/msm: Add support for iommu-sva PASIDs
  drm/msm: Add support for per-instance address spaces
  drm/msm/a5xx: Support per-instance pagetables
  drm/msm: Support per-instance address spaces
  iommu: Gracefully allow drivers to not attach to a default domain
  iommu/arm-smmu: Add list of devices to opt out of DMA domains

 drivers/gpu/drm/msm/Kconfig               |   1 +
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c     |  69 ++++
 drivers/gpu/drm/msm/adreno/a5xx_gpu.h     |  17 +
 drivers/gpu/drm/msm/adreno/a5xx_preempt.c |  74 ++++-
 drivers/gpu/drm/msm/adreno/adreno_gpu.c   |  11 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.h   |   5 +
 drivers/gpu/drm/msm/msm_drv.c             |  45 ++-
 drivers/gpu/drm/msm/msm_drv.h             |   4 +
 drivers/gpu/drm/msm/msm_gem.h             |   1 +
 drivers/gpu/drm/msm/msm_gem_submit.c      |  11 +-
 drivers/gpu/drm/msm/msm_gem_vma.c         |  37 ++-
 drivers/gpu/drm/msm/msm_gpu.c             |  24 +-
 drivers/gpu/drm/msm/msm_gpu.h             |   4 +-
 drivers/gpu/drm/msm/msm_iommu.c           | 192 ++++++++++-
 drivers/gpu/drm/msm/msm_mmu.h             |  19 ++
 drivers/gpu/drm/msm/msm_ringbuffer.h      |   1 +
 drivers/iommu/arm-smmu-regs.h             |  18 ++
 drivers/iommu/arm-smmu-v3-context.c       |   2 +-
 drivers/iommu/arm-smmu.c                  | 370 ++++++++++++++++++++--
 drivers/iommu/io-pgtable-arm-v7s.c        |   3 +-
 drivers/iommu/io-pgtable-arm.c            |   8 +-
 drivers/iommu/io-pgtable.h                |  14 +-
 drivers/iommu/iommu-sva.c                 | 139 +++++++-
 drivers/iommu/iommu.c                     |  83 +++--
 drivers/iommu/ipmmu-vmsa.c                |   2 +-
 drivers/iommu/msm_iommu.c                 |   4 +-
 drivers/iommu/mtk_iommu.c                 |   4 +-
 drivers/iommu/qcom_iommu.c                |   3 +-
 include/linux/arm-smmu.h                  |  18 ++
 include/linux/iommu.h                     |  75 ++++-
 30 files changed, 1137 insertions(+), 121 deletions(-)
 create mode 100644 include/linux/arm-smmu.h

-- 
2.17.0

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH 01/16] iommu: Add DOMAIN_ATTR_SPLIT_TABLES
  2018-05-18 21:34 ` Jordan Crouse
@ 2018-05-18 21:34     ` Jordan Crouse
  -1 siblings, 0 replies; 38+ messages in thread
From: Jordan Crouse @ 2018-05-18 21:34 UTC (permalink / raw)
  To: freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: jean-philippe.brucker-5wv7dgnIgG8,
	linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	tfiga-F7+t8E8rja9g9hUCZPvPmw,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

Add a new domain attribute to enable split pagetable support for devices
devices that support it.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 include/linux/iommu.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index d7e2f54086e4..366254e4b07f 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -153,6 +153,7 @@ enum iommu_attr {
 	DOMAIN_ATTR_FSL_PAMU_ENABLE,
 	DOMAIN_ATTR_FSL_PAMUV1,
 	DOMAIN_ATTR_NESTING,	/* two stages of translation */
+	DOMAIN_ATTR_SPLIT_TABLES,
 	DOMAIN_ATTR_MAX,
 };
 
-- 
2.17.0

_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 01/16] iommu: Add DOMAIN_ATTR_SPLIT_TABLES
@ 2018-05-18 21:34     ` Jordan Crouse
  0 siblings, 0 replies; 38+ messages in thread
From: Jordan Crouse @ 2018-05-18 21:34 UTC (permalink / raw)
  To: linux-arm-kernel

Add a new domain attribute to enable split pagetable support for devices
devices that support it.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 include/linux/iommu.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index d7e2f54086e4..366254e4b07f 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -153,6 +153,7 @@ enum iommu_attr {
 	DOMAIN_ATTR_FSL_PAMU_ENABLE,
 	DOMAIN_ATTR_FSL_PAMUV1,
 	DOMAIN_ATTR_NESTING,	/* two stages of translation */
+	DOMAIN_ATTR_SPLIT_TABLES,
 	DOMAIN_ATTR_MAX,
 };
 
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 02/16] iommu/arm-smmu: Add split pagetable support for arm-smmu-v2
  2018-05-18 21:34 ` Jordan Crouse
@ 2018-05-18 21:34     ` Jordan Crouse
  -1 siblings, 0 replies; 38+ messages in thread
From: Jordan Crouse @ 2018-05-18 21:34 UTC (permalink / raw)
  To: freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: jean-philippe.brucker-5wv7dgnIgG8,
	linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	tfiga-F7+t8E8rja9g9hUCZPvPmw,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

Add support for a split pagetable (TTBR0/TTBR1) scheme for
arm-smmu-v2. If split pagetables are enabled, create a
pagetable for TTBR1 and set up the sign extension bit so
that all IOVAs with that bit set are mapped and translated
from the TTBR1 pagetable.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/iommu/arm-smmu-regs.h  |  18 ++++
 drivers/iommu/arm-smmu.c       | 148 +++++++++++++++++++++++++++++----
 drivers/iommu/io-pgtable-arm.c |   3 +-
 3 files changed, 153 insertions(+), 16 deletions(-)

diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-regs.h
index a1226e4ab5f8..56f97093f46a 100644
--- a/drivers/iommu/arm-smmu-regs.h
+++ b/drivers/iommu/arm-smmu-regs.h
@@ -193,7 +193,25 @@ enum arm_smmu_s2cr_privcfg {
 #define RESUME_RETRY			(0 << 0)
 #define RESUME_TERMINATE		(1 << 0)
 
+#define TTBCR_EPD1			(1 << 23)
+#define TTBCR_T1SZ_SHIFT		16
+#define TTBCR_IRGN1_SHIFT		24
+#define TTBCR_ORGN1_SHIFT		26
+#define TTBCR_RGN_WBWA			1
+#define TTBCR_SH1_SHIFT			28
+#define TTBCR_SH_IS			3
+
+#define TTBCR_TG1_16K			(1 << 30)
+#define TTBCR_TG1_4K			(2 << 30)
+#define TTBCR_TG1_64K			(3 << 30)
+
 #define TTBCR2_SEP_SHIFT		15
+#define TTBCR2_SEP_31			(0x0 << TTBCR2_SEP_SHIFT)
+#define TTBCR2_SEP_35			(0x1 << TTBCR2_SEP_SHIFT)
+#define TTBCR2_SEP_39			(0x2 << TTBCR2_SEP_SHIFT)
+#define TTBCR2_SEP_41			(0x3 << TTBCR2_SEP_SHIFT)
+#define TTBCR2_SEP_43			(0x4 << TTBCR2_SEP_SHIFT)
+#define TTBCR2_SEP_47			(0x5 << TTBCR2_SEP_SHIFT)
 #define TTBCR2_SEP_UPSTREAM		(0x7 << TTBCR2_SEP_SHIFT)
 #define TTBCR2_AS			(1 << 4)
 
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 69e7c60792a8..3568e8b073ec 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -143,6 +143,7 @@ struct arm_smmu_cb {
 	u32				tcr[2];
 	u32				mair[2];
 	struct arm_smmu_cfg		*cfg;
+	u64				split_table_mask;
 };
 
 struct arm_smmu_master_cfg {
@@ -200,6 +201,7 @@ struct arm_smmu_device {
 	unsigned long			va_size;
 	unsigned long			ipa_size;
 	unsigned long			pa_size;
+	unsigned long			ubs_size;
 	unsigned long			pgsize_bitmap;
 
 	u32				num_global_irqs;
@@ -242,12 +244,13 @@ enum arm_smmu_domain_stage {
 
 struct arm_smmu_domain {
 	struct arm_smmu_device		*smmu;
-	struct io_pgtable_ops		*pgtbl_ops;
+	struct io_pgtable_ops		*pgtbl_ops[2];
 	const struct iommu_gather_ops	*tlb_ops;
 	struct arm_smmu_cfg		cfg;
 	enum arm_smmu_domain_stage	stage;
 	struct mutex			init_mutex; /* Protects smmu pointer */
 	spinlock_t			cb_lock; /* Serialises ATS1* ops and TLB syncs */
+	u32 attributes;
 	struct iommu_domain		domain;
 };
 
@@ -582,6 +585,69 @@ static irqreturn_t arm_smmu_global_fault(int irq, void *dev)
 	return IRQ_HANDLED;
 }
 
+static void arm_smmu_init_ttbr1(struct arm_smmu_domain *smmu_domain,
+		struct io_pgtable_cfg *pgtbl_cfg)
+{
+	struct arm_smmu_device *smmu = smmu_domain->smmu;
+	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
+	struct arm_smmu_cb *cb = &smmu_domain->smmu->cbs[cfg->cbndx];
+	int pgsize = 1 << __ffs(pgtbl_cfg->pgsize_bitmap);
+
+	/* Enable speculative walks through the TTBR1 */
+	cb->tcr[0] &= ~TTBCR_EPD1;
+
+	cb->tcr[0] |= TTBCR_SH_IS << TTBCR_SH1_SHIFT;
+	cb->tcr[0] |= TTBCR_RGN_WBWA << TTBCR_IRGN1_SHIFT;
+	cb->tcr[0] |= TTBCR_RGN_WBWA << TTBCR_ORGN1_SHIFT;
+
+	switch (pgsize) {
+	case SZ_4K:
+		cb->tcr[0] |= TTBCR_TG1_4K;
+		break;
+	case SZ_16K:
+		cb->tcr[0] |= TTBCR_TG1_16K;
+		break;
+	case SZ_64K:
+		cb->tcr[0] |= TTBCR_TG1_64K;
+		break;
+	}
+
+	cb->tcr[0] |= (64ULL - smmu->va_size) << TTBCR_T1SZ_SHIFT;
+
+	/* Clear the existing SEP configuration */
+	cb->tcr[1] &= ~TTBCR2_SEP_UPSTREAM;
+
+	/* Set up the sign extend bit */
+	switch (smmu->va_size) {
+	case 32:
+		cb->tcr[1] |= TTBCR2_SEP_31;
+		cb->split_table_mask = (1ULL << 31);
+		break;
+	case 36:
+		cb->tcr[1] |= TTBCR2_SEP_35;
+		cb->split_table_mask = (1ULL << 35);
+		break;
+	case 40:
+		cb->tcr[1] |= TTBCR2_SEP_39;
+		cb->split_table_mask = (1ULL << 39);
+		break;
+	case 42:
+		cb->tcr[1] |= TTBCR2_SEP_41;
+		cb->split_table_mask = (1ULL << 41);
+		break;
+	case 44:
+		cb->tcr[1] |= TTBCR2_SEP_43;
+		cb->split_table_mask = (1ULL << 43);
+		break;
+	case 48:
+		cb->tcr[1] |= TTBCR2_SEP_UPSTREAM;
+		cb->split_table_mask = (1ULL << 48);
+	}
+
+	cb->ttbr[1] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[0];
+	cb->ttbr[1] |= (u64)cfg->asid << TTBRn_ASID_SHIFT;
+}
+
 static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain,
 				       struct io_pgtable_cfg *pgtbl_cfg)
 {
@@ -614,8 +680,12 @@ static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain,
 		} else {
 			cb->ttbr[0] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[0];
 			cb->ttbr[0] |= (u64)cfg->asid << TTBRn_ASID_SHIFT;
-			cb->ttbr[1] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[1];
-			cb->ttbr[1] |= (u64)cfg->asid << TTBRn_ASID_SHIFT;
+
+			/*
+			 * Set TTBR1 to empty by default - it will get
+			 * programmed later if it is enabled
+			 */
+			cb->ttbr[1] = (u64)cfg->asid << TTBRn_ASID_SHIFT;
 		}
 	} else {
 		cb->ttbr[0] = pgtbl_cfg->arm_lpae_s2_cfg.vttbr;
@@ -724,11 +794,13 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 {
 	int irq, start, ret = 0;
 	unsigned long ias, oas;
-	struct io_pgtable_ops *pgtbl_ops;
+	struct io_pgtable_ops *pgtbl_ops[2];
 	struct io_pgtable_cfg pgtbl_cfg;
 	enum io_pgtable_fmt fmt;
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
 	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
+	bool split_tables =
+		(smmu_domain->attributes & (1 << DOMAIN_ATTR_SPLIT_TABLES));
 
 	mutex_lock(&smmu_domain->init_mutex);
 	if (smmu_domain->smmu)
@@ -758,8 +830,11 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	 *
 	 * Note that you can't actually request stage-2 mappings.
 	 */
-	if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1))
+	if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1)) {
 		smmu_domain->stage = ARM_SMMU_DOMAIN_S2;
+		/* FIXME: fail instead? */
+		split_tables = false;
+	}
 	if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S2))
 		smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
 
@@ -776,8 +851,11 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	if (IS_ENABLED(CONFIG_IOMMU_IO_PGTABLE_ARMV7S) &&
 	    !IS_ENABLED(CONFIG_64BIT) && !IS_ENABLED(CONFIG_ARM_LPAE) &&
 	    (smmu->features & ARM_SMMU_FEAT_FMT_AARCH32_S) &&
-	    (smmu_domain->stage == ARM_SMMU_DOMAIN_S1))
+	    (smmu_domain->stage == ARM_SMMU_DOMAIN_S1)) {
+		/* FIXME: fail instead? */
+		split_tables = false;
 		cfg->fmt = ARM_SMMU_CTX_FMT_AARCH32_S;
+	}
 	if ((IS_ENABLED(CONFIG_64BIT) || cfg->fmt == ARM_SMMU_CTX_FMT_NONE) &&
 	    (smmu->features & (ARM_SMMU_FEAT_FMT_AARCH64_64K |
 			       ARM_SMMU_FEAT_FMT_AARCH64_16K |
@@ -864,8 +942,8 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 		pgtbl_cfg.quirks = IO_PGTABLE_QUIRK_NO_DMA;
 
 	smmu_domain->smmu = smmu;
-	pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
-	if (!pgtbl_ops) {
+	pgtbl_ops[0] = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
+	if (!pgtbl_ops[0]) {
 		ret = -ENOMEM;
 		goto out_clear_smmu;
 	}
@@ -877,6 +955,22 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 
 	/* Initialise the context bank with our page table cfg */
 	arm_smmu_init_context_bank(smmu_domain, &pgtbl_cfg);
+
+	pgtbl_ops[1] = NULL;
+
+	if (split_tables) {
+		/* FIXME: I think it is safe to reuse pgtbl_cfg here */
+		pgtbl_ops[1] = alloc_io_pgtable_ops(fmt, &pgtbl_cfg,
+			smmu_domain);
+		if (!pgtbl_ops[1]) {
+			free_io_pgtable_ops(pgtbl_ops[0]);
+			ret = -ENOMEM;
+			goto out_clear_smmu;
+		}
+
+		arm_smmu_init_ttbr1(smmu_domain, &pgtbl_cfg);
+	}
+
 	arm_smmu_write_context_bank(smmu, cfg->cbndx);
 
 	/*
@@ -895,7 +989,9 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	mutex_unlock(&smmu_domain->init_mutex);
 
 	/* Publish page table ops for map/unmap */
-	smmu_domain->pgtbl_ops = pgtbl_ops;
+	smmu_domain->pgtbl_ops[0] = pgtbl_ops[0];
+	smmu_domain->pgtbl_ops[1] = pgtbl_ops[1];
+
 	return 0;
 
 out_clear_smmu:
@@ -927,7 +1023,9 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain)
 		devm_free_irq(smmu->dev, irq, domain);
 	}
 
-	free_io_pgtable_ops(smmu_domain->pgtbl_ops);
+	free_io_pgtable_ops(smmu_domain->pgtbl_ops[0]);
+	free_io_pgtable_ops(smmu_domain->pgtbl_ops[1]);
+
 	__arm_smmu_free_bitmap(smmu->context_map, cfg->cbndx);
 }
 
@@ -1230,10 +1328,23 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 	return arm_smmu_domain_add_master(smmu_domain, fwspec);
 }
 
+static struct io_pgtable_ops *
+arm_smmu_get_pgtbl_ops(struct iommu_domain *domain, unsigned long iova)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
+	struct arm_smmu_cb *cb = &smmu_domain->smmu->cbs[cfg->cbndx];
+
+	if (iova & cb->split_table_mask)
+		return smmu_domain->pgtbl_ops[1];
+
+	return smmu_domain->pgtbl_ops[0];
+}
+
 static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
 			phys_addr_t paddr, size_t size, int prot)
 {
-	struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
+	struct io_pgtable_ops *ops = arm_smmu_get_pgtbl_ops(domain, iova);
 
 	if (!ops)
 		return -ENODEV;
@@ -1244,7 +1355,7 @@ static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
 static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
 			     size_t size)
 {
-	struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
+	struct io_pgtable_ops *ops = arm_smmu_get_pgtbl_ops(domain, iova);
 
 	if (!ops)
 		return 0;
@@ -1266,7 +1377,7 @@ static phys_addr_t arm_smmu_iova_to_phys_hard(struct iommu_domain *domain,
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
 	struct arm_smmu_device *smmu = smmu_domain->smmu;
 	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
-	struct io_pgtable_ops *ops= smmu_domain->pgtbl_ops;
+	struct io_pgtable_ops *ops = arm_smmu_get_pgtbl_ops(domain, iova);
 	struct device *dev = smmu->dev;
 	void __iomem *cb_base;
 	u32 tmp;
@@ -1307,7 +1418,7 @@ static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain,
 					dma_addr_t iova)
 {
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
-	struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
+	struct io_pgtable_ops *ops = arm_smmu_get_pgtbl_ops(domain, iova);
 
 	if (domain->type == IOMMU_DOMAIN_IDENTITY)
 		return iova;
@@ -1477,6 +1588,10 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
 	case DOMAIN_ATTR_NESTING:
 		*(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED);
 		return 0;
+	case DOMAIN_ATTR_SPLIT_TABLES:
+		*((int *)data) = !!(smmu_domain->attributes
+					& (1 << DOMAIN_ATTR_SPLIT_TABLES));
+		return 0;
 	default:
 		return -ENODEV;
 	}
@@ -1506,6 +1621,11 @@ static int arm_smmu_domain_set_attr(struct iommu_domain *domain,
 			smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
 
 		break;
+	case DOMAIN_ATTR_SPLIT_TABLES:
+		if (*((int *)data))
+			smmu_domain->attributes |=
+				1 << DOMAIN_ATTR_SPLIT_TABLES;
+		break;
 	default:
 		ret = -ENODEV;
 	}
diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index fe851eae9057..920d9faa2a76 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -422,8 +422,7 @@ static int arm_lpae_map(struct io_pgtable_ops *ops, unsigned long iova,
 	if (!(iommu_prot & (IOMMU_READ | IOMMU_WRITE)))
 		return 0;
 
-	if (WARN_ON(iova >= (1ULL << data->iop.cfg.ias) ||
-		    paddr >= (1ULL << data->iop.cfg.oas)))
+	if (WARN_ON(paddr >= (1ULL << data->iop.cfg.oas)))
 		return -ERANGE;
 
 	prot = arm_lpae_prot_to_pte(data, iommu_prot);
-- 
2.17.0

_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 02/16] iommu/arm-smmu: Add split pagetable support for arm-smmu-v2
@ 2018-05-18 21:34     ` Jordan Crouse
  0 siblings, 0 replies; 38+ messages in thread
From: Jordan Crouse @ 2018-05-18 21:34 UTC (permalink / raw)
  To: linux-arm-kernel

Add support for a split pagetable (TTBR0/TTBR1) scheme for
arm-smmu-v2. If split pagetables are enabled, create a
pagetable for TTBR1 and set up the sign extension bit so
that all IOVAs with that bit set are mapped and translated
from the TTBR1 pagetable.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/iommu/arm-smmu-regs.h  |  18 ++++
 drivers/iommu/arm-smmu.c       | 148 +++++++++++++++++++++++++++++----
 drivers/iommu/io-pgtable-arm.c |   3 +-
 3 files changed, 153 insertions(+), 16 deletions(-)

diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-regs.h
index a1226e4ab5f8..56f97093f46a 100644
--- a/drivers/iommu/arm-smmu-regs.h
+++ b/drivers/iommu/arm-smmu-regs.h
@@ -193,7 +193,25 @@ enum arm_smmu_s2cr_privcfg {
 #define RESUME_RETRY			(0 << 0)
 #define RESUME_TERMINATE		(1 << 0)
 
+#define TTBCR_EPD1			(1 << 23)
+#define TTBCR_T1SZ_SHIFT		16
+#define TTBCR_IRGN1_SHIFT		24
+#define TTBCR_ORGN1_SHIFT		26
+#define TTBCR_RGN_WBWA			1
+#define TTBCR_SH1_SHIFT			28
+#define TTBCR_SH_IS			3
+
+#define TTBCR_TG1_16K			(1 << 30)
+#define TTBCR_TG1_4K			(2 << 30)
+#define TTBCR_TG1_64K			(3 << 30)
+
 #define TTBCR2_SEP_SHIFT		15
+#define TTBCR2_SEP_31			(0x0 << TTBCR2_SEP_SHIFT)
+#define TTBCR2_SEP_35			(0x1 << TTBCR2_SEP_SHIFT)
+#define TTBCR2_SEP_39			(0x2 << TTBCR2_SEP_SHIFT)
+#define TTBCR2_SEP_41			(0x3 << TTBCR2_SEP_SHIFT)
+#define TTBCR2_SEP_43			(0x4 << TTBCR2_SEP_SHIFT)
+#define TTBCR2_SEP_47			(0x5 << TTBCR2_SEP_SHIFT)
 #define TTBCR2_SEP_UPSTREAM		(0x7 << TTBCR2_SEP_SHIFT)
 #define TTBCR2_AS			(1 << 4)
 
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 69e7c60792a8..3568e8b073ec 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -143,6 +143,7 @@ struct arm_smmu_cb {
 	u32				tcr[2];
 	u32				mair[2];
 	struct arm_smmu_cfg		*cfg;
+	u64				split_table_mask;
 };
 
 struct arm_smmu_master_cfg {
@@ -200,6 +201,7 @@ struct arm_smmu_device {
 	unsigned long			va_size;
 	unsigned long			ipa_size;
 	unsigned long			pa_size;
+	unsigned long			ubs_size;
 	unsigned long			pgsize_bitmap;
 
 	u32				num_global_irqs;
@@ -242,12 +244,13 @@ enum arm_smmu_domain_stage {
 
 struct arm_smmu_domain {
 	struct arm_smmu_device		*smmu;
-	struct io_pgtable_ops		*pgtbl_ops;
+	struct io_pgtable_ops		*pgtbl_ops[2];
 	const struct iommu_gather_ops	*tlb_ops;
 	struct arm_smmu_cfg		cfg;
 	enum arm_smmu_domain_stage	stage;
 	struct mutex			init_mutex; /* Protects smmu pointer */
 	spinlock_t			cb_lock; /* Serialises ATS1* ops and TLB syncs */
+	u32 attributes;
 	struct iommu_domain		domain;
 };
 
@@ -582,6 +585,69 @@ static irqreturn_t arm_smmu_global_fault(int irq, void *dev)
 	return IRQ_HANDLED;
 }
 
+static void arm_smmu_init_ttbr1(struct arm_smmu_domain *smmu_domain,
+		struct io_pgtable_cfg *pgtbl_cfg)
+{
+	struct arm_smmu_device *smmu = smmu_domain->smmu;
+	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
+	struct arm_smmu_cb *cb = &smmu_domain->smmu->cbs[cfg->cbndx];
+	int pgsize = 1 << __ffs(pgtbl_cfg->pgsize_bitmap);
+
+	/* Enable speculative walks through the TTBR1 */
+	cb->tcr[0] &= ~TTBCR_EPD1;
+
+	cb->tcr[0] |= TTBCR_SH_IS << TTBCR_SH1_SHIFT;
+	cb->tcr[0] |= TTBCR_RGN_WBWA << TTBCR_IRGN1_SHIFT;
+	cb->tcr[0] |= TTBCR_RGN_WBWA << TTBCR_ORGN1_SHIFT;
+
+	switch (pgsize) {
+	case SZ_4K:
+		cb->tcr[0] |= TTBCR_TG1_4K;
+		break;
+	case SZ_16K:
+		cb->tcr[0] |= TTBCR_TG1_16K;
+		break;
+	case SZ_64K:
+		cb->tcr[0] |= TTBCR_TG1_64K;
+		break;
+	}
+
+	cb->tcr[0] |= (64ULL - smmu->va_size) << TTBCR_T1SZ_SHIFT;
+
+	/* Clear the existing SEP configuration */
+	cb->tcr[1] &= ~TTBCR2_SEP_UPSTREAM;
+
+	/* Set up the sign extend bit */
+	switch (smmu->va_size) {
+	case 32:
+		cb->tcr[1] |= TTBCR2_SEP_31;
+		cb->split_table_mask = (1ULL << 31);
+		break;
+	case 36:
+		cb->tcr[1] |= TTBCR2_SEP_35;
+		cb->split_table_mask = (1ULL << 35);
+		break;
+	case 40:
+		cb->tcr[1] |= TTBCR2_SEP_39;
+		cb->split_table_mask = (1ULL << 39);
+		break;
+	case 42:
+		cb->tcr[1] |= TTBCR2_SEP_41;
+		cb->split_table_mask = (1ULL << 41);
+		break;
+	case 44:
+		cb->tcr[1] |= TTBCR2_SEP_43;
+		cb->split_table_mask = (1ULL << 43);
+		break;
+	case 48:
+		cb->tcr[1] |= TTBCR2_SEP_UPSTREAM;
+		cb->split_table_mask = (1ULL << 48);
+	}
+
+	cb->ttbr[1] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[0];
+	cb->ttbr[1] |= (u64)cfg->asid << TTBRn_ASID_SHIFT;
+}
+
 static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain,
 				       struct io_pgtable_cfg *pgtbl_cfg)
 {
@@ -614,8 +680,12 @@ static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain,
 		} else {
 			cb->ttbr[0] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[0];
 			cb->ttbr[0] |= (u64)cfg->asid << TTBRn_ASID_SHIFT;
-			cb->ttbr[1] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[1];
-			cb->ttbr[1] |= (u64)cfg->asid << TTBRn_ASID_SHIFT;
+
+			/*
+			 * Set TTBR1 to empty by default - it will get
+			 * programmed later if it is enabled
+			 */
+			cb->ttbr[1] = (u64)cfg->asid << TTBRn_ASID_SHIFT;
 		}
 	} else {
 		cb->ttbr[0] = pgtbl_cfg->arm_lpae_s2_cfg.vttbr;
@@ -724,11 +794,13 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 {
 	int irq, start, ret = 0;
 	unsigned long ias, oas;
-	struct io_pgtable_ops *pgtbl_ops;
+	struct io_pgtable_ops *pgtbl_ops[2];
 	struct io_pgtable_cfg pgtbl_cfg;
 	enum io_pgtable_fmt fmt;
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
 	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
+	bool split_tables =
+		(smmu_domain->attributes & (1 << DOMAIN_ATTR_SPLIT_TABLES));
 
 	mutex_lock(&smmu_domain->init_mutex);
 	if (smmu_domain->smmu)
@@ -758,8 +830,11 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	 *
 	 * Note that you can't actually request stage-2 mappings.
 	 */
-	if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1))
+	if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1)) {
 		smmu_domain->stage = ARM_SMMU_DOMAIN_S2;
+		/* FIXME: fail instead? */
+		split_tables = false;
+	}
 	if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S2))
 		smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
 
@@ -776,8 +851,11 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	if (IS_ENABLED(CONFIG_IOMMU_IO_PGTABLE_ARMV7S) &&
 	    !IS_ENABLED(CONFIG_64BIT) && !IS_ENABLED(CONFIG_ARM_LPAE) &&
 	    (smmu->features & ARM_SMMU_FEAT_FMT_AARCH32_S) &&
-	    (smmu_domain->stage == ARM_SMMU_DOMAIN_S1))
+	    (smmu_domain->stage == ARM_SMMU_DOMAIN_S1)) {
+		/* FIXME: fail instead? */
+		split_tables = false;
 		cfg->fmt = ARM_SMMU_CTX_FMT_AARCH32_S;
+	}
 	if ((IS_ENABLED(CONFIG_64BIT) || cfg->fmt == ARM_SMMU_CTX_FMT_NONE) &&
 	    (smmu->features & (ARM_SMMU_FEAT_FMT_AARCH64_64K |
 			       ARM_SMMU_FEAT_FMT_AARCH64_16K |
@@ -864,8 +942,8 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 		pgtbl_cfg.quirks = IO_PGTABLE_QUIRK_NO_DMA;
 
 	smmu_domain->smmu = smmu;
-	pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
-	if (!pgtbl_ops) {
+	pgtbl_ops[0] = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
+	if (!pgtbl_ops[0]) {
 		ret = -ENOMEM;
 		goto out_clear_smmu;
 	}
@@ -877,6 +955,22 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 
 	/* Initialise the context bank with our page table cfg */
 	arm_smmu_init_context_bank(smmu_domain, &pgtbl_cfg);
+
+	pgtbl_ops[1] = NULL;
+
+	if (split_tables) {
+		/* FIXME: I think it is safe to reuse pgtbl_cfg here */
+		pgtbl_ops[1] = alloc_io_pgtable_ops(fmt, &pgtbl_cfg,
+			smmu_domain);
+		if (!pgtbl_ops[1]) {
+			free_io_pgtable_ops(pgtbl_ops[0]);
+			ret = -ENOMEM;
+			goto out_clear_smmu;
+		}
+
+		arm_smmu_init_ttbr1(smmu_domain, &pgtbl_cfg);
+	}
+
 	arm_smmu_write_context_bank(smmu, cfg->cbndx);
 
 	/*
@@ -895,7 +989,9 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	mutex_unlock(&smmu_domain->init_mutex);
 
 	/* Publish page table ops for map/unmap */
-	smmu_domain->pgtbl_ops = pgtbl_ops;
+	smmu_domain->pgtbl_ops[0] = pgtbl_ops[0];
+	smmu_domain->pgtbl_ops[1] = pgtbl_ops[1];
+
 	return 0;
 
 out_clear_smmu:
@@ -927,7 +1023,9 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain)
 		devm_free_irq(smmu->dev, irq, domain);
 	}
 
-	free_io_pgtable_ops(smmu_domain->pgtbl_ops);
+	free_io_pgtable_ops(smmu_domain->pgtbl_ops[0]);
+	free_io_pgtable_ops(smmu_domain->pgtbl_ops[1]);
+
 	__arm_smmu_free_bitmap(smmu->context_map, cfg->cbndx);
 }
 
@@ -1230,10 +1328,23 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 	return arm_smmu_domain_add_master(smmu_domain, fwspec);
 }
 
+static struct io_pgtable_ops *
+arm_smmu_get_pgtbl_ops(struct iommu_domain *domain, unsigned long iova)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
+	struct arm_smmu_cb *cb = &smmu_domain->smmu->cbs[cfg->cbndx];
+
+	if (iova & cb->split_table_mask)
+		return smmu_domain->pgtbl_ops[1];
+
+	return smmu_domain->pgtbl_ops[0];
+}
+
 static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
 			phys_addr_t paddr, size_t size, int prot)
 {
-	struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
+	struct io_pgtable_ops *ops = arm_smmu_get_pgtbl_ops(domain, iova);
 
 	if (!ops)
 		return -ENODEV;
@@ -1244,7 +1355,7 @@ static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
 static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
 			     size_t size)
 {
-	struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
+	struct io_pgtable_ops *ops = arm_smmu_get_pgtbl_ops(domain, iova);
 
 	if (!ops)
 		return 0;
@@ -1266,7 +1377,7 @@ static phys_addr_t arm_smmu_iova_to_phys_hard(struct iommu_domain *domain,
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
 	struct arm_smmu_device *smmu = smmu_domain->smmu;
 	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
-	struct io_pgtable_ops *ops= smmu_domain->pgtbl_ops;
+	struct io_pgtable_ops *ops = arm_smmu_get_pgtbl_ops(domain, iova);
 	struct device *dev = smmu->dev;
 	void __iomem *cb_base;
 	u32 tmp;
@@ -1307,7 +1418,7 @@ static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain,
 					dma_addr_t iova)
 {
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
-	struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
+	struct io_pgtable_ops *ops = arm_smmu_get_pgtbl_ops(domain, iova);
 
 	if (domain->type == IOMMU_DOMAIN_IDENTITY)
 		return iova;
@@ -1477,6 +1588,10 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
 	case DOMAIN_ATTR_NESTING:
 		*(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED);
 		return 0;
+	case DOMAIN_ATTR_SPLIT_TABLES:
+		*((int *)data) = !!(smmu_domain->attributes
+					& (1 << DOMAIN_ATTR_SPLIT_TABLES));
+		return 0;
 	default:
 		return -ENODEV;
 	}
@@ -1506,6 +1621,11 @@ static int arm_smmu_domain_set_attr(struct iommu_domain *domain,
 			smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
 
 		break;
+	case DOMAIN_ATTR_SPLIT_TABLES:
+		if (*((int *)data))
+			smmu_domain->attributes |=
+				1 << DOMAIN_ATTR_SPLIT_TABLES;
+		break;
 	default:
 		ret = -ENODEV;
 	}
diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index fe851eae9057..920d9faa2a76 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -422,8 +422,7 @@ static int arm_lpae_map(struct io_pgtable_ops *ops, unsigned long iova,
 	if (!(iommu_prot & (IOMMU_READ | IOMMU_WRITE)))
 		return 0;
 
-	if (WARN_ON(iova >= (1ULL << data->iop.cfg.ias) ||
-		    paddr >= (1ULL << data->iop.cfg.oas)))
+	if (WARN_ON(paddr >= (1ULL << data->iop.cfg.oas)))
 		return -ERANGE;
 
 	prot = arm_lpae_prot_to_pte(data, iommu_prot);
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 03/16] iommu/io-pgtable-arm: Remove ttbr[1] from io_pgtbl_cfg
  2018-05-18 21:34 ` Jordan Crouse
@ 2018-05-18 21:34     ` Jordan Crouse
  -1 siblings, 0 replies; 38+ messages in thread
From: Jordan Crouse @ 2018-05-18 21:34 UTC (permalink / raw)
  To: freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: jean-philippe.brucker-5wv7dgnIgG8,
	linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	tfiga-F7+t8E8rja9g9hUCZPvPmw,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

Now that we have a working example of an ARM driver that implements
split pagetables completely in the client driver it is apparent that
we don't need to store an extra ttbr value in the io_pgtbl_cfg struct
that will never get used.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/iommu/arm-smmu-v3-context.c | 2 +-
 drivers/iommu/arm-smmu.c            | 8 ++++----
 drivers/iommu/io-pgtable-arm-v7s.c  | 3 +--
 drivers/iommu/io-pgtable-arm.c      | 5 ++---
 drivers/iommu/io-pgtable.h          | 4 ++--
 drivers/iommu/ipmmu-vmsa.c          | 2 +-
 drivers/iommu/msm_iommu.c           | 4 ++--
 drivers/iommu/mtk_iommu.c           | 4 ++--
 drivers/iommu/qcom_iommu.c          | 3 +--
 9 files changed, 16 insertions(+), 19 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3-context.c b/drivers/iommu/arm-smmu-v3-context.c
index 22e7b80a7682..d23e0092f917 100644
--- a/drivers/iommu/arm-smmu-v3-context.c
+++ b/drivers/iommu/arm-smmu-v3-context.c
@@ -522,7 +522,7 @@ arm_smmu_alloc_priv_cd(struct iommu_pasid_table_ops *ops,
 
 	switch (fmt) {
 	case ARM_64_LPAE_S1:
-		cd->ttbr	= cfg->arm_lpae_s1_cfg.ttbr[0];
+		cd->ttbr	= cfg->arm_lpae_s1_cfg.ttbr;
 		cd->tcr		= cfg->arm_lpae_s1_cfg.tcr;
 		cd->mair	= cfg->arm_lpae_s1_cfg.mair[0];
 		break;
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 3568e8b073ec..d459909877c3 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -644,7 +644,7 @@ static void arm_smmu_init_ttbr1(struct arm_smmu_domain *smmu_domain,
 		cb->split_table_mask = (1ULL << 48);
 	}
 
-	cb->ttbr[1] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[0];
+	cb->ttbr[1] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
 	cb->ttbr[1] |= (u64)cfg->asid << TTBRn_ASID_SHIFT;
 }
 
@@ -675,10 +675,10 @@ static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain,
 	/* TTBRs */
 	if (stage1) {
 		if (cfg->fmt == ARM_SMMU_CTX_FMT_AARCH32_S) {
-			cb->ttbr[0] = pgtbl_cfg->arm_v7s_cfg.ttbr[0];
-			cb->ttbr[1] = pgtbl_cfg->arm_v7s_cfg.ttbr[1];
+			cb->ttbr[0] = pgtbl_cfg->arm_v7s_cfg.ttbr;
+			cb->ttbr[1] = 0;
 		} else {
-			cb->ttbr[0] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[0];
+			cb->ttbr[0] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
 			cb->ttbr[0] |= (u64)cfg->asid << TTBRn_ASID_SHIFT;
 
 			/*
diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c
index 10e4a3d11c02..37d607ac5153 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -767,11 +767,10 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct io_pgtable_cfg *cfg,
 	wmb();
 
 	/* TTBRs */
-	cfg->arm_v7s_cfg.ttbr[0] = virt_to_phys(data->pgd) |
+	cfg->arm_v7s_cfg.ttbr = virt_to_phys(data->pgd) |
 				   ARM_V7S_TTBR_S | ARM_V7S_TTBR_NOS |
 				   ARM_V7S_TTBR_IRGN_ATTR(ARM_V7S_RGN_WBWA) |
 				   ARM_V7S_TTBR_ORGN_ATTR(ARM_V7S_RGN_WBWA);
-	cfg->arm_v7s_cfg.ttbr[1] = 0;
 	return &data->iop;
 
 out_free_data:
diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 920d9faa2a76..5bba30d901b4 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -793,9 +793,8 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie)
 	/* Ensure the empty pgd is visible before any actual TTBR write */
 	wmb();
 
-	/* TTBRs */
-	cfg->arm_lpae_s1_cfg.ttbr[0] = virt_to_phys(data->pgd);
-	cfg->arm_lpae_s1_cfg.ttbr[1] = 0;
+	/* TTBR */
+	cfg->arm_lpae_s1_cfg.ttbr = virt_to_phys(data->pgd);
 	return &data->iop;
 
 out_free_data:
diff --git a/drivers/iommu/io-pgtable.h b/drivers/iommu/io-pgtable.h
index 2df79093cad9..fd9f0fc4eb60 100644
--- a/drivers/iommu/io-pgtable.h
+++ b/drivers/iommu/io-pgtable.h
@@ -87,7 +87,7 @@ struct io_pgtable_cfg {
 	/* Low-level data specific to the table format */
 	union {
 		struct {
-			u64	ttbr[2];
+			u64	ttbr;
 			u64	tcr;
 			u64	mair[2];
 		} arm_lpae_s1_cfg;
@@ -98,7 +98,7 @@ struct io_pgtable_cfg {
 		} arm_lpae_s2_cfg;
 
 		struct {
-			u32	ttbr[2];
+			u32	ttbr;
 			u32	tcr;
 			u32	nmrr;
 			u32	prrr;
diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
index 40ae6e87cb88..7cdaa0fef85a 100644
--- a/drivers/iommu/ipmmu-vmsa.c
+++ b/drivers/iommu/ipmmu-vmsa.c
@@ -447,7 +447,7 @@ static int ipmmu_domain_init_context(struct ipmmu_vmsa_domain *domain)
 	}
 
 	/* TTBR0 */
-	ttbr = domain->cfg.arm_lpae_s1_cfg.ttbr[0];
+	ttbr = domain->cfg.arm_lpae_s1_cfg.ttbr;
 	ipmmu_ctx_write_root(domain, IMTTLBR0, ttbr);
 	ipmmu_ctx_write_root(domain, IMTTUBR0, ttbr >> 32);
 
diff --git a/drivers/iommu/msm_iommu.c b/drivers/iommu/msm_iommu.c
index 0d3350463a3f..ef323301c574 100644
--- a/drivers/iommu/msm_iommu.c
+++ b/drivers/iommu/msm_iommu.c
@@ -281,8 +281,8 @@ static void __program_context(void __iomem *base, int ctx,
 	SET_V2PCFG(base, ctx, 0x3);
 
 	SET_TTBCR(base, ctx, priv->cfg.arm_v7s_cfg.tcr);
-	SET_TTBR0(base, ctx, priv->cfg.arm_v7s_cfg.ttbr[0]);
-	SET_TTBR1(base, ctx, priv->cfg.arm_v7s_cfg.ttbr[1]);
+	SET_TTBR0(base, ctx, priv->cfg.arm_v7s_cfg.ttbr);
+	SET_TTBR1(base, ctx, 0);
 
 	/* Set prrr and nmrr */
 	SET_PRRR(base, ctx, priv->cfg.arm_v7s_cfg.prrr);
diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index f2832a10fcea..db710d99fc5f 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -344,7 +344,7 @@ static int mtk_iommu_attach_device(struct iommu_domain *domain,
 	/* Update the pgtable base address register of the M4U HW */
 	if (!data->m4u_dom) {
 		data->m4u_dom = dom;
-		writel(dom->cfg.arm_v7s_cfg.ttbr[0],
+		writel(dom->cfg.arm_v7s_cfg.ttbr,
 		       data->base + REG_MMU_PT_BASE_ADDR);
 	}
 
@@ -725,7 +725,7 @@ static int __maybe_unused mtk_iommu_resume(struct device *dev)
 	writel_relaxed(reg->int_main_control, base + REG_MMU_INT_MAIN_CONTROL);
 	writel_relaxed(reg->ivrp_paddr, base + REG_MMU_IVRP_PADDR);
 	if (data->m4u_dom)
-		writel(data->m4u_dom->cfg.arm_v7s_cfg.ttbr[0],
+		writel(data->m4u_dom->cfg.arm_v7s_cfg.ttbr,
 		       base + REG_MMU_PT_BASE_ADDR);
 	return 0;
 }
diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
index 65b9c99707f8..b0f5fd5b33c0 100644
--- a/drivers/iommu/qcom_iommu.c
+++ b/drivers/iommu/qcom_iommu.c
@@ -257,10 +257,9 @@ static int qcom_iommu_init_domain(struct iommu_domain *domain,
 
 		/* TTBRs */
 		iommu_writeq(ctx, ARM_SMMU_CB_TTBR0,
-				pgtbl_cfg.arm_lpae_s1_cfg.ttbr[0] |
+				pgtbl_cfg.arm_lpae_s1_cfg.ttbr |
 				((u64)ctx->asid << TTBRn_ASID_SHIFT));
 		iommu_writeq(ctx, ARM_SMMU_CB_TTBR1,
-				pgtbl_cfg.arm_lpae_s1_cfg.ttbr[1] |
 				((u64)ctx->asid << TTBRn_ASID_SHIFT));
 
 		/* TTBCR */
-- 
2.17.0

_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 03/16] iommu/io-pgtable-arm: Remove ttbr[1] from io_pgtbl_cfg
@ 2018-05-18 21:34     ` Jordan Crouse
  0 siblings, 0 replies; 38+ messages in thread
From: Jordan Crouse @ 2018-05-18 21:34 UTC (permalink / raw)
  To: linux-arm-kernel

Now that we have a working example of an ARM driver that implements
split pagetables completely in the client driver it is apparent that
we don't need to store an extra ttbr value in the io_pgtbl_cfg struct
that will never get used.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/iommu/arm-smmu-v3-context.c | 2 +-
 drivers/iommu/arm-smmu.c            | 8 ++++----
 drivers/iommu/io-pgtable-arm-v7s.c  | 3 +--
 drivers/iommu/io-pgtable-arm.c      | 5 ++---
 drivers/iommu/io-pgtable.h          | 4 ++--
 drivers/iommu/ipmmu-vmsa.c          | 2 +-
 drivers/iommu/msm_iommu.c           | 4 ++--
 drivers/iommu/mtk_iommu.c           | 4 ++--
 drivers/iommu/qcom_iommu.c          | 3 +--
 9 files changed, 16 insertions(+), 19 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3-context.c b/drivers/iommu/arm-smmu-v3-context.c
index 22e7b80a7682..d23e0092f917 100644
--- a/drivers/iommu/arm-smmu-v3-context.c
+++ b/drivers/iommu/arm-smmu-v3-context.c
@@ -522,7 +522,7 @@ arm_smmu_alloc_priv_cd(struct iommu_pasid_table_ops *ops,
 
 	switch (fmt) {
 	case ARM_64_LPAE_S1:
-		cd->ttbr	= cfg->arm_lpae_s1_cfg.ttbr[0];
+		cd->ttbr	= cfg->arm_lpae_s1_cfg.ttbr;
 		cd->tcr		= cfg->arm_lpae_s1_cfg.tcr;
 		cd->mair	= cfg->arm_lpae_s1_cfg.mair[0];
 		break;
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 3568e8b073ec..d459909877c3 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -644,7 +644,7 @@ static void arm_smmu_init_ttbr1(struct arm_smmu_domain *smmu_domain,
 		cb->split_table_mask = (1ULL << 48);
 	}
 
-	cb->ttbr[1] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[0];
+	cb->ttbr[1] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
 	cb->ttbr[1] |= (u64)cfg->asid << TTBRn_ASID_SHIFT;
 }
 
@@ -675,10 +675,10 @@ static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain,
 	/* TTBRs */
 	if (stage1) {
 		if (cfg->fmt == ARM_SMMU_CTX_FMT_AARCH32_S) {
-			cb->ttbr[0] = pgtbl_cfg->arm_v7s_cfg.ttbr[0];
-			cb->ttbr[1] = pgtbl_cfg->arm_v7s_cfg.ttbr[1];
+			cb->ttbr[0] = pgtbl_cfg->arm_v7s_cfg.ttbr;
+			cb->ttbr[1] = 0;
 		} else {
-			cb->ttbr[0] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[0];
+			cb->ttbr[0] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr;
 			cb->ttbr[0] |= (u64)cfg->asid << TTBRn_ASID_SHIFT;
 
 			/*
diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c
index 10e4a3d11c02..37d607ac5153 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -767,11 +767,10 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct io_pgtable_cfg *cfg,
 	wmb();
 
 	/* TTBRs */
-	cfg->arm_v7s_cfg.ttbr[0] = virt_to_phys(data->pgd) |
+	cfg->arm_v7s_cfg.ttbr = virt_to_phys(data->pgd) |
 				   ARM_V7S_TTBR_S | ARM_V7S_TTBR_NOS |
 				   ARM_V7S_TTBR_IRGN_ATTR(ARM_V7S_RGN_WBWA) |
 				   ARM_V7S_TTBR_ORGN_ATTR(ARM_V7S_RGN_WBWA);
-	cfg->arm_v7s_cfg.ttbr[1] = 0;
 	return &data->iop;
 
 out_free_data:
diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 920d9faa2a76..5bba30d901b4 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -793,9 +793,8 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie)
 	/* Ensure the empty pgd is visible before any actual TTBR write */
 	wmb();
 
-	/* TTBRs */
-	cfg->arm_lpae_s1_cfg.ttbr[0] = virt_to_phys(data->pgd);
-	cfg->arm_lpae_s1_cfg.ttbr[1] = 0;
+	/* TTBR */
+	cfg->arm_lpae_s1_cfg.ttbr = virt_to_phys(data->pgd);
 	return &data->iop;
 
 out_free_data:
diff --git a/drivers/iommu/io-pgtable.h b/drivers/iommu/io-pgtable.h
index 2df79093cad9..fd9f0fc4eb60 100644
--- a/drivers/iommu/io-pgtable.h
+++ b/drivers/iommu/io-pgtable.h
@@ -87,7 +87,7 @@ struct io_pgtable_cfg {
 	/* Low-level data specific to the table format */
 	union {
 		struct {
-			u64	ttbr[2];
+			u64	ttbr;
 			u64	tcr;
 			u64	mair[2];
 		} arm_lpae_s1_cfg;
@@ -98,7 +98,7 @@ struct io_pgtable_cfg {
 		} arm_lpae_s2_cfg;
 
 		struct {
-			u32	ttbr[2];
+			u32	ttbr;
 			u32	tcr;
 			u32	nmrr;
 			u32	prrr;
diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
index 40ae6e87cb88..7cdaa0fef85a 100644
--- a/drivers/iommu/ipmmu-vmsa.c
+++ b/drivers/iommu/ipmmu-vmsa.c
@@ -447,7 +447,7 @@ static int ipmmu_domain_init_context(struct ipmmu_vmsa_domain *domain)
 	}
 
 	/* TTBR0 */
-	ttbr = domain->cfg.arm_lpae_s1_cfg.ttbr[0];
+	ttbr = domain->cfg.arm_lpae_s1_cfg.ttbr;
 	ipmmu_ctx_write_root(domain, IMTTLBR0, ttbr);
 	ipmmu_ctx_write_root(domain, IMTTUBR0, ttbr >> 32);
 
diff --git a/drivers/iommu/msm_iommu.c b/drivers/iommu/msm_iommu.c
index 0d3350463a3f..ef323301c574 100644
--- a/drivers/iommu/msm_iommu.c
+++ b/drivers/iommu/msm_iommu.c
@@ -281,8 +281,8 @@ static void __program_context(void __iomem *base, int ctx,
 	SET_V2PCFG(base, ctx, 0x3);
 
 	SET_TTBCR(base, ctx, priv->cfg.arm_v7s_cfg.tcr);
-	SET_TTBR0(base, ctx, priv->cfg.arm_v7s_cfg.ttbr[0]);
-	SET_TTBR1(base, ctx, priv->cfg.arm_v7s_cfg.ttbr[1]);
+	SET_TTBR0(base, ctx, priv->cfg.arm_v7s_cfg.ttbr);
+	SET_TTBR1(base, ctx, 0);
 
 	/* Set prrr and nmrr */
 	SET_PRRR(base, ctx, priv->cfg.arm_v7s_cfg.prrr);
diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index f2832a10fcea..db710d99fc5f 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -344,7 +344,7 @@ static int mtk_iommu_attach_device(struct iommu_domain *domain,
 	/* Update the pgtable base address register of the M4U HW */
 	if (!data->m4u_dom) {
 		data->m4u_dom = dom;
-		writel(dom->cfg.arm_v7s_cfg.ttbr[0],
+		writel(dom->cfg.arm_v7s_cfg.ttbr,
 		       data->base + REG_MMU_PT_BASE_ADDR);
 	}
 
@@ -725,7 +725,7 @@ static int __maybe_unused mtk_iommu_resume(struct device *dev)
 	writel_relaxed(reg->int_main_control, base + REG_MMU_INT_MAIN_CONTROL);
 	writel_relaxed(reg->ivrp_paddr, base + REG_MMU_IVRP_PADDR);
 	if (data->m4u_dom)
-		writel(data->m4u_dom->cfg.arm_v7s_cfg.ttbr[0],
+		writel(data->m4u_dom->cfg.arm_v7s_cfg.ttbr,
 		       base + REG_MMU_PT_BASE_ADDR);
 	return 0;
 }
diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
index 65b9c99707f8..b0f5fd5b33c0 100644
--- a/drivers/iommu/qcom_iommu.c
+++ b/drivers/iommu/qcom_iommu.c
@@ -257,10 +257,9 @@ static int qcom_iommu_init_domain(struct iommu_domain *domain,
 
 		/* TTBRs */
 		iommu_writeq(ctx, ARM_SMMU_CB_TTBR0,
-				pgtbl_cfg.arm_lpae_s1_cfg.ttbr[0] |
+				pgtbl_cfg.arm_lpae_s1_cfg.ttbr |
 				((u64)ctx->asid << TTBRn_ASID_SHIFT));
 		iommu_writeq(ctx, ARM_SMMU_CB_TTBR1,
-				pgtbl_cfg.arm_lpae_s1_cfg.ttbr[1] |
 				((u64)ctx->asid << TTBRn_ASID_SHIFT));
 
 		/* TTBCR */
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 04/16] iommu: sva: Add support for private PASIDs
  2018-05-18 21:34 ` Jordan Crouse
@ 2018-05-18 21:34     ` Jordan Crouse
  -1 siblings, 0 replies; 38+ messages in thread
From: Jordan Crouse @ 2018-05-18 21:34 UTC (permalink / raw)
  To: freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: jean-philippe.brucker-5wv7dgnIgG8,
	linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	tfiga-F7+t8E8rja9g9hUCZPvPmw,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

Some older SMMU implementations that do not have a fully featured
hardware PASID features have alternate workarounds for using multiple
pagetables. For example, MSM GPUs have logic to automatically switch the
user pagetable from hardware by writing the context bank directly.

Support private PASIDs by creating a new io-pgtable instance map it
to a PASID and provide the APIs for drivers to populate it manually.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/iommu/iommu-sva.c | 139 ++++++++++++++++++++++++++++++++++++--
 drivers/iommu/iommu.c     |  66 +++++++++++++-----
 include/linux/iommu.h     |  74 ++++++++++++++++++--
 3 files changed, 250 insertions(+), 29 deletions(-)

diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c
index e98b994c15f1..26f0da9692d4 100644
--- a/drivers/iommu/iommu-sva.c
+++ b/drivers/iommu/iommu-sva.c
@@ -156,6 +156,7 @@ io_mm_alloc(struct iommu_domain *domain, struct device *dev,
 	mmgrab(mm);
 
 	io_mm->flags		= flags;
+	io_mm->type		= IO_TYPE_SHARED;
 	io_mm->mm		= mm;
 	io_mm->notifier.ops	= &iommu_mmu_notifier;
 	io_mm->release		= domain->ops->mm_free;
@@ -544,13 +545,10 @@ int iommu_sva_device_init(struct device *dev, unsigned long features,
 			  unsigned int max_pasid,
 			  iommu_mm_exit_handler_t mm_exit)
 {
-	int ret;
+	int ret = 0;
 	struct iommu_sva_param *param;
 	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
 
-	if (!domain || !domain->ops->sva_device_init)
-		return -ENODEV;
-
 	if (features & ~IOMMU_SVA_FEAT_IOPF)
 		return -EINVAL;
 
@@ -576,9 +574,12 @@ int iommu_sva_device_init(struct device *dev, unsigned long features,
 	 * IOMMU driver updates the limits depending on the IOMMU and device
 	 * capabilities.
 	 */
-	ret = domain->ops->sva_device_init(dev, param);
-	if (ret)
-		goto err_free_param;
+
+	if (domain && domain->ops->sva_device_init) {
+		ret = domain->ops->sva_device_init(dev, param);
+		if (ret)
+			goto err_free_param;
+	}
 
 	mutex_lock(&dev->iommu_param->lock);
 	if (dev->iommu_param->sva_param)
@@ -790,3 +791,127 @@ struct mm_struct *iommu_sva_find(int pasid)
 	return mm;
 }
 EXPORT_SYMBOL_GPL(iommu_sva_find);
+
+int iommu_sva_alloc_pasid(struct iommu_domain *domain, struct device *dev)
+{
+	int ret, pasid;
+	struct io_mm *io_mm;
+	struct iommu_sva_param *param = dev->iommu_param->sva_param;
+
+	if (!domain->ops->mm_attach || !domain->ops->mm_detach)
+		return -ENODEV;
+
+	if (domain->ops->mm_alloc)
+		io_mm = domain->ops->mm_alloc(domain, NULL, 0);
+	else
+		io_mm = kzalloc(sizeof(*io_mm), GFP_KERNEL);
+
+	if (IS_ERR(io_mm))
+		return PTR_ERR(io_mm);
+	if (!io_mm)
+		return -ENOMEM;
+
+	io_mm->domain = domain;
+	io_mm->type = IO_TYPE_PRIVATE;
+
+	idr_preload(GFP_KERNEL);
+	spin_lock(&iommu_sva_lock);
+	pasid = idr_alloc_cyclic(&iommu_pasid_idr, io_mm, param->min_pasid,
+		param->max_pasid + 1, GFP_ATOMIC);
+	io_mm->pasid = pasid;
+	spin_unlock(&iommu_sva_lock);
+	idr_preload_end();
+
+	if (pasid < 0) {
+		kfree(io_mm);
+		return pasid;
+	}
+
+	ret = domain->ops->mm_attach(domain, dev, io_mm, false);
+	if (!ret)
+		return pasid;
+
+	spin_lock(&iommu_sva_lock);
+	idr_remove(&iommu_pasid_idr, io_mm->pasid);
+	spin_unlock(&iommu_sva_lock);
+
+	if (domain->ops->mm_free)
+		domain->ops->mm_free(io_mm);
+	else
+		kfree(io_mm);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_sva_alloc_pasid);
+
+static struct io_mm *get_io_mm(int pasid)
+{
+	struct io_mm *io_mm;
+
+	spin_lock(&iommu_sva_lock);
+	io_mm = idr_find(&iommu_pasid_idr, pasid);
+	spin_unlock(&iommu_sva_lock);
+
+	return io_mm;
+}
+
+int iommu_sva_map(int pasid, unsigned long iova,
+	      phys_addr_t paddr, size_t size, int prot)
+{
+	struct io_mm *io_mm = get_io_mm(pasid);
+
+	if (!io_mm || io_mm->type != IO_TYPE_PRIVATE)
+		return -ENODEV;
+
+	return __iommu_map(io_mm->domain, &pasid, iova, paddr, size, prot);
+}
+EXPORT_SYMBOL_GPL(iommu_sva_map);
+
+size_t iommu_sva_map_sg(int pasid, unsigned long iova, struct scatterlist *sg,
+		unsigned int nents, int prot)
+{
+	struct io_mm *io_mm = get_io_mm(pasid);
+	struct iommu_domain *domain;
+
+	if (!io_mm || io_mm->type != IO_TYPE_PRIVATE)
+		return -ENODEV;
+
+	domain = io_mm->domain;
+
+	return domain->ops->map_sg(domain, &pasid, iova, sg, nents, prot);
+}
+EXPORT_SYMBOL_GPL(iommu_sva_map_sg);
+
+size_t iommu_sva_unmap(int pasid, unsigned long iova, size_t size)
+{
+	struct io_mm *io_mm = get_io_mm(pasid);
+
+	if (!io_mm || io_mm->type != IO_TYPE_PRIVATE)
+		return -ENODEV;
+
+	return __iommu_unmap(io_mm->domain, &pasid, iova, size, false);
+}
+EXPORT_SYMBOL_GPL(iommu_sva_unmap);
+
+void iommu_sva_free_pasid(int pasid, struct device *dev)
+{
+	struct io_mm *io_mm = get_io_mm(pasid);
+	struct iommu_domain *domain;
+
+	if (!io_mm || io_mm->type != IO_TYPE_PRIVATE)
+		return;
+
+	domain = io_mm->domain;
+
+	domain->ops->mm_detach(domain, dev, io_mm, false);
+
+	spin_lock(&iommu_sva_lock);
+	idr_remove(&iommu_pasid_idr, io_mm->pasid);
+	spin_unlock(&iommu_sva_lock);
+
+	if (domain->ops->mm_free)
+		domain->ops->mm_free(io_mm);
+	else
+		kfree(io_mm);
+}
+EXPORT_SYMBOL_GPL(iommu_sva_free_pasid);
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 13f705df0725..0ba3d27f2300 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1792,7 +1792,7 @@ static size_t iommu_pgsize(struct iommu_domain *domain,
 	return pgsize;
 }
 
-int iommu_map(struct iommu_domain *domain, unsigned long iova,
+int __iommu_map(struct iommu_domain *domain, int *pasid, unsigned long iova,
 	      phys_addr_t paddr, size_t size, int prot)
 {
 	unsigned long orig_iova = iova;
@@ -1801,10 +1801,17 @@ int iommu_map(struct iommu_domain *domain, unsigned long iova,
 	phys_addr_t orig_paddr = paddr;
 	int ret = 0;
 
-	if (unlikely(domain->ops->map == NULL ||
-		     domain->pgsize_bitmap == 0UL))
+	if (unlikely(domain->pgsize_bitmap == 0UL))
 		return -ENODEV;
 
+	if (pasid) {
+		if (unlikely(domain->ops->sva_map == NULL))
+			return -ENODEV;
+	} else {
+		if (unlikely(domain->ops->map == NULL))
+			return -ENODEV;
+	}
+
 	if (unlikely(!(domain->type & __IOMMU_DOMAIN_PAGING)))
 		return -EINVAL;
 
@@ -1830,7 +1837,13 @@ int iommu_map(struct iommu_domain *domain, unsigned long iova,
 		pr_debug("mapping: iova 0x%lx pa %pa pgsize 0x%zx\n",
 			 iova, &paddr, pgsize);
 
-		ret = domain->ops->map(domain, iova, paddr, pgsize, prot);
+		if (pasid)
+			ret = domain->ops->sva_map(domain, *pasid, iova, paddr,
+				pgsize, prot);
+		else
+			ret = domain->ops->map(domain, iova, paddr, pgsize,
+				prot);
+
 		if (ret)
 			break;
 
@@ -1841,16 +1854,23 @@ int iommu_map(struct iommu_domain *domain, unsigned long iova,
 
 	/* unroll mapping in case something went wrong */
 	if (ret)
-		iommu_unmap(domain, orig_iova, orig_size - size);
+		__iommu_unmap(domain, pasid, orig_iova, orig_size - size,
+			pasid ? false : true);
 	else
 		trace_map(orig_iova, orig_paddr, orig_size);
 
 	return ret;
 }
+
+int iommu_map(struct iommu_domain *domain, unsigned long iova,
+	      phys_addr_t paddr, size_t size, int prot)
+{
+	return __iommu_map(domain, NULL, iova, paddr, size, prot);
+}
 EXPORT_SYMBOL_GPL(iommu_map);
 
-static size_t __iommu_unmap(struct iommu_domain *domain,
-			    unsigned long iova, size_t size,
+size_t __iommu_unmap(struct iommu_domain *domain,
+			    int *pasid, unsigned long iova, size_t size,
 			    bool sync)
 {
 	const struct iommu_ops *ops = domain->ops;
@@ -1858,9 +1878,16 @@ static size_t __iommu_unmap(struct iommu_domain *domain,
 	unsigned long orig_iova = iova;
 	unsigned int min_pagesz;
 
-	if (unlikely(ops->unmap == NULL ||
-		     domain->pgsize_bitmap == 0UL))
-		return 0;
+	if (unlikely(domain->pgsize_bitmap == 0UL))
+		return -0;
+
+	if (pasid) {
+		if (unlikely(domain->ops->sva_unmap == NULL))
+			return 0;
+	} else {
+		if (unlikely(domain->ops->unmap == NULL))
+			return 0;
+	}
 
 	if (unlikely(!(domain->type & __IOMMU_DOMAIN_PAGING)))
 		return 0;
@@ -1888,7 +1915,12 @@ static size_t __iommu_unmap(struct iommu_domain *domain,
 	while (unmapped < size) {
 		size_t pgsize = iommu_pgsize(domain, iova, size - unmapped);
 
-		unmapped_page = ops->unmap(domain, iova, pgsize);
+		if (pasid)
+			unmapped_page = ops->sva_unmap(domain, *pasid, iova,
+				pgsize);
+		else
+			unmapped_page = ops->unmap(domain, iova, pgsize);
+
 		if (!unmapped_page)
 			break;
 
@@ -1912,19 +1944,20 @@ static size_t __iommu_unmap(struct iommu_domain *domain,
 size_t iommu_unmap(struct iommu_domain *domain,
 		   unsigned long iova, size_t size)
 {
-	return __iommu_unmap(domain, iova, size, true);
+	return __iommu_unmap(domain, NULL, iova, size, true);
 }
 EXPORT_SYMBOL_GPL(iommu_unmap);
 
 size_t iommu_unmap_fast(struct iommu_domain *domain,
 			unsigned long iova, size_t size)
 {
-	return __iommu_unmap(domain, iova, size, false);
+	return __iommu_unmap(domain, NULL, iova, size, false);
 }
 EXPORT_SYMBOL_GPL(iommu_unmap_fast);
 
-size_t default_iommu_map_sg(struct iommu_domain *domain, unsigned long iova,
-			 struct scatterlist *sg, unsigned int nents, int prot)
+size_t default_iommu_map_sg(struct iommu_domain *domain, int *pasid,
+			 unsigned long iova, struct scatterlist *sg,
+			 unsigned int nents, int prot)
 {
 	struct scatterlist *s;
 	size_t mapped = 0;
@@ -1948,7 +1981,8 @@ size_t default_iommu_map_sg(struct iommu_domain *domain, unsigned long iova,
 		if (!IS_ALIGNED(s->offset, min_pagesz))
 			goto out_err;
 
-		ret = iommu_map(domain, iova + mapped, phys, s->length, prot);
+		ret = __iommu_map(domain, pasid, iova + mapped, phys, s->length,
+			prot);
 		if (ret)
 			goto out_err;
 
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 366254e4b07f..3d72d636c13d 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -108,7 +108,13 @@ struct iommu_domain {
 	struct list_head mm_list;
 };
 
+enum iommu_io_type {
+	IO_TYPE_SHARED,
+	IO_TYPE_PRIVATE,
+};
+
 struct io_mm {
+	enum iommu_io_type	type;
 	int			pasid;
 	/* IOMMU_SVA_FEAT_* */
 	unsigned long		flags;
@@ -123,6 +129,9 @@ struct io_mm {
 	void (*release)(struct io_mm *io_mm);
 	/* For postponed release */
 	struct rcu_head		rcu;
+
+	/* This is used by private entries */
+	struct iommu_domain *domain;
 };
 
 enum iommu_cap {
@@ -315,8 +324,9 @@ struct iommu_ops {
 		   phys_addr_t paddr, size_t size, int prot);
 	size_t (*unmap)(struct iommu_domain *domain, unsigned long iova,
 		     size_t size);
-	size_t (*map_sg)(struct iommu_domain *domain, unsigned long iova,
-			 struct scatterlist *sg, unsigned int nents, int prot);
+	size_t (*map_sg)(struct iommu_domain *domain, int *pasid,
+			 unsigned long iova, struct scatterlist *sg,
+			 unsigned int nents, int prot);
 	void (*flush_iotlb_all)(struct iommu_domain *domain);
 	void (*iotlb_range_add)(struct iommu_domain *domain,
 				unsigned long iova, size_t size);
@@ -358,6 +368,12 @@ struct iommu_ops {
 		struct device *dev, struct tlb_invalidate_info *inv_info);
 	int (*page_response)(struct device *dev, struct page_response_msg *msg);
 
+	int (*sva_map)(struct iommu_domain *domain, int pasid,
+		       unsigned long iova, phys_addr_t paddr, size_t size,
+		       int prot);
+	size_t (*sva_unmap)(struct iommu_domain *domain, int pasid,
+			    unsigned long iova, size_t size);
+
 	unsigned long pgsize_bitmap;
 };
 
@@ -548,9 +564,9 @@ extern size_t iommu_unmap(struct iommu_domain *domain, unsigned long iova,
 			  size_t size);
 extern size_t iommu_unmap_fast(struct iommu_domain *domain,
 			       unsigned long iova, size_t size);
-extern size_t default_iommu_map_sg(struct iommu_domain *domain, unsigned long iova,
-				struct scatterlist *sg,unsigned int nents,
-				int prot);
+extern size_t default_iommu_map_sg(struct iommu_domain *domain, int *pasid,
+				   unsigned long iova, struct scatterlist *sg,
+				   unsigned int nents, int prot);
 extern phys_addr_t iommu_iova_to_phys(struct iommu_domain *domain, dma_addr_t iova);
 extern void iommu_set_fault_handler(struct iommu_domain *domain,
 			iommu_fault_handler_t handler, void *token);
@@ -636,7 +652,7 @@ static inline size_t iommu_map_sg(struct iommu_domain *domain,
 				  unsigned long iova, struct scatterlist *sg,
 				  unsigned int nents, int prot)
 {
-	return domain->ops->map_sg(domain, iova, sg, nents, prot);
+	return domain->ops->map_sg(domain, NULL, iova, sg, nents, prot);
 }
 
 /* PCI device grouping function */
@@ -676,6 +692,14 @@ extern int iommu_sva_bind_device(struct device *dev, struct mm_struct *mm,
 				int *pasid, unsigned long flags, void *drvdata);
 extern int iommu_sva_unbind_device(struct device *dev, int pasid);
 
+/* Common map and unmap functions */
+extern int __iommu_map(struct iommu_domain *domain, int *pasid,
+		unsigned long iova, phys_addr_t paddr, size_t size, int prot);
+
+extern size_t __iommu_unmap(struct iommu_domain *domain,
+			    int *pasid, unsigned long iova, size_t size,
+			    bool sync);
+
 #else /* CONFIG_IOMMU_API */
 
 struct iommu_ops {};
@@ -1027,6 +1051,16 @@ extern int __iommu_sva_unbind_device(struct device *dev, int pasid);
 extern void __iommu_sva_unbind_dev_all(struct device *dev);
 
 extern struct mm_struct *iommu_sva_find(int pasid);
+
+extern int iommu_sva_alloc_pasid(struct iommu_domain *domain,
+		struct device *dev);
+extern int iommu_sva_map(int pasid, unsigned long iova, phys_addr_t physaddr,
+		size_t size, int prot);
+extern size_t iommu_sva_map_sg(int pasid, unsigned long iova,
+		struct scatterlist *sg, unsigned int nents, int prot);
+extern size_t iommu_sva_unmap(int pasid, unsigned long iova, size_t size);
+extern void iommu_sva_free_pasid(int pasid, struct device *dev);
+
 #else /* CONFIG_IOMMU_SVA */
 static inline int iommu_sva_device_init(struct device *dev,
 					unsigned long features,
@@ -1061,6 +1095,34 @@ static inline struct mm_struct *iommu_sva_find(int pasid)
 {
 	return NULL;
 }
+
+static inline int iommu_sva_alloc_pasid(struct iommu_domain *domain,
+		struct device *dev)
+{
+	return -EOPNOTSUPP;
+}
+
+static inline int iommu_sva_map(int pasid, unsigned long iova,
+		phys_addr_t physaddr, size_t size, int prot)
+{
+	return -ENODEV;
+}
+
+
+static inline size_t iommu_sva_map_sg(int pasid, unsigned long iova,
+		struct scatterlist *sg, unsigned int nents, int prot)
+{
+	return 0;
+}
+
+static inline size_t iommu_sva_unmap(int pasid, unsigned long iova, size_t size)
+{
+	return size;
+}
+
+static inline void iommu_sva_free_pasid(int pasid, struct device *dev) { }
+
+
 #endif /* CONFIG_IOMMU_SVA */
 
 #ifdef CONFIG_IOMMU_PAGE_FAULT
-- 
2.17.0

_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 04/16] iommu: sva: Add support for private PASIDs
@ 2018-05-18 21:34     ` Jordan Crouse
  0 siblings, 0 replies; 38+ messages in thread
From: Jordan Crouse @ 2018-05-18 21:34 UTC (permalink / raw)
  To: linux-arm-kernel

Some older SMMU implementations that do not have a fully featured
hardware PASID features have alternate workarounds for using multiple
pagetables. For example, MSM GPUs have logic to automatically switch the
user pagetable from hardware by writing the context bank directly.

Support private PASIDs by creating a new io-pgtable instance map it
to a PASID and provide the APIs for drivers to populate it manually.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/iommu/iommu-sva.c | 139 ++++++++++++++++++++++++++++++++++++--
 drivers/iommu/iommu.c     |  66 +++++++++++++-----
 include/linux/iommu.h     |  74 ++++++++++++++++++--
 3 files changed, 250 insertions(+), 29 deletions(-)

diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c
index e98b994c15f1..26f0da9692d4 100644
--- a/drivers/iommu/iommu-sva.c
+++ b/drivers/iommu/iommu-sva.c
@@ -156,6 +156,7 @@ io_mm_alloc(struct iommu_domain *domain, struct device *dev,
 	mmgrab(mm);
 
 	io_mm->flags		= flags;
+	io_mm->type		= IO_TYPE_SHARED;
 	io_mm->mm		= mm;
 	io_mm->notifier.ops	= &iommu_mmu_notifier;
 	io_mm->release		= domain->ops->mm_free;
@@ -544,13 +545,10 @@ int iommu_sva_device_init(struct device *dev, unsigned long features,
 			  unsigned int max_pasid,
 			  iommu_mm_exit_handler_t mm_exit)
 {
-	int ret;
+	int ret = 0;
 	struct iommu_sva_param *param;
 	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
 
-	if (!domain || !domain->ops->sva_device_init)
-		return -ENODEV;
-
 	if (features & ~IOMMU_SVA_FEAT_IOPF)
 		return -EINVAL;
 
@@ -576,9 +574,12 @@ int iommu_sva_device_init(struct device *dev, unsigned long features,
 	 * IOMMU driver updates the limits depending on the IOMMU and device
 	 * capabilities.
 	 */
-	ret = domain->ops->sva_device_init(dev, param);
-	if (ret)
-		goto err_free_param;
+
+	if (domain && domain->ops->sva_device_init) {
+		ret = domain->ops->sva_device_init(dev, param);
+		if (ret)
+			goto err_free_param;
+	}
 
 	mutex_lock(&dev->iommu_param->lock);
 	if (dev->iommu_param->sva_param)
@@ -790,3 +791,127 @@ struct mm_struct *iommu_sva_find(int pasid)
 	return mm;
 }
 EXPORT_SYMBOL_GPL(iommu_sva_find);
+
+int iommu_sva_alloc_pasid(struct iommu_domain *domain, struct device *dev)
+{
+	int ret, pasid;
+	struct io_mm *io_mm;
+	struct iommu_sva_param *param = dev->iommu_param->sva_param;
+
+	if (!domain->ops->mm_attach || !domain->ops->mm_detach)
+		return -ENODEV;
+
+	if (domain->ops->mm_alloc)
+		io_mm = domain->ops->mm_alloc(domain, NULL, 0);
+	else
+		io_mm = kzalloc(sizeof(*io_mm), GFP_KERNEL);
+
+	if (IS_ERR(io_mm))
+		return PTR_ERR(io_mm);
+	if (!io_mm)
+		return -ENOMEM;
+
+	io_mm->domain = domain;
+	io_mm->type = IO_TYPE_PRIVATE;
+
+	idr_preload(GFP_KERNEL);
+	spin_lock(&iommu_sva_lock);
+	pasid = idr_alloc_cyclic(&iommu_pasid_idr, io_mm, param->min_pasid,
+		param->max_pasid + 1, GFP_ATOMIC);
+	io_mm->pasid = pasid;
+	spin_unlock(&iommu_sva_lock);
+	idr_preload_end();
+
+	if (pasid < 0) {
+		kfree(io_mm);
+		return pasid;
+	}
+
+	ret = domain->ops->mm_attach(domain, dev, io_mm, false);
+	if (!ret)
+		return pasid;
+
+	spin_lock(&iommu_sva_lock);
+	idr_remove(&iommu_pasid_idr, io_mm->pasid);
+	spin_unlock(&iommu_sva_lock);
+
+	if (domain->ops->mm_free)
+		domain->ops->mm_free(io_mm);
+	else
+		kfree(io_mm);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_sva_alloc_pasid);
+
+static struct io_mm *get_io_mm(int pasid)
+{
+	struct io_mm *io_mm;
+
+	spin_lock(&iommu_sva_lock);
+	io_mm = idr_find(&iommu_pasid_idr, pasid);
+	spin_unlock(&iommu_sva_lock);
+
+	return io_mm;
+}
+
+int iommu_sva_map(int pasid, unsigned long iova,
+	      phys_addr_t paddr, size_t size, int prot)
+{
+	struct io_mm *io_mm = get_io_mm(pasid);
+
+	if (!io_mm || io_mm->type != IO_TYPE_PRIVATE)
+		return -ENODEV;
+
+	return __iommu_map(io_mm->domain, &pasid, iova, paddr, size, prot);
+}
+EXPORT_SYMBOL_GPL(iommu_sva_map);
+
+size_t iommu_sva_map_sg(int pasid, unsigned long iova, struct scatterlist *sg,
+		unsigned int nents, int prot)
+{
+	struct io_mm *io_mm = get_io_mm(pasid);
+	struct iommu_domain *domain;
+
+	if (!io_mm || io_mm->type != IO_TYPE_PRIVATE)
+		return -ENODEV;
+
+	domain = io_mm->domain;
+
+	return domain->ops->map_sg(domain, &pasid, iova, sg, nents, prot);
+}
+EXPORT_SYMBOL_GPL(iommu_sva_map_sg);
+
+size_t iommu_sva_unmap(int pasid, unsigned long iova, size_t size)
+{
+	struct io_mm *io_mm = get_io_mm(pasid);
+
+	if (!io_mm || io_mm->type != IO_TYPE_PRIVATE)
+		return -ENODEV;
+
+	return __iommu_unmap(io_mm->domain, &pasid, iova, size, false);
+}
+EXPORT_SYMBOL_GPL(iommu_sva_unmap);
+
+void iommu_sva_free_pasid(int pasid, struct device *dev)
+{
+	struct io_mm *io_mm = get_io_mm(pasid);
+	struct iommu_domain *domain;
+
+	if (!io_mm || io_mm->type != IO_TYPE_PRIVATE)
+		return;
+
+	domain = io_mm->domain;
+
+	domain->ops->mm_detach(domain, dev, io_mm, false);
+
+	spin_lock(&iommu_sva_lock);
+	idr_remove(&iommu_pasid_idr, io_mm->pasid);
+	spin_unlock(&iommu_sva_lock);
+
+	if (domain->ops->mm_free)
+		domain->ops->mm_free(io_mm);
+	else
+		kfree(io_mm);
+}
+EXPORT_SYMBOL_GPL(iommu_sva_free_pasid);
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 13f705df0725..0ba3d27f2300 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1792,7 +1792,7 @@ static size_t iommu_pgsize(struct iommu_domain *domain,
 	return pgsize;
 }
 
-int iommu_map(struct iommu_domain *domain, unsigned long iova,
+int __iommu_map(struct iommu_domain *domain, int *pasid, unsigned long iova,
 	      phys_addr_t paddr, size_t size, int prot)
 {
 	unsigned long orig_iova = iova;
@@ -1801,10 +1801,17 @@ int iommu_map(struct iommu_domain *domain, unsigned long iova,
 	phys_addr_t orig_paddr = paddr;
 	int ret = 0;
 
-	if (unlikely(domain->ops->map == NULL ||
-		     domain->pgsize_bitmap == 0UL))
+	if (unlikely(domain->pgsize_bitmap == 0UL))
 		return -ENODEV;
 
+	if (pasid) {
+		if (unlikely(domain->ops->sva_map == NULL))
+			return -ENODEV;
+	} else {
+		if (unlikely(domain->ops->map == NULL))
+			return -ENODEV;
+	}
+
 	if (unlikely(!(domain->type & __IOMMU_DOMAIN_PAGING)))
 		return -EINVAL;
 
@@ -1830,7 +1837,13 @@ int iommu_map(struct iommu_domain *domain, unsigned long iova,
 		pr_debug("mapping: iova 0x%lx pa %pa pgsize 0x%zx\n",
 			 iova, &paddr, pgsize);
 
-		ret = domain->ops->map(domain, iova, paddr, pgsize, prot);
+		if (pasid)
+			ret = domain->ops->sva_map(domain, *pasid, iova, paddr,
+				pgsize, prot);
+		else
+			ret = domain->ops->map(domain, iova, paddr, pgsize,
+				prot);
+
 		if (ret)
 			break;
 
@@ -1841,16 +1854,23 @@ int iommu_map(struct iommu_domain *domain, unsigned long iova,
 
 	/* unroll mapping in case something went wrong */
 	if (ret)
-		iommu_unmap(domain, orig_iova, orig_size - size);
+		__iommu_unmap(domain, pasid, orig_iova, orig_size - size,
+			pasid ? false : true);
 	else
 		trace_map(orig_iova, orig_paddr, orig_size);
 
 	return ret;
 }
+
+int iommu_map(struct iommu_domain *domain, unsigned long iova,
+	      phys_addr_t paddr, size_t size, int prot)
+{
+	return __iommu_map(domain, NULL, iova, paddr, size, prot);
+}
 EXPORT_SYMBOL_GPL(iommu_map);
 
-static size_t __iommu_unmap(struct iommu_domain *domain,
-			    unsigned long iova, size_t size,
+size_t __iommu_unmap(struct iommu_domain *domain,
+			    int *pasid, unsigned long iova, size_t size,
 			    bool sync)
 {
 	const struct iommu_ops *ops = domain->ops;
@@ -1858,9 +1878,16 @@ static size_t __iommu_unmap(struct iommu_domain *domain,
 	unsigned long orig_iova = iova;
 	unsigned int min_pagesz;
 
-	if (unlikely(ops->unmap == NULL ||
-		     domain->pgsize_bitmap == 0UL))
-		return 0;
+	if (unlikely(domain->pgsize_bitmap == 0UL))
+		return -0;
+
+	if (pasid) {
+		if (unlikely(domain->ops->sva_unmap == NULL))
+			return 0;
+	} else {
+		if (unlikely(domain->ops->unmap == NULL))
+			return 0;
+	}
 
 	if (unlikely(!(domain->type & __IOMMU_DOMAIN_PAGING)))
 		return 0;
@@ -1888,7 +1915,12 @@ static size_t __iommu_unmap(struct iommu_domain *domain,
 	while (unmapped < size) {
 		size_t pgsize = iommu_pgsize(domain, iova, size - unmapped);
 
-		unmapped_page = ops->unmap(domain, iova, pgsize);
+		if (pasid)
+			unmapped_page = ops->sva_unmap(domain, *pasid, iova,
+				pgsize);
+		else
+			unmapped_page = ops->unmap(domain, iova, pgsize);
+
 		if (!unmapped_page)
 			break;
 
@@ -1912,19 +1944,20 @@ static size_t __iommu_unmap(struct iommu_domain *domain,
 size_t iommu_unmap(struct iommu_domain *domain,
 		   unsigned long iova, size_t size)
 {
-	return __iommu_unmap(domain, iova, size, true);
+	return __iommu_unmap(domain, NULL, iova, size, true);
 }
 EXPORT_SYMBOL_GPL(iommu_unmap);
 
 size_t iommu_unmap_fast(struct iommu_domain *domain,
 			unsigned long iova, size_t size)
 {
-	return __iommu_unmap(domain, iova, size, false);
+	return __iommu_unmap(domain, NULL, iova, size, false);
 }
 EXPORT_SYMBOL_GPL(iommu_unmap_fast);
 
-size_t default_iommu_map_sg(struct iommu_domain *domain, unsigned long iova,
-			 struct scatterlist *sg, unsigned int nents, int prot)
+size_t default_iommu_map_sg(struct iommu_domain *domain, int *pasid,
+			 unsigned long iova, struct scatterlist *sg,
+			 unsigned int nents, int prot)
 {
 	struct scatterlist *s;
 	size_t mapped = 0;
@@ -1948,7 +1981,8 @@ size_t default_iommu_map_sg(struct iommu_domain *domain, unsigned long iova,
 		if (!IS_ALIGNED(s->offset, min_pagesz))
 			goto out_err;
 
-		ret = iommu_map(domain, iova + mapped, phys, s->length, prot);
+		ret = __iommu_map(domain, pasid, iova + mapped, phys, s->length,
+			prot);
 		if (ret)
 			goto out_err;
 
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 366254e4b07f..3d72d636c13d 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -108,7 +108,13 @@ struct iommu_domain {
 	struct list_head mm_list;
 };
 
+enum iommu_io_type {
+	IO_TYPE_SHARED,
+	IO_TYPE_PRIVATE,
+};
+
 struct io_mm {
+	enum iommu_io_type	type;
 	int			pasid;
 	/* IOMMU_SVA_FEAT_* */
 	unsigned long		flags;
@@ -123,6 +129,9 @@ struct io_mm {
 	void (*release)(struct io_mm *io_mm);
 	/* For postponed release */
 	struct rcu_head		rcu;
+
+	/* This is used by private entries */
+	struct iommu_domain *domain;
 };
 
 enum iommu_cap {
@@ -315,8 +324,9 @@ struct iommu_ops {
 		   phys_addr_t paddr, size_t size, int prot);
 	size_t (*unmap)(struct iommu_domain *domain, unsigned long iova,
 		     size_t size);
-	size_t (*map_sg)(struct iommu_domain *domain, unsigned long iova,
-			 struct scatterlist *sg, unsigned int nents, int prot);
+	size_t (*map_sg)(struct iommu_domain *domain, int *pasid,
+			 unsigned long iova, struct scatterlist *sg,
+			 unsigned int nents, int prot);
 	void (*flush_iotlb_all)(struct iommu_domain *domain);
 	void (*iotlb_range_add)(struct iommu_domain *domain,
 				unsigned long iova, size_t size);
@@ -358,6 +368,12 @@ struct iommu_ops {
 		struct device *dev, struct tlb_invalidate_info *inv_info);
 	int (*page_response)(struct device *dev, struct page_response_msg *msg);
 
+	int (*sva_map)(struct iommu_domain *domain, int pasid,
+		       unsigned long iova, phys_addr_t paddr, size_t size,
+		       int prot);
+	size_t (*sva_unmap)(struct iommu_domain *domain, int pasid,
+			    unsigned long iova, size_t size);
+
 	unsigned long pgsize_bitmap;
 };
 
@@ -548,9 +564,9 @@ extern size_t iommu_unmap(struct iommu_domain *domain, unsigned long iova,
 			  size_t size);
 extern size_t iommu_unmap_fast(struct iommu_domain *domain,
 			       unsigned long iova, size_t size);
-extern size_t default_iommu_map_sg(struct iommu_domain *domain, unsigned long iova,
-				struct scatterlist *sg,unsigned int nents,
-				int prot);
+extern size_t default_iommu_map_sg(struct iommu_domain *domain, int *pasid,
+				   unsigned long iova, struct scatterlist *sg,
+				   unsigned int nents, int prot);
 extern phys_addr_t iommu_iova_to_phys(struct iommu_domain *domain, dma_addr_t iova);
 extern void iommu_set_fault_handler(struct iommu_domain *domain,
 			iommu_fault_handler_t handler, void *token);
@@ -636,7 +652,7 @@ static inline size_t iommu_map_sg(struct iommu_domain *domain,
 				  unsigned long iova, struct scatterlist *sg,
 				  unsigned int nents, int prot)
 {
-	return domain->ops->map_sg(domain, iova, sg, nents, prot);
+	return domain->ops->map_sg(domain, NULL, iova, sg, nents, prot);
 }
 
 /* PCI device grouping function */
@@ -676,6 +692,14 @@ extern int iommu_sva_bind_device(struct device *dev, struct mm_struct *mm,
 				int *pasid, unsigned long flags, void *drvdata);
 extern int iommu_sva_unbind_device(struct device *dev, int pasid);
 
+/* Common map and unmap functions */
+extern int __iommu_map(struct iommu_domain *domain, int *pasid,
+		unsigned long iova, phys_addr_t paddr, size_t size, int prot);
+
+extern size_t __iommu_unmap(struct iommu_domain *domain,
+			    int *pasid, unsigned long iova, size_t size,
+			    bool sync);
+
 #else /* CONFIG_IOMMU_API */
 
 struct iommu_ops {};
@@ -1027,6 +1051,16 @@ extern int __iommu_sva_unbind_device(struct device *dev, int pasid);
 extern void __iommu_sva_unbind_dev_all(struct device *dev);
 
 extern struct mm_struct *iommu_sva_find(int pasid);
+
+extern int iommu_sva_alloc_pasid(struct iommu_domain *domain,
+		struct device *dev);
+extern int iommu_sva_map(int pasid, unsigned long iova, phys_addr_t physaddr,
+		size_t size, int prot);
+extern size_t iommu_sva_map_sg(int pasid, unsigned long iova,
+		struct scatterlist *sg, unsigned int nents, int prot);
+extern size_t iommu_sva_unmap(int pasid, unsigned long iova, size_t size);
+extern void iommu_sva_free_pasid(int pasid, struct device *dev);
+
 #else /* CONFIG_IOMMU_SVA */
 static inline int iommu_sva_device_init(struct device *dev,
 					unsigned long features,
@@ -1061,6 +1095,34 @@ static inline struct mm_struct *iommu_sva_find(int pasid)
 {
 	return NULL;
 }
+
+static inline int iommu_sva_alloc_pasid(struct iommu_domain *domain,
+		struct device *dev)
+{
+	return -EOPNOTSUPP;
+}
+
+static inline int iommu_sva_map(int pasid, unsigned long iova,
+		phys_addr_t physaddr, size_t size, int prot)
+{
+	return -ENODEV;
+}
+
+
+static inline size_t iommu_sva_map_sg(int pasid, unsigned long iova,
+		struct scatterlist *sg, unsigned int nents, int prot)
+{
+	return 0;
+}
+
+static inline size_t iommu_sva_unmap(int pasid, unsigned long iova, size_t size)
+{
+	return size;
+}
+
+static inline void iommu_sva_free_pasid(int pasid, struct device *dev) { }
+
+
 #endif /* CONFIG_IOMMU_SVA */
 
 #ifdef CONFIG_IOMMU_PAGE_FAULT
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 05/16] iommu: arm-smmu: Add support for private PASIDs
  2018-05-18 21:34 ` Jordan Crouse
@ 2018-05-18 21:34     ` Jordan Crouse
  -1 siblings, 0 replies; 38+ messages in thread
From: Jordan Crouse @ 2018-05-18 21:34 UTC (permalink / raw)
  To: freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: jean-philippe.brucker-5wv7dgnIgG8,
	linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	tfiga-F7+t8E8rja9g9hUCZPvPmw,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

Add support for allocating and populating pagetables
indexed by private PASIDs. Each new PASID is allocated a pagetable
with the same parameters and format as the parent domain.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/iommu/arm-smmu.c   | 154 +++++++++++++++++++++++++++++++++++--
 drivers/iommu/io-pgtable.h |  10 ++-
 2 files changed, 155 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index d459909877c3..5c7c135bbb44 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -201,7 +201,6 @@ struct arm_smmu_device {
 	unsigned long			va_size;
 	unsigned long			ipa_size;
 	unsigned long			pa_size;
-	unsigned long			ubs_size;
 	unsigned long			pgsize_bitmap;
 
 	u32				num_global_irqs;
@@ -252,6 +251,9 @@ struct arm_smmu_domain {
 	spinlock_t			cb_lock; /* Serialises ATS1* ops and TLB syncs */
 	u32 attributes;
 	struct iommu_domain		domain;
+
+	spinlock_t			pasid_lock;
+	struct list_head		pasid_list;
 };
 
 struct arm_smmu_option_prop {
@@ -259,6 +261,144 @@ struct arm_smmu_option_prop {
 	const char *prop;
 };
 
+static struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)
+{
+	return container_of(dom, struct arm_smmu_domain, domain);
+}
+
+struct arm_smmu_pasid {
+	struct iommu_domain *domain;
+	struct io_pgtable_ops		*pgtbl_ops;
+	struct list_head node;
+	int pasid;
+};
+
+struct arm_smmu_pasid *arm_smmu_get_pasid(struct arm_smmu_domain *smmu_domain,
+		int pasid)
+{
+	struct arm_smmu_pasid *node, *obj = NULL;
+
+	spin_lock(&smmu_domain->pasid_lock);
+	list_for_each_entry(node, &smmu_domain->pasid_list, node) {
+		if (node->pasid == pasid) {
+			obj = node;
+			break;
+		}
+	}
+	spin_unlock(&smmu_domain->pasid_lock);
+
+	return obj;
+}
+
+static void arm_smmu_mm_detach(struct iommu_domain *domain, struct device *dev,
+		struct io_mm *io_mm, bool unused)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct arm_smmu_pasid *node, *obj = NULL;
+
+	spin_lock(&smmu_domain->pasid_lock);
+	list_for_each_entry(node, &smmu_domain->pasid_list, node) {
+		if (node->pasid == io_mm->pasid) {
+			obj = node;
+			list_del(&obj->node);
+			break;
+		}
+	}
+	spin_unlock(&smmu_domain->pasid_lock);
+
+	if (obj)
+		free_io_pgtable_ops(obj->pgtbl_ops);
+
+	kfree(obj);
+}
+
+static size_t arm_smmu_sva_unmap(struct iommu_domain *domain, int pasid,
+		unsigned long iova, size_t size)
+{
+	struct arm_smmu_pasid *obj =
+		arm_smmu_get_pasid(to_smmu_domain(domain), pasid);
+
+	if (!obj)
+		return -ENODEV;
+
+	return obj->pgtbl_ops->unmap(obj->pgtbl_ops, iova, size);
+}
+
+
+static int arm_smmu_sva_map(struct iommu_domain *domain, int pasid,
+		unsigned long iova, phys_addr_t paddr, size_t size, int prot)
+{
+	struct arm_smmu_pasid *obj =
+		arm_smmu_get_pasid(to_smmu_domain(domain), pasid);
+
+	if (!obj)
+		return -ENODEV;
+
+	return obj->pgtbl_ops->map(obj->pgtbl_ops, iova, paddr, size, prot);
+}
+
+static int arm_smmu_mm_attach(struct iommu_domain *domain, struct device *dev,
+		struct io_mm *io_mm, bool unused)
+{
+	struct arm_smmu_pasid *obj;
+	struct io_pgtable_cfg pgtbl_cfg;
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct arm_smmu_device *smmu = smmu_domain->smmu;
+	enum io_pgtable_fmt fmt;
+	unsigned long ias, oas;
+
+	/* Only allow private pasids */
+	if (io_mm->type != IO_TYPE_PRIVATE || io_mm->mm)
+		return -ENODEV;
+
+	/* Only allow pasid backed tables to be created on S1 domains */
+	if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1)
+		return -ENODEV;
+
+	obj = kzalloc(sizeof(*obj), GFP_KERNEL);
+	if (!obj)
+		return -ENOMEM;
+
+	/* Get the same exact format as the parent domain */
+	ias = smmu->va_size;
+	oas = smmu->ipa_size;
+
+	if (smmu_domain->cfg.fmt == ARM_SMMU_CTX_FMT_AARCH64)
+		fmt = ARM_64_LPAE_S1;
+	else if (smmu_domain->cfg.fmt == ARM_SMMU_CTX_FMT_AARCH32_L) {
+		fmt = ARM_32_LPAE_S1;
+		ias = min(ias, 32UL);
+		oas = min(oas, 40UL);
+	} else {
+		fmt = ARM_V7S;
+		ias = min(ias, 32UL);
+		oas = min(oas, 32UL);
+	}
+
+	pgtbl_cfg = (struct io_pgtable_cfg) {
+		.pgsize_bitmap = smmu->pgsize_bitmap,
+		.ias = ias,
+		.oas = oas,
+		.tlb = NULL,
+		.iommu_dev = smmu->dev
+	};
+
+	obj->pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
+	if (!obj->pgtbl_ops) {
+		kfree(obj);
+		return -ENOMEM;
+	}
+
+	obj->domain = domain;
+	obj->pasid = io_mm->pasid;
+
+	spin_lock(&smmu_domain->pasid_lock);
+	list_add_tail(&obj->node, &smmu_domain->pasid_list);
+	spin_unlock(&smmu_domain->pasid_lock);
+
+	return 0;
+}
+
 static atomic_t cavium_smmu_context_count = ATOMIC_INIT(0);
 
 static bool using_legacy_binding, using_generic_binding;
@@ -268,11 +408,6 @@ static struct arm_smmu_option_prop arm_smmu_options[] = {
 	{ 0, NULL},
 };
 
-static struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)
-{
-	return container_of(dom, struct arm_smmu_domain, domain);
-}
-
 static void parse_driver_options(struct arm_smmu_device *smmu)
 {
 	int i = 0;
@@ -1055,6 +1190,9 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
 	mutex_init(&smmu_domain->init_mutex);
 	spin_lock_init(&smmu_domain->cb_lock);
 
+	spin_lock_init(&smmu_domain->pasid_lock);
+	INIT_LIST_HEAD(&smmu_domain->pasid_list);
+
 	return &smmu_domain->domain;
 }
 
@@ -1694,6 +1832,10 @@ static struct iommu_ops arm_smmu_ops = {
 	.of_xlate		= arm_smmu_of_xlate,
 	.get_resv_regions	= arm_smmu_get_resv_regions,
 	.put_resv_regions	= arm_smmu_put_resv_regions,
+	.mm_attach		= arm_smmu_mm_attach,
+	.sva_map		= arm_smmu_sva_map,
+	.sva_unmap		= arm_smmu_sva_unmap,
+	.mm_detach		= arm_smmu_mm_detach,
 	.pgsize_bitmap		= -1UL, /* Restricted during device attach */
 };
 
diff --git a/drivers/iommu/io-pgtable.h b/drivers/iommu/io-pgtable.h
index fd9f0fc4eb60..69fcee763446 100644
--- a/drivers/iommu/io-pgtable.h
+++ b/drivers/iommu/io-pgtable.h
@@ -173,18 +173,22 @@ struct io_pgtable {
 
 static inline void io_pgtable_tlb_flush_all(struct io_pgtable *iop)
 {
-	iop->cfg.tlb->tlb_flush_all(iop->cookie);
+	if (iop->cfg.tlb)
+		iop->cfg.tlb->tlb_flush_all(iop->cookie);
 }
 
 static inline void io_pgtable_tlb_add_flush(struct io_pgtable *iop,
 		unsigned long iova, size_t size, size_t granule, bool leaf)
 {
-	iop->cfg.tlb->tlb_add_flush(iova, size, granule, leaf, iop->cookie);
+	if (iop->cfg.tlb)
+		iop->cfg.tlb->tlb_add_flush(iova, size, granule, leaf,
+			iop->cookie);
 }
 
 static inline void io_pgtable_tlb_sync(struct io_pgtable *iop)
 {
-	iop->cfg.tlb->tlb_sync(iop->cookie);
+	if (iop->cfg.tlb)
+		iop->cfg.tlb->tlb_sync(iop->cookie);
 }
 
 /**
-- 
2.17.0

_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 05/16] iommu: arm-smmu: Add support for private PASIDs
@ 2018-05-18 21:34     ` Jordan Crouse
  0 siblings, 0 replies; 38+ messages in thread
From: Jordan Crouse @ 2018-05-18 21:34 UTC (permalink / raw)
  To: linux-arm-kernel

Add support for allocating and populating pagetables
indexed by private PASIDs. Each new PASID is allocated a pagetable
with the same parameters and format as the parent domain.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/iommu/arm-smmu.c   | 154 +++++++++++++++++++++++++++++++++++--
 drivers/iommu/io-pgtable.h |  10 ++-
 2 files changed, 155 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index d459909877c3..5c7c135bbb44 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -201,7 +201,6 @@ struct arm_smmu_device {
 	unsigned long			va_size;
 	unsigned long			ipa_size;
 	unsigned long			pa_size;
-	unsigned long			ubs_size;
 	unsigned long			pgsize_bitmap;
 
 	u32				num_global_irqs;
@@ -252,6 +251,9 @@ struct arm_smmu_domain {
 	spinlock_t			cb_lock; /* Serialises ATS1* ops and TLB syncs */
 	u32 attributes;
 	struct iommu_domain		domain;
+
+	spinlock_t			pasid_lock;
+	struct list_head		pasid_list;
 };
 
 struct arm_smmu_option_prop {
@@ -259,6 +261,144 @@ struct arm_smmu_option_prop {
 	const char *prop;
 };
 
+static struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)
+{
+	return container_of(dom, struct arm_smmu_domain, domain);
+}
+
+struct arm_smmu_pasid {
+	struct iommu_domain *domain;
+	struct io_pgtable_ops		*pgtbl_ops;
+	struct list_head node;
+	int pasid;
+};
+
+struct arm_smmu_pasid *arm_smmu_get_pasid(struct arm_smmu_domain *smmu_domain,
+		int pasid)
+{
+	struct arm_smmu_pasid *node, *obj = NULL;
+
+	spin_lock(&smmu_domain->pasid_lock);
+	list_for_each_entry(node, &smmu_domain->pasid_list, node) {
+		if (node->pasid == pasid) {
+			obj = node;
+			break;
+		}
+	}
+	spin_unlock(&smmu_domain->pasid_lock);
+
+	return obj;
+}
+
+static void arm_smmu_mm_detach(struct iommu_domain *domain, struct device *dev,
+		struct io_mm *io_mm, bool unused)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct arm_smmu_pasid *node, *obj = NULL;
+
+	spin_lock(&smmu_domain->pasid_lock);
+	list_for_each_entry(node, &smmu_domain->pasid_list, node) {
+		if (node->pasid == io_mm->pasid) {
+			obj = node;
+			list_del(&obj->node);
+			break;
+		}
+	}
+	spin_unlock(&smmu_domain->pasid_lock);
+
+	if (obj)
+		free_io_pgtable_ops(obj->pgtbl_ops);
+
+	kfree(obj);
+}
+
+static size_t arm_smmu_sva_unmap(struct iommu_domain *domain, int pasid,
+		unsigned long iova, size_t size)
+{
+	struct arm_smmu_pasid *obj =
+		arm_smmu_get_pasid(to_smmu_domain(domain), pasid);
+
+	if (!obj)
+		return -ENODEV;
+
+	return obj->pgtbl_ops->unmap(obj->pgtbl_ops, iova, size);
+}
+
+
+static int arm_smmu_sva_map(struct iommu_domain *domain, int pasid,
+		unsigned long iova, phys_addr_t paddr, size_t size, int prot)
+{
+	struct arm_smmu_pasid *obj =
+		arm_smmu_get_pasid(to_smmu_domain(domain), pasid);
+
+	if (!obj)
+		return -ENODEV;
+
+	return obj->pgtbl_ops->map(obj->pgtbl_ops, iova, paddr, size, prot);
+}
+
+static int arm_smmu_mm_attach(struct iommu_domain *domain, struct device *dev,
+		struct io_mm *io_mm, bool unused)
+{
+	struct arm_smmu_pasid *obj;
+	struct io_pgtable_cfg pgtbl_cfg;
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct arm_smmu_device *smmu = smmu_domain->smmu;
+	enum io_pgtable_fmt fmt;
+	unsigned long ias, oas;
+
+	/* Only allow private pasids */
+	if (io_mm->type != IO_TYPE_PRIVATE || io_mm->mm)
+		return -ENODEV;
+
+	/* Only allow pasid backed tables to be created on S1 domains */
+	if (smmu_domain->stage != ARM_SMMU_DOMAIN_S1)
+		return -ENODEV;
+
+	obj = kzalloc(sizeof(*obj), GFP_KERNEL);
+	if (!obj)
+		return -ENOMEM;
+
+	/* Get the same exact format as the parent domain */
+	ias = smmu->va_size;
+	oas = smmu->ipa_size;
+
+	if (smmu_domain->cfg.fmt == ARM_SMMU_CTX_FMT_AARCH64)
+		fmt = ARM_64_LPAE_S1;
+	else if (smmu_domain->cfg.fmt == ARM_SMMU_CTX_FMT_AARCH32_L) {
+		fmt = ARM_32_LPAE_S1;
+		ias = min(ias, 32UL);
+		oas = min(oas, 40UL);
+	} else {
+		fmt = ARM_V7S;
+		ias = min(ias, 32UL);
+		oas = min(oas, 32UL);
+	}
+
+	pgtbl_cfg = (struct io_pgtable_cfg) {
+		.pgsize_bitmap = smmu->pgsize_bitmap,
+		.ias = ias,
+		.oas = oas,
+		.tlb = NULL,
+		.iommu_dev = smmu->dev
+	};
+
+	obj->pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
+	if (!obj->pgtbl_ops) {
+		kfree(obj);
+		return -ENOMEM;
+	}
+
+	obj->domain = domain;
+	obj->pasid = io_mm->pasid;
+
+	spin_lock(&smmu_domain->pasid_lock);
+	list_add_tail(&obj->node, &smmu_domain->pasid_list);
+	spin_unlock(&smmu_domain->pasid_lock);
+
+	return 0;
+}
+
 static atomic_t cavium_smmu_context_count = ATOMIC_INIT(0);
 
 static bool using_legacy_binding, using_generic_binding;
@@ -268,11 +408,6 @@ static struct arm_smmu_option_prop arm_smmu_options[] = {
 	{ 0, NULL},
 };
 
-static struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)
-{
-	return container_of(dom, struct arm_smmu_domain, domain);
-}
-
 static void parse_driver_options(struct arm_smmu_device *smmu)
 {
 	int i = 0;
@@ -1055,6 +1190,9 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
 	mutex_init(&smmu_domain->init_mutex);
 	spin_lock_init(&smmu_domain->cb_lock);
 
+	spin_lock_init(&smmu_domain->pasid_lock);
+	INIT_LIST_HEAD(&smmu_domain->pasid_list);
+
 	return &smmu_domain->domain;
 }
 
@@ -1694,6 +1832,10 @@ static struct iommu_ops arm_smmu_ops = {
 	.of_xlate		= arm_smmu_of_xlate,
 	.get_resv_regions	= arm_smmu_get_resv_regions,
 	.put_resv_regions	= arm_smmu_put_resv_regions,
+	.mm_attach		= arm_smmu_mm_attach,
+	.sva_map		= arm_smmu_sva_map,
+	.sva_unmap		= arm_smmu_sva_unmap,
+	.mm_detach		= arm_smmu_mm_detach,
 	.pgsize_bitmap		= -1UL, /* Restricted during device attach */
 };
 
diff --git a/drivers/iommu/io-pgtable.h b/drivers/iommu/io-pgtable.h
index fd9f0fc4eb60..69fcee763446 100644
--- a/drivers/iommu/io-pgtable.h
+++ b/drivers/iommu/io-pgtable.h
@@ -173,18 +173,22 @@ struct io_pgtable {
 
 static inline void io_pgtable_tlb_flush_all(struct io_pgtable *iop)
 {
-	iop->cfg.tlb->tlb_flush_all(iop->cookie);
+	if (iop->cfg.tlb)
+		iop->cfg.tlb->tlb_flush_all(iop->cookie);
 }
 
 static inline void io_pgtable_tlb_add_flush(struct io_pgtable *iop,
 		unsigned long iova, size_t size, size_t granule, bool leaf)
 {
-	iop->cfg.tlb->tlb_add_flush(iova, size, granule, leaf, iop->cookie);
+	if (iop->cfg.tlb)
+		iop->cfg.tlb->tlb_add_flush(iova, size, granule, leaf,
+			iop->cookie);
 }
 
 static inline void io_pgtable_tlb_sync(struct io_pgtable *iop)
 {
-	iop->cfg.tlb->tlb_sync(iop->cookie);
+	if (iop->cfg.tlb)
+		iop->cfg.tlb->tlb_sync(iop->cookie);
 }
 
 /**
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 06/16] iommu: arm-smmu: Add side-band function for specific PASID callbacks
  2018-05-18 21:34 ` Jordan Crouse
@ 2018-05-18 21:34     ` Jordan Crouse
  -1 siblings, 0 replies; 38+ messages in thread
From: Jordan Crouse @ 2018-05-18 21:34 UTC (permalink / raw)
  To: freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: jean-philippe.brucker-5wv7dgnIgG8,
	linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	tfiga-F7+t8E8rja9g9hUCZPvPmw,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

Just allowing a client driver to create and manage a
a private PASID isn't interesting if the client driver doesn't have
enough information about the pagetable to be able to use it. Add a
side band function for arm-smmu that lets the client device register
pasid operations to pass the relevant pagetable information to the
client driver whenever a new PASID is created or destroyed

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/iommu/arm-smmu.c | 40 ++++++++++++++++++++++++++++++++++++++++
 include/linux/arm-smmu.h | 18 ++++++++++++++++++
 2 files changed, 58 insertions(+)
 create mode 100644 include/linux/arm-smmu.h

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 5c7c135bbb44..100797a07be0 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -50,6 +50,7 @@
 #include <linux/platform_device.h>
 #include <linux/slab.h>
 #include <linux/spinlock.h>
+#include <linux/arm-smmu.h>
 
 #include <linux/amba/bus.h>
 
@@ -254,6 +255,8 @@ struct arm_smmu_domain {
 
 	spinlock_t			pasid_lock;
 	struct list_head		pasid_list;
+	const struct arm_smmu_pasid_ops	*pasid_ops;
+	void				*pasid_data;
 };
 
 struct arm_smmu_option_prop {
@@ -296,6 +299,10 @@ static void arm_smmu_mm_detach(struct iommu_domain *domain, struct device *dev,
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
 	struct arm_smmu_pasid *node, *obj = NULL;
 
+	if (smmu_domain->pasid_ops && smmu_domain->pasid_ops->remove_pasid)
+		smmu_domain->pasid_ops->remove_pasid(io_mm->pasid,
+			smmu_domain->pasid_data);
+
 	spin_lock(&smmu_domain->pasid_lock);
 	list_for_each_entry(node, &smmu_domain->pasid_list, node) {
 		if (node->pasid == io_mm->pasid) {
@@ -392,6 +399,26 @@ static int arm_smmu_mm_attach(struct iommu_domain *domain, struct device *dev,
 	obj->domain = domain;
 	obj->pasid = io_mm->pasid;
 
+	if (smmu_domain->pasid_ops && smmu_domain->pasid_ops->install_pasid) {
+		int ret;
+		u64 ttbr;
+
+		if (smmu_domain->cfg.fmt == ARM_SMMU_CTX_FMT_AARCH32_S)
+			ttbr = pgtbl_cfg.arm_v7s_cfg.ttbr;
+		else
+			ttbr = pgtbl_cfg.arm_lpae_s1_cfg.ttbr;
+
+		ret = smmu_domain->pasid_ops->install_pasid(io_mm->pasid, ttbr,
+			smmu_domain->cfg.asid, smmu_domain->pasid_data);
+
+		if (ret) {
+			free_io_pgtable_ops(obj->pgtbl_ops);
+			kfree(obj);
+
+			return ret;
+		}
+	}
+
 	spin_lock(&smmu_domain->pasid_lock);
 	list_add_tail(&obj->node, &smmu_domain->pasid_list);
 	spin_unlock(&smmu_domain->pasid_lock);
@@ -2156,6 +2183,19 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
 	return 0;
 }
 
+void arm_smmu_add_pasid_ops(struct iommu_domain *domain,
+	const struct arm_smmu_pasid_ops *ops, void *data)
+{
+	struct arm_smmu_domain *smmu_domain;
+
+	if (domain) {
+		smmu_domain = to_smmu_domain(domain);
+		smmu_domain->pasid_ops = ops;
+		smmu_domain->pasid_data = data;
+	}
+}
+EXPORT_SYMBOL_GPL(arm_smmu_add_pasid_ops);
+
 struct arm_smmu_match_data {
 	enum arm_smmu_arch_version version;
 	enum arm_smmu_implementation model;
diff --git a/include/linux/arm-smmu.h b/include/linux/arm-smmu.h
new file mode 100644
index 000000000000..c14ca52231bf
--- /dev/null
+++ b/include/linux/arm-smmu.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (c) 2018, The Linux Foundation. All rights reserved. */
+
+#ifndef ARM_SMMU_H_
+#define ARM_SMMU_H_
+
+struct iommu_domain;
+
+struct arm_smmu_pasid_ops {
+	int (*install_pasid)(int pasid, u64 ttbr, u32 asid, void *data);
+	void (*remove_pasid)(int pasid, void *data);
+};
+
+
+void arm_smmu_add_pasid_ops(struct iommu_domain *domain,
+	const struct arm_smmu_pasid_ops *ops, void *data);
+
+#endif
-- 
2.17.0

_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 06/16] iommu: arm-smmu: Add side-band function for specific PASID callbacks
@ 2018-05-18 21:34     ` Jordan Crouse
  0 siblings, 0 replies; 38+ messages in thread
From: Jordan Crouse @ 2018-05-18 21:34 UTC (permalink / raw)
  To: linux-arm-kernel

Just allowing a client driver to create and manage a
a private PASID isn't interesting if the client driver doesn't have
enough information about the pagetable to be able to use it. Add a
side band function for arm-smmu that lets the client device register
pasid operations to pass the relevant pagetable information to the
client driver whenever a new PASID is created or destroyed

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/iommu/arm-smmu.c | 40 ++++++++++++++++++++++++++++++++++++++++
 include/linux/arm-smmu.h | 18 ++++++++++++++++++
 2 files changed, 58 insertions(+)
 create mode 100644 include/linux/arm-smmu.h

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 5c7c135bbb44..100797a07be0 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -50,6 +50,7 @@
 #include <linux/platform_device.h>
 #include <linux/slab.h>
 #include <linux/spinlock.h>
+#include <linux/arm-smmu.h>
 
 #include <linux/amba/bus.h>
 
@@ -254,6 +255,8 @@ struct arm_smmu_domain {
 
 	spinlock_t			pasid_lock;
 	struct list_head		pasid_list;
+	const struct arm_smmu_pasid_ops	*pasid_ops;
+	void				*pasid_data;
 };
 
 struct arm_smmu_option_prop {
@@ -296,6 +299,10 @@ static void arm_smmu_mm_detach(struct iommu_domain *domain, struct device *dev,
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
 	struct arm_smmu_pasid *node, *obj = NULL;
 
+	if (smmu_domain->pasid_ops && smmu_domain->pasid_ops->remove_pasid)
+		smmu_domain->pasid_ops->remove_pasid(io_mm->pasid,
+			smmu_domain->pasid_data);
+
 	spin_lock(&smmu_domain->pasid_lock);
 	list_for_each_entry(node, &smmu_domain->pasid_list, node) {
 		if (node->pasid == io_mm->pasid) {
@@ -392,6 +399,26 @@ static int arm_smmu_mm_attach(struct iommu_domain *domain, struct device *dev,
 	obj->domain = domain;
 	obj->pasid = io_mm->pasid;
 
+	if (smmu_domain->pasid_ops && smmu_domain->pasid_ops->install_pasid) {
+		int ret;
+		u64 ttbr;
+
+		if (smmu_domain->cfg.fmt == ARM_SMMU_CTX_FMT_AARCH32_S)
+			ttbr = pgtbl_cfg.arm_v7s_cfg.ttbr;
+		else
+			ttbr = pgtbl_cfg.arm_lpae_s1_cfg.ttbr;
+
+		ret = smmu_domain->pasid_ops->install_pasid(io_mm->pasid, ttbr,
+			smmu_domain->cfg.asid, smmu_domain->pasid_data);
+
+		if (ret) {
+			free_io_pgtable_ops(obj->pgtbl_ops);
+			kfree(obj);
+
+			return ret;
+		}
+	}
+
 	spin_lock(&smmu_domain->pasid_lock);
 	list_add_tail(&obj->node, &smmu_domain->pasid_list);
 	spin_unlock(&smmu_domain->pasid_lock);
@@ -2156,6 +2183,19 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
 	return 0;
 }
 
+void arm_smmu_add_pasid_ops(struct iommu_domain *domain,
+	const struct arm_smmu_pasid_ops *ops, void *data)
+{
+	struct arm_smmu_domain *smmu_domain;
+
+	if (domain) {
+		smmu_domain = to_smmu_domain(domain);
+		smmu_domain->pasid_ops = ops;
+		smmu_domain->pasid_data = data;
+	}
+}
+EXPORT_SYMBOL_GPL(arm_smmu_add_pasid_ops);
+
 struct arm_smmu_match_data {
 	enum arm_smmu_arch_version version;
 	enum arm_smmu_implementation model;
diff --git a/include/linux/arm-smmu.h b/include/linux/arm-smmu.h
new file mode 100644
index 000000000000..c14ca52231bf
--- /dev/null
+++ b/include/linux/arm-smmu.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (c) 2018, The Linux Foundation. All rights reserved. */
+
+#ifndef ARM_SMMU_H_
+#define ARM_SMMU_H_
+
+struct iommu_domain;
+
+struct arm_smmu_pasid_ops {
+	int (*install_pasid)(int pasid, u64 ttbr, u32 asid, void *data);
+	void (*remove_pasid)(int pasid, void *data);
+};
+
+
+void arm_smmu_add_pasid_ops(struct iommu_domain *domain,
+	const struct arm_smmu_pasid_ops *ops, void *data);
+
+#endif
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 07/16] drm/msm/gpu: Enable 64 bit mode by default
  2018-05-18 21:34 ` Jordan Crouse
@ 2018-05-18 21:34     ` Jordan Crouse
  -1 siblings, 0 replies; 38+ messages in thread
From: Jordan Crouse @ 2018-05-18 21:34 UTC (permalink / raw)
  To: freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: jean-philippe.brucker-5wv7dgnIgG8,
	linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	tfiga-F7+t8E8rja9g9hUCZPvPmw,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

A5XX GPUs can be run in either 32 or 64 bit mode. The GPU registers
and the microcode use 64 bit virtual addressing in either case but the
upper 32 bits are ignored if the GPU is in 32 bit mode. There is no
performance disadvantage to remaining in 64 bit mode even if we are
only generating 32 bit addresses so switch over now to prepare for
using addresses above 4G for targets that support them.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 14 ++++++++++++++
 drivers/gpu/drm/msm/msm_iommu.c       |  2 +-
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index d39400e5bc42..b2c0370072dd 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -741,6 +741,20 @@ static int a5xx_hw_init(struct msm_gpu *gpu)
 		REG_A5XX_RBBM_SECVID_TSB_TRUSTED_BASE_HI, 0x00000000);
 	gpu_write(gpu, REG_A5XX_RBBM_SECVID_TSB_TRUSTED_SIZE, 0x00000000);
 
+	/* Put the GPU into 64 bit by default */
+	gpu_write(gpu, REG_A5XX_CP_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_VSC_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_GRAS_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_RB_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_PC_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_HLSQ_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_VFD_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_VPC_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_UCHE_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_SP_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_TPL1_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_RBBM_SECVID_TSB_ADDR_MODE_CNTL, 0x1);
+
 	ret = adreno_hw_init(gpu);
 	if (ret)
 		return ret;
diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c
index b23d33622f37..fdbe1a8372f0 100644
--- a/drivers/gpu/drm/msm/msm_iommu.c
+++ b/drivers/gpu/drm/msm/msm_iommu.c
@@ -30,7 +30,7 @@ static int msm_fault_handler(struct iommu_domain *domain, struct device *dev,
 	struct msm_iommu *iommu = arg;
 	if (iommu->base.handler)
 		return iommu->base.handler(iommu->base.arg, iova, flags);
-	pr_warn_ratelimited("*** fault: iova=%08lx, flags=%d\n", iova, flags);
+	pr_warn_ratelimited("*** fault: iova=%16lx, flags=%d\n", iova, flags);
 	return 0;
 }
 
-- 
2.17.0

_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 07/16] drm/msm/gpu: Enable 64 bit mode by default
@ 2018-05-18 21:34     ` Jordan Crouse
  0 siblings, 0 replies; 38+ messages in thread
From: Jordan Crouse @ 2018-05-18 21:34 UTC (permalink / raw)
  To: linux-arm-kernel

A5XX GPUs can be run in either 32 or 64 bit mode. The GPU registers
and the microcode use 64 bit virtual addressing in either case but the
upper 32 bits are ignored if the GPU is in 32 bit mode. There is no
performance disadvantage to remaining in 64 bit mode even if we are
only generating 32 bit addresses so switch over now to prepare for
using addresses above 4G for targets that support them.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 14 ++++++++++++++
 drivers/gpu/drm/msm/msm_iommu.c       |  2 +-
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index d39400e5bc42..b2c0370072dd 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -741,6 +741,20 @@ static int a5xx_hw_init(struct msm_gpu *gpu)
 		REG_A5XX_RBBM_SECVID_TSB_TRUSTED_BASE_HI, 0x00000000);
 	gpu_write(gpu, REG_A5XX_RBBM_SECVID_TSB_TRUSTED_SIZE, 0x00000000);
 
+	/* Put the GPU into 64 bit by default */
+	gpu_write(gpu, REG_A5XX_CP_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_VSC_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_GRAS_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_RB_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_PC_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_HLSQ_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_VFD_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_VPC_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_UCHE_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_SP_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_TPL1_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_RBBM_SECVID_TSB_ADDR_MODE_CNTL, 0x1);
+
 	ret = adreno_hw_init(gpu);
 	if (ret)
 		return ret;
diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c
index b23d33622f37..fdbe1a8372f0 100644
--- a/drivers/gpu/drm/msm/msm_iommu.c
+++ b/drivers/gpu/drm/msm/msm_iommu.c
@@ -30,7 +30,7 @@ static int msm_fault_handler(struct iommu_domain *domain, struct device *dev,
 	struct msm_iommu *iommu = arg;
 	if (iommu->base.handler)
 		return iommu->base.handler(iommu->base.arg, iova, flags);
-	pr_warn_ratelimited("*** fault: iova=%08lx, flags=%d\n", iova, flags);
+	pr_warn_ratelimited("*** fault: iova=%16lx, flags=%d\n", iova, flags);
 	return 0;
 }
 
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 08/16] drm/msm: Pass the MMU domain index in struct msm_file_private
  2018-05-18 21:34 ` Jordan Crouse
@ 2018-05-18 21:34   ` Jordan Crouse
  -1 siblings, 0 replies; 38+ messages in thread
From: Jordan Crouse @ 2018-05-18 21:34 UTC (permalink / raw)
  To: freedreno
  Cc: jean-philippe.brucker, linux-arm-msm, dri-devel, tfiga, iommu,
	vivek.gautam, linux-arm-kernel

Pass the index of the MMU domain in struct msm_file_private instead
of assuming gpu->id throughout the submit path. This clears the way
to change ctx->aspace to a per-instance pagetable.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/gpu/drm/msm/msm_drv.c        | 16 ++++------------
 drivers/gpu/drm/msm/msm_drv.h        |  1 +
 drivers/gpu/drm/msm/msm_gem.h        |  1 +
 drivers/gpu/drm/msm/msm_gem_submit.c | 11 ++++++-----
 drivers/gpu/drm/msm/msm_gpu.c        |  5 ++---
 5 files changed, 14 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index 30cd514d8f7c..2b663435a3f7 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -502,6 +502,7 @@ static void load_gpu(struct drm_device *dev)
 
 static int context_init(struct drm_device *dev, struct drm_file *file)
 {
+	struct msm_drm_private *priv = dev->dev_private;
 	struct msm_file_private *ctx;
 
 	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
@@ -510,6 +511,7 @@ static int context_init(struct drm_device *dev, struct drm_file *file)
 
 	msm_submitqueue_init(dev, ctx);
 
+	ctx->aspace = priv->gpu->aspace;
 	file->driver_priv = ctx;
 
 	return 0;
@@ -683,17 +685,6 @@ static int msm_ioctl_gem_cpu_fini(struct drm_device *dev, void *data,
 	return ret;
 }
 
-static int msm_ioctl_gem_info_iova(struct drm_device *dev,
-		struct drm_gem_object *obj, uint64_t *iova)
-{
-	struct msm_drm_private *priv = dev->dev_private;
-
-	if (!priv->gpu)
-		return -EINVAL;
-
-	return msm_gem_get_iova(obj, priv->gpu->aspace, iova);
-}
-
 static int msm_ioctl_gem_info(struct drm_device *dev, void *data,
 		struct drm_file *file)
 {
@@ -709,9 +700,10 @@ static int msm_ioctl_gem_info(struct drm_device *dev, void *data,
 		return -ENOENT;
 
 	if (args->flags & MSM_INFO_IOVA) {
+		struct msm_file_private *ctx = file->driver_priv;
 		uint64_t iova;
 
-		ret = msm_ioctl_gem_info_iova(dev, obj, &iova);
+		ret = msm_gem_get_iova(obj, ctx->aspace, &iova);
 		if (!ret)
 			args->offset = iova;
 	} else {
diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
index 48ed5b9a8580..897b08135927 100644
--- a/drivers/gpu/drm/msm/msm_drv.h
+++ b/drivers/gpu/drm/msm/msm_drv.h
@@ -58,6 +58,7 @@ struct msm_file_private {
 	rwlock_t queuelock;
 	struct list_head submitqueues;
 	int queueid;
+	struct msm_gem_address_space *aspace;
 };
 
 enum msm_mdp_plane_property {
diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h
index c5d9bd3e47a8..fe8b3aa7d76f 100644
--- a/drivers/gpu/drm/msm/msm_gem.h
+++ b/drivers/gpu/drm/msm/msm_gem.h
@@ -138,6 +138,7 @@ void msm_gem_vunmap(struct drm_gem_object *obj, enum msm_gem_lock subclass);
 struct msm_gem_submit {
 	struct drm_device *dev;
 	struct msm_gpu *gpu;
+	struct msm_gem_address_space *aspace;
 	struct list_head node;   /* node in ring submit list */
 	struct list_head bo_list;
 	struct ww_acquire_ctx ticket;
diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c
index 7bd83e0afa97..d5dffcba9919 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -31,8 +31,8 @@
 #define BO_PINNED   0x2000
 
 static struct msm_gem_submit *submit_create(struct drm_device *dev,
-		struct msm_gpu *gpu, struct msm_gpu_submitqueue *queue,
-		uint32_t nr_bos, uint32_t nr_cmds)
+		struct msm_gpu *gpu, struct msm_gem_address_space *aspace,
+		struct msm_gpu_submitqueue *queue, uint32_t nr_bos, uint32_t nr_cmds)
 {
 	struct msm_gem_submit *submit;
 	uint64_t sz = sizeof(*submit) + ((u64)nr_bos * sizeof(submit->bos[0])) +
@@ -46,6 +46,7 @@ static struct msm_gem_submit *submit_create(struct drm_device *dev,
 		return NULL;
 
 	submit->dev = dev;
+	submit->aspace = aspace;
 	submit->gpu = gpu;
 	submit->fence = NULL;
 	submit->pid = get_pid(task_pid(current));
@@ -167,7 +168,7 @@ static void submit_unlock_unpin_bo(struct msm_gem_submit *submit,
 	struct msm_gem_object *msm_obj = submit->bos[i].obj;
 
 	if (submit->bos[i].flags & BO_PINNED)
-		msm_gem_put_iova(&msm_obj->base, submit->gpu->aspace);
+		msm_gem_put_iova(&msm_obj->base, submit->aspace);
 
 	if (submit->bos[i].flags & BO_LOCKED)
 		ww_mutex_unlock(&msm_obj->resv->lock);
@@ -270,7 +271,7 @@ static int submit_pin_objects(struct msm_gem_submit *submit)
 
 		/* if locking succeeded, pin bo: */
 		ret = msm_gem_get_iova(&msm_obj->base,
-				submit->gpu->aspace, &iova);
+				submit->aspace, &iova);
 
 		if (ret)
 			break;
@@ -471,7 +472,7 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data,
 		}
 	}
 
-	submit = submit_create(dev, gpu, queue, args->nr_bos, args->nr_cmds);
+	submit = submit_create(dev, gpu, ctx->aspace, queue, args->nr_bos, args->nr_cmds);
 	if (!submit) {
 		ret = -ENOMEM;
 		goto out_unlock;
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index 1c09acfb4028..2f45bea04221 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -551,7 +551,7 @@ static void retire_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
 		struct msm_gem_object *msm_obj = submit->bos[i].obj;
 		/* move to inactive: */
 		msm_gem_move_to_inactive(&msm_obj->base);
-		msm_gem_put_iova(&msm_obj->base, gpu->aspace);
+		msm_gem_put_iova(&msm_obj->base, submit->aspace);
 		drm_gem_object_put(&msm_obj->base);
 	}
 
@@ -635,8 +635,7 @@ void msm_gpu_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit,
 
 		/* submit takes a reference to the bo and iova until retired: */
 		drm_gem_object_get(&msm_obj->base);
-		msm_gem_get_iova(&msm_obj->base,
-				submit->gpu->aspace, &iova);
+		msm_gem_get_iova(&msm_obj->base, submit->aspace, &iova);
 
 		if (submit->bos[i].flags & MSM_SUBMIT_BO_WRITE)
 			msm_gem_move_to_active(&msm_obj->base, gpu, true, submit->fence);
-- 
2.17.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 08/16] drm/msm: Pass the MMU domain index in struct msm_file_private
@ 2018-05-18 21:34   ` Jordan Crouse
  0 siblings, 0 replies; 38+ messages in thread
From: Jordan Crouse @ 2018-05-18 21:34 UTC (permalink / raw)
  To: linux-arm-kernel

Pass the index of the MMU domain in struct msm_file_private instead
of assuming gpu->id throughout the submit path. This clears the way
to change ctx->aspace to a per-instance pagetable.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/gpu/drm/msm/msm_drv.c        | 16 ++++------------
 drivers/gpu/drm/msm/msm_drv.h        |  1 +
 drivers/gpu/drm/msm/msm_gem.h        |  1 +
 drivers/gpu/drm/msm/msm_gem_submit.c | 11 ++++++-----
 drivers/gpu/drm/msm/msm_gpu.c        |  5 ++---
 5 files changed, 14 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index 30cd514d8f7c..2b663435a3f7 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -502,6 +502,7 @@ static void load_gpu(struct drm_device *dev)
 
 static int context_init(struct drm_device *dev, struct drm_file *file)
 {
+	struct msm_drm_private *priv = dev->dev_private;
 	struct msm_file_private *ctx;
 
 	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
@@ -510,6 +511,7 @@ static int context_init(struct drm_device *dev, struct drm_file *file)
 
 	msm_submitqueue_init(dev, ctx);
 
+	ctx->aspace = priv->gpu->aspace;
 	file->driver_priv = ctx;
 
 	return 0;
@@ -683,17 +685,6 @@ static int msm_ioctl_gem_cpu_fini(struct drm_device *dev, void *data,
 	return ret;
 }
 
-static int msm_ioctl_gem_info_iova(struct drm_device *dev,
-		struct drm_gem_object *obj, uint64_t *iova)
-{
-	struct msm_drm_private *priv = dev->dev_private;
-
-	if (!priv->gpu)
-		return -EINVAL;
-
-	return msm_gem_get_iova(obj, priv->gpu->aspace, iova);
-}
-
 static int msm_ioctl_gem_info(struct drm_device *dev, void *data,
 		struct drm_file *file)
 {
@@ -709,9 +700,10 @@ static int msm_ioctl_gem_info(struct drm_device *dev, void *data,
 		return -ENOENT;
 
 	if (args->flags & MSM_INFO_IOVA) {
+		struct msm_file_private *ctx = file->driver_priv;
 		uint64_t iova;
 
-		ret = msm_ioctl_gem_info_iova(dev, obj, &iova);
+		ret = msm_gem_get_iova(obj, ctx->aspace, &iova);
 		if (!ret)
 			args->offset = iova;
 	} else {
diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
index 48ed5b9a8580..897b08135927 100644
--- a/drivers/gpu/drm/msm/msm_drv.h
+++ b/drivers/gpu/drm/msm/msm_drv.h
@@ -58,6 +58,7 @@ struct msm_file_private {
 	rwlock_t queuelock;
 	struct list_head submitqueues;
 	int queueid;
+	struct msm_gem_address_space *aspace;
 };
 
 enum msm_mdp_plane_property {
diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h
index c5d9bd3e47a8..fe8b3aa7d76f 100644
--- a/drivers/gpu/drm/msm/msm_gem.h
+++ b/drivers/gpu/drm/msm/msm_gem.h
@@ -138,6 +138,7 @@ void msm_gem_vunmap(struct drm_gem_object *obj, enum msm_gem_lock subclass);
 struct msm_gem_submit {
 	struct drm_device *dev;
 	struct msm_gpu *gpu;
+	struct msm_gem_address_space *aspace;
 	struct list_head node;   /* node in ring submit list */
 	struct list_head bo_list;
 	struct ww_acquire_ctx ticket;
diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c
index 7bd83e0afa97..d5dffcba9919 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -31,8 +31,8 @@
 #define BO_PINNED   0x2000
 
 static struct msm_gem_submit *submit_create(struct drm_device *dev,
-		struct msm_gpu *gpu, struct msm_gpu_submitqueue *queue,
-		uint32_t nr_bos, uint32_t nr_cmds)
+		struct msm_gpu *gpu, struct msm_gem_address_space *aspace,
+		struct msm_gpu_submitqueue *queue, uint32_t nr_bos, uint32_t nr_cmds)
 {
 	struct msm_gem_submit *submit;
 	uint64_t sz = sizeof(*submit) + ((u64)nr_bos * sizeof(submit->bos[0])) +
@@ -46,6 +46,7 @@ static struct msm_gem_submit *submit_create(struct drm_device *dev,
 		return NULL;
 
 	submit->dev = dev;
+	submit->aspace = aspace;
 	submit->gpu = gpu;
 	submit->fence = NULL;
 	submit->pid = get_pid(task_pid(current));
@@ -167,7 +168,7 @@ static void submit_unlock_unpin_bo(struct msm_gem_submit *submit,
 	struct msm_gem_object *msm_obj = submit->bos[i].obj;
 
 	if (submit->bos[i].flags & BO_PINNED)
-		msm_gem_put_iova(&msm_obj->base, submit->gpu->aspace);
+		msm_gem_put_iova(&msm_obj->base, submit->aspace);
 
 	if (submit->bos[i].flags & BO_LOCKED)
 		ww_mutex_unlock(&msm_obj->resv->lock);
@@ -270,7 +271,7 @@ static int submit_pin_objects(struct msm_gem_submit *submit)
 
 		/* if locking succeeded, pin bo: */
 		ret = msm_gem_get_iova(&msm_obj->base,
-				submit->gpu->aspace, &iova);
+				submit->aspace, &iova);
 
 		if (ret)
 			break;
@@ -471,7 +472,7 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data,
 		}
 	}
 
-	submit = submit_create(dev, gpu, queue, args->nr_bos, args->nr_cmds);
+	submit = submit_create(dev, gpu, ctx->aspace, queue, args->nr_bos, args->nr_cmds);
 	if (!submit) {
 		ret = -ENOMEM;
 		goto out_unlock;
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index 1c09acfb4028..2f45bea04221 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -551,7 +551,7 @@ static void retire_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
 		struct msm_gem_object *msm_obj = submit->bos[i].obj;
 		/* move to inactive: */
 		msm_gem_move_to_inactive(&msm_obj->base);
-		msm_gem_put_iova(&msm_obj->base, gpu->aspace);
+		msm_gem_put_iova(&msm_obj->base, submit->aspace);
 		drm_gem_object_put(&msm_obj->base);
 	}
 
@@ -635,8 +635,7 @@ void msm_gpu_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit,
 
 		/* submit takes a reference to the bo and iova until retired: */
 		drm_gem_object_get(&msm_obj->base);
-		msm_gem_get_iova(&msm_obj->base,
-				submit->gpu->aspace, &iova);
+		msm_gem_get_iova(&msm_obj->base, submit->aspace, &iova);
 
 		if (submit->bos[i].flags & MSM_SUBMIT_BO_WRITE)
 			msm_gem_move_to_active(&msm_obj->base, gpu, true, submit->fence);
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 09/16] drm/msm/gpu: Support using split page tables for kernel buffer objects
  2018-05-18 21:34 ` Jordan Crouse
@ 2018-05-18 21:34     ` Jordan Crouse
  -1 siblings, 0 replies; 38+ messages in thread
From: Jordan Crouse @ 2018-05-18 21:34 UTC (permalink / raw)
  To: freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: jean-philippe.brucker-5wv7dgnIgG8,
	linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	tfiga-F7+t8E8rja9g9hUCZPvPmw,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

arm-smmu based targets can support split pagetables (TTBR0/TTBR1).
This is most useful for implementing per-instance pagetables so that
the "user" pagetable can be swapped out while the "kernel" or
"global" pagetable remains entact.

if the target specifies a global virtual memory range then try to
enable TTBR1 (the "global" pagetable) on the domain and if
successful use the global virtual memory range for allocations
on the default GPU address space - this ensures that the global
allocations make it into the right space. Per-instance pagetables
still need additional support to be enabled but even if they
aren't set up it isn't harmful to just use TTBR1 for all
virtual memory regions and leave the other pagetable unused.

If TTBR1 support isn't enabled then fall back to the "legacy"
virtual address space both kernel and user.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/gpu/drm/msm/msm_gpu.c | 19 +++++++++++++++++--
 drivers/gpu/drm/msm/msm_gpu.h |  4 ++--
 2 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index 2f45bea04221..78e8e56d2499 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -703,7 +703,8 @@ static int get_clocks(struct platform_device *pdev, struct msm_gpu *gpu)
 
 static struct msm_gem_address_space *
 msm_gpu_create_address_space(struct msm_gpu *gpu, struct platform_device *pdev,
-		uint64_t va_start, uint64_t va_end)
+		u64 va_start, u64 va_end,
+		u64 va_global_start, u64 va_global_end)
 {
 	struct iommu_domain *iommu;
 	struct msm_gem_address_space *aspace;
@@ -721,6 +722,19 @@ msm_gpu_create_address_space(struct msm_gpu *gpu, struct platform_device *pdev,
 	iommu->geometry.aperture_start = va_start;
 	iommu->geometry.aperture_end = va_end;
 
+	/* If a va_global range was specified then try to set up split tables */
+	if (va_global_start && va_global_end) {
+		int val = 1;
+
+		ret = iommu_domain_set_attr(iommu, DOMAIN_ATTR_SPLIT_TABLES,
+			&val);
+
+		if (!WARN(ret, "Unable to enable split pagetables for the IOMMU\n")) {
+			iommu->geometry.aperture_start = va_global_start;
+			iommu->geometry.aperture_end = va_global_end;
+		}
+	}
+
 	dev_info(gpu->dev->dev, "%s: using IOMMU\n", gpu->name);
 
 	aspace = msm_gem_address_space_create(&pdev->dev, iommu, "gpu");
@@ -813,7 +827,8 @@ int msm_gpu_init(struct drm_device *drm, struct platform_device *pdev,
 	msm_devfreq_init(gpu);
 
 	gpu->aspace = msm_gpu_create_address_space(gpu, pdev,
-		config->va_start, config->va_end);
+		config->va_start, config->va_end, config->va_start_global,
+		config->va_end_global);
 
 	if (gpu->aspace == NULL)
 		dev_info(drm->dev, "%s: no IOMMU, fallback to VRAM carveout!\n", name);
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index b8241179175a..da58aa6c12c8 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -31,8 +31,8 @@ struct msm_gpu_perfcntr;
 struct msm_gpu_config {
 	const char *ioname;
 	const char *irqname;
-	uint64_t va_start;
-	uint64_t va_end;
+	uint64_t va_start, va_end;
+	uint64_t va_start_global, va_end_global;
 	unsigned int nr_rings;
 };
 
-- 
2.17.0

_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 09/16] drm/msm/gpu: Support using split page tables for kernel buffer objects
@ 2018-05-18 21:34     ` Jordan Crouse
  0 siblings, 0 replies; 38+ messages in thread
From: Jordan Crouse @ 2018-05-18 21:34 UTC (permalink / raw)
  To: linux-arm-kernel

arm-smmu based targets can support split pagetables (TTBR0/TTBR1).
This is most useful for implementing per-instance pagetables so that
the "user" pagetable can be swapped out while the "kernel" or
"global" pagetable remains entact.

if the target specifies a global virtual memory range then try to
enable TTBR1 (the "global" pagetable) on the domain and if
successful use the global virtual memory range for allocations
on the default GPU address space - this ensures that the global
allocations make it into the right space. Per-instance pagetables
still need additional support to be enabled but even if they
aren't set up it isn't harmful to just use TTBR1 for all
virtual memory regions and leave the other pagetable unused.

If TTBR1 support isn't enabled then fall back to the "legacy"
virtual address space both kernel and user.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/gpu/drm/msm/msm_gpu.c | 19 +++++++++++++++++--
 drivers/gpu/drm/msm/msm_gpu.h |  4 ++--
 2 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index 2f45bea04221..78e8e56d2499 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -703,7 +703,8 @@ static int get_clocks(struct platform_device *pdev, struct msm_gpu *gpu)
 
 static struct msm_gem_address_space *
 msm_gpu_create_address_space(struct msm_gpu *gpu, struct platform_device *pdev,
-		uint64_t va_start, uint64_t va_end)
+		u64 va_start, u64 va_end,
+		u64 va_global_start, u64 va_global_end)
 {
 	struct iommu_domain *iommu;
 	struct msm_gem_address_space *aspace;
@@ -721,6 +722,19 @@ msm_gpu_create_address_space(struct msm_gpu *gpu, struct platform_device *pdev,
 	iommu->geometry.aperture_start = va_start;
 	iommu->geometry.aperture_end = va_end;
 
+	/* If a va_global range was specified then try to set up split tables */
+	if (va_global_start && va_global_end) {
+		int val = 1;
+
+		ret = iommu_domain_set_attr(iommu, DOMAIN_ATTR_SPLIT_TABLES,
+			&val);
+
+		if (!WARN(ret, "Unable to enable split pagetables for the IOMMU\n")) {
+			iommu->geometry.aperture_start = va_global_start;
+			iommu->geometry.aperture_end = va_global_end;
+		}
+	}
+
 	dev_info(gpu->dev->dev, "%s: using IOMMU\n", gpu->name);
 
 	aspace = msm_gem_address_space_create(&pdev->dev, iommu, "gpu");
@@ -813,7 +827,8 @@ int msm_gpu_init(struct drm_device *drm, struct platform_device *pdev,
 	msm_devfreq_init(gpu);
 
 	gpu->aspace = msm_gpu_create_address_space(gpu, pdev,
-		config->va_start, config->va_end);
+		config->va_start, config->va_end, config->va_start_global,
+		config->va_end_global);
 
 	if (gpu->aspace == NULL)
 		dev_info(drm->dev, "%s: no IOMMU, fallback to VRAM carveout!\n", name);
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index b8241179175a..da58aa6c12c8 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -31,8 +31,8 @@ struct msm_gpu_perfcntr;
 struct msm_gpu_config {
 	const char *ioname;
 	const char *irqname;
-	uint64_t va_start;
-	uint64_t va_end;
+	uint64_t va_start, va_end;
+	uint64_t va_start_global, va_end_global;
 	unsigned int nr_rings;
 };
 
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 10/16] drm/msm: Add msm_mmu features
  2018-05-18 21:34 ` Jordan Crouse
@ 2018-05-18 21:34     ` Jordan Crouse
  -1 siblings, 0 replies; 38+ messages in thread
From: Jordan Crouse @ 2018-05-18 21:34 UTC (permalink / raw)
  To: freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: jean-philippe.brucker-5wv7dgnIgG8,
	linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	tfiga-F7+t8E8rja9g9hUCZPvPmw,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

Add a few simple support functions to support a bitmask of
features that a specific MMU implementation supports. The
first feature will be per-instance pagetables coming in the
following patch.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/gpu/drm/msm/msm_mmu.h | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/drivers/gpu/drm/msm/msm_mmu.h b/drivers/gpu/drm/msm/msm_mmu.h
index aa2c5d4580c8..85df78d71398 100644
--- a/drivers/gpu/drm/msm/msm_mmu.h
+++ b/drivers/gpu/drm/msm/msm_mmu.h
@@ -35,6 +35,7 @@ struct msm_mmu {
 	struct device *dev;
 	int (*handler)(void *arg, unsigned long iova, int flags);
 	void *arg;
+	unsigned long features;
 };
 
 static inline void msm_mmu_init(struct msm_mmu *mmu, struct device *dev,
@@ -54,4 +55,16 @@ static inline void msm_mmu_set_fault_handler(struct msm_mmu *mmu, void *arg,
 	mmu->handler = handler;
 }
 
+static inline void msm_mmu_set_feature(struct msm_mmu *mmu,
+		unsigned long feature)
+{
+	mmu->features |= feature;
+}
+
+static inline bool msm_mmu_has_feature(struct msm_mmu *mmu,
+		unsigned long feature)
+{
+	return (mmu->features & feature) ? true : false;
+}
+
 #endif /* __MSM_MMU_H__ */
-- 
2.17.0

_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 10/16] drm/msm: Add msm_mmu features
@ 2018-05-18 21:34     ` Jordan Crouse
  0 siblings, 0 replies; 38+ messages in thread
From: Jordan Crouse @ 2018-05-18 21:34 UTC (permalink / raw)
  To: linux-arm-kernel

Add a few simple support functions to support a bitmask of
features that a specific MMU implementation supports. The
first feature will be per-instance pagetables coming in the
following patch.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/gpu/drm/msm/msm_mmu.h | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/drivers/gpu/drm/msm/msm_mmu.h b/drivers/gpu/drm/msm/msm_mmu.h
index aa2c5d4580c8..85df78d71398 100644
--- a/drivers/gpu/drm/msm/msm_mmu.h
+++ b/drivers/gpu/drm/msm/msm_mmu.h
@@ -35,6 +35,7 @@ struct msm_mmu {
 	struct device *dev;
 	int (*handler)(void *arg, unsigned long iova, int flags);
 	void *arg;
+	unsigned long features;
 };
 
 static inline void msm_mmu_init(struct msm_mmu *mmu, struct device *dev,
@@ -54,4 +55,16 @@ static inline void msm_mmu_set_fault_handler(struct msm_mmu *mmu, void *arg,
 	mmu->handler = handler;
 }
 
+static inline void msm_mmu_set_feature(struct msm_mmu *mmu,
+		unsigned long feature)
+{
+	mmu->features |= feature;
+}
+
+static inline bool msm_mmu_has_feature(struct msm_mmu *mmu,
+		unsigned long feature)
+{
+	return (mmu->features & feature) ? true : false;
+}
+
 #endif /* __MSM_MMU_H__ */
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 11/16] drm/msm: Add support for iommu-sva PASIDs
  2018-05-18 21:34 ` Jordan Crouse
@ 2018-05-18 21:34     ` Jordan Crouse
  -1 siblings, 0 replies; 38+ messages in thread
From: Jordan Crouse @ 2018-05-18 21:34 UTC (permalink / raw)
  To: freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: jean-philippe.brucker-5wv7dgnIgG8,
	linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	tfiga-F7+t8E8rja9g9hUCZPvPmw,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

The IOMMU core can support creating multiple pagetables
for a specific domai and making them available to a client
driver that has the means to manage the pagetable itself.

PASIDs are unique indexes to a software created pagetable with
the same format and characteristics as the parent IOMMU device.
The IOMMU driver allocates the pagetable and tracks it with a
unique token (PASID) - it does not touch the actual hardware.
 The client driver is expected to be able to manage the pagetables
and do something interesting with them.

Some flavors of the MSM GPU are able to allow each DRM instance
to have its own pagetable (and virtual memory space) and switch them
asynchronously at the beginning of a command. This protects against
accidental or malicious corruption or copying of buffers from other
instances.

The first step is to add a MMU implementation that can allocate a
PASID and set up a msm_mmu struct to abstract (most) of the details
from the rest of the system.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/gpu/drm/msm/msm_iommu.c | 190 ++++++++++++++++++++++++++++++++
 drivers/gpu/drm/msm/msm_mmu.h   |   6 +
 2 files changed, 196 insertions(+)

diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c
index fdbe1a8372f0..99e6611969d4 100644
--- a/drivers/gpu/drm/msm/msm_iommu.c
+++ b/drivers/gpu/drm/msm/msm_iommu.c
@@ -15,6 +15,9 @@
  * this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
+#include <linux/hashtable.h>
+#include <linux/arm-smmu.h>
+
 #include "msm_drv.h"
 #include "msm_mmu.h"
 
@@ -34,12 +37,32 @@ static int msm_fault_handler(struct iommu_domain *domain, struct device *dev,
 	return 0;
 }
 
+static bool msm_iommu_check_per_instance(struct msm_iommu *iommu)
+{
+	int val;
+
+	if (!IS_ENABLED(CONFIG_IOMMU_SVA))
+		return false;
+
+	if (iommu_domain_get_attr(iommu->domain, DOMAIN_ATTR_SPLIT_TABLES,
+		&val))
+		return false;
+
+	return val ? true : false;
+}
+
 static int msm_iommu_attach(struct msm_mmu *mmu, const char * const *names,
 			    int cnt)
 {
 	struct msm_iommu *iommu = to_msm_iommu(mmu);
 	int ret;
 
+	if (msm_iommu_check_per_instance(iommu)) {
+		if (!iommu_sva_device_init(mmu->dev, 0, (1 << 31), NULL))
+			msm_mmu_set_feature(mmu,
+				MMU_FEATURE_PER_INSTANCE_TABLES);
+	}
+
 	pm_runtime_get_sync(mmu->dev);
 	ret = iommu_attach_device(iommu->domain, mmu->dev);
 	pm_runtime_put_sync(mmu->dev);
@@ -112,3 +135,170 @@ struct msm_mmu *msm_iommu_new(struct device *dev, struct iommu_domain *domain)
 
 	return &iommu->base;
 }
+
+struct pasid_entry {
+	int pasid;
+	u64 ttbr;
+	u32 asid;
+	struct hlist_node node;
+};
+
+DECLARE_HASHTABLE(pasid_table, 4);
+
+static int install_pasid_cb(int pasid, u64 ttbr, u32 asid, void *data)
+{
+	struct pasid_entry *entry = kzalloc(sizeof(*entry), GFP_KERNEL);
+
+	if (!entry)
+		return -ENOMEM;
+
+	entry->pasid = pasid;
+	entry->ttbr = ttbr;
+	entry->asid = asid;
+
+	/* FIXME: Assume that we'll never have a pasid conflict? */
+	/* FIXME: locks? RCU? */
+	hash_add(pasid_table, &entry->node, pasid);
+	return 0;
+}
+
+static void remove_pasid_cb(int pasid, void *data)
+{
+	struct pasid_entry *entry;
+
+	hash_for_each_possible(pasid_table, entry, node, pasid) {
+		if (pasid == entry->pasid) {
+			hash_del(&entry->node);
+			kfree(entry);
+			return;
+		}
+	}
+}
+
+struct msm_iommu_pasid {
+	struct msm_mmu base;
+	struct device *dev;
+	int pasid;
+	u64 ttbr;
+	u32 asid;
+};
+#define to_msm_iommu_pasid(x) container_of(x, struct msm_iommu_pasid, base)
+
+static int msm_iommu_pasid_attach(struct msm_mmu *mmu,
+		const char * const *names, int cnt)
+{
+	return 0;
+}
+
+static int msm_iommu_pasid_map(struct msm_mmu *mmu, uint64_t iova,
+		struct sg_table *sgt, unsigned len, int prot)
+{
+	struct msm_iommu_pasid *pasid = to_msm_iommu_pasid(mmu);
+	int ret;
+
+	ret = iommu_sva_map_sg(pasid->pasid, iova, sgt->sgl, sgt->nents, prot);
+	WARN_ON(ret < 0);
+
+	return (ret == len) ? 0 : -EINVAL;
+}
+
+static int msm_iommu_pasid_unmap(struct msm_mmu *mmu, uint64_t iova,
+		struct sg_table *sgt, unsigned len)
+{
+	struct msm_iommu_pasid *pasid = to_msm_iommu_pasid(mmu);
+
+	iommu_sva_unmap(pasid->pasid, iova, len);
+
+	return 0;
+}
+
+static void msm_iommu_pasid_detach(struct msm_mmu *mmu,
+		const char * const *names, int cnt)
+{
+}
+
+static void msm_iommu_pasid_destroy(struct msm_mmu *mmu)
+{
+	struct msm_iommu_pasid *pasid = to_msm_iommu_pasid(mmu);
+
+	iommu_sva_free_pasid(pasid->pasid, pasid->dev);
+	kfree(pasid);
+}
+
+static const struct msm_mmu_funcs pasid_funcs = {
+		.attach = msm_iommu_pasid_attach,
+		.detach = msm_iommu_pasid_detach,
+		.map = msm_iommu_pasid_map,
+		.unmap = msm_iommu_pasid_unmap,
+		.destroy = msm_iommu_pasid_destroy,
+};
+
+static const struct arm_smmu_pasid_ops msm_iommu_pasid_ops = {
+	.install_pasid = install_pasid_cb,
+	.remove_pasid = remove_pasid_cb,
+};
+
+struct msm_mmu *msm_iommu_pasid_new(struct msm_mmu *parent)
+{
+	struct msm_iommu *parent_iommu = to_msm_iommu(parent);
+	struct msm_iommu_pasid *pasid;
+	int id;
+
+	if (!msm_mmu_has_feature(parent, MMU_FEATURE_PER_INSTANCE_TABLES))
+		return ERR_PTR(-EOPNOTSUPP);
+
+	pasid = kzalloc(sizeof(*pasid), GFP_KERNEL);
+	if (!pasid)
+		return ERR_PTR(-ENOMEM);
+
+	arm_smmu_add_pasid_ops(parent_iommu->domain, &msm_iommu_pasid_ops,
+		NULL);
+
+	id = iommu_sva_alloc_pasid(parent_iommu->domain, parent->dev);
+	if (id < 0) {
+		kfree(pasid);
+		return ERR_PTR(id);
+	}
+
+	pasid->pasid = id;
+	pasid->dev = parent->dev;
+
+	msm_mmu_init(&pasid->base, parent->dev, &pasid_funcs);
+
+	return &pasid->base;
+}
+
+/* Given a pasid return the TTBR and ASID associated with it */
+int msm_iommu_pasid_info(struct msm_mmu *mmu, u64 *ttbr, u32 *asid)
+{
+	struct msm_iommu_pasid *pasid;
+	struct pasid_entry *entry;
+
+	if (mmu->funcs->map != msm_iommu_pasid_map)
+		return -ENODEV;
+
+	pasid = to_msm_iommu_pasid(mmu);
+
+	if (!pasid->ttbr) {
+		/* Find the pasid entry in the hash */
+		hash_for_each_possible(pasid_table, entry, node, pasid->pasid) {
+			if (pasid->pasid == entry->pasid) {
+				pasid->ttbr = entry->ttbr;
+				pasid->asid = entry->asid;
+				goto out;
+			}
+		}
+
+		WARN(1, "Couldn't find the entry for pasid %d\n", pasid->pasid);
+		return -EINVAL;
+	}
+
+out:
+	if (*ttbr)
+		*ttbr = pasid->ttbr;
+
+	if (*asid)
+		*asid = pasid->asid;
+
+	return 0;
+}
diff --git a/drivers/gpu/drm/msm/msm_mmu.h b/drivers/gpu/drm/msm/msm_mmu.h
index 85df78d71398..29436b9daa73 100644
--- a/drivers/gpu/drm/msm/msm_mmu.h
+++ b/drivers/gpu/drm/msm/msm_mmu.h
@@ -30,6 +30,9 @@ struct msm_mmu_funcs {
 	void (*destroy)(struct msm_mmu *mmu);
 };
 
+/* MMU features */
+#define MMU_FEATURE_PER_INSTANCE_TABLES (1 << 0)
+
 struct msm_mmu {
 	const struct msm_mmu_funcs *funcs;
 	struct device *dev;
@@ -48,6 +51,9 @@ static inline void msm_mmu_init(struct msm_mmu *mmu, struct device *dev,
 struct msm_mmu *msm_iommu_new(struct device *dev, struct iommu_domain *domain);
 struct msm_mmu *msm_gpummu_new(struct device *dev, struct msm_gpu *gpu);
 
+struct msm_mmu *msm_iommu_pasid_new(struct msm_mmu *parent);
+int msm_iommu_pasid_info(struct msm_mmu *mmu, u64 *ttbr, u32 *asid);
+
 static inline void msm_mmu_set_fault_handler(struct msm_mmu *mmu, void *arg,
 		int (*handler)(void *arg, unsigned long iova, int flags))
 {
-- 
2.17.0

_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 11/16] drm/msm: Add support for iommu-sva PASIDs
@ 2018-05-18 21:34     ` Jordan Crouse
  0 siblings, 0 replies; 38+ messages in thread
From: Jordan Crouse @ 2018-05-18 21:34 UTC (permalink / raw)
  To: linux-arm-kernel

The IOMMU core can support creating multiple pagetables
for a specific domai and making them available to a client
driver that has the means to manage the pagetable itself.

PASIDs are unique indexes to a software created pagetable with
the same format and characteristics as the parent IOMMU device.
The IOMMU driver allocates the pagetable and tracks it with a
unique token (PASID) - it does not touch the actual hardware.
 The client driver is expected to be able to manage the pagetables
and do something interesting with them.

Some flavors of the MSM GPU are able to allow each DRM instance
to have its own pagetable (and virtual memory space) and switch them
asynchronously at the beginning of a command. This protects against
accidental or malicious corruption or copying of buffers from other
instances.

The first step is to add a MMU implementation that can allocate a
PASID and set up a msm_mmu struct to abstract (most) of the details
from the rest of the system.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/gpu/drm/msm/msm_iommu.c | 190 ++++++++++++++++++++++++++++++++
 drivers/gpu/drm/msm/msm_mmu.h   |   6 +
 2 files changed, 196 insertions(+)

diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c
index fdbe1a8372f0..99e6611969d4 100644
--- a/drivers/gpu/drm/msm/msm_iommu.c
+++ b/drivers/gpu/drm/msm/msm_iommu.c
@@ -15,6 +15,9 @@
  * this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
+#include <linux/hashtable.h>
+#include <linux/arm-smmu.h>
+
 #include "msm_drv.h"
 #include "msm_mmu.h"
 
@@ -34,12 +37,32 @@ static int msm_fault_handler(struct iommu_domain *domain, struct device *dev,
 	return 0;
 }
 
+static bool msm_iommu_check_per_instance(struct msm_iommu *iommu)
+{
+	int val;
+
+	if (!IS_ENABLED(CONFIG_IOMMU_SVA))
+		return false;
+
+	if (iommu_domain_get_attr(iommu->domain, DOMAIN_ATTR_SPLIT_TABLES,
+		&val))
+		return false;
+
+	return val ? true : false;
+}
+
 static int msm_iommu_attach(struct msm_mmu *mmu, const char * const *names,
 			    int cnt)
 {
 	struct msm_iommu *iommu = to_msm_iommu(mmu);
 	int ret;
 
+	if (msm_iommu_check_per_instance(iommu)) {
+		if (!iommu_sva_device_init(mmu->dev, 0, (1 << 31), NULL))
+			msm_mmu_set_feature(mmu,
+				MMU_FEATURE_PER_INSTANCE_TABLES);
+	}
+
 	pm_runtime_get_sync(mmu->dev);
 	ret = iommu_attach_device(iommu->domain, mmu->dev);
 	pm_runtime_put_sync(mmu->dev);
@@ -112,3 +135,170 @@ struct msm_mmu *msm_iommu_new(struct device *dev, struct iommu_domain *domain)
 
 	return &iommu->base;
 }
+
+struct pasid_entry {
+	int pasid;
+	u64 ttbr;
+	u32 asid;
+	struct hlist_node node;
+};
+
+DECLARE_HASHTABLE(pasid_table, 4);
+
+static int install_pasid_cb(int pasid, u64 ttbr, u32 asid, void *data)
+{
+	struct pasid_entry *entry = kzalloc(sizeof(*entry), GFP_KERNEL);
+
+	if (!entry)
+		return -ENOMEM;
+
+	entry->pasid = pasid;
+	entry->ttbr = ttbr;
+	entry->asid = asid;
+
+	/* FIXME: Assume that we'll never have a pasid conflict? */
+	/* FIXME: locks? RCU? */
+	hash_add(pasid_table, &entry->node, pasid);
+	return 0;
+}
+
+static void remove_pasid_cb(int pasid, void *data)
+{
+	struct pasid_entry *entry;
+
+	hash_for_each_possible(pasid_table, entry, node, pasid) {
+		if (pasid == entry->pasid) {
+			hash_del(&entry->node);
+			kfree(entry);
+			return;
+		}
+	}
+}
+
+struct msm_iommu_pasid {
+	struct msm_mmu base;
+	struct device *dev;
+	int pasid;
+	u64 ttbr;
+	u32 asid;
+};
+#define to_msm_iommu_pasid(x) container_of(x, struct msm_iommu_pasid, base)
+
+static int msm_iommu_pasid_attach(struct msm_mmu *mmu,
+		const char * const *names, int cnt)
+{
+	return 0;
+}
+
+static int msm_iommu_pasid_map(struct msm_mmu *mmu, uint64_t iova,
+		struct sg_table *sgt, unsigned len, int prot)
+{
+	struct msm_iommu_pasid *pasid = to_msm_iommu_pasid(mmu);
+	int ret;
+
+	ret = iommu_sva_map_sg(pasid->pasid, iova, sgt->sgl, sgt->nents, prot);
+	WARN_ON(ret < 0);
+
+	return (ret == len) ? 0 : -EINVAL;
+}
+
+static int msm_iommu_pasid_unmap(struct msm_mmu *mmu, uint64_t iova,
+		struct sg_table *sgt, unsigned len)
+{
+	struct msm_iommu_pasid *pasid = to_msm_iommu_pasid(mmu);
+
+	iommu_sva_unmap(pasid->pasid, iova, len);
+
+	return 0;
+}
+
+static void msm_iommu_pasid_detach(struct msm_mmu *mmu,
+		const char * const *names, int cnt)
+{
+}
+
+static void msm_iommu_pasid_destroy(struct msm_mmu *mmu)
+{
+	struct msm_iommu_pasid *pasid = to_msm_iommu_pasid(mmu);
+
+	iommu_sva_free_pasid(pasid->pasid, pasid->dev);
+	kfree(pasid);
+}
+
+static const struct msm_mmu_funcs pasid_funcs = {
+		.attach = msm_iommu_pasid_attach,
+		.detach = msm_iommu_pasid_detach,
+		.map = msm_iommu_pasid_map,
+		.unmap = msm_iommu_pasid_unmap,
+		.destroy = msm_iommu_pasid_destroy,
+};
+
+static const struct arm_smmu_pasid_ops msm_iommu_pasid_ops = {
+	.install_pasid = install_pasid_cb,
+	.remove_pasid = remove_pasid_cb,
+};
+
+struct msm_mmu *msm_iommu_pasid_new(struct msm_mmu *parent)
+{
+	struct msm_iommu *parent_iommu = to_msm_iommu(parent);
+	struct msm_iommu_pasid *pasid;
+	int id;
+
+	if (!msm_mmu_has_feature(parent, MMU_FEATURE_PER_INSTANCE_TABLES))
+		return ERR_PTR(-EOPNOTSUPP);
+
+	pasid = kzalloc(sizeof(*pasid), GFP_KERNEL);
+	if (!pasid)
+		return ERR_PTR(-ENOMEM);
+
+	arm_smmu_add_pasid_ops(parent_iommu->domain, &msm_iommu_pasid_ops,
+		NULL);
+
+	id = iommu_sva_alloc_pasid(parent_iommu->domain, parent->dev);
+	if (id < 0) {
+		kfree(pasid);
+		return ERR_PTR(id);
+	}
+
+	pasid->pasid = id;
+	pasid->dev = parent->dev;
+
+	msm_mmu_init(&pasid->base, parent->dev, &pasid_funcs);
+
+	return &pasid->base;
+}
+
+/* Given a pasid return the TTBR and ASID associated with it */
+int msm_iommu_pasid_info(struct msm_mmu *mmu, u64 *ttbr, u32 *asid)
+{
+	struct msm_iommu_pasid *pasid;
+	struct pasid_entry *entry;
+
+	if (mmu->funcs->map != msm_iommu_pasid_map)
+		return -ENODEV;
+
+	pasid = to_msm_iommu_pasid(mmu);
+
+	if (!pasid->ttbr) {
+		/* Find the pasid entry in the hash */
+		hash_for_each_possible(pasid_table, entry, node, pasid->pasid) {
+			if (pasid->pasid == entry->pasid) {
+				pasid->ttbr = entry->ttbr;
+				pasid->asid = entry->asid;
+				goto out;
+			}
+		}
+
+		WARN(1, "Couldn't find the entry for pasid %d\n", pasid->pasid);
+		return -EINVAL;
+	}
+
+out:
+	if (*ttbr)
+		*ttbr = pasid->ttbr;
+
+	if (*asid)
+		*asid = pasid->asid;
+
+	return 0;
+}
diff --git a/drivers/gpu/drm/msm/msm_mmu.h b/drivers/gpu/drm/msm/msm_mmu.h
index 85df78d71398..29436b9daa73 100644
--- a/drivers/gpu/drm/msm/msm_mmu.h
+++ b/drivers/gpu/drm/msm/msm_mmu.h
@@ -30,6 +30,9 @@ struct msm_mmu_funcs {
 	void (*destroy)(struct msm_mmu *mmu);
 };
 
+/* MMU features */
+#define MMU_FEATURE_PER_INSTANCE_TABLES (1 << 0)
+
 struct msm_mmu {
 	const struct msm_mmu_funcs *funcs;
 	struct device *dev;
@@ -48,6 +51,9 @@ static inline void msm_mmu_init(struct msm_mmu *mmu, struct device *dev,
 struct msm_mmu *msm_iommu_new(struct device *dev, struct iommu_domain *domain);
 struct msm_mmu *msm_gpummu_new(struct device *dev, struct msm_gpu *gpu);
 
+struct msm_mmu *msm_iommu_pasid_new(struct msm_mmu *parent);
+int msm_iommu_pasid_info(struct msm_mmu *mmu, u64 *ttbr, u32 *asid);
+
 static inline void msm_mmu_set_fault_handler(struct msm_mmu *mmu, void *arg,
 		int (*handler)(void *arg, unsigned long iova, int flags))
 {
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 12/16] drm/msm: Add support for per-instance address spaces
  2018-05-18 21:34 ` Jordan Crouse
@ 2018-05-18 21:34     ` Jordan Crouse
  -1 siblings, 0 replies; 38+ messages in thread
From: Jordan Crouse @ 2018-05-18 21:34 UTC (permalink / raw)
  To: freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: jean-philippe.brucker-5wv7dgnIgG8,
	linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	tfiga-F7+t8E8rja9g9hUCZPvPmw,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

Add a function to allocate a new pasid from a existing
MMU domain and create a per-instance address space.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/gpu/drm/msm/msm_drv.h     |  3 +++
 drivers/gpu/drm/msm/msm_gem_vma.c | 37 +++++++++++++++++++++++++------
 2 files changed, 33 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
index 897b08135927..d92b009dfef4 100644
--- a/drivers/gpu/drm/msm/msm_drv.h
+++ b/drivers/gpu/drm/msm/msm_drv.h
@@ -177,6 +177,9 @@ void msm_gem_address_space_put(struct msm_gem_address_space *aspace);
 struct msm_gem_address_space *
 msm_gem_address_space_create(struct device *dev, struct iommu_domain *domain,
 		const char *name);
+struct msm_gem_address_space *
+msm_gem_address_space_create_instance(struct msm_mmu *parent, const char *name,
+		u64 start, u64 end);
 
 void msm_gem_submit_free(struct msm_gem_submit *submit);
 int msm_ioctl_gem_submit(struct drm_device *dev, void *data,
diff --git a/drivers/gpu/drm/msm/msm_gem_vma.c b/drivers/gpu/drm/msm/msm_gem_vma.c
index ffbec224551b..d75b56119752 100644
--- a/drivers/gpu/drm/msm/msm_gem_vma.c
+++ b/drivers/gpu/drm/msm/msm_gem_vma.c
@@ -92,12 +92,11 @@ msm_gem_map_vma(struct msm_gem_address_space *aspace,
 }
 
 struct msm_gem_address_space *
-msm_gem_address_space_create(struct device *dev, struct iommu_domain *domain,
-		const char *name)
+msm_gem_address_space_new(struct msm_mmu *mmu, const char *name,
+		u64 start, u64 end)
 {
 	struct msm_gem_address_space *aspace;
-	u64 size = domain->geometry.aperture_end -
-		domain->geometry.aperture_start;
+	u64 size = end - start;
 
 	aspace = kzalloc(sizeof(*aspace), GFP_KERNEL);
 	if (!aspace)
@@ -105,12 +104,36 @@ msm_gem_address_space_create(struct device *dev, struct iommu_domain *domain,
 
 	spin_lock_init(&aspace->lock);
 	aspace->name = name;
-	aspace->mmu = msm_iommu_new(dev, domain);
+	aspace->mmu = mmu;
 
-	drm_mm_init(&aspace->mm, (domain->geometry.aperture_start >> PAGE_SHIFT),
-		size >> PAGE_SHIFT);
+	drm_mm_init(&aspace->mm, (start >> PAGE_SHIFT), size >> PAGE_SHIFT);
 
 	kref_init(&aspace->kref);
 
 	return aspace;
 }
+
+struct msm_gem_address_space *
+msm_gem_address_space_create(struct device *dev, struct iommu_domain *domain,
+		const char *name)
+{
+	struct msm_mmu *mmu = msm_iommu_new(dev, domain);
+
+	if (IS_ERR(mmu))
+		return ERR_CAST(mmu);
+
+	return msm_gem_address_space_new(mmu, name,
+		domain->geometry.aperture_start,
+		domain->geometry.aperture_end);
+}
+
+struct msm_gem_address_space *
+msm_gem_address_space_create_instance(struct msm_mmu *parent, const char *name,
+		u64 start, u64 end)
+{
+	struct msm_mmu *instance = msm_iommu_pasid_new(parent);
+	if (IS_ERR(instance))
+		return ERR_CAST(instance);
+
+	return msm_gem_address_space_new(instance, name, start, end);
+}
-- 
2.17.0

_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 12/16] drm/msm: Add support for per-instance address spaces
@ 2018-05-18 21:34     ` Jordan Crouse
  0 siblings, 0 replies; 38+ messages in thread
From: Jordan Crouse @ 2018-05-18 21:34 UTC (permalink / raw)
  To: linux-arm-kernel

Add a function to allocate a new pasid from a existing
MMU domain and create a per-instance address space.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/gpu/drm/msm/msm_drv.h     |  3 +++
 drivers/gpu/drm/msm/msm_gem_vma.c | 37 +++++++++++++++++++++++++------
 2 files changed, 33 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
index 897b08135927..d92b009dfef4 100644
--- a/drivers/gpu/drm/msm/msm_drv.h
+++ b/drivers/gpu/drm/msm/msm_drv.h
@@ -177,6 +177,9 @@ void msm_gem_address_space_put(struct msm_gem_address_space *aspace);
 struct msm_gem_address_space *
 msm_gem_address_space_create(struct device *dev, struct iommu_domain *domain,
 		const char *name);
+struct msm_gem_address_space *
+msm_gem_address_space_create_instance(struct msm_mmu *parent, const char *name,
+		u64 start, u64 end);
 
 void msm_gem_submit_free(struct msm_gem_submit *submit);
 int msm_ioctl_gem_submit(struct drm_device *dev, void *data,
diff --git a/drivers/gpu/drm/msm/msm_gem_vma.c b/drivers/gpu/drm/msm/msm_gem_vma.c
index ffbec224551b..d75b56119752 100644
--- a/drivers/gpu/drm/msm/msm_gem_vma.c
+++ b/drivers/gpu/drm/msm/msm_gem_vma.c
@@ -92,12 +92,11 @@ msm_gem_map_vma(struct msm_gem_address_space *aspace,
 }
 
 struct msm_gem_address_space *
-msm_gem_address_space_create(struct device *dev, struct iommu_domain *domain,
-		const char *name)
+msm_gem_address_space_new(struct msm_mmu *mmu, const char *name,
+		u64 start, u64 end)
 {
 	struct msm_gem_address_space *aspace;
-	u64 size = domain->geometry.aperture_end -
-		domain->geometry.aperture_start;
+	u64 size = end - start;
 
 	aspace = kzalloc(sizeof(*aspace), GFP_KERNEL);
 	if (!aspace)
@@ -105,12 +104,36 @@ msm_gem_address_space_create(struct device *dev, struct iommu_domain *domain,
 
 	spin_lock_init(&aspace->lock);
 	aspace->name = name;
-	aspace->mmu = msm_iommu_new(dev, domain);
+	aspace->mmu = mmu;
 
-	drm_mm_init(&aspace->mm, (domain->geometry.aperture_start >> PAGE_SHIFT),
-		size >> PAGE_SHIFT);
+	drm_mm_init(&aspace->mm, (start >> PAGE_SHIFT), size >> PAGE_SHIFT);
 
 	kref_init(&aspace->kref);
 
 	return aspace;
 }
+
+struct msm_gem_address_space *
+msm_gem_address_space_create(struct device *dev, struct iommu_domain *domain,
+		const char *name)
+{
+	struct msm_mmu *mmu = msm_iommu_new(dev, domain);
+
+	if (IS_ERR(mmu))
+		return ERR_CAST(mmu);
+
+	return msm_gem_address_space_new(mmu, name,
+		domain->geometry.aperture_start,
+		domain->geometry.aperture_end);
+}
+
+struct msm_gem_address_space *
+msm_gem_address_space_create_instance(struct msm_mmu *parent, const char *name,
+		u64 start, u64 end)
+{
+	struct msm_mmu *instance = msm_iommu_pasid_new(parent);
+	if (IS_ERR(instance))
+		return ERR_CAST(instance);
+
+	return msm_gem_address_space_new(instance, name, start, end);
+}
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 13/16] drm/msm/a5xx: Support per-instance pagetables
  2018-05-18 21:34 ` Jordan Crouse
@ 2018-05-18 21:34     ` Jordan Crouse
  -1 siblings, 0 replies; 38+ messages in thread
From: Jordan Crouse @ 2018-05-18 21:34 UTC (permalink / raw)
  To: freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: jean-philippe.brucker-5wv7dgnIgG8,
	linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	tfiga-F7+t8E8rja9g9hUCZPvPmw,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

Add support for per-instance pagetables for 5XX targets. Create a support
buffer for preemption to hold the SMMU pagetable information for a preempted
ring, enable TTBR1 to support split pagetables and add the necessary PM4
commands to trigger a pagetable switch at the beginning of a user command.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/gpu/drm/msm/Kconfig               |  1 +
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c     | 55 +++++++++++++++++
 drivers/gpu/drm/msm/adreno/a5xx_gpu.h     | 17 ++++++
 drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 74 +++++++++++++++++++----
 drivers/gpu/drm/msm/adreno/adreno_gpu.c   | 11 ++++
 drivers/gpu/drm/msm/adreno/adreno_gpu.h   |  5 ++
 drivers/gpu/drm/msm/msm_ringbuffer.h      |  1 +
 7 files changed, 152 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/msm/Kconfig b/drivers/gpu/drm/msm/Kconfig
index 38cbde971b48..e69cbf88bb3d 100644
--- a/drivers/gpu/drm/msm/Kconfig
+++ b/drivers/gpu/drm/msm/Kconfig
@@ -15,6 +15,7 @@ config DRM_MSM
 	select SND_SOC_HDMI_CODEC if SND_SOC
 	select SYNC_FILE
 	select PM_OPP
+	select IOMMU_SVA
 	default y
 	help
 	  DRM/KMS driver for MSM/snapdragon.
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index b2c0370072dd..f4be2536441b 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -199,6 +199,59 @@ static void a5xx_submit_in_rb(struct msm_gpu *gpu, struct msm_gem_submit *submit
 	msm_gpu_retire(gpu);
 }
 
+static void a5xx_set_pagetable(struct msm_gpu *gpu, struct msm_ringbuffer *ring,
+	struct msm_file_private *ctx)
+{
+	u64 ttbr;
+	u32 asid;
+
+	if (msm_iommu_pasid_info(ctx->aspace->mmu, &ttbr, &asid))
+		return;
+
+	ttbr = ttbr | ((u64) asid) << 48;
+
+	/* Turn off protected mode */
+	OUT_PKT7(ring, CP_SET_PROTECTED_MODE, 1);
+	OUT_RING(ring, 0);
+
+	/* Turn on APIV mode to access critical regions */
+	OUT_PKT4(ring, REG_A5XX_CP_CNTL, 1);
+	OUT_RING(ring, 1);
+
+	/* Make sure the ME is synchronized before staring the update */
+	OUT_PKT7(ring, CP_WAIT_FOR_ME, 0);
+
+	/* Execute the table update */
+	OUT_PKT7(ring, CP_SMMU_TABLE_UPDATE, 3);
+	OUT_RING(ring, lower_32_bits(ttbr));
+	OUT_RING(ring, upper_32_bits(ttbr));
+	OUT_RING(ring, 0);
+
+	/*
+	 * Write the new TTBR0 to the preemption records - this will be used to
+	 * reload the pagetable if the current ring gets preempted out.
+	 */
+	OUT_PKT7(ring, CP_MEM_WRITE, 4);
+	OUT_RING(ring, lower_32_bits(rbmemptr(ring, ttbr0)));
+	OUT_RING(ring, upper_32_bits(rbmemptr(ring, ttbr0)));
+	OUT_RING(ring, lower_32_bits(ttbr));
+	OUT_RING(ring, upper_32_bits(ttbr));
+
+	/* Invalidate the draw state so we start off fresh */
+	OUT_PKT7(ring, CP_SET_DRAW_STATE, 3);
+	OUT_RING(ring, 0x40000);
+	OUT_RING(ring, 1);
+	OUT_RING(ring, 0);
+
+	/* Turn off APRIV */
+	OUT_PKT4(ring, REG_A5XX_CP_CNTL, 1);
+	OUT_RING(ring, 0);
+
+	/* Turn off protected mode */
+	OUT_PKT7(ring, CP_SET_PROTECTED_MODE, 1);
+	OUT_RING(ring, 1);
+}
+
 static void a5xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit,
 	struct msm_file_private *ctx)
 {
@@ -214,6 +267,8 @@ static void a5xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit,
 		return;
 	}
 
+	a5xx_set_pagetable(gpu, ring, ctx);
+
 	OUT_PKT7(ring, CP_PREEMPT_ENABLE_GLOBAL, 1);
 	OUT_RING(ring, 0x02);
 
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.h b/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
index 7d71860c4bee..9387d6085576 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
@@ -45,6 +45,9 @@ struct a5xx_gpu {
 
 	atomic_t preempt_state;
 	struct timer_list preempt_timer;
+	struct a5xx_smmu_info *smmu_info;
+	struct drm_gem_object *smmu_info_bo;
+	uint64_t smmu_info_iova;
 };
 
 #define to_a5xx_gpu(x) container_of(x, struct a5xx_gpu, base)
@@ -132,6 +135,20 @@ struct a5xx_preempt_record {
  */
 #define A5XX_PREEMPT_COUNTER_SIZE (16 * 4)
 
+/*
+ * This is a global structure that the preemption code uses to switch in the
+ * pagetable for the preempted process - the code switches in whatever we
+ * after preempting in a new ring.
+ */
+struct a5xx_smmu_info {
+	uint32_t  magic;
+	uint32_t  _pad4;
+	uint64_t  ttbr0;
+	uint32_t  asid;
+	uint32_t  contextidr;
+};
+
+#define A5XX_SMMU_INFO_MAGIC 0x3618CDA3UL
 
 int a5xx_power_init(struct msm_gpu *gpu);
 void a5xx_gpmu_ucode_init(struct msm_gpu *gpu);
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
index 970c7963ae29..d5dbcbd494f3 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
@@ -12,6 +12,7 @@
  */
 
 #include "msm_gem.h"
+#include "msm_mmu.h"
 #include "a5xx_gpu.h"
 
 /*
@@ -145,6 +146,15 @@ void a5xx_preempt_trigger(struct msm_gpu *gpu)
 	a5xx_gpu->preempt[ring->id]->wptr = get_wptr(ring);
 	spin_unlock_irqrestore(&ring->lock, flags);
 
+	/* Do read barrier to make sure we have updated pagetable info */
+	rmb();
+
+	/* Set the SMMU info for the preemption */
+	if (a5xx_gpu->smmu_info) {
+		a5xx_gpu->smmu_info->ttbr0 = ring->memptrs->ttbr0;
+		a5xx_gpu->smmu_info->contextidr = 0;
+	}
+
 	/* Set the address of the incoming preemption record */
 	gpu_write64(gpu, REG_A5XX_CP_CONTEXT_SWITCH_RESTORE_ADDR_LO,
 		REG_A5XX_CP_CONTEXT_SWITCH_RESTORE_ADDR_HI,
@@ -214,9 +224,10 @@ void a5xx_preempt_hw_init(struct msm_gpu *gpu)
 		a5xx_gpu->preempt[i]->rbase = gpu->rb[i]->iova;
 	}
 
-	/* Write a 0 to signal that we aren't switching pagetables */
+	/* Tell the CP where to find the smmu_info buffer*/
 	gpu_write64(gpu, REG_A5XX_CP_CONTEXT_SWITCH_SMMU_INFO_LO,
-		REG_A5XX_CP_CONTEXT_SWITCH_SMMU_INFO_HI, 0);
+		REG_A5XX_CP_CONTEXT_SWITCH_SMMU_INFO_HI,
+		a5xx_gpu->smmu_info_iova);
 
 	/* Reset the preemption state */
 	set_preempt_state(a5xx_gpu, PREEMPT_NONE);
@@ -275,8 +286,43 @@ void a5xx_preempt_fini(struct msm_gpu *gpu)
 		drm_gem_object_unreference(a5xx_gpu->preempt_bo[i]);
 		a5xx_gpu->preempt_bo[i] = NULL;
 	}
+
+	if (a5xx_gpu->smmu_info_bo) {
+		if (a5xx_gpu->smmu_info_iova)
+			msm_gem_put_iova(a5xx_gpu->smmu_info_bo, gpu->aspace);
+		drm_gem_object_unreference_unlocked(a5xx_gpu->smmu_info_bo);
+		a5xx_gpu->smmu_info_bo = NULL;
+	}
 }
 
+static int a5xx_smmu_info_init(struct msm_gpu *gpu)
+{
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct a5xx_gpu *a5xx_gpu = to_a5xx_gpu(adreno_gpu);
+	struct a5xx_smmu_info *ptr;
+	struct drm_gem_object *bo;
+	u64 iova;
+
+	if (!msm_mmu_has_feature(gpu->aspace->mmu,
+			MMU_FEATURE_PER_INSTANCE_TABLES))
+		return 0;
+
+	ptr = msm_gem_kernel_new(gpu->dev, sizeof(struct a5xx_smmu_info),
+		MSM_BO_UNCACHED, gpu->aspace, &bo, &iova);
+
+	if (IS_ERR(ptr))
+		return PTR_ERR(ptr);
+
+	ptr->magic = A5XX_SMMU_INFO_MAGIC;
+
+	a5xx_gpu->smmu_info_bo = bo;
+	a5xx_gpu->smmu_info_iova = iova;
+	a5xx_gpu->smmu_info = ptr;
+
+	return 0;
+}
+
+
 void a5xx_preempt_init(struct msm_gpu *gpu)
 {
 	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
@@ -288,17 +334,21 @@ void a5xx_preempt_init(struct msm_gpu *gpu)
 		return;
 
 	for (i = 0; i < gpu->nr_rings; i++) {
-		if (preempt_init_ring(a5xx_gpu, gpu->rb[i])) {
-			/*
-			 * On any failure our adventure is over. Clean up and
-			 * set nr_rings to 1 to force preemption off
-			 */
-			a5xx_preempt_fini(gpu);
-			gpu->nr_rings = 1;
-
-			return;
-		}
+		if (preempt_init_ring(a5xx_gpu, gpu->rb[i]))
+			goto fail;
 	}
 
+	if (a5xx_smmu_info_init(gpu))
+		goto fail;
+
 	timer_setup(&a5xx_gpu->preempt_timer, a5xx_preempt_timer, 0);
+
+	return;
+fail:
+	/*
+	 * On any failure our adventure is over. Clean up and
+	 * set nr_rings to 1 to force preemption off
+	 */
+	a5xx_preempt_fini(gpu);
+	gpu->nr_rings = 1;
 }
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index 17d0506d058c..b681edec4560 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -558,6 +558,17 @@ int adreno_gpu_init(struct drm_device *drm, struct platform_device *pdev,
 	adreno_gpu_config.ioname = "kgsl_3d0_reg_memory";
 	adreno_gpu_config.irqname = "kgsl_3d0_irq";
 
+	if (adreno_is_a5xx(adreno_gpu)) {
+		/*
+		 * If possible use the TTBR1 virtual address space for all the
+		 * "global" buffer objects which are shared between processes.
+		 * This leaves the lower virtual address space open for
+		 * per-instance pagables if they are available
+		 */
+		adreno_gpu_config.va_start_global = 0xfffffff800000000ULL;
+		adreno_gpu_config.va_end_global = 0xfffffff8ffffffffULL;
+	}
+
 	adreno_gpu_config.va_start = SZ_16M;
 	adreno_gpu_config.va_end = 0xffffffff;
 
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
index d6b0e7b813f4..dc4b21ea3e65 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
@@ -203,6 +203,11 @@ static inline int adreno_is_a530(struct adreno_gpu *gpu)
 	return gpu->revn == 530;
 }
 
+static inline bool adreno_is_a5xx(struct adreno_gpu *gpu)
+{
+	return ((gpu->revn >= 500) & (gpu->revn < 600));
+}
+
 int adreno_get_param(struct msm_gpu *gpu, uint32_t param, uint64_t *value);
 const struct firmware *adreno_request_fw(struct adreno_gpu *adreno_gpu,
 		const char *fwname);
diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.h b/drivers/gpu/drm/msm/msm_ringbuffer.h
index cffce094aecb..fd71484d5894 100644
--- a/drivers/gpu/drm/msm/msm_ringbuffer.h
+++ b/drivers/gpu/drm/msm/msm_ringbuffer.h
@@ -26,6 +26,7 @@
 struct msm_rbmemptrs {
 	volatile uint32_t rptr;
 	volatile uint32_t fence;
+	volatile uint64_t ttbr0;
 };
 
 struct msm_ringbuffer {
-- 
2.17.0

_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 13/16] drm/msm/a5xx: Support per-instance pagetables
@ 2018-05-18 21:34     ` Jordan Crouse
  0 siblings, 0 replies; 38+ messages in thread
From: Jordan Crouse @ 2018-05-18 21:34 UTC (permalink / raw)
  To: linux-arm-kernel

Add support for per-instance pagetables for 5XX targets. Create a support
buffer for preemption to hold the SMMU pagetable information for a preempted
ring, enable TTBR1 to support split pagetables and add the necessary PM4
commands to trigger a pagetable switch at the beginning of a user command.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/gpu/drm/msm/Kconfig               |  1 +
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c     | 55 +++++++++++++++++
 drivers/gpu/drm/msm/adreno/a5xx_gpu.h     | 17 ++++++
 drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 74 +++++++++++++++++++----
 drivers/gpu/drm/msm/adreno/adreno_gpu.c   | 11 ++++
 drivers/gpu/drm/msm/adreno/adreno_gpu.h   |  5 ++
 drivers/gpu/drm/msm/msm_ringbuffer.h      |  1 +
 7 files changed, 152 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/msm/Kconfig b/drivers/gpu/drm/msm/Kconfig
index 38cbde971b48..e69cbf88bb3d 100644
--- a/drivers/gpu/drm/msm/Kconfig
+++ b/drivers/gpu/drm/msm/Kconfig
@@ -15,6 +15,7 @@ config DRM_MSM
 	select SND_SOC_HDMI_CODEC if SND_SOC
 	select SYNC_FILE
 	select PM_OPP
+	select IOMMU_SVA
 	default y
 	help
 	  DRM/KMS driver for MSM/snapdragon.
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index b2c0370072dd..f4be2536441b 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -199,6 +199,59 @@ static void a5xx_submit_in_rb(struct msm_gpu *gpu, struct msm_gem_submit *submit
 	msm_gpu_retire(gpu);
 }
 
+static void a5xx_set_pagetable(struct msm_gpu *gpu, struct msm_ringbuffer *ring,
+	struct msm_file_private *ctx)
+{
+	u64 ttbr;
+	u32 asid;
+
+	if (msm_iommu_pasid_info(ctx->aspace->mmu, &ttbr, &asid))
+		return;
+
+	ttbr = ttbr | ((u64) asid) << 48;
+
+	/* Turn off protected mode */
+	OUT_PKT7(ring, CP_SET_PROTECTED_MODE, 1);
+	OUT_RING(ring, 0);
+
+	/* Turn on APIV mode to access critical regions */
+	OUT_PKT4(ring, REG_A5XX_CP_CNTL, 1);
+	OUT_RING(ring, 1);
+
+	/* Make sure the ME is synchronized before staring the update */
+	OUT_PKT7(ring, CP_WAIT_FOR_ME, 0);
+
+	/* Execute the table update */
+	OUT_PKT7(ring, CP_SMMU_TABLE_UPDATE, 3);
+	OUT_RING(ring, lower_32_bits(ttbr));
+	OUT_RING(ring, upper_32_bits(ttbr));
+	OUT_RING(ring, 0);
+
+	/*
+	 * Write the new TTBR0 to the preemption records - this will be used to
+	 * reload the pagetable if the current ring gets preempted out.
+	 */
+	OUT_PKT7(ring, CP_MEM_WRITE, 4);
+	OUT_RING(ring, lower_32_bits(rbmemptr(ring, ttbr0)));
+	OUT_RING(ring, upper_32_bits(rbmemptr(ring, ttbr0)));
+	OUT_RING(ring, lower_32_bits(ttbr));
+	OUT_RING(ring, upper_32_bits(ttbr));
+
+	/* Invalidate the draw state so we start off fresh */
+	OUT_PKT7(ring, CP_SET_DRAW_STATE, 3);
+	OUT_RING(ring, 0x40000);
+	OUT_RING(ring, 1);
+	OUT_RING(ring, 0);
+
+	/* Turn off APRIV */
+	OUT_PKT4(ring, REG_A5XX_CP_CNTL, 1);
+	OUT_RING(ring, 0);
+
+	/* Turn off protected mode */
+	OUT_PKT7(ring, CP_SET_PROTECTED_MODE, 1);
+	OUT_RING(ring, 1);
+}
+
 static void a5xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit,
 	struct msm_file_private *ctx)
 {
@@ -214,6 +267,8 @@ static void a5xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit,
 		return;
 	}
 
+	a5xx_set_pagetable(gpu, ring, ctx);
+
 	OUT_PKT7(ring, CP_PREEMPT_ENABLE_GLOBAL, 1);
 	OUT_RING(ring, 0x02);
 
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.h b/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
index 7d71860c4bee..9387d6085576 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
@@ -45,6 +45,9 @@ struct a5xx_gpu {
 
 	atomic_t preempt_state;
 	struct timer_list preempt_timer;
+	struct a5xx_smmu_info *smmu_info;
+	struct drm_gem_object *smmu_info_bo;
+	uint64_t smmu_info_iova;
 };
 
 #define to_a5xx_gpu(x) container_of(x, struct a5xx_gpu, base)
@@ -132,6 +135,20 @@ struct a5xx_preempt_record {
  */
 #define A5XX_PREEMPT_COUNTER_SIZE (16 * 4)
 
+/*
+ * This is a global structure that the preemption code uses to switch in the
+ * pagetable for the preempted process - the code switches in whatever we
+ * after preempting in a new ring.
+ */
+struct a5xx_smmu_info {
+	uint32_t  magic;
+	uint32_t  _pad4;
+	uint64_t  ttbr0;
+	uint32_t  asid;
+	uint32_t  contextidr;
+};
+
+#define A5XX_SMMU_INFO_MAGIC 0x3618CDA3UL
 
 int a5xx_power_init(struct msm_gpu *gpu);
 void a5xx_gpmu_ucode_init(struct msm_gpu *gpu);
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
index 970c7963ae29..d5dbcbd494f3 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
@@ -12,6 +12,7 @@
  */
 
 #include "msm_gem.h"
+#include "msm_mmu.h"
 #include "a5xx_gpu.h"
 
 /*
@@ -145,6 +146,15 @@ void a5xx_preempt_trigger(struct msm_gpu *gpu)
 	a5xx_gpu->preempt[ring->id]->wptr = get_wptr(ring);
 	spin_unlock_irqrestore(&ring->lock, flags);
 
+	/* Do read barrier to make sure we have updated pagetable info */
+	rmb();
+
+	/* Set the SMMU info for the preemption */
+	if (a5xx_gpu->smmu_info) {
+		a5xx_gpu->smmu_info->ttbr0 = ring->memptrs->ttbr0;
+		a5xx_gpu->smmu_info->contextidr = 0;
+	}
+
 	/* Set the address of the incoming preemption record */
 	gpu_write64(gpu, REG_A5XX_CP_CONTEXT_SWITCH_RESTORE_ADDR_LO,
 		REG_A5XX_CP_CONTEXT_SWITCH_RESTORE_ADDR_HI,
@@ -214,9 +224,10 @@ void a5xx_preempt_hw_init(struct msm_gpu *gpu)
 		a5xx_gpu->preempt[i]->rbase = gpu->rb[i]->iova;
 	}
 
-	/* Write a 0 to signal that we aren't switching pagetables */
+	/* Tell the CP where to find the smmu_info buffer*/
 	gpu_write64(gpu, REG_A5XX_CP_CONTEXT_SWITCH_SMMU_INFO_LO,
-		REG_A5XX_CP_CONTEXT_SWITCH_SMMU_INFO_HI, 0);
+		REG_A5XX_CP_CONTEXT_SWITCH_SMMU_INFO_HI,
+		a5xx_gpu->smmu_info_iova);
 
 	/* Reset the preemption state */
 	set_preempt_state(a5xx_gpu, PREEMPT_NONE);
@@ -275,8 +286,43 @@ void a5xx_preempt_fini(struct msm_gpu *gpu)
 		drm_gem_object_unreference(a5xx_gpu->preempt_bo[i]);
 		a5xx_gpu->preempt_bo[i] = NULL;
 	}
+
+	if (a5xx_gpu->smmu_info_bo) {
+		if (a5xx_gpu->smmu_info_iova)
+			msm_gem_put_iova(a5xx_gpu->smmu_info_bo, gpu->aspace);
+		drm_gem_object_unreference_unlocked(a5xx_gpu->smmu_info_bo);
+		a5xx_gpu->smmu_info_bo = NULL;
+	}
 }
 
+static int a5xx_smmu_info_init(struct msm_gpu *gpu)
+{
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct a5xx_gpu *a5xx_gpu = to_a5xx_gpu(adreno_gpu);
+	struct a5xx_smmu_info *ptr;
+	struct drm_gem_object *bo;
+	u64 iova;
+
+	if (!msm_mmu_has_feature(gpu->aspace->mmu,
+			MMU_FEATURE_PER_INSTANCE_TABLES))
+		return 0;
+
+	ptr = msm_gem_kernel_new(gpu->dev, sizeof(struct a5xx_smmu_info),
+		MSM_BO_UNCACHED, gpu->aspace, &bo, &iova);
+
+	if (IS_ERR(ptr))
+		return PTR_ERR(ptr);
+
+	ptr->magic = A5XX_SMMU_INFO_MAGIC;
+
+	a5xx_gpu->smmu_info_bo = bo;
+	a5xx_gpu->smmu_info_iova = iova;
+	a5xx_gpu->smmu_info = ptr;
+
+	return 0;
+}
+
+
 void a5xx_preempt_init(struct msm_gpu *gpu)
 {
 	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
@@ -288,17 +334,21 @@ void a5xx_preempt_init(struct msm_gpu *gpu)
 		return;
 
 	for (i = 0; i < gpu->nr_rings; i++) {
-		if (preempt_init_ring(a5xx_gpu, gpu->rb[i])) {
-			/*
-			 * On any failure our adventure is over. Clean up and
-			 * set nr_rings to 1 to force preemption off
-			 */
-			a5xx_preempt_fini(gpu);
-			gpu->nr_rings = 1;
-
-			return;
-		}
+		if (preempt_init_ring(a5xx_gpu, gpu->rb[i]))
+			goto fail;
 	}
 
+	if (a5xx_smmu_info_init(gpu))
+		goto fail;
+
 	timer_setup(&a5xx_gpu->preempt_timer, a5xx_preempt_timer, 0);
+
+	return;
+fail:
+	/*
+	 * On any failure our adventure is over. Clean up and
+	 * set nr_rings to 1 to force preemption off
+	 */
+	a5xx_preempt_fini(gpu);
+	gpu->nr_rings = 1;
 }
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index 17d0506d058c..b681edec4560 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -558,6 +558,17 @@ int adreno_gpu_init(struct drm_device *drm, struct platform_device *pdev,
 	adreno_gpu_config.ioname = "kgsl_3d0_reg_memory";
 	adreno_gpu_config.irqname = "kgsl_3d0_irq";
 
+	if (adreno_is_a5xx(adreno_gpu)) {
+		/*
+		 * If possible use the TTBR1 virtual address space for all the
+		 * "global" buffer objects which are shared between processes.
+		 * This leaves the lower virtual address space open for
+		 * per-instance pagables if they are available
+		 */
+		adreno_gpu_config.va_start_global = 0xfffffff800000000ULL;
+		adreno_gpu_config.va_end_global = 0xfffffff8ffffffffULL;
+	}
+
 	adreno_gpu_config.va_start = SZ_16M;
 	adreno_gpu_config.va_end = 0xffffffff;
 
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
index d6b0e7b813f4..dc4b21ea3e65 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
@@ -203,6 +203,11 @@ static inline int adreno_is_a530(struct adreno_gpu *gpu)
 	return gpu->revn == 530;
 }
 
+static inline bool adreno_is_a5xx(struct adreno_gpu *gpu)
+{
+	return ((gpu->revn >= 500) & (gpu->revn < 600));
+}
+
 int adreno_get_param(struct msm_gpu *gpu, uint32_t param, uint64_t *value);
 const struct firmware *adreno_request_fw(struct adreno_gpu *adreno_gpu,
 		const char *fwname);
diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.h b/drivers/gpu/drm/msm/msm_ringbuffer.h
index cffce094aecb..fd71484d5894 100644
--- a/drivers/gpu/drm/msm/msm_ringbuffer.h
+++ b/drivers/gpu/drm/msm/msm_ringbuffer.h
@@ -26,6 +26,7 @@
 struct msm_rbmemptrs {
 	volatile uint32_t rptr;
 	volatile uint32_t fence;
+	volatile uint64_t ttbr0;
 };
 
 struct msm_ringbuffer {
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 14/16] drm/msm: Support per-instance address spaces
  2018-05-18 21:34 ` Jordan Crouse
@ 2018-05-18 21:34     ` Jordan Crouse
  -1 siblings, 0 replies; 38+ messages in thread
From: Jordan Crouse @ 2018-05-18 21:34 UTC (permalink / raw)
  To: freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: jean-philippe.brucker-5wv7dgnIgG8,
	linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	tfiga-F7+t8E8rja9g9hUCZPvPmw,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

Create a per-instance address spaces when a new DRM file instance is
opened assuming the target supports it and the underlying
infrastructure exists. If the operation is unsupported fall back
quietly to use the global pagetable.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/gpu/drm/msm/msm_drv.c | 31 ++++++++++++++++++++++++++++---
 1 file changed, 28 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index 2b663435a3f7..31d1e7589892 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -22,6 +22,7 @@
 #include "msm_fence.h"
 #include "msm_gpu.h"
 #include "msm_kms.h"
+#include "msm_gem.h"
 
 
 /*
@@ -511,7 +512,27 @@ static int context_init(struct drm_device *dev, struct drm_file *file)
 
 	msm_submitqueue_init(dev, ctx);
 
-	ctx->aspace = priv->gpu->aspace;
+	/* FIXME: Do we want a dynamic name of some sort? */
+	/* FIXME: We need a smarter way to set the range based on target */
+
+	ctx->aspace = msm_gem_address_space_create_instance(
+		priv->gpu->aspace->mmu, "gpu", 0x100000000, 0x1ffffffff);
+
+	if (IS_ERR(ctx->aspace)) {
+		int ret = PTR_ERR(ctx->aspace);
+
+		/*
+		 * if per-instance pagetables are not supported, fall back to
+		 * using the generic address space
+		 */
+		if (ret == -EOPNOTSUPP)
+			ctx->aspace = priv->gpu->aspace;
+		else {
+			kfree(ctx);
+			return ret;
+		}
+	}
+
 	file->driver_priv = ctx;
 
 	return 0;
@@ -527,8 +548,12 @@ static int msm_open(struct drm_device *dev, struct drm_file *file)
 	return context_init(dev, file);
 }
 
-static void context_close(struct msm_file_private *ctx)
+static void context_close(struct msm_drm_private *priv,
+		struct msm_file_private *ctx)
 {
+	if (ctx && ctx->aspace != priv->gpu->aspace)
+		msm_gem_address_space_put(ctx->aspace);
+
 	msm_submitqueue_close(ctx);
 	kfree(ctx);
 }
@@ -543,7 +568,7 @@ static void msm_postclose(struct drm_device *dev, struct drm_file *file)
 		priv->lastctx = NULL;
 	mutex_unlock(&dev->struct_mutex);
 
-	context_close(ctx);
+	context_close(priv, ctx);
 }
 
 static irqreturn_t msm_irq(int irq, void *arg)
-- 
2.17.0

_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 14/16] drm/msm: Support per-instance address spaces
@ 2018-05-18 21:34     ` Jordan Crouse
  0 siblings, 0 replies; 38+ messages in thread
From: Jordan Crouse @ 2018-05-18 21:34 UTC (permalink / raw)
  To: linux-arm-kernel

Create a per-instance address spaces when a new DRM file instance is
opened assuming the target supports it and the underlying
infrastructure exists. If the operation is unsupported fall back
quietly to use the global pagetable.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/gpu/drm/msm/msm_drv.c | 31 ++++++++++++++++++++++++++++---
 1 file changed, 28 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index 2b663435a3f7..31d1e7589892 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -22,6 +22,7 @@
 #include "msm_fence.h"
 #include "msm_gpu.h"
 #include "msm_kms.h"
+#include "msm_gem.h"
 
 
 /*
@@ -511,7 +512,27 @@ static int context_init(struct drm_device *dev, struct drm_file *file)
 
 	msm_submitqueue_init(dev, ctx);
 
-	ctx->aspace = priv->gpu->aspace;
+	/* FIXME: Do we want a dynamic name of some sort? */
+	/* FIXME: We need a smarter way to set the range based on target */
+
+	ctx->aspace = msm_gem_address_space_create_instance(
+		priv->gpu->aspace->mmu, "gpu", 0x100000000, 0x1ffffffff);
+
+	if (IS_ERR(ctx->aspace)) {
+		int ret = PTR_ERR(ctx->aspace);
+
+		/*
+		 * if per-instance pagetables are not supported, fall back to
+		 * using the generic address space
+		 */
+		if (ret == -EOPNOTSUPP)
+			ctx->aspace = priv->gpu->aspace;
+		else {
+			kfree(ctx);
+			return ret;
+		}
+	}
+
 	file->driver_priv = ctx;
 
 	return 0;
@@ -527,8 +548,12 @@ static int msm_open(struct drm_device *dev, struct drm_file *file)
 	return context_init(dev, file);
 }
 
-static void context_close(struct msm_file_private *ctx)
+static void context_close(struct msm_drm_private *priv,
+		struct msm_file_private *ctx)
 {
+	if (ctx && ctx->aspace != priv->gpu->aspace)
+		msm_gem_address_space_put(ctx->aspace);
+
 	msm_submitqueue_close(ctx);
 	kfree(ctx);
 }
@@ -543,7 +568,7 @@ static void msm_postclose(struct drm_device *dev, struct drm_file *file)
 		priv->lastctx = NULL;
 	mutex_unlock(&dev->struct_mutex);
 
-	context_close(ctx);
+	context_close(priv, ctx);
 }
 
 static irqreturn_t msm_irq(int irq, void *arg)
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 15/16] iommu: Gracefully allow drivers to not attach to a default domain
  2018-05-18 21:34 ` Jordan Crouse
@ 2018-05-18 21:34     ` Jordan Crouse
  -1 siblings, 0 replies; 38+ messages in thread
From: Jordan Crouse @ 2018-05-18 21:34 UTC (permalink / raw)
  To: freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: jean-philippe.brucker-5wv7dgnIgG8,
	linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	tfiga-F7+t8E8rja9g9hUCZPvPmw,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

Provide individual device drivers the chance to gracefully refuse
to attach a device to the default domain. If the attach_device
op returns -ENOTSUPP don't print a error message and don't set
group->domain but still return success from iommu_group_add_dev().

This allows all the usual APIs to work and the next domain to try
to attach will take group->domain for itself and everything will
proceed as normal.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/iommu/iommu.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 0ba3d27f2300..a255b5d6c495 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -599,7 +599,7 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev)
 	if (group->domain)
 		ret = __iommu_attach_device(group->domain, dev);
 	mutex_unlock(&group->mutex);
-	if (ret)
+	if (ret && ret != -ENOTSUPP)
 		goto err_put_group;
 
 	/* Notify any listeners about change to group. */
@@ -625,7 +625,8 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev)
 	sysfs_remove_link(&dev->kobj, "iommu_group");
 err_free_device:
 	kfree(device);
-	pr_err("Failed to add device %s to group %d: %d\n", dev_name(dev), group->id, ret);
+	if (ret != -ENOTSUPP)
+		pr_err("Failed to add device %s to group %d: %d\n", dev_name(dev), group->id, ret);
 	return ret;
 }
 EXPORT_SYMBOL_GPL(iommu_group_add_device);
@@ -1238,8 +1239,16 @@ struct iommu_group *iommu_group_get_for_dev(struct device *dev)
 
 	ret = iommu_group_add_device(group, dev);
 	if (ret) {
-		iommu_group_put(group);
-		return ERR_PTR(ret);
+		/*
+		 * If the driver chooses not to bind the device, reset
+		 * group->domain so a new domain can be added later
+		 */
+		if (ret == -ENOTSUPP)
+			group->domain = NULL;
+		else {
+			iommu_group_put(group);
+			return ERR_PTR(ret);
+		}
 	}
 
 	return group;
-- 
2.17.0

_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 15/16] iommu: Gracefully allow drivers to not attach to a default domain
@ 2018-05-18 21:34     ` Jordan Crouse
  0 siblings, 0 replies; 38+ messages in thread
From: Jordan Crouse @ 2018-05-18 21:34 UTC (permalink / raw)
  To: linux-arm-kernel

Provide individual device drivers the chance to gracefully refuse
to attach a device to the default domain. If the attach_device
op returns -ENOTSUPP don't print a error message and don't set
group->domain but still return success from iommu_group_add_dev().

This allows all the usual APIs to work and the next domain to try
to attach will take group->domain for itself and everything will
proceed as normal.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/iommu/iommu.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 0ba3d27f2300..a255b5d6c495 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -599,7 +599,7 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev)
 	if (group->domain)
 		ret = __iommu_attach_device(group->domain, dev);
 	mutex_unlock(&group->mutex);
-	if (ret)
+	if (ret && ret != -ENOTSUPP)
 		goto err_put_group;
 
 	/* Notify any listeners about change to group. */
@@ -625,7 +625,8 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev)
 	sysfs_remove_link(&dev->kobj, "iommu_group");
 err_free_device:
 	kfree(device);
-	pr_err("Failed to add device %s to group %d: %d\n", dev_name(dev), group->id, ret);
+	if (ret != -ENOTSUPP)
+		pr_err("Failed to add device %s to group %d: %d\n", dev_name(dev), group->id, ret);
 	return ret;
 }
 EXPORT_SYMBOL_GPL(iommu_group_add_device);
@@ -1238,8 +1239,16 @@ struct iommu_group *iommu_group_get_for_dev(struct device *dev)
 
 	ret = iommu_group_add_device(group, dev);
 	if (ret) {
-		iommu_group_put(group);
-		return ERR_PTR(ret);
+		/*
+		 * If the driver chooses not to bind the device, reset
+		 * group->domain so a new domain can be added later
+		 */
+		if (ret == -ENOTSUPP)
+			group->domain = NULL;
+		else {
+			iommu_group_put(group);
+			return ERR_PTR(ret);
+		}
 	}
 
 	return group;
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 16/16] iommu/arm-smmu: Add list of devices to opt out of DMA domains
  2018-05-18 21:34 ` Jordan Crouse
@ 2018-05-18 21:35     ` Jordan Crouse
  -1 siblings, 0 replies; 38+ messages in thread
From: Jordan Crouse @ 2018-05-18 21:35 UTC (permalink / raw)
  To: freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: jean-philippe.brucker-5wv7dgnIgG8,
	linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	tfiga-F7+t8E8rja9g9hUCZPvPmw,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

Add a list of compatible strings for devices that wish to opt out
of attaching to a DMA domain.  This is for devices that prefer to
manage their own IOMMU space for any number of reasons. Returning
-ENOTSUPP for attach device will filter down and force
arch_setup_dma_ops() to not set up the iommu DMA ops. Later
the client device in question can set up and attach their own
domain.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/iommu/arm-smmu.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 100797a07be0..df6e4eacf727 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1080,6 +1080,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 		goto out_unlock;
 
 	cfg->cbndx = ret;
+
 	if (smmu->version < ARM_SMMU_V2) {
 		cfg->irptndx = atomic_inc_return(&smmu->irptndx);
 		cfg->irptndx %= smmu->num_context_irqs;
@@ -1450,6 +1451,15 @@ static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain,
 	return 0;
 }
 
+/*
+ * This is a list of compatible strings for devices that wish to manage their
+ * own IOMMU space instead of the DMA IOMMU ops. Devices on this list will not
+ * allow themselves to be attached to a IOMMU_DOMAIN_DMA domain
+ */
+static const char *arm_smmu_dma_blacklist[] = {
+	"qcom,adreno",
+};
+
 static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 {
 	int ret;
@@ -1472,6 +1482,20 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 	if (!fwspec->iommu_priv)
 		return -ENODEV;
 
+	/*
+	 * If this is the dfeault DMA domain, check to see if the device is on
+	 * the blacklist and reject if so
+	 */
+	if (domain->type == IOMMU_DOMAIN_DMA && dev->of_node) {
+		int i;
+
+		for(i = 0; i < ARRAY_SIZE(arm_smmu_dma_blacklist); i++) {
+			if (of_device_is_compatible(dev->of_node,
+				arm_smmu_dma_blacklist[i]))
+				return -ENOTSUPP;
+		}
+	}
+
 	smmu = fwspec_smmu(fwspec);
 	/* Ensure that the domain is finalised */
 	ret = arm_smmu_init_domain_context(domain, smmu);
-- 
2.17.0

_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 16/16] iommu/arm-smmu: Add list of devices to opt out of DMA domains
@ 2018-05-18 21:35     ` Jordan Crouse
  0 siblings, 0 replies; 38+ messages in thread
From: Jordan Crouse @ 2018-05-18 21:35 UTC (permalink / raw)
  To: linux-arm-kernel

Add a list of compatible strings for devices that wish to opt out
of attaching to a DMA domain.  This is for devices that prefer to
manage their own IOMMU space for any number of reasons. Returning
-ENOTSUPP for attach device will filter down and force
arch_setup_dma_ops() to not set up the iommu DMA ops. Later
the client device in question can set up and attach their own
domain.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/iommu/arm-smmu.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 100797a07be0..df6e4eacf727 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1080,6 +1080,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 		goto out_unlock;
 
 	cfg->cbndx = ret;
+
 	if (smmu->version < ARM_SMMU_V2) {
 		cfg->irptndx = atomic_inc_return(&smmu->irptndx);
 		cfg->irptndx %= smmu->num_context_irqs;
@@ -1450,6 +1451,15 @@ static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain,
 	return 0;
 }
 
+/*
+ * This is a list of compatible strings for devices that wish to manage their
+ * own IOMMU space instead of the DMA IOMMU ops. Devices on this list will not
+ * allow themselves to be attached to a IOMMU_DOMAIN_DMA domain
+ */
+static const char *arm_smmu_dma_blacklist[] = {
+	"qcom,adreno",
+};
+
 static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 {
 	int ret;
@@ -1472,6 +1482,20 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 	if (!fwspec->iommu_priv)
 		return -ENODEV;
 
+	/*
+	 * If this is the dfeault DMA domain, check to see if the device is on
+	 * the blacklist and reject if so
+	 */
+	if (domain->type == IOMMU_DOMAIN_DMA && dev->of_node) {
+		int i;
+
+		for(i = 0; i < ARRAY_SIZE(arm_smmu_dma_blacklist); i++) {
+			if (of_device_is_compatible(dev->of_node,
+				arm_smmu_dma_blacklist[i]))
+				return -ENOTSUPP;
+		}
+	}
+
 	smmu = fwspec_smmu(fwspec);
 	/* Ensure that the domain is finalised */
 	ret = arm_smmu_init_domain_context(domain, smmu);
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH 04/16] iommu: sva: Add support for private PASIDs
  2018-05-18 21:34     ` Jordan Crouse
@ 2018-07-17 11:21         ` Jean-Philippe Brucker
  -1 siblings, 0 replies; 38+ messages in thread
From: Jean-Philippe Brucker @ 2018-07-17 11:21 UTC (permalink / raw)
  To: Jordan Crouse, freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	joro-zLv9SwRftAIdnm+yROfE0A,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	tfiga-F7+t8E8rja9g9hUCZPvPmw,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

Hi Jordan,

Thanks for the patches, I finally got around testing them with SMMUv3.
It's an important feature, arguably more than SVA itself. I could pick
this one as part of the SVA series, what do you think?

Although I probably would have done the same, I dislike the interface
because it forces us to duplicate functions and IOMMU ops. The list is
small but growing:

iommu_map
iommu_map_sg
iommu_unmap
iommu_unmap_fast
iommu_iova_to_phys
iommu_tlb_range_add
iommu_flush_tlb_all

Each of these and their associated IOMMU op will have an iommu_sva_X
counterpart that takes one different argument. Modifying these functions
to take both a domain and a PASID argument would be more elegant. Or as
an intermediate solution, perhaps we could only change the IOMMU ops to
take an additional argument, like you did for map_sg?

In any case it requires invasive changes in lots of drivers and we can
always tidy up later, so unless Joerg has a preference I'd keep the
duplicates for now.

However, having to lookup pasid-to-io_mm on every map/unmap call is
cumbersome, especially since map/unmap are supposed to be as fast as
possible. iommu_sva_alloc_pasid should return a structure representing
the PASID instead of the value alone. The io_mm structure seems like a
good fit, and the device driver can access io_mm->pasid directly or via
an io_mm_get_pasid() function.

The new functions would then be:

struct io_mm *iommu_sva_alloc_pasid(domain, dev)
void iommu_sva_free_pasid(domain, io_mm)

int iommu_sva_map(io_mm, iova, paddr, size, prot)
size_t iommu_map_sg(io_mm, iova, sg, nents, prot)
size_t iommu_sva_unmap(io_mm, iova, size)
size_t iommu_sva_unmap_fast(io_mm, iova, size)
phys_addr_t iommu_sva_iova_to_phys(io_mm, iova)
void iommu_sva_flush_tlb_all(io_mm)
void iommu_sva_tlb_range_add(io_mm, iova, size)

A few more comments inline

On 18/05/18 22:34, Jordan Crouse wrote:
> Some older SMMU implementations that do not have a fully featured
> hardware PASID features have alternate workarounds for using multiple
> pagetables. For example, MSM GPUs have logic to automatically switch the
> user pagetable from hardware by writing the context bank directly.

The comment may be a bit too specific, sva_map/sva_unmap is also useful
for PASID-capable IOMMUs

> Support private PASIDs by creating a new io-pgtable instance map it
> to a PASID and provide the APIs for drivers to populate it manually.
> 
> Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> ---
[...]
> +int iommu_sva_alloc_pasid(struct iommu_domain *domain, struct device *dev)
> +{
> +	int ret, pasid;
> +	struct io_mm *io_mm;
> +	struct iommu_sva_param *param = dev->iommu_param->sva_param;

We need a NULL check on the param, to ensure that the driver called
sva_device_init first.

> +
> +	if (!domain->ops->mm_attach || !domain->ops->mm_detach)
> +		return -ENODEV;
> +
> +	if (domain->ops->mm_alloc)

I'd rather make mm_alloc and mm_free mandatory, but if we do make them
optional, then we need to check that both mm_alloc and mm_free are
present, or both absent.

> +		io_mm = domain->ops->mm_alloc(domain, NULL, 0);
> +	else
> +		io_mm = kzalloc(sizeof(*io_mm), GFP_KERNEL);
> +
> +	if (IS_ERR(io_mm))
> +		return PTR_ERR(io_mm);
> +	if (!io_mm)
> +		return -ENOMEM;
> +
> +	io_mm->domain = domain;
> +	io_mm->type = IO_TYPE_PRIVATE;

This could be a IOMMU_SVA_FEAT_PRIVATE flag

> +
> +	idr_preload(GFP_KERNEL);
> +	spin_lock(&iommu_sva_lock);
> +	pasid = idr_alloc_cyclic(&iommu_pasid_idr, io_mm, param->min_pasid,
> +		param->max_pasid + 1, GFP_ATOMIC);
> +	io_mm->pasid = pasid;
> +	spin_unlock(&iommu_sva_lock);
> +	idr_preload_end();
> +
> +	if (pasid < 0) {
> +		kfree(io_mm);
> +		return pasid;
> +	}
> +
> +	ret = domain->ops->mm_attach(domain, dev, io_mm, false);

attach_domain should be true, otherwise the SMMUv3 driver won't write
the PASID table. But we should probably go through io_mm_attach here, to
make sure that PASID contexts are added to the mm list and cleaned up by
unbind_dev_all()

> +size_t iommu_sva_unmap(int pasid, unsigned long iova, size_t size)
> +{
> +	struct io_mm *io_mm = get_io_mm(pasid);
> +
> +	if (!io_mm || io_mm->type != IO_TYPE_PRIVATE)
> +		return -ENODEV;
> +
> +	return __iommu_unmap(io_mm->domain, &pasid, iova, size, false);

sync must be true here, and false in the unmap_fast() variant

> +}
> +EXPORT_SYMBOL_GPL(iommu_sva_unmap);
> +
> +void iommu_sva_free_pasid(int pasid, struct device *dev)
> +{
> +	struct io_mm *io_mm = get_io_mm(pasid);
> +	struct iommu_domain *domain;
> +
> +	if (!io_mm || io_mm->type != IO_TYPE_PRIVATE)
> +		return;
> +
> +	domain = io_mm->domain;
> +
> +	domain->ops->mm_detach(domain, dev, io_mm, false);

Here too detach_domain should be true

> @@ -1841,16 +1854,23 @@ int iommu_map(struct iommu_domain *domain, unsigned long iova,
>  
>  	/* unroll mapping in case something went wrong */
>  	if (ret)
> -		iommu_unmap(domain, orig_iova, orig_size - size);
> +		__iommu_unmap(domain, pasid, orig_iova, orig_size - size,
> +			pasid ? false : true);

sync should be true

> -	if (unlikely(ops->unmap == NULL ||
> -		     domain->pgsize_bitmap == 0UL))
> -		return 0;
> +	if (unlikely(domain->pgsize_bitmap == 0UL))
> +		return -0;

spurious '-'

Thanks,
Jean
_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH 04/16] iommu: sva: Add support for private PASIDs
@ 2018-07-17 11:21         ` Jean-Philippe Brucker
  0 siblings, 0 replies; 38+ messages in thread
From: Jean-Philippe Brucker @ 2018-07-17 11:21 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Jordan,

Thanks for the patches, I finally got around testing them with SMMUv3.
It's an important feature, arguably more than SVA itself. I could pick
this one as part of the SVA series, what do you think?

Although I probably would have done the same, I dislike the interface
because it forces us to duplicate functions and IOMMU ops. The list is
small but growing:

iommu_map
iommu_map_sg
iommu_unmap
iommu_unmap_fast
iommu_iova_to_phys
iommu_tlb_range_add
iommu_flush_tlb_all

Each of these and their associated IOMMU op will have an iommu_sva_X
counterpart that takes one different argument. Modifying these functions
to take both a domain and a PASID argument would be more elegant. Or as
an intermediate solution, perhaps we could only change the IOMMU ops to
take an additional argument, like you did for map_sg?

In any case it requires invasive changes in lots of drivers and we can
always tidy up later, so unless Joerg has a preference I'd keep the
duplicates for now.

However, having to lookup pasid-to-io_mm on every map/unmap call is
cumbersome, especially since map/unmap are supposed to be as fast as
possible. iommu_sva_alloc_pasid should return a structure representing
the PASID instead of the value alone. The io_mm structure seems like a
good fit, and the device driver can access io_mm->pasid directly or via
an io_mm_get_pasid() function.

The new functions would then be:

struct io_mm *iommu_sva_alloc_pasid(domain, dev)
void iommu_sva_free_pasid(domain, io_mm)

int iommu_sva_map(io_mm, iova, paddr, size, prot)
size_t iommu_map_sg(io_mm, iova, sg, nents, prot)
size_t iommu_sva_unmap(io_mm, iova, size)
size_t iommu_sva_unmap_fast(io_mm, iova, size)
phys_addr_t iommu_sva_iova_to_phys(io_mm, iova)
void iommu_sva_flush_tlb_all(io_mm)
void iommu_sva_tlb_range_add(io_mm, iova, size)

A few more comments inline

On 18/05/18 22:34, Jordan Crouse wrote:
> Some older SMMU implementations that do not have a fully featured
> hardware PASID features have alternate workarounds for using multiple
> pagetables. For example, MSM GPUs have logic to automatically switch the
> user pagetable from hardware by writing the context bank directly.

The comment may be a bit too specific, sva_map/sva_unmap is also useful
for PASID-capable IOMMUs

> Support private PASIDs by creating a new io-pgtable instance map it
> to a PASID and provide the APIs for drivers to populate it manually.
> 
> Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> ---
[...]
> +int iommu_sva_alloc_pasid(struct iommu_domain *domain, struct device *dev)
> +{
> +	int ret, pasid;
> +	struct io_mm *io_mm;
> +	struct iommu_sva_param *param = dev->iommu_param->sva_param;

We need a NULL check on the param, to ensure that the driver called
sva_device_init first.

> +
> +	if (!domain->ops->mm_attach || !domain->ops->mm_detach)
> +		return -ENODEV;
> +
> +	if (domain->ops->mm_alloc)

I'd rather make mm_alloc and mm_free mandatory, but if we do make them
optional, then we need to check that both mm_alloc and mm_free are
present, or both absent.

> +		io_mm = domain->ops->mm_alloc(domain, NULL, 0);
> +	else
> +		io_mm = kzalloc(sizeof(*io_mm), GFP_KERNEL);
> +
> +	if (IS_ERR(io_mm))
> +		return PTR_ERR(io_mm);
> +	if (!io_mm)
> +		return -ENOMEM;
> +
> +	io_mm->domain = domain;
> +	io_mm->type = IO_TYPE_PRIVATE;

This could be a IOMMU_SVA_FEAT_PRIVATE flag

> +
> +	idr_preload(GFP_KERNEL);
> +	spin_lock(&iommu_sva_lock);
> +	pasid = idr_alloc_cyclic(&iommu_pasid_idr, io_mm, param->min_pasid,
> +		param->max_pasid + 1, GFP_ATOMIC);
> +	io_mm->pasid = pasid;
> +	spin_unlock(&iommu_sva_lock);
> +	idr_preload_end();
> +
> +	if (pasid < 0) {
> +		kfree(io_mm);
> +		return pasid;
> +	}
> +
> +	ret = domain->ops->mm_attach(domain, dev, io_mm, false);

attach_domain should be true, otherwise the SMMUv3 driver won't write
the PASID table. But we should probably go through io_mm_attach here, to
make sure that PASID contexts are added to the mm list and cleaned up by
unbind_dev_all()

> +size_t iommu_sva_unmap(int pasid, unsigned long iova, size_t size)
> +{
> +	struct io_mm *io_mm = get_io_mm(pasid);
> +
> +	if (!io_mm || io_mm->type != IO_TYPE_PRIVATE)
> +		return -ENODEV;
> +
> +	return __iommu_unmap(io_mm->domain, &pasid, iova, size, false);

sync must be true here, and false in the unmap_fast() variant

> +}
> +EXPORT_SYMBOL_GPL(iommu_sva_unmap);
> +
> +void iommu_sva_free_pasid(int pasid, struct device *dev)
> +{
> +	struct io_mm *io_mm = get_io_mm(pasid);
> +	struct iommu_domain *domain;
> +
> +	if (!io_mm || io_mm->type != IO_TYPE_PRIVATE)
> +		return;
> +
> +	domain = io_mm->domain;
> +
> +	domain->ops->mm_detach(domain, dev, io_mm, false);

Here too detach_domain should be true

> @@ -1841,16 +1854,23 @@ int iommu_map(struct iommu_domain *domain, unsigned long iova,
>  
>  	/* unroll mapping in case something went wrong */
>  	if (ret)
> -		iommu_unmap(domain, orig_iova, orig_size - size);
> +		__iommu_unmap(domain, pasid, orig_iova, orig_size - size,
> +			pasid ? false : true);

sync should be true

> -	if (unlikely(ops->unmap == NULL ||
> -		     domain->pgsize_bitmap == 0UL))
> -		return 0;
> +	if (unlikely(domain->pgsize_bitmap == 0UL))
> +		return -0;

spurious '-'

Thanks,
Jean

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 04/16] iommu: sva: Add support for private PASIDs
  2018-07-17 11:21         ` Jean-Philippe Brucker
@ 2018-07-17 20:19             ` Jordan Crouse
  -1 siblings, 0 replies; 38+ messages in thread
From: Jordan Crouse @ 2018-07-17 20:19 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r

On Tue, Jul 17, 2018 at 12:21:03PM +0100, Jean-Philippe Brucker wrote:
> Hi Jordan,
> 
> Thanks for the patches, I finally got around testing them with SMMUv3.
> It's an important feature, arguably more than SVA itself. I could pick
> this one as part of the SVA series, what do you think?

I'm good with whatever is the easiest.

> Although I probably would have done the same, I dislike the interface
> because it forces us to duplicate functions and IOMMU ops. The list is
> small but growing:
> 
> iommu_map
> iommu_map_sg
> iommu_unmap
> iommu_unmap_fast
> iommu_iova_to_phys
> iommu_tlb_range_add
> iommu_flush_tlb_all
> 
> Each of these and their associated IOMMU op will have an iommu_sva_X
> counterpart that takes one different argument. Modifying these functions
> to take both a domain and a PASID argument would be more elegant. Or as
> an intermediate solution, perhaps we could only change the IOMMU ops to
> take an additional argument, like you did for map_sg?
> 
> In any case it requires invasive changes in lots of drivers and we can
> always tidy up later, so unless Joerg has a preference I'd keep the
> duplicates for now.

I agree.

> However, having to lookup pasid-to-io_mm on every map/unmap call is
> cumbersome, especially since map/unmap are supposed to be as fast as
> possible. iommu_sva_alloc_pasid should return a structure representing
> the PASID instead of the value alone. The io_mm structure seems like a
> good fit, and the device driver can access io_mm->pasid directly or via
> an io_mm_get_pasid() function.
> 
> The new functions would then be:
> 
> struct io_mm *iommu_sva_alloc_pasid(domain, dev)
> void iommu_sva_free_pasid(domain, io_mm)
> 
> int iommu_sva_map(io_mm, iova, paddr, size, prot)
> size_t iommu_map_sg(io_mm, iova, sg, nents, prot)
> size_t iommu_sva_unmap(io_mm, iova, size)
> size_t iommu_sva_unmap_fast(io_mm, iova, size)
> phys_addr_t iommu_sva_iova_to_phys(io_mm, iova)
> void iommu_sva_flush_tlb_all(io_mm)
> void iommu_sva_tlb_range_add(io_mm, iova, size)

Okay - this sounds reasonable - a simplification like that could
even lead to making all the new functions static inlines which
would cut down on the exported symbols.

> A few more comments inline

All those sound like good ideas to me. I'll take a bit of time to bash on this
and send out an updated revision soonish.

Jordan

<snip>

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH 04/16] iommu: sva: Add support for private PASIDs
@ 2018-07-17 20:19             ` Jordan Crouse
  0 siblings, 0 replies; 38+ messages in thread
From: Jordan Crouse @ 2018-07-17 20:19 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jul 17, 2018 at 12:21:03PM +0100, Jean-Philippe Brucker wrote:
> Hi Jordan,
> 
> Thanks for the patches, I finally got around testing them with SMMUv3.
> It's an important feature, arguably more than SVA itself. I could pick
> this one as part of the SVA series, what do you think?

I'm good with whatever is the easiest.

> Although I probably would have done the same, I dislike the interface
> because it forces us to duplicate functions and IOMMU ops. The list is
> small but growing:
> 
> iommu_map
> iommu_map_sg
> iommu_unmap
> iommu_unmap_fast
> iommu_iova_to_phys
> iommu_tlb_range_add
> iommu_flush_tlb_all
> 
> Each of these and their associated IOMMU op will have an iommu_sva_X
> counterpart that takes one different argument. Modifying these functions
> to take both a domain and a PASID argument would be more elegant. Or as
> an intermediate solution, perhaps we could only change the IOMMU ops to
> take an additional argument, like you did for map_sg?
> 
> In any case it requires invasive changes in lots of drivers and we can
> always tidy up later, so unless Joerg has a preference I'd keep the
> duplicates for now.

I agree.

> However, having to lookup pasid-to-io_mm on every map/unmap call is
> cumbersome, especially since map/unmap are supposed to be as fast as
> possible. iommu_sva_alloc_pasid should return a structure representing
> the PASID instead of the value alone. The io_mm structure seems like a
> good fit, and the device driver can access io_mm->pasid directly or via
> an io_mm_get_pasid() function.
> 
> The new functions would then be:
> 
> struct io_mm *iommu_sva_alloc_pasid(domain, dev)
> void iommu_sva_free_pasid(domain, io_mm)
> 
> int iommu_sva_map(io_mm, iova, paddr, size, prot)
> size_t iommu_map_sg(io_mm, iova, sg, nents, prot)
> size_t iommu_sva_unmap(io_mm, iova, size)
> size_t iommu_sva_unmap_fast(io_mm, iova, size)
> phys_addr_t iommu_sva_iova_to_phys(io_mm, iova)
> void iommu_sva_flush_tlb_all(io_mm)
> void iommu_sva_tlb_range_add(io_mm, iova, size)

Okay - this sounds reasonable - a simplification like that could
even lead to making all the new functions static inlines which
would cut down on the exported symbols.

> A few more comments inline

All those sound like good ideas to me. I'll take a bit of time to bash on this
and send out an updated revision soonish.

Jordan

<snip>

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2018-07-17 20:19 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-18 21:34 [RFC v2 00/16] Private PASID and per-instance pagetables Jordan Crouse
2018-05-18 21:34 ` Jordan Crouse
2018-05-18 21:34 ` [PATCH 08/16] drm/msm: Pass the MMU domain index in struct msm_file_private Jordan Crouse
2018-05-18 21:34   ` Jordan Crouse
     [not found] ` <20180518213500.31595-1-jcrouse-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
2018-05-18 21:34   ` [PATCH 01/16] iommu: Add DOMAIN_ATTR_SPLIT_TABLES Jordan Crouse
2018-05-18 21:34     ` Jordan Crouse
2018-05-18 21:34   ` [PATCH 02/16] iommu/arm-smmu: Add split pagetable support for arm-smmu-v2 Jordan Crouse
2018-05-18 21:34     ` Jordan Crouse
2018-05-18 21:34   ` [PATCH 03/16] iommu/io-pgtable-arm: Remove ttbr[1] from io_pgtbl_cfg Jordan Crouse
2018-05-18 21:34     ` Jordan Crouse
2018-05-18 21:34   ` [PATCH 04/16] iommu: sva: Add support for private PASIDs Jordan Crouse
2018-05-18 21:34     ` Jordan Crouse
     [not found]     ` <20180518213500.31595-5-jcrouse-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
2018-07-17 11:21       ` Jean-Philippe Brucker
2018-07-17 11:21         ` Jean-Philippe Brucker
     [not found]         ` <c87ae21f-ac02-17fe-6d11-48d2840911e1-5wv7dgnIgG8@public.gmane.org>
2018-07-17 20:19           ` Jordan Crouse
2018-07-17 20:19             ` Jordan Crouse
2018-05-18 21:34   ` [PATCH 05/16] iommu: arm-smmu: " Jordan Crouse
2018-05-18 21:34     ` Jordan Crouse
2018-05-18 21:34   ` [PATCH 06/16] iommu: arm-smmu: Add side-band function for specific PASID callbacks Jordan Crouse
2018-05-18 21:34     ` Jordan Crouse
2018-05-18 21:34   ` [PATCH 07/16] drm/msm/gpu: Enable 64 bit mode by default Jordan Crouse
2018-05-18 21:34     ` Jordan Crouse
2018-05-18 21:34   ` [PATCH 09/16] drm/msm/gpu: Support using split page tables for kernel buffer objects Jordan Crouse
2018-05-18 21:34     ` Jordan Crouse
2018-05-18 21:34   ` [PATCH 10/16] drm/msm: Add msm_mmu features Jordan Crouse
2018-05-18 21:34     ` Jordan Crouse
2018-05-18 21:34   ` [PATCH 11/16] drm/msm: Add support for iommu-sva PASIDs Jordan Crouse
2018-05-18 21:34     ` Jordan Crouse
2018-05-18 21:34   ` [PATCH 12/16] drm/msm: Add support for per-instance address spaces Jordan Crouse
2018-05-18 21:34     ` Jordan Crouse
2018-05-18 21:34   ` [PATCH 13/16] drm/msm/a5xx: Support per-instance pagetables Jordan Crouse
2018-05-18 21:34     ` Jordan Crouse
2018-05-18 21:34   ` [PATCH 14/16] drm/msm: Support per-instance address spaces Jordan Crouse
2018-05-18 21:34     ` Jordan Crouse
2018-05-18 21:34   ` [PATCH 15/16] iommu: Gracefully allow drivers to not attach to a default domain Jordan Crouse
2018-05-18 21:34     ` Jordan Crouse
2018-05-18 21:35   ` [PATCH 16/16] iommu/arm-smmu: Add list of devices to opt out of DMA domains Jordan Crouse
2018-05-18 21:35     ` Jordan Crouse

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.