All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7] RFC: iommu/arm-smmu-v2: Dynamic domains
@ 2017-03-07 16:39 Jordan Crouse
       [not found] ` <1488904795-870-1-git-send-email-jcrouse-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
  0 siblings, 1 reply; 11+ messages in thread
From: Jordan Crouse @ 2017-03-07 16:39 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: linux-arm-msm-u79uwXL29TY76Z2rM5mHXA

Pursuant to the arm-smmu-v3 SVM support:

https://lists.linuxfoundation.org/pipermail/iommu/2017-February/020599.html

I felt it would be helpful if I would demonstrate how Qualcomm implements
per-process pagetables for several generations of SoCs and GPUs focusing on the
Adreno A540 GPU and an arm-smmu-v2 IOMMU on the Snapdragon 820 SoC.

The requirement is to implement per-process GPU address spaces for security
reasons. Though some very crude SVM support is possible we focus mainly on
individual address spaces that are maintained and mapped by the GPU driver.

In a nutshell, the solution is to create special virtual or "dynamic" domains
that are associated with a real domain. The dynamic domains allocate pagetables
but do not reprogram the hardware. When a command is submitted, the kernel
driver provides the physial address of the pagetable (TTBR0) to the GPU which
reprograms the TTBR0 register in context bank 0 of the GPMU SMMU on the fly (and
does the requisite flushing and stalling).

The TTBR1 address space is used to maintain a split between the process and the
global GPU buffers (ringbuffers, etc). This greatly facilitates the switching
process.

In more detail this is the workflow:

 - The kernel driver attaches a UNMANAGED domain to context bank 0

 - Global GPU buffers are allocated in the TTBR1 address space
 
 - Each new process creates a dynamic domain cloned from the "real" domain

 - New buffers for the process are mapped into the dynamic domain

 - The kernel driver gets the TTBR0/ASID register value from the dynamic domain
   via an attribute

 - At command submission time, the kernel driver sends the TTBR0/ASID value to
   the GPU before the command. The GPU switches the pagetable by programming
   the SMMU hardware before executing the command.

I'll be uploading the series to implement this in the MSM DRM driver to show how
it works from the GPU perspective. I'm adding it as a separate thread to avoid
crossing the streams and confusing folks - I'll reply to this email with a link.

Obviously there are some similarities with Jean-Philippe's code and I think its
worth having the discussion about ways we can merge the concepts on that thread.
There are a few barriers to overcome but in general I think we can find a way
forward.

Please review if you want and provide comment or just follow along.

Thanks!
Jordan

Jeremy Gebben (2):
  iommu: introduce TTBR0 domain attribute
  iommu/arm-smmu: add support for TTBR0 attribute

Jordan Crouse (4):
  iommu: Add DOMAIN_ATTR_ENABLE_TTBR1
  iommu/arm-smmu: Add support for TTBR1
  iommu: Add dynamic domains
  iommu/arm-smmu: add support for dynamic domains

Mitchel Humpherys (1):
  iommu/arm-smmu: save the pgtbl_cfg in the domain

 drivers/iommu/arm-smmu.c       | 198 +++++++++++++++++++++++++++++++++++------
 drivers/iommu/io-pgtable-arm.c | 168 ++++++++++++++++++++++++++++++----
 drivers/iommu/io-pgtable.h     |   6 ++
 drivers/iommu/iommu.c          |  37 ++++++++
 include/linux/iommu.h          |  19 +++-
 5 files changed, 382 insertions(+), 46 deletions(-)

-- 
1.9.1

_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/7] iommu/arm-smmu: save the pgtbl_cfg in the domain
       [not found] ` <1488904795-870-1-git-send-email-jcrouse-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
@ 2017-03-07 16:39   ` Jordan Crouse
  2017-03-07 16:39   ` [PATCH 2/7] iommu: Add DOMAIN_ATTR_ENABLE_TTBR1 Jordan Crouse
                     ` (6 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Jordan Crouse @ 2017-03-07 16:39 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: linux-arm-msm-u79uwXL29TY76Z2rM5mHXA, Mitchel Humpherys

From: Mitchel Humpherys <mitchelh@codeaurora.org>

The pgtbl_cfg object has a few handy properties that we'd like to make
use of later (returning the pgd in a domain attribute, for example).
Keep track of the domain pgtbl_cfg in the domain structure.

Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
---
 drivers/iommu/arm-smmu.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 57bcf14..c47f883 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -409,6 +409,7 @@ enum arm_smmu_domain_stage {
 struct arm_smmu_domain {
 	struct arm_smmu_device		*smmu;
 	struct io_pgtable_ops		*pgtbl_ops;
+	struct io_pgtable_cfg		pgtbl_cfg;
 	spinlock_t			pgtbl_lock;
 	struct arm_smmu_cfg		cfg;
 	enum arm_smmu_domain_stage	stage;
@@ -840,7 +841,6 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	int irq, start, ret = 0;
 	unsigned long ias, oas;
 	struct io_pgtable_ops *pgtbl_ops;
-	struct io_pgtable_cfg pgtbl_cfg;
 	enum io_pgtable_fmt fmt;
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
 	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
@@ -952,7 +952,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 		cfg->irptndx = cfg->cbndx;
 	}
 
-	pgtbl_cfg = (struct io_pgtable_cfg) {
+	smmu_domain->pgtbl_cfg = (struct io_pgtable_cfg) {
 		.pgsize_bitmap	= smmu->pgsize_bitmap,
 		.ias		= ias,
 		.oas		= oas,
@@ -961,19 +961,20 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	};
 
 	smmu_domain->smmu = smmu;
-	pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
+	pgtbl_ops = alloc_io_pgtable_ops(fmt, &smmu_domain->pgtbl_cfg,
+					 smmu_domain);
 	if (!pgtbl_ops) {
 		ret = -ENOMEM;
 		goto out_clear_smmu;
 	}
 
 	/* Update the domain's page sizes to reflect the page table format */
-	domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
+	domain->pgsize_bitmap = smmu_domain->pgtbl_cfg.pgsize_bitmap;
 	domain->geometry.aperture_end = (1UL << ias) - 1;
 	domain->geometry.force_aperture = true;
 
 	/* Initialise the context bank with our page table cfg */
-	arm_smmu_init_context_bank(smmu_domain, &pgtbl_cfg);
+	arm_smmu_init_context_bank(smmu_domain, &smmu_domain->pgtbl_cfg);
 
 	/*
 	 * Request context fault interrupt. Do this last to avoid the
-- 
1.9.1

_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/7] iommu: Add DOMAIN_ATTR_ENABLE_TTBR1
       [not found] ` <1488904795-870-1-git-send-email-jcrouse-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
  2017-03-07 16:39   ` [PATCH 1/7] iommu/arm-smmu: save the pgtbl_cfg in the domain Jordan Crouse
@ 2017-03-07 16:39   ` Jordan Crouse
  2017-03-07 16:39   ` [PATCH 3/7] iommu/arm-smmu: Add support for TTBR1 Jordan Crouse
                     ` (5 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Jordan Crouse @ 2017-03-07 16:39 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: linux-arm-msm-u79uwXL29TY76Z2rM5mHXA

Add a new domain attribute to enable the TTBR1 pagetable for drivers
and devices that support it.  This will enabled using a TTBR1 (otherwise
known as a "global" or "system" pagetable for devices that support a split
pagetable scheme for switching pagetables quickly and safely.

Signed-off-by: Jordan Crouse <jcrouse-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
---
 include/linux/iommu.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 436dc21..d537cc9 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -114,6 +114,7 @@ enum iommu_attr {
 	DOMAIN_ATTR_FSL_PAMU_ENABLE,
 	DOMAIN_ATTR_FSL_PAMUV1,
 	DOMAIN_ATTR_NESTING,	/* two stages of translation */
+	DOMAIN_ATTR_ENABLE_TTBR1,
 	DOMAIN_ATTR_MAX,
 };
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/7] iommu/arm-smmu: Add support for TTBR1
       [not found] ` <1488904795-870-1-git-send-email-jcrouse-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
  2017-03-07 16:39   ` [PATCH 1/7] iommu/arm-smmu: save the pgtbl_cfg in the domain Jordan Crouse
  2017-03-07 16:39   ` [PATCH 2/7] iommu: Add DOMAIN_ATTR_ENABLE_TTBR1 Jordan Crouse
@ 2017-03-07 16:39   ` Jordan Crouse
  2017-03-07 16:39   ` [PATCH 4/7] iommu: introduce TTBR0 domain attribute Jordan Crouse
                     ` (4 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Jordan Crouse @ 2017-03-07 16:39 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: linux-arm-msm-u79uwXL29TY76Z2rM5mHXA

Allow a SMMU device to opt into allocating a TTBR1 pagetable.

The size of the TTBR1 region will be the same as
the TTBR0 size with the sign extension bit set on the highest
bit in the region unless the upstream size is 49 bits and then
the sign-extension bit will be set on the 49th bit.

The map/unmap operations will automatically use the appropriate
pagetable based on the specified iova and the existing mask.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/iommu/arm-smmu.c       |  19 ++++-
 drivers/iommu/io-pgtable-arm.c | 168 +++++++++++++++++++++++++++++++++++++----
 drivers/iommu/io-pgtable.h     |   6 ++
 3 files changed, 173 insertions(+), 20 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index c47f883..2e3879f 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -256,9 +256,6 @@ enum arm_smmu_s2cr_privcfg {
 #define RESUME_RETRY			(0 << 0)
 #define RESUME_TERMINATE		(1 << 0)
 
-#define TTBCR2_SEP_SHIFT		15
-#define TTBCR2_SEP_UPSTREAM		(0x7 << TTBCR2_SEP_SHIFT)
-
 #define TTBRn_ASID_SHIFT		48
 
 #define FSR_MULTI			(1 << 31)
@@ -414,6 +411,7 @@ struct arm_smmu_domain {
 	struct arm_smmu_cfg		cfg;
 	enum arm_smmu_domain_stage	stage;
 	struct mutex			init_mutex; /* Protects smmu pointer */
+	u32 attributes;
 	struct iommu_domain		domain;
 };
 
@@ -803,7 +801,6 @@ static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain,
 		} else {
 			reg = pgtbl_cfg->arm_lpae_s1_cfg.tcr;
 			reg2 = pgtbl_cfg->arm_lpae_s1_cfg.tcr >> 32;
-			reg2 |= TTBCR2_SEP_UPSTREAM;
 		}
 		if (smmu->version > ARM_SMMU_V1)
 			writel_relaxed(reg2, cb_base + ARM_SMMU_CB_TTBCR2);
@@ -844,6 +841,9 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	enum io_pgtable_fmt fmt;
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
 	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
+	unsigned int quirks =
+		smmu_domain->attributes & (1 << DOMAIN_ATTR_ENABLE_TTBR1) ?
+			IO_PGTABLE_QUIRK_ARM_TTBR1 : 0;
 
 	mutex_lock(&smmu_domain->init_mutex);
 	if (smmu_domain->smmu)
@@ -953,6 +953,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	}
 
 	smmu_domain->pgtbl_cfg = (struct io_pgtable_cfg) {
+		.quirks		= quirks,
 		.pgsize_bitmap	= smmu->pgsize_bitmap,
 		.ias		= ias,
 		.oas		= oas,
@@ -1539,6 +1540,10 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
 	case DOMAIN_ATTR_NESTING:
 		*(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED);
 		return 0;
+	case DOMAIN_ATTR_ENABLE_TTBR1:
+		*((int *)data) = !!(smmu_domain->attributes
+					& (1 << DOMAIN_ATTR_ENABLE_TTBR1));
+		return 0;
 	default:
 		return -ENODEV;
 	}
@@ -1565,6 +1570,12 @@ static int arm_smmu_domain_set_attr(struct iommu_domain *domain,
 			smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
 
 		break;
+	case DOMAIN_ATTR_ENABLE_TTBR1:
+		if (*((int *)data))
+			smmu_domain->attributes |=
+				1 << DOMAIN_ATTR_ENABLE_TTBR1;
+		ret = 0;
+		break;
 	default:
 		ret = -ENODEV;
 	}
diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index f5c90e1..110a691 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -124,14 +124,21 @@
 #define ARM_LPAE_TCR_TG0_64K		(1 << 14)
 #define ARM_LPAE_TCR_TG0_16K		(2 << 14)
 
+#define ARM_LPAE_TCR_TG1_16K            1ULL
+#define ARM_LPAE_TCR_TG1_4K             2ULL
+#define ARM_LPAE_TCR_TG1_64K            3ULL
+
 #define ARM_LPAE_TCR_SH0_SHIFT		12
 #define ARM_LPAE_TCR_SH0_MASK		0x3
+#define ARM_LPAE_TCR_SH1_SHIFT		28
 #define ARM_LPAE_TCR_SH_NS		0
 #define ARM_LPAE_TCR_SH_OS		2
 #define ARM_LPAE_TCR_SH_IS		3
 
 #define ARM_LPAE_TCR_ORGN0_SHIFT	10
+#define ARM_LPAE_TCR_ORGN1_SHIFT	26
 #define ARM_LPAE_TCR_IRGN0_SHIFT	8
+#define ARM_LPAE_TCR_IRGN1_SHIFT	24
 #define ARM_LPAE_TCR_RGN_MASK		0x3
 #define ARM_LPAE_TCR_RGN_NC		0
 #define ARM_LPAE_TCR_RGN_WBWA		1
@@ -144,6 +151,9 @@
 #define ARM_LPAE_TCR_T0SZ_SHIFT		0
 #define ARM_LPAE_TCR_SZ_MASK		0xf
 
+#define ARM_LPAE_TCR_T1SZ_SHIFT         16
+#define ARM_LPAE_TCR_T1SZ_MASK          0x3f
+
 #define ARM_LPAE_TCR_PS_SHIFT		16
 #define ARM_LPAE_TCR_PS_MASK		0x7
 
@@ -157,6 +167,16 @@
 #define ARM_LPAE_TCR_PS_44_BIT		0x4ULL
 #define ARM_LPAE_TCR_PS_48_BIT		0x5ULL
 
+#define ARM_LPAE_TCR_SEP_SHIFT		(15 + 32)
+
+#define ARM_LPAE_TCR_SEP_31		0x0ULL
+#define ARM_LPAE_TCR_SEP_35		0x1ULL
+#define ARM_LPAE_TCR_SEP_39		0x2ULL
+#define ARM_LPAE_TCR_SEP_41		0x3ULL
+#define ARM_LPAE_TCR_SEP_43		0x4ULL
+#define ARM_LPAE_TCR_SEP_47		0x5ULL
+#define ARM_LPAE_TCR_SEP_UPSTREAM	0x7ULL
+
 #define ARM_LPAE_MAIR_ATTR_SHIFT(n)	((n) << 3)
 #define ARM_LPAE_MAIR_ATTR_MASK		0xff
 #define ARM_LPAE_MAIR_ATTR_DEVICE	0x04
@@ -195,7 +215,7 @@ struct arm_lpae_io_pgtable {
 	unsigned long		pg_shift;
 	unsigned long		bits_per_level;
 
-	void			*pgd;
+	void			*pgd[2];
 };
 
 typedef u64 arm_lpae_iopte;
@@ -381,14 +401,38 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data,
 	return pte;
 }
 
+static inline arm_lpae_iopte *arm_lpae_get_table(
+		struct arm_lpae_io_pgtable *data, unsigned long iova)
+{
+	struct io_pgtable_cfg *cfg = &data->iop.cfg;
+
+	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1)  {
+		unsigned long mask;
+
+		/*
+		 * IAS of 48 means sign extension bit 48, otherwise it is the
+		 * ias - 1 (32 -> bit 31, etc)
+		 */
+		mask = (cfg->ias == 48) ? (1UL << 48) :
+			(1UL << (cfg->ias - 1));
+
+		if (iova & mask)
+			return data->pgd[1];
+	}
+
+	return data->pgd[0];
+}
+
 static int arm_lpae_map(struct io_pgtable_ops *ops, unsigned long iova,
 			phys_addr_t paddr, size_t size, int iommu_prot)
 {
 	struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
-	arm_lpae_iopte *ptep = data->pgd;
+	arm_lpae_iopte *ptep;
 	int ret, lvl = ARM_LPAE_START_LVL(data);
 	arm_lpae_iopte prot;
 
+	ptep = arm_lpae_get_table(data, iova);
+
 	/* If no access, then nothing to do */
 	if (!(iommu_prot & (IOMMU_READ | IOMMU_WRITE)))
 		return 0;
@@ -439,7 +483,10 @@ static void arm_lpae_free_pgtable(struct io_pgtable *iop)
 {
 	struct arm_lpae_io_pgtable *data = io_pgtable_to_data(iop);
 
-	__arm_lpae_free_pgtable(data, ARM_LPAE_START_LVL(data), data->pgd);
+	__arm_lpae_free_pgtable(data, ARM_LPAE_START_LVL(data), data->pgd[0]);
+	if (data->pgd[1])
+		__arm_lpae_free_pgtable(data, ARM_LPAE_START_LVL(data),
+			data->pgd[1]);
 	kfree(data);
 }
 
@@ -535,9 +582,11 @@ static int arm_lpae_unmap(struct io_pgtable_ops *ops, unsigned long iova,
 {
 	size_t unmapped;
 	struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
-	arm_lpae_iopte *ptep = data->pgd;
+	arm_lpae_iopte *ptep;
 	int lvl = ARM_LPAE_START_LVL(data);
 
+	ptep = arm_lpae_get_table(data, iova);
+
 	unmapped = __arm_lpae_unmap(data, iova, size, lvl, ptep);
 	if (unmapped)
 		io_pgtable_tlb_sync(&data->iop);
@@ -549,9 +598,11 @@ static phys_addr_t arm_lpae_iova_to_phys(struct io_pgtable_ops *ops,
 					 unsigned long iova)
 {
 	struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
-	arm_lpae_iopte pte, *ptep = data->pgd;
+	arm_lpae_iopte pte, *ptep;
 	int lvl = ARM_LPAE_START_LVL(data);
 
+	ptep = arm_lpae_get_table(data, iova);
+
 	do {
 		/* Valid IOPTE pointer? */
 		if (!ptep)
@@ -660,13 +711,81 @@ static void arm_lpae_restrict_pgsizes(struct io_pgtable_cfg *cfg)
 	return data;
 }
 
+static u64 arm64_lpae_setup_ttbr1(struct io_pgtable_cfg *cfg,
+		struct arm_lpae_io_pgtable *data)
+
+{
+	u64 reg;
+
+	/* If TTBR1 is disabled, disable speculative walks through the TTBR1 */
+	if (!(cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1)) {
+		reg = ARM_LPAE_TCR_EPD1;
+		reg |= (ARM_LPAE_TCR_SEP_UPSTREAM << ARM_LPAE_TCR_SEP_SHIFT);
+		return reg;
+	}
+
+	reg = (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH1_SHIFT) |
+	      (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN1_SHIFT) |
+	      (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN1_SHIFT);
+
+	switch (1 << data->pg_shift) {
+	case SZ_4K:
+		reg |= (ARM_LPAE_TCR_TG1_4K << 30);
+		break;
+	case SZ_16K:
+		reg |= (ARM_LPAE_TCR_TG1_16K << 30);
+		break;
+	case SZ_64K:
+		reg |= (ARM_LPAE_TCR_TG1_64K << 30);
+		break;
+	}
+
+	/* Set T1SZ */
+	reg |= (64ULL - cfg->ias) << ARM_LPAE_TCR_T1SZ_SHIFT;
+
+	/* Set the SEP bit based on the size */
+	switch (cfg->ias) {
+	case 32:
+		reg |= (ARM_LPAE_TCR_SEP_31 << ARM_LPAE_TCR_SEP_SHIFT);
+		break;
+	case 36:
+		reg |= (ARM_LPAE_TCR_SEP_35 << ARM_LPAE_TCR_SEP_SHIFT);
+		break;
+	case 40:
+		reg |= (ARM_LPAE_TCR_SEP_39 << ARM_LPAE_TCR_SEP_SHIFT);
+		break;
+	case 42:
+		reg |= (ARM_LPAE_TCR_SEP_41 << ARM_LPAE_TCR_SEP_SHIFT);
+		break;
+	case 44:
+		reg |= (ARM_LPAE_TCR_SEP_43 << ARM_LPAE_TCR_SEP_SHIFT);
+		break;
+	case 48:
+		/*
+		 * If ias is 48 then that probably means that the UBS on the
+		 * device was 0101b (49) which is a special case that assumes
+		 * bit 48 is the sign extension bit. In this case we are
+		 * expected to use ARM_LPAE_TCR_SEP_UPSTREAM to use bit 48 as
+		 * the extension bit. One might be confused because there is
+		 * also an option to set the SEP to bit 47 but this is probably
+		 * not what the arm-smmu driver intended.
+		 */
+	default:
+		reg |= (ARM_LPAE_TCR_SEP_UPSTREAM << ARM_LPAE_TCR_SEP_SHIFT);
+		break;
+	}
+
+	return reg;
+}
+
 static struct io_pgtable *
 arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie)
 {
 	u64 reg;
 	struct arm_lpae_io_pgtable *data;
 
-	if (cfg->quirks & ~IO_PGTABLE_QUIRK_ARM_NS)
+	if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS |
+		  IO_PGTABLE_QUIRK_ARM_TTBR1))
 		return NULL;
 
 	data = arm_lpae_alloc_pgtable(cfg);
@@ -715,8 +834,9 @@ static void arm_lpae_restrict_pgsizes(struct io_pgtable_cfg *cfg)
 
 	reg |= (64ULL - cfg->ias) << ARM_LPAE_TCR_T0SZ_SHIFT;
 
-	/* Disable speculative walks through TTBR1 */
-	reg |= ARM_LPAE_TCR_EPD1;
+	/* Bring in the TTBR1 configuration */
+	reg |= arm64_lpae_setup_ttbr1(cfg, data);
+
 	cfg->arm_lpae_s1_cfg.tcr = reg;
 
 	/* MAIRs */
@@ -731,16 +851,32 @@ static void arm_lpae_restrict_pgsizes(struct io_pgtable_cfg *cfg)
 	cfg->arm_lpae_s1_cfg.mair[1] = 0;
 
 	/* Looking good; allocate a pgd */
-	data->pgd = __arm_lpae_alloc_pages(data->pgd_size, GFP_KERNEL, cfg);
-	if (!data->pgd)
+	data->pgd[0] = __arm_lpae_alloc_pages(data->pgd_size, GFP_KERNEL, cfg);
+	if (!data->pgd[0])
 		goto out_free_data;
 
+
+	if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1) {
+		data->pgd[1] = __arm_lpae_alloc_pages(data->pgd_size,
+			GFP_KERNEL, cfg);
+		if (!data->pgd[1]) {
+			__arm_lpae_free_pages(data->pgd[0], data->pgd_size,
+				cfg);
+			goto out_free_data;
+		}
+	} else {
+		data->pgd[1] = NULL;
+	}
+
 	/* Ensure the empty pgd is visible before any actual TTBR write */
 	wmb();
 
 	/* TTBRs */
-	cfg->arm_lpae_s1_cfg.ttbr[0] = virt_to_phys(data->pgd);
-	cfg->arm_lpae_s1_cfg.ttbr[1] = 0;
+	cfg->arm_lpae_s1_cfg.ttbr[0] = virt_to_phys(data->pgd[0]);
+
+	if (data->pgd[1])
+		cfg->arm_lpae_s1_cfg.ttbr[1] = virt_to_phys(data->pgd[1]);
+
 	return &data->iop;
 
 out_free_data:
@@ -825,15 +961,15 @@ static void arm_lpae_restrict_pgsizes(struct io_pgtable_cfg *cfg)
 	cfg->arm_lpae_s2_cfg.vtcr = reg;
 
 	/* Allocate pgd pages */
-	data->pgd = __arm_lpae_alloc_pages(data->pgd_size, GFP_KERNEL, cfg);
-	if (!data->pgd)
+	data->pgd[0] = __arm_lpae_alloc_pages(data->pgd_size, GFP_KERNEL, cfg);
+	if (!data->pgd[0])
 		goto out_free_data;
 
 	/* Ensure the empty pgd is visible before any actual TTBR write */
 	wmb();
 
 	/* VTTBR */
-	cfg->arm_lpae_s2_cfg.vttbr = virt_to_phys(data->pgd);
+	cfg->arm_lpae_s2_cfg.vttbr = virt_to_phys(data->pgd[0]);
 	return &data->iop;
 
 out_free_data:
@@ -931,7 +1067,7 @@ static void __init arm_lpae_dump_ops(struct io_pgtable_ops *ops)
 		cfg->pgsize_bitmap, cfg->ias);
 	pr_err("data: %d levels, 0x%zx pgd_size, %lu pg_shift, %lu bits_per_level, pgd @ %p\n",
 		data->levels, data->pgd_size, data->pg_shift,
-		data->bits_per_level, data->pgd);
+		data->bits_per_level, data->pgd[0]);
 }
 
 #define __FAIL(ops, i)	({						\
diff --git a/drivers/iommu/io-pgtable.h b/drivers/iommu/io-pgtable.h
index 969d82c..fd2cd34 100644
--- a/drivers/iommu/io-pgtable.h
+++ b/drivers/iommu/io-pgtable.h
@@ -65,11 +65,17 @@ struct io_pgtable_cfg {
 	 *	PTEs, for Mediatek IOMMUs which treat it as a 33rd address bit
 	 *	when the SoC is in "4GB mode" and they can only access the high
 	 *	remap of DRAM (0x1_00000000 to 0x1_ffffffff).
+	 *
+	 * IO_PGTABLE_QUIRK_ARM_TTBR1: Specifies that TTBR1 has been enabled on
+	 *	this domain. Set up the configuration registers and dyanmically
+	 *	choose which pagetable (TTBR0 or TTBR1) a mapping should go into
+	 *	based on the address.
 	 */
 	#define IO_PGTABLE_QUIRK_ARM_NS		BIT(0)
 	#define IO_PGTABLE_QUIRK_NO_PERMS	BIT(1)
 	#define IO_PGTABLE_QUIRK_TLBI_ON_MAP	BIT(2)
 	#define IO_PGTABLE_QUIRK_ARM_MTK_4GB	BIT(3)
+	#define IO_PGTABLE_QUIRK_ARM_TTBR1      BIT(4)
 	unsigned long			quirks;
 	unsigned long			pgsize_bitmap;
 	unsigned int			ias;
-- 
1.9.1

_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 4/7] iommu: introduce TTBR0 domain attribute
       [not found] ` <1488904795-870-1-git-send-email-jcrouse-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
                     ` (2 preceding siblings ...)
  2017-03-07 16:39   ` [PATCH 3/7] iommu/arm-smmu: Add support for TTBR1 Jordan Crouse
@ 2017-03-07 16:39   ` Jordan Crouse
  2017-03-07 16:39   ` [PATCH 5/7] iommu/arm-smmu: add support for TTBR0 attribute Jordan Crouse
                     ` (3 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Jordan Crouse @ 2017-03-07 16:39 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: linux-arm-msm-u79uwXL29TY76Z2rM5mHXA, Jeremy Gebben

From: Jeremy Gebben <jgebben@codeaurora.org>

In the ARM SMMU architecture, pagetable programming is controlled
by the TTBR0 register. The layout of this
registers varies depending on the pagetable format in use.
In particular, the ASID (address space ID) field is found in
CONTEXTIDR when using V7S format and in the top bits of TTBR0
for V7L and V8L.

Some drivers need to program hardware to switch domains on the
fly. This attribute allows the correct setting to be determined
by querying the domain rather than directly reading registers and
making assumptions about the pagetable format. The domain must be
attached before TTBR0 may be queried.

Signed-off-by: Jeremy Gebben <jgebben@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 include/linux/iommu.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index d537cc9..544cfc6 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -115,6 +115,7 @@ enum iommu_attr {
 	DOMAIN_ATTR_FSL_PAMUV1,
 	DOMAIN_ATTR_NESTING,	/* two stages of translation */
 	DOMAIN_ATTR_ENABLE_TTBR1,
+	DOMAIN_ATTR_TTBR0,
 	DOMAIN_ATTR_MAX,
 };
 
-- 
1.9.1

_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 5/7] iommu/arm-smmu: add support for TTBR0 attribute
       [not found] ` <1488904795-870-1-git-send-email-jcrouse-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
                     ` (3 preceding siblings ...)
  2017-03-07 16:39   ` [PATCH 4/7] iommu: introduce TTBR0 domain attribute Jordan Crouse
@ 2017-03-07 16:39   ` Jordan Crouse
  2017-03-07 16:39   ` [PATCH 6/7] iommu: Add dynamic domains Jordan Crouse
                     ` (2 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Jordan Crouse @ 2017-03-07 16:39 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: linux-arm-msm-u79uwXL29TY76Z2rM5mHXA, Jeremy Gebben

From: Jeremy Gebben <jgebben-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>

Add support to return the value of the TTBR0 register in response
to a request via DOMAIN_ATTR_TTBR0.

Signed-off-by: Jeremy Gebben <jgebben-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
Signed-off-by: Jordan Crouse <jcrouse-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
---
 drivers/iommu/arm-smmu.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 2e3879f..e051750 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1544,6 +1544,19 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
 		*((int *)data) = !!(smmu_domain->attributes
 					& (1 << DOMAIN_ATTR_ENABLE_TTBR1));
 		return 0;
+	case DOMAIN_ATTR_TTBR0: {
+		u64 val;
+		/* not valid until we are attached */
+		if (smmu_domain->smmu == NULL)
+			return -ENODEV;
+
+		val = smmu_domain->pgtbl_cfg.arm_lpae_s1_cfg.ttbr[0];
+		if (smmu_domain->cfg.cbar != CBAR_TYPE_S2_TRANS)
+			val |= (u64)ARM_SMMU_CB_ASID(smmu_domain->smmu,
+				&smmu_domain->cfg) << TTBRn_ASID_SHIFT;
+		*((u64 *)data) = val;
+		return 0;
+	}
 	default:
 		return -ENODEV;
 	}
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 6/7] iommu: Add dynamic domains
       [not found] ` <1488904795-870-1-git-send-email-jcrouse-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
                     ` (4 preceding siblings ...)
  2017-03-07 16:39   ` [PATCH 5/7] iommu/arm-smmu: add support for TTBR0 attribute Jordan Crouse
@ 2017-03-07 16:39   ` Jordan Crouse
  2017-03-07 16:39   ` [PATCH 7/7] iommu/arm-smmu: add support for " Jordan Crouse
  2017-03-07 17:22   ` [Freedreno] [PATCH 0/7] RFC: iommu/arm-smmu-v2: Dynamic domains Jordan Crouse
  7 siblings, 0 replies; 11+ messages in thread
From: Jordan Crouse @ 2017-03-07 16:39 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: linux-arm-msm-u79uwXL29TY76Z2rM5mHXA

Add an API to create a dynamic domain from an existing domain.
A dynamic domain is a special IOMMU domain that is attached to
the same device as the parent domain but is backed by separate
pagetables. Devices such as GPUs that support asynchronous
methods for switching pagetables can create dynamic domains for
each individual instance and map memory into them.

The hardware can use the physical address of the pagetable
(as queried by DOMAIN_ATTR_TTBR0) to asynchronously switch the
hardware to the desired pagetable when needed.

Dynamic domains must be created from existing attached
non-dynamic domains.  The domains will share configuration
(pagetable format, context bank, etc). Dynamic domains do not
modify the hardware directly - they are typically a
wrapper for the pagetable memory and facilitate using the other
IOMMU APIs to map and unmap buffers.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/iommu/iommu.c | 37 +++++++++++++++++++++++++++++++++++++
 include/linux/iommu.h | 17 ++++++++++++++++-
 2 files changed, 53 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 9a2f196..4ba593b 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1079,6 +1079,31 @@ void iommu_domain_free(struct iommu_domain *domain)
 }
 EXPORT_SYMBOL_GPL(iommu_domain_free);
 
+struct iommu_domain *iommu_domain_create_dynamic(struct iommu_domain *parent)
+{
+	struct iommu_domain *child;
+	int ret;
+
+	if (!parent || !parent->ops || !parent->ops->domain_init_dynamic)
+		return NULL;
+
+	child = parent->ops->domain_alloc(IOMMU_DOMAIN_DYNAMIC);
+	if (child == NULL)
+		return NULL;
+
+	child->ops = parent->ops;
+	child->type = IOMMU_DOMAIN_DYNAMIC;
+	child->pgsize_bitmap = parent->pgsize_bitmap;
+
+	ret = child->ops->domain_init_dynamic(parent, child);
+	if (!ret)
+		return child;
+
+	child->ops->domain_free(child);
+	return NULL;
+}
+EXPORT_SYMBOL_GPL(iommu_domain_create_dynamic);
+
 static int __iommu_attach_device(struct iommu_domain *domain,
 				 struct device *dev)
 {
@@ -1097,6 +1122,10 @@ int iommu_attach_device(struct iommu_domain *domain, struct device *dev)
 	struct iommu_group *group;
 	int ret;
 
+	/* Don't try to attach dynamic domains */
+	if (!domain || domain->type == IOMMU_DOMAIN_DYNAMIC)
+		return -EINVAL;
+
 	group = iommu_group_get(dev);
 	/* FIXME: Remove this when groups a mandatory for iommu drivers */
 	if (group == NULL)
@@ -1135,6 +1164,10 @@ void iommu_detach_device(struct iommu_domain *domain, struct device *dev)
 {
 	struct iommu_group *group;
 
+	/* Don't try to detach dynamic domains */
+	if (!domain || domain->type == IOMMU_DOMAIN_DYNAMIC)
+		return;
+
 	group = iommu_group_get(dev);
 	/* FIXME: Remove this when groups a mandatory for iommu drivers */
 	if (group == NULL)
@@ -1508,6 +1541,10 @@ int iommu_domain_get_attr(struct iommu_domain *domain,
 			ret = -ENODEV;
 
 		break;
+	case DOMAIN_ATTR_DYNAMIC:
+		*((unsigned int *) data) =
+			!!(domain->type & __IOMMU_DOMAIN_DYNAMIC);
+		break;
 	default:
 		if (!domain->ops->domain_get_attr)
 			return -EINVAL;
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 544cfc6..5b538d0 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -57,7 +57,7 @@ struct iommu_domain_geometry {
 #define __IOMMU_DOMAIN_DMA_API	(1U << 1)  /* Domain for use in DMA-API
 					      implementation              */
 #define __IOMMU_DOMAIN_PT	(1U << 2)  /* Domain is identity mapped   */
-
+#define __IOMMU_DOMAIN_DYNAMIC  (1U << 3)  /* Domain is dynamic */
 /*
  * This are the possible domain-types
  *
@@ -69,12 +69,18 @@ struct iommu_domain_geometry {
  *	IOMMU_DOMAIN_DMA	- Internally used for DMA-API implementations.
  *				  This flag allows IOMMU drivers to implement
  *				  certain optimizations for these domains
+ *	IOMMU_DOMAIN_DYNAMIC	- The domain is dynamic and bound to a parent
+ *				  domain. This allows the driver to implement
+ *				  multiple domains on one device with different
+ *				  attributes
  */
 #define IOMMU_DOMAIN_BLOCKED	(0U)
 #define IOMMU_DOMAIN_IDENTITY	(__IOMMU_DOMAIN_PT)
 #define IOMMU_DOMAIN_UNMANAGED	(__IOMMU_DOMAIN_PAGING)
 #define IOMMU_DOMAIN_DMA	(__IOMMU_DOMAIN_PAGING |	\
 				 __IOMMU_DOMAIN_DMA_API)
+#define IOMMU_DOMAIN_DYNAMIC	(__IOMMU_DOMAIN_PAGING |	\
+				 __IOMMU_DOMAIN_DYNAMIC)
 
 struct iommu_domain {
 	unsigned type;
@@ -116,6 +122,7 @@ enum iommu_attr {
 	DOMAIN_ATTR_NESTING,	/* two stages of translation */
 	DOMAIN_ATTR_ENABLE_TTBR1,
 	DOMAIN_ATTR_TTBR0,
+	DOMAIN_ATTR_DYNAMIC,
 	DOMAIN_ATTR_MAX,
 };
 
@@ -160,6 +167,7 @@ struct iommu_dm_region {
  * @domain_set_windows: Set the number of windows for a domain
  * @domain_get_windows: Return the number of windows for a domain
  * @of_xlate: add OF master IDs to iommu grouping
+ * @domain_init_dynamic: Initialize a dynamic domain based on a parent domain
  * @pgsize_bitmap: bitmap of all possible supported page sizes
  */
 struct iommu_ops {
@@ -203,6 +211,9 @@ struct iommu_ops {
 
 	int (*of_xlate)(struct device *dev, struct of_phandle_args *args);
 
+	int (*domain_init_dynamic)(struct iommu_domain *parent,
+			struct iommu_domain *child);
+
 	unsigned long pgsize_bitmap;
 };
 
@@ -280,6 +291,10 @@ extern int iommu_domain_window_enable(struct iommu_domain *domain, u32 wnd_nr,
 				      phys_addr_t offset, u64 size,
 				      int prot);
 extern void iommu_domain_window_disable(struct iommu_domain *domain, u32 wnd_nr);
+
+extern struct iommu_domain *iommu_domain_create_dynamic(
+		struct iommu_domain *parent);
+
 /**
  * report_iommu_fault() - report about an IOMMU fault to the IOMMU framework
  * @domain: the iommu domain where the fault has happened
-- 
1.9.1

_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 7/7] iommu/arm-smmu: add support for dynamic domains
       [not found] ` <1488904795-870-1-git-send-email-jcrouse-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
                     ` (5 preceding siblings ...)
  2017-03-07 16:39   ` [PATCH 6/7] iommu: Add dynamic domains Jordan Crouse
@ 2017-03-07 16:39   ` Jordan Crouse
       [not found]     ` <1488904795-870-8-git-send-email-jcrouse-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
  2017-03-07 17:22   ` [Freedreno] [PATCH 0/7] RFC: iommu/arm-smmu-v2: Dynamic domains Jordan Crouse
  7 siblings, 1 reply; 11+ messages in thread
From: Jordan Crouse @ 2017-03-07 16:39 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: linux-arm-msm-u79uwXL29TY76Z2rM5mHXA, Jeremy Gebben

Implement support for dynamic domain switching. This feature is
only enabled when the qcom,dynamic device tree attribute for an smmu
instance.

In order to use dynamic domains, a non-dynamic domain must first
be created and attached.  The non-dynamic domain must remain
attached while the device is in use.

The dynamic domain is cloned from the non-dynamic domain. Important
configuration information is copied from the non-dynamic domain and
the dynamic domain is automatically "attached" (though it doesn't
program the hardware).

To switch domains dynamically the hardware must program the TTBR0 register
with the value from the DOMAIN_ATTR_TTBR0 attribute for the dynamic domain.
The upstream driver may also need to do other hardware specific register
programming to properly synchronize the domain switch. It must ensure that
all register state except for the TTBR0 register is restored
at the end of the switch operation.

Signed-off-by: Jeremy Gebben <jgebben@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/iommu/arm-smmu.c | 157 ++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 136 insertions(+), 21 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index e051750..34943f0 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -349,6 +349,7 @@ struct arm_smmu_device {
 	u32				features;
 
 #define ARM_SMMU_OPT_SECURE_CFG_ACCESS (1 << 0)
+#define ARM_SMMU_OPT_DYNAMIC		(1 << 1)
 	u32				options;
 	enum arm_smmu_arch_version	version;
 	enum arm_smmu_implementation	model;
@@ -377,6 +378,8 @@ struct arm_smmu_device {
 	struct clk                      **clocks;
 
 	u32				cavium_id_base; /* Specific to Cavium */
+
+	struct ida			asid_ida;
 };
 
 enum arm_smmu_context_fmt {
@@ -391,11 +394,17 @@ struct arm_smmu_cfg {
 	u8				irptndx;
 	u32				cbar;
 	enum arm_smmu_context_fmt	fmt;
+	u16                             asid;
+	u8                              vmid;
 };
 #define INVALID_IRPTNDX			0xff
+#define INVALID_ASID                   0xffff
+
+/* 0xff is a reasonable limit that works for all targets */
+#define MAX_ASID			0xff
 
-#define ARM_SMMU_CB_ASID(smmu, cfg) ((u16)(smmu)->cavium_id_base + (cfg)->cbndx)
-#define ARM_SMMU_CB_VMID(smmu, cfg) ((u16)(smmu)->cavium_id_base + (cfg)->cbndx + 1)
+#define ARM_SMMU_CB_ASID(smmu, cfg) ((u16)(smmu)->cavium_id_base + (cfg)->asid)
+#define ARM_SMMU_CB_VMID(smmu, cfg) ((u16)(smmu)->cavium_id_base + (cfg)->vmid)
 
 enum arm_smmu_domain_stage {
 	ARM_SMMU_DOMAIN_S1 = 0,
@@ -426,6 +435,7 @@ struct arm_smmu_option_prop {
 
 static struct arm_smmu_option_prop arm_smmu_options[] = {
 	{ ARM_SMMU_OPT_SECURE_CFG_ACCESS, "calxeda,smmu-secure-config-access" },
+	{ ARM_SMMU_OPT_DYNAMIC, "qcom,dynamic" },
 	{ 0, NULL},
 };
 
@@ -473,6 +483,11 @@ static void parse_driver_options(struct arm_smmu_device *smmu)
 	} while (arm_smmu_options[++i].opt);
 }
 
+static bool is_dynamic_domain(struct iommu_domain *domain)
+{
+	return !!(domain->type & (__IOMMU_DOMAIN_DYNAMIC));
+}
+
 static struct device_node *dev_get_dev_node(struct device *dev)
 {
 	if (dev_is_pci(dev)) {
@@ -602,6 +617,10 @@ static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu)
 static void arm_smmu_tlb_sync(void *cookie)
 {
 	struct arm_smmu_domain *smmu_domain = cookie;
+
+	if (!smmu_domain->smmu)
+		return;
+
 	__arm_smmu_tlb_sync(smmu_domain->smmu);
 }
 
@@ -832,6 +851,44 @@ static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain,
 	writel_relaxed(reg, cb_base + ARM_SMMU_CB_SCTLR);
 }
 
+static int arm_smmu_init_asid(struct iommu_domain *domain,
+				struct arm_smmu_device *smmu)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
+	int ret;
+
+	/* For regular domains the asid is the context bank id */
+	if (likely(!is_dynamic_domain(domain))) {
+		cfg->asid = cfg->cbndx;
+		return 0;
+	}
+
+	/*
+	 * For dynamic domains, allocate a unique asid from our pool of virtual
+	 * values
+	 */
+	ret = ida_simple_get(&smmu->asid_ida, smmu->num_context_banks + 2,
+		MAX_ASID + 1, GFP_KERNEL);
+	if (ret < 0) {
+		dev_err(smmu->dev, "dynamic ASID allocation failed: %d\n", ret);
+		return ret;
+	}
+
+	cfg->asid = ret;
+	return 0;
+}
+
+static void arm_smmu_free_asid(struct iommu_domain *domain)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct arm_smmu_device *smmu = smmu_domain->smmu;
+	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
+
+	if (cfg->asid != INVALID_ASID && is_dynamic_domain(domain))
+		ida_simple_remove(&smmu->asid_ida, cfg->asid);
+}
+
 static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 					struct arm_smmu_device *smmu)
 {
@@ -841,6 +898,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	enum io_pgtable_fmt fmt;
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
 	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
+	bool dynamic = is_dynamic_domain(domain);
 	unsigned int quirks =
 		smmu_domain->attributes & (1 << DOMAIN_ATTR_ENABLE_TTBR1) ?
 			IO_PGTABLE_QUIRK_ARM_TTBR1 : 0;
@@ -849,6 +907,8 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	if (smmu_domain->smmu)
 		goto out_unlock;
 
+	smmu_domain->cfg.asid = INVALID_ASID;
+
 	/*
 	 * Mapping the requested stage onto what we support is surprisingly
 	 * complicated, mainly because the spec allows S1+S2 SMMUs without
@@ -939,12 +999,14 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 		goto out_unlock;
 	}
 
-	ret = __arm_smmu_alloc_bitmap(smmu->context_map, start,
-				      smmu->num_context_banks);
-	if (ret < 0)
-		goto out_unlock;
-
-	cfg->cbndx = ret;
+	/* Dynamic domains will inherit cbndx from the parent */
+	if (!dynamic) {
+		ret = __arm_smmu_alloc_bitmap(smmu->context_map, start,
+					      smmu->num_context_banks);
+		if (ret < 0)
+			goto out_unlock;
+		cfg->cbndx = ret;
+	}
 	if (smmu->version < ARM_SMMU_V2) {
 		cfg->irptndx = atomic_inc_return(&smmu->irptndx);
 		cfg->irptndx %= smmu->num_context_irqs;
@@ -961,6 +1023,8 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 		.iommu_dev	= smmu->dev,
 	};
 
+	cfg->vmid = cfg->cbndx + 1;
+
 	smmu_domain->smmu = smmu;
 	pgtbl_ops = alloc_io_pgtable_ops(fmt, &smmu_domain->pgtbl_cfg,
 					 smmu_domain);
@@ -974,19 +1038,30 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	domain->geometry.aperture_end = (1UL << ias) - 1;
 	domain->geometry.force_aperture = true;
 
-	/* Initialise the context bank with our page table cfg */
-	arm_smmu_init_context_bank(smmu_domain, &smmu_domain->pgtbl_cfg);
+	/* Assign an asid */
+	ret = arm_smmu_init_asid(domain, smmu);
+	if (ret)
+		goto out_clear_smmu;
 
-	/*
-	 * Request context fault interrupt. Do this last to avoid the
-	 * handler seeing a half-initialised domain state.
-	 */
-	irq = smmu->irqs[smmu->num_global_irqs + cfg->irptndx];
-	ret = devm_request_irq(smmu->dev, irq, arm_smmu_context_fault,
-			       IRQF_SHARED, "arm-smmu-context-fault", domain);
-	if (ret < 0) {
-		dev_err(smmu->dev, "failed to request context IRQ %d (%u)\n",
-			cfg->irptndx, irq);
+	if (!dynamic) {
+		/* Initialise the context bank with our page table cfg */
+		arm_smmu_init_context_bank(smmu_domain,
+						&smmu_domain->pgtbl_cfg);
+
+		/*
+		 * Request context fault interrupt. Do this last to avoid the
+		 * handler seeing a half-initialised domain state.
+		 */
+		irq = smmu->irqs[smmu->num_global_irqs + cfg->irptndx];
+		ret = devm_request_irq(smmu->dev, irq, arm_smmu_context_fault,
+				IRQF_SHARED, "arm-smmu-context-fault", domain);
+		if (ret < 0) {
+			dev_err(smmu->dev, "failed to request context IRQ %d (%u)\n",
+				cfg->irptndx, irq);
+			cfg->irptndx = INVALID_IRPTNDX;
+			goto out_clear_smmu;
+		}
+	} else {
 		cfg->irptndx = INVALID_IRPTNDX;
 	}
 
@@ -1014,6 +1089,12 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain)
 	if (!smmu)
 		return;
 
+	if (is_dynamic_domain(domain)) {
+		arm_smmu_free_asid(domain);
+		free_io_pgtable_ops(smmu_domain->pgtbl_ops);
+		return;
+	}
+
 	/*
 	 * Disable the context bank and free the page tables before freeing
 	 * it.
@@ -1021,6 +1102,8 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain)
 	cb_base = ARM_SMMU_CB_BASE(smmu) + ARM_SMMU_CB(smmu, cfg->cbndx);
 	writel_relaxed(0, cb_base + ARM_SMMU_CB_SCTLR);
 
+	arm_smmu_tlb_inv_context(smmu_domain);
+
 	if (cfg->irptndx != INVALID_IRPTNDX) {
 		irq = smmu->irqs[smmu->num_global_irqs + cfg->irptndx];
 		devm_free_irq(smmu->dev, irq, domain);
@@ -1034,7 +1117,8 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type)
 {
 	struct arm_smmu_domain *smmu_domain;
 
-	if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA)
+	if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA &&
+		type != IOMMU_DOMAIN_DYNAMIC)
 		return NULL;
 	/*
 	 * Allocate the domain and initialise some of its data structures.
@@ -1257,6 +1341,10 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 	struct arm_smmu_device *smmu;
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
 
+	/* Dynamic domains do not need to be attached */
+	if (is_dynamic_domain(domain))
+		return 0;
+
 	if (!fwspec || fwspec->ops != &arm_smmu_ops) {
 		dev_err(dev, "cannot attach to SMMU, is it on the same bus?\n");
 		return -ENXIO;
@@ -1531,6 +1619,29 @@ static struct iommu_group *arm_smmu_device_group(struct device *dev)
 	return group;
 }
 
+static int arm_smmu_domain_init_dynamic(struct iommu_domain *parent,
+		struct iommu_domain *child)
+{
+	struct arm_smmu_domain *parent_domain = to_smmu_domain(parent);
+	struct arm_smmu_domain *child_domain = to_smmu_domain(child);
+	struct arm_smmu_device *smmu = parent_domain->smmu;
+
+	/* We can't do any of this until the parent is attached */
+	if (!smmu)
+		return -ENODEV;
+
+	if (!(smmu->options & ARM_SMMU_OPT_DYNAMIC)) {
+		dev_err(smmu->dev, "dynamic domains are not supported\n");
+		return -EPERM;
+	}
+
+	/* Copy the context bank from the parent */
+	child_domain->cfg.cbndx = parent_domain->cfg.cbndx;
+
+	/* Initialize the context and create all the useful stuff */
+	return arm_smmu_init_domain_context(child, smmu);
+}
+
 static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
 				    enum iommu_attr attr, void *data)
 {
@@ -1626,6 +1737,7 @@ static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args)
 	.domain_get_attr	= arm_smmu_domain_get_attr,
 	.domain_set_attr	= arm_smmu_domain_set_attr,
 	.of_xlate		= arm_smmu_of_xlate,
+	.domain_init_dynamic	= arm_smmu_domain_init_dynamic,
 	.pgsize_bitmap		= -1UL, /* Restricted during device attach */
 };
 
@@ -2037,6 +2149,7 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev)
 		return -ENOMEM;
 	}
 	smmu->dev = dev;
+	ida_init(&smmu->asid_ida);
 
 	data = of_device_get_match_data(dev);
 	smmu->version = data->version;
@@ -2148,6 +2261,8 @@ static int arm_smmu_device_remove(struct platform_device *pdev)
 	if (!bitmap_empty(smmu->context_map, ARM_SMMU_MAX_CBS))
 		dev_err(&pdev->dev, "removing device with active domains!\n");
 
+	ida_destroy(&smmu->asid_ida);
+
 	/* Turn the thing off */
 	writel(sCR0_CLIENTPD, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0);
 
-- 
1.9.1

_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [Freedreno] [PATCH 0/7] RFC: iommu/arm-smmu-v2: Dynamic domains
       [not found] ` <1488904795-870-1-git-send-email-jcrouse-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
                     ` (6 preceding siblings ...)
  2017-03-07 16:39   ` [PATCH 7/7] iommu/arm-smmu: add support for " Jordan Crouse
@ 2017-03-07 17:22   ` Jordan Crouse
  7 siblings, 0 replies; 11+ messages in thread
From: Jordan Crouse @ 2017-03-07 17:22 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: linux-arm-msm-u79uwXL29TY76Z2rM5mHXA

On Tue, Mar 07, 2017 at 09:39:48AM -0700, Jordan Crouse wrote:
> Pursuant to the arm-smmu-v3 SVM support:
> 
> https://lists.linuxfoundation.org/pipermail/iommu/2017-February/020599.html
> 
> I felt it would be helpful if I would demonstrate how Qualcomm implements
> per-process pagetables for several generations of SoCs and GPUs focusing on the
> Adreno A540 GPU and an arm-smmu-v2 IOMMU on the Snapdragon 820 SoC.
> 
> The requirement is to implement per-process GPU address spaces for security
> reasons. Though some very crude SVM support is possible we focus mainly on
> individual address spaces that are maintained and mapped by the GPU driver.
> 
> In a nutshell, the solution is to create special virtual or "dynamic" domains
> that are associated with a real domain. The dynamic domains allocate pagetables
> but do not reprogram the hardware. When a command is submitted, the kernel
> driver provides the physial address of the pagetable (TTBR0) to the GPU which
> reprograms the TTBR0 register in context bank 0 of the GPMU SMMU on the fly (and
> does the requisite flushing and stalling).
> 
> The TTBR1 address space is used to maintain a split between the process and the
> global GPU buffers (ringbuffers, etc). This greatly facilitates the switching
> process.
> 
> In more detail this is the workflow:
> 
>  - The kernel driver attaches a UNMANAGED domain to context bank 0
> 
>  - Global GPU buffers are allocated in the TTBR1 address space
>  
>  - Each new process creates a dynamic domain cloned from the "real" domain
> 
>  - New buffers for the process are mapped into the dynamic domain
> 
>  - The kernel driver gets the TTBR0/ASID register value from the dynamic domain
>    via an attribute
> 
>  - At command submission time, the kernel driver sends the TTBR0/ASID value to
>    the GPU before the command. The GPU switches the pagetable by programming
>    the SMMU hardware before executing the command.
> 
> I'll be uploading the series to implement this in the MSM DRM driver to show how
> it works from the GPU perspective. I'm adding it as a separate thread to avoid
> crossing the streams and confusing folks - I'll reply to this email with a link.

Here is a link to the GPU implementation. I'll leave the link here but if
enough folks think it is useful I can reply append the actual patches to this
thread too.

https://lists.freedesktop.org/archives/freedreno/2017-March/001049.html

Jordan
-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 7/7] iommu/arm-smmu: add support for dynamic domains
       [not found]     ` <1488904795-870-8-git-send-email-jcrouse-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
@ 2017-03-07 18:11       ` Mark Rutland
  2017-03-07 20:40         ` Jordan Crouse
  0 siblings, 1 reply; 11+ messages in thread
From: Mark Rutland @ 2017-03-07 18:11 UTC (permalink / raw)
  To: Jordan Crouse
  Cc: linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Jeremy Gebben

On Tue, Mar 07, 2017 at 09:39:55AM -0700, Jordan Crouse wrote:
> Implement support for dynamic domain switching. This feature is
> only enabled when the qcom,dynamic device tree attribute for an smmu
> instance.
> 
> In order to use dynamic domains, a non-dynamic domain must first
> be created and attached.  The non-dynamic domain must remain
> attached while the device is in use.
> 
> The dynamic domain is cloned from the non-dynamic domain. Important
> configuration information is copied from the non-dynamic domain and
> the dynamic domain is automatically "attached" (though it doesn't
> program the hardware).
> 
> To switch domains dynamically the hardware must program the TTBR0 register
> with the value from the DOMAIN_ATTR_TTBR0 attribute for the dynamic domain.
> The upstream driver may also need to do other hardware specific register
> programming to properly synchronize the domain switch. It must ensure that
> all register state except for the TTBR0 register is restored
> at the end of the switch operation.

> +	{ ARM_SMMU_OPT_DYNAMIC, "qcom,dynamic" },

What *precisely* is the intended semantic of this property?

It's not clear to me what a dynamic domain is, there's no documentation
in this series for this property, and from a glance it sounds like a
pure SW detail rather than a hardware/system detail (i.e. it shouldn;t
be in the DT at all).

This needs documentation. In future, please also Cc the devicetree list
(devicetree-u79uwXL29TY76Z2rM5mHXA@public.gmane.org) when adding new properties.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 7/7] iommu/arm-smmu: add support for dynamic domains
  2017-03-07 18:11       ` Mark Rutland
@ 2017-03-07 20:40         ` Jordan Crouse
  0 siblings, 0 replies; 11+ messages in thread
From: Jordan Crouse @ 2017-03-07 20:40 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Jeremy Gebben

On Tue, Mar 07, 2017 at 06:11:38PM +0000, Mark Rutland wrote:
> On Tue, Mar 07, 2017 at 09:39:55AM -0700, Jordan Crouse wrote:
> > Implement support for dynamic domain switching. This feature is
> > only enabled when the qcom,dynamic device tree attribute for an smmu
> > instance.
> > 
> > In order to use dynamic domains, a non-dynamic domain must first
> > be created and attached.  The non-dynamic domain must remain
> > attached while the device is in use.
> > 
> > The dynamic domain is cloned from the non-dynamic domain. Important
> > configuration information is copied from the non-dynamic domain and
> > the dynamic domain is automatically "attached" (though it doesn't
> > program the hardware).
> > 
> > To switch domains dynamically the hardware must program the TTBR0 register
> > with the value from the DOMAIN_ATTR_TTBR0 attribute for the dynamic domain.
> > The upstream driver may also need to do other hardware specific register
> > programming to properly synchronize the domain switch. It must ensure that
> > all register state except for the TTBR0 register is restored
> > at the end of the switch operation.
> 
> > +	{ ARM_SMMU_OPT_DYNAMIC, "qcom,dynamic" },
> 
> What *precisely* is the intended semantic of this property?
> 
> It's not clear to me what a dynamic domain is, there's no documentation
> in this series for this property, and from a glance it sounds like a
> pure SW detail rather than a hardware/system detail (i.e. it shouldn;t
> be in the DT at all).

Yep, I agree.  The original intent was to try to keep other clients from getting
themselves into "trouble" but we are all adults here. I'll zap it if we do
another refresh.

Thanks,
Jordan

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-03-07 20:40 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-07 16:39 [PATCH 0/7] RFC: iommu/arm-smmu-v2: Dynamic domains Jordan Crouse
     [not found] ` <1488904795-870-1-git-send-email-jcrouse-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
2017-03-07 16:39   ` [PATCH 1/7] iommu/arm-smmu: save the pgtbl_cfg in the domain Jordan Crouse
2017-03-07 16:39   ` [PATCH 2/7] iommu: Add DOMAIN_ATTR_ENABLE_TTBR1 Jordan Crouse
2017-03-07 16:39   ` [PATCH 3/7] iommu/arm-smmu: Add support for TTBR1 Jordan Crouse
2017-03-07 16:39   ` [PATCH 4/7] iommu: introduce TTBR0 domain attribute Jordan Crouse
2017-03-07 16:39   ` [PATCH 5/7] iommu/arm-smmu: add support for TTBR0 attribute Jordan Crouse
2017-03-07 16:39   ` [PATCH 6/7] iommu: Add dynamic domains Jordan Crouse
2017-03-07 16:39   ` [PATCH 7/7] iommu/arm-smmu: add support for " Jordan Crouse
     [not found]     ` <1488904795-870-8-git-send-email-jcrouse-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
2017-03-07 18:11       ` Mark Rutland
2017-03-07 20:40         ` Jordan Crouse
2017-03-07 17:22   ` [Freedreno] [PATCH 0/7] RFC: iommu/arm-smmu-v2: Dynamic domains Jordan Crouse

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.