All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/15] drm/msm: Per-instance pagetable support
@ 2019-05-21 16:13 ` Jordan Crouse
  0 siblings, 0 replies; 45+ messages in thread
From: Jordan Crouse @ 2019-05-21 16:13 UTC (permalink / raw)
  To: freedreno
  Cc: jean-philippe.brucker, linux-arm-msm, hoegsberg, dianders,
	Sean Paul, Jonathan Marek, Kees Cook, Thomas Zimmermann,
	Wen Yang, linux-kernel, iommu, Sharat Masetty, Robin Murphy,
	Rob Clark, dri-devel, Daniel Vetter, Will Deacon, Joerg Roedel,
	Mamta Shukla, linux-arm-kernel, David Airlie

This is a refresh of the per-instance pagetable support for arm-smmu-v2 and the
MSM GPU driver. I think this is pretty mature at this point, so I've dropped the
RFC designation and ask for consideration for 5.3.

Per-instance pagetables allow the target GPU driver to create and manage
an individual pagetable for each file descriptor instance and switch
between them asynchronously using the GPU to reprogram the pagetable
registers on the fly.

Most of the heavy lifting for this is done in the arm-smmu-v2 driver by
taking advantage of the newly added multiple domain API. The first patch in the
series allows opted-in clients to create a default identity domain when the
IOMMU group for the SMMU device is created. This bypasses the DMA domain
creation in the IOMMU core which serves several purposes for the GPU by skipping
the otherwise  unused DMA domain and also keeping context bank 0 unused on the
hardware (for better or worse, the GPU is hardcoded to only use context bank 0
for switching).

The next two patches enable split pagetable support. This is used to map
global buffers for the GPU so we can safely switch the TTBR0 pagetable for the
instance.

The last two arm-smmu-v2 patches enable auxillary domain support. Again the
SMMU client can opt-in to allow auxiliary domains, and if enabled will create
a pagetable but not otherwise touch the hardware. The client can get the address
of the pagetable through an attribute to perform its own switching.

After the arm-smmu-v2 patches are more than several msm/gpu patches to allow
for target specific address spaces, enable 64 bit virtual addressing and
implement the mechanics of pagetable switching.

For the purposes of merging all the patches between

drm/msm/adreno: Enable 64 bit mode by default on a5xx and a6xx targets

and

drm/msm: Add support to create target specific address spaces

can be merged to the msm-next tree without dependencies on the IOMMU changes.
Only the last three patches will require coordination between the two areas.

Jordan Crouse (15):
  iommu/arm-smmu: Allow IOMMU enabled devices to skip DMA domains
  iommu: Add DOMAIN_ATTR_SPLIT_TABLES
  iommu/arm-smmu: Add split pagetable support for arm-smmu-v2
  iommu: Add DOMAIN_ATTR_PTBASE
  iommu/arm-smmu: Add auxiliary domain support for arm-smmuv2
  drm/msm/adreno: Enable 64 bit mode by default on a5xx and a6xx targets
  drm/msm: Print all 64 bits of the faulting IOMMU address
  drm/msm: Pass the MMU domain index in struct msm_file_private
  drm/msm/gpu: Move address space setup to the GPU targets
  drm/msm: Add a helper function for a per-instance address space
  drm/msm/gpu: Add ttbr0 to the memptrs
  drm/msm: Add support to create target specific address spaces
  drm/msm: Add support for IOMMU auxiliary domains
  drm/msm/a6xx: Support per-instance pagetables
  drm/msm/a5xx: Support per-instance pagetables

 drivers/gpu/drm/msm/adreno/a2xx_gpu.c     |  37 +++-
 drivers/gpu/drm/msm/adreno/a3xx_gpu.c     |  50 +++--
 drivers/gpu/drm/msm/adreno/a4xx_gpu.c     |  51 +++--
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c     | 163 +++++++++++++-
 drivers/gpu/drm/msm/adreno/a5xx_gpu.h     |  19 ++
 drivers/gpu/drm/msm/adreno/a5xx_preempt.c |  70 ++++--
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c     | 166 +++++++++++++-
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h     |   1 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.c   |   7 -
 drivers/gpu/drm/msm/msm_drv.c             |  25 ++-
 drivers/gpu/drm/msm/msm_drv.h             |   5 +
 drivers/gpu/drm/msm/msm_gem.h             |   2 +
 drivers/gpu/drm/msm/msm_gem_submit.c      |  13 +-
 drivers/gpu/drm/msm/msm_gem_vma.c         |  53 +++--
 drivers/gpu/drm/msm/msm_gpu.c             |  59 +----
 drivers/gpu/drm/msm/msm_gpu.h             |   3 +
 drivers/gpu/drm/msm/msm_iommu.c           |  99 ++++++++-
 drivers/gpu/drm/msm/msm_mmu.h             |   4 +
 drivers/gpu/drm/msm/msm_ringbuffer.h      |   1 +
 drivers/iommu/arm-smmu-regs.h             |  19 ++
 drivers/iommu/arm-smmu.c                  | 352 +++++++++++++++++++++++++++---
 drivers/iommu/io-pgtable-arm.c            |   3 +-
 drivers/iommu/iommu.c                     |  29 ++-
 include/linux/iommu.h                     |   5 +
 24 files changed, 1052 insertions(+), 184 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH v2 00/15] drm/msm: Per-instance pagetable support
@ 2019-05-21 16:13 ` Jordan Crouse
  0 siblings, 0 replies; 45+ messages in thread
From: Jordan Crouse @ 2019-05-21 16:13 UTC (permalink / raw)
  To: freedreno
  Cc: Kees Cook, Jonathan Marek, jean-philippe.brucker, linux-arm-msm,
	Sharat Masetty, Will Deacon, dianders, dri-devel, linux-kernel,
	David Airlie, iommu, Mamta Shukla, hoegsberg, Thomas Zimmermann,
	Daniel Vetter, Sean Paul, Wen Yang, linux-arm-kernel,
	Robin Murphy

This is a refresh of the per-instance pagetable support for arm-smmu-v2 and the
MSM GPU driver. I think this is pretty mature at this point, so I've dropped the
RFC designation and ask for consideration for 5.3.

Per-instance pagetables allow the target GPU driver to create and manage
an individual pagetable for each file descriptor instance and switch
between them asynchronously using the GPU to reprogram the pagetable
registers on the fly.

Most of the heavy lifting for this is done in the arm-smmu-v2 driver by
taking advantage of the newly added multiple domain API. The first patch in the
series allows opted-in clients to create a default identity domain when the
IOMMU group for the SMMU device is created. This bypasses the DMA domain
creation in the IOMMU core which serves several purposes for the GPU by skipping
the otherwise  unused DMA domain and also keeping context bank 0 unused on the
hardware (for better or worse, the GPU is hardcoded to only use context bank 0
for switching).

The next two patches enable split pagetable support. This is used to map
global buffers for the GPU so we can safely switch the TTBR0 pagetable for the
instance.

The last two arm-smmu-v2 patches enable auxillary domain support. Again the
SMMU client can opt-in to allow auxiliary domains, and if enabled will create
a pagetable but not otherwise touch the hardware. The client can get the address
of the pagetable through an attribute to perform its own switching.

After the arm-smmu-v2 patches are more than several msm/gpu patches to allow
for target specific address spaces, enable 64 bit virtual addressing and
implement the mechanics of pagetable switching.

For the purposes of merging all the patches between

drm/msm/adreno: Enable 64 bit mode by default on a5xx and a6xx targets

and

drm/msm: Add support to create target specific address spaces

can be merged to the msm-next tree without dependencies on the IOMMU changes.
Only the last three patches will require coordination between the two areas.

Jordan Crouse (15):
  iommu/arm-smmu: Allow IOMMU enabled devices to skip DMA domains
  iommu: Add DOMAIN_ATTR_SPLIT_TABLES
  iommu/arm-smmu: Add split pagetable support for arm-smmu-v2
  iommu: Add DOMAIN_ATTR_PTBASE
  iommu/arm-smmu: Add auxiliary domain support for arm-smmuv2
  drm/msm/adreno: Enable 64 bit mode by default on a5xx and a6xx targets
  drm/msm: Print all 64 bits of the faulting IOMMU address
  drm/msm: Pass the MMU domain index in struct msm_file_private
  drm/msm/gpu: Move address space setup to the GPU targets
  drm/msm: Add a helper function for a per-instance address space
  drm/msm/gpu: Add ttbr0 to the memptrs
  drm/msm: Add support to create target specific address spaces
  drm/msm: Add support for IOMMU auxiliary domains
  drm/msm/a6xx: Support per-instance pagetables
  drm/msm/a5xx: Support per-instance pagetables

 drivers/gpu/drm/msm/adreno/a2xx_gpu.c     |  37 +++-
 drivers/gpu/drm/msm/adreno/a3xx_gpu.c     |  50 +++--
 drivers/gpu/drm/msm/adreno/a4xx_gpu.c     |  51 +++--
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c     | 163 +++++++++++++-
 drivers/gpu/drm/msm/adreno/a5xx_gpu.h     |  19 ++
 drivers/gpu/drm/msm/adreno/a5xx_preempt.c |  70 ++++--
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c     | 166 +++++++++++++-
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h     |   1 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.c   |   7 -
 drivers/gpu/drm/msm/msm_drv.c             |  25 ++-
 drivers/gpu/drm/msm/msm_drv.h             |   5 +
 drivers/gpu/drm/msm/msm_gem.h             |   2 +
 drivers/gpu/drm/msm/msm_gem_submit.c      |  13 +-
 drivers/gpu/drm/msm/msm_gem_vma.c         |  53 +++--
 drivers/gpu/drm/msm/msm_gpu.c             |  59 +----
 drivers/gpu/drm/msm/msm_gpu.h             |   3 +
 drivers/gpu/drm/msm/msm_iommu.c           |  99 ++++++++-
 drivers/gpu/drm/msm/msm_mmu.h             |   4 +
 drivers/gpu/drm/msm/msm_ringbuffer.h      |   1 +
 drivers/iommu/arm-smmu-regs.h             |  19 ++
 drivers/iommu/arm-smmu.c                  | 352 +++++++++++++++++++++++++++---
 drivers/iommu/io-pgtable-arm.c            |   3 +-
 drivers/iommu/iommu.c                     |  29 ++-
 include/linux/iommu.h                     |   5 +
 24 files changed, 1052 insertions(+), 184 deletions(-)

-- 
2.7.4

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH v2 00/15] drm/msm: Per-instance pagetable support
@ 2019-05-21 16:13 ` Jordan Crouse
  0 siblings, 0 replies; 45+ messages in thread
From: Jordan Crouse @ 2019-05-21 16:13 UTC (permalink / raw)
  To: freedreno
  Cc: Rob Clark, Kees Cook, Jonathan Marek, jean-philippe.brucker,
	linux-arm-msm, Sharat Masetty, Will Deacon, dianders, dri-devel,
	linux-kernel, David Airlie, iommu, Mamta Shukla, hoegsberg,
	Joerg Roedel, Thomas Zimmermann, Daniel Vetter, Sean Paul,
	Wen Yang, linux-arm-kernel, Robin Murphy

This is a refresh of the per-instance pagetable support for arm-smmu-v2 and the
MSM GPU driver. I think this is pretty mature at this point, so I've dropped the
RFC designation and ask for consideration for 5.3.

Per-instance pagetables allow the target GPU driver to create and manage
an individual pagetable for each file descriptor instance and switch
between them asynchronously using the GPU to reprogram the pagetable
registers on the fly.

Most of the heavy lifting for this is done in the arm-smmu-v2 driver by
taking advantage of the newly added multiple domain API. The first patch in the
series allows opted-in clients to create a default identity domain when the
IOMMU group for the SMMU device is created. This bypasses the DMA domain
creation in the IOMMU core which serves several purposes for the GPU by skipping
the otherwise  unused DMA domain and also keeping context bank 0 unused on the
hardware (for better or worse, the GPU is hardcoded to only use context bank 0
for switching).

The next two patches enable split pagetable support. This is used to map
global buffers for the GPU so we can safely switch the TTBR0 pagetable for the
instance.

The last two arm-smmu-v2 patches enable auxillary domain support. Again the
SMMU client can opt-in to allow auxiliary domains, and if enabled will create
a pagetable but not otherwise touch the hardware. The client can get the address
of the pagetable through an attribute to perform its own switching.

After the arm-smmu-v2 patches are more than several msm/gpu patches to allow
for target specific address spaces, enable 64 bit virtual addressing and
implement the mechanics of pagetable switching.

For the purposes of merging all the patches between

drm/msm/adreno: Enable 64 bit mode by default on a5xx and a6xx targets

and

drm/msm: Add support to create target specific address spaces

can be merged to the msm-next tree without dependencies on the IOMMU changes.
Only the last three patches will require coordination between the two areas.

Jordan Crouse (15):
  iommu/arm-smmu: Allow IOMMU enabled devices to skip DMA domains
  iommu: Add DOMAIN_ATTR_SPLIT_TABLES
  iommu/arm-smmu: Add split pagetable support for arm-smmu-v2
  iommu: Add DOMAIN_ATTR_PTBASE
  iommu/arm-smmu: Add auxiliary domain support for arm-smmuv2
  drm/msm/adreno: Enable 64 bit mode by default on a5xx and a6xx targets
  drm/msm: Print all 64 bits of the faulting IOMMU address
  drm/msm: Pass the MMU domain index in struct msm_file_private
  drm/msm/gpu: Move address space setup to the GPU targets
  drm/msm: Add a helper function for a per-instance address space
  drm/msm/gpu: Add ttbr0 to the memptrs
  drm/msm: Add support to create target specific address spaces
  drm/msm: Add support for IOMMU auxiliary domains
  drm/msm/a6xx: Support per-instance pagetables
  drm/msm/a5xx: Support per-instance pagetables

 drivers/gpu/drm/msm/adreno/a2xx_gpu.c     |  37 +++-
 drivers/gpu/drm/msm/adreno/a3xx_gpu.c     |  50 +++--
 drivers/gpu/drm/msm/adreno/a4xx_gpu.c     |  51 +++--
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c     | 163 +++++++++++++-
 drivers/gpu/drm/msm/adreno/a5xx_gpu.h     |  19 ++
 drivers/gpu/drm/msm/adreno/a5xx_preempt.c |  70 ++++--
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c     | 166 +++++++++++++-
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h     |   1 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.c   |   7 -
 drivers/gpu/drm/msm/msm_drv.c             |  25 ++-
 drivers/gpu/drm/msm/msm_drv.h             |   5 +
 drivers/gpu/drm/msm/msm_gem.h             |   2 +
 drivers/gpu/drm/msm/msm_gem_submit.c      |  13 +-
 drivers/gpu/drm/msm/msm_gem_vma.c         |  53 +++--
 drivers/gpu/drm/msm/msm_gpu.c             |  59 +----
 drivers/gpu/drm/msm/msm_gpu.h             |   3 +
 drivers/gpu/drm/msm/msm_iommu.c           |  99 ++++++++-
 drivers/gpu/drm/msm/msm_mmu.h             |   4 +
 drivers/gpu/drm/msm/msm_ringbuffer.h      |   1 +
 drivers/iommu/arm-smmu-regs.h             |  19 ++
 drivers/iommu/arm-smmu.c                  | 352 +++++++++++++++++++++++++++---
 drivers/iommu/io-pgtable-arm.c            |   3 +-
 drivers/iommu/iommu.c                     |  29 ++-
 include/linux/iommu.h                     |   5 +
 24 files changed, 1052 insertions(+), 184 deletions(-)

-- 
2.7.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH v2 00/15] drm/msm: Per-instance pagetable support
@ 2019-05-21 16:13 ` Jordan Crouse
  0 siblings, 0 replies; 45+ messages in thread
From: Jordan Crouse @ 2019-05-21 16:13 UTC (permalink / raw)
  To: freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Rob Clark, Kees Cook, Jonathan Marek,
	jean-philippe.brucker-5wv7dgnIgG8,
	linux-arm-msm-u79uwXL29TY76Z2rM5mHXA, Sharat Masetty,
	Will Deacon, dianders-F7+t8E8rja9g9hUCZPvPmw,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, David Airlie,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Mamta Shukla,
	hoegsberg-hpIqsD4AKlfQT0dZR+AlfA, Joerg Roedel,
	Thomas Zimmermann, Daniel Vetter, Sean Paul, Wen Yang,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Robin Murphy

This is a refresh of the per-instance pagetable support for arm-smmu-v2 and the
MSM GPU driver. I think this is pretty mature at this point, so I've dropped the
RFC designation and ask for consideration for 5.3.

Per-instance pagetables allow the target GPU driver to create and manage
an individual pagetable for each file descriptor instance and switch
between them asynchronously using the GPU to reprogram the pagetable
registers on the fly.

Most of the heavy lifting for this is done in the arm-smmu-v2 driver by
taking advantage of the newly added multiple domain API. The first patch in the
series allows opted-in clients to create a default identity domain when the
IOMMU group for the SMMU device is created. This bypasses the DMA domain
creation in the IOMMU core which serves several purposes for the GPU by skipping
the otherwise  unused DMA domain and also keeping context bank 0 unused on the
hardware (for better or worse, the GPU is hardcoded to only use context bank 0
for switching).

The next two patches enable split pagetable support. This is used to map
global buffers for the GPU so we can safely switch the TTBR0 pagetable for the
instance.

The last two arm-smmu-v2 patches enable auxillary domain support. Again the
SMMU client can opt-in to allow auxiliary domains, and if enabled will create
a pagetable but not otherwise touch the hardware. The client can get the address
of the pagetable through an attribute to perform its own switching.

After the arm-smmu-v2 patches are more than several msm/gpu patches to allow
for target specific address spaces, enable 64 bit virtual addressing and
implement the mechanics of pagetable switching.

For the purposes of merging all the patches between

drm/msm/adreno: Enable 64 bit mode by default on a5xx and a6xx targets

and

drm/msm: Add support to create target specific address spaces

can be merged to the msm-next tree without dependencies on the IOMMU changes.
Only the last three patches will require coordination between the two areas.

Jordan Crouse (15):
  iommu/arm-smmu: Allow IOMMU enabled devices to skip DMA domains
  iommu: Add DOMAIN_ATTR_SPLIT_TABLES
  iommu/arm-smmu: Add split pagetable support for arm-smmu-v2
  iommu: Add DOMAIN_ATTR_PTBASE
  iommu/arm-smmu: Add auxiliary domain support for arm-smmuv2
  drm/msm/adreno: Enable 64 bit mode by default on a5xx and a6xx targets
  drm/msm: Print all 64 bits of the faulting IOMMU address
  drm/msm: Pass the MMU domain index in struct msm_file_private
  drm/msm/gpu: Move address space setup to the GPU targets
  drm/msm: Add a helper function for a per-instance address space
  drm/msm/gpu: Add ttbr0 to the memptrs
  drm/msm: Add support to create target specific address spaces
  drm/msm: Add support for IOMMU auxiliary domains
  drm/msm/a6xx: Support per-instance pagetables
  drm/msm/a5xx: Support per-instance pagetables

 drivers/gpu/drm/msm/adreno/a2xx_gpu.c     |  37 +++-
 drivers/gpu/drm/msm/adreno/a3xx_gpu.c     |  50 +++--
 drivers/gpu/drm/msm/adreno/a4xx_gpu.c     |  51 +++--
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c     | 163 +++++++++++++-
 drivers/gpu/drm/msm/adreno/a5xx_gpu.h     |  19 ++
 drivers/gpu/drm/msm/adreno/a5xx_preempt.c |  70 ++++--
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c     | 166 +++++++++++++-
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h     |   1 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.c   |   7 -
 drivers/gpu/drm/msm/msm_drv.c             |  25 ++-
 drivers/gpu/drm/msm/msm_drv.h             |   5 +
 drivers/gpu/drm/msm/msm_gem.h             |   2 +
 drivers/gpu/drm/msm/msm_gem_submit.c      |  13 +-
 drivers/gpu/drm/msm/msm_gem_vma.c         |  53 +++--
 drivers/gpu/drm/msm/msm_gpu.c             |  59 +----
 drivers/gpu/drm/msm/msm_gpu.h             |   3 +
 drivers/gpu/drm/msm/msm_iommu.c           |  99 ++++++++-
 drivers/gpu/drm/msm/msm_mmu.h             |   4 +
 drivers/gpu/drm/msm/msm_ringbuffer.h      |   1 +
 drivers/iommu/arm-smmu-regs.h             |  19 ++
 drivers/iommu/arm-smmu.c                  | 352 +++++++++++++++++++++++++++---
 drivers/iommu/io-pgtable-arm.c            |   3 +-
 drivers/iommu/iommu.c                     |  29 ++-
 include/linux/iommu.h                     |   5 +
 24 files changed, 1052 insertions(+), 184 deletions(-)

-- 
2.7.4

_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH v2 01/15] iommu/arm-smmu: Allow IOMMU enabled devices to skip DMA domains
  2019-05-21 16:13 ` Jordan Crouse
  (?)
@ 2019-05-21 16:13   ` Jordan Crouse
  -1 siblings, 0 replies; 45+ messages in thread
From: Jordan Crouse @ 2019-05-21 16:13 UTC (permalink / raw)
  To: freedreno
  Cc: jean-philippe.brucker, linux-arm-msm, hoegsberg, dianders,
	linux-kernel, iommu, Robin Murphy, Will Deacon, Joerg Roedel,
	linux-arm-kernel

Allow IOMMU enabled devices specified on an opt-in list to create a
default identity domain for a new IOMMU group and bypass the DMA
domain created by the IOMMU core. This allows the group to be properly
set up but otherwise skips touching the hardware until the client
device attaches a unmanaged domain of its own.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 drivers/iommu/arm-smmu.c | 42 ++++++++++++++++++++++++++++++++++++++++++
 drivers/iommu/iommu.c    | 29 +++++++++++++++++++++++------
 include/linux/iommu.h    |  3 +++
 3 files changed, 68 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 5e54cc0..a795ada 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1235,6 +1235,35 @@ static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain,
 	return 0;
 }
 
+struct arm_smmu_client_match_data {
+	bool use_identity_domain;
+};
+
+static const struct arm_smmu_client_match_data qcom_adreno = {
+	.use_identity_domain = true,
+};
+
+static const struct arm_smmu_client_match_data qcom_mdss = {
+	.use_identity_domain = true,
+};
+
+static const struct of_device_id arm_smmu_client_of_match[] = {
+	{ .compatible = "qcom,adreno", .data = &qcom_adreno },
+	{ .compatible = "qcom,mdp4", .data = &qcom_mdss },
+	{ .compatible = "qcom,mdss", .data = &qcom_mdss },
+	{ .compatible = "qcom,sdm845-mdss", .data = &qcom_mdss },
+	{},
+};
+
+static const struct arm_smmu_client_match_data *
+arm_smmu_client_data(struct device *dev)
+{
+	const struct of_device_id *match =
+		of_match_device(arm_smmu_client_of_match, dev);
+
+	return match ? match->data : NULL;
+}
+
 static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 {
 	int ret;
@@ -1552,6 +1581,7 @@ static struct iommu_group *arm_smmu_device_group(struct device *dev)
 {
 	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
 	struct arm_smmu_device *smmu = fwspec_smmu(fwspec);
+	const struct arm_smmu_client_match_data *client;
 	struct iommu_group *group = NULL;
 	int i, idx;
 
@@ -1573,6 +1603,18 @@ static struct iommu_group *arm_smmu_device_group(struct device *dev)
 	else
 		group = generic_device_group(dev);
 
+	client = arm_smmu_client_data(dev);
+
+	/*
+	 * If the client chooses to bypass the dma domain, create a identity
+	 * domain as a default placeholder. This will give the device a
+	 * default domain but skip DMA operations and not consume a context
+	 * bank
+	 */
+	if (client && client->no_dma_domain)
+		iommu_group_set_default_domain(group, dev,
+			IOMMU_DOMAIN_IDENTITY);
+
 	return group;
 }
 
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 67ee662..af3e1ed 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1062,6 +1062,24 @@ struct iommu_group *fsl_mc_device_group(struct device *dev)
 	return group;
 }
 
+struct iommu_domain *iommu_group_set_default_domain(struct iommu_group *group,
+		struct device *dev, unsigned int type)
+{
+	struct iommu_domain *dom;
+
+	dom = __iommu_domain_alloc(dev->bus, type);
+	if (!dom)
+		return NULL;
+
+	/* FIXME: Error if the default domain is already set? */
+	group->default_domain = dom;
+	if (!group->domain)
+		group->domain = dom;
+
+	return dom;
+}
+EXPORT_SYMBOL_GPL(iommu_group_set_default_domain);
+
 /**
  * iommu_group_get_for_dev - Find or create the IOMMU group for a device
  * @dev: target device
@@ -1099,9 +1117,12 @@ struct iommu_group *iommu_group_get_for_dev(struct device *dev)
 	if (!group->default_domain) {
 		struct iommu_domain *dom;
 
-		dom = __iommu_domain_alloc(dev->bus, iommu_def_domain_type);
+		dom = iommu_group_set_default_domain(group, dev,
+			iommu_def_domain_type);
+
 		if (!dom && iommu_def_domain_type != IOMMU_DOMAIN_DMA) {
-			dom = __iommu_domain_alloc(dev->bus, IOMMU_DOMAIN_DMA);
+			dom = iommu_group_set_default_domain(group, dev,
+				IOMMU_DOMAIN_DMA);
 			if (dom) {
 				dev_warn(dev,
 					 "failed to allocate default IOMMU domain of type %u; falling back to IOMMU_DOMAIN_DMA",
@@ -1109,10 +1130,6 @@ struct iommu_group *iommu_group_get_for_dev(struct device *dev)
 			}
 		}
 
-		group->default_domain = dom;
-		if (!group->domain)
-			group->domain = dom;
-
 		if (dom && !iommu_dma_strict) {
 			int attr = 1;
 			iommu_domain_set_attr(dom,
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index a815cf6..4ef8bd5 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -394,6 +394,9 @@ extern int iommu_group_id(struct iommu_group *group);
 extern struct iommu_group *iommu_group_get_for_dev(struct device *dev);
 extern struct iommu_domain *iommu_group_default_domain(struct iommu_group *);
 
+struct iommu_domain *iommu_group_set_default_domain(struct iommu_group *group,
+		struct device *dev, unsigned int type);
+
 extern int iommu_domain_get_attr(struct iommu_domain *domain, enum iommu_attr,
 				 void *data);
 extern int iommu_domain_set_attr(struct iommu_domain *domain, enum iommu_attr,
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 01/15] iommu/arm-smmu: Allow IOMMU enabled devices to skip DMA domains
@ 2019-05-21 16:13   ` Jordan Crouse
  0 siblings, 0 replies; 45+ messages in thread
From: Jordan Crouse @ 2019-05-21 16:13 UTC (permalink / raw)
  To: freedreno
  Cc: jean-philippe.brucker, linux-arm-msm, Will Deacon, dianders,
	linux-kernel, iommu, hoegsberg, Robin Murphy, linux-arm-kernel

Allow IOMMU enabled devices specified on an opt-in list to create a
default identity domain for a new IOMMU group and bypass the DMA
domain created by the IOMMU core. This allows the group to be properly
set up but otherwise skips touching the hardware until the client
device attaches a unmanaged domain of its own.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 drivers/iommu/arm-smmu.c | 42 ++++++++++++++++++++++++++++++++++++++++++
 drivers/iommu/iommu.c    | 29 +++++++++++++++++++++++------
 include/linux/iommu.h    |  3 +++
 3 files changed, 68 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 5e54cc0..a795ada 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1235,6 +1235,35 @@ static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain,
 	return 0;
 }
 
+struct arm_smmu_client_match_data {
+	bool use_identity_domain;
+};
+
+static const struct arm_smmu_client_match_data qcom_adreno = {
+	.use_identity_domain = true,
+};
+
+static const struct arm_smmu_client_match_data qcom_mdss = {
+	.use_identity_domain = true,
+};
+
+static const struct of_device_id arm_smmu_client_of_match[] = {
+	{ .compatible = "qcom,adreno", .data = &qcom_adreno },
+	{ .compatible = "qcom,mdp4", .data = &qcom_mdss },
+	{ .compatible = "qcom,mdss", .data = &qcom_mdss },
+	{ .compatible = "qcom,sdm845-mdss", .data = &qcom_mdss },
+	{},
+};
+
+static const struct arm_smmu_client_match_data *
+arm_smmu_client_data(struct device *dev)
+{
+	const struct of_device_id *match =
+		of_match_device(arm_smmu_client_of_match, dev);
+
+	return match ? match->data : NULL;
+}
+
 static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 {
 	int ret;
@@ -1552,6 +1581,7 @@ static struct iommu_group *arm_smmu_device_group(struct device *dev)
 {
 	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
 	struct arm_smmu_device *smmu = fwspec_smmu(fwspec);
+	const struct arm_smmu_client_match_data *client;
 	struct iommu_group *group = NULL;
 	int i, idx;
 
@@ -1573,6 +1603,18 @@ static struct iommu_group *arm_smmu_device_group(struct device *dev)
 	else
 		group = generic_device_group(dev);
 
+	client = arm_smmu_client_data(dev);
+
+	/*
+	 * If the client chooses to bypass the dma domain, create a identity
+	 * domain as a default placeholder. This will give the device a
+	 * default domain but skip DMA operations and not consume a context
+	 * bank
+	 */
+	if (client && client->no_dma_domain)
+		iommu_group_set_default_domain(group, dev,
+			IOMMU_DOMAIN_IDENTITY);
+
 	return group;
 }
 
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 67ee662..af3e1ed 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1062,6 +1062,24 @@ struct iommu_group *fsl_mc_device_group(struct device *dev)
 	return group;
 }
 
+struct iommu_domain *iommu_group_set_default_domain(struct iommu_group *group,
+		struct device *dev, unsigned int type)
+{
+	struct iommu_domain *dom;
+
+	dom = __iommu_domain_alloc(dev->bus, type);
+	if (!dom)
+		return NULL;
+
+	/* FIXME: Error if the default domain is already set? */
+	group->default_domain = dom;
+	if (!group->domain)
+		group->domain = dom;
+
+	return dom;
+}
+EXPORT_SYMBOL_GPL(iommu_group_set_default_domain);
+
 /**
  * iommu_group_get_for_dev - Find or create the IOMMU group for a device
  * @dev: target device
@@ -1099,9 +1117,12 @@ struct iommu_group *iommu_group_get_for_dev(struct device *dev)
 	if (!group->default_domain) {
 		struct iommu_domain *dom;
 
-		dom = __iommu_domain_alloc(dev->bus, iommu_def_domain_type);
+		dom = iommu_group_set_default_domain(group, dev,
+			iommu_def_domain_type);
+
 		if (!dom && iommu_def_domain_type != IOMMU_DOMAIN_DMA) {
-			dom = __iommu_domain_alloc(dev->bus, IOMMU_DOMAIN_DMA);
+			dom = iommu_group_set_default_domain(group, dev,
+				IOMMU_DOMAIN_DMA);
 			if (dom) {
 				dev_warn(dev,
 					 "failed to allocate default IOMMU domain of type %u; falling back to IOMMU_DOMAIN_DMA",
@@ -1109,10 +1130,6 @@ struct iommu_group *iommu_group_get_for_dev(struct device *dev)
 			}
 		}
 
-		group->default_domain = dom;
-		if (!group->domain)
-			group->domain = dom;
-
 		if (dom && !iommu_dma_strict) {
 			int attr = 1;
 			iommu_domain_set_attr(dom,
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index a815cf6..4ef8bd5 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -394,6 +394,9 @@ extern int iommu_group_id(struct iommu_group *group);
 extern struct iommu_group *iommu_group_get_for_dev(struct device *dev);
 extern struct iommu_domain *iommu_group_default_domain(struct iommu_group *);
 
+struct iommu_domain *iommu_group_set_default_domain(struct iommu_group *group,
+		struct device *dev, unsigned int type);
+
 extern int iommu_domain_get_attr(struct iommu_domain *domain, enum iommu_attr,
 				 void *data);
 extern int iommu_domain_set_attr(struct iommu_domain *domain, enum iommu_attr,
-- 
2.7.4

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 01/15] iommu/arm-smmu: Allow IOMMU enabled devices to skip DMA domains
@ 2019-05-21 16:13   ` Jordan Crouse
  0 siblings, 0 replies; 45+ messages in thread
From: Jordan Crouse @ 2019-05-21 16:13 UTC (permalink / raw)
  To: freedreno
  Cc: jean-philippe.brucker, linux-arm-msm, Joerg Roedel, Will Deacon,
	dianders, linux-kernel, iommu, hoegsberg, Robin Murphy,
	linux-arm-kernel

Allow IOMMU enabled devices specified on an opt-in list to create a
default identity domain for a new IOMMU group and bypass the DMA
domain created by the IOMMU core. This allows the group to be properly
set up but otherwise skips touching the hardware until the client
device attaches a unmanaged domain of its own.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 drivers/iommu/arm-smmu.c | 42 ++++++++++++++++++++++++++++++++++++++++++
 drivers/iommu/iommu.c    | 29 +++++++++++++++++++++++------
 include/linux/iommu.h    |  3 +++
 3 files changed, 68 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 5e54cc0..a795ada 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1235,6 +1235,35 @@ static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain,
 	return 0;
 }
 
+struct arm_smmu_client_match_data {
+	bool use_identity_domain;
+};
+
+static const struct arm_smmu_client_match_data qcom_adreno = {
+	.use_identity_domain = true,
+};
+
+static const struct arm_smmu_client_match_data qcom_mdss = {
+	.use_identity_domain = true,
+};
+
+static const struct of_device_id arm_smmu_client_of_match[] = {
+	{ .compatible = "qcom,adreno", .data = &qcom_adreno },
+	{ .compatible = "qcom,mdp4", .data = &qcom_mdss },
+	{ .compatible = "qcom,mdss", .data = &qcom_mdss },
+	{ .compatible = "qcom,sdm845-mdss", .data = &qcom_mdss },
+	{},
+};
+
+static const struct arm_smmu_client_match_data *
+arm_smmu_client_data(struct device *dev)
+{
+	const struct of_device_id *match =
+		of_match_device(arm_smmu_client_of_match, dev);
+
+	return match ? match->data : NULL;
+}
+
 static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 {
 	int ret;
@@ -1552,6 +1581,7 @@ static struct iommu_group *arm_smmu_device_group(struct device *dev)
 {
 	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
 	struct arm_smmu_device *smmu = fwspec_smmu(fwspec);
+	const struct arm_smmu_client_match_data *client;
 	struct iommu_group *group = NULL;
 	int i, idx;
 
@@ -1573,6 +1603,18 @@ static struct iommu_group *arm_smmu_device_group(struct device *dev)
 	else
 		group = generic_device_group(dev);
 
+	client = arm_smmu_client_data(dev);
+
+	/*
+	 * If the client chooses to bypass the dma domain, create a identity
+	 * domain as a default placeholder. This will give the device a
+	 * default domain but skip DMA operations and not consume a context
+	 * bank
+	 */
+	if (client && client->no_dma_domain)
+		iommu_group_set_default_domain(group, dev,
+			IOMMU_DOMAIN_IDENTITY);
+
 	return group;
 }
 
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 67ee662..af3e1ed 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1062,6 +1062,24 @@ struct iommu_group *fsl_mc_device_group(struct device *dev)
 	return group;
 }
 
+struct iommu_domain *iommu_group_set_default_domain(struct iommu_group *group,
+		struct device *dev, unsigned int type)
+{
+	struct iommu_domain *dom;
+
+	dom = __iommu_domain_alloc(dev->bus, type);
+	if (!dom)
+		return NULL;
+
+	/* FIXME: Error if the default domain is already set? */
+	group->default_domain = dom;
+	if (!group->domain)
+		group->domain = dom;
+
+	return dom;
+}
+EXPORT_SYMBOL_GPL(iommu_group_set_default_domain);
+
 /**
  * iommu_group_get_for_dev - Find or create the IOMMU group for a device
  * @dev: target device
@@ -1099,9 +1117,12 @@ struct iommu_group *iommu_group_get_for_dev(struct device *dev)
 	if (!group->default_domain) {
 		struct iommu_domain *dom;
 
-		dom = __iommu_domain_alloc(dev->bus, iommu_def_domain_type);
+		dom = iommu_group_set_default_domain(group, dev,
+			iommu_def_domain_type);
+
 		if (!dom && iommu_def_domain_type != IOMMU_DOMAIN_DMA) {
-			dom = __iommu_domain_alloc(dev->bus, IOMMU_DOMAIN_DMA);
+			dom = iommu_group_set_default_domain(group, dev,
+				IOMMU_DOMAIN_DMA);
 			if (dom) {
 				dev_warn(dev,
 					 "failed to allocate default IOMMU domain of type %u; falling back to IOMMU_DOMAIN_DMA",
@@ -1109,10 +1130,6 @@ struct iommu_group *iommu_group_get_for_dev(struct device *dev)
 			}
 		}
 
-		group->default_domain = dom;
-		if (!group->domain)
-			group->domain = dom;
-
 		if (dom && !iommu_dma_strict) {
 			int attr = 1;
 			iommu_domain_set_attr(dom,
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index a815cf6..4ef8bd5 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -394,6 +394,9 @@ extern int iommu_group_id(struct iommu_group *group);
 extern struct iommu_group *iommu_group_get_for_dev(struct device *dev);
 extern struct iommu_domain *iommu_group_default_domain(struct iommu_group *);
 
+struct iommu_domain *iommu_group_set_default_domain(struct iommu_group *group,
+		struct device *dev, unsigned int type);
+
 extern int iommu_domain_get_attr(struct iommu_domain *domain, enum iommu_attr,
 				 void *data);
 extern int iommu_domain_set_attr(struct iommu_domain *domain, enum iommu_attr,
-- 
2.7.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 02/15] iommu: Add DOMAIN_ATTR_SPLIT_TABLES
  2019-05-21 16:13 ` Jordan Crouse
@ 2019-05-21 16:13   ` Jordan Crouse
  -1 siblings, 0 replies; 45+ messages in thread
From: Jordan Crouse @ 2019-05-21 16:13 UTC (permalink / raw)
  To: freedreno
  Cc: jean-philippe.brucker, linux-arm-msm, hoegsberg, dianders, iommu,
	Joerg Roedel, linux-kernel

Add a new domain attribute to enable split pagetable support for devices
devices that support it.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 include/linux/iommu.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 4ef8bd5..204acd8 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -128,6 +128,7 @@ enum iommu_attr {
 	DOMAIN_ATTR_FSL_PAMUV1,
 	DOMAIN_ATTR_NESTING,	/* two stages of translation */
 	DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE,
+	DOMAIN_ATTR_SPLIT_TABLES,
 	DOMAIN_ATTR_MAX,
 };
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 02/15] iommu: Add DOMAIN_ATTR_SPLIT_TABLES
@ 2019-05-21 16:13   ` Jordan Crouse
  0 siblings, 0 replies; 45+ messages in thread
From: Jordan Crouse @ 2019-05-21 16:13 UTC (permalink / raw)
  To: freedreno
  Cc: jean-philippe.brucker, linux-arm-msm, dianders, linux-kernel,
	iommu, hoegsberg

Add a new domain attribute to enable split pagetable support for devices
devices that support it.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 include/linux/iommu.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 4ef8bd5..204acd8 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -128,6 +128,7 @@ enum iommu_attr {
 	DOMAIN_ATTR_FSL_PAMUV1,
 	DOMAIN_ATTR_NESTING,	/* two stages of translation */
 	DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE,
+	DOMAIN_ATTR_SPLIT_TABLES,
 	DOMAIN_ATTR_MAX,
 };
 
-- 
2.7.4

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 03/15] iommu/arm-smmu: Add split pagetable support for arm-smmu-v2
  2019-05-21 16:13 ` Jordan Crouse
  (?)
@ 2019-05-21 16:13   ` Jordan Crouse
  -1 siblings, 0 replies; 45+ messages in thread
From: Jordan Crouse @ 2019-05-21 16:13 UTC (permalink / raw)
  To: freedreno
  Cc: jean-philippe.brucker, linux-arm-msm, hoegsberg, dianders,
	linux-kernel, iommu, Robin Murphy, Will Deacon, Joerg Roedel,
	linux-arm-kernel

Add support for a split pagetable (TTBR0/TTBR1) scheme for arm-smmu-v2.
If split pagetables are enabled, create a pagetable for TTBR1 and set
up the sign extension bit so that all IOVAs with that bit set are mapped
and translated from the TTBR1 pagetable.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 drivers/iommu/arm-smmu-regs.h  |  19 +++++
 drivers/iommu/arm-smmu.c       | 179 ++++++++++++++++++++++++++++++++++++++---
 drivers/iommu/io-pgtable-arm.c |   3 +-
 3 files changed, 186 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-regs.h
index e9132a9..23f27c2 100644
--- a/drivers/iommu/arm-smmu-regs.h
+++ b/drivers/iommu/arm-smmu-regs.h
@@ -195,7 +195,26 @@ enum arm_smmu_s2cr_privcfg {
 #define RESUME_RETRY			(0 << 0)
 #define RESUME_TERMINATE		(1 << 0)
 
+#define TTBCR_EPD1			(1 << 23)
+#define TTBCR_T0SZ_SHIFT		0
+#define TTBCR_T1SZ_SHIFT		16
+#define TTBCR_IRGN1_SHIFT		24
+#define TTBCR_ORGN1_SHIFT		26
+#define TTBCR_RGN_WBWA			1
+#define TTBCR_SH1_SHIFT			28
+#define TTBCR_SH_IS			3
+
+#define TTBCR_TG1_16K			(1 << 30)
+#define TTBCR_TG1_4K			(2 << 30)
+#define TTBCR_TG1_64K			(3 << 30)
+
 #define TTBCR2_SEP_SHIFT		15
+#define TTBCR2_SEP_31			(0x0 << TTBCR2_SEP_SHIFT)
+#define TTBCR2_SEP_35			(0x1 << TTBCR2_SEP_SHIFT)
+#define TTBCR2_SEP_39			(0x2 << TTBCR2_SEP_SHIFT)
+#define TTBCR2_SEP_41			(0x3 << TTBCR2_SEP_SHIFT)
+#define TTBCR2_SEP_43			(0x4 << TTBCR2_SEP_SHIFT)
+#define TTBCR2_SEP_47			(0x5 << TTBCR2_SEP_SHIFT)
 #define TTBCR2_SEP_UPSTREAM		(0x7 << TTBCR2_SEP_SHIFT)
 #define TTBCR2_AS			(1 << 4)
 
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index a795ada..e09c0e6 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -152,6 +152,7 @@ struct arm_smmu_cb {
 	u32				tcr[2];
 	u32				mair[2];
 	struct arm_smmu_cfg		*cfg;
+	unsigned long			split_table_mask;
 };
 
 struct arm_smmu_master_cfg {
@@ -253,13 +254,14 @@ enum arm_smmu_domain_stage {
 
 struct arm_smmu_domain {
 	struct arm_smmu_device		*smmu;
-	struct io_pgtable_ops		*pgtbl_ops;
+	struct io_pgtable_ops		*pgtbl_ops[2];
 	const struct iommu_gather_ops	*tlb_ops;
 	struct arm_smmu_cfg		cfg;
 	enum arm_smmu_domain_stage	stage;
 	bool				non_strict;
 	struct mutex			init_mutex; /* Protects smmu pointer */
 	spinlock_t			cb_lock; /* Serialises ATS1* ops and TLB syncs */
+	u32 attributes;
 	struct iommu_domain		domain;
 };
 
@@ -621,6 +623,85 @@ static irqreturn_t arm_smmu_global_fault(int irq, void *dev)
 	return IRQ_HANDLED;
 }
 
+/* Adjust the context bank settings to support TTBR1 */
+static void arm_smmu_init_ttbr1(struct arm_smmu_domain *smmu_domain,
+		struct io_pgtable_cfg *pgtbl_cfg)
+{
+	struct arm_smmu_device *smmu = smmu_domain->smmu;
+	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
+	struct arm_smmu_cb *cb = &smmu_domain->smmu->cbs[cfg->cbndx];
+	int pgsize = 1 << __ffs(pgtbl_cfg->pgsize_bitmap);
+
+	/* Enable speculative walks through the TTBR1 */
+	cb->tcr[0] &= ~TTBCR_EPD1;
+
+	cb->tcr[0] |= TTBCR_SH_IS << TTBCR_SH1_SHIFT;
+	cb->tcr[0] |= TTBCR_RGN_WBWA << TTBCR_IRGN1_SHIFT;
+	cb->tcr[0] |= TTBCR_RGN_WBWA << TTBCR_ORGN1_SHIFT;
+
+	switch (pgsize) {
+	case SZ_4K:
+		cb->tcr[0] |= TTBCR_TG1_4K;
+		break;
+	case SZ_16K:
+		cb->tcr[0] |= TTBCR_TG1_16K;
+		break;
+	case SZ_64K:
+		cb->tcr[0] |= TTBCR_TG1_64K;
+		break;
+	}
+
+	/*
+	 * Outside of the special 49 bit UBS case that has a dedicated sign
+	 * extension bit, setting the SEP for any other va_size will force us to
+	 * shrink the size of the T0/T1 regions by one bit to accommodate the
+	 * SEP
+	 */
+	if (smmu->va_size != 48) {
+		/* Replace the T0 size */
+		cb->tcr[0] &= ~(0x3f << TTBCR_T0SZ_SHIFT);
+		cb->tcr[0] |= (64ULL - smmu->va_size - 1) << TTBCR_T0SZ_SHIFT;
+		/* Set the T1 size */
+		cb->tcr[0] |= (64ULL - smmu->va_size - 1) << TTBCR_T1SZ_SHIFT;
+	} else {
+		/* Set the T1 size to the full available UBS */
+		cb->tcr[0] |= (64ULL - smmu->va_size) << TTBCR_T1SZ_SHIFT;
+	}
+
+	/* Clear the existing SEP configuration */
+	cb->tcr[1] &= ~TTBCR2_SEP_UPSTREAM;
+
+	/* Set up the sign extend bit */
+	switch (smmu->va_size) {
+	case 32:
+		cb->tcr[1] |= TTBCR2_SEP_31;
+		cb->split_table_mask = (1UL << 31);
+		break;
+	case 36:
+		cb->tcr[1] |= TTBCR2_SEP_35;
+		cb->split_table_mask = (1UL << 35);
+		break;
+	case 40:
+		cb->tcr[1] |= TTBCR2_SEP_39;
+		cb->split_table_mask = (1UL << 39);
+		break;
+	case 42:
+		cb->tcr[1] |= TTBCR2_SEP_41;
+		cb->split_table_mask = (1UL << 41);
+		break;
+	case 44:
+		cb->tcr[1] |= TTBCR2_SEP_43;
+		cb->split_table_mask = (1UL << 43);
+		break;
+	case 48:
+		cb->tcr[1] |= TTBCR2_SEP_UPSTREAM;
+		cb->split_table_mask = (1UL << 48);
+	}
+
+	cb->ttbr[1] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[0];
+	cb->ttbr[1] |= (u64)cfg->asid << TTBRn_ASID_SHIFT;
+}
+
 static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain,
 				       struct io_pgtable_cfg *pgtbl_cfg)
 {
@@ -763,11 +844,13 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 {
 	int irq, start, ret = 0;
 	unsigned long ias, oas;
-	struct io_pgtable_ops *pgtbl_ops;
+	struct io_pgtable_ops *pgtbl_ops[2] = { NULL, NULL };
 	struct io_pgtable_cfg pgtbl_cfg;
 	enum io_pgtable_fmt fmt;
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
 	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
+	bool split_tables =
+		(smmu_domain->attributes & (1 << DOMAIN_ATTR_SPLIT_TABLES));
 
 	mutex_lock(&smmu_domain->init_mutex);
 	if (smmu_domain->smmu)
@@ -797,8 +880,15 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	 *
 	 * Note that you can't actually request stage-2 mappings.
 	 */
-	if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1))
+	if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1)) {
 		smmu_domain->stage = ARM_SMMU_DOMAIN_S2;
+
+		/* Only allow split pagetables on stage 1 tables */
+		if (split_tables) {
+			ret = -EINVAL;
+			goto out_unlock;
+		}
+	}
 	if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S2))
 		smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
 
@@ -817,6 +907,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	    (smmu->features & ARM_SMMU_FEAT_FMT_AARCH32_S) &&
 	    (smmu_domain->stage == ARM_SMMU_DOMAIN_S1))
 		cfg->fmt = ARM_SMMU_CTX_FMT_AARCH32_S;
+
 	if ((IS_ENABLED(CONFIG_64BIT) || cfg->fmt == ARM_SMMU_CTX_FMT_NONE) &&
 	    (smmu->features & (ARM_SMMU_FEAT_FMT_AARCH64_64K |
 			       ARM_SMMU_FEAT_FMT_AARCH64_16K |
@@ -828,6 +919,12 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 		goto out_unlock;
 	}
 
+	/* For now, only allow split tables for AARCH64 formats */
+	if (split_tables && cfg->fmt != ARM_SMMU_CTX_FMT_AARCH64) {
+		ret = -EINVAL;
+		goto out_unlock;
+	}
+
 	switch (smmu_domain->stage) {
 	case ARM_SMMU_DOMAIN_S1:
 		cfg->cbar = CBAR_TYPE_S1_TRANS_S2_BYPASS;
@@ -906,8 +1003,8 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 		pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
 
 	smmu_domain->smmu = smmu;
-	pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
-	if (!pgtbl_ops) {
+	pgtbl_ops[0] = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
+	if (!pgtbl_ops[0]) {
 		ret = -ENOMEM;
 		goto out_clear_smmu;
 	}
@@ -919,6 +1016,20 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 
 	/* Initialise the context bank with our page table cfg */
 	arm_smmu_init_context_bank(smmu_domain, &pgtbl_cfg);
+
+	if (split_tables) {
+		/* It is safe to reuse pgtbl_cfg here */
+		pgtbl_ops[1] = alloc_io_pgtable_ops(fmt, &pgtbl_cfg,
+			smmu_domain);
+		if (!pgtbl_ops[1]) {
+			free_io_pgtable_ops(pgtbl_ops[0]);
+			ret = -ENOMEM;
+			goto out_clear_smmu;
+		}
+
+		arm_smmu_init_ttbr1(smmu_domain, &pgtbl_cfg);
+	}
+
 	arm_smmu_write_context_bank(smmu, cfg->cbndx);
 
 	/*
@@ -937,7 +1048,9 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	mutex_unlock(&smmu_domain->init_mutex);
 
 	/* Publish page table ops for map/unmap */
-	smmu_domain->pgtbl_ops = pgtbl_ops;
+	smmu_domain->pgtbl_ops[0] = pgtbl_ops[0];
+	smmu_domain->pgtbl_ops[1] = pgtbl_ops[1];
+
 	return 0;
 
 out_clear_smmu:
@@ -973,7 +1086,9 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain)
 		devm_free_irq(smmu->dev, irq, domain);
 	}
 
-	free_io_pgtable_ops(smmu_domain->pgtbl_ops);
+	free_io_pgtable_ops(smmu_domain->pgtbl_ops[0]);
+	free_io_pgtable_ops(smmu_domain->pgtbl_ops[1]);
+
 	__arm_smmu_free_bitmap(smmu->context_map, cfg->cbndx);
 
 	arm_smmu_rpm_put(smmu);
@@ -1317,10 +1432,37 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 	return ret;
 }
 
+static struct io_pgtable_ops *
+arm_smmu_get_pgtbl_ops(struct iommu_domain *domain, unsigned long iova)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
+	struct arm_smmu_cb *cb = &smmu_domain->smmu->cbs[cfg->cbndx];
+
+	if (iova & cb->split_table_mask)
+		return smmu_domain->pgtbl_ops[1];
+
+	return smmu_domain->pgtbl_ops[0];
+}
+
+/*
+ * If split pagetables are enabled adjust the iova so that it
+ * matches the T0SZ/T1SZ that has been programmed
+ */
+unsigned long arm_smmu_adjust_iova(struct iommu_domain *domain,
+		unsigned long iova)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
+	struct arm_smmu_cb *cb = &smmu_domain->smmu->cbs[cfg->cbndx];
+
+	return cb->split_table_mask ? iova & (cb->split_table_mask - 1) : iova;
+}
+
 static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
 			phys_addr_t paddr, size_t size, int prot)
 {
-	struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
+	struct io_pgtable_ops *ops = arm_smmu_get_pgtbl_ops(domain, iova);
 	struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu;
 	int ret;
 
@@ -1328,7 +1470,8 @@ static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
 		return -ENODEV;
 
 	arm_smmu_rpm_get(smmu);
-	ret = ops->map(ops, iova, paddr, size, prot);
+	ret = ops->map(ops, arm_smmu_adjust_iova(domain, iova),
+		paddr, size, prot);
 	arm_smmu_rpm_put(smmu);
 
 	return ret;
@@ -1337,7 +1480,7 @@ static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
 static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
 			     size_t size)
 {
-	struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
+	struct io_pgtable_ops *ops = arm_smmu_get_pgtbl_ops(domain, iova);
 	struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu;
 	size_t ret;
 
@@ -1345,7 +1488,7 @@ static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
 		return 0;
 
 	arm_smmu_rpm_get(smmu);
-	ret = ops->unmap(ops, iova, size);
+	ret = ops->unmap(ops, arm_smmu_adjust_iova(domain, iova), size);
 	arm_smmu_rpm_put(smmu);
 
 	return ret;
@@ -1381,7 +1524,7 @@ static phys_addr_t arm_smmu_iova_to_phys_hard(struct iommu_domain *domain,
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
 	struct arm_smmu_device *smmu = smmu_domain->smmu;
 	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
-	struct io_pgtable_ops *ops= smmu_domain->pgtbl_ops;
+	struct io_pgtable_ops *ops = arm_smmu_get_pgtbl_ops(domain, iova);
 	struct device *dev = smmu->dev;
 	void __iomem *cb_base;
 	u32 tmp;
@@ -1429,7 +1572,7 @@ static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain,
 					dma_addr_t iova)
 {
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
-	struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
+	struct io_pgtable_ops *ops = arm_smmu_get_pgtbl_ops(domain, iova);
 
 	if (domain->type == IOMMU_DOMAIN_IDENTITY)
 		return iova;
@@ -1629,6 +1772,11 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
 		case DOMAIN_ATTR_NESTING:
 			*(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED);
 			return 0;
+		case DOMAIN_ATTR_SPLIT_TABLES:
+			*((int *)data) =
+				!!(smmu_domain->attributes &
+				   (1 << DOMAIN_ATTR_SPLIT_TABLES));
+			return 0;
 		default:
 			return -ENODEV;
 		}
@@ -1669,6 +1817,11 @@ static int arm_smmu_domain_set_attr(struct iommu_domain *domain,
 			else
 				smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
 			break;
+		case DOMAIN_ATTR_SPLIT_TABLES:
+			if (*((int *)data))
+				smmu_domain->attributes |=
+					(1 << DOMAIN_ATTR_SPLIT_TABLES);
+			break;
 		default:
 			ret = -ENODEV;
 		}
diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 4e21efb..71ecb08 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -490,8 +490,7 @@ static int arm_lpae_map(struct io_pgtable_ops *ops, unsigned long iova,
 	if (!(iommu_prot & (IOMMU_READ | IOMMU_WRITE)))
 		return 0;
 
-	if (WARN_ON(iova >= (1ULL << data->iop.cfg.ias) ||
-		    paddr >= (1ULL << data->iop.cfg.oas)))
+	if (WARN_ON(paddr >= (1ULL << data->iop.cfg.oas)))
 		return -ERANGE;
 
 	prot = arm_lpae_prot_to_pte(data, iommu_prot);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 03/15] iommu/arm-smmu: Add split pagetable support for arm-smmu-v2
@ 2019-05-21 16:13   ` Jordan Crouse
  0 siblings, 0 replies; 45+ messages in thread
From: Jordan Crouse @ 2019-05-21 16:13 UTC (permalink / raw)
  To: freedreno
  Cc: jean-philippe.brucker, linux-arm-msm, Will Deacon, dianders,
	linux-kernel, iommu, hoegsberg, Robin Murphy, linux-arm-kernel

Add support for a split pagetable (TTBR0/TTBR1) scheme for arm-smmu-v2.
If split pagetables are enabled, create a pagetable for TTBR1 and set
up the sign extension bit so that all IOVAs with that bit set are mapped
and translated from the TTBR1 pagetable.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 drivers/iommu/arm-smmu-regs.h  |  19 +++++
 drivers/iommu/arm-smmu.c       | 179 ++++++++++++++++++++++++++++++++++++++---
 drivers/iommu/io-pgtable-arm.c |   3 +-
 3 files changed, 186 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-regs.h
index e9132a9..23f27c2 100644
--- a/drivers/iommu/arm-smmu-regs.h
+++ b/drivers/iommu/arm-smmu-regs.h
@@ -195,7 +195,26 @@ enum arm_smmu_s2cr_privcfg {
 #define RESUME_RETRY			(0 << 0)
 #define RESUME_TERMINATE		(1 << 0)
 
+#define TTBCR_EPD1			(1 << 23)
+#define TTBCR_T0SZ_SHIFT		0
+#define TTBCR_T1SZ_SHIFT		16
+#define TTBCR_IRGN1_SHIFT		24
+#define TTBCR_ORGN1_SHIFT		26
+#define TTBCR_RGN_WBWA			1
+#define TTBCR_SH1_SHIFT			28
+#define TTBCR_SH_IS			3
+
+#define TTBCR_TG1_16K			(1 << 30)
+#define TTBCR_TG1_4K			(2 << 30)
+#define TTBCR_TG1_64K			(3 << 30)
+
 #define TTBCR2_SEP_SHIFT		15
+#define TTBCR2_SEP_31			(0x0 << TTBCR2_SEP_SHIFT)
+#define TTBCR2_SEP_35			(0x1 << TTBCR2_SEP_SHIFT)
+#define TTBCR2_SEP_39			(0x2 << TTBCR2_SEP_SHIFT)
+#define TTBCR2_SEP_41			(0x3 << TTBCR2_SEP_SHIFT)
+#define TTBCR2_SEP_43			(0x4 << TTBCR2_SEP_SHIFT)
+#define TTBCR2_SEP_47			(0x5 << TTBCR2_SEP_SHIFT)
 #define TTBCR2_SEP_UPSTREAM		(0x7 << TTBCR2_SEP_SHIFT)
 #define TTBCR2_AS			(1 << 4)
 
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index a795ada..e09c0e6 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -152,6 +152,7 @@ struct arm_smmu_cb {
 	u32				tcr[2];
 	u32				mair[2];
 	struct arm_smmu_cfg		*cfg;
+	unsigned long			split_table_mask;
 };
 
 struct arm_smmu_master_cfg {
@@ -253,13 +254,14 @@ enum arm_smmu_domain_stage {
 
 struct arm_smmu_domain {
 	struct arm_smmu_device		*smmu;
-	struct io_pgtable_ops		*pgtbl_ops;
+	struct io_pgtable_ops		*pgtbl_ops[2];
 	const struct iommu_gather_ops	*tlb_ops;
 	struct arm_smmu_cfg		cfg;
 	enum arm_smmu_domain_stage	stage;
 	bool				non_strict;
 	struct mutex			init_mutex; /* Protects smmu pointer */
 	spinlock_t			cb_lock; /* Serialises ATS1* ops and TLB syncs */
+	u32 attributes;
 	struct iommu_domain		domain;
 };
 
@@ -621,6 +623,85 @@ static irqreturn_t arm_smmu_global_fault(int irq, void *dev)
 	return IRQ_HANDLED;
 }
 
+/* Adjust the context bank settings to support TTBR1 */
+static void arm_smmu_init_ttbr1(struct arm_smmu_domain *smmu_domain,
+		struct io_pgtable_cfg *pgtbl_cfg)
+{
+	struct arm_smmu_device *smmu = smmu_domain->smmu;
+	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
+	struct arm_smmu_cb *cb = &smmu_domain->smmu->cbs[cfg->cbndx];
+	int pgsize = 1 << __ffs(pgtbl_cfg->pgsize_bitmap);
+
+	/* Enable speculative walks through the TTBR1 */
+	cb->tcr[0] &= ~TTBCR_EPD1;
+
+	cb->tcr[0] |= TTBCR_SH_IS << TTBCR_SH1_SHIFT;
+	cb->tcr[0] |= TTBCR_RGN_WBWA << TTBCR_IRGN1_SHIFT;
+	cb->tcr[0] |= TTBCR_RGN_WBWA << TTBCR_ORGN1_SHIFT;
+
+	switch (pgsize) {
+	case SZ_4K:
+		cb->tcr[0] |= TTBCR_TG1_4K;
+		break;
+	case SZ_16K:
+		cb->tcr[0] |= TTBCR_TG1_16K;
+		break;
+	case SZ_64K:
+		cb->tcr[0] |= TTBCR_TG1_64K;
+		break;
+	}
+
+	/*
+	 * Outside of the special 49 bit UBS case that has a dedicated sign
+	 * extension bit, setting the SEP for any other va_size will force us to
+	 * shrink the size of the T0/T1 regions by one bit to accommodate the
+	 * SEP
+	 */
+	if (smmu->va_size != 48) {
+		/* Replace the T0 size */
+		cb->tcr[0] &= ~(0x3f << TTBCR_T0SZ_SHIFT);
+		cb->tcr[0] |= (64ULL - smmu->va_size - 1) << TTBCR_T0SZ_SHIFT;
+		/* Set the T1 size */
+		cb->tcr[0] |= (64ULL - smmu->va_size - 1) << TTBCR_T1SZ_SHIFT;
+	} else {
+		/* Set the T1 size to the full available UBS */
+		cb->tcr[0] |= (64ULL - smmu->va_size) << TTBCR_T1SZ_SHIFT;
+	}
+
+	/* Clear the existing SEP configuration */
+	cb->tcr[1] &= ~TTBCR2_SEP_UPSTREAM;
+
+	/* Set up the sign extend bit */
+	switch (smmu->va_size) {
+	case 32:
+		cb->tcr[1] |= TTBCR2_SEP_31;
+		cb->split_table_mask = (1UL << 31);
+		break;
+	case 36:
+		cb->tcr[1] |= TTBCR2_SEP_35;
+		cb->split_table_mask = (1UL << 35);
+		break;
+	case 40:
+		cb->tcr[1] |= TTBCR2_SEP_39;
+		cb->split_table_mask = (1UL << 39);
+		break;
+	case 42:
+		cb->tcr[1] |= TTBCR2_SEP_41;
+		cb->split_table_mask = (1UL << 41);
+		break;
+	case 44:
+		cb->tcr[1] |= TTBCR2_SEP_43;
+		cb->split_table_mask = (1UL << 43);
+		break;
+	case 48:
+		cb->tcr[1] |= TTBCR2_SEP_UPSTREAM;
+		cb->split_table_mask = (1UL << 48);
+	}
+
+	cb->ttbr[1] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[0];
+	cb->ttbr[1] |= (u64)cfg->asid << TTBRn_ASID_SHIFT;
+}
+
 static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain,
 				       struct io_pgtable_cfg *pgtbl_cfg)
 {
@@ -763,11 +844,13 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 {
 	int irq, start, ret = 0;
 	unsigned long ias, oas;
-	struct io_pgtable_ops *pgtbl_ops;
+	struct io_pgtable_ops *pgtbl_ops[2] = { NULL, NULL };
 	struct io_pgtable_cfg pgtbl_cfg;
 	enum io_pgtable_fmt fmt;
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
 	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
+	bool split_tables =
+		(smmu_domain->attributes & (1 << DOMAIN_ATTR_SPLIT_TABLES));
 
 	mutex_lock(&smmu_domain->init_mutex);
 	if (smmu_domain->smmu)
@@ -797,8 +880,15 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	 *
 	 * Note that you can't actually request stage-2 mappings.
 	 */
-	if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1))
+	if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1)) {
 		smmu_domain->stage = ARM_SMMU_DOMAIN_S2;
+
+		/* Only allow split pagetables on stage 1 tables */
+		if (split_tables) {
+			ret = -EINVAL;
+			goto out_unlock;
+		}
+	}
 	if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S2))
 		smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
 
@@ -817,6 +907,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	    (smmu->features & ARM_SMMU_FEAT_FMT_AARCH32_S) &&
 	    (smmu_domain->stage == ARM_SMMU_DOMAIN_S1))
 		cfg->fmt = ARM_SMMU_CTX_FMT_AARCH32_S;
+
 	if ((IS_ENABLED(CONFIG_64BIT) || cfg->fmt == ARM_SMMU_CTX_FMT_NONE) &&
 	    (smmu->features & (ARM_SMMU_FEAT_FMT_AARCH64_64K |
 			       ARM_SMMU_FEAT_FMT_AARCH64_16K |
@@ -828,6 +919,12 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 		goto out_unlock;
 	}
 
+	/* For now, only allow split tables for AARCH64 formats */
+	if (split_tables && cfg->fmt != ARM_SMMU_CTX_FMT_AARCH64) {
+		ret = -EINVAL;
+		goto out_unlock;
+	}
+
 	switch (smmu_domain->stage) {
 	case ARM_SMMU_DOMAIN_S1:
 		cfg->cbar = CBAR_TYPE_S1_TRANS_S2_BYPASS;
@@ -906,8 +1003,8 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 		pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
 
 	smmu_domain->smmu = smmu;
-	pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
-	if (!pgtbl_ops) {
+	pgtbl_ops[0] = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
+	if (!pgtbl_ops[0]) {
 		ret = -ENOMEM;
 		goto out_clear_smmu;
 	}
@@ -919,6 +1016,20 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 
 	/* Initialise the context bank with our page table cfg */
 	arm_smmu_init_context_bank(smmu_domain, &pgtbl_cfg);
+
+	if (split_tables) {
+		/* It is safe to reuse pgtbl_cfg here */
+		pgtbl_ops[1] = alloc_io_pgtable_ops(fmt, &pgtbl_cfg,
+			smmu_domain);
+		if (!pgtbl_ops[1]) {
+			free_io_pgtable_ops(pgtbl_ops[0]);
+			ret = -ENOMEM;
+			goto out_clear_smmu;
+		}
+
+		arm_smmu_init_ttbr1(smmu_domain, &pgtbl_cfg);
+	}
+
 	arm_smmu_write_context_bank(smmu, cfg->cbndx);
 
 	/*
@@ -937,7 +1048,9 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	mutex_unlock(&smmu_domain->init_mutex);
 
 	/* Publish page table ops for map/unmap */
-	smmu_domain->pgtbl_ops = pgtbl_ops;
+	smmu_domain->pgtbl_ops[0] = pgtbl_ops[0];
+	smmu_domain->pgtbl_ops[1] = pgtbl_ops[1];
+
 	return 0;
 
 out_clear_smmu:
@@ -973,7 +1086,9 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain)
 		devm_free_irq(smmu->dev, irq, domain);
 	}
 
-	free_io_pgtable_ops(smmu_domain->pgtbl_ops);
+	free_io_pgtable_ops(smmu_domain->pgtbl_ops[0]);
+	free_io_pgtable_ops(smmu_domain->pgtbl_ops[1]);
+
 	__arm_smmu_free_bitmap(smmu->context_map, cfg->cbndx);
 
 	arm_smmu_rpm_put(smmu);
@@ -1317,10 +1432,37 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 	return ret;
 }
 
+static struct io_pgtable_ops *
+arm_smmu_get_pgtbl_ops(struct iommu_domain *domain, unsigned long iova)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
+	struct arm_smmu_cb *cb = &smmu_domain->smmu->cbs[cfg->cbndx];
+
+	if (iova & cb->split_table_mask)
+		return smmu_domain->pgtbl_ops[1];
+
+	return smmu_domain->pgtbl_ops[0];
+}
+
+/*
+ * If split pagetables are enabled adjust the iova so that it
+ * matches the T0SZ/T1SZ that has been programmed
+ */
+unsigned long arm_smmu_adjust_iova(struct iommu_domain *domain,
+		unsigned long iova)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
+	struct arm_smmu_cb *cb = &smmu_domain->smmu->cbs[cfg->cbndx];
+
+	return cb->split_table_mask ? iova & (cb->split_table_mask - 1) : iova;
+}
+
 static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
 			phys_addr_t paddr, size_t size, int prot)
 {
-	struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
+	struct io_pgtable_ops *ops = arm_smmu_get_pgtbl_ops(domain, iova);
 	struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu;
 	int ret;
 
@@ -1328,7 +1470,8 @@ static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
 		return -ENODEV;
 
 	arm_smmu_rpm_get(smmu);
-	ret = ops->map(ops, iova, paddr, size, prot);
+	ret = ops->map(ops, arm_smmu_adjust_iova(domain, iova),
+		paddr, size, prot);
 	arm_smmu_rpm_put(smmu);
 
 	return ret;
@@ -1337,7 +1480,7 @@ static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
 static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
 			     size_t size)
 {
-	struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
+	struct io_pgtable_ops *ops = arm_smmu_get_pgtbl_ops(domain, iova);
 	struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu;
 	size_t ret;
 
@@ -1345,7 +1488,7 @@ static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
 		return 0;
 
 	arm_smmu_rpm_get(smmu);
-	ret = ops->unmap(ops, iova, size);
+	ret = ops->unmap(ops, arm_smmu_adjust_iova(domain, iova), size);
 	arm_smmu_rpm_put(smmu);
 
 	return ret;
@@ -1381,7 +1524,7 @@ static phys_addr_t arm_smmu_iova_to_phys_hard(struct iommu_domain *domain,
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
 	struct arm_smmu_device *smmu = smmu_domain->smmu;
 	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
-	struct io_pgtable_ops *ops= smmu_domain->pgtbl_ops;
+	struct io_pgtable_ops *ops = arm_smmu_get_pgtbl_ops(domain, iova);
 	struct device *dev = smmu->dev;
 	void __iomem *cb_base;
 	u32 tmp;
@@ -1429,7 +1572,7 @@ static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain,
 					dma_addr_t iova)
 {
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
-	struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
+	struct io_pgtable_ops *ops = arm_smmu_get_pgtbl_ops(domain, iova);
 
 	if (domain->type == IOMMU_DOMAIN_IDENTITY)
 		return iova;
@@ -1629,6 +1772,11 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
 		case DOMAIN_ATTR_NESTING:
 			*(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED);
 			return 0;
+		case DOMAIN_ATTR_SPLIT_TABLES:
+			*((int *)data) =
+				!!(smmu_domain->attributes &
+				   (1 << DOMAIN_ATTR_SPLIT_TABLES));
+			return 0;
 		default:
 			return -ENODEV;
 		}
@@ -1669,6 +1817,11 @@ static int arm_smmu_domain_set_attr(struct iommu_domain *domain,
 			else
 				smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
 			break;
+		case DOMAIN_ATTR_SPLIT_TABLES:
+			if (*((int *)data))
+				smmu_domain->attributes |=
+					(1 << DOMAIN_ATTR_SPLIT_TABLES);
+			break;
 		default:
 			ret = -ENODEV;
 		}
diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 4e21efb..71ecb08 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -490,8 +490,7 @@ static int arm_lpae_map(struct io_pgtable_ops *ops, unsigned long iova,
 	if (!(iommu_prot & (IOMMU_READ | IOMMU_WRITE)))
 		return 0;
 
-	if (WARN_ON(iova >= (1ULL << data->iop.cfg.ias) ||
-		    paddr >= (1ULL << data->iop.cfg.oas)))
+	if (WARN_ON(paddr >= (1ULL << data->iop.cfg.oas)))
 		return -ERANGE;
 
 	prot = arm_lpae_prot_to_pte(data, iommu_prot);
-- 
2.7.4

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 03/15] iommu/arm-smmu: Add split pagetable support for arm-smmu-v2
@ 2019-05-21 16:13   ` Jordan Crouse
  0 siblings, 0 replies; 45+ messages in thread
From: Jordan Crouse @ 2019-05-21 16:13 UTC (permalink / raw)
  To: freedreno
  Cc: jean-philippe.brucker, linux-arm-msm, Joerg Roedel, Will Deacon,
	dianders, linux-kernel, iommu, hoegsberg, Robin Murphy,
	linux-arm-kernel

Add support for a split pagetable (TTBR0/TTBR1) scheme for arm-smmu-v2.
If split pagetables are enabled, create a pagetable for TTBR1 and set
up the sign extension bit so that all IOVAs with that bit set are mapped
and translated from the TTBR1 pagetable.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 drivers/iommu/arm-smmu-regs.h  |  19 +++++
 drivers/iommu/arm-smmu.c       | 179 ++++++++++++++++++++++++++++++++++++++---
 drivers/iommu/io-pgtable-arm.c |   3 +-
 3 files changed, 186 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-regs.h
index e9132a9..23f27c2 100644
--- a/drivers/iommu/arm-smmu-regs.h
+++ b/drivers/iommu/arm-smmu-regs.h
@@ -195,7 +195,26 @@ enum arm_smmu_s2cr_privcfg {
 #define RESUME_RETRY			(0 << 0)
 #define RESUME_TERMINATE		(1 << 0)
 
+#define TTBCR_EPD1			(1 << 23)
+#define TTBCR_T0SZ_SHIFT		0
+#define TTBCR_T1SZ_SHIFT		16
+#define TTBCR_IRGN1_SHIFT		24
+#define TTBCR_ORGN1_SHIFT		26
+#define TTBCR_RGN_WBWA			1
+#define TTBCR_SH1_SHIFT			28
+#define TTBCR_SH_IS			3
+
+#define TTBCR_TG1_16K			(1 << 30)
+#define TTBCR_TG1_4K			(2 << 30)
+#define TTBCR_TG1_64K			(3 << 30)
+
 #define TTBCR2_SEP_SHIFT		15
+#define TTBCR2_SEP_31			(0x0 << TTBCR2_SEP_SHIFT)
+#define TTBCR2_SEP_35			(0x1 << TTBCR2_SEP_SHIFT)
+#define TTBCR2_SEP_39			(0x2 << TTBCR2_SEP_SHIFT)
+#define TTBCR2_SEP_41			(0x3 << TTBCR2_SEP_SHIFT)
+#define TTBCR2_SEP_43			(0x4 << TTBCR2_SEP_SHIFT)
+#define TTBCR2_SEP_47			(0x5 << TTBCR2_SEP_SHIFT)
 #define TTBCR2_SEP_UPSTREAM		(0x7 << TTBCR2_SEP_SHIFT)
 #define TTBCR2_AS			(1 << 4)
 
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index a795ada..e09c0e6 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -152,6 +152,7 @@ struct arm_smmu_cb {
 	u32				tcr[2];
 	u32				mair[2];
 	struct arm_smmu_cfg		*cfg;
+	unsigned long			split_table_mask;
 };
 
 struct arm_smmu_master_cfg {
@@ -253,13 +254,14 @@ enum arm_smmu_domain_stage {
 
 struct arm_smmu_domain {
 	struct arm_smmu_device		*smmu;
-	struct io_pgtable_ops		*pgtbl_ops;
+	struct io_pgtable_ops		*pgtbl_ops[2];
 	const struct iommu_gather_ops	*tlb_ops;
 	struct arm_smmu_cfg		cfg;
 	enum arm_smmu_domain_stage	stage;
 	bool				non_strict;
 	struct mutex			init_mutex; /* Protects smmu pointer */
 	spinlock_t			cb_lock; /* Serialises ATS1* ops and TLB syncs */
+	u32 attributes;
 	struct iommu_domain		domain;
 };
 
@@ -621,6 +623,85 @@ static irqreturn_t arm_smmu_global_fault(int irq, void *dev)
 	return IRQ_HANDLED;
 }
 
+/* Adjust the context bank settings to support TTBR1 */
+static void arm_smmu_init_ttbr1(struct arm_smmu_domain *smmu_domain,
+		struct io_pgtable_cfg *pgtbl_cfg)
+{
+	struct arm_smmu_device *smmu = smmu_domain->smmu;
+	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
+	struct arm_smmu_cb *cb = &smmu_domain->smmu->cbs[cfg->cbndx];
+	int pgsize = 1 << __ffs(pgtbl_cfg->pgsize_bitmap);
+
+	/* Enable speculative walks through the TTBR1 */
+	cb->tcr[0] &= ~TTBCR_EPD1;
+
+	cb->tcr[0] |= TTBCR_SH_IS << TTBCR_SH1_SHIFT;
+	cb->tcr[0] |= TTBCR_RGN_WBWA << TTBCR_IRGN1_SHIFT;
+	cb->tcr[0] |= TTBCR_RGN_WBWA << TTBCR_ORGN1_SHIFT;
+
+	switch (pgsize) {
+	case SZ_4K:
+		cb->tcr[0] |= TTBCR_TG1_4K;
+		break;
+	case SZ_16K:
+		cb->tcr[0] |= TTBCR_TG1_16K;
+		break;
+	case SZ_64K:
+		cb->tcr[0] |= TTBCR_TG1_64K;
+		break;
+	}
+
+	/*
+	 * Outside of the special 49 bit UBS case that has a dedicated sign
+	 * extension bit, setting the SEP for any other va_size will force us to
+	 * shrink the size of the T0/T1 regions by one bit to accommodate the
+	 * SEP
+	 */
+	if (smmu->va_size != 48) {
+		/* Replace the T0 size */
+		cb->tcr[0] &= ~(0x3f << TTBCR_T0SZ_SHIFT);
+		cb->tcr[0] |= (64ULL - smmu->va_size - 1) << TTBCR_T0SZ_SHIFT;
+		/* Set the T1 size */
+		cb->tcr[0] |= (64ULL - smmu->va_size - 1) << TTBCR_T1SZ_SHIFT;
+	} else {
+		/* Set the T1 size to the full available UBS */
+		cb->tcr[0] |= (64ULL - smmu->va_size) << TTBCR_T1SZ_SHIFT;
+	}
+
+	/* Clear the existing SEP configuration */
+	cb->tcr[1] &= ~TTBCR2_SEP_UPSTREAM;
+
+	/* Set up the sign extend bit */
+	switch (smmu->va_size) {
+	case 32:
+		cb->tcr[1] |= TTBCR2_SEP_31;
+		cb->split_table_mask = (1UL << 31);
+		break;
+	case 36:
+		cb->tcr[1] |= TTBCR2_SEP_35;
+		cb->split_table_mask = (1UL << 35);
+		break;
+	case 40:
+		cb->tcr[1] |= TTBCR2_SEP_39;
+		cb->split_table_mask = (1UL << 39);
+		break;
+	case 42:
+		cb->tcr[1] |= TTBCR2_SEP_41;
+		cb->split_table_mask = (1UL << 41);
+		break;
+	case 44:
+		cb->tcr[1] |= TTBCR2_SEP_43;
+		cb->split_table_mask = (1UL << 43);
+		break;
+	case 48:
+		cb->tcr[1] |= TTBCR2_SEP_UPSTREAM;
+		cb->split_table_mask = (1UL << 48);
+	}
+
+	cb->ttbr[1] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[0];
+	cb->ttbr[1] |= (u64)cfg->asid << TTBRn_ASID_SHIFT;
+}
+
 static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain,
 				       struct io_pgtable_cfg *pgtbl_cfg)
 {
@@ -763,11 +844,13 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 {
 	int irq, start, ret = 0;
 	unsigned long ias, oas;
-	struct io_pgtable_ops *pgtbl_ops;
+	struct io_pgtable_ops *pgtbl_ops[2] = { NULL, NULL };
 	struct io_pgtable_cfg pgtbl_cfg;
 	enum io_pgtable_fmt fmt;
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
 	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
+	bool split_tables =
+		(smmu_domain->attributes & (1 << DOMAIN_ATTR_SPLIT_TABLES));
 
 	mutex_lock(&smmu_domain->init_mutex);
 	if (smmu_domain->smmu)
@@ -797,8 +880,15 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	 *
 	 * Note that you can't actually request stage-2 mappings.
 	 */
-	if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1))
+	if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1)) {
 		smmu_domain->stage = ARM_SMMU_DOMAIN_S2;
+
+		/* Only allow split pagetables on stage 1 tables */
+		if (split_tables) {
+			ret = -EINVAL;
+			goto out_unlock;
+		}
+	}
 	if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S2))
 		smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
 
@@ -817,6 +907,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	    (smmu->features & ARM_SMMU_FEAT_FMT_AARCH32_S) &&
 	    (smmu_domain->stage == ARM_SMMU_DOMAIN_S1))
 		cfg->fmt = ARM_SMMU_CTX_FMT_AARCH32_S;
+
 	if ((IS_ENABLED(CONFIG_64BIT) || cfg->fmt == ARM_SMMU_CTX_FMT_NONE) &&
 	    (smmu->features & (ARM_SMMU_FEAT_FMT_AARCH64_64K |
 			       ARM_SMMU_FEAT_FMT_AARCH64_16K |
@@ -828,6 +919,12 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 		goto out_unlock;
 	}
 
+	/* For now, only allow split tables for AARCH64 formats */
+	if (split_tables && cfg->fmt != ARM_SMMU_CTX_FMT_AARCH64) {
+		ret = -EINVAL;
+		goto out_unlock;
+	}
+
 	switch (smmu_domain->stage) {
 	case ARM_SMMU_DOMAIN_S1:
 		cfg->cbar = CBAR_TYPE_S1_TRANS_S2_BYPASS;
@@ -906,8 +1003,8 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 		pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
 
 	smmu_domain->smmu = smmu;
-	pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
-	if (!pgtbl_ops) {
+	pgtbl_ops[0] = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
+	if (!pgtbl_ops[0]) {
 		ret = -ENOMEM;
 		goto out_clear_smmu;
 	}
@@ -919,6 +1016,20 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 
 	/* Initialise the context bank with our page table cfg */
 	arm_smmu_init_context_bank(smmu_domain, &pgtbl_cfg);
+
+	if (split_tables) {
+		/* It is safe to reuse pgtbl_cfg here */
+		pgtbl_ops[1] = alloc_io_pgtable_ops(fmt, &pgtbl_cfg,
+			smmu_domain);
+		if (!pgtbl_ops[1]) {
+			free_io_pgtable_ops(pgtbl_ops[0]);
+			ret = -ENOMEM;
+			goto out_clear_smmu;
+		}
+
+		arm_smmu_init_ttbr1(smmu_domain, &pgtbl_cfg);
+	}
+
 	arm_smmu_write_context_bank(smmu, cfg->cbndx);
 
 	/*
@@ -937,7 +1048,9 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	mutex_unlock(&smmu_domain->init_mutex);
 
 	/* Publish page table ops for map/unmap */
-	smmu_domain->pgtbl_ops = pgtbl_ops;
+	smmu_domain->pgtbl_ops[0] = pgtbl_ops[0];
+	smmu_domain->pgtbl_ops[1] = pgtbl_ops[1];
+
 	return 0;
 
 out_clear_smmu:
@@ -973,7 +1086,9 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain)
 		devm_free_irq(smmu->dev, irq, domain);
 	}
 
-	free_io_pgtable_ops(smmu_domain->pgtbl_ops);
+	free_io_pgtable_ops(smmu_domain->pgtbl_ops[0]);
+	free_io_pgtable_ops(smmu_domain->pgtbl_ops[1]);
+
 	__arm_smmu_free_bitmap(smmu->context_map, cfg->cbndx);
 
 	arm_smmu_rpm_put(smmu);
@@ -1317,10 +1432,37 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 	return ret;
 }
 
+static struct io_pgtable_ops *
+arm_smmu_get_pgtbl_ops(struct iommu_domain *domain, unsigned long iova)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
+	struct arm_smmu_cb *cb = &smmu_domain->smmu->cbs[cfg->cbndx];
+
+	if (iova & cb->split_table_mask)
+		return smmu_domain->pgtbl_ops[1];
+
+	return smmu_domain->pgtbl_ops[0];
+}
+
+/*
+ * If split pagetables are enabled adjust the iova so that it
+ * matches the T0SZ/T1SZ that has been programmed
+ */
+unsigned long arm_smmu_adjust_iova(struct iommu_domain *domain,
+		unsigned long iova)
+{
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
+	struct arm_smmu_cb *cb = &smmu_domain->smmu->cbs[cfg->cbndx];
+
+	return cb->split_table_mask ? iova & (cb->split_table_mask - 1) : iova;
+}
+
 static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
 			phys_addr_t paddr, size_t size, int prot)
 {
-	struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
+	struct io_pgtable_ops *ops = arm_smmu_get_pgtbl_ops(domain, iova);
 	struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu;
 	int ret;
 
@@ -1328,7 +1470,8 @@ static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
 		return -ENODEV;
 
 	arm_smmu_rpm_get(smmu);
-	ret = ops->map(ops, iova, paddr, size, prot);
+	ret = ops->map(ops, arm_smmu_adjust_iova(domain, iova),
+		paddr, size, prot);
 	arm_smmu_rpm_put(smmu);
 
 	return ret;
@@ -1337,7 +1480,7 @@ static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
 static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
 			     size_t size)
 {
-	struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
+	struct io_pgtable_ops *ops = arm_smmu_get_pgtbl_ops(domain, iova);
 	struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu;
 	size_t ret;
 
@@ -1345,7 +1488,7 @@ static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
 		return 0;
 
 	arm_smmu_rpm_get(smmu);
-	ret = ops->unmap(ops, iova, size);
+	ret = ops->unmap(ops, arm_smmu_adjust_iova(domain, iova), size);
 	arm_smmu_rpm_put(smmu);
 
 	return ret;
@@ -1381,7 +1524,7 @@ static phys_addr_t arm_smmu_iova_to_phys_hard(struct iommu_domain *domain,
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
 	struct arm_smmu_device *smmu = smmu_domain->smmu;
 	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
-	struct io_pgtable_ops *ops= smmu_domain->pgtbl_ops;
+	struct io_pgtable_ops *ops = arm_smmu_get_pgtbl_ops(domain, iova);
 	struct device *dev = smmu->dev;
 	void __iomem *cb_base;
 	u32 tmp;
@@ -1429,7 +1572,7 @@ static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain,
 					dma_addr_t iova)
 {
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
-	struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
+	struct io_pgtable_ops *ops = arm_smmu_get_pgtbl_ops(domain, iova);
 
 	if (domain->type == IOMMU_DOMAIN_IDENTITY)
 		return iova;
@@ -1629,6 +1772,11 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
 		case DOMAIN_ATTR_NESTING:
 			*(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED);
 			return 0;
+		case DOMAIN_ATTR_SPLIT_TABLES:
+			*((int *)data) =
+				!!(smmu_domain->attributes &
+				   (1 << DOMAIN_ATTR_SPLIT_TABLES));
+			return 0;
 		default:
 			return -ENODEV;
 		}
@@ -1669,6 +1817,11 @@ static int arm_smmu_domain_set_attr(struct iommu_domain *domain,
 			else
 				smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
 			break;
+		case DOMAIN_ATTR_SPLIT_TABLES:
+			if (*((int *)data))
+				smmu_domain->attributes |=
+					(1 << DOMAIN_ATTR_SPLIT_TABLES);
+			break;
 		default:
 			ret = -ENODEV;
 		}
diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 4e21efb..71ecb08 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -490,8 +490,7 @@ static int arm_lpae_map(struct io_pgtable_ops *ops, unsigned long iova,
 	if (!(iommu_prot & (IOMMU_READ | IOMMU_WRITE)))
 		return 0;
 
-	if (WARN_ON(iova >= (1ULL << data->iop.cfg.ias) ||
-		    paddr >= (1ULL << data->iop.cfg.oas)))
+	if (WARN_ON(paddr >= (1ULL << data->iop.cfg.oas)))
 		return -ERANGE;
 
 	prot = arm_lpae_prot_to_pte(data, iommu_prot);
-- 
2.7.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 04/15] iommu: Add DOMAIN_ATTR_PTBASE
  2019-05-21 16:13 ` Jordan Crouse
@ 2019-05-21 16:13   ` Jordan Crouse
  -1 siblings, 0 replies; 45+ messages in thread
From: Jordan Crouse @ 2019-05-21 16:13 UTC (permalink / raw)
  To: freedreno
  Cc: jean-philippe.brucker, linux-arm-msm, hoegsberg, dianders, iommu,
	Joerg Roedel, linux-kernel

Add an attribute to return the base address of the pagetable. This is used
by auxiliary domains from arm-smmu to return the address of the pagetable
to the leaf driver so that it can set the appropriate pagetable through
it's own means.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 include/linux/iommu.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 204acd8..2252edc 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -129,6 +129,7 @@ enum iommu_attr {
 	DOMAIN_ATTR_NESTING,	/* two stages of translation */
 	DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE,
 	DOMAIN_ATTR_SPLIT_TABLES,
+	DOMAIN_ATTR_PTBASE,
 	DOMAIN_ATTR_MAX,
 };
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 04/15] iommu: Add DOMAIN_ATTR_PTBASE
@ 2019-05-21 16:13   ` Jordan Crouse
  0 siblings, 0 replies; 45+ messages in thread
From: Jordan Crouse @ 2019-05-21 16:13 UTC (permalink / raw)
  To: freedreno
  Cc: jean-philippe.brucker, linux-arm-msm, dianders, linux-kernel,
	iommu, hoegsberg

Add an attribute to return the base address of the pagetable. This is used
by auxiliary domains from arm-smmu to return the address of the pagetable
to the leaf driver so that it can set the appropriate pagetable through
it's own means.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 include/linux/iommu.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 204acd8..2252edc 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -129,6 +129,7 @@ enum iommu_attr {
 	DOMAIN_ATTR_NESTING,	/* two stages of translation */
 	DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE,
 	DOMAIN_ATTR_SPLIT_TABLES,
+	DOMAIN_ATTR_PTBASE,
 	DOMAIN_ATTR_MAX,
 };
 
-- 
2.7.4

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 05/15] iommu/arm-smmu: Add auxiliary domain support for arm-smmuv2
  2019-05-21 16:13 ` Jordan Crouse
  (?)
@ 2019-05-21 16:13   ` Jordan Crouse
  -1 siblings, 0 replies; 45+ messages in thread
From: Jordan Crouse @ 2019-05-21 16:13 UTC (permalink / raw)
  To: freedreno
  Cc: jean-philippe.brucker, linux-arm-msm, hoegsberg, dianders,
	linux-kernel, iommu, Robin Murphy, Will Deacon, Joerg Roedel,
	linux-arm-kernel

Support auxiliary domains for arm-smmu-v2 to initialize and support
multiple pagetables for a single SMMU context bank. Since the smmu-v2
hardware doesn't have any built in support for switching the pagetable
it is left as an exercise to the caller to actually use the pagetable;
aux domains in the IOMMU driver are only preoccupied with creating and
managing the pagetable memory.

Following is a pseudo code example of how a domain can be created

 /* Check to see if aux domains are supported */
 if (iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX)) {
	 iommu = iommu_domain_alloc(...);

	 if (iommu_aux_attach_device(domain, dev))
		 return FAIL;

	/* Save the base address of the pagetable for use by the driver
	iommu_domain_get_attr(domain, DOMAIN_ATTR_PTBASE, &ptbase);
 }

Then 'domain' can be used like any other iommu domain to map and
unmap iova addresses in the pagetable. The driver/hardware is used
to switch the pagetable according to its own specific implementation.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 drivers/iommu/arm-smmu.c | 133 +++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 117 insertions(+), 16 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index e09c0e6..27ff554 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -263,6 +263,8 @@ struct arm_smmu_domain {
 	spinlock_t			cb_lock; /* Serialises ATS1* ops and TLB syncs */
 	u32 attributes;
 	struct iommu_domain		domain;
+	bool				is_aux;
+	u64				ttbr0;
 };
 
 struct arm_smmu_option_prop {
@@ -892,6 +894,12 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S2))
 		smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
 
+	/* Aux domains can only be created for stage-1 tables */
+	if (smmu_domain->is_aux && smmu_domain->stage != ARM_SMMU_DOMAIN_S1) {
+		ret = -EINVAL;
+		goto out_unlock;
+	}
+
 	/*
 	 * Choosing a suitable context format is even more fiddly. Until we
 	 * grow some way for the caller to express a preference, and/or move
@@ -942,6 +950,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 			ias = min(ias, 32UL);
 			oas = min(oas, 32UL);
 		}
+
 		smmu_domain->tlb_ops = &arm_smmu_s1_tlb_ops;
 		break;
 	case ARM_SMMU_DOMAIN_NESTED:
@@ -961,6 +970,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 			ias = min(ias, 40UL);
 			oas = min(oas, 40UL);
 		}
+
 		if (smmu->version == ARM_SMMU_V2)
 			smmu_domain->tlb_ops = &arm_smmu_s2_tlb_ops_v2;
 		else
@@ -970,23 +980,30 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 		ret = -EINVAL;
 		goto out_unlock;
 	}
-	ret = __arm_smmu_alloc_bitmap(smmu->context_map, start,
-				      smmu->num_context_banks);
-	if (ret < 0)
-		goto out_unlock;
 
-	cfg->cbndx = ret;
-	if (smmu->version < ARM_SMMU_V2) {
-		cfg->irptndx = atomic_inc_return(&smmu->irptndx);
-		cfg->irptndx %= smmu->num_context_irqs;
-	} else {
-		cfg->irptndx = cfg->cbndx;
-	}
+	/*
+	 * Aux domains will use the same context bank assigned to the master
+	 * domain for the device
+	 */
+	if (!smmu_domain->is_aux) {
+		ret = __arm_smmu_alloc_bitmap(smmu->context_map, start,
+					      smmu->num_context_banks);
+		if (ret < 0)
+			goto out_unlock;
 
-	if (smmu_domain->stage == ARM_SMMU_DOMAIN_S2)
-		cfg->vmid = cfg->cbndx + 1 + smmu->cavium_id_base;
-	else
-		cfg->asid = cfg->cbndx + smmu->cavium_id_base;
+		cfg->cbndx = ret;
+		if (smmu->version < ARM_SMMU_V2) {
+			cfg->irptndx = atomic_inc_return(&smmu->irptndx);
+			cfg->irptndx %= smmu->num_context_irqs;
+		} else {
+			cfg->irptndx = cfg->cbndx;
+		}
+
+		if (smmu_domain->stage == ARM_SMMU_DOMAIN_S2)
+			cfg->vmid = cfg->cbndx + 1 + smmu->cavium_id_base;
+		else
+			cfg->asid = cfg->cbndx + smmu->cavium_id_base;
+	}
 
 	pgtbl_cfg = (struct io_pgtable_cfg) {
 		.pgsize_bitmap	= smmu->pgsize_bitmap,
@@ -1009,11 +1026,21 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 		goto out_clear_smmu;
 	}
 
+	/* Cache the TTBR0 for the aux domain */
+	smmu_domain->ttbr0 = pgtbl_cfg.arm_lpae_s1_cfg.ttbr[0];
+
 	/* Update the domain's page sizes to reflect the page table format */
 	domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
 	domain->geometry.aperture_end = (1UL << ias) - 1;
 	domain->geometry.force_aperture = true;
 
+	/*
+	 * aux domains don't use split tables or program the hardware so we're
+	 * done setting it up
+	 */
+	if (smmu_domain->is_aux)
+		goto out;
+
 	/* Initialise the context bank with our page table cfg */
 	arm_smmu_init_context_bank(smmu_domain, &pgtbl_cfg);
 
@@ -1045,6 +1072,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 		cfg->irptndx = INVALID_IRPTNDX;
 	}
 
+out:
 	mutex_unlock(&smmu_domain->init_mutex);
 
 	/* Publish page table ops for map/unmap */
@@ -1070,6 +1098,12 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain)
 	if (!smmu || domain->type == IOMMU_DOMAIN_IDENTITY)
 		return;
 
+	/* All we need to do for aux devices is destroy the pagetable */
+	if (smmu_domain->is_aux) {
+		free_io_pgtable_ops(smmu_domain->pgtbl_ops[0]);
+		return;
+	}
+
 	ret = arm_smmu_rpm_get(smmu);
 	if (ret < 0)
 		return;
@@ -1352,14 +1386,17 @@ static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain,
 
 struct arm_smmu_client_match_data {
 	bool use_identity_domain;
+	bool allow_aux_domain;
 };
 
 static const struct arm_smmu_client_match_data qcom_adreno = {
 	.use_identity_domain = true,
+	.allow_aux_domain = true,
 };
 
 static const struct arm_smmu_client_match_data qcom_mdss = {
 	.use_identity_domain = true,
+	.allow_aux_domain = false,
 };
 
 static const struct of_device_id arm_smmu_client_of_match[] = {
@@ -1379,6 +1416,55 @@ arm_smmu_client_data(struct device *dev)
 	return match ? match->data : NULL;
 }
 
+static bool arm_smmu_supports_aux(struct device *dev)
+{
+	const struct arm_smmu_client_match_data *data =
+		arm_smmu_client_data(dev);
+
+	return (data && data->allow_aux_domain);
+}
+
+static bool arm_smmu_dev_has_feat(struct device *dev,
+		enum iommu_dev_features feat)
+{
+	if (feat != IOMMU_DEV_FEAT_AUX)
+		return false;
+
+	return arm_smmu_supports_aux(dev);
+}
+
+static int arm_smmu_dev_enable_feat(struct device *dev,
+		enum iommu_dev_features feat)
+{
+	/* If supported aux domain support is always "on" */
+	if (feat == IOMMU_DEV_FEAT_AUX && arm_smmu_supports_aux(dev))
+		return 0;
+
+	return -ENODEV;
+}
+
+static int arm_smmu_dev_disable_feat(struct device *dev,
+		enum iommu_dev_features feat)
+{
+	return -EBUSY;
+}
+
+/* Set up a new aux domain and create a new pagetable with the same
+ * characteristics as the master
+ */
+static int arm_smmu_aux_attach_dev(struct iommu_domain *domain,
+		struct device *dev)
+{
+	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+	struct arm_smmu_device *smmu = fwspec_smmu(fwspec);
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+
+	smmu_domain->is_aux = true;
+
+	/* No power is needed because aux domain doesn't touch the hardware */
+	return arm_smmu_init_domain_context(domain, smmu);
+}
+
 static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 {
 	int ret;
@@ -1437,7 +1523,13 @@ arm_smmu_get_pgtbl_ops(struct iommu_domain *domain, unsigned long iova)
 {
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
 	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
-	struct arm_smmu_cb *cb = &smmu_domain->smmu->cbs[cfg->cbndx];
+	struct arm_smmu_cb *cb;
+
+	/* quick escape for domains that don't have split pagetables enabled */
+	if (!smmu_domain->pgtbl_ops[1])
+		return smmu_domain->pgtbl_ops[0];
+
+	cb = &smmu_domain->smmu->cbs[cfg->cbndx];
 
 	if (iova & cb->split_table_mask)
 		return smmu_domain->pgtbl_ops[1];
@@ -1777,6 +1869,11 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
 				!!(smmu_domain->attributes &
 				   (1 << DOMAIN_ATTR_SPLIT_TABLES));
 			return 0;
+		case DOMAIN_ATTR_PTBASE:
+			if (!smmu_domain->is_aux)
+				return -ENODEV;
+			*((u64 *)data) = smmu_domain->ttbr0;
+			return 0;
 		default:
 			return -ENODEV;
 		}
@@ -1887,7 +1984,11 @@ static struct iommu_ops arm_smmu_ops = {
 	.capable		= arm_smmu_capable,
 	.domain_alloc		= arm_smmu_domain_alloc,
 	.domain_free		= arm_smmu_domain_free,
+	.dev_has_feat		= arm_smmu_dev_has_feat,
+	.dev_enable_feat	= arm_smmu_dev_enable_feat,
+	.dev_disable_feat	= arm_smmu_dev_disable_feat,
 	.attach_dev		= arm_smmu_attach_dev,
+	.aux_attach_dev		= arm_smmu_aux_attach_dev,
 	.map			= arm_smmu_map,
 	.unmap			= arm_smmu_unmap,
 	.flush_iotlb_all	= arm_smmu_flush_iotlb_all,
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 05/15] iommu/arm-smmu: Add auxiliary domain support for arm-smmuv2
@ 2019-05-21 16:13   ` Jordan Crouse
  0 siblings, 0 replies; 45+ messages in thread
From: Jordan Crouse @ 2019-05-21 16:13 UTC (permalink / raw)
  To: freedreno
  Cc: jean-philippe.brucker, linux-arm-msm, Will Deacon, dianders,
	linux-kernel, iommu, hoegsberg, Robin Murphy, linux-arm-kernel

Support auxiliary domains for arm-smmu-v2 to initialize and support
multiple pagetables for a single SMMU context bank. Since the smmu-v2
hardware doesn't have any built in support for switching the pagetable
it is left as an exercise to the caller to actually use the pagetable;
aux domains in the IOMMU driver are only preoccupied with creating and
managing the pagetable memory.

Following is a pseudo code example of how a domain can be created

 /* Check to see if aux domains are supported */
 if (iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX)) {
	 iommu = iommu_domain_alloc(...);

	 if (iommu_aux_attach_device(domain, dev))
		 return FAIL;

	/* Save the base address of the pagetable for use by the driver
	iommu_domain_get_attr(domain, DOMAIN_ATTR_PTBASE, &ptbase);
 }

Then 'domain' can be used like any other iommu domain to map and
unmap iova addresses in the pagetable. The driver/hardware is used
to switch the pagetable according to its own specific implementation.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 drivers/iommu/arm-smmu.c | 133 +++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 117 insertions(+), 16 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index e09c0e6..27ff554 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -263,6 +263,8 @@ struct arm_smmu_domain {
 	spinlock_t			cb_lock; /* Serialises ATS1* ops and TLB syncs */
 	u32 attributes;
 	struct iommu_domain		domain;
+	bool				is_aux;
+	u64				ttbr0;
 };
 
 struct arm_smmu_option_prop {
@@ -892,6 +894,12 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S2))
 		smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
 
+	/* Aux domains can only be created for stage-1 tables */
+	if (smmu_domain->is_aux && smmu_domain->stage != ARM_SMMU_DOMAIN_S1) {
+		ret = -EINVAL;
+		goto out_unlock;
+	}
+
 	/*
 	 * Choosing a suitable context format is even more fiddly. Until we
 	 * grow some way for the caller to express a preference, and/or move
@@ -942,6 +950,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 			ias = min(ias, 32UL);
 			oas = min(oas, 32UL);
 		}
+
 		smmu_domain->tlb_ops = &arm_smmu_s1_tlb_ops;
 		break;
 	case ARM_SMMU_DOMAIN_NESTED:
@@ -961,6 +970,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 			ias = min(ias, 40UL);
 			oas = min(oas, 40UL);
 		}
+
 		if (smmu->version == ARM_SMMU_V2)
 			smmu_domain->tlb_ops = &arm_smmu_s2_tlb_ops_v2;
 		else
@@ -970,23 +980,30 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 		ret = -EINVAL;
 		goto out_unlock;
 	}
-	ret = __arm_smmu_alloc_bitmap(smmu->context_map, start,
-				      smmu->num_context_banks);
-	if (ret < 0)
-		goto out_unlock;
 
-	cfg->cbndx = ret;
-	if (smmu->version < ARM_SMMU_V2) {
-		cfg->irptndx = atomic_inc_return(&smmu->irptndx);
-		cfg->irptndx %= smmu->num_context_irqs;
-	} else {
-		cfg->irptndx = cfg->cbndx;
-	}
+	/*
+	 * Aux domains will use the same context bank assigned to the master
+	 * domain for the device
+	 */
+	if (!smmu_domain->is_aux) {
+		ret = __arm_smmu_alloc_bitmap(smmu->context_map, start,
+					      smmu->num_context_banks);
+		if (ret < 0)
+			goto out_unlock;
 
-	if (smmu_domain->stage == ARM_SMMU_DOMAIN_S2)
-		cfg->vmid = cfg->cbndx + 1 + smmu->cavium_id_base;
-	else
-		cfg->asid = cfg->cbndx + smmu->cavium_id_base;
+		cfg->cbndx = ret;
+		if (smmu->version < ARM_SMMU_V2) {
+			cfg->irptndx = atomic_inc_return(&smmu->irptndx);
+			cfg->irptndx %= smmu->num_context_irqs;
+		} else {
+			cfg->irptndx = cfg->cbndx;
+		}
+
+		if (smmu_domain->stage == ARM_SMMU_DOMAIN_S2)
+			cfg->vmid = cfg->cbndx + 1 + smmu->cavium_id_base;
+		else
+			cfg->asid = cfg->cbndx + smmu->cavium_id_base;
+	}
 
 	pgtbl_cfg = (struct io_pgtable_cfg) {
 		.pgsize_bitmap	= smmu->pgsize_bitmap,
@@ -1009,11 +1026,21 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 		goto out_clear_smmu;
 	}
 
+	/* Cache the TTBR0 for the aux domain */
+	smmu_domain->ttbr0 = pgtbl_cfg.arm_lpae_s1_cfg.ttbr[0];
+
 	/* Update the domain's page sizes to reflect the page table format */
 	domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
 	domain->geometry.aperture_end = (1UL << ias) - 1;
 	domain->geometry.force_aperture = true;
 
+	/*
+	 * aux domains don't use split tables or program the hardware so we're
+	 * done setting it up
+	 */
+	if (smmu_domain->is_aux)
+		goto out;
+
 	/* Initialise the context bank with our page table cfg */
 	arm_smmu_init_context_bank(smmu_domain, &pgtbl_cfg);
 
@@ -1045,6 +1072,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 		cfg->irptndx = INVALID_IRPTNDX;
 	}
 
+out:
 	mutex_unlock(&smmu_domain->init_mutex);
 
 	/* Publish page table ops for map/unmap */
@@ -1070,6 +1098,12 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain)
 	if (!smmu || domain->type == IOMMU_DOMAIN_IDENTITY)
 		return;
 
+	/* All we need to do for aux devices is destroy the pagetable */
+	if (smmu_domain->is_aux) {
+		free_io_pgtable_ops(smmu_domain->pgtbl_ops[0]);
+		return;
+	}
+
 	ret = arm_smmu_rpm_get(smmu);
 	if (ret < 0)
 		return;
@@ -1352,14 +1386,17 @@ static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain,
 
 struct arm_smmu_client_match_data {
 	bool use_identity_domain;
+	bool allow_aux_domain;
 };
 
 static const struct arm_smmu_client_match_data qcom_adreno = {
 	.use_identity_domain = true,
+	.allow_aux_domain = true,
 };
 
 static const struct arm_smmu_client_match_data qcom_mdss = {
 	.use_identity_domain = true,
+	.allow_aux_domain = false,
 };
 
 static const struct of_device_id arm_smmu_client_of_match[] = {
@@ -1379,6 +1416,55 @@ arm_smmu_client_data(struct device *dev)
 	return match ? match->data : NULL;
 }
 
+static bool arm_smmu_supports_aux(struct device *dev)
+{
+	const struct arm_smmu_client_match_data *data =
+		arm_smmu_client_data(dev);
+
+	return (data && data->allow_aux_domain);
+}
+
+static bool arm_smmu_dev_has_feat(struct device *dev,
+		enum iommu_dev_features feat)
+{
+	if (feat != IOMMU_DEV_FEAT_AUX)
+		return false;
+
+	return arm_smmu_supports_aux(dev);
+}
+
+static int arm_smmu_dev_enable_feat(struct device *dev,
+		enum iommu_dev_features feat)
+{
+	/* If supported aux domain support is always "on" */
+	if (feat == IOMMU_DEV_FEAT_AUX && arm_smmu_supports_aux(dev))
+		return 0;
+
+	return -ENODEV;
+}
+
+static int arm_smmu_dev_disable_feat(struct device *dev,
+		enum iommu_dev_features feat)
+{
+	return -EBUSY;
+}
+
+/* Set up a new aux domain and create a new pagetable with the same
+ * characteristics as the master
+ */
+static int arm_smmu_aux_attach_dev(struct iommu_domain *domain,
+		struct device *dev)
+{
+	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+	struct arm_smmu_device *smmu = fwspec_smmu(fwspec);
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+
+	smmu_domain->is_aux = true;
+
+	/* No power is needed because aux domain doesn't touch the hardware */
+	return arm_smmu_init_domain_context(domain, smmu);
+}
+
 static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 {
 	int ret;
@@ -1437,7 +1523,13 @@ arm_smmu_get_pgtbl_ops(struct iommu_domain *domain, unsigned long iova)
 {
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
 	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
-	struct arm_smmu_cb *cb = &smmu_domain->smmu->cbs[cfg->cbndx];
+	struct arm_smmu_cb *cb;
+
+	/* quick escape for domains that don't have split pagetables enabled */
+	if (!smmu_domain->pgtbl_ops[1])
+		return smmu_domain->pgtbl_ops[0];
+
+	cb = &smmu_domain->smmu->cbs[cfg->cbndx];
 
 	if (iova & cb->split_table_mask)
 		return smmu_domain->pgtbl_ops[1];
@@ -1777,6 +1869,11 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
 				!!(smmu_domain->attributes &
 				   (1 << DOMAIN_ATTR_SPLIT_TABLES));
 			return 0;
+		case DOMAIN_ATTR_PTBASE:
+			if (!smmu_domain->is_aux)
+				return -ENODEV;
+			*((u64 *)data) = smmu_domain->ttbr0;
+			return 0;
 		default:
 			return -ENODEV;
 		}
@@ -1887,7 +1984,11 @@ static struct iommu_ops arm_smmu_ops = {
 	.capable		= arm_smmu_capable,
 	.domain_alloc		= arm_smmu_domain_alloc,
 	.domain_free		= arm_smmu_domain_free,
+	.dev_has_feat		= arm_smmu_dev_has_feat,
+	.dev_enable_feat	= arm_smmu_dev_enable_feat,
+	.dev_disable_feat	= arm_smmu_dev_disable_feat,
 	.attach_dev		= arm_smmu_attach_dev,
+	.aux_attach_dev		= arm_smmu_aux_attach_dev,
 	.map			= arm_smmu_map,
 	.unmap			= arm_smmu_unmap,
 	.flush_iotlb_all	= arm_smmu_flush_iotlb_all,
-- 
2.7.4

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 05/15] iommu/arm-smmu: Add auxiliary domain support for arm-smmuv2
@ 2019-05-21 16:13   ` Jordan Crouse
  0 siblings, 0 replies; 45+ messages in thread
From: Jordan Crouse @ 2019-05-21 16:13 UTC (permalink / raw)
  To: freedreno
  Cc: jean-philippe.brucker, linux-arm-msm, Joerg Roedel, Will Deacon,
	dianders, linux-kernel, iommu, hoegsberg, Robin Murphy,
	linux-arm-kernel

Support auxiliary domains for arm-smmu-v2 to initialize and support
multiple pagetables for a single SMMU context bank. Since the smmu-v2
hardware doesn't have any built in support for switching the pagetable
it is left as an exercise to the caller to actually use the pagetable;
aux domains in the IOMMU driver are only preoccupied with creating and
managing the pagetable memory.

Following is a pseudo code example of how a domain can be created

 /* Check to see if aux domains are supported */
 if (iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX)) {
	 iommu = iommu_domain_alloc(...);

	 if (iommu_aux_attach_device(domain, dev))
		 return FAIL;

	/* Save the base address of the pagetable for use by the driver
	iommu_domain_get_attr(domain, DOMAIN_ATTR_PTBASE, &ptbase);
 }

Then 'domain' can be used like any other iommu domain to map and
unmap iova addresses in the pagetable. The driver/hardware is used
to switch the pagetable according to its own specific implementation.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 drivers/iommu/arm-smmu.c | 133 +++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 117 insertions(+), 16 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index e09c0e6..27ff554 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -263,6 +263,8 @@ struct arm_smmu_domain {
 	spinlock_t			cb_lock; /* Serialises ATS1* ops and TLB syncs */
 	u32 attributes;
 	struct iommu_domain		domain;
+	bool				is_aux;
+	u64				ttbr0;
 };
 
 struct arm_smmu_option_prop {
@@ -892,6 +894,12 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 	if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S2))
 		smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
 
+	/* Aux domains can only be created for stage-1 tables */
+	if (smmu_domain->is_aux && smmu_domain->stage != ARM_SMMU_DOMAIN_S1) {
+		ret = -EINVAL;
+		goto out_unlock;
+	}
+
 	/*
 	 * Choosing a suitable context format is even more fiddly. Until we
 	 * grow some way for the caller to express a preference, and/or move
@@ -942,6 +950,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 			ias = min(ias, 32UL);
 			oas = min(oas, 32UL);
 		}
+
 		smmu_domain->tlb_ops = &arm_smmu_s1_tlb_ops;
 		break;
 	case ARM_SMMU_DOMAIN_NESTED:
@@ -961,6 +970,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 			ias = min(ias, 40UL);
 			oas = min(oas, 40UL);
 		}
+
 		if (smmu->version == ARM_SMMU_V2)
 			smmu_domain->tlb_ops = &arm_smmu_s2_tlb_ops_v2;
 		else
@@ -970,23 +980,30 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 		ret = -EINVAL;
 		goto out_unlock;
 	}
-	ret = __arm_smmu_alloc_bitmap(smmu->context_map, start,
-				      smmu->num_context_banks);
-	if (ret < 0)
-		goto out_unlock;
 
-	cfg->cbndx = ret;
-	if (smmu->version < ARM_SMMU_V2) {
-		cfg->irptndx = atomic_inc_return(&smmu->irptndx);
-		cfg->irptndx %= smmu->num_context_irqs;
-	} else {
-		cfg->irptndx = cfg->cbndx;
-	}
+	/*
+	 * Aux domains will use the same context bank assigned to the master
+	 * domain for the device
+	 */
+	if (!smmu_domain->is_aux) {
+		ret = __arm_smmu_alloc_bitmap(smmu->context_map, start,
+					      smmu->num_context_banks);
+		if (ret < 0)
+			goto out_unlock;
 
-	if (smmu_domain->stage == ARM_SMMU_DOMAIN_S2)
-		cfg->vmid = cfg->cbndx + 1 + smmu->cavium_id_base;
-	else
-		cfg->asid = cfg->cbndx + smmu->cavium_id_base;
+		cfg->cbndx = ret;
+		if (smmu->version < ARM_SMMU_V2) {
+			cfg->irptndx = atomic_inc_return(&smmu->irptndx);
+			cfg->irptndx %= smmu->num_context_irqs;
+		} else {
+			cfg->irptndx = cfg->cbndx;
+		}
+
+		if (smmu_domain->stage == ARM_SMMU_DOMAIN_S2)
+			cfg->vmid = cfg->cbndx + 1 + smmu->cavium_id_base;
+		else
+			cfg->asid = cfg->cbndx + smmu->cavium_id_base;
+	}
 
 	pgtbl_cfg = (struct io_pgtable_cfg) {
 		.pgsize_bitmap	= smmu->pgsize_bitmap,
@@ -1009,11 +1026,21 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 		goto out_clear_smmu;
 	}
 
+	/* Cache the TTBR0 for the aux domain */
+	smmu_domain->ttbr0 = pgtbl_cfg.arm_lpae_s1_cfg.ttbr[0];
+
 	/* Update the domain's page sizes to reflect the page table format */
 	domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
 	domain->geometry.aperture_end = (1UL << ias) - 1;
 	domain->geometry.force_aperture = true;
 
+	/*
+	 * aux domains don't use split tables or program the hardware so we're
+	 * done setting it up
+	 */
+	if (smmu_domain->is_aux)
+		goto out;
+
 	/* Initialise the context bank with our page table cfg */
 	arm_smmu_init_context_bank(smmu_domain, &pgtbl_cfg);
 
@@ -1045,6 +1072,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
 		cfg->irptndx = INVALID_IRPTNDX;
 	}
 
+out:
 	mutex_unlock(&smmu_domain->init_mutex);
 
 	/* Publish page table ops for map/unmap */
@@ -1070,6 +1098,12 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain)
 	if (!smmu || domain->type == IOMMU_DOMAIN_IDENTITY)
 		return;
 
+	/* All we need to do for aux devices is destroy the pagetable */
+	if (smmu_domain->is_aux) {
+		free_io_pgtable_ops(smmu_domain->pgtbl_ops[0]);
+		return;
+	}
+
 	ret = arm_smmu_rpm_get(smmu);
 	if (ret < 0)
 		return;
@@ -1352,14 +1386,17 @@ static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain,
 
 struct arm_smmu_client_match_data {
 	bool use_identity_domain;
+	bool allow_aux_domain;
 };
 
 static const struct arm_smmu_client_match_data qcom_adreno = {
 	.use_identity_domain = true,
+	.allow_aux_domain = true,
 };
 
 static const struct arm_smmu_client_match_data qcom_mdss = {
 	.use_identity_domain = true,
+	.allow_aux_domain = false,
 };
 
 static const struct of_device_id arm_smmu_client_of_match[] = {
@@ -1379,6 +1416,55 @@ arm_smmu_client_data(struct device *dev)
 	return match ? match->data : NULL;
 }
 
+static bool arm_smmu_supports_aux(struct device *dev)
+{
+	const struct arm_smmu_client_match_data *data =
+		arm_smmu_client_data(dev);
+
+	return (data && data->allow_aux_domain);
+}
+
+static bool arm_smmu_dev_has_feat(struct device *dev,
+		enum iommu_dev_features feat)
+{
+	if (feat != IOMMU_DEV_FEAT_AUX)
+		return false;
+
+	return arm_smmu_supports_aux(dev);
+}
+
+static int arm_smmu_dev_enable_feat(struct device *dev,
+		enum iommu_dev_features feat)
+{
+	/* If supported aux domain support is always "on" */
+	if (feat == IOMMU_DEV_FEAT_AUX && arm_smmu_supports_aux(dev))
+		return 0;
+
+	return -ENODEV;
+}
+
+static int arm_smmu_dev_disable_feat(struct device *dev,
+		enum iommu_dev_features feat)
+{
+	return -EBUSY;
+}
+
+/* Set up a new aux domain and create a new pagetable with the same
+ * characteristics as the master
+ */
+static int arm_smmu_aux_attach_dev(struct iommu_domain *domain,
+		struct device *dev)
+{
+	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+	struct arm_smmu_device *smmu = fwspec_smmu(fwspec);
+	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+
+	smmu_domain->is_aux = true;
+
+	/* No power is needed because aux domain doesn't touch the hardware */
+	return arm_smmu_init_domain_context(domain, smmu);
+}
+
 static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
 {
 	int ret;
@@ -1437,7 +1523,13 @@ arm_smmu_get_pgtbl_ops(struct iommu_domain *domain, unsigned long iova)
 {
 	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
 	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
-	struct arm_smmu_cb *cb = &smmu_domain->smmu->cbs[cfg->cbndx];
+	struct arm_smmu_cb *cb;
+
+	/* quick escape for domains that don't have split pagetables enabled */
+	if (!smmu_domain->pgtbl_ops[1])
+		return smmu_domain->pgtbl_ops[0];
+
+	cb = &smmu_domain->smmu->cbs[cfg->cbndx];
 
 	if (iova & cb->split_table_mask)
 		return smmu_domain->pgtbl_ops[1];
@@ -1777,6 +1869,11 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
 				!!(smmu_domain->attributes &
 				   (1 << DOMAIN_ATTR_SPLIT_TABLES));
 			return 0;
+		case DOMAIN_ATTR_PTBASE:
+			if (!smmu_domain->is_aux)
+				return -ENODEV;
+			*((u64 *)data) = smmu_domain->ttbr0;
+			return 0;
 		default:
 			return -ENODEV;
 		}
@@ -1887,7 +1984,11 @@ static struct iommu_ops arm_smmu_ops = {
 	.capable		= arm_smmu_capable,
 	.domain_alloc		= arm_smmu_domain_alloc,
 	.domain_free		= arm_smmu_domain_free,
+	.dev_has_feat		= arm_smmu_dev_has_feat,
+	.dev_enable_feat	= arm_smmu_dev_enable_feat,
+	.dev_disable_feat	= arm_smmu_dev_disable_feat,
 	.attach_dev		= arm_smmu_attach_dev,
+	.aux_attach_dev		= arm_smmu_aux_attach_dev,
 	.map			= arm_smmu_map,
 	.unmap			= arm_smmu_unmap,
 	.flush_iotlb_all	= arm_smmu_flush_iotlb_all,
-- 
2.7.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 06/15] drm/msm/adreno: Enable 64 bit mode by default on a5xx and a6xx targets
@ 2019-05-21 16:13   ` Jordan Crouse
  0 siblings, 0 replies; 45+ messages in thread
From: Jordan Crouse @ 2019-05-21 16:13 UTC (permalink / raw)
  To: freedreno
  Cc: jean-philippe.brucker, linux-arm-msm, hoegsberg, dianders,
	Sean Paul, Wen Yang, Kees Cook, Sharat Masetty, dri-devel,
	linux-kernel, Rob Clark, David Airlie, Mamta Shukla,
	Daniel Vetter

A5XX and newer GPUs can be run in either 32 or 64 bit mode. The GPU
registers and the microcode use 64 bit virtual addressing in either
case but the upper 32 bits are ignored if the GPU is in 32 bit mode.
There is no performance disadvantage to remaining in 64 bit mode even
if we are only generating 32 bit addresses so switch over now to prepare
for using addresses above 4G for targets that support them.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 14 ++++++++++++++
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 14 ++++++++++++++
 2 files changed, 28 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index e5fcefa..43a2b4a 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -642,6 +642,20 @@ static int a5xx_hw_init(struct msm_gpu *gpu)
 		REG_A5XX_RBBM_SECVID_TSB_TRUSTED_BASE_HI, 0x00000000);
 	gpu_write(gpu, REG_A5XX_RBBM_SECVID_TSB_TRUSTED_SIZE, 0x00000000);
 
+	/* Put the GPU into 64 bit by default */
+	gpu_write(gpu, REG_A5XX_CP_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_VSC_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_GRAS_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_RB_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_PC_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_HLSQ_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_VFD_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_VPC_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_UCHE_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_SP_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_TPL1_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_RBBM_SECVID_TSB_ADDR_MODE_CNTL, 0x1);
+
 	ret = adreno_hw_init(gpu);
 	if (ret)
 		return ret;
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index e74dce4..0b0dcd7 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -391,6 +391,20 @@ static int a6xx_hw_init(struct msm_gpu *gpu)
 		REG_A6XX_RBBM_SECVID_TSB_TRUSTED_BASE_HI, 0x00000000);
 	gpu_write(gpu, REG_A6XX_RBBM_SECVID_TSB_TRUSTED_SIZE, 0x00000000);
 
+	/* Turn on 64 bit addressing for all blocks */
+	gpu_write(gpu, REG_A6XX_CP_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A6XX_VSC_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A6XX_GRAS_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A6XX_RB_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A6XX_PC_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A6XX_HLSQ_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A6XX_VFD_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A6XX_VPC_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A6XX_UCHE_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A6XX_SP_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A6XX_TPL1_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A6XX_RBBM_SECVID_TSB_ADDR_MODE_CNTL, 0x1);
+
 	/* enable hardware clockgating */
 	a6xx_set_hwcg(gpu, true);
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 06/15] drm/msm/adreno: Enable 64 bit mode by default on a5xx and a6xx targets
@ 2019-05-21 16:13   ` Jordan Crouse
  0 siblings, 0 replies; 45+ messages in thread
From: Jordan Crouse @ 2019-05-21 16:13 UTC (permalink / raw)
  To: freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Kees Cook, jean-philippe.brucker-5wv7dgnIgG8,
	linux-arm-msm-u79uwXL29TY76Z2rM5mHXA, Sharat Masetty,
	dianders-F7+t8E8rja9g9hUCZPvPmw,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, David Airlie, Rob Clark,
	hoegsberg-hpIqsD4AKlfQT0dZR+AlfA, Mamta Shukla, Daniel Vetter,
	Sean Paul, Wen Yang

A5XX and newer GPUs can be run in either 32 or 64 bit mode. The GPU
registers and the microcode use 64 bit virtual addressing in either
case but the upper 32 bits are ignored if the GPU is in 32 bit mode.
There is no performance disadvantage to remaining in 64 bit mode even
if we are only generating 32 bit addresses so switch over now to prepare
for using addresses above 4G for targets that support them.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 14 ++++++++++++++
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 14 ++++++++++++++
 2 files changed, 28 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index e5fcefa..43a2b4a 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -642,6 +642,20 @@ static int a5xx_hw_init(struct msm_gpu *gpu)
 		REG_A5XX_RBBM_SECVID_TSB_TRUSTED_BASE_HI, 0x00000000);
 	gpu_write(gpu, REG_A5XX_RBBM_SECVID_TSB_TRUSTED_SIZE, 0x00000000);
 
+	/* Put the GPU into 64 bit by default */
+	gpu_write(gpu, REG_A5XX_CP_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_VSC_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_GRAS_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_RB_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_PC_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_HLSQ_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_VFD_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_VPC_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_UCHE_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_SP_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_TPL1_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A5XX_RBBM_SECVID_TSB_ADDR_MODE_CNTL, 0x1);
+
 	ret = adreno_hw_init(gpu);
 	if (ret)
 		return ret;
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index e74dce4..0b0dcd7 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -391,6 +391,20 @@ static int a6xx_hw_init(struct msm_gpu *gpu)
 		REG_A6XX_RBBM_SECVID_TSB_TRUSTED_BASE_HI, 0x00000000);
 	gpu_write(gpu, REG_A6XX_RBBM_SECVID_TSB_TRUSTED_SIZE, 0x00000000);
 
+	/* Turn on 64 bit addressing for all blocks */
+	gpu_write(gpu, REG_A6XX_CP_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A6XX_VSC_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A6XX_GRAS_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A6XX_RB_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A6XX_PC_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A6XX_HLSQ_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A6XX_VFD_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A6XX_VPC_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A6XX_UCHE_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A6XX_SP_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A6XX_TPL1_ADDR_MODE_CNTL, 0x1);
+	gpu_write(gpu, REG_A6XX_RBBM_SECVID_TSB_ADDR_MODE_CNTL, 0x1);
+
 	/* enable hardware clockgating */
 	a6xx_set_hwcg(gpu, true);
 
-- 
2.7.4

_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 07/15] drm/msm: Print all 64 bits of the faulting IOMMU address
@ 2019-05-21 16:13   ` Jordan Crouse
  0 siblings, 0 replies; 45+ messages in thread
From: Jordan Crouse @ 2019-05-21 16:13 UTC (permalink / raw)
  To: freedreno
  Cc: jean-philippe.brucker, linux-arm-msm, hoegsberg, dianders,
	Sean Paul, linux-kernel, dri-devel, Rob Clark, David Airlie,
	Daniel Vetter

When we move to 64 bit addressing for a5xx and a6xx targets we will start
seeing pagefaults at larger addresses so format them appropriately in the
log message for easier debugging.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 drivers/gpu/drm/msm/msm_iommu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c
index 12bb54c..1926329 100644
--- a/drivers/gpu/drm/msm/msm_iommu.c
+++ b/drivers/gpu/drm/msm/msm_iommu.c
@@ -30,7 +30,7 @@ static int msm_fault_handler(struct iommu_domain *domain, struct device *dev,
 	struct msm_iommu *iommu = arg;
 	if (iommu->base.handler)
 		return iommu->base.handler(iommu->base.arg, iova, flags);
-	pr_warn_ratelimited("*** fault: iova=%08lx, flags=%d\n", iova, flags);
+	pr_warn_ratelimited("*** fault: iova=%16lx, flags=%d\n", iova, flags);
 	return 0;
 }
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 07/15] drm/msm: Print all 64 bits of the faulting IOMMU address
@ 2019-05-21 16:13   ` Jordan Crouse
  0 siblings, 0 replies; 45+ messages in thread
From: Jordan Crouse @ 2019-05-21 16:13 UTC (permalink / raw)
  To: freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: jean-philippe.brucker-5wv7dgnIgG8,
	linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	dianders-F7+t8E8rja9g9hUCZPvPmw,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, David Airlie, Rob Clark,
	hoegsberg-hpIqsD4AKlfQT0dZR+AlfA, Daniel Vetter, Sean Paul

When we move to 64 bit addressing for a5xx and a6xx targets we will start
seeing pagefaults at larger addresses so format them appropriately in the
log message for easier debugging.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 drivers/gpu/drm/msm/msm_iommu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c
index 12bb54c..1926329 100644
--- a/drivers/gpu/drm/msm/msm_iommu.c
+++ b/drivers/gpu/drm/msm/msm_iommu.c
@@ -30,7 +30,7 @@ static int msm_fault_handler(struct iommu_domain *domain, struct device *dev,
 	struct msm_iommu *iommu = arg;
 	if (iommu->base.handler)
 		return iommu->base.handler(iommu->base.arg, iova, flags);
-	pr_warn_ratelimited("*** fault: iova=%08lx, flags=%d\n", iova, flags);
+	pr_warn_ratelimited("*** fault: iova=%16lx, flags=%d\n", iova, flags);
 	return 0;
 }
 
-- 
2.7.4

_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 08/15] drm/msm: Pass the MMU domain index in struct msm_file_private
  2019-05-21 16:13 ` Jordan Crouse
                   ` (9 preceding siblings ...)
  (?)
@ 2019-05-21 16:13 ` Jordan Crouse
  -1 siblings, 0 replies; 45+ messages in thread
From: Jordan Crouse @ 2019-05-21 16:13 UTC (permalink / raw)
  To: freedreno
  Cc: jean-philippe.brucker, linux-arm-msm, hoegsberg, dianders,
	Sean Paul, linux-kernel, dri-devel, Rob Clark, David Airlie,
	Daniel Vetter

Pass the index of the MMU domain in struct msm_file_private instead
of assuming gpu->id throughout the submit path. This clears the way
to change ctx->aspace to a per-instance pagetable.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 drivers/gpu/drm/msm/msm_drv.c        |  2 ++
 drivers/gpu/drm/msm/msm_drv.h        |  1 +
 drivers/gpu/drm/msm/msm_gem.h        |  1 +
 drivers/gpu/drm/msm/msm_gem_submit.c | 13 ++++++++-----
 drivers/gpu/drm/msm/msm_gpu.c        |  5 ++---
 5 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index 31deb87..4c51063 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -611,6 +611,7 @@ static void load_gpu(struct drm_device *dev)
 
 static int context_init(struct drm_device *dev, struct drm_file *file)
 {
+	struct msm_drm_private *priv = dev->dev_private;
 	struct msm_file_private *ctx;
 
 	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
@@ -619,6 +620,7 @@ static int context_init(struct drm_device *dev, struct drm_file *file)
 
 	msm_submitqueue_init(dev, ctx);
 
+	ctx->aspace = priv->gpu->aspace;
 	file->driver_priv = ctx;
 
 	return 0;
diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
index e20e6b4..d9aa7ba 100644
--- a/drivers/gpu/drm/msm/msm_drv.h
+++ b/drivers/gpu/drm/msm/msm_drv.h
@@ -68,6 +68,7 @@ struct msm_file_private {
 	rwlock_t queuelock;
 	struct list_head submitqueues;
 	int queueid;
+	struct msm_gem_address_space *aspace;
 };
 
 enum msm_mdp_plane_property {
diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h
index 812d1b1..36aeb58 100644
--- a/drivers/gpu/drm/msm/msm_gem.h
+++ b/drivers/gpu/drm/msm/msm_gem.h
@@ -141,6 +141,7 @@ void msm_gem_free_work(struct work_struct *work);
 struct msm_gem_submit {
 	struct drm_device *dev;
 	struct msm_gpu *gpu;
+	struct msm_gem_address_space *aspace;
 	struct list_head node;   /* node in ring submit list */
 	struct list_head bo_list;
 	struct ww_acquire_ctx ticket;
diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c
index 1b68130..d3801bf 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -32,8 +32,9 @@
 #define BO_PINNED   0x2000
 
 static struct msm_gem_submit *submit_create(struct drm_device *dev,
-		struct msm_gpu *gpu, struct msm_gpu_submitqueue *queue,
-		uint32_t nr_bos, uint32_t nr_cmds)
+		struct msm_gpu *gpu, struct msm_gem_address_space *aspace,
+		struct msm_gpu_submitqueue *queue, uint32_t nr_bos,
+		uint32_t nr_cmds)
 {
 	struct msm_gem_submit *submit;
 	uint64_t sz = sizeof(*submit) + ((u64)nr_bos * sizeof(submit->bos[0])) +
@@ -47,6 +48,7 @@ static struct msm_gem_submit *submit_create(struct drm_device *dev,
 		return NULL;
 
 	submit->dev = dev;
+	submit->aspace = aspace;
 	submit->gpu = gpu;
 	submit->fence = NULL;
 	submit->cmd = (void *)&submit->bos[nr_bos];
@@ -160,7 +162,7 @@ static void submit_unlock_unpin_bo(struct msm_gem_submit *submit,
 	struct msm_gem_object *msm_obj = submit->bos[i].obj;
 
 	if (submit->bos[i].flags & BO_PINNED)
-		msm_gem_unpin_iova(&msm_obj->base, submit->gpu->aspace);
+		msm_gem_unpin_iova(&msm_obj->base, submit->aspace);
 
 	if (submit->bos[i].flags & BO_LOCKED)
 		ww_mutex_unlock(&msm_obj->base.resv->lock);
@@ -264,7 +266,7 @@ static int submit_pin_objects(struct msm_gem_submit *submit)
 
 		/* if locking succeeded, pin bo: */
 		ret = msm_gem_get_and_pin_iova(&msm_obj->base,
-				submit->gpu->aspace, &iova);
+				submit->aspace, &iova);
 
 		if (ret)
 			break;
@@ -477,7 +479,8 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data,
 		}
 	}
 
-	submit = submit_create(dev, gpu, queue, args->nr_bos, args->nr_cmds);
+	submit = submit_create(dev, gpu, ctx->aspace, queue, args->nr_bos,
+		args->nr_cmds);
 	if (!submit) {
 		ret = -ENOMEM;
 		goto out_unlock;
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index bf4ee27..0a4c77f 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -684,7 +684,7 @@ static void retire_submit(struct msm_gpu *gpu, struct msm_ringbuffer *ring,
 		struct msm_gem_object *msm_obj = submit->bos[i].obj;
 		/* move to inactive: */
 		msm_gem_move_to_inactive(&msm_obj->base);
-		msm_gem_unpin_iova(&msm_obj->base, gpu->aspace);
+		msm_gem_unpin_iova(&msm_obj->base, submit->aspace);
 		drm_gem_object_put(&msm_obj->base);
 	}
 
@@ -768,8 +768,7 @@ void msm_gpu_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit,
 
 		/* submit takes a reference to the bo and iova until retired: */
 		drm_gem_object_get(&msm_obj->base);
-		msm_gem_get_and_pin_iova(&msm_obj->base,
-				submit->gpu->aspace, &iova);
+		msm_gem_get_and_pin_iova(&msm_obj->base, submit->aspace, &iova);
 
 		if (submit->bos[i].flags & MSM_SUBMIT_BO_WRITE)
 			msm_gem_move_to_active(&msm_obj->base, gpu, true, submit->fence);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 09/15] drm/msm/gpu: Move address space setup to the GPU targets
  2019-05-21 16:13 ` Jordan Crouse
                   ` (10 preceding siblings ...)
  (?)
@ 2019-05-21 16:13 ` Jordan Crouse
  -1 siblings, 0 replies; 45+ messages in thread
From: Jordan Crouse @ 2019-05-21 16:13 UTC (permalink / raw)
  To: freedreno
  Cc: jean-philippe.brucker, linux-arm-msm, hoegsberg, dianders,
	Sean Paul, Kees Cook, Thomas Zimmermann, Sharat Masetty,
	dri-devel, linux-kernel, Rob Clark, David Airlie, Jonathan Marek,
	Mamta Shukla, Daniel Vetter

Move the address space setup code out of the generic msm GPU code to
to the individual GPU targets. This allows us to do target specific
setup such as gpummu for a2xx or split pagetables and per-instance
pagetables for newer a5xx and a6xx targets. All this is at the
expense of duplicated code in some of the target files but I think
it pays for itself in improved code flow and flexibility.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 drivers/gpu/drm/msm/adreno/a2xx_gpu.c   | 37 ++++++++++++++++------
 drivers/gpu/drm/msm/adreno/a3xx_gpu.c   | 50 ++++++++++++++++++++++--------
 drivers/gpu/drm/msm/adreno/a4xx_gpu.c   | 51 +++++++++++++++++++++++--------
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c   | 37 +++++++++++++++++++---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 37 +++++++++++++++++++---
 drivers/gpu/drm/msm/adreno/adreno_gpu.c |  7 -----
 drivers/gpu/drm/msm/msm_gem.h           |  1 +
 drivers/gpu/drm/msm/msm_gpu.c           | 54 ++-------------------------------
 drivers/gpu/drm/msm/msm_gpu.h           |  2 ++
 9 files changed, 173 insertions(+), 103 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a2xx_gpu.c b/drivers/gpu/drm/msm/adreno/a2xx_gpu.c
index 1f83bc1..49241d0 100644
--- a/drivers/gpu/drm/msm/adreno/a2xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a2xx_gpu.c
@@ -401,6 +401,30 @@ static struct msm_gpu_state *a2xx_gpu_state_get(struct msm_gpu *gpu)
 	return state;
 }
 
+static struct msm_gem_address_space *
+a2xx_create_address_space(struct msm_gpu *gpu)
+{
+	struct msm_gem_address_space *aspace;
+	int ret;
+
+	aspace = msm_gem_address_space_create_a2xx(&gpu->pdev->dev, gpu,
+		"gpu", SZ_16M, SZ_16M + 0xff * SZ_64K);
+	if (IS_ERR(aspace)) {
+		DRM_DEV_ERROR(gpu->dev->dev,
+			"No memory protection without MMU\n");
+		return ERR_PTR(-ENXIO);
+	}
+
+	ret = aspace->mmu->funcs->attach(aspace->mmu, NULL, 0);
+	if (ret) {
+		msm_gem_address_space_put(aspace);
+		return ERR_PTR(ret);
+	}
+
+	return aspace;
+}
+
+
 /* Register offset defines for A2XX - copy of A3XX */
 static const unsigned int a2xx_register_offsets[REG_ADRENO_REGISTER_MAX] = {
 	REG_ADRENO_DEFINE(REG_ADRENO_CP_RB_BASE, REG_AXXX_CP_RB_BASE),
@@ -429,6 +453,7 @@ static const struct adreno_gpu_funcs funcs = {
 #endif
 		.gpu_state_get = a2xx_gpu_state_get,
 		.gpu_state_put = adreno_gpu_state_put,
+		.create_address_space = a2xx_create_address_space,
 	},
 };
 
@@ -473,16 +498,8 @@ struct msm_gpu *a2xx_gpu_init(struct drm_device *dev)
 	adreno_gpu->reg_offsets = a2xx_register_offsets;
 
 	ret = adreno_gpu_init(dev, pdev, adreno_gpu, &funcs, 1);
-	if (ret)
-		goto fail;
-
-	if (!gpu->aspace) {
-		dev_err(dev->dev, "No memory protection without MMU\n");
-		ret = -ENXIO;
-		goto fail;
-	}
-
-	return gpu;
+	if (!ret)
+		return gpu;
 
 fail:
 	if (a2xx_gpu)
diff --git a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
index c3b4bc6..33ab5e8 100644
--- a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
@@ -21,6 +21,7 @@
 #  include <mach/ocmem.h>
 #endif
 
+#include "msm_gem.h"
 #include "a3xx_gpu.h"
 
 #define A3XX_INT0_MASK \
@@ -433,6 +434,41 @@ static struct msm_gpu_state *a3xx_gpu_state_get(struct msm_gpu *gpu)
 	return state;
 }
 
+static struct msm_gem_address_space *
+a3xx_create_address_space(struct msm_gpu *gpu)
+{
+	struct msm_gem_address_space *aspace;
+	struct iommu_domain *iommu;
+	int ret;
+
+	iommu = iommu_domain_alloc(&platform_bus_type);
+	if (!iommu) {
+		DRM_DEV_ERROR(gpu->dev->dev,
+			"No memory protection without IOMMU\n");
+		return ERR_PTR(-ENXIO);
+	}
+
+	iommu->geometry.aperture_start = SZ_16M;
+	iommu->geometry.aperture_end = 0xffffffff;
+
+	aspace = msm_gem_address_space_create(&gpu->pdev->dev, iommu, "gpu");
+	if (IS_ERR(aspace)) {
+		iommu_domain_free(iommu);
+		DRM_DEV_ERROR(gpu->dev->dev, "failed to init mmu: %ld\n",
+			PTR_ERR(aspace));
+		return aspace;
+	}
+
+	ret = aspace->mmu->funcs->attach(aspace->mmu, NULL, 0);
+	if (ret) {
+		msm_gem_address_space_put(aspace);
+		return ERR_PTR(ret);
+	}
+
+	return aspace;
+}
+
+
 /* Register offset defines for A3XX */
 static const unsigned int a3xx_register_offsets[REG_ADRENO_REGISTER_MAX] = {
 	REG_ADRENO_DEFINE(REG_ADRENO_CP_RB_BASE, REG_AXXX_CP_RB_BASE),
@@ -461,6 +497,7 @@ static const struct adreno_gpu_funcs funcs = {
 #endif
 		.gpu_state_get = a3xx_gpu_state_get,
 		.gpu_state_put = adreno_gpu_state_put,
+		.create_address_space = a3xx_create_address_space,
 	},
 };
 
@@ -520,19 +557,6 @@ struct msm_gpu *a3xx_gpu_init(struct drm_device *dev)
 #endif
 	}
 
-	if (!gpu->aspace) {
-		/* TODO we think it is possible to configure the GPU to
-		 * restrict access to VRAM carveout.  But the required
-		 * registers are unknown.  For now just bail out and
-		 * limp along with just modesetting.  If it turns out
-		 * to not be possible to restrict access, then we must
-		 * implement a cmdstream validator.
-		 */
-		DRM_DEV_ERROR(dev->dev, "No memory protection without IOMMU\n");
-		ret = -ENXIO;
-		goto fail;
-	}
-
 	return gpu;
 
 fail:
diff --git a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
index 18f9a8e..08a5729 100644
--- a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
@@ -15,6 +15,8 @@
 #  include <soc/qcom/ocmem.h>
 #endif
 
+#include "msm_gem.h"
+
 #define A4XX_INT0_MASK \
 	(A4XX_INT0_RBBM_AHB_ERROR |        \
 	 A4XX_INT0_RBBM_ATB_BUS_OVERFLOW | \
@@ -530,6 +532,41 @@ static int a4xx_get_timestamp(struct msm_gpu *gpu, uint64_t *value)
 	return 0;
 }
 
+static struct msm_gem_address_space *
+a4xx_create_address_space(struct msm_gpu *gpu)
+{
+	struct msm_gem_address_space *aspace;
+	struct iommu_domain *iommu;
+	int ret;
+
+	iommu = iommu_domain_alloc(&platform_bus_type);
+	if (!iommu) {
+		DRM_DEV_ERROR(gpu->dev->dev,
+			"No memory protection without IOMMU\n");
+		return ERR_PTR(-ENXIO);
+	}
+
+	iommu->geometry.aperture_start = SZ_16M;
+	iommu->geometry.aperture_end = 0xffffffff;
+
+	aspace = msm_gem_address_space_create(&gpu->pdev->dev, iommu, "gpu");
+	if (IS_ERR(aspace)) {
+		iommu_domain_free(iommu);
+		DRM_DEV_ERROR(gpu->dev->dev, "failed to init mmu: %ld\n",
+			PTR_ERR(aspace));
+		return aspace;
+	}
+
+	ret = aspace->mmu->funcs->attach(aspace->mmu, NULL, 0);
+	if (ret) {
+		msm_gem_address_space_put(aspace);
+		return ERR_PTR(ret);
+	}
+
+	return aspace;
+}
+
+
 static const struct adreno_gpu_funcs funcs = {
 	.base = {
 		.get_param = adreno_get_param,
@@ -547,6 +584,7 @@ static const struct adreno_gpu_funcs funcs = {
 #endif
 		.gpu_state_get = a4xx_gpu_state_get,
 		.gpu_state_put = adreno_gpu_state_put,
+		.create_address_space = a4xx_create_address_space,
 	},
 	.get_timestamp = a4xx_get_timestamp,
 };
@@ -600,19 +638,6 @@ struct msm_gpu *a4xx_gpu_init(struct drm_device *dev)
 #endif
 	}
 
-	if (!gpu->aspace) {
-		/* TODO we think it is possible to configure the GPU to
-		 * restrict access to VRAM carveout.  But the required
-		 * registers are unknown.  For now just bail out and
-		 * limp along with just modesetting.  If it turns out
-		 * to not be possible to restrict access, then we must
-		 * implement a cmdstream validator.
-		 */
-		DRM_DEV_ERROR(dev->dev, "No memory protection without IOMMU\n");
-		ret = -ENXIO;
-		goto fail;
-	}
-
 	return gpu;
 
 fail:
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index 43a2b4a..c243334 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -1349,6 +1349,38 @@ static unsigned long a5xx_gpu_busy(struct msm_gpu *gpu)
 	return (unsigned long)busy_time;
 }
 
+static struct msm_gem_address_space *
+a5xx_create_address_space(struct msm_gpu *gpu)
+{
+	struct msm_gem_address_space *aspace;
+	struct iommu_domain *iommu;
+	int ret;
+
+	iommu = iommu_domain_alloc(&platform_bus_type);
+	if (!iommu)
+		return NULL;
+
+	iommu->geometry.aperture_start = 0x100000000ULL;
+	iommu->geometry.aperture_end = 0x1ffffffffULL;
+
+	aspace = msm_gem_address_space_create(&gpu->pdev->dev, iommu, "gpu");
+	if (IS_ERR(aspace)) {
+		iommu_domain_free(iommu);
+		DRM_DEV_ERROR(gpu->dev->dev, "failed to init mmu: %ld\n",
+			PTR_ERR(aspace));
+		return aspace;
+	}
+
+	ret = aspace->mmu->funcs->attach(aspace->mmu, NULL, 0);
+	if (ret) {
+		msm_gem_address_space_put(aspace);
+		return ERR_PTR(ret);
+	}
+
+	msm_mmu_set_fault_handler(aspace->mmu, gpu, a5xx_fault_handler);
+	return aspace;
+}
+
 static const struct adreno_gpu_funcs funcs = {
 	.base = {
 		.get_param = adreno_get_param,
@@ -1370,6 +1402,7 @@ static const struct adreno_gpu_funcs funcs = {
 		.gpu_busy = a5xx_gpu_busy,
 		.gpu_state_get = a5xx_gpu_state_get,
 		.gpu_state_put = a5xx_gpu_state_put,
+		.create_address_space = a5xx_create_address_space,
 	},
 	.get_timestamp = a5xx_get_timestamp,
 };
@@ -1416,7 +1449,6 @@ struct msm_gpu *a5xx_gpu_init(struct drm_device *dev)
 
 	adreno_gpu->registers = a5xx_registers;
 	adreno_gpu->reg_offsets = a5xx_register_offsets;
-
 	a5xx_gpu->lm_leakage = 0x4E001A;
 
 	check_speed_bin(&pdev->dev);
@@ -1427,9 +1459,6 @@ struct msm_gpu *a5xx_gpu_init(struct drm_device *dev)
 		return ERR_PTR(ret);
 	}
 
-	if (gpu->aspace)
-		msm_mmu_set_fault_handler(gpu->aspace->mmu, gpu, a5xx_fault_handler);
-
 	/* Set up the preemption specific bits and pieces for each ringbuffer */
 	a5xx_preempt_init(gpu);
 
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 0b0dcd7..f22385c 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -810,6 +810,38 @@ static unsigned long a6xx_gpu_busy(struct msm_gpu *gpu)
 	return (unsigned long)busy_time;
 }
 
+static struct msm_gem_address_space *
+a6xx_create_address_space(struct msm_gpu *gpu)
+{
+	struct msm_gem_address_space *aspace;
+	struct iommu_domain *iommu;
+	int ret;
+
+	iommu = iommu_domain_alloc(&platform_bus_type);
+	if (!iommu)
+		return NULL;
+
+	iommu->geometry.aperture_start = 0x100000000ULL;
+	iommu->geometry.aperture_end = 0x1ffffffffULL;
+
+	aspace = msm_gem_address_space_create(&gpu->pdev->dev, iommu, "gpu");
+	if (IS_ERR(aspace)) {
+		iommu_domain_free(iommu);
+		DRM_DEV_ERROR(gpu->dev->dev, "failed to init mmu: %ld\n",
+			PTR_ERR(aspace));
+		return aspace;
+	}
+
+	ret = aspace->mmu->funcs->attach(aspace->mmu, NULL, 0);
+	if (ret) {
+		msm_gem_address_space_put(aspace);
+		return ERR_PTR(ret);
+	}
+
+	msm_mmu_set_fault_handler(aspace->mmu, gpu, a6xx_fault_handler);
+	return aspace;
+}
+
 static const struct adreno_gpu_funcs funcs = {
 	.base = {
 		.get_param = adreno_get_param,
@@ -832,6 +864,7 @@ static const struct adreno_gpu_funcs funcs = {
 		.gpu_state_get = a6xx_gpu_state_get,
 		.gpu_state_put = a6xx_gpu_state_put,
 #endif
+		.create_address_space = a6xx_create_address_space,
 	},
 	.get_timestamp = a6xx_get_timestamp,
 };
@@ -874,9 +907,5 @@ struct msm_gpu *a6xx_gpu_init(struct drm_device *dev)
 		return ERR_PTR(ret);
 	}
 
-	if (gpu->aspace)
-		msm_mmu_set_fault_handler(gpu->aspace->mmu, gpu,
-				a6xx_fault_handler);
-
 	return gpu;
 }
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index 6f7f411..3ba7141 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -912,13 +912,6 @@ int adreno_gpu_init(struct drm_device *drm, struct platform_device *pdev,
 	adreno_gpu->rev = config->rev;
 
 	adreno_gpu_config.ioname = "kgsl_3d0_reg_memory";
-
-	adreno_gpu_config.va_start = SZ_16M;
-	adreno_gpu_config.va_end = 0xffffffff;
-	/* maximum range of a2xx mmu */
-	if (adreno_is_a2xx(adreno_gpu))
-		adreno_gpu_config.va_end = SZ_16M + 0xfff * SZ_64K;
-
 	adreno_gpu_config.nr_rings = nr_rings;
 
 	adreno_get_pwrlevels(&pdev->dev, gpu);
diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h
index 36aeb58..fd67153 100644
--- a/drivers/gpu/drm/msm/msm_gem.h
+++ b/drivers/gpu/drm/msm/msm_gem.h
@@ -21,6 +21,7 @@
 #include <linux/kref.h>
 #include <linux/reservation.h>
 #include "msm_drv.h"
+#include "msm_mmu.h"
 
 /* Additional internal-use only BO flags: */
 #define MSM_BO_STOLEN        0x10000000    /* try to use stolen/splash memory */
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index 0a4c77f..e8a14b0 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -20,7 +20,6 @@
 #include "msm_mmu.h"
 #include "msm_fence.h"
 #include "msm_gpu_trace.h"
-#include "adreno/adreno_gpu.h"
 
 #include <generated/utsrelease.h>
 #include <linux/string_helpers.h>
@@ -812,51 +811,6 @@ static int get_clocks(struct platform_device *pdev, struct msm_gpu *gpu)
 	return 0;
 }
 
-static struct msm_gem_address_space *
-msm_gpu_create_address_space(struct msm_gpu *gpu, struct platform_device *pdev,
-		uint64_t va_start, uint64_t va_end)
-{
-	struct msm_gem_address_space *aspace;
-	int ret;
-
-	/*
-	 * Setup IOMMU.. eventually we will (I think) do this once per context
-	 * and have separate page tables per context.  For now, to keep things
-	 * simple and to get something working, just use a single address space:
-	 */
-	if (!adreno_is_a2xx(to_adreno_gpu(gpu))) {
-		struct iommu_domain *iommu = iommu_domain_alloc(&platform_bus_type);
-		if (!iommu)
-			return NULL;
-
-		iommu->geometry.aperture_start = va_start;
-		iommu->geometry.aperture_end = va_end;
-
-		DRM_DEV_INFO(gpu->dev->dev, "%s: using IOMMU\n", gpu->name);
-
-		aspace = msm_gem_address_space_create(&pdev->dev, iommu, "gpu");
-		if (IS_ERR(aspace))
-			iommu_domain_free(iommu);
-	} else {
-		aspace = msm_gem_address_space_create_a2xx(&pdev->dev, gpu, "gpu",
-			va_start, va_end);
-	}
-
-	if (IS_ERR(aspace)) {
-		DRM_DEV_ERROR(gpu->dev->dev, "failed to init mmu: %ld\n",
-			PTR_ERR(aspace));
-		return ERR_CAST(aspace);
-	}
-
-	ret = aspace->mmu->funcs->attach(aspace->mmu, NULL, 0);
-	if (ret) {
-		msm_gem_address_space_put(aspace);
-		return ERR_PTR(ret);
-	}
-
-	return aspace;
-}
-
 int msm_gpu_init(struct drm_device *drm, struct platform_device *pdev,
 		struct msm_gpu *gpu, const struct msm_gpu_funcs *funcs,
 		const char *name, struct msm_gpu_config *config)
@@ -929,12 +883,8 @@ int msm_gpu_init(struct drm_device *drm, struct platform_device *pdev,
 
 	msm_devfreq_init(gpu);
 
-	gpu->aspace = msm_gpu_create_address_space(gpu, pdev,
-		config->va_start, config->va_end);
-
-	if (gpu->aspace == NULL)
-		DRM_DEV_INFO(drm->dev, "%s: no IOMMU, fallback to VRAM carveout!\n", name);
-	else if (IS_ERR(gpu->aspace)) {
+	gpu->aspace = gpu->funcs->create_address_space(gpu);
+	if (IS_ERR(gpu->aspace)) {
 		ret = PTR_ERR(gpu->aspace);
 		goto fail;
 	}
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index f2739cd..d4bf051 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -75,6 +75,8 @@ struct msm_gpu_funcs {
 	int (*gpu_state_put)(struct msm_gpu_state *state);
 	unsigned long (*gpu_get_freq)(struct msm_gpu *gpu);
 	void (*gpu_set_freq)(struct msm_gpu *gpu, unsigned long freq);
+	struct msm_gem_address_space *(*create_address_space)
+		(struct msm_gpu *gpu);
 };
 
 struct msm_gpu {
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 10/15] drm/msm: Add a helper function for a per-instance address space
  2019-05-21 16:13 ` Jordan Crouse
@ 2019-05-21 16:13   ` Jordan Crouse
  -1 siblings, 0 replies; 45+ messages in thread
From: Jordan Crouse @ 2019-05-21 16:13 UTC (permalink / raw)
  To: freedreno
  Cc: jean-philippe.brucker, linux-arm-msm, hoegsberg, dianders,
	Sean Paul, linux-kernel, dri-devel, Rob Clark, David Airlie,
	Daniel Vetter

Add a helper function to create a GEM address space attached to
an iommu auxiliary domain for a per-instance pagetable.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 drivers/gpu/drm/msm/msm_drv.h     |  4 +++
 drivers/gpu/drm/msm/msm_gem_vma.c | 53 +++++++++++++++++++++++----------------
 2 files changed, 36 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
index d9aa7ba..1d4b45a 100644
--- a/drivers/gpu/drm/msm/msm_drv.h
+++ b/drivers/gpu/drm/msm/msm_drv.h
@@ -262,6 +262,10 @@ struct msm_gem_address_space *
 msm_gem_address_space_create_a2xx(struct device *dev, struct msm_gpu *gpu,
 		const char *name, uint64_t va_start, uint64_t va_end);
 
+struct msm_gem_address_space *
+msm_gem_address_space_create_instance(struct device *dev, const char *name,
+		u64 va_start, u64 va_end);
+
 int msm_register_mmu(struct drm_device *dev, struct msm_mmu *mmu);
 void msm_unregister_mmu(struct drm_device *dev, struct msm_mmu *mmu);
 
diff --git a/drivers/gpu/drm/msm/msm_gem_vma.c b/drivers/gpu/drm/msm/msm_gem_vma.c
index fcf7a83..0ee11b4 100644
--- a/drivers/gpu/drm/msm/msm_gem_vma.c
+++ b/drivers/gpu/drm/msm/msm_gem_vma.c
@@ -136,14 +136,12 @@ int msm_gem_init_vma(struct msm_gem_address_space *aspace,
 	return 0;
 }
 
-
-struct msm_gem_address_space *
-msm_gem_address_space_create(struct device *dev, struct iommu_domain *domain,
-		const char *name)
+static struct msm_gem_address_space *
+msm_gem_address_space_new(struct msm_mmu *mmu, const char *name,
+		u64 va_start, u64 va_end)
 {
 	struct msm_gem_address_space *aspace;
-	u64 size = domain->geometry.aperture_end -
-		domain->geometry.aperture_start;
+	u64 size = va_end - va_start;
 
 	aspace = kzalloc(sizeof(*aspace), GFP_KERNEL);
 	if (!aspace)
@@ -151,10 +149,9 @@ msm_gem_address_space_create(struct device *dev, struct iommu_domain *domain,
 
 	spin_lock_init(&aspace->lock);
 	aspace->name = name;
-	aspace->mmu = msm_iommu_new(dev, domain);
+	aspace->mmu = mmu;
 
-	drm_mm_init(&aspace->mm, (domain->geometry.aperture_start >> PAGE_SHIFT),
-		size >> PAGE_SHIFT);
+	drm_mm_init(&aspace->mm, (va_start >> PAGE_SHIFT), size >> PAGE_SHIFT);
 
 	kref_init(&aspace->kref);
 
@@ -162,24 +159,38 @@ msm_gem_address_space_create(struct device *dev, struct iommu_domain *domain,
 }
 
 struct msm_gem_address_space *
+msm_gem_address_space_create(struct device *dev, struct iommu_domain *domain,
+		const char *name)
+{
+	struct msm_mmu *mmu = msm_iommu_new(dev, domain);
+
+	if (IS_ERR(mmu))
+		return ERR_CAST(mmu);
+
+	return msm_gem_address_space_new(mmu, name,
+		domain->geometry.aperture_start, domain->geometry.aperture_end);
+}
+
+struct msm_gem_address_space *
 msm_gem_address_space_create_a2xx(struct device *dev, struct msm_gpu *gpu,
 		const char *name, uint64_t va_start, uint64_t va_end)
 {
-	struct msm_gem_address_space *aspace;
-	u64 size = va_end - va_start;
+	struct msm_mmu *mmu = msm_gpummu_new(dev, gpu);
 
-	aspace = kzalloc(sizeof(*aspace), GFP_KERNEL);
-	if (!aspace)
-		return ERR_PTR(-ENOMEM);
+	if (IS_ERR(mmu))
+		return ERR_CAST(mmu);
 
-	spin_lock_init(&aspace->lock);
-	aspace->name = name;
-	aspace->mmu = msm_gpummu_new(dev, gpu);
+	return msm_gem_address_space_new(mmu, name, va_start, va_end);
+}
 
-	drm_mm_init(&aspace->mm, (va_start >> PAGE_SHIFT),
-		size >> PAGE_SHIFT);
+struct msm_gem_address_space *
+msm_gem_address_space_create_instance(struct device *dev, const char *name,
+		u64 va_start, u64 va_end)
+{
+	struct msm_mmu *mmu = msm_iommu_new_instance(dev);
 
-	kref_init(&aspace->kref);
+	if (IS_ERR(mmu))
+		return ERR_CAST(mmu);
 
-	return aspace;
+	return msm_gem_address_space_new(mmu, name, va_start, va_end);
 }
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 10/15] drm/msm: Add a helper function for a per-instance address space
@ 2019-05-21 16:13   ` Jordan Crouse
  0 siblings, 0 replies; 45+ messages in thread
From: Jordan Crouse @ 2019-05-21 16:13 UTC (permalink / raw)
  To: freedreno
  Cc: jean-philippe.brucker, linux-arm-msm, dianders, dri-devel,
	linux-kernel, David Airlie, hoegsberg, Sean Paul

Add a helper function to create a GEM address space attached to
an iommu auxiliary domain for a per-instance pagetable.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 drivers/gpu/drm/msm/msm_drv.h     |  4 +++
 drivers/gpu/drm/msm/msm_gem_vma.c | 53 +++++++++++++++++++++++----------------
 2 files changed, 36 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
index d9aa7ba..1d4b45a 100644
--- a/drivers/gpu/drm/msm/msm_drv.h
+++ b/drivers/gpu/drm/msm/msm_drv.h
@@ -262,6 +262,10 @@ struct msm_gem_address_space *
 msm_gem_address_space_create_a2xx(struct device *dev, struct msm_gpu *gpu,
 		const char *name, uint64_t va_start, uint64_t va_end);
 
+struct msm_gem_address_space *
+msm_gem_address_space_create_instance(struct device *dev, const char *name,
+		u64 va_start, u64 va_end);
+
 int msm_register_mmu(struct drm_device *dev, struct msm_mmu *mmu);
 void msm_unregister_mmu(struct drm_device *dev, struct msm_mmu *mmu);
 
diff --git a/drivers/gpu/drm/msm/msm_gem_vma.c b/drivers/gpu/drm/msm/msm_gem_vma.c
index fcf7a83..0ee11b4 100644
--- a/drivers/gpu/drm/msm/msm_gem_vma.c
+++ b/drivers/gpu/drm/msm/msm_gem_vma.c
@@ -136,14 +136,12 @@ int msm_gem_init_vma(struct msm_gem_address_space *aspace,
 	return 0;
 }
 
-
-struct msm_gem_address_space *
-msm_gem_address_space_create(struct device *dev, struct iommu_domain *domain,
-		const char *name)
+static struct msm_gem_address_space *
+msm_gem_address_space_new(struct msm_mmu *mmu, const char *name,
+		u64 va_start, u64 va_end)
 {
 	struct msm_gem_address_space *aspace;
-	u64 size = domain->geometry.aperture_end -
-		domain->geometry.aperture_start;
+	u64 size = va_end - va_start;
 
 	aspace = kzalloc(sizeof(*aspace), GFP_KERNEL);
 	if (!aspace)
@@ -151,10 +149,9 @@ msm_gem_address_space_create(struct device *dev, struct iommu_domain *domain,
 
 	spin_lock_init(&aspace->lock);
 	aspace->name = name;
-	aspace->mmu = msm_iommu_new(dev, domain);
+	aspace->mmu = mmu;
 
-	drm_mm_init(&aspace->mm, (domain->geometry.aperture_start >> PAGE_SHIFT),
-		size >> PAGE_SHIFT);
+	drm_mm_init(&aspace->mm, (va_start >> PAGE_SHIFT), size >> PAGE_SHIFT);
 
 	kref_init(&aspace->kref);
 
@@ -162,24 +159,38 @@ msm_gem_address_space_create(struct device *dev, struct iommu_domain *domain,
 }
 
 struct msm_gem_address_space *
+msm_gem_address_space_create(struct device *dev, struct iommu_domain *domain,
+		const char *name)
+{
+	struct msm_mmu *mmu = msm_iommu_new(dev, domain);
+
+	if (IS_ERR(mmu))
+		return ERR_CAST(mmu);
+
+	return msm_gem_address_space_new(mmu, name,
+		domain->geometry.aperture_start, domain->geometry.aperture_end);
+}
+
+struct msm_gem_address_space *
 msm_gem_address_space_create_a2xx(struct device *dev, struct msm_gpu *gpu,
 		const char *name, uint64_t va_start, uint64_t va_end)
 {
-	struct msm_gem_address_space *aspace;
-	u64 size = va_end - va_start;
+	struct msm_mmu *mmu = msm_gpummu_new(dev, gpu);
 
-	aspace = kzalloc(sizeof(*aspace), GFP_KERNEL);
-	if (!aspace)
-		return ERR_PTR(-ENOMEM);
+	if (IS_ERR(mmu))
+		return ERR_CAST(mmu);
 
-	spin_lock_init(&aspace->lock);
-	aspace->name = name;
-	aspace->mmu = msm_gpummu_new(dev, gpu);
+	return msm_gem_address_space_new(mmu, name, va_start, va_end);
+}
 
-	drm_mm_init(&aspace->mm, (va_start >> PAGE_SHIFT),
-		size >> PAGE_SHIFT);
+struct msm_gem_address_space *
+msm_gem_address_space_create_instance(struct device *dev, const char *name,
+		u64 va_start, u64 va_end)
+{
+	struct msm_mmu *mmu = msm_iommu_new_instance(dev);
 
-	kref_init(&aspace->kref);
+	if (IS_ERR(mmu))
+		return ERR_CAST(mmu);
 
-	return aspace;
+	return msm_gem_address_space_new(mmu, name, va_start, va_end);
 }
-- 
2.7.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 11/15] drm/msm/gpu: Add ttbr0 to the memptrs
  2019-05-21 16:13 ` Jordan Crouse
                   ` (12 preceding siblings ...)
  (?)
@ 2019-05-21 16:13 ` Jordan Crouse
  -1 siblings, 0 replies; 45+ messages in thread
From: Jordan Crouse @ 2019-05-21 16:13 UTC (permalink / raw)
  To: freedreno
  Cc: jean-philippe.brucker, linux-arm-msm, hoegsberg, dianders,
	Sean Paul, linux-kernel, dri-devel, Rob Clark, David Airlie,
	Daniel Vetter

Targets that support per-instance pagetable switching will have to keep
track of which pagetable belongs to each instance to be able to recover
for preemption.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 drivers/gpu/drm/msm/msm_ringbuffer.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.h b/drivers/gpu/drm/msm/msm_ringbuffer.h
index 6434ebb..493fa89 100644
--- a/drivers/gpu/drm/msm/msm_ringbuffer.h
+++ b/drivers/gpu/drm/msm/msm_ringbuffer.h
@@ -40,6 +40,7 @@ struct msm_gpu_submit_stats {
 struct msm_rbmemptrs {
 	volatile uint32_t rptr;
 	volatile uint32_t fence;
+	volatile uint64_t ttbr0;
 
 	volatile struct msm_gpu_submit_stats stats[MSM_GPU_SUBMIT_STATS_COUNT];
 };
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 12/15] drm/msm: Add support to create target specific address spaces
  2019-05-21 16:13 ` Jordan Crouse
                   ` (13 preceding siblings ...)
  (?)
@ 2019-05-21 16:14 ` Jordan Crouse
  -1 siblings, 0 replies; 45+ messages in thread
From: Jordan Crouse @ 2019-05-21 16:14 UTC (permalink / raw)
  To: freedreno
  Cc: jean-philippe.brucker, linux-arm-msm, hoegsberg, dianders,
	Sean Paul, linux-kernel, dri-devel, Rob Clark, David Airlie,
	Daniel Vetter

Add support to create a GPU target specific address space for
a context. For those targets that support per-instance
pagetables they will return a new address space set up for
the instance if possible otherwise just use the global
device pagetable.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 drivers/gpu/drm/msm/msm_drv.c | 25 ++++++++++++++++++++++---
 drivers/gpu/drm/msm/msm_gpu.h |  1 +
 2 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index 4c51063..dd3eb30 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -609,6 +609,14 @@ static void load_gpu(struct drm_device *dev)
 	mutex_unlock(&init_lock);
 }
 
+static struct msm_gem_address_space *context_address_space(struct msm_gpu *gpu)
+{
+	if (!gpu->funcs->new_address_space)
+		return gpu->aspace;
+
+	return gpu->funcs->new_address_space(gpu);
+}
+
 static int context_init(struct drm_device *dev, struct drm_file *file)
 {
 	struct msm_drm_private *priv = dev->dev_private;
@@ -618,9 +626,16 @@ static int context_init(struct drm_device *dev, struct drm_file *file)
 	if (!ctx)
 		return -ENOMEM;
 
+	ctx->aspace = context_address_space(priv->gpu);
+	if (IS_ERR(ctx->aspace)) {
+		int ret = PTR_ERR(ctx->aspace);
+
+		kfree(ctx);
+		return ret;
+	}
+
 	msm_submitqueue_init(dev, ctx);
 
-	ctx->aspace = priv->gpu->aspace;
 	file->driver_priv = ctx;
 
 	return 0;
@@ -636,8 +651,12 @@ static int msm_open(struct drm_device *dev, struct drm_file *file)
 	return context_init(dev, file);
 }
 
-static void context_close(struct msm_file_private *ctx)
+static void context_close(struct msm_drm_private *priv,
+		struct msm_file_private *ctx)
 {
+	if (ctx->aspace != priv->gpu->aspace)
+		msm_gem_address_space_put(ctx->aspace);
+
 	msm_submitqueue_close(ctx);
 	kfree(ctx);
 }
@@ -652,7 +671,7 @@ static void msm_postclose(struct drm_device *dev, struct drm_file *file)
 		priv->lastctx = NULL;
 	mutex_unlock(&dev->struct_mutex);
 
-	context_close(ctx);
+	context_close(priv, ctx);
 }
 
 static irqreturn_t msm_irq(int irq, void *arg)
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index d4bf051..588d7ba 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -77,6 +77,7 @@ struct msm_gpu_funcs {
 	void (*gpu_set_freq)(struct msm_gpu *gpu, unsigned long freq);
 	struct msm_gem_address_space *(*create_address_space)
 		(struct msm_gpu *gpu);
+	struct msm_gem_address_space *(*new_address_space)(struct msm_gpu *gpu);
 };
 
 struct msm_gpu {
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 13/15] drm/msm: Add support for IOMMU auxiliary domains
  2019-05-21 16:13 ` Jordan Crouse
@ 2019-05-21 16:14   ` Jordan Crouse
  -1 siblings, 0 replies; 45+ messages in thread
From: Jordan Crouse @ 2019-05-21 16:14 UTC (permalink / raw)
  To: freedreno
  Cc: jean-philippe.brucker, linux-arm-msm, hoegsberg, dianders,
	Sean Paul, linux-kernel, dri-devel, Rob Clark, David Airlie,
	Daniel Vetter

Add support for creating a auxiliary domain from the IOMMU device to
implement per-instance pagetables. Also add a helper function to
return the pagetable base address (ttbr) and asid to the caller so
that the GPU target code can set up the pagetable switch.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 drivers/gpu/drm/msm/msm_iommu.c | 97 +++++++++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/msm/msm_mmu.h   |  4 ++
 2 files changed, 101 insertions(+)

diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c
index 1926329..adf9f18 100644
--- a/drivers/gpu/drm/msm/msm_iommu.c
+++ b/drivers/gpu/drm/msm/msm_iommu.c
@@ -21,9 +21,21 @@
 struct msm_iommu {
 	struct msm_mmu base;
 	struct iommu_domain *domain;
+	u64 ttbr;
+	u32 asid;
 };
 #define to_msm_iommu(x) container_of(x, struct msm_iommu, base)
 
+/*
+ * The asid is currently unused for arm-smmu-v2 since all the pagetable
+ * switching does a TLBIALL but still assign a somewhat unique number per
+ * instance to leave open the possibility of being smarter about it
+ *
+ * Accepted range is 32 to 255 (starting at 32 gives a cushion for the asids
+ * assigned to the real context banks in the arm-smmu driver.
+ */
+static int msm_iommu_asid = 32;
+
 static int msm_fault_handler(struct iommu_domain *domain, struct device *dev,
 		unsigned long iova, int flags, void *arg)
 {
@@ -34,6 +46,47 @@ static int msm_fault_handler(struct iommu_domain *domain, struct device *dev,
 	return 0;
 }
 
+static int msm_iommu_aux_attach(struct msm_mmu *mmu, const char * const *names,
+			    int cnt)
+{
+	struct msm_iommu *iommu = to_msm_iommu(mmu);
+	int ret;
+
+	/* Attach the aux device */
+	ret = iommu_aux_attach_device(iommu->domain, mmu->dev);
+	if (ret)
+		return ret;
+
+	/* Get the base address of the pagetable */
+	ret = iommu_domain_get_attr(iommu->domain, DOMAIN_ATTR_PTBASE,
+		&iommu->ttbr);
+	if (ret)
+		return ret;
+
+	/*
+	 * Assign an asid for the instance even though the code doesn't
+	 * currently support per-asid TLB invalidation. There isn't any
+	 * protection on this so two instances could in theory end up with the
+	 * same ASID but that would have very minor performance implications if
+	 * per-ASID TLB invalidation were to be enabled in the future
+	 */
+	iommu->asid = msm_iommu_asid++;
+
+	if (msm_iommu_asid > 0xff)
+		msm_iommu_asid = 32;
+
+	return 0;
+}
+
+static void msm_iommu_aux_detach(struct msm_mmu *mmu, const char * const *names,
+			     int cnt)
+{
+	struct msm_iommu *iommu = to_msm_iommu(mmu);
+
+	iommu->ttbr = 0;
+	iommu->asid = 0;
+}
+
 static int msm_iommu_attach(struct msm_mmu *mmu, const char * const *names,
 			    int cnt)
 {
@@ -86,6 +139,50 @@ static const struct msm_mmu_funcs funcs = {
 		.destroy = msm_iommu_destroy,
 };
 
+static const struct msm_mmu_funcs aux_funcs = {
+		.attach = msm_iommu_aux_attach,
+		.detach = msm_iommu_aux_detach,
+		.map = msm_iommu_map,
+		.unmap = msm_iommu_unmap,
+		.destroy = msm_iommu_destroy,
+};
+
+bool msm_iommu_get_ptinfo(struct msm_mmu *mmu, u64 *ttbr, u32 *asid)
+{
+	struct msm_iommu *iommu = to_msm_iommu(mmu);
+
+	if (!iommu->ttbr)
+		return false;
+
+	if (ttbr)
+		*ttbr = iommu->ttbr;
+	if (asid)
+		*asid = iommu->asid;
+
+	return true;
+}
+
+
+struct msm_mmu *msm_iommu_new_instance(struct device *dev)
+{
+	struct msm_iommu *iommu;
+
+	iommu = kzalloc(sizeof(*iommu), GFP_KERNEL);
+	if (!iommu)
+		return ERR_PTR(-ENOMEM);
+
+	/* Create a new domain that will be attached as an aux domain */
+	iommu->domain = iommu_domain_alloc(&platform_bus_type);
+	if (!iommu->domain) {
+		kfree(iommu);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	msm_mmu_init(&iommu->base, dev, &aux_funcs);
+
+	return &iommu->base;
+}
+
 struct msm_mmu *msm_iommu_new(struct device *dev, struct iommu_domain *domain)
 {
 	struct msm_iommu *iommu;
diff --git a/drivers/gpu/drm/msm/msm_mmu.h b/drivers/gpu/drm/msm/msm_mmu.h
index d21b266..f430903 100644
--- a/drivers/gpu/drm/msm/msm_mmu.h
+++ b/drivers/gpu/drm/msm/msm_mmu.h
@@ -46,6 +46,10 @@ static inline void msm_mmu_init(struct msm_mmu *mmu, struct device *dev,
 struct msm_mmu *msm_iommu_new(struct device *dev, struct iommu_domain *domain);
 struct msm_mmu *msm_gpummu_new(struct device *dev, struct msm_gpu *gpu);
 
+struct msm_mmu *msm_iommu_new_instance(struct device *dev);
+
+bool msm_iommu_get_ptinfo(struct msm_mmu *mmu, u64 *ttbr, u32 *asid);
+
 static inline void msm_mmu_set_fault_handler(struct msm_mmu *mmu, void *arg,
 		int (*handler)(void *arg, unsigned long iova, int flags))
 {
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 13/15] drm/msm: Add support for IOMMU auxiliary domains
@ 2019-05-21 16:14   ` Jordan Crouse
  0 siblings, 0 replies; 45+ messages in thread
From: Jordan Crouse @ 2019-05-21 16:14 UTC (permalink / raw)
  To: freedreno
  Cc: jean-philippe.brucker, linux-arm-msm, dianders, dri-devel,
	linux-kernel, David Airlie, hoegsberg, Sean Paul

Add support for creating a auxiliary domain from the IOMMU device to
implement per-instance pagetables. Also add a helper function to
return the pagetable base address (ttbr) and asid to the caller so
that the GPU target code can set up the pagetable switch.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 drivers/gpu/drm/msm/msm_iommu.c | 97 +++++++++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/msm/msm_mmu.h   |  4 ++
 2 files changed, 101 insertions(+)

diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c
index 1926329..adf9f18 100644
--- a/drivers/gpu/drm/msm/msm_iommu.c
+++ b/drivers/gpu/drm/msm/msm_iommu.c
@@ -21,9 +21,21 @@
 struct msm_iommu {
 	struct msm_mmu base;
 	struct iommu_domain *domain;
+	u64 ttbr;
+	u32 asid;
 };
 #define to_msm_iommu(x) container_of(x, struct msm_iommu, base)
 
+/*
+ * The asid is currently unused for arm-smmu-v2 since all the pagetable
+ * switching does a TLBIALL but still assign a somewhat unique number per
+ * instance to leave open the possibility of being smarter about it
+ *
+ * Accepted range is 32 to 255 (starting at 32 gives a cushion for the asids
+ * assigned to the real context banks in the arm-smmu driver.
+ */
+static int msm_iommu_asid = 32;
+
 static int msm_fault_handler(struct iommu_domain *domain, struct device *dev,
 		unsigned long iova, int flags, void *arg)
 {
@@ -34,6 +46,47 @@ static int msm_fault_handler(struct iommu_domain *domain, struct device *dev,
 	return 0;
 }
 
+static int msm_iommu_aux_attach(struct msm_mmu *mmu, const char * const *names,
+			    int cnt)
+{
+	struct msm_iommu *iommu = to_msm_iommu(mmu);
+	int ret;
+
+	/* Attach the aux device */
+	ret = iommu_aux_attach_device(iommu->domain, mmu->dev);
+	if (ret)
+		return ret;
+
+	/* Get the base address of the pagetable */
+	ret = iommu_domain_get_attr(iommu->domain, DOMAIN_ATTR_PTBASE,
+		&iommu->ttbr);
+	if (ret)
+		return ret;
+
+	/*
+	 * Assign an asid for the instance even though the code doesn't
+	 * currently support per-asid TLB invalidation. There isn't any
+	 * protection on this so two instances could in theory end up with the
+	 * same ASID but that would have very minor performance implications if
+	 * per-ASID TLB invalidation were to be enabled in the future
+	 */
+	iommu->asid = msm_iommu_asid++;
+
+	if (msm_iommu_asid > 0xff)
+		msm_iommu_asid = 32;
+
+	return 0;
+}
+
+static void msm_iommu_aux_detach(struct msm_mmu *mmu, const char * const *names,
+			     int cnt)
+{
+	struct msm_iommu *iommu = to_msm_iommu(mmu);
+
+	iommu->ttbr = 0;
+	iommu->asid = 0;
+}
+
 static int msm_iommu_attach(struct msm_mmu *mmu, const char * const *names,
 			    int cnt)
 {
@@ -86,6 +139,50 @@ static const struct msm_mmu_funcs funcs = {
 		.destroy = msm_iommu_destroy,
 };
 
+static const struct msm_mmu_funcs aux_funcs = {
+		.attach = msm_iommu_aux_attach,
+		.detach = msm_iommu_aux_detach,
+		.map = msm_iommu_map,
+		.unmap = msm_iommu_unmap,
+		.destroy = msm_iommu_destroy,
+};
+
+bool msm_iommu_get_ptinfo(struct msm_mmu *mmu, u64 *ttbr, u32 *asid)
+{
+	struct msm_iommu *iommu = to_msm_iommu(mmu);
+
+	if (!iommu->ttbr)
+		return false;
+
+	if (ttbr)
+		*ttbr = iommu->ttbr;
+	if (asid)
+		*asid = iommu->asid;
+
+	return true;
+}
+
+
+struct msm_mmu *msm_iommu_new_instance(struct device *dev)
+{
+	struct msm_iommu *iommu;
+
+	iommu = kzalloc(sizeof(*iommu), GFP_KERNEL);
+	if (!iommu)
+		return ERR_PTR(-ENOMEM);
+
+	/* Create a new domain that will be attached as an aux domain */
+	iommu->domain = iommu_domain_alloc(&platform_bus_type);
+	if (!iommu->domain) {
+		kfree(iommu);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	msm_mmu_init(&iommu->base, dev, &aux_funcs);
+
+	return &iommu->base;
+}
+
 struct msm_mmu *msm_iommu_new(struct device *dev, struct iommu_domain *domain)
 {
 	struct msm_iommu *iommu;
diff --git a/drivers/gpu/drm/msm/msm_mmu.h b/drivers/gpu/drm/msm/msm_mmu.h
index d21b266..f430903 100644
--- a/drivers/gpu/drm/msm/msm_mmu.h
+++ b/drivers/gpu/drm/msm/msm_mmu.h
@@ -46,6 +46,10 @@ static inline void msm_mmu_init(struct msm_mmu *mmu, struct device *dev,
 struct msm_mmu *msm_iommu_new(struct device *dev, struct iommu_domain *domain);
 struct msm_mmu *msm_gpummu_new(struct device *dev, struct msm_gpu *gpu);
 
+struct msm_mmu *msm_iommu_new_instance(struct device *dev);
+
+bool msm_iommu_get_ptinfo(struct msm_mmu *mmu, u64 *ttbr, u32 *asid);
+
 static inline void msm_mmu_set_fault_handler(struct msm_mmu *mmu, void *arg,
 		int (*handler)(void *arg, unsigned long iova, int flags))
 {
-- 
2.7.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 14/15] drm/msm/a6xx: Support per-instance pagetables
@ 2019-05-21 16:14   ` Jordan Crouse
  0 siblings, 0 replies; 45+ messages in thread
From: Jordan Crouse @ 2019-05-21 16:14 UTC (permalink / raw)
  To: freedreno
  Cc: jean-philippe.brucker, linux-arm-msm, hoegsberg, dianders,
	Sean Paul, Thomas Zimmermann, Sharat Masetty, dri-devel,
	linux-kernel, Rob Clark, David Airlie, Mamta Shukla,
	Daniel Vetter

Add support for per-instance pagetables for a6xx targets. Add support
to handle split pagetables and create a new instance if the needed
IOMMU support exists and insert the necessary PM4 commands to trigger
a pagetable switch at the beginning of a user command.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 123 ++++++++++++++++++++++++++++++++--
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h |   1 +
 2 files changed, 120 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index f22385c..fd875de 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -12,6 +12,62 @@
 
 #define GPU_PAS_ID 13
 
+static void a6xx_set_pagetable(struct msm_gpu *gpu, struct msm_ringbuffer *ring,
+	struct msm_file_private *ctx)
+{
+	u64 ttbr;
+	u32 asid;
+
+	if (!msm_iommu_get_ptinfo(ctx->aspace->mmu, &ttbr, &asid))
+		return;
+
+	ttbr = ttbr | ((u64) asid) << 48;
+
+	/* Turn off protected mode */
+	OUT_PKT7(ring, CP_SET_PROTECTED_MODE, 1);
+	OUT_RING(ring, 0);
+
+	/* Turn on APIV mode to access critical regions */
+	OUT_PKT4(ring, REG_A6XX_CP_MISC_CNTL, 1);
+	OUT_RING(ring, 1);
+
+	/* Make sure the ME is synchronized before staring the update */
+	OUT_PKT7(ring, CP_WAIT_FOR_ME, 0);
+
+	/* Execute the table update */
+	OUT_PKT7(ring, CP_SMMU_TABLE_UPDATE, 4);
+	OUT_RING(ring, lower_32_bits(ttbr));
+	OUT_RING(ring, upper_32_bits(ttbr));
+	/* CONTEXTIDR is currently unused */
+	OUT_RING(ring, 0);
+	/* CONTEXTBANK is currently unused */
+	OUT_RING(ring, 0);
+
+	/*
+	 * Write the new TTBR0 to the preemption records - this will be used to
+	 * reload the pagetable if the current ring gets preempted out.
+	 */
+	OUT_PKT7(ring, CP_MEM_WRITE, 4);
+	OUT_RING(ring, lower_32_bits(rbmemptr(ring, ttbr0)));
+	OUT_RING(ring, upper_32_bits(rbmemptr(ring, ttbr0)));
+	OUT_RING(ring, lower_32_bits(ttbr));
+	OUT_RING(ring, upper_32_bits(ttbr));
+
+	/* Invalidate the draw state so we start off fresh */
+	OUT_PKT7(ring, CP_SET_DRAW_STATE, 3);
+	OUT_RING(ring, 0x40000);
+	OUT_RING(ring, 1);
+	OUT_RING(ring, 0);
+
+	/* Turn off APRIV */
+	OUT_PKT4(ring, REG_A6XX_CP_MISC_CNTL, 1);
+	OUT_RING(ring, 0);
+
+	/* Turn off protected mode */
+	OUT_PKT7(ring, CP_SET_PROTECTED_MODE, 1);
+	OUT_RING(ring, 1);
+}
+
 static inline bool _a6xx_check_idle(struct msm_gpu *gpu)
 {
 	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
@@ -89,6 +145,8 @@ static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit,
 	struct msm_ringbuffer *ring = submit->ring;
 	unsigned int i;
 
+	a6xx_set_pagetable(gpu, ring, ctx);
+
 	get_stats_counter(ring, REG_A6XX_RBBM_PERFCTR_CP_0_LO,
 		rbmemptr_stats(ring, index, cpcycles_start));
 
@@ -810,21 +868,77 @@ static unsigned long a6xx_gpu_busy(struct msm_gpu *gpu)
 	return (unsigned long)busy_time;
 }
 
+static struct msm_gem_address_space *a6xx_new_address_space(struct msm_gpu *gpu)
+{
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
+	struct msm_gem_address_space *aspace;
+	int ret;
+
+	/* Return the default pagetable if per instance tables don't work */
+	if (!a6xx_gpu->per_instance_tables)
+		return gpu->aspace;
+
+	aspace = msm_gem_address_space_create_instance(&gpu->pdev->dev, "gpu",
+		0x100000000ULL, 0x1ffffffffULL);
+	if (IS_ERR(aspace))
+		return aspace;
+
+	ret = aspace->mmu->funcs->attach(aspace->mmu, NULL, 0);
+	if (ret) {
+		/* -ENODEV means that aux domains aren't supported */
+		if (ret == -ENODEV)
+			return gpu->aspace;
+
+		return ERR_PTR(ret);
+	}
+
+	return aspace;
+}
+
 static struct msm_gem_address_space *
 a6xx_create_address_space(struct msm_gpu *gpu)
 {
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
+	struct device *dev = &gpu->pdev->dev;
 	struct msm_gem_address_space *aspace;
 	struct iommu_domain *iommu;
-	int ret;
+	int ret, val = 1;
+
+	a6xx_gpu->per_instance_tables = false;
 
 	iommu = iommu_domain_alloc(&platform_bus_type);
 	if (!iommu)
 		return NULL;
 
-	iommu->geometry.aperture_start = 0x100000000ULL;
-	iommu->geometry.aperture_end = 0x1ffffffffULL;
+	/* Try to enable split pagetables */
+	if (iommu_domain_set_attr(iommu, DOMAIN_ATTR_SPLIT_TABLES, &val)) {
+		/*
+		 * If split pagetables aren't available we won't be able to do
+		 * per-instance pagetables so set up the global va space at our
+		 * susual location
+		 */
+		iommu->geometry.aperture_start = 0x100000000ULL;
+		iommu->geometry.aperture_end = 0x1ffffffffULL;
+	} else {
+		/*
+		 * If split pagetables are available then we might be able to do
+		 * per-instance pagetables. Put the default va-space in TTBR1 to
+		 * prepare
+		 */
+		iommu->geometry.aperture_start = 0xfffffff100000000ULL;
+		iommu->geometry.aperture_end = 0xffffff1ffffffffULL;
+
+		/*
+		 * If both split pagetables and aux domains are supported we can
+		 * do per_instance pagetables
+		 */
+		a6xx_gpu->per_instance_tables =
+			iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX);
+	}
 
-	aspace = msm_gem_address_space_create(&gpu->pdev->dev, iommu, "gpu");
+	aspace = msm_gem_address_space_create(dev, iommu, "gpu");
 	if (IS_ERR(aspace)) {
 		iommu_domain_free(iommu);
 		DRM_DEV_ERROR(gpu->dev->dev, "failed to init mmu: %ld\n",
@@ -865,6 +979,7 @@ static const struct adreno_gpu_funcs funcs = {
 		.gpu_state_put = a6xx_gpu_state_put,
 #endif
 		.create_address_space = a6xx_create_address_space,
+		.new_address_space = a6xx_new_address_space,
 	},
 	.get_timestamp = a6xx_get_timestamp,
 };
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
index b46279e..54d71f4 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
@@ -21,6 +21,7 @@ struct a6xx_gpu {
 	struct msm_ringbuffer *cur_ring;
 
 	struct a6xx_gmu gmu;
+	bool per_instance_tables;
 };
 
 #define to_a6xx_gpu(x) container_of(x, struct a6xx_gpu, base)
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 14/15] drm/msm/a6xx: Support per-instance pagetables
@ 2019-05-21 16:14   ` Jordan Crouse
  0 siblings, 0 replies; 45+ messages in thread
From: Jordan Crouse @ 2019-05-21 16:14 UTC (permalink / raw)
  To: freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Daniel Vetter, jean-philippe.brucker-5wv7dgnIgG8,
	linux-arm-msm-u79uwXL29TY76Z2rM5mHXA, Sharat Masetty,
	dianders-F7+t8E8rja9g9hUCZPvPmw,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, David Airlie, Rob Clark,
	hoegsberg-hpIqsD4AKlfQT0dZR+AlfA, Mamta Shukla,
	Thomas Zimmermann, Sean Paul

Add support for per-instance pagetables for a6xx targets. Add support
to handle split pagetables and create a new instance if the needed
IOMMU support exists and insert the necessary PM4 commands to trigger
a pagetable switch at the beginning of a user command.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 123 ++++++++++++++++++++++++++++++++--
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h |   1 +
 2 files changed, 120 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index f22385c..fd875de 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -12,6 +12,62 @@
 
 #define GPU_PAS_ID 13
 
+static void a6xx_set_pagetable(struct msm_gpu *gpu, struct msm_ringbuffer *ring,
+	struct msm_file_private *ctx)
+{
+	u64 ttbr;
+	u32 asid;
+
+	if (!msm_iommu_get_ptinfo(ctx->aspace->mmu, &ttbr, &asid))
+		return;
+
+	ttbr = ttbr | ((u64) asid) << 48;
+
+	/* Turn off protected mode */
+	OUT_PKT7(ring, CP_SET_PROTECTED_MODE, 1);
+	OUT_RING(ring, 0);
+
+	/* Turn on APIV mode to access critical regions */
+	OUT_PKT4(ring, REG_A6XX_CP_MISC_CNTL, 1);
+	OUT_RING(ring, 1);
+
+	/* Make sure the ME is synchronized before staring the update */
+	OUT_PKT7(ring, CP_WAIT_FOR_ME, 0);
+
+	/* Execute the table update */
+	OUT_PKT7(ring, CP_SMMU_TABLE_UPDATE, 4);
+	OUT_RING(ring, lower_32_bits(ttbr));
+	OUT_RING(ring, upper_32_bits(ttbr));
+	/* CONTEXTIDR is currently unused */
+	OUT_RING(ring, 0);
+	/* CONTEXTBANK is currently unused */
+	OUT_RING(ring, 0);
+
+	/*
+	 * Write the new TTBR0 to the preemption records - this will be used to
+	 * reload the pagetable if the current ring gets preempted out.
+	 */
+	OUT_PKT7(ring, CP_MEM_WRITE, 4);
+	OUT_RING(ring, lower_32_bits(rbmemptr(ring, ttbr0)));
+	OUT_RING(ring, upper_32_bits(rbmemptr(ring, ttbr0)));
+	OUT_RING(ring, lower_32_bits(ttbr));
+	OUT_RING(ring, upper_32_bits(ttbr));
+
+	/* Invalidate the draw state so we start off fresh */
+	OUT_PKT7(ring, CP_SET_DRAW_STATE, 3);
+	OUT_RING(ring, 0x40000);
+	OUT_RING(ring, 1);
+	OUT_RING(ring, 0);
+
+	/* Turn off APRIV */
+	OUT_PKT4(ring, REG_A6XX_CP_MISC_CNTL, 1);
+	OUT_RING(ring, 0);
+
+	/* Turn off protected mode */
+	OUT_PKT7(ring, CP_SET_PROTECTED_MODE, 1);
+	OUT_RING(ring, 1);
+}
+
 static inline bool _a6xx_check_idle(struct msm_gpu *gpu)
 {
 	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
@@ -89,6 +145,8 @@ static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit,
 	struct msm_ringbuffer *ring = submit->ring;
 	unsigned int i;
 
+	a6xx_set_pagetable(gpu, ring, ctx);
+
 	get_stats_counter(ring, REG_A6XX_RBBM_PERFCTR_CP_0_LO,
 		rbmemptr_stats(ring, index, cpcycles_start));
 
@@ -810,21 +868,77 @@ static unsigned long a6xx_gpu_busy(struct msm_gpu *gpu)
 	return (unsigned long)busy_time;
 }
 
+static struct msm_gem_address_space *a6xx_new_address_space(struct msm_gpu *gpu)
+{
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
+	struct msm_gem_address_space *aspace;
+	int ret;
+
+	/* Return the default pagetable if per instance tables don't work */
+	if (!a6xx_gpu->per_instance_tables)
+		return gpu->aspace;
+
+	aspace = msm_gem_address_space_create_instance(&gpu->pdev->dev, "gpu",
+		0x100000000ULL, 0x1ffffffffULL);
+	if (IS_ERR(aspace))
+		return aspace;
+
+	ret = aspace->mmu->funcs->attach(aspace->mmu, NULL, 0);
+	if (ret) {
+		/* -ENODEV means that aux domains aren't supported */
+		if (ret == -ENODEV)
+			return gpu->aspace;
+
+		return ERR_PTR(ret);
+	}
+
+	return aspace;
+}
+
 static struct msm_gem_address_space *
 a6xx_create_address_space(struct msm_gpu *gpu)
 {
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
+	struct device *dev = &gpu->pdev->dev;
 	struct msm_gem_address_space *aspace;
 	struct iommu_domain *iommu;
-	int ret;
+	int ret, val = 1;
+
+	a6xx_gpu->per_instance_tables = false;
 
 	iommu = iommu_domain_alloc(&platform_bus_type);
 	if (!iommu)
 		return NULL;
 
-	iommu->geometry.aperture_start = 0x100000000ULL;
-	iommu->geometry.aperture_end = 0x1ffffffffULL;
+	/* Try to enable split pagetables */
+	if (iommu_domain_set_attr(iommu, DOMAIN_ATTR_SPLIT_TABLES, &val)) {
+		/*
+		 * If split pagetables aren't available we won't be able to do
+		 * per-instance pagetables so set up the global va space at our
+		 * susual location
+		 */
+		iommu->geometry.aperture_start = 0x100000000ULL;
+		iommu->geometry.aperture_end = 0x1ffffffffULL;
+	} else {
+		/*
+		 * If split pagetables are available then we might be able to do
+		 * per-instance pagetables. Put the default va-space in TTBR1 to
+		 * prepare
+		 */
+		iommu->geometry.aperture_start = 0xfffffff100000000ULL;
+		iommu->geometry.aperture_end = 0xffffff1ffffffffULL;
+
+		/*
+		 * If both split pagetables and aux domains are supported we can
+		 * do per_instance pagetables
+		 */
+		a6xx_gpu->per_instance_tables =
+			iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX);
+	}
 
-	aspace = msm_gem_address_space_create(&gpu->pdev->dev, iommu, "gpu");
+	aspace = msm_gem_address_space_create(dev, iommu, "gpu");
 	if (IS_ERR(aspace)) {
 		iommu_domain_free(iommu);
 		DRM_DEV_ERROR(gpu->dev->dev, "failed to init mmu: %ld\n",
@@ -865,6 +979,7 @@ static const struct adreno_gpu_funcs funcs = {
 		.gpu_state_put = a6xx_gpu_state_put,
 #endif
 		.create_address_space = a6xx_create_address_space,
+		.new_address_space = a6xx_new_address_space,
 	},
 	.get_timestamp = a6xx_get_timestamp,
 };
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
index b46279e..54d71f4 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
@@ -21,6 +21,7 @@ struct a6xx_gpu {
 	struct msm_ringbuffer *cur_ring;
 
 	struct a6xx_gmu gmu;
+	bool per_instance_tables;
 };
 
 #define to_a6xx_gpu(x) container_of(x, struct a6xx_gpu, base)
-- 
2.7.4

_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 15/15] drm/msm/a5xx: Support per-instance pagetables
@ 2019-05-21 16:14   ` Jordan Crouse
  0 siblings, 0 replies; 45+ messages in thread
From: Jordan Crouse @ 2019-05-21 16:14 UTC (permalink / raw)
  To: freedreno
  Cc: jean-philippe.brucker, linux-arm-msm, hoegsberg, dianders,
	Sean Paul, Wen Yang, Thomas Zimmermann, Sharat Masetty,
	dri-devel, linux-kernel, Rob Clark, David Airlie, Mamta Shukla,
	Daniel Vetter

Add support for per-instance pagetables for 5XX targets. Create a support
buffer for preemption to hold the SMMU pagetable information for a
preempted ring, enable TTBR1 to support split pagetables and add the
necessary PM4 commands to trigger a pagetable switch at the beginning
of a user command.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 drivers/gpu/drm/msm/adreno/a5xx_gpu.c     | 120 +++++++++++++++++++++++++++++-
 drivers/gpu/drm/msm/adreno/a5xx_gpu.h     |  19 +++++
 drivers/gpu/drm/msm/adreno/a5xx_preempt.c |  70 +++++++++++++----
 3 files changed, 192 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index c243334..21b8e5c 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -111,6 +111,59 @@ static void a5xx_submit_in_rb(struct msm_gpu *gpu, struct msm_gem_submit *submit
 	msm_gpu_retire(gpu);
 }
 
+static void a5xx_set_pagetable(struct msm_gpu *gpu, struct msm_ringbuffer *ring,
+	struct msm_file_private *ctx)
+{
+	u64 ttbr;
+	u32 asid;
+
+	if (!msm_iommu_get_ptinfo(ctx->aspace->mmu, &ttbr, &asid))
+		return;
+
+	ttbr = ttbr | ((u64) asid) << 48;
+
+	/* Turn off protected mode */
+	OUT_PKT7(ring, CP_SET_PROTECTED_MODE, 1);
+	OUT_RING(ring, 0);
+
+	/* Turn on APIV mode to access critical regions */
+	OUT_PKT4(ring, REG_A5XX_CP_CNTL, 1);
+	OUT_RING(ring, 1);
+
+	/* Make sure the ME is synchronized before staring the update */
+	OUT_PKT7(ring, CP_WAIT_FOR_ME, 0);
+
+	/* Execute the table update */
+	OUT_PKT7(ring, CP_SMMU_TABLE_UPDATE, 3);
+	OUT_RING(ring, lower_32_bits(ttbr));
+	OUT_RING(ring, upper_32_bits(ttbr));
+	OUT_RING(ring, 0);
+
+	/*
+	 * Write the new TTBR0 to the preemption records - this will be used to
+	 * reload the pagetable if the current ring gets preempted out.
+	 */
+	OUT_PKT7(ring, CP_MEM_WRITE, 4);
+	OUT_RING(ring, lower_32_bits(rbmemptr(ring, ttbr0)));
+	OUT_RING(ring, upper_32_bits(rbmemptr(ring, ttbr0)));
+	OUT_RING(ring, lower_32_bits(ttbr));
+	OUT_RING(ring, upper_32_bits(ttbr));
+
+	/* Invalidate the draw state so we start off fresh */
+	OUT_PKT7(ring, CP_SET_DRAW_STATE, 3);
+	OUT_RING(ring, 0x40000);
+	OUT_RING(ring, 1);
+	OUT_RING(ring, 0);
+
+	/* Turn off APRIV */
+	OUT_PKT4(ring, REG_A5XX_CP_CNTL, 1);
+	OUT_RING(ring, 0);
+
+	/* Turn off protected mode */
+	OUT_PKT7(ring, CP_SET_PROTECTED_MODE, 1);
+	OUT_RING(ring, 1);
+}
+
 static void a5xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit,
 	struct msm_file_private *ctx)
 {
@@ -126,6 +179,8 @@ static void a5xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit,
 		return;
 	}
 
+	a5xx_set_pagetable(gpu, ring, ctx);
+
 	OUT_PKT7(ring, CP_PREEMPT_ENABLE_GLOBAL, 1);
 	OUT_RING(ring, 0x02);
 
@@ -1349,21 +1404,77 @@ static unsigned long a5xx_gpu_busy(struct msm_gpu *gpu)
 	return (unsigned long)busy_time;
 }
 
+static struct msm_gem_address_space *a5xx_new_address_space(struct msm_gpu *gpu)
+{
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct a5xx_gpu *a5xx_gpu = to_a5xx_gpu(adreno_gpu);
+	struct msm_gem_address_space *aspace;
+	int ret;
+
+	/* Return the default pagetable if per instance tables don't work */
+	if (!a5xx_gpu->per_instance_tables)
+		return gpu->aspace;
+
+	aspace = msm_gem_address_space_create_instance(&gpu->pdev->dev,
+		"gpu", 0x100000000ULL, 0x1ffffffffULL);
+	if (IS_ERR(aspace))
+		return aspace;
+
+	ret = aspace->mmu->funcs->attach(aspace->mmu, NULL, 0);
+	if (ret) {
+		/* -ENODEV means that aux domains aren't supported */
+		if (ret == -ENODEV)
+			return gpu->aspace;
+
+		return ERR_PTR(ret);
+	}
+
+	return aspace;
+}
+
 static struct msm_gem_address_space *
 a5xx_create_address_space(struct msm_gpu *gpu)
 {
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct a5xx_gpu *a5xx_gpu = to_a5xx_gpu(adreno_gpu);
+	struct device *dev = &gpu->pdev->dev;
 	struct msm_gem_address_space *aspace;
 	struct iommu_domain *iommu;
-	int ret;
+	int ret, val = 1;
+
+	a5xx_gpu->per_instance_tables = false;
 
 	iommu = iommu_domain_alloc(&platform_bus_type);
 	if (!iommu)
 		return NULL;
 
-	iommu->geometry.aperture_start = 0x100000000ULL;
-	iommu->geometry.aperture_end = 0x1ffffffffULL;
+	/* Try to enable split pagetables */
+	if (iommu_domain_set_attr(iommu, DOMAIN_ATTR_SPLIT_TABLES, &val)) {
+		/*
+		 * If split pagetables aren't available we won't be able to do
+		 * per-instance pagetables so set up the global va space at our
+		 * susual location
+		 */
+		iommu->geometry.aperture_start = 0x100000000ULL;
+		iommu->geometry.aperture_end = 0x1ffffffffULL;
+	} else {
+		/*
+		 * If split pagetables are available then we might be able to do
+		 * per-instance pagetables. Put the default va-space in TTBR1 to
+		 * prepare
+		 */
+		iommu->geometry.aperture_start = 0xfffffff100000000ULL;
+		iommu->geometry.aperture_end = 0xfffffff1ffffffffULL;
+
+		/*
+		 * If both split pagetables and aux domains are supported we can
+		 * do per_instance pagetables
+		 */
+		a5xx_gpu->per_instance_tables =
+			iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX);
+	}
 
-	aspace = msm_gem_address_space_create(&gpu->pdev->dev, iommu, "gpu");
+	aspace = msm_gem_address_space_create(dev, iommu, "gpu");
 	if (IS_ERR(aspace)) {
 		iommu_domain_free(iommu);
 		DRM_DEV_ERROR(gpu->dev->dev, "failed to init mmu: %ld\n",
@@ -1403,6 +1514,7 @@ static const struct adreno_gpu_funcs funcs = {
 		.gpu_state_get = a5xx_gpu_state_get,
 		.gpu_state_put = a5xx_gpu_state_put,
 		.create_address_space = a5xx_create_address_space,
+		.new_address_space = a5xx_new_address_space,
 	},
 	.get_timestamp = a5xx_get_timestamp,
 };
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.h b/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
index 7d71860..82ceb9b 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
@@ -45,6 +45,11 @@ struct a5xx_gpu {
 
 	atomic_t preempt_state;
 	struct timer_list preempt_timer;
+	struct a5xx_smmu_info *smmu_info;
+	struct drm_gem_object *smmu_info_bo;
+	uint64_t smmu_info_iova;
+
+	bool per_instance_tables;
 };
 
 #define to_a5xx_gpu(x) container_of(x, struct a5xx_gpu, base)
@@ -132,6 +137,20 @@ struct a5xx_preempt_record {
  */
 #define A5XX_PREEMPT_COUNTER_SIZE (16 * 4)
 
+/*
+ * This is a global structure that the preemption code uses to switch in the
+ * pagetable for the preempted process - the code switches in whatever we
+ * after preempting in a new ring.
+ */
+struct a5xx_smmu_info {
+	uint32_t  magic;
+	uint32_t  _pad4;
+	uint64_t  ttbr0;
+	uint32_t  asid;
+	uint32_t  contextidr;
+};
+
+#define A5XX_SMMU_INFO_MAGIC 0x3618CDA3UL
 
 int a5xx_power_init(struct msm_gpu *gpu);
 void a5xx_gpmu_ucode_init(struct msm_gpu *gpu);
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
index 3d62310..1050409 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
@@ -12,6 +12,7 @@
  */
 
 #include "msm_gem.h"
+#include "msm_mmu.h"
 #include "a5xx_gpu.h"
 
 /*
@@ -145,6 +146,15 @@ void a5xx_preempt_trigger(struct msm_gpu *gpu)
 	a5xx_gpu->preempt[ring->id]->wptr = get_wptr(ring);
 	spin_unlock_irqrestore(&ring->lock, flags);
 
+	/* Do read barrier to make sure we have updated pagetable info */
+	rmb();
+
+	/* Set the SMMU info for the preemption */
+	if (a5xx_gpu->smmu_info) {
+		a5xx_gpu->smmu_info->ttbr0 = ring->memptrs->ttbr0;
+		a5xx_gpu->smmu_info->contextidr = 0;
+	}
+
 	/* Set the address of the incoming preemption record */
 	gpu_write64(gpu, REG_A5XX_CP_CONTEXT_SWITCH_RESTORE_ADDR_LO,
 		REG_A5XX_CP_CONTEXT_SWITCH_RESTORE_ADDR_HI,
@@ -221,9 +231,10 @@ void a5xx_preempt_hw_init(struct msm_gpu *gpu)
 		a5xx_gpu->preempt[i]->rbase = gpu->rb[i]->iova;
 	}
 
-	/* Write a 0 to signal that we aren't switching pagetables */
+	/* Tell the CP where to find the smmu_info buffer*/
 	gpu_write64(gpu, REG_A5XX_CP_CONTEXT_SWITCH_SMMU_INFO_LO,
-		REG_A5XX_CP_CONTEXT_SWITCH_SMMU_INFO_HI, 0);
+		REG_A5XX_CP_CONTEXT_SWITCH_SMMU_INFO_HI,
+		a5xx_gpu->smmu_info_iova);
 
 	/* Reset the preemption state */
 	set_preempt_state(a5xx_gpu, PREEMPT_NONE);
@@ -271,6 +282,34 @@ void a5xx_preempt_fini(struct msm_gpu *gpu)
 
 	for (i = 0; i < gpu->nr_rings; i++)
 		msm_gem_kernel_put(a5xx_gpu->preempt_bo[i], gpu->aspace, true);
+
+	msm_gem_kernel_put(a5xx_gpu->smmu_info_bo, gpu->aspace, true);
+}
+
+static int a5xx_smmu_info_init(struct msm_gpu *gpu)
+{
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct a5xx_gpu *a5xx_gpu = to_a5xx_gpu(adreno_gpu);
+	struct a5xx_smmu_info *ptr;
+	struct drm_gem_object *bo;
+	u64 iova;
+
+	if (!a5xx_gpu->per_instance_tables)
+		return 0;
+
+	ptr = msm_gem_kernel_new(gpu->dev, sizeof(struct a5xx_smmu_info),
+		MSM_BO_UNCACHED, gpu->aspace, &bo, &iova);
+
+	if (IS_ERR(ptr))
+		return PTR_ERR(ptr);
+
+	ptr->magic = A5XX_SMMU_INFO_MAGIC;
+
+	a5xx_gpu->smmu_info_bo = bo;
+	a5xx_gpu->smmu_info_iova = iova;
+	a5xx_gpu->smmu_info = ptr;
+
+	return 0;
 }
 
 void a5xx_preempt_init(struct msm_gpu *gpu)
@@ -284,17 +323,22 @@ void a5xx_preempt_init(struct msm_gpu *gpu)
 		return;
 
 	for (i = 0; i < gpu->nr_rings; i++) {
-		if (preempt_init_ring(a5xx_gpu, gpu->rb[i])) {
-			/*
-			 * On any failure our adventure is over. Clean up and
-			 * set nr_rings to 1 to force preemption off
-			 */
-			a5xx_preempt_fini(gpu);
-			gpu->nr_rings = 1;
-
-			return;
-		}
+		if (preempt_init_ring(a5xx_gpu, gpu->rb[i]))
+			goto fail;
 	}
 
-	timer_setup(&a5xx_gpu->preempt_timer, a5xx_preempt_timer, 0);
+	if (a5xx_smmu_info_init(gpu))
+		goto fail;
+
+	timer_setup(&a5xx_gpu->preempt_timer, a5xx_preempt_timer,
+		(unsigned long) a5xx_gpu);
+
+	return;
+fail:
+	/*
+	 * On any failure our adventure is over. Clean up and
+	 * set nr_rings to 1 to force preemption off
+	 */
+	a5xx_preempt_fini(gpu);
+	gpu->nr_rings = 1;
 }
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 15/15] drm/msm/a5xx: Support per-instance pagetables
@ 2019-05-21 16:14   ` Jordan Crouse
  0 siblings, 0 replies; 45+ messages in thread
From: Jordan Crouse @ 2019-05-21 16:14 UTC (permalink / raw)
  To: freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Daniel Vetter, jean-philippe.brucker-5wv7dgnIgG8,
	linux-arm-msm-u79uwXL29TY76Z2rM5mHXA, Sharat Masetty,
	dianders-F7+t8E8rja9g9hUCZPvPmw,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, David Airlie, Rob Clark,
	hoegsberg-hpIqsD4AKlfQT0dZR+AlfA, Mamta Shukla,
	Thomas Zimmermann, Sean Paul, Wen Yang

Add support for per-instance pagetables for 5XX targets. Create a support
buffer for preemption to hold the SMMU pagetable information for a
preempted ring, enable TTBR1 to support split pagetables and add the
necessary PM4 commands to trigger a pagetable switch at the beginning
of a user command.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---

 drivers/gpu/drm/msm/adreno/a5xx_gpu.c     | 120 +++++++++++++++++++++++++++++-
 drivers/gpu/drm/msm/adreno/a5xx_gpu.h     |  19 +++++
 drivers/gpu/drm/msm/adreno/a5xx_preempt.c |  70 +++++++++++++----
 3 files changed, 192 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index c243334..21b8e5c 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -111,6 +111,59 @@ static void a5xx_submit_in_rb(struct msm_gpu *gpu, struct msm_gem_submit *submit
 	msm_gpu_retire(gpu);
 }
 
+static void a5xx_set_pagetable(struct msm_gpu *gpu, struct msm_ringbuffer *ring,
+	struct msm_file_private *ctx)
+{
+	u64 ttbr;
+	u32 asid;
+
+	if (!msm_iommu_get_ptinfo(ctx->aspace->mmu, &ttbr, &asid))
+		return;
+
+	ttbr = ttbr | ((u64) asid) << 48;
+
+	/* Turn off protected mode */
+	OUT_PKT7(ring, CP_SET_PROTECTED_MODE, 1);
+	OUT_RING(ring, 0);
+
+	/* Turn on APIV mode to access critical regions */
+	OUT_PKT4(ring, REG_A5XX_CP_CNTL, 1);
+	OUT_RING(ring, 1);
+
+	/* Make sure the ME is synchronized before staring the update */
+	OUT_PKT7(ring, CP_WAIT_FOR_ME, 0);
+
+	/* Execute the table update */
+	OUT_PKT7(ring, CP_SMMU_TABLE_UPDATE, 3);
+	OUT_RING(ring, lower_32_bits(ttbr));
+	OUT_RING(ring, upper_32_bits(ttbr));
+	OUT_RING(ring, 0);
+
+	/*
+	 * Write the new TTBR0 to the preemption records - this will be used to
+	 * reload the pagetable if the current ring gets preempted out.
+	 */
+	OUT_PKT7(ring, CP_MEM_WRITE, 4);
+	OUT_RING(ring, lower_32_bits(rbmemptr(ring, ttbr0)));
+	OUT_RING(ring, upper_32_bits(rbmemptr(ring, ttbr0)));
+	OUT_RING(ring, lower_32_bits(ttbr));
+	OUT_RING(ring, upper_32_bits(ttbr));
+
+	/* Invalidate the draw state so we start off fresh */
+	OUT_PKT7(ring, CP_SET_DRAW_STATE, 3);
+	OUT_RING(ring, 0x40000);
+	OUT_RING(ring, 1);
+	OUT_RING(ring, 0);
+
+	/* Turn off APRIV */
+	OUT_PKT4(ring, REG_A5XX_CP_CNTL, 1);
+	OUT_RING(ring, 0);
+
+	/* Turn off protected mode */
+	OUT_PKT7(ring, CP_SET_PROTECTED_MODE, 1);
+	OUT_RING(ring, 1);
+}
+
 static void a5xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit,
 	struct msm_file_private *ctx)
 {
@@ -126,6 +179,8 @@ static void a5xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit,
 		return;
 	}
 
+	a5xx_set_pagetable(gpu, ring, ctx);
+
 	OUT_PKT7(ring, CP_PREEMPT_ENABLE_GLOBAL, 1);
 	OUT_RING(ring, 0x02);
 
@@ -1349,21 +1404,77 @@ static unsigned long a5xx_gpu_busy(struct msm_gpu *gpu)
 	return (unsigned long)busy_time;
 }
 
+static struct msm_gem_address_space *a5xx_new_address_space(struct msm_gpu *gpu)
+{
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct a5xx_gpu *a5xx_gpu = to_a5xx_gpu(adreno_gpu);
+	struct msm_gem_address_space *aspace;
+	int ret;
+
+	/* Return the default pagetable if per instance tables don't work */
+	if (!a5xx_gpu->per_instance_tables)
+		return gpu->aspace;
+
+	aspace = msm_gem_address_space_create_instance(&gpu->pdev->dev,
+		"gpu", 0x100000000ULL, 0x1ffffffffULL);
+	if (IS_ERR(aspace))
+		return aspace;
+
+	ret = aspace->mmu->funcs->attach(aspace->mmu, NULL, 0);
+	if (ret) {
+		/* -ENODEV means that aux domains aren't supported */
+		if (ret == -ENODEV)
+			return gpu->aspace;
+
+		return ERR_PTR(ret);
+	}
+
+	return aspace;
+}
+
 static struct msm_gem_address_space *
 a5xx_create_address_space(struct msm_gpu *gpu)
 {
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct a5xx_gpu *a5xx_gpu = to_a5xx_gpu(adreno_gpu);
+	struct device *dev = &gpu->pdev->dev;
 	struct msm_gem_address_space *aspace;
 	struct iommu_domain *iommu;
-	int ret;
+	int ret, val = 1;
+
+	a5xx_gpu->per_instance_tables = false;
 
 	iommu = iommu_domain_alloc(&platform_bus_type);
 	if (!iommu)
 		return NULL;
 
-	iommu->geometry.aperture_start = 0x100000000ULL;
-	iommu->geometry.aperture_end = 0x1ffffffffULL;
+	/* Try to enable split pagetables */
+	if (iommu_domain_set_attr(iommu, DOMAIN_ATTR_SPLIT_TABLES, &val)) {
+		/*
+		 * If split pagetables aren't available we won't be able to do
+		 * per-instance pagetables so set up the global va space at our
+		 * susual location
+		 */
+		iommu->geometry.aperture_start = 0x100000000ULL;
+		iommu->geometry.aperture_end = 0x1ffffffffULL;
+	} else {
+		/*
+		 * If split pagetables are available then we might be able to do
+		 * per-instance pagetables. Put the default va-space in TTBR1 to
+		 * prepare
+		 */
+		iommu->geometry.aperture_start = 0xfffffff100000000ULL;
+		iommu->geometry.aperture_end = 0xfffffff1ffffffffULL;
+
+		/*
+		 * If both split pagetables and aux domains are supported we can
+		 * do per_instance pagetables
+		 */
+		a5xx_gpu->per_instance_tables =
+			iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX);
+	}
 
-	aspace = msm_gem_address_space_create(&gpu->pdev->dev, iommu, "gpu");
+	aspace = msm_gem_address_space_create(dev, iommu, "gpu");
 	if (IS_ERR(aspace)) {
 		iommu_domain_free(iommu);
 		DRM_DEV_ERROR(gpu->dev->dev, "failed to init mmu: %ld\n",
@@ -1403,6 +1514,7 @@ static const struct adreno_gpu_funcs funcs = {
 		.gpu_state_get = a5xx_gpu_state_get,
 		.gpu_state_put = a5xx_gpu_state_put,
 		.create_address_space = a5xx_create_address_space,
+		.new_address_space = a5xx_new_address_space,
 	},
 	.get_timestamp = a5xx_get_timestamp,
 };
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.h b/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
index 7d71860..82ceb9b 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
@@ -45,6 +45,11 @@ struct a5xx_gpu {
 
 	atomic_t preempt_state;
 	struct timer_list preempt_timer;
+	struct a5xx_smmu_info *smmu_info;
+	struct drm_gem_object *smmu_info_bo;
+	uint64_t smmu_info_iova;
+
+	bool per_instance_tables;
 };
 
 #define to_a5xx_gpu(x) container_of(x, struct a5xx_gpu, base)
@@ -132,6 +137,20 @@ struct a5xx_preempt_record {
  */
 #define A5XX_PREEMPT_COUNTER_SIZE (16 * 4)
 
+/*
+ * This is a global structure that the preemption code uses to switch in the
+ * pagetable for the preempted process - the code switches in whatever we
+ * after preempting in a new ring.
+ */
+struct a5xx_smmu_info {
+	uint32_t  magic;
+	uint32_t  _pad4;
+	uint64_t  ttbr0;
+	uint32_t  asid;
+	uint32_t  contextidr;
+};
+
+#define A5XX_SMMU_INFO_MAGIC 0x3618CDA3UL
 
 int a5xx_power_init(struct msm_gpu *gpu);
 void a5xx_gpmu_ucode_init(struct msm_gpu *gpu);
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
index 3d62310..1050409 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
@@ -12,6 +12,7 @@
  */
 
 #include "msm_gem.h"
+#include "msm_mmu.h"
 #include "a5xx_gpu.h"
 
 /*
@@ -145,6 +146,15 @@ void a5xx_preempt_trigger(struct msm_gpu *gpu)
 	a5xx_gpu->preempt[ring->id]->wptr = get_wptr(ring);
 	spin_unlock_irqrestore(&ring->lock, flags);
 
+	/* Do read barrier to make sure we have updated pagetable info */
+	rmb();
+
+	/* Set the SMMU info for the preemption */
+	if (a5xx_gpu->smmu_info) {
+		a5xx_gpu->smmu_info->ttbr0 = ring->memptrs->ttbr0;
+		a5xx_gpu->smmu_info->contextidr = 0;
+	}
+
 	/* Set the address of the incoming preemption record */
 	gpu_write64(gpu, REG_A5XX_CP_CONTEXT_SWITCH_RESTORE_ADDR_LO,
 		REG_A5XX_CP_CONTEXT_SWITCH_RESTORE_ADDR_HI,
@@ -221,9 +231,10 @@ void a5xx_preempt_hw_init(struct msm_gpu *gpu)
 		a5xx_gpu->preempt[i]->rbase = gpu->rb[i]->iova;
 	}
 
-	/* Write a 0 to signal that we aren't switching pagetables */
+	/* Tell the CP where to find the smmu_info buffer*/
 	gpu_write64(gpu, REG_A5XX_CP_CONTEXT_SWITCH_SMMU_INFO_LO,
-		REG_A5XX_CP_CONTEXT_SWITCH_SMMU_INFO_HI, 0);
+		REG_A5XX_CP_CONTEXT_SWITCH_SMMU_INFO_HI,
+		a5xx_gpu->smmu_info_iova);
 
 	/* Reset the preemption state */
 	set_preempt_state(a5xx_gpu, PREEMPT_NONE);
@@ -271,6 +282,34 @@ void a5xx_preempt_fini(struct msm_gpu *gpu)
 
 	for (i = 0; i < gpu->nr_rings; i++)
 		msm_gem_kernel_put(a5xx_gpu->preempt_bo[i], gpu->aspace, true);
+
+	msm_gem_kernel_put(a5xx_gpu->smmu_info_bo, gpu->aspace, true);
+}
+
+static int a5xx_smmu_info_init(struct msm_gpu *gpu)
+{
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct a5xx_gpu *a5xx_gpu = to_a5xx_gpu(adreno_gpu);
+	struct a5xx_smmu_info *ptr;
+	struct drm_gem_object *bo;
+	u64 iova;
+
+	if (!a5xx_gpu->per_instance_tables)
+		return 0;
+
+	ptr = msm_gem_kernel_new(gpu->dev, sizeof(struct a5xx_smmu_info),
+		MSM_BO_UNCACHED, gpu->aspace, &bo, &iova);
+
+	if (IS_ERR(ptr))
+		return PTR_ERR(ptr);
+
+	ptr->magic = A5XX_SMMU_INFO_MAGIC;
+
+	a5xx_gpu->smmu_info_bo = bo;
+	a5xx_gpu->smmu_info_iova = iova;
+	a5xx_gpu->smmu_info = ptr;
+
+	return 0;
 }
 
 void a5xx_preempt_init(struct msm_gpu *gpu)
@@ -284,17 +323,22 @@ void a5xx_preempt_init(struct msm_gpu *gpu)
 		return;
 
 	for (i = 0; i < gpu->nr_rings; i++) {
-		if (preempt_init_ring(a5xx_gpu, gpu->rb[i])) {
-			/*
-			 * On any failure our adventure is over. Clean up and
-			 * set nr_rings to 1 to force preemption off
-			 */
-			a5xx_preempt_fini(gpu);
-			gpu->nr_rings = 1;
-
-			return;
-		}
+		if (preempt_init_ring(a5xx_gpu, gpu->rb[i]))
+			goto fail;
 	}
 
-	timer_setup(&a5xx_gpu->preempt_timer, a5xx_preempt_timer, 0);
+	if (a5xx_smmu_info_init(gpu))
+		goto fail;
+
+	timer_setup(&a5xx_gpu->preempt_timer, a5xx_preempt_timer,
+		(unsigned long) a5xx_gpu);
+
+	return;
+fail:
+	/*
+	 * On any failure our adventure is over. Clean up and
+	 * set nr_rings to 1 to force preemption off
+	 */
+	a5xx_preempt_fini(gpu);
+	gpu->nr_rings = 1;
 }
-- 
2.7.4

_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 01/15] iommu/arm-smmu: Allow IOMMU enabled devices to skip DMA domains
  2019-05-21 16:13   ` Jordan Crouse
  (?)
@ 2019-05-21 17:43     ` Robin Murphy
  -1 siblings, 0 replies; 45+ messages in thread
From: Robin Murphy @ 2019-05-21 17:43 UTC (permalink / raw)
  To: Jordan Crouse, freedreno
  Cc: jean-philippe.brucker, linux-arm-msm, hoegsberg, dianders,
	linux-kernel, iommu, Will Deacon, Joerg Roedel, linux-arm-kernel

On 21/05/2019 17:13, Jordan Crouse wrote:
> Allow IOMMU enabled devices specified on an opt-in list to create a
> default identity domain for a new IOMMU group and bypass the DMA
> domain created by the IOMMU core. This allows the group to be properly
> set up but otherwise skips touching the hardware until the client
> device attaches a unmanaged domain of its own.

All the cool kids are using iommu_request_dm_for_dev() to force an 
identity domain for particular devices, won't that suffice for this case 
too? There is definite scope for improvement in this area, so I'd really 
like to keep things as consistent as possible to make that easier in future.

Robin.

> Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> ---
> 
>   drivers/iommu/arm-smmu.c | 42 ++++++++++++++++++++++++++++++++++++++++++
>   drivers/iommu/iommu.c    | 29 +++++++++++++++++++++++------
>   include/linux/iommu.h    |  3 +++
>   3 files changed, 68 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 5e54cc0..a795ada 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -1235,6 +1235,35 @@ static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain,
>   	return 0;
>   }
>   
> +struct arm_smmu_client_match_data {
> +	bool use_identity_domain;
> +};
> +
> +static const struct arm_smmu_client_match_data qcom_adreno = {
> +	.use_identity_domain = true,
> +};
> +
> +static const struct arm_smmu_client_match_data qcom_mdss = {
> +	.use_identity_domain = true,
> +};
> +
> +static const struct of_device_id arm_smmu_client_of_match[] = {
> +	{ .compatible = "qcom,adreno", .data = &qcom_adreno },
> +	{ .compatible = "qcom,mdp4", .data = &qcom_mdss },
> +	{ .compatible = "qcom,mdss", .data = &qcom_mdss },
> +	{ .compatible = "qcom,sdm845-mdss", .data = &qcom_mdss },
> +	{},
> +};
> +
> +static const struct arm_smmu_client_match_data *
> +arm_smmu_client_data(struct device *dev)
> +{
> +	const struct of_device_id *match =
> +		of_match_device(arm_smmu_client_of_match, dev);
> +
> +	return match ? match->data : NULL;
> +}
> +
>   static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>   {
>   	int ret;
> @@ -1552,6 +1581,7 @@ static struct iommu_group *arm_smmu_device_group(struct device *dev)
>   {
>   	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
>   	struct arm_smmu_device *smmu = fwspec_smmu(fwspec);
> +	const struct arm_smmu_client_match_data *client;
>   	struct iommu_group *group = NULL;
>   	int i, idx;
>   
> @@ -1573,6 +1603,18 @@ static struct iommu_group *arm_smmu_device_group(struct device *dev)
>   	else
>   		group = generic_device_group(dev);
>   
> +	client = arm_smmu_client_data(dev);
> +
> +	/*
> +	 * If the client chooses to bypass the dma domain, create a identity
> +	 * domain as a default placeholder. This will give the device a
> +	 * default domain but skip DMA operations and not consume a context
> +	 * bank
> +	 */
> +	if (client && client->no_dma_domain)
> +		iommu_group_set_default_domain(group, dev,
> +			IOMMU_DOMAIN_IDENTITY);
> +
>   	return group;
>   }
>   
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 67ee662..af3e1ed 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -1062,6 +1062,24 @@ struct iommu_group *fsl_mc_device_group(struct device *dev)
>   	return group;
>   }
>   
> +struct iommu_domain *iommu_group_set_default_domain(struct iommu_group *group,
> +		struct device *dev, unsigned int type)
> +{
> +	struct iommu_domain *dom;
> +
> +	dom = __iommu_domain_alloc(dev->bus, type);
> +	if (!dom)
> +		return NULL;
> +
> +	/* FIXME: Error if the default domain is already set? */
> +	group->default_domain = dom;
> +	if (!group->domain)
> +		group->domain = dom;
> +
> +	return dom;
> +}
> +EXPORT_SYMBOL_GPL(iommu_group_set_default_domain);
> +
>   /**
>    * iommu_group_get_for_dev - Find or create the IOMMU group for a device
>    * @dev: target device
> @@ -1099,9 +1117,12 @@ struct iommu_group *iommu_group_get_for_dev(struct device *dev)
>   	if (!group->default_domain) {
>   		struct iommu_domain *dom;
>   
> -		dom = __iommu_domain_alloc(dev->bus, iommu_def_domain_type);
> +		dom = iommu_group_set_default_domain(group, dev,
> +			iommu_def_domain_type);
> +
>   		if (!dom && iommu_def_domain_type != IOMMU_DOMAIN_DMA) {
> -			dom = __iommu_domain_alloc(dev->bus, IOMMU_DOMAIN_DMA);
> +			dom = iommu_group_set_default_domain(group, dev,
> +				IOMMU_DOMAIN_DMA);
>   			if (dom) {
>   				dev_warn(dev,
>   					 "failed to allocate default IOMMU domain of type %u; falling back to IOMMU_DOMAIN_DMA",
> @@ -1109,10 +1130,6 @@ struct iommu_group *iommu_group_get_for_dev(struct device *dev)
>   			}
>   		}
>   
> -		group->default_domain = dom;
> -		if (!group->domain)
> -			group->domain = dom;
> -
>   		if (dom && !iommu_dma_strict) {
>   			int attr = 1;
>   			iommu_domain_set_attr(dom,
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index a815cf6..4ef8bd5 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -394,6 +394,9 @@ extern int iommu_group_id(struct iommu_group *group);
>   extern struct iommu_group *iommu_group_get_for_dev(struct device *dev);
>   extern struct iommu_domain *iommu_group_default_domain(struct iommu_group *);
>   
> +struct iommu_domain *iommu_group_set_default_domain(struct iommu_group *group,
> +		struct device *dev, unsigned int type);
> +
>   extern int iommu_domain_get_attr(struct iommu_domain *domain, enum iommu_attr,
>   				 void *data);
>   extern int iommu_domain_set_attr(struct iommu_domain *domain, enum iommu_attr,
> 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 01/15] iommu/arm-smmu: Allow IOMMU enabled devices to skip DMA domains
@ 2019-05-21 17:43     ` Robin Murphy
  0 siblings, 0 replies; 45+ messages in thread
From: Robin Murphy @ 2019-05-21 17:43 UTC (permalink / raw)
  To: Jordan Crouse, freedreno
  Cc: jean-philippe.brucker, linux-arm-msm, Will Deacon, dianders,
	linux-kernel, iommu, hoegsberg, linux-arm-kernel

On 21/05/2019 17:13, Jordan Crouse wrote:
> Allow IOMMU enabled devices specified on an opt-in list to create a
> default identity domain for a new IOMMU group and bypass the DMA
> domain created by the IOMMU core. This allows the group to be properly
> set up but otherwise skips touching the hardware until the client
> device attaches a unmanaged domain of its own.

All the cool kids are using iommu_request_dm_for_dev() to force an 
identity domain for particular devices, won't that suffice for this case 
too? There is definite scope for improvement in this area, so I'd really 
like to keep things as consistent as possible to make that easier in future.

Robin.

> Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> ---
> 
>   drivers/iommu/arm-smmu.c | 42 ++++++++++++++++++++++++++++++++++++++++++
>   drivers/iommu/iommu.c    | 29 +++++++++++++++++++++++------
>   include/linux/iommu.h    |  3 +++
>   3 files changed, 68 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 5e54cc0..a795ada 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -1235,6 +1235,35 @@ static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain,
>   	return 0;
>   }
>   
> +struct arm_smmu_client_match_data {
> +	bool use_identity_domain;
> +};
> +
> +static const struct arm_smmu_client_match_data qcom_adreno = {
> +	.use_identity_domain = true,
> +};
> +
> +static const struct arm_smmu_client_match_data qcom_mdss = {
> +	.use_identity_domain = true,
> +};
> +
> +static const struct of_device_id arm_smmu_client_of_match[] = {
> +	{ .compatible = "qcom,adreno", .data = &qcom_adreno },
> +	{ .compatible = "qcom,mdp4", .data = &qcom_mdss },
> +	{ .compatible = "qcom,mdss", .data = &qcom_mdss },
> +	{ .compatible = "qcom,sdm845-mdss", .data = &qcom_mdss },
> +	{},
> +};
> +
> +static const struct arm_smmu_client_match_data *
> +arm_smmu_client_data(struct device *dev)
> +{
> +	const struct of_device_id *match =
> +		of_match_device(arm_smmu_client_of_match, dev);
> +
> +	return match ? match->data : NULL;
> +}
> +
>   static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>   {
>   	int ret;
> @@ -1552,6 +1581,7 @@ static struct iommu_group *arm_smmu_device_group(struct device *dev)
>   {
>   	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
>   	struct arm_smmu_device *smmu = fwspec_smmu(fwspec);
> +	const struct arm_smmu_client_match_data *client;
>   	struct iommu_group *group = NULL;
>   	int i, idx;
>   
> @@ -1573,6 +1603,18 @@ static struct iommu_group *arm_smmu_device_group(struct device *dev)
>   	else
>   		group = generic_device_group(dev);
>   
> +	client = arm_smmu_client_data(dev);
> +
> +	/*
> +	 * If the client chooses to bypass the dma domain, create a identity
> +	 * domain as a default placeholder. This will give the device a
> +	 * default domain but skip DMA operations and not consume a context
> +	 * bank
> +	 */
> +	if (client && client->no_dma_domain)
> +		iommu_group_set_default_domain(group, dev,
> +			IOMMU_DOMAIN_IDENTITY);
> +
>   	return group;
>   }
>   
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 67ee662..af3e1ed 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -1062,6 +1062,24 @@ struct iommu_group *fsl_mc_device_group(struct device *dev)
>   	return group;
>   }
>   
> +struct iommu_domain *iommu_group_set_default_domain(struct iommu_group *group,
> +		struct device *dev, unsigned int type)
> +{
> +	struct iommu_domain *dom;
> +
> +	dom = __iommu_domain_alloc(dev->bus, type);
> +	if (!dom)
> +		return NULL;
> +
> +	/* FIXME: Error if the default domain is already set? */
> +	group->default_domain = dom;
> +	if (!group->domain)
> +		group->domain = dom;
> +
> +	return dom;
> +}
> +EXPORT_SYMBOL_GPL(iommu_group_set_default_domain);
> +
>   /**
>    * iommu_group_get_for_dev - Find or create the IOMMU group for a device
>    * @dev: target device
> @@ -1099,9 +1117,12 @@ struct iommu_group *iommu_group_get_for_dev(struct device *dev)
>   	if (!group->default_domain) {
>   		struct iommu_domain *dom;
>   
> -		dom = __iommu_domain_alloc(dev->bus, iommu_def_domain_type);
> +		dom = iommu_group_set_default_domain(group, dev,
> +			iommu_def_domain_type);
> +
>   		if (!dom && iommu_def_domain_type != IOMMU_DOMAIN_DMA) {
> -			dom = __iommu_domain_alloc(dev->bus, IOMMU_DOMAIN_DMA);
> +			dom = iommu_group_set_default_domain(group, dev,
> +				IOMMU_DOMAIN_DMA);
>   			if (dom) {
>   				dev_warn(dev,
>   					 "failed to allocate default IOMMU domain of type %u; falling back to IOMMU_DOMAIN_DMA",
> @@ -1109,10 +1130,6 @@ struct iommu_group *iommu_group_get_for_dev(struct device *dev)
>   			}
>   		}
>   
> -		group->default_domain = dom;
> -		if (!group->domain)
> -			group->domain = dom;
> -
>   		if (dom && !iommu_dma_strict) {
>   			int attr = 1;
>   			iommu_domain_set_attr(dom,
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index a815cf6..4ef8bd5 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -394,6 +394,9 @@ extern int iommu_group_id(struct iommu_group *group);
>   extern struct iommu_group *iommu_group_get_for_dev(struct device *dev);
>   extern struct iommu_domain *iommu_group_default_domain(struct iommu_group *);
>   
> +struct iommu_domain *iommu_group_set_default_domain(struct iommu_group *group,
> +		struct device *dev, unsigned int type);
> +
>   extern int iommu_domain_get_attr(struct iommu_domain *domain, enum iommu_attr,
>   				 void *data);
>   extern int iommu_domain_set_attr(struct iommu_domain *domain, enum iommu_attr,
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 01/15] iommu/arm-smmu: Allow IOMMU enabled devices to skip DMA domains
@ 2019-05-21 17:43     ` Robin Murphy
  0 siblings, 0 replies; 45+ messages in thread
From: Robin Murphy @ 2019-05-21 17:43 UTC (permalink / raw)
  To: Jordan Crouse, freedreno
  Cc: jean-philippe.brucker, linux-arm-msm, Joerg Roedel, Will Deacon,
	dianders, linux-kernel, iommu, hoegsberg, linux-arm-kernel

On 21/05/2019 17:13, Jordan Crouse wrote:
> Allow IOMMU enabled devices specified on an opt-in list to create a
> default identity domain for a new IOMMU group and bypass the DMA
> domain created by the IOMMU core. This allows the group to be properly
> set up but otherwise skips touching the hardware until the client
> device attaches a unmanaged domain of its own.

All the cool kids are using iommu_request_dm_for_dev() to force an 
identity domain for particular devices, won't that suffice for this case 
too? There is definite scope for improvement in this area, so I'd really 
like to keep things as consistent as possible to make that easier in future.

Robin.

> Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> ---
> 
>   drivers/iommu/arm-smmu.c | 42 ++++++++++++++++++++++++++++++++++++++++++
>   drivers/iommu/iommu.c    | 29 +++++++++++++++++++++++------
>   include/linux/iommu.h    |  3 +++
>   3 files changed, 68 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 5e54cc0..a795ada 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -1235,6 +1235,35 @@ static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain,
>   	return 0;
>   }
>   
> +struct arm_smmu_client_match_data {
> +	bool use_identity_domain;
> +};
> +
> +static const struct arm_smmu_client_match_data qcom_adreno = {
> +	.use_identity_domain = true,
> +};
> +
> +static const struct arm_smmu_client_match_data qcom_mdss = {
> +	.use_identity_domain = true,
> +};
> +
> +static const struct of_device_id arm_smmu_client_of_match[] = {
> +	{ .compatible = "qcom,adreno", .data = &qcom_adreno },
> +	{ .compatible = "qcom,mdp4", .data = &qcom_mdss },
> +	{ .compatible = "qcom,mdss", .data = &qcom_mdss },
> +	{ .compatible = "qcom,sdm845-mdss", .data = &qcom_mdss },
> +	{},
> +};
> +
> +static const struct arm_smmu_client_match_data *
> +arm_smmu_client_data(struct device *dev)
> +{
> +	const struct of_device_id *match =
> +		of_match_device(arm_smmu_client_of_match, dev);
> +
> +	return match ? match->data : NULL;
> +}
> +
>   static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>   {
>   	int ret;
> @@ -1552,6 +1581,7 @@ static struct iommu_group *arm_smmu_device_group(struct device *dev)
>   {
>   	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
>   	struct arm_smmu_device *smmu = fwspec_smmu(fwspec);
> +	const struct arm_smmu_client_match_data *client;
>   	struct iommu_group *group = NULL;
>   	int i, idx;
>   
> @@ -1573,6 +1603,18 @@ static struct iommu_group *arm_smmu_device_group(struct device *dev)
>   	else
>   		group = generic_device_group(dev);
>   
> +	client = arm_smmu_client_data(dev);
> +
> +	/*
> +	 * If the client chooses to bypass the dma domain, create a identity
> +	 * domain as a default placeholder. This will give the device a
> +	 * default domain but skip DMA operations and not consume a context
> +	 * bank
> +	 */
> +	if (client && client->no_dma_domain)
> +		iommu_group_set_default_domain(group, dev,
> +			IOMMU_DOMAIN_IDENTITY);
> +
>   	return group;
>   }
>   
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 67ee662..af3e1ed 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -1062,6 +1062,24 @@ struct iommu_group *fsl_mc_device_group(struct device *dev)
>   	return group;
>   }
>   
> +struct iommu_domain *iommu_group_set_default_domain(struct iommu_group *group,
> +		struct device *dev, unsigned int type)
> +{
> +	struct iommu_domain *dom;
> +
> +	dom = __iommu_domain_alloc(dev->bus, type);
> +	if (!dom)
> +		return NULL;
> +
> +	/* FIXME: Error if the default domain is already set? */
> +	group->default_domain = dom;
> +	if (!group->domain)
> +		group->domain = dom;
> +
> +	return dom;
> +}
> +EXPORT_SYMBOL_GPL(iommu_group_set_default_domain);
> +
>   /**
>    * iommu_group_get_for_dev - Find or create the IOMMU group for a device
>    * @dev: target device
> @@ -1099,9 +1117,12 @@ struct iommu_group *iommu_group_get_for_dev(struct device *dev)
>   	if (!group->default_domain) {
>   		struct iommu_domain *dom;
>   
> -		dom = __iommu_domain_alloc(dev->bus, iommu_def_domain_type);
> +		dom = iommu_group_set_default_domain(group, dev,
> +			iommu_def_domain_type);
> +
>   		if (!dom && iommu_def_domain_type != IOMMU_DOMAIN_DMA) {
> -			dom = __iommu_domain_alloc(dev->bus, IOMMU_DOMAIN_DMA);
> +			dom = iommu_group_set_default_domain(group, dev,
> +				IOMMU_DOMAIN_DMA);
>   			if (dom) {
>   				dev_warn(dev,
>   					 "failed to allocate default IOMMU domain of type %u; falling back to IOMMU_DOMAIN_DMA",
> @@ -1109,10 +1130,6 @@ struct iommu_group *iommu_group_get_for_dev(struct device *dev)
>   			}
>   		}
>   
> -		group->default_domain = dom;
> -		if (!group->domain)
> -			group->domain = dom;
> -
>   		if (dom && !iommu_dma_strict) {
>   			int attr = 1;
>   			iommu_domain_set_attr(dom,
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index a815cf6..4ef8bd5 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -394,6 +394,9 @@ extern int iommu_group_id(struct iommu_group *group);
>   extern struct iommu_group *iommu_group_get_for_dev(struct device *dev);
>   extern struct iommu_domain *iommu_group_default_domain(struct iommu_group *);
>   
> +struct iommu_domain *iommu_group_set_default_domain(struct iommu_group *group,
> +		struct device *dev, unsigned int type);
> +
>   extern int iommu_domain_get_attr(struct iommu_domain *domain, enum iommu_attr,
>   				 void *data);
>   extern int iommu_domain_set_attr(struct iommu_domain *domain, enum iommu_attr,
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 03/15] iommu/arm-smmu: Add split pagetable support for arm-smmu-v2
  2019-05-21 16:13   ` Jordan Crouse
  (?)
@ 2019-05-21 18:18     ` Robin Murphy
  -1 siblings, 0 replies; 45+ messages in thread
From: Robin Murphy @ 2019-05-21 18:18 UTC (permalink / raw)
  To: Jordan Crouse, freedreno
  Cc: jean-philippe.brucker, linux-arm-msm, hoegsberg, dianders,
	linux-kernel, iommu, Will Deacon, Joerg Roedel, linux-arm-kernel

On 21/05/2019 17:13, Jordan Crouse wrote:
> Add support for a split pagetable (TTBR0/TTBR1) scheme for arm-smmu-v2.
> If split pagetables are enabled, create a pagetable for TTBR1 and set
> up the sign extension bit so that all IOVAs with that bit set are mapped
> and translated from the TTBR1 pagetable.
> 
> Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> ---
> 
>   drivers/iommu/arm-smmu-regs.h  |  19 +++++
>   drivers/iommu/arm-smmu.c       | 179 ++++++++++++++++++++++++++++++++++++++---
>   drivers/iommu/io-pgtable-arm.c |   3 +-
>   3 files changed, 186 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-regs.h
> index e9132a9..23f27c2 100644
> --- a/drivers/iommu/arm-smmu-regs.h
> +++ b/drivers/iommu/arm-smmu-regs.h
> @@ -195,7 +195,26 @@ enum arm_smmu_s2cr_privcfg {
>   #define RESUME_RETRY			(0 << 0)
>   #define RESUME_TERMINATE		(1 << 0)
>   
> +#define TTBCR_EPD1			(1 << 23)
> +#define TTBCR_T0SZ_SHIFT		0
> +#define TTBCR_T1SZ_SHIFT		16
> +#define TTBCR_IRGN1_SHIFT		24
> +#define TTBCR_ORGN1_SHIFT		26
> +#define TTBCR_RGN_WBWA			1
> +#define TTBCR_SH1_SHIFT			28
> +#define TTBCR_SH_IS			3
> +
> +#define TTBCR_TG1_16K			(1 << 30)
> +#define TTBCR_TG1_4K			(2 << 30)
> +#define TTBCR_TG1_64K			(3 << 30)
> +
>   #define TTBCR2_SEP_SHIFT		15
> +#define TTBCR2_SEP_31			(0x0 << TTBCR2_SEP_SHIFT)
> +#define TTBCR2_SEP_35			(0x1 << TTBCR2_SEP_SHIFT)
> +#define TTBCR2_SEP_39			(0x2 << TTBCR2_SEP_SHIFT)
> +#define TTBCR2_SEP_41			(0x3 << TTBCR2_SEP_SHIFT)
> +#define TTBCR2_SEP_43			(0x4 << TTBCR2_SEP_SHIFT)
> +#define TTBCR2_SEP_47			(0x5 << TTBCR2_SEP_SHIFT)
>   #define TTBCR2_SEP_UPSTREAM		(0x7 << TTBCR2_SEP_SHIFT)
>   #define TTBCR2_AS			(1 << 4)
>   
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index a795ada..e09c0e6 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -152,6 +152,7 @@ struct arm_smmu_cb {
>   	u32				tcr[2];
>   	u32				mair[2];
>   	struct arm_smmu_cfg		*cfg;
> +	unsigned long			split_table_mask;
>   };
>   
>   struct arm_smmu_master_cfg {
> @@ -253,13 +254,14 @@ enum arm_smmu_domain_stage {
>   
>   struct arm_smmu_domain {
>   	struct arm_smmu_device		*smmu;
> -	struct io_pgtable_ops		*pgtbl_ops;
> +	struct io_pgtable_ops		*pgtbl_ops[2];

This seems a bit off - surely the primary domain and aux domain only 
ever need one set of tables each, but either way there's definitely 
unnecessary redundancy in having four sets of io_pgtable_ops between them.

>   	const struct iommu_gather_ops	*tlb_ops;
>   	struct arm_smmu_cfg		cfg;
>   	enum arm_smmu_domain_stage	stage;
>   	bool				non_strict;
>   	struct mutex			init_mutex; /* Protects smmu pointer */
>   	spinlock_t			cb_lock; /* Serialises ATS1* ops and TLB syncs */
> +	u32 attributes;
>   	struct iommu_domain		domain;
>   };
>   
> @@ -621,6 +623,85 @@ static irqreturn_t arm_smmu_global_fault(int irq, void *dev)
>   	return IRQ_HANDLED;
>   }
>   
> +/* Adjust the context bank settings to support TTBR1 */
> +static void arm_smmu_init_ttbr1(struct arm_smmu_domain *smmu_domain,
> +		struct io_pgtable_cfg *pgtbl_cfg)
> +{
> +	struct arm_smmu_device *smmu = smmu_domain->smmu;
> +	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> +	struct arm_smmu_cb *cb = &smmu_domain->smmu->cbs[cfg->cbndx];
> +	int pgsize = 1 << __ffs(pgtbl_cfg->pgsize_bitmap);
> +
> +	/* Enable speculative walks through the TTBR1 */
> +	cb->tcr[0] &= ~TTBCR_EPD1;
> +
> +	cb->tcr[0] |= TTBCR_SH_IS << TTBCR_SH1_SHIFT;
> +	cb->tcr[0] |= TTBCR_RGN_WBWA << TTBCR_IRGN1_SHIFT;
> +	cb->tcr[0] |= TTBCR_RGN_WBWA << TTBCR_ORGN1_SHIFT;
> +
> +	switch (pgsize) {
> +	case SZ_4K:
> +		cb->tcr[0] |= TTBCR_TG1_4K;
> +		break;
> +	case SZ_16K:
> +		cb->tcr[0] |= TTBCR_TG1_16K;
> +		break;
> +	case SZ_64K:
> +		cb->tcr[0] |= TTBCR_TG1_64K;
> +		break;
> +	}
> +
> +	/*
> +	 * Outside of the special 49 bit UBS case that has a dedicated sign
> +	 * extension bit, setting the SEP for any other va_size will force us to
> +	 * shrink the size of the T0/T1 regions by one bit to accommodate the
> +	 * SEP
> +	 */
> +	if (smmu->va_size != 48) {
> +		/* Replace the T0 size */
> +		cb->tcr[0] &= ~(0x3f << TTBCR_T0SZ_SHIFT);
> +		cb->tcr[0] |= (64ULL - smmu->va_size - 1) << TTBCR_T0SZ_SHIFT;
> +		/* Set the T1 size */
> +		cb->tcr[0] |= (64ULL - smmu->va_size - 1) << TTBCR_T1SZ_SHIFT;
> +	} else {
> +		/* Set the T1 size to the full available UBS */
> +		cb->tcr[0] |= (64ULL - smmu->va_size) << TTBCR_T1SZ_SHIFT;
> +	}
> +
> +	/* Clear the existing SEP configuration */
> +	cb->tcr[1] &= ~TTBCR2_SEP_UPSTREAM;
> +
> +	/* Set up the sign extend bit */
> +	switch (smmu->va_size) {
> +	case 32:
> +		cb->tcr[1] |= TTBCR2_SEP_31;
> +		cb->split_table_mask = (1UL << 31);
> +		break;
> +	case 36:
> +		cb->tcr[1] |= TTBCR2_SEP_35;
> +		cb->split_table_mask = (1UL << 35);
> +		break;
> +	case 40:
> +		cb->tcr[1] |= TTBCR2_SEP_39;
> +		cb->split_table_mask = (1UL << 39);
> +		break;
> +	case 42:
> +		cb->tcr[1] |= TTBCR2_SEP_41;
> +		cb->split_table_mask = (1UL << 41);
> +		break;
> +	case 44:
> +		cb->tcr[1] |= TTBCR2_SEP_43;
> +		cb->split_table_mask = (1UL << 43);
> +		break;
> +	case 48:
> +		cb->tcr[1] |= TTBCR2_SEP_UPSTREAM;
> +		cb->split_table_mask = (1UL << 48);
> +	}
> +
> +	cb->ttbr[1] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[0];

Assigning a "TTBR0" to a "TTBR1" is the point at which it becomes clear 
that we need to take a step back and reconsider. I think there was 
originally a half-formed idea that pagetables might go around in pairs, 
but things really aren't working out that way in practice, so it's 
almost certainly time to rework the io_pgatble_alloc() interface. We 
probably want to make "TTBR1" an up-front option for the appropriate 
formats, such that either way they return a single TTBR value plus a TCR 
with the appropriate half configured (hopefully in such a way that the 
caller can simply allocate one of each and merge the two TCRs together, 
so maybe responsibility for EPD* needs to move). That way we can also 
make *better* use of the IOVA sanity-checking in io-pgtable-arm, rather 
than just removing it (especially since this will open up a whole new 
class of "unmapping a TTBR0 address from the TTBR1 domain" type bugs).

Robin.

> +	cb->ttbr[1] |= (u64)cfg->asid << TTBRn_ASID_SHIFT;
> +}
> +
>   static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain,
>   				       struct io_pgtable_cfg *pgtbl_cfg)
>   {
> @@ -763,11 +844,13 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
>   {
>   	int irq, start, ret = 0;
>   	unsigned long ias, oas;
> -	struct io_pgtable_ops *pgtbl_ops;
> +	struct io_pgtable_ops *pgtbl_ops[2] = { NULL, NULL };
>   	struct io_pgtable_cfg pgtbl_cfg;
>   	enum io_pgtable_fmt fmt;
>   	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>   	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> +	bool split_tables =
> +		(smmu_domain->attributes & (1 << DOMAIN_ATTR_SPLIT_TABLES));
>   
>   	mutex_lock(&smmu_domain->init_mutex);
>   	if (smmu_domain->smmu)
> @@ -797,8 +880,15 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
>   	 *
>   	 * Note that you can't actually request stage-2 mappings.
>   	 */
> -	if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1))
> +	if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1)) {
>   		smmu_domain->stage = ARM_SMMU_DOMAIN_S2;
> +
> +		/* Only allow split pagetables on stage 1 tables */
> +		if (split_tables) {
> +			ret = -EINVAL;
> +			goto out_unlock;
> +		}
> +	}
>   	if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S2))
>   		smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
>   
> @@ -817,6 +907,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
>   	    (smmu->features & ARM_SMMU_FEAT_FMT_AARCH32_S) &&
>   	    (smmu_domain->stage == ARM_SMMU_DOMAIN_S1))
>   		cfg->fmt = ARM_SMMU_CTX_FMT_AARCH32_S;
> +
>   	if ((IS_ENABLED(CONFIG_64BIT) || cfg->fmt == ARM_SMMU_CTX_FMT_NONE) &&
>   	    (smmu->features & (ARM_SMMU_FEAT_FMT_AARCH64_64K |
>   			       ARM_SMMU_FEAT_FMT_AARCH64_16K |
> @@ -828,6 +919,12 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
>   		goto out_unlock;
>   	}
>   
> +	/* For now, only allow split tables for AARCH64 formats */
> +	if (split_tables && cfg->fmt != ARM_SMMU_CTX_FMT_AARCH64) {
> +		ret = -EINVAL;
> +		goto out_unlock;
> +	}
> +
>   	switch (smmu_domain->stage) {
>   	case ARM_SMMU_DOMAIN_S1:
>   		cfg->cbar = CBAR_TYPE_S1_TRANS_S2_BYPASS;
> @@ -906,8 +1003,8 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
>   		pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
>   
>   	smmu_domain->smmu = smmu;
> -	pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
> -	if (!pgtbl_ops) {
> +	pgtbl_ops[0] = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
> +	if (!pgtbl_ops[0]) {
>   		ret = -ENOMEM;
>   		goto out_clear_smmu;
>   	}
> @@ -919,6 +1016,20 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
>   
>   	/* Initialise the context bank with our page table cfg */
>   	arm_smmu_init_context_bank(smmu_domain, &pgtbl_cfg);
> +
> +	if (split_tables) {
> +		/* It is safe to reuse pgtbl_cfg here */
> +		pgtbl_ops[1] = alloc_io_pgtable_ops(fmt, &pgtbl_cfg,
> +			smmu_domain);
> +		if (!pgtbl_ops[1]) {
> +			free_io_pgtable_ops(pgtbl_ops[0]);
> +			ret = -ENOMEM;
> +			goto out_clear_smmu;
> +		}
> +
> +		arm_smmu_init_ttbr1(smmu_domain, &pgtbl_cfg);
> +	}
> +
>   	arm_smmu_write_context_bank(smmu, cfg->cbndx);
>   
>   	/*
> @@ -937,7 +1048,9 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
>   	mutex_unlock(&smmu_domain->init_mutex);
>   
>   	/* Publish page table ops for map/unmap */
> -	smmu_domain->pgtbl_ops = pgtbl_ops;
> +	smmu_domain->pgtbl_ops[0] = pgtbl_ops[0];
> +	smmu_domain->pgtbl_ops[1] = pgtbl_ops[1];
> +
>   	return 0;
>   
>   out_clear_smmu:
> @@ -973,7 +1086,9 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain)
>   		devm_free_irq(smmu->dev, irq, domain);
>   	}
>   
> -	free_io_pgtable_ops(smmu_domain->pgtbl_ops);
> +	free_io_pgtable_ops(smmu_domain->pgtbl_ops[0]);
> +	free_io_pgtable_ops(smmu_domain->pgtbl_ops[1]);
> +
>   	__arm_smmu_free_bitmap(smmu->context_map, cfg->cbndx);
>   
>   	arm_smmu_rpm_put(smmu);
> @@ -1317,10 +1432,37 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>   	return ret;
>   }
>   
> +static struct io_pgtable_ops *
> +arm_smmu_get_pgtbl_ops(struct iommu_domain *domain, unsigned long iova)
> +{
> +	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> +	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> +	struct arm_smmu_cb *cb = &smmu_domain->smmu->cbs[cfg->cbndx];
> +
> +	if (iova & cb->split_table_mask)
> +		return smmu_domain->pgtbl_ops[1];
> +
> +	return smmu_domain->pgtbl_ops[0];
> +}
> +
> +/*
> + * If split pagetables are enabled adjust the iova so that it
> + * matches the T0SZ/T1SZ that has been programmed
> + */
> +unsigned long arm_smmu_adjust_iova(struct iommu_domain *domain,
> +		unsigned long iova)
> +{
> +	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> +	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> +	struct arm_smmu_cb *cb = &smmu_domain->smmu->cbs[cfg->cbndx];
> +
> +	return cb->split_table_mask ? iova & (cb->split_table_mask - 1) : iova;
> +}
> +
>   static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
>   			phys_addr_t paddr, size_t size, int prot)
>   {
> -	struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
> +	struct io_pgtable_ops *ops = arm_smmu_get_pgtbl_ops(domain, iova);
>   	struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu;
>   	int ret;
>   
> @@ -1328,7 +1470,8 @@ static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
>   		return -ENODEV;
>   
>   	arm_smmu_rpm_get(smmu);
> -	ret = ops->map(ops, iova, paddr, size, prot);
> +	ret = ops->map(ops, arm_smmu_adjust_iova(domain, iova),
> +		paddr, size, prot);
>   	arm_smmu_rpm_put(smmu);
>   
>   	return ret;
> @@ -1337,7 +1480,7 @@ static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
>   static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
>   			     size_t size)
>   {
> -	struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
> +	struct io_pgtable_ops *ops = arm_smmu_get_pgtbl_ops(domain, iova);
>   	struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu;
>   	size_t ret;
>   
> @@ -1345,7 +1488,7 @@ static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
>   		return 0;
>   
>   	arm_smmu_rpm_get(smmu);
> -	ret = ops->unmap(ops, iova, size);
> +	ret = ops->unmap(ops, arm_smmu_adjust_iova(domain, iova), size);
>   	arm_smmu_rpm_put(smmu);
>   
>   	return ret;
> @@ -1381,7 +1524,7 @@ static phys_addr_t arm_smmu_iova_to_phys_hard(struct iommu_domain *domain,
>   	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>   	struct arm_smmu_device *smmu = smmu_domain->smmu;
>   	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> -	struct io_pgtable_ops *ops= smmu_domain->pgtbl_ops;
> +	struct io_pgtable_ops *ops = arm_smmu_get_pgtbl_ops(domain, iova);
>   	struct device *dev = smmu->dev;
>   	void __iomem *cb_base;
>   	u32 tmp;
> @@ -1429,7 +1572,7 @@ static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain,
>   					dma_addr_t iova)
>   {
>   	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> -	struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
> +	struct io_pgtable_ops *ops = arm_smmu_get_pgtbl_ops(domain, iova);
>   
>   	if (domain->type == IOMMU_DOMAIN_IDENTITY)
>   		return iova;
> @@ -1629,6 +1772,11 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
>   		case DOMAIN_ATTR_NESTING:
>   			*(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED);
>   			return 0;
> +		case DOMAIN_ATTR_SPLIT_TABLES:
> +			*((int *)data) =
> +				!!(smmu_domain->attributes &
> +				   (1 << DOMAIN_ATTR_SPLIT_TABLES));
> +			return 0;
>   		default:
>   			return -ENODEV;
>   		}
> @@ -1669,6 +1817,11 @@ static int arm_smmu_domain_set_attr(struct iommu_domain *domain,
>   			else
>   				smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
>   			break;
> +		case DOMAIN_ATTR_SPLIT_TABLES:
> +			if (*((int *)data))
> +				smmu_domain->attributes |=
> +					(1 << DOMAIN_ATTR_SPLIT_TABLES);
> +			break;
>   		default:
>   			ret = -ENODEV;
>   		}
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 4e21efb..71ecb08 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -490,8 +490,7 @@ static int arm_lpae_map(struct io_pgtable_ops *ops, unsigned long iova,
>   	if (!(iommu_prot & (IOMMU_READ | IOMMU_WRITE)))
>   		return 0;
>   
> -	if (WARN_ON(iova >= (1ULL << data->iop.cfg.ias) ||
> -		    paddr >= (1ULL << data->iop.cfg.oas)))
> +	if (WARN_ON(paddr >= (1ULL << data->iop.cfg.oas)))
>   		return -ERANGE;
>   
>   	prot = arm_lpae_prot_to_pte(data, iommu_prot);
> 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 03/15] iommu/arm-smmu: Add split pagetable support for arm-smmu-v2
@ 2019-05-21 18:18     ` Robin Murphy
  0 siblings, 0 replies; 45+ messages in thread
From: Robin Murphy @ 2019-05-21 18:18 UTC (permalink / raw)
  To: Jordan Crouse, freedreno
  Cc: jean-philippe.brucker, linux-arm-msm, Will Deacon, dianders,
	linux-kernel, iommu, hoegsberg, linux-arm-kernel

On 21/05/2019 17:13, Jordan Crouse wrote:
> Add support for a split pagetable (TTBR0/TTBR1) scheme for arm-smmu-v2.
> If split pagetables are enabled, create a pagetable for TTBR1 and set
> up the sign extension bit so that all IOVAs with that bit set are mapped
> and translated from the TTBR1 pagetable.
> 
> Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> ---
> 
>   drivers/iommu/arm-smmu-regs.h  |  19 +++++
>   drivers/iommu/arm-smmu.c       | 179 ++++++++++++++++++++++++++++++++++++++---
>   drivers/iommu/io-pgtable-arm.c |   3 +-
>   3 files changed, 186 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-regs.h
> index e9132a9..23f27c2 100644
> --- a/drivers/iommu/arm-smmu-regs.h
> +++ b/drivers/iommu/arm-smmu-regs.h
> @@ -195,7 +195,26 @@ enum arm_smmu_s2cr_privcfg {
>   #define RESUME_RETRY			(0 << 0)
>   #define RESUME_TERMINATE		(1 << 0)
>   
> +#define TTBCR_EPD1			(1 << 23)
> +#define TTBCR_T0SZ_SHIFT		0
> +#define TTBCR_T1SZ_SHIFT		16
> +#define TTBCR_IRGN1_SHIFT		24
> +#define TTBCR_ORGN1_SHIFT		26
> +#define TTBCR_RGN_WBWA			1
> +#define TTBCR_SH1_SHIFT			28
> +#define TTBCR_SH_IS			3
> +
> +#define TTBCR_TG1_16K			(1 << 30)
> +#define TTBCR_TG1_4K			(2 << 30)
> +#define TTBCR_TG1_64K			(3 << 30)
> +
>   #define TTBCR2_SEP_SHIFT		15
> +#define TTBCR2_SEP_31			(0x0 << TTBCR2_SEP_SHIFT)
> +#define TTBCR2_SEP_35			(0x1 << TTBCR2_SEP_SHIFT)
> +#define TTBCR2_SEP_39			(0x2 << TTBCR2_SEP_SHIFT)
> +#define TTBCR2_SEP_41			(0x3 << TTBCR2_SEP_SHIFT)
> +#define TTBCR2_SEP_43			(0x4 << TTBCR2_SEP_SHIFT)
> +#define TTBCR2_SEP_47			(0x5 << TTBCR2_SEP_SHIFT)
>   #define TTBCR2_SEP_UPSTREAM		(0x7 << TTBCR2_SEP_SHIFT)
>   #define TTBCR2_AS			(1 << 4)
>   
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index a795ada..e09c0e6 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -152,6 +152,7 @@ struct arm_smmu_cb {
>   	u32				tcr[2];
>   	u32				mair[2];
>   	struct arm_smmu_cfg		*cfg;
> +	unsigned long			split_table_mask;
>   };
>   
>   struct arm_smmu_master_cfg {
> @@ -253,13 +254,14 @@ enum arm_smmu_domain_stage {
>   
>   struct arm_smmu_domain {
>   	struct arm_smmu_device		*smmu;
> -	struct io_pgtable_ops		*pgtbl_ops;
> +	struct io_pgtable_ops		*pgtbl_ops[2];

This seems a bit off - surely the primary domain and aux domain only 
ever need one set of tables each, but either way there's definitely 
unnecessary redundancy in having four sets of io_pgtable_ops between them.

>   	const struct iommu_gather_ops	*tlb_ops;
>   	struct arm_smmu_cfg		cfg;
>   	enum arm_smmu_domain_stage	stage;
>   	bool				non_strict;
>   	struct mutex			init_mutex; /* Protects smmu pointer */
>   	spinlock_t			cb_lock; /* Serialises ATS1* ops and TLB syncs */
> +	u32 attributes;
>   	struct iommu_domain		domain;
>   };
>   
> @@ -621,6 +623,85 @@ static irqreturn_t arm_smmu_global_fault(int irq, void *dev)
>   	return IRQ_HANDLED;
>   }
>   
> +/* Adjust the context bank settings to support TTBR1 */
> +static void arm_smmu_init_ttbr1(struct arm_smmu_domain *smmu_domain,
> +		struct io_pgtable_cfg *pgtbl_cfg)
> +{
> +	struct arm_smmu_device *smmu = smmu_domain->smmu;
> +	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> +	struct arm_smmu_cb *cb = &smmu_domain->smmu->cbs[cfg->cbndx];
> +	int pgsize = 1 << __ffs(pgtbl_cfg->pgsize_bitmap);
> +
> +	/* Enable speculative walks through the TTBR1 */
> +	cb->tcr[0] &= ~TTBCR_EPD1;
> +
> +	cb->tcr[0] |= TTBCR_SH_IS << TTBCR_SH1_SHIFT;
> +	cb->tcr[0] |= TTBCR_RGN_WBWA << TTBCR_IRGN1_SHIFT;
> +	cb->tcr[0] |= TTBCR_RGN_WBWA << TTBCR_ORGN1_SHIFT;
> +
> +	switch (pgsize) {
> +	case SZ_4K:
> +		cb->tcr[0] |= TTBCR_TG1_4K;
> +		break;
> +	case SZ_16K:
> +		cb->tcr[0] |= TTBCR_TG1_16K;
> +		break;
> +	case SZ_64K:
> +		cb->tcr[0] |= TTBCR_TG1_64K;
> +		break;
> +	}
> +
> +	/*
> +	 * Outside of the special 49 bit UBS case that has a dedicated sign
> +	 * extension bit, setting the SEP for any other va_size will force us to
> +	 * shrink the size of the T0/T1 regions by one bit to accommodate the
> +	 * SEP
> +	 */
> +	if (smmu->va_size != 48) {
> +		/* Replace the T0 size */
> +		cb->tcr[0] &= ~(0x3f << TTBCR_T0SZ_SHIFT);
> +		cb->tcr[0] |= (64ULL - smmu->va_size - 1) << TTBCR_T0SZ_SHIFT;
> +		/* Set the T1 size */
> +		cb->tcr[0] |= (64ULL - smmu->va_size - 1) << TTBCR_T1SZ_SHIFT;
> +	} else {
> +		/* Set the T1 size to the full available UBS */
> +		cb->tcr[0] |= (64ULL - smmu->va_size) << TTBCR_T1SZ_SHIFT;
> +	}
> +
> +	/* Clear the existing SEP configuration */
> +	cb->tcr[1] &= ~TTBCR2_SEP_UPSTREAM;
> +
> +	/* Set up the sign extend bit */
> +	switch (smmu->va_size) {
> +	case 32:
> +		cb->tcr[1] |= TTBCR2_SEP_31;
> +		cb->split_table_mask = (1UL << 31);
> +		break;
> +	case 36:
> +		cb->tcr[1] |= TTBCR2_SEP_35;
> +		cb->split_table_mask = (1UL << 35);
> +		break;
> +	case 40:
> +		cb->tcr[1] |= TTBCR2_SEP_39;
> +		cb->split_table_mask = (1UL << 39);
> +		break;
> +	case 42:
> +		cb->tcr[1] |= TTBCR2_SEP_41;
> +		cb->split_table_mask = (1UL << 41);
> +		break;
> +	case 44:
> +		cb->tcr[1] |= TTBCR2_SEP_43;
> +		cb->split_table_mask = (1UL << 43);
> +		break;
> +	case 48:
> +		cb->tcr[1] |= TTBCR2_SEP_UPSTREAM;
> +		cb->split_table_mask = (1UL << 48);
> +	}
> +
> +	cb->ttbr[1] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[0];

Assigning a "TTBR0" to a "TTBR1" is the point at which it becomes clear 
that we need to take a step back and reconsider. I think there was 
originally a half-formed idea that pagetables might go around in pairs, 
but things really aren't working out that way in practice, so it's 
almost certainly time to rework the io_pgatble_alloc() interface. We 
probably want to make "TTBR1" an up-front option for the appropriate 
formats, such that either way they return a single TTBR value plus a TCR 
with the appropriate half configured (hopefully in such a way that the 
caller can simply allocate one of each and merge the two TCRs together, 
so maybe responsibility for EPD* needs to move). That way we can also 
make *better* use of the IOVA sanity-checking in io-pgtable-arm, rather 
than just removing it (especially since this will open up a whole new 
class of "unmapping a TTBR0 address from the TTBR1 domain" type bugs).

Robin.

> +	cb->ttbr[1] |= (u64)cfg->asid << TTBRn_ASID_SHIFT;
> +}
> +
>   static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain,
>   				       struct io_pgtable_cfg *pgtbl_cfg)
>   {
> @@ -763,11 +844,13 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
>   {
>   	int irq, start, ret = 0;
>   	unsigned long ias, oas;
> -	struct io_pgtable_ops *pgtbl_ops;
> +	struct io_pgtable_ops *pgtbl_ops[2] = { NULL, NULL };
>   	struct io_pgtable_cfg pgtbl_cfg;
>   	enum io_pgtable_fmt fmt;
>   	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>   	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> +	bool split_tables =
> +		(smmu_domain->attributes & (1 << DOMAIN_ATTR_SPLIT_TABLES));
>   
>   	mutex_lock(&smmu_domain->init_mutex);
>   	if (smmu_domain->smmu)
> @@ -797,8 +880,15 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
>   	 *
>   	 * Note that you can't actually request stage-2 mappings.
>   	 */
> -	if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1))
> +	if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1)) {
>   		smmu_domain->stage = ARM_SMMU_DOMAIN_S2;
> +
> +		/* Only allow split pagetables on stage 1 tables */
> +		if (split_tables) {
> +			ret = -EINVAL;
> +			goto out_unlock;
> +		}
> +	}
>   	if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S2))
>   		smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
>   
> @@ -817,6 +907,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
>   	    (smmu->features & ARM_SMMU_FEAT_FMT_AARCH32_S) &&
>   	    (smmu_domain->stage == ARM_SMMU_DOMAIN_S1))
>   		cfg->fmt = ARM_SMMU_CTX_FMT_AARCH32_S;
> +
>   	if ((IS_ENABLED(CONFIG_64BIT) || cfg->fmt == ARM_SMMU_CTX_FMT_NONE) &&
>   	    (smmu->features & (ARM_SMMU_FEAT_FMT_AARCH64_64K |
>   			       ARM_SMMU_FEAT_FMT_AARCH64_16K |
> @@ -828,6 +919,12 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
>   		goto out_unlock;
>   	}
>   
> +	/* For now, only allow split tables for AARCH64 formats */
> +	if (split_tables && cfg->fmt != ARM_SMMU_CTX_FMT_AARCH64) {
> +		ret = -EINVAL;
> +		goto out_unlock;
> +	}
> +
>   	switch (smmu_domain->stage) {
>   	case ARM_SMMU_DOMAIN_S1:
>   		cfg->cbar = CBAR_TYPE_S1_TRANS_S2_BYPASS;
> @@ -906,8 +1003,8 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
>   		pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
>   
>   	smmu_domain->smmu = smmu;
> -	pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
> -	if (!pgtbl_ops) {
> +	pgtbl_ops[0] = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
> +	if (!pgtbl_ops[0]) {
>   		ret = -ENOMEM;
>   		goto out_clear_smmu;
>   	}
> @@ -919,6 +1016,20 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
>   
>   	/* Initialise the context bank with our page table cfg */
>   	arm_smmu_init_context_bank(smmu_domain, &pgtbl_cfg);
> +
> +	if (split_tables) {
> +		/* It is safe to reuse pgtbl_cfg here */
> +		pgtbl_ops[1] = alloc_io_pgtable_ops(fmt, &pgtbl_cfg,
> +			smmu_domain);
> +		if (!pgtbl_ops[1]) {
> +			free_io_pgtable_ops(pgtbl_ops[0]);
> +			ret = -ENOMEM;
> +			goto out_clear_smmu;
> +		}
> +
> +		arm_smmu_init_ttbr1(smmu_domain, &pgtbl_cfg);
> +	}
> +
>   	arm_smmu_write_context_bank(smmu, cfg->cbndx);
>   
>   	/*
> @@ -937,7 +1048,9 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
>   	mutex_unlock(&smmu_domain->init_mutex);
>   
>   	/* Publish page table ops for map/unmap */
> -	smmu_domain->pgtbl_ops = pgtbl_ops;
> +	smmu_domain->pgtbl_ops[0] = pgtbl_ops[0];
> +	smmu_domain->pgtbl_ops[1] = pgtbl_ops[1];
> +
>   	return 0;
>   
>   out_clear_smmu:
> @@ -973,7 +1086,9 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain)
>   		devm_free_irq(smmu->dev, irq, domain);
>   	}
>   
> -	free_io_pgtable_ops(smmu_domain->pgtbl_ops);
> +	free_io_pgtable_ops(smmu_domain->pgtbl_ops[0]);
> +	free_io_pgtable_ops(smmu_domain->pgtbl_ops[1]);
> +
>   	__arm_smmu_free_bitmap(smmu->context_map, cfg->cbndx);
>   
>   	arm_smmu_rpm_put(smmu);
> @@ -1317,10 +1432,37 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>   	return ret;
>   }
>   
> +static struct io_pgtable_ops *
> +arm_smmu_get_pgtbl_ops(struct iommu_domain *domain, unsigned long iova)
> +{
> +	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> +	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> +	struct arm_smmu_cb *cb = &smmu_domain->smmu->cbs[cfg->cbndx];
> +
> +	if (iova & cb->split_table_mask)
> +		return smmu_domain->pgtbl_ops[1];
> +
> +	return smmu_domain->pgtbl_ops[0];
> +}
> +
> +/*
> + * If split pagetables are enabled adjust the iova so that it
> + * matches the T0SZ/T1SZ that has been programmed
> + */
> +unsigned long arm_smmu_adjust_iova(struct iommu_domain *domain,
> +		unsigned long iova)
> +{
> +	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> +	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> +	struct arm_smmu_cb *cb = &smmu_domain->smmu->cbs[cfg->cbndx];
> +
> +	return cb->split_table_mask ? iova & (cb->split_table_mask - 1) : iova;
> +}
> +
>   static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
>   			phys_addr_t paddr, size_t size, int prot)
>   {
> -	struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
> +	struct io_pgtable_ops *ops = arm_smmu_get_pgtbl_ops(domain, iova);
>   	struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu;
>   	int ret;
>   
> @@ -1328,7 +1470,8 @@ static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
>   		return -ENODEV;
>   
>   	arm_smmu_rpm_get(smmu);
> -	ret = ops->map(ops, iova, paddr, size, prot);
> +	ret = ops->map(ops, arm_smmu_adjust_iova(domain, iova),
> +		paddr, size, prot);
>   	arm_smmu_rpm_put(smmu);
>   
>   	return ret;
> @@ -1337,7 +1480,7 @@ static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
>   static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
>   			     size_t size)
>   {
> -	struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
> +	struct io_pgtable_ops *ops = arm_smmu_get_pgtbl_ops(domain, iova);
>   	struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu;
>   	size_t ret;
>   
> @@ -1345,7 +1488,7 @@ static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
>   		return 0;
>   
>   	arm_smmu_rpm_get(smmu);
> -	ret = ops->unmap(ops, iova, size);
> +	ret = ops->unmap(ops, arm_smmu_adjust_iova(domain, iova), size);
>   	arm_smmu_rpm_put(smmu);
>   
>   	return ret;
> @@ -1381,7 +1524,7 @@ static phys_addr_t arm_smmu_iova_to_phys_hard(struct iommu_domain *domain,
>   	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>   	struct arm_smmu_device *smmu = smmu_domain->smmu;
>   	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> -	struct io_pgtable_ops *ops= smmu_domain->pgtbl_ops;
> +	struct io_pgtable_ops *ops = arm_smmu_get_pgtbl_ops(domain, iova);
>   	struct device *dev = smmu->dev;
>   	void __iomem *cb_base;
>   	u32 tmp;
> @@ -1429,7 +1572,7 @@ static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain,
>   					dma_addr_t iova)
>   {
>   	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> -	struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
> +	struct io_pgtable_ops *ops = arm_smmu_get_pgtbl_ops(domain, iova);
>   
>   	if (domain->type == IOMMU_DOMAIN_IDENTITY)
>   		return iova;
> @@ -1629,6 +1772,11 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
>   		case DOMAIN_ATTR_NESTING:
>   			*(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED);
>   			return 0;
> +		case DOMAIN_ATTR_SPLIT_TABLES:
> +			*((int *)data) =
> +				!!(smmu_domain->attributes &
> +				   (1 << DOMAIN_ATTR_SPLIT_TABLES));
> +			return 0;
>   		default:
>   			return -ENODEV;
>   		}
> @@ -1669,6 +1817,11 @@ static int arm_smmu_domain_set_attr(struct iommu_domain *domain,
>   			else
>   				smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
>   			break;
> +		case DOMAIN_ATTR_SPLIT_TABLES:
> +			if (*((int *)data))
> +				smmu_domain->attributes |=
> +					(1 << DOMAIN_ATTR_SPLIT_TABLES);
> +			break;
>   		default:
>   			ret = -ENODEV;
>   		}
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 4e21efb..71ecb08 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -490,8 +490,7 @@ static int arm_lpae_map(struct io_pgtable_ops *ops, unsigned long iova,
>   	if (!(iommu_prot & (IOMMU_READ | IOMMU_WRITE)))
>   		return 0;
>   
> -	if (WARN_ON(iova >= (1ULL << data->iop.cfg.ias) ||
> -		    paddr >= (1ULL << data->iop.cfg.oas)))
> +	if (WARN_ON(paddr >= (1ULL << data->iop.cfg.oas)))
>   		return -ERANGE;
>   
>   	prot = arm_lpae_prot_to_pte(data, iommu_prot);
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 03/15] iommu/arm-smmu: Add split pagetable support for arm-smmu-v2
@ 2019-05-21 18:18     ` Robin Murphy
  0 siblings, 0 replies; 45+ messages in thread
From: Robin Murphy @ 2019-05-21 18:18 UTC (permalink / raw)
  To: Jordan Crouse, freedreno
  Cc: jean-philippe.brucker, linux-arm-msm, Joerg Roedel, Will Deacon,
	dianders, linux-kernel, iommu, hoegsberg, linux-arm-kernel

On 21/05/2019 17:13, Jordan Crouse wrote:
> Add support for a split pagetable (TTBR0/TTBR1) scheme for arm-smmu-v2.
> If split pagetables are enabled, create a pagetable for TTBR1 and set
> up the sign extension bit so that all IOVAs with that bit set are mapped
> and translated from the TTBR1 pagetable.
> 
> Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> ---
> 
>   drivers/iommu/arm-smmu-regs.h  |  19 +++++
>   drivers/iommu/arm-smmu.c       | 179 ++++++++++++++++++++++++++++++++++++++---
>   drivers/iommu/io-pgtable-arm.c |   3 +-
>   3 files changed, 186 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-regs.h
> index e9132a9..23f27c2 100644
> --- a/drivers/iommu/arm-smmu-regs.h
> +++ b/drivers/iommu/arm-smmu-regs.h
> @@ -195,7 +195,26 @@ enum arm_smmu_s2cr_privcfg {
>   #define RESUME_RETRY			(0 << 0)
>   #define RESUME_TERMINATE		(1 << 0)
>   
> +#define TTBCR_EPD1			(1 << 23)
> +#define TTBCR_T0SZ_SHIFT		0
> +#define TTBCR_T1SZ_SHIFT		16
> +#define TTBCR_IRGN1_SHIFT		24
> +#define TTBCR_ORGN1_SHIFT		26
> +#define TTBCR_RGN_WBWA			1
> +#define TTBCR_SH1_SHIFT			28
> +#define TTBCR_SH_IS			3
> +
> +#define TTBCR_TG1_16K			(1 << 30)
> +#define TTBCR_TG1_4K			(2 << 30)
> +#define TTBCR_TG1_64K			(3 << 30)
> +
>   #define TTBCR2_SEP_SHIFT		15
> +#define TTBCR2_SEP_31			(0x0 << TTBCR2_SEP_SHIFT)
> +#define TTBCR2_SEP_35			(0x1 << TTBCR2_SEP_SHIFT)
> +#define TTBCR2_SEP_39			(0x2 << TTBCR2_SEP_SHIFT)
> +#define TTBCR2_SEP_41			(0x3 << TTBCR2_SEP_SHIFT)
> +#define TTBCR2_SEP_43			(0x4 << TTBCR2_SEP_SHIFT)
> +#define TTBCR2_SEP_47			(0x5 << TTBCR2_SEP_SHIFT)
>   #define TTBCR2_SEP_UPSTREAM		(0x7 << TTBCR2_SEP_SHIFT)
>   #define TTBCR2_AS			(1 << 4)
>   
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index a795ada..e09c0e6 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -152,6 +152,7 @@ struct arm_smmu_cb {
>   	u32				tcr[2];
>   	u32				mair[2];
>   	struct arm_smmu_cfg		*cfg;
> +	unsigned long			split_table_mask;
>   };
>   
>   struct arm_smmu_master_cfg {
> @@ -253,13 +254,14 @@ enum arm_smmu_domain_stage {
>   
>   struct arm_smmu_domain {
>   	struct arm_smmu_device		*smmu;
> -	struct io_pgtable_ops		*pgtbl_ops;
> +	struct io_pgtable_ops		*pgtbl_ops[2];

This seems a bit off - surely the primary domain and aux domain only 
ever need one set of tables each, but either way there's definitely 
unnecessary redundancy in having four sets of io_pgtable_ops between them.

>   	const struct iommu_gather_ops	*tlb_ops;
>   	struct arm_smmu_cfg		cfg;
>   	enum arm_smmu_domain_stage	stage;
>   	bool				non_strict;
>   	struct mutex			init_mutex; /* Protects smmu pointer */
>   	spinlock_t			cb_lock; /* Serialises ATS1* ops and TLB syncs */
> +	u32 attributes;
>   	struct iommu_domain		domain;
>   };
>   
> @@ -621,6 +623,85 @@ static irqreturn_t arm_smmu_global_fault(int irq, void *dev)
>   	return IRQ_HANDLED;
>   }
>   
> +/* Adjust the context bank settings to support TTBR1 */
> +static void arm_smmu_init_ttbr1(struct arm_smmu_domain *smmu_domain,
> +		struct io_pgtable_cfg *pgtbl_cfg)
> +{
> +	struct arm_smmu_device *smmu = smmu_domain->smmu;
> +	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> +	struct arm_smmu_cb *cb = &smmu_domain->smmu->cbs[cfg->cbndx];
> +	int pgsize = 1 << __ffs(pgtbl_cfg->pgsize_bitmap);
> +
> +	/* Enable speculative walks through the TTBR1 */
> +	cb->tcr[0] &= ~TTBCR_EPD1;
> +
> +	cb->tcr[0] |= TTBCR_SH_IS << TTBCR_SH1_SHIFT;
> +	cb->tcr[0] |= TTBCR_RGN_WBWA << TTBCR_IRGN1_SHIFT;
> +	cb->tcr[0] |= TTBCR_RGN_WBWA << TTBCR_ORGN1_SHIFT;
> +
> +	switch (pgsize) {
> +	case SZ_4K:
> +		cb->tcr[0] |= TTBCR_TG1_4K;
> +		break;
> +	case SZ_16K:
> +		cb->tcr[0] |= TTBCR_TG1_16K;
> +		break;
> +	case SZ_64K:
> +		cb->tcr[0] |= TTBCR_TG1_64K;
> +		break;
> +	}
> +
> +	/*
> +	 * Outside of the special 49 bit UBS case that has a dedicated sign
> +	 * extension bit, setting the SEP for any other va_size will force us to
> +	 * shrink the size of the T0/T1 regions by one bit to accommodate the
> +	 * SEP
> +	 */
> +	if (smmu->va_size != 48) {
> +		/* Replace the T0 size */
> +		cb->tcr[0] &= ~(0x3f << TTBCR_T0SZ_SHIFT);
> +		cb->tcr[0] |= (64ULL - smmu->va_size - 1) << TTBCR_T0SZ_SHIFT;
> +		/* Set the T1 size */
> +		cb->tcr[0] |= (64ULL - smmu->va_size - 1) << TTBCR_T1SZ_SHIFT;
> +	} else {
> +		/* Set the T1 size to the full available UBS */
> +		cb->tcr[0] |= (64ULL - smmu->va_size) << TTBCR_T1SZ_SHIFT;
> +	}
> +
> +	/* Clear the existing SEP configuration */
> +	cb->tcr[1] &= ~TTBCR2_SEP_UPSTREAM;
> +
> +	/* Set up the sign extend bit */
> +	switch (smmu->va_size) {
> +	case 32:
> +		cb->tcr[1] |= TTBCR2_SEP_31;
> +		cb->split_table_mask = (1UL << 31);
> +		break;
> +	case 36:
> +		cb->tcr[1] |= TTBCR2_SEP_35;
> +		cb->split_table_mask = (1UL << 35);
> +		break;
> +	case 40:
> +		cb->tcr[1] |= TTBCR2_SEP_39;
> +		cb->split_table_mask = (1UL << 39);
> +		break;
> +	case 42:
> +		cb->tcr[1] |= TTBCR2_SEP_41;
> +		cb->split_table_mask = (1UL << 41);
> +		break;
> +	case 44:
> +		cb->tcr[1] |= TTBCR2_SEP_43;
> +		cb->split_table_mask = (1UL << 43);
> +		break;
> +	case 48:
> +		cb->tcr[1] |= TTBCR2_SEP_UPSTREAM;
> +		cb->split_table_mask = (1UL << 48);
> +	}
> +
> +	cb->ttbr[1] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[0];

Assigning a "TTBR0" to a "TTBR1" is the point at which it becomes clear 
that we need to take a step back and reconsider. I think there was 
originally a half-formed idea that pagetables might go around in pairs, 
but things really aren't working out that way in practice, so it's 
almost certainly time to rework the io_pgatble_alloc() interface. We 
probably want to make "TTBR1" an up-front option for the appropriate 
formats, such that either way they return a single TTBR value plus a TCR 
with the appropriate half configured (hopefully in such a way that the 
caller can simply allocate one of each and merge the two TCRs together, 
so maybe responsibility for EPD* needs to move). That way we can also 
make *better* use of the IOVA sanity-checking in io-pgtable-arm, rather 
than just removing it (especially since this will open up a whole new 
class of "unmapping a TTBR0 address from the TTBR1 domain" type bugs).

Robin.

> +	cb->ttbr[1] |= (u64)cfg->asid << TTBRn_ASID_SHIFT;
> +}
> +
>   static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain,
>   				       struct io_pgtable_cfg *pgtbl_cfg)
>   {
> @@ -763,11 +844,13 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
>   {
>   	int irq, start, ret = 0;
>   	unsigned long ias, oas;
> -	struct io_pgtable_ops *pgtbl_ops;
> +	struct io_pgtable_ops *pgtbl_ops[2] = { NULL, NULL };
>   	struct io_pgtable_cfg pgtbl_cfg;
>   	enum io_pgtable_fmt fmt;
>   	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>   	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> +	bool split_tables =
> +		(smmu_domain->attributes & (1 << DOMAIN_ATTR_SPLIT_TABLES));
>   
>   	mutex_lock(&smmu_domain->init_mutex);
>   	if (smmu_domain->smmu)
> @@ -797,8 +880,15 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
>   	 *
>   	 * Note that you can't actually request stage-2 mappings.
>   	 */
> -	if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1))
> +	if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S1)) {
>   		smmu_domain->stage = ARM_SMMU_DOMAIN_S2;
> +
> +		/* Only allow split pagetables on stage 1 tables */
> +		if (split_tables) {
> +			ret = -EINVAL;
> +			goto out_unlock;
> +		}
> +	}
>   	if (!(smmu->features & ARM_SMMU_FEAT_TRANS_S2))
>   		smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
>   
> @@ -817,6 +907,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
>   	    (smmu->features & ARM_SMMU_FEAT_FMT_AARCH32_S) &&
>   	    (smmu_domain->stage == ARM_SMMU_DOMAIN_S1))
>   		cfg->fmt = ARM_SMMU_CTX_FMT_AARCH32_S;
> +
>   	if ((IS_ENABLED(CONFIG_64BIT) || cfg->fmt == ARM_SMMU_CTX_FMT_NONE) &&
>   	    (smmu->features & (ARM_SMMU_FEAT_FMT_AARCH64_64K |
>   			       ARM_SMMU_FEAT_FMT_AARCH64_16K |
> @@ -828,6 +919,12 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
>   		goto out_unlock;
>   	}
>   
> +	/* For now, only allow split tables for AARCH64 formats */
> +	if (split_tables && cfg->fmt != ARM_SMMU_CTX_FMT_AARCH64) {
> +		ret = -EINVAL;
> +		goto out_unlock;
> +	}
> +
>   	switch (smmu_domain->stage) {
>   	case ARM_SMMU_DOMAIN_S1:
>   		cfg->cbar = CBAR_TYPE_S1_TRANS_S2_BYPASS;
> @@ -906,8 +1003,8 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
>   		pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
>   
>   	smmu_domain->smmu = smmu;
> -	pgtbl_ops = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
> -	if (!pgtbl_ops) {
> +	pgtbl_ops[0] = alloc_io_pgtable_ops(fmt, &pgtbl_cfg, smmu_domain);
> +	if (!pgtbl_ops[0]) {
>   		ret = -ENOMEM;
>   		goto out_clear_smmu;
>   	}
> @@ -919,6 +1016,20 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
>   
>   	/* Initialise the context bank with our page table cfg */
>   	arm_smmu_init_context_bank(smmu_domain, &pgtbl_cfg);
> +
> +	if (split_tables) {
> +		/* It is safe to reuse pgtbl_cfg here */
> +		pgtbl_ops[1] = alloc_io_pgtable_ops(fmt, &pgtbl_cfg,
> +			smmu_domain);
> +		if (!pgtbl_ops[1]) {
> +			free_io_pgtable_ops(pgtbl_ops[0]);
> +			ret = -ENOMEM;
> +			goto out_clear_smmu;
> +		}
> +
> +		arm_smmu_init_ttbr1(smmu_domain, &pgtbl_cfg);
> +	}
> +
>   	arm_smmu_write_context_bank(smmu, cfg->cbndx);
>   
>   	/*
> @@ -937,7 +1048,9 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain,
>   	mutex_unlock(&smmu_domain->init_mutex);
>   
>   	/* Publish page table ops for map/unmap */
> -	smmu_domain->pgtbl_ops = pgtbl_ops;
> +	smmu_domain->pgtbl_ops[0] = pgtbl_ops[0];
> +	smmu_domain->pgtbl_ops[1] = pgtbl_ops[1];
> +
>   	return 0;
>   
>   out_clear_smmu:
> @@ -973,7 +1086,9 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain)
>   		devm_free_irq(smmu->dev, irq, domain);
>   	}
>   
> -	free_io_pgtable_ops(smmu_domain->pgtbl_ops);
> +	free_io_pgtable_ops(smmu_domain->pgtbl_ops[0]);
> +	free_io_pgtable_ops(smmu_domain->pgtbl_ops[1]);
> +
>   	__arm_smmu_free_bitmap(smmu->context_map, cfg->cbndx);
>   
>   	arm_smmu_rpm_put(smmu);
> @@ -1317,10 +1432,37 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
>   	return ret;
>   }
>   
> +static struct io_pgtable_ops *
> +arm_smmu_get_pgtbl_ops(struct iommu_domain *domain, unsigned long iova)
> +{
> +	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> +	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> +	struct arm_smmu_cb *cb = &smmu_domain->smmu->cbs[cfg->cbndx];
> +
> +	if (iova & cb->split_table_mask)
> +		return smmu_domain->pgtbl_ops[1];
> +
> +	return smmu_domain->pgtbl_ops[0];
> +}
> +
> +/*
> + * If split pagetables are enabled adjust the iova so that it
> + * matches the T0SZ/T1SZ that has been programmed
> + */
> +unsigned long arm_smmu_adjust_iova(struct iommu_domain *domain,
> +		unsigned long iova)
> +{
> +	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> +	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> +	struct arm_smmu_cb *cb = &smmu_domain->smmu->cbs[cfg->cbndx];
> +
> +	return cb->split_table_mask ? iova & (cb->split_table_mask - 1) : iova;
> +}
> +
>   static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
>   			phys_addr_t paddr, size_t size, int prot)
>   {
> -	struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
> +	struct io_pgtable_ops *ops = arm_smmu_get_pgtbl_ops(domain, iova);
>   	struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu;
>   	int ret;
>   
> @@ -1328,7 +1470,8 @@ static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
>   		return -ENODEV;
>   
>   	arm_smmu_rpm_get(smmu);
> -	ret = ops->map(ops, iova, paddr, size, prot);
> +	ret = ops->map(ops, arm_smmu_adjust_iova(domain, iova),
> +		paddr, size, prot);
>   	arm_smmu_rpm_put(smmu);
>   
>   	return ret;
> @@ -1337,7 +1480,7 @@ static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
>   static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
>   			     size_t size)
>   {
> -	struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
> +	struct io_pgtable_ops *ops = arm_smmu_get_pgtbl_ops(domain, iova);
>   	struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu;
>   	size_t ret;
>   
> @@ -1345,7 +1488,7 @@ static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
>   		return 0;
>   
>   	arm_smmu_rpm_get(smmu);
> -	ret = ops->unmap(ops, iova, size);
> +	ret = ops->unmap(ops, arm_smmu_adjust_iova(domain, iova), size);
>   	arm_smmu_rpm_put(smmu);
>   
>   	return ret;
> @@ -1381,7 +1524,7 @@ static phys_addr_t arm_smmu_iova_to_phys_hard(struct iommu_domain *domain,
>   	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>   	struct arm_smmu_device *smmu = smmu_domain->smmu;
>   	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> -	struct io_pgtable_ops *ops= smmu_domain->pgtbl_ops;
> +	struct io_pgtable_ops *ops = arm_smmu_get_pgtbl_ops(domain, iova);
>   	struct device *dev = smmu->dev;
>   	void __iomem *cb_base;
>   	u32 tmp;
> @@ -1429,7 +1572,7 @@ static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain,
>   					dma_addr_t iova)
>   {
>   	struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> -	struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
> +	struct io_pgtable_ops *ops = arm_smmu_get_pgtbl_ops(domain, iova);
>   
>   	if (domain->type == IOMMU_DOMAIN_IDENTITY)
>   		return iova;
> @@ -1629,6 +1772,11 @@ static int arm_smmu_domain_get_attr(struct iommu_domain *domain,
>   		case DOMAIN_ATTR_NESTING:
>   			*(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED);
>   			return 0;
> +		case DOMAIN_ATTR_SPLIT_TABLES:
> +			*((int *)data) =
> +				!!(smmu_domain->attributes &
> +				   (1 << DOMAIN_ATTR_SPLIT_TABLES));
> +			return 0;
>   		default:
>   			return -ENODEV;
>   		}
> @@ -1669,6 +1817,11 @@ static int arm_smmu_domain_set_attr(struct iommu_domain *domain,
>   			else
>   				smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
>   			break;
> +		case DOMAIN_ATTR_SPLIT_TABLES:
> +			if (*((int *)data))
> +				smmu_domain->attributes |=
> +					(1 << DOMAIN_ATTR_SPLIT_TABLES);
> +			break;
>   		default:
>   			ret = -ENODEV;
>   		}
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 4e21efb..71ecb08 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -490,8 +490,7 @@ static int arm_lpae_map(struct io_pgtable_ops *ops, unsigned long iova,
>   	if (!(iommu_prot & (IOMMU_READ | IOMMU_WRITE)))
>   		return 0;
>   
> -	if (WARN_ON(iova >= (1ULL << data->iop.cfg.ias) ||
> -		    paddr >= (1ULL << data->iop.cfg.oas)))
> +	if (WARN_ON(paddr >= (1ULL << data->iop.cfg.oas)))
>   		return -ERANGE;
>   
>   	prot = arm_lpae_prot_to_pte(data, iommu_prot);
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 01/15] iommu/arm-smmu: Allow IOMMU enabled devices to skip DMA domains
  2019-05-21 17:43     ` Robin Murphy
  (?)
@ 2019-05-21 19:07       ` Jordan Crouse
  -1 siblings, 0 replies; 45+ messages in thread
From: Jordan Crouse @ 2019-05-21 19:07 UTC (permalink / raw)
  To: Robin Murphy
  Cc: freedreno, jean-philippe.brucker, linux-arm-msm, hoegsberg,
	dianders, linux-kernel, iommu, Will Deacon, Joerg Roedel,
	linux-arm-kernel

On Tue, May 21, 2019 at 06:43:34PM +0100, Robin Murphy wrote:
> On 21/05/2019 17:13, Jordan Crouse wrote:
> >Allow IOMMU enabled devices specified on an opt-in list to create a
> >default identity domain for a new IOMMU group and bypass the DMA
> >domain created by the IOMMU core. This allows the group to be properly
> >set up but otherwise skips touching the hardware until the client
> >device attaches a unmanaged domain of its own.
> 
> All the cool kids are using iommu_request_dm_for_dev() to force an identity
> domain for particular devices, won't that suffice for this case too? There
> is definite scope for improvement in this area, so I'd really like to keep
> things as consistent as possible to make that easier in future.

I initially rejected iommu_request_dm_for_dev() since it still allowed the DMA
domain to consume the context bank but now that I look at it again as long as
the domain free returns the context bank to the pool it might work. Let me give
it a shot and see if it does what we need.

Jordan

> >Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> >---
> >
> >  drivers/iommu/arm-smmu.c | 42 ++++++++++++++++++++++++++++++++++++++++++
> >  drivers/iommu/iommu.c    | 29 +++++++++++++++++++++++------
> >  include/linux/iommu.h    |  3 +++
> >  3 files changed, 68 insertions(+), 6 deletions(-)
> >
> >diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> >index 5e54cc0..a795ada 100644
> >--- a/drivers/iommu/arm-smmu.c
> >+++ b/drivers/iommu/arm-smmu.c
> >@@ -1235,6 +1235,35 @@ static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain,
> >  	return 0;
> >  }
> >+struct arm_smmu_client_match_data {
> >+	bool use_identity_domain;
> >+};
> >+
> >+static const struct arm_smmu_client_match_data qcom_adreno = {
> >+	.use_identity_domain = true,
> >+};
> >+
> >+static const struct arm_smmu_client_match_data qcom_mdss = {
> >+	.use_identity_domain = true,
> >+};
> >+
> >+static const struct of_device_id arm_smmu_client_of_match[] = {
> >+	{ .compatible = "qcom,adreno", .data = &qcom_adreno },
> >+	{ .compatible = "qcom,mdp4", .data = &qcom_mdss },
> >+	{ .compatible = "qcom,mdss", .data = &qcom_mdss },
> >+	{ .compatible = "qcom,sdm845-mdss", .data = &qcom_mdss },
> >+	{},
> >+};
> >+
> >+static const struct arm_smmu_client_match_data *
> >+arm_smmu_client_data(struct device *dev)
> >+{
> >+	const struct of_device_id *match =
> >+		of_match_device(arm_smmu_client_of_match, dev);
> >+
> >+	return match ? match->data : NULL;
> >+}
> >+
> >  static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
> >  {
> >  	int ret;
> >@@ -1552,6 +1581,7 @@ static struct iommu_group *arm_smmu_device_group(struct device *dev)
> >  {
> >  	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> >  	struct arm_smmu_device *smmu = fwspec_smmu(fwspec);
> >+	const struct arm_smmu_client_match_data *client;
> >  	struct iommu_group *group = NULL;
> >  	int i, idx;
> >@@ -1573,6 +1603,18 @@ static struct iommu_group *arm_smmu_device_group(struct device *dev)
> >  	else
> >  		group = generic_device_group(dev);
> >+	client = arm_smmu_client_data(dev);
> >+
> >+	/*
> >+	 * If the client chooses to bypass the dma domain, create a identity
> >+	 * domain as a default placeholder. This will give the device a
> >+	 * default domain but skip DMA operations and not consume a context
> >+	 * bank
> >+	 */
> >+	if (client && client->no_dma_domain)
> >+		iommu_group_set_default_domain(group, dev,
> >+			IOMMU_DOMAIN_IDENTITY);
> >+
> >  	return group;
> >  }
> >diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> >index 67ee662..af3e1ed 100644
> >--- a/drivers/iommu/iommu.c
> >+++ b/drivers/iommu/iommu.c
> >@@ -1062,6 +1062,24 @@ struct iommu_group *fsl_mc_device_group(struct device *dev)
> >  	return group;
> >  }
> >+struct iommu_domain *iommu_group_set_default_domain(struct iommu_group *group,
> >+		struct device *dev, unsigned int type)
> >+{
> >+	struct iommu_domain *dom;
> >+
> >+	dom = __iommu_domain_alloc(dev->bus, type);
> >+	if (!dom)
> >+		return NULL;
> >+
> >+	/* FIXME: Error if the default domain is already set? */
> >+	group->default_domain = dom;
> >+	if (!group->domain)
> >+		group->domain = dom;
> >+
> >+	return dom;
> >+}
> >+EXPORT_SYMBOL_GPL(iommu_group_set_default_domain);
> >+
> >  /**
> >   * iommu_group_get_for_dev - Find or create the IOMMU group for a device
> >   * @dev: target device
> >@@ -1099,9 +1117,12 @@ struct iommu_group *iommu_group_get_for_dev(struct device *dev)
> >  	if (!group->default_domain) {
> >  		struct iommu_domain *dom;
> >-		dom = __iommu_domain_alloc(dev->bus, iommu_def_domain_type);
> >+		dom = iommu_group_set_default_domain(group, dev,
> >+			iommu_def_domain_type);
> >+
> >  		if (!dom && iommu_def_domain_type != IOMMU_DOMAIN_DMA) {
> >-			dom = __iommu_domain_alloc(dev->bus, IOMMU_DOMAIN_DMA);
> >+			dom = iommu_group_set_default_domain(group, dev,
> >+				IOMMU_DOMAIN_DMA);
> >  			if (dom) {
> >  				dev_warn(dev,
> >  					 "failed to allocate default IOMMU domain of type %u; falling back to IOMMU_DOMAIN_DMA",
> >@@ -1109,10 +1130,6 @@ struct iommu_group *iommu_group_get_for_dev(struct device *dev)
> >  			}
> >  		}
> >-		group->default_domain = dom;
> >-		if (!group->domain)
> >-			group->domain = dom;
> >-
> >  		if (dom && !iommu_dma_strict) {
> >  			int attr = 1;
> >  			iommu_domain_set_attr(dom,
> >diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> >index a815cf6..4ef8bd5 100644
> >--- a/include/linux/iommu.h
> >+++ b/include/linux/iommu.h
> >@@ -394,6 +394,9 @@ extern int iommu_group_id(struct iommu_group *group);
> >  extern struct iommu_group *iommu_group_get_for_dev(struct device *dev);
> >  extern struct iommu_domain *iommu_group_default_domain(struct iommu_group *);
> >+struct iommu_domain *iommu_group_set_default_domain(struct iommu_group *group,
> >+		struct device *dev, unsigned int type);
> >+
> >  extern int iommu_domain_get_attr(struct iommu_domain *domain, enum iommu_attr,
> >  				 void *data);
> >  extern int iommu_domain_set_attr(struct iommu_domain *domain, enum iommu_attr,
> >

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 01/15] iommu/arm-smmu: Allow IOMMU enabled devices to skip DMA domains
@ 2019-05-21 19:07       ` Jordan Crouse
  0 siblings, 0 replies; 45+ messages in thread
From: Jordan Crouse @ 2019-05-21 19:07 UTC (permalink / raw)
  To: Robin Murphy
  Cc: jean-philippe.brucker, linux-arm-msm, Will Deacon, dianders,
	linux-kernel, iommu, hoegsberg, freedreno, linux-arm-kernel

On Tue, May 21, 2019 at 06:43:34PM +0100, Robin Murphy wrote:
> On 21/05/2019 17:13, Jordan Crouse wrote:
> >Allow IOMMU enabled devices specified on an opt-in list to create a
> >default identity domain for a new IOMMU group and bypass the DMA
> >domain created by the IOMMU core. This allows the group to be properly
> >set up but otherwise skips touching the hardware until the client
> >device attaches a unmanaged domain of its own.
> 
> All the cool kids are using iommu_request_dm_for_dev() to force an identity
> domain for particular devices, won't that suffice for this case too? There
> is definite scope for improvement in this area, so I'd really like to keep
> things as consistent as possible to make that easier in future.

I initially rejected iommu_request_dm_for_dev() since it still allowed the DMA
domain to consume the context bank but now that I look at it again as long as
the domain free returns the context bank to the pool it might work. Let me give
it a shot and see if it does what we need.

Jordan

> >Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> >---
> >
> >  drivers/iommu/arm-smmu.c | 42 ++++++++++++++++++++++++++++++++++++++++++
> >  drivers/iommu/iommu.c    | 29 +++++++++++++++++++++++------
> >  include/linux/iommu.h    |  3 +++
> >  3 files changed, 68 insertions(+), 6 deletions(-)
> >
> >diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> >index 5e54cc0..a795ada 100644
> >--- a/drivers/iommu/arm-smmu.c
> >+++ b/drivers/iommu/arm-smmu.c
> >@@ -1235,6 +1235,35 @@ static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain,
> >  	return 0;
> >  }
> >+struct arm_smmu_client_match_data {
> >+	bool use_identity_domain;
> >+};
> >+
> >+static const struct arm_smmu_client_match_data qcom_adreno = {
> >+	.use_identity_domain = true,
> >+};
> >+
> >+static const struct arm_smmu_client_match_data qcom_mdss = {
> >+	.use_identity_domain = true,
> >+};
> >+
> >+static const struct of_device_id arm_smmu_client_of_match[] = {
> >+	{ .compatible = "qcom,adreno", .data = &qcom_adreno },
> >+	{ .compatible = "qcom,mdp4", .data = &qcom_mdss },
> >+	{ .compatible = "qcom,mdss", .data = &qcom_mdss },
> >+	{ .compatible = "qcom,sdm845-mdss", .data = &qcom_mdss },
> >+	{},
> >+};
> >+
> >+static const struct arm_smmu_client_match_data *
> >+arm_smmu_client_data(struct device *dev)
> >+{
> >+	const struct of_device_id *match =
> >+		of_match_device(arm_smmu_client_of_match, dev);
> >+
> >+	return match ? match->data : NULL;
> >+}
> >+
> >  static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
> >  {
> >  	int ret;
> >@@ -1552,6 +1581,7 @@ static struct iommu_group *arm_smmu_device_group(struct device *dev)
> >  {
> >  	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> >  	struct arm_smmu_device *smmu = fwspec_smmu(fwspec);
> >+	const struct arm_smmu_client_match_data *client;
> >  	struct iommu_group *group = NULL;
> >  	int i, idx;
> >@@ -1573,6 +1603,18 @@ static struct iommu_group *arm_smmu_device_group(struct device *dev)
> >  	else
> >  		group = generic_device_group(dev);
> >+	client = arm_smmu_client_data(dev);
> >+
> >+	/*
> >+	 * If the client chooses to bypass the dma domain, create a identity
> >+	 * domain as a default placeholder. This will give the device a
> >+	 * default domain but skip DMA operations and not consume a context
> >+	 * bank
> >+	 */
> >+	if (client && client->no_dma_domain)
> >+		iommu_group_set_default_domain(group, dev,
> >+			IOMMU_DOMAIN_IDENTITY);
> >+
> >  	return group;
> >  }
> >diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> >index 67ee662..af3e1ed 100644
> >--- a/drivers/iommu/iommu.c
> >+++ b/drivers/iommu/iommu.c
> >@@ -1062,6 +1062,24 @@ struct iommu_group *fsl_mc_device_group(struct device *dev)
> >  	return group;
> >  }
> >+struct iommu_domain *iommu_group_set_default_domain(struct iommu_group *group,
> >+		struct device *dev, unsigned int type)
> >+{
> >+	struct iommu_domain *dom;
> >+
> >+	dom = __iommu_domain_alloc(dev->bus, type);
> >+	if (!dom)
> >+		return NULL;
> >+
> >+	/* FIXME: Error if the default domain is already set? */
> >+	group->default_domain = dom;
> >+	if (!group->domain)
> >+		group->domain = dom;
> >+
> >+	return dom;
> >+}
> >+EXPORT_SYMBOL_GPL(iommu_group_set_default_domain);
> >+
> >  /**
> >   * iommu_group_get_for_dev - Find or create the IOMMU group for a device
> >   * @dev: target device
> >@@ -1099,9 +1117,12 @@ struct iommu_group *iommu_group_get_for_dev(struct device *dev)
> >  	if (!group->default_domain) {
> >  		struct iommu_domain *dom;
> >-		dom = __iommu_domain_alloc(dev->bus, iommu_def_domain_type);
> >+		dom = iommu_group_set_default_domain(group, dev,
> >+			iommu_def_domain_type);
> >+
> >  		if (!dom && iommu_def_domain_type != IOMMU_DOMAIN_DMA) {
> >-			dom = __iommu_domain_alloc(dev->bus, IOMMU_DOMAIN_DMA);
> >+			dom = iommu_group_set_default_domain(group, dev,
> >+				IOMMU_DOMAIN_DMA);
> >  			if (dom) {
> >  				dev_warn(dev,
> >  					 "failed to allocate default IOMMU domain of type %u; falling back to IOMMU_DOMAIN_DMA",
> >@@ -1109,10 +1130,6 @@ struct iommu_group *iommu_group_get_for_dev(struct device *dev)
> >  			}
> >  		}
> >-		group->default_domain = dom;
> >-		if (!group->domain)
> >-			group->domain = dom;
> >-
> >  		if (dom && !iommu_dma_strict) {
> >  			int attr = 1;
> >  			iommu_domain_set_attr(dom,
> >diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> >index a815cf6..4ef8bd5 100644
> >--- a/include/linux/iommu.h
> >+++ b/include/linux/iommu.h
> >@@ -394,6 +394,9 @@ extern int iommu_group_id(struct iommu_group *group);
> >  extern struct iommu_group *iommu_group_get_for_dev(struct device *dev);
> >  extern struct iommu_domain *iommu_group_default_domain(struct iommu_group *);
> >+struct iommu_domain *iommu_group_set_default_domain(struct iommu_group *group,
> >+		struct device *dev, unsigned int type);
> >+
> >  extern int iommu_domain_get_attr(struct iommu_domain *domain, enum iommu_attr,
> >  				 void *data);
> >  extern int iommu_domain_set_attr(struct iommu_domain *domain, enum iommu_attr,
> >

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 01/15] iommu/arm-smmu: Allow IOMMU enabled devices to skip DMA domains
@ 2019-05-21 19:07       ` Jordan Crouse
  0 siblings, 0 replies; 45+ messages in thread
From: Jordan Crouse @ 2019-05-21 19:07 UTC (permalink / raw)
  To: Robin Murphy
  Cc: jean-philippe.brucker, linux-arm-msm, Joerg Roedel, Will Deacon,
	dianders, linux-kernel, iommu, hoegsberg, freedreno,
	linux-arm-kernel

On Tue, May 21, 2019 at 06:43:34PM +0100, Robin Murphy wrote:
> On 21/05/2019 17:13, Jordan Crouse wrote:
> >Allow IOMMU enabled devices specified on an opt-in list to create a
> >default identity domain for a new IOMMU group and bypass the DMA
> >domain created by the IOMMU core. This allows the group to be properly
> >set up but otherwise skips touching the hardware until the client
> >device attaches a unmanaged domain of its own.
> 
> All the cool kids are using iommu_request_dm_for_dev() to force an identity
> domain for particular devices, won't that suffice for this case too? There
> is definite scope for improvement in this area, so I'd really like to keep
> things as consistent as possible to make that easier in future.

I initially rejected iommu_request_dm_for_dev() since it still allowed the DMA
domain to consume the context bank but now that I look at it again as long as
the domain free returns the context bank to the pool it might work. Let me give
it a shot and see if it does what we need.

Jordan

> >Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> >---
> >
> >  drivers/iommu/arm-smmu.c | 42 ++++++++++++++++++++++++++++++++++++++++++
> >  drivers/iommu/iommu.c    | 29 +++++++++++++++++++++++------
> >  include/linux/iommu.h    |  3 +++
> >  3 files changed, 68 insertions(+), 6 deletions(-)
> >
> >diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> >index 5e54cc0..a795ada 100644
> >--- a/drivers/iommu/arm-smmu.c
> >+++ b/drivers/iommu/arm-smmu.c
> >@@ -1235,6 +1235,35 @@ static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain,
> >  	return 0;
> >  }
> >+struct arm_smmu_client_match_data {
> >+	bool use_identity_domain;
> >+};
> >+
> >+static const struct arm_smmu_client_match_data qcom_adreno = {
> >+	.use_identity_domain = true,
> >+};
> >+
> >+static const struct arm_smmu_client_match_data qcom_mdss = {
> >+	.use_identity_domain = true,
> >+};
> >+
> >+static const struct of_device_id arm_smmu_client_of_match[] = {
> >+	{ .compatible = "qcom,adreno", .data = &qcom_adreno },
> >+	{ .compatible = "qcom,mdp4", .data = &qcom_mdss },
> >+	{ .compatible = "qcom,mdss", .data = &qcom_mdss },
> >+	{ .compatible = "qcom,sdm845-mdss", .data = &qcom_mdss },
> >+	{},
> >+};
> >+
> >+static const struct arm_smmu_client_match_data *
> >+arm_smmu_client_data(struct device *dev)
> >+{
> >+	const struct of_device_id *match =
> >+		of_match_device(arm_smmu_client_of_match, dev);
> >+
> >+	return match ? match->data : NULL;
> >+}
> >+
> >  static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
> >  {
> >  	int ret;
> >@@ -1552,6 +1581,7 @@ static struct iommu_group *arm_smmu_device_group(struct device *dev)
> >  {
> >  	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> >  	struct arm_smmu_device *smmu = fwspec_smmu(fwspec);
> >+	const struct arm_smmu_client_match_data *client;
> >  	struct iommu_group *group = NULL;
> >  	int i, idx;
> >@@ -1573,6 +1603,18 @@ static struct iommu_group *arm_smmu_device_group(struct device *dev)
> >  	else
> >  		group = generic_device_group(dev);
> >+	client = arm_smmu_client_data(dev);
> >+
> >+	/*
> >+	 * If the client chooses to bypass the dma domain, create a identity
> >+	 * domain as a default placeholder. This will give the device a
> >+	 * default domain but skip DMA operations and not consume a context
> >+	 * bank
> >+	 */
> >+	if (client && client->no_dma_domain)
> >+		iommu_group_set_default_domain(group, dev,
> >+			IOMMU_DOMAIN_IDENTITY);
> >+
> >  	return group;
> >  }
> >diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> >index 67ee662..af3e1ed 100644
> >--- a/drivers/iommu/iommu.c
> >+++ b/drivers/iommu/iommu.c
> >@@ -1062,6 +1062,24 @@ struct iommu_group *fsl_mc_device_group(struct device *dev)
> >  	return group;
> >  }
> >+struct iommu_domain *iommu_group_set_default_domain(struct iommu_group *group,
> >+		struct device *dev, unsigned int type)
> >+{
> >+	struct iommu_domain *dom;
> >+
> >+	dom = __iommu_domain_alloc(dev->bus, type);
> >+	if (!dom)
> >+		return NULL;
> >+
> >+	/* FIXME: Error if the default domain is already set? */
> >+	group->default_domain = dom;
> >+	if (!group->domain)
> >+		group->domain = dom;
> >+
> >+	return dom;
> >+}
> >+EXPORT_SYMBOL_GPL(iommu_group_set_default_domain);
> >+
> >  /**
> >   * iommu_group_get_for_dev - Find or create the IOMMU group for a device
> >   * @dev: target device
> >@@ -1099,9 +1117,12 @@ struct iommu_group *iommu_group_get_for_dev(struct device *dev)
> >  	if (!group->default_domain) {
> >  		struct iommu_domain *dom;
> >-		dom = __iommu_domain_alloc(dev->bus, iommu_def_domain_type);
> >+		dom = iommu_group_set_default_domain(group, dev,
> >+			iommu_def_domain_type);
> >+
> >  		if (!dom && iommu_def_domain_type != IOMMU_DOMAIN_DMA) {
> >-			dom = __iommu_domain_alloc(dev->bus, IOMMU_DOMAIN_DMA);
> >+			dom = iommu_group_set_default_domain(group, dev,
> >+				IOMMU_DOMAIN_DMA);
> >  			if (dom) {
> >  				dev_warn(dev,
> >  					 "failed to allocate default IOMMU domain of type %u; falling back to IOMMU_DOMAIN_DMA",
> >@@ -1109,10 +1130,6 @@ struct iommu_group *iommu_group_get_for_dev(struct device *dev)
> >  			}
> >  		}
> >-		group->default_domain = dom;
> >-		if (!group->domain)
> >-			group->domain = dom;
> >-
> >  		if (dom && !iommu_dma_strict) {
> >  			int attr = 1;
> >  			iommu_domain_set_attr(dom,
> >diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> >index a815cf6..4ef8bd5 100644
> >--- a/include/linux/iommu.h
> >+++ b/include/linux/iommu.h
> >@@ -394,6 +394,9 @@ extern int iommu_group_id(struct iommu_group *group);
> >  extern struct iommu_group *iommu_group_get_for_dev(struct device *dev);
> >  extern struct iommu_domain *iommu_group_default_domain(struct iommu_group *);
> >+struct iommu_domain *iommu_group_set_default_domain(struct iommu_group *group,
> >+		struct device *dev, unsigned int type);
> >+
> >  extern int iommu_domain_get_attr(struct iommu_domain *domain, enum iommu_attr,
> >  				 void *data);
> >  extern int iommu_domain_set_attr(struct iommu_domain *domain, enum iommu_attr,
> >

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 03/15] iommu/arm-smmu: Add split pagetable support for arm-smmu-v2
  2019-05-21 18:18     ` Robin Murphy
  (?)
@ 2019-05-23 20:00       ` Jordan Crouse
  -1 siblings, 0 replies; 45+ messages in thread
From: Jordan Crouse @ 2019-05-23 20:00 UTC (permalink / raw)
  To: Robin Murphy
  Cc: freedreno, jean-philippe.brucker, linux-arm-msm, hoegsberg,
	dianders, linux-kernel, iommu, Will Deacon, Joerg Roedel,
	linux-arm-kernel

On Tue, May 21, 2019 at 07:18:32PM +0100, Robin Murphy wrote:
> On 21/05/2019 17:13, Jordan Crouse wrote:
> >Add support for a split pagetable (TTBR0/TTBR1) scheme for arm-smmu-v2.
> >If split pagetables are enabled, create a pagetable for TTBR1 and set
> >up the sign extension bit so that all IOVAs with that bit set are mapped
> >and translated from the TTBR1 pagetable.
> >
> >Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> >---
> >
> >  drivers/iommu/arm-smmu-regs.h  |  19 +++++
> >  drivers/iommu/arm-smmu.c       | 179 ++++++++++++++++++++++++++++++++++++++---
> >  drivers/iommu/io-pgtable-arm.c |   3 +-
> >  3 files changed, 186 insertions(+), 15 deletions(-)
> >
> >diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-regs.h
> >index e9132a9..23f27c2 100644
> >--- a/drivers/iommu/arm-smmu-regs.h
> >+++ b/drivers/iommu/arm-smmu-regs.h
> >@@ -195,7 +195,26 @@ enum arm_smmu_s2cr_privcfg {
> >  #define RESUME_RETRY			(0 << 0)
> >  #define RESUME_TERMINATE		(1 << 0)
> >+#define TTBCR_EPD1			(1 << 23)
> >+#define TTBCR_T0SZ_SHIFT		0
> >+#define TTBCR_T1SZ_SHIFT		16
> >+#define TTBCR_IRGN1_SHIFT		24
> >+#define TTBCR_ORGN1_SHIFT		26
> >+#define TTBCR_RGN_WBWA			1
> >+#define TTBCR_SH1_SHIFT			28
> >+#define TTBCR_SH_IS			3
> >+
> >+#define TTBCR_TG1_16K			(1 << 30)
> >+#define TTBCR_TG1_4K			(2 << 30)
> >+#define TTBCR_TG1_64K			(3 << 30)
> >+
> >  #define TTBCR2_SEP_SHIFT		15
> >+#define TTBCR2_SEP_31			(0x0 << TTBCR2_SEP_SHIFT)
> >+#define TTBCR2_SEP_35			(0x1 << TTBCR2_SEP_SHIFT)
> >+#define TTBCR2_SEP_39			(0x2 << TTBCR2_SEP_SHIFT)
> >+#define TTBCR2_SEP_41			(0x3 << TTBCR2_SEP_SHIFT)
> >+#define TTBCR2_SEP_43			(0x4 << TTBCR2_SEP_SHIFT)
> >+#define TTBCR2_SEP_47			(0x5 << TTBCR2_SEP_SHIFT)
> >  #define TTBCR2_SEP_UPSTREAM		(0x7 << TTBCR2_SEP_SHIFT)
> >  #define TTBCR2_AS			(1 << 4)
> >diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> >index a795ada..e09c0e6 100644
> >--- a/drivers/iommu/arm-smmu.c
> >+++ b/drivers/iommu/arm-smmu.c
> >@@ -152,6 +152,7 @@ struct arm_smmu_cb {
> >  	u32				tcr[2];
> >  	u32				mair[2];
> >  	struct arm_smmu_cfg		*cfg;
> >+	unsigned long			split_table_mask;
> >  };
> >  struct arm_smmu_master_cfg {
> >@@ -253,13 +254,14 @@ enum arm_smmu_domain_stage {
> >  struct arm_smmu_domain {
> >  	struct arm_smmu_device		*smmu;
> >-	struct io_pgtable_ops		*pgtbl_ops;
> >+	struct io_pgtable_ops		*pgtbl_ops[2];
> 
> This seems a bit off - surely the primary domain and aux domain only ever
> need one set of tables each, but either way there's definitely unnecessary
> redundancy in having four sets of io_pgtable_ops between them.
> 
> >  	const struct iommu_gather_ops	*tlb_ops;
> >  	struct arm_smmu_cfg		cfg;
> >  	enum arm_smmu_domain_stage	stage;
> >  	bool				non_strict;
> >  	struct mutex			init_mutex; /* Protects smmu pointer */
> >  	spinlock_t			cb_lock; /* Serialises ATS1* ops and TLB syncs */
> >+	u32 attributes;
> >  	struct iommu_domain		domain;
> >  };
> >@@ -621,6 +623,85 @@ static irqreturn_t arm_smmu_global_fault(int irq, void *dev)
> >  	return IRQ_HANDLED;
> >  }
> >+/* Adjust the context bank settings to support TTBR1 */
> >+static void arm_smmu_init_ttbr1(struct arm_smmu_domain *smmu_domain,
> >+		struct io_pgtable_cfg *pgtbl_cfg)
> >+{
> >+	struct arm_smmu_device *smmu = smmu_domain->smmu;
> >+	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> >+	struct arm_smmu_cb *cb = &smmu_domain->smmu->cbs[cfg->cbndx];
> >+	int pgsize = 1 << __ffs(pgtbl_cfg->pgsize_bitmap);
> >+
> >+	/* Enable speculative walks through the TTBR1 */
> >+	cb->tcr[0] &= ~TTBCR_EPD1;
> >+
> >+	cb->tcr[0] |= TTBCR_SH_IS << TTBCR_SH1_SHIFT;
> >+	cb->tcr[0] |= TTBCR_RGN_WBWA << TTBCR_IRGN1_SHIFT;
> >+	cb->tcr[0] |= TTBCR_RGN_WBWA << TTBCR_ORGN1_SHIFT;
> >+
> >+	switch (pgsize) {
> >+	case SZ_4K:
> >+		cb->tcr[0] |= TTBCR_TG1_4K;
> >+		break;
> >+	case SZ_16K:
> >+		cb->tcr[0] |= TTBCR_TG1_16K;
> >+		break;
> >+	case SZ_64K:
> >+		cb->tcr[0] |= TTBCR_TG1_64K;
> >+		break;
> >+	}
> >+
> >+	/*
> >+	 * Outside of the special 49 bit UBS case that has a dedicated sign
> >+	 * extension bit, setting the SEP for any other va_size will force us to
> >+	 * shrink the size of the T0/T1 regions by one bit to accommodate the
> >+	 * SEP
> >+	 */
> >+	if (smmu->va_size != 48) {
> >+		/* Replace the T0 size */
> >+		cb->tcr[0] &= ~(0x3f << TTBCR_T0SZ_SHIFT);
> >+		cb->tcr[0] |= (64ULL - smmu->va_size - 1) << TTBCR_T0SZ_SHIFT;
> >+		/* Set the T1 size */
> >+		cb->tcr[0] |= (64ULL - smmu->va_size - 1) << TTBCR_T1SZ_SHIFT;
> >+	} else {
> >+		/* Set the T1 size to the full available UBS */
> >+		cb->tcr[0] |= (64ULL - smmu->va_size) << TTBCR_T1SZ_SHIFT;
> >+	}
> >+
> >+	/* Clear the existing SEP configuration */
> >+	cb->tcr[1] &= ~TTBCR2_SEP_UPSTREAM;
> >+
> >+	/* Set up the sign extend bit */
> >+	switch (smmu->va_size) {
> >+	case 32:
> >+		cb->tcr[1] |= TTBCR2_SEP_31;
> >+		cb->split_table_mask = (1UL << 31);
> >+		break;
> >+	case 36:
> >+		cb->tcr[1] |= TTBCR2_SEP_35;
> >+		cb->split_table_mask = (1UL << 35);
> >+		break;
> >+	case 40:
> >+		cb->tcr[1] |= TTBCR2_SEP_39;
> >+		cb->split_table_mask = (1UL << 39);
> >+		break;
> >+	case 42:
> >+		cb->tcr[1] |= TTBCR2_SEP_41;
> >+		cb->split_table_mask = (1UL << 41);
> >+		break;
> >+	case 44:
> >+		cb->tcr[1] |= TTBCR2_SEP_43;
> >+		cb->split_table_mask = (1UL << 43);
> >+		break;
> >+	case 48:
> >+		cb->tcr[1] |= TTBCR2_SEP_UPSTREAM;
> >+		cb->split_table_mask = (1UL << 48);
> >+	}
> >+
> >+	cb->ttbr[1] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[0];
> 
> Assigning a "TTBR0" to a "TTBR1" is the point at which it becomes clear that
> we need to take a step back and reconsider. I think there was originally a
> half-formed idea that pagetables might go around in pairs, but things really
> aren't working out that way in practice, so it's almost certainly time to
> rework the io_pgatble_alloc() interface. We probably want to make "TTBR1" an
> up-front option for the appropriate formats, such that either way they
> return a single TTBR value plus a TCR with the appropriate half configured
> (hopefully in such a way that the caller can simply allocate one of each and
> merge the two TCRs together, so maybe responsibility for EPD* needs to
> move). That way we can also make *better* use of the IOVA sanity-checking in
> io-pgtable-arm, rather than just removing it (especially since this will
> open up a whole new class of "unmapping a TTBR0 address from the TTBR1
> domain" type bugs).

I'm kind of having trouble wrapping my brain around what an API like this would
look like, so please bear with me.

The current patch does three things in the arm-smmu driver: it creates a
secondary pagetable in the same manner as the first one (with the same
parameters), updates the context bank registers and makes a decision at
map/unmap time as to which pagetable ops pointer to use.

If I understand you correctly I think you are saying that we would like to move
the register specific details into the io_pgtable code and get rid of the
function quoted above. It also sounds like you may want to use separate
pagetable ops for mapping ttbr0 and ttbr1 to allow for better sanity checking.

I'm still not quite sure how the pagetable allocation will work in this case.
The biggest downside is that we need to possibly adjust T0SZ in a split table
situation to account for the SEP and both T0SZ and T1SZ live in tcr[0] so if we
stick with individual allocation functions struct io_pgtable_cfg would have to
be an accumulator of sorts, passed first for TTBR0 and the again for TTBR1 which
may modify the value of tcr[0] (or OR the two values together and hope
they don't conflict).

To me this kind of defeats the purpose of calling the allocator twice. It feels
cleaner if we called it once with the advanced knowledge that we are going to
use TTBR1 and then return the pagetable addresses in .ttbr[0] and .ttbr[1] as
above. We could make a new format type that incorporates TTBR1 and then we
wouldn't have to change struct io_ptgable_cfg or the API call.

This could also have the advantage of moving the mask check out of the arm-smmu
map/unmap functions and moving it to special TTBR1 ops in the io pgtable that
could find the right pagetable  to write to as well as do the appropriate sanity
checks.

I'm kind of shooting from the hip here so feel free to let me know I'm being
silly. I really want to get this moving forward so any reasonable ideas will be
welcome.

> Robin.

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 03/15] iommu/arm-smmu: Add split pagetable support for arm-smmu-v2
@ 2019-05-23 20:00       ` Jordan Crouse
  0 siblings, 0 replies; 45+ messages in thread
From: Jordan Crouse @ 2019-05-23 20:00 UTC (permalink / raw)
  To: Robin Murphy
  Cc: jean-philippe.brucker, linux-arm-msm, Will Deacon, dianders,
	linux-kernel, iommu, hoegsberg, freedreno, linux-arm-kernel

On Tue, May 21, 2019 at 07:18:32PM +0100, Robin Murphy wrote:
> On 21/05/2019 17:13, Jordan Crouse wrote:
> >Add support for a split pagetable (TTBR0/TTBR1) scheme for arm-smmu-v2.
> >If split pagetables are enabled, create a pagetable for TTBR1 and set
> >up the sign extension bit so that all IOVAs with that bit set are mapped
> >and translated from the TTBR1 pagetable.
> >
> >Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> >---
> >
> >  drivers/iommu/arm-smmu-regs.h  |  19 +++++
> >  drivers/iommu/arm-smmu.c       | 179 ++++++++++++++++++++++++++++++++++++++---
> >  drivers/iommu/io-pgtable-arm.c |   3 +-
> >  3 files changed, 186 insertions(+), 15 deletions(-)
> >
> >diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-regs.h
> >index e9132a9..23f27c2 100644
> >--- a/drivers/iommu/arm-smmu-regs.h
> >+++ b/drivers/iommu/arm-smmu-regs.h
> >@@ -195,7 +195,26 @@ enum arm_smmu_s2cr_privcfg {
> >  #define RESUME_RETRY			(0 << 0)
> >  #define RESUME_TERMINATE		(1 << 0)
> >+#define TTBCR_EPD1			(1 << 23)
> >+#define TTBCR_T0SZ_SHIFT		0
> >+#define TTBCR_T1SZ_SHIFT		16
> >+#define TTBCR_IRGN1_SHIFT		24
> >+#define TTBCR_ORGN1_SHIFT		26
> >+#define TTBCR_RGN_WBWA			1
> >+#define TTBCR_SH1_SHIFT			28
> >+#define TTBCR_SH_IS			3
> >+
> >+#define TTBCR_TG1_16K			(1 << 30)
> >+#define TTBCR_TG1_4K			(2 << 30)
> >+#define TTBCR_TG1_64K			(3 << 30)
> >+
> >  #define TTBCR2_SEP_SHIFT		15
> >+#define TTBCR2_SEP_31			(0x0 << TTBCR2_SEP_SHIFT)
> >+#define TTBCR2_SEP_35			(0x1 << TTBCR2_SEP_SHIFT)
> >+#define TTBCR2_SEP_39			(0x2 << TTBCR2_SEP_SHIFT)
> >+#define TTBCR2_SEP_41			(0x3 << TTBCR2_SEP_SHIFT)
> >+#define TTBCR2_SEP_43			(0x4 << TTBCR2_SEP_SHIFT)
> >+#define TTBCR2_SEP_47			(0x5 << TTBCR2_SEP_SHIFT)
> >  #define TTBCR2_SEP_UPSTREAM		(0x7 << TTBCR2_SEP_SHIFT)
> >  #define TTBCR2_AS			(1 << 4)
> >diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> >index a795ada..e09c0e6 100644
> >--- a/drivers/iommu/arm-smmu.c
> >+++ b/drivers/iommu/arm-smmu.c
> >@@ -152,6 +152,7 @@ struct arm_smmu_cb {
> >  	u32				tcr[2];
> >  	u32				mair[2];
> >  	struct arm_smmu_cfg		*cfg;
> >+	unsigned long			split_table_mask;
> >  };
> >  struct arm_smmu_master_cfg {
> >@@ -253,13 +254,14 @@ enum arm_smmu_domain_stage {
> >  struct arm_smmu_domain {
> >  	struct arm_smmu_device		*smmu;
> >-	struct io_pgtable_ops		*pgtbl_ops;
> >+	struct io_pgtable_ops		*pgtbl_ops[2];
> 
> This seems a bit off - surely the primary domain and aux domain only ever
> need one set of tables each, but either way there's definitely unnecessary
> redundancy in having four sets of io_pgtable_ops between them.
> 
> >  	const struct iommu_gather_ops	*tlb_ops;
> >  	struct arm_smmu_cfg		cfg;
> >  	enum arm_smmu_domain_stage	stage;
> >  	bool				non_strict;
> >  	struct mutex			init_mutex; /* Protects smmu pointer */
> >  	spinlock_t			cb_lock; /* Serialises ATS1* ops and TLB syncs */
> >+	u32 attributes;
> >  	struct iommu_domain		domain;
> >  };
> >@@ -621,6 +623,85 @@ static irqreturn_t arm_smmu_global_fault(int irq, void *dev)
> >  	return IRQ_HANDLED;
> >  }
> >+/* Adjust the context bank settings to support TTBR1 */
> >+static void arm_smmu_init_ttbr1(struct arm_smmu_domain *smmu_domain,
> >+		struct io_pgtable_cfg *pgtbl_cfg)
> >+{
> >+	struct arm_smmu_device *smmu = smmu_domain->smmu;
> >+	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> >+	struct arm_smmu_cb *cb = &smmu_domain->smmu->cbs[cfg->cbndx];
> >+	int pgsize = 1 << __ffs(pgtbl_cfg->pgsize_bitmap);
> >+
> >+	/* Enable speculative walks through the TTBR1 */
> >+	cb->tcr[0] &= ~TTBCR_EPD1;
> >+
> >+	cb->tcr[0] |= TTBCR_SH_IS << TTBCR_SH1_SHIFT;
> >+	cb->tcr[0] |= TTBCR_RGN_WBWA << TTBCR_IRGN1_SHIFT;
> >+	cb->tcr[0] |= TTBCR_RGN_WBWA << TTBCR_ORGN1_SHIFT;
> >+
> >+	switch (pgsize) {
> >+	case SZ_4K:
> >+		cb->tcr[0] |= TTBCR_TG1_4K;
> >+		break;
> >+	case SZ_16K:
> >+		cb->tcr[0] |= TTBCR_TG1_16K;
> >+		break;
> >+	case SZ_64K:
> >+		cb->tcr[0] |= TTBCR_TG1_64K;
> >+		break;
> >+	}
> >+
> >+	/*
> >+	 * Outside of the special 49 bit UBS case that has a dedicated sign
> >+	 * extension bit, setting the SEP for any other va_size will force us to
> >+	 * shrink the size of the T0/T1 regions by one bit to accommodate the
> >+	 * SEP
> >+	 */
> >+	if (smmu->va_size != 48) {
> >+		/* Replace the T0 size */
> >+		cb->tcr[0] &= ~(0x3f << TTBCR_T0SZ_SHIFT);
> >+		cb->tcr[0] |= (64ULL - smmu->va_size - 1) << TTBCR_T0SZ_SHIFT;
> >+		/* Set the T1 size */
> >+		cb->tcr[0] |= (64ULL - smmu->va_size - 1) << TTBCR_T1SZ_SHIFT;
> >+	} else {
> >+		/* Set the T1 size to the full available UBS */
> >+		cb->tcr[0] |= (64ULL - smmu->va_size) << TTBCR_T1SZ_SHIFT;
> >+	}
> >+
> >+	/* Clear the existing SEP configuration */
> >+	cb->tcr[1] &= ~TTBCR2_SEP_UPSTREAM;
> >+
> >+	/* Set up the sign extend bit */
> >+	switch (smmu->va_size) {
> >+	case 32:
> >+		cb->tcr[1] |= TTBCR2_SEP_31;
> >+		cb->split_table_mask = (1UL << 31);
> >+		break;
> >+	case 36:
> >+		cb->tcr[1] |= TTBCR2_SEP_35;
> >+		cb->split_table_mask = (1UL << 35);
> >+		break;
> >+	case 40:
> >+		cb->tcr[1] |= TTBCR2_SEP_39;
> >+		cb->split_table_mask = (1UL << 39);
> >+		break;
> >+	case 42:
> >+		cb->tcr[1] |= TTBCR2_SEP_41;
> >+		cb->split_table_mask = (1UL << 41);
> >+		break;
> >+	case 44:
> >+		cb->tcr[1] |= TTBCR2_SEP_43;
> >+		cb->split_table_mask = (1UL << 43);
> >+		break;
> >+	case 48:
> >+		cb->tcr[1] |= TTBCR2_SEP_UPSTREAM;
> >+		cb->split_table_mask = (1UL << 48);
> >+	}
> >+
> >+	cb->ttbr[1] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[0];
> 
> Assigning a "TTBR0" to a "TTBR1" is the point at which it becomes clear that
> we need to take a step back and reconsider. I think there was originally a
> half-formed idea that pagetables might go around in pairs, but things really
> aren't working out that way in practice, so it's almost certainly time to
> rework the io_pgatble_alloc() interface. We probably want to make "TTBR1" an
> up-front option for the appropriate formats, such that either way they
> return a single TTBR value plus a TCR with the appropriate half configured
> (hopefully in such a way that the caller can simply allocate one of each and
> merge the two TCRs together, so maybe responsibility for EPD* needs to
> move). That way we can also make *better* use of the IOVA sanity-checking in
> io-pgtable-arm, rather than just removing it (especially since this will
> open up a whole new class of "unmapping a TTBR0 address from the TTBR1
> domain" type bugs).

I'm kind of having trouble wrapping my brain around what an API like this would
look like, so please bear with me.

The current patch does three things in the arm-smmu driver: it creates a
secondary pagetable in the same manner as the first one (with the same
parameters), updates the context bank registers and makes a decision at
map/unmap time as to which pagetable ops pointer to use.

If I understand you correctly I think you are saying that we would like to move
the register specific details into the io_pgtable code and get rid of the
function quoted above. It also sounds like you may want to use separate
pagetable ops for mapping ttbr0 and ttbr1 to allow for better sanity checking.

I'm still not quite sure how the pagetable allocation will work in this case.
The biggest downside is that we need to possibly adjust T0SZ in a split table
situation to account for the SEP and both T0SZ and T1SZ live in tcr[0] so if we
stick with individual allocation functions struct io_pgtable_cfg would have to
be an accumulator of sorts, passed first for TTBR0 and the again for TTBR1 which
may modify the value of tcr[0] (or OR the two values together and hope
they don't conflict).

To me this kind of defeats the purpose of calling the allocator twice. It feels
cleaner if we called it once with the advanced knowledge that we are going to
use TTBR1 and then return the pagetable addresses in .ttbr[0] and .ttbr[1] as
above. We could make a new format type that incorporates TTBR1 and then we
wouldn't have to change struct io_ptgable_cfg or the API call.

This could also have the advantage of moving the mask check out of the arm-smmu
map/unmap functions and moving it to special TTBR1 ops in the io pgtable that
could find the right pagetable  to write to as well as do the appropriate sanity
checks.

I'm kind of shooting from the hip here so feel free to let me know I'm being
silly. I really want to get this moving forward so any reasonable ideas will be
welcome.

> Robin.

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 03/15] iommu/arm-smmu: Add split pagetable support for arm-smmu-v2
@ 2019-05-23 20:00       ` Jordan Crouse
  0 siblings, 0 replies; 45+ messages in thread
From: Jordan Crouse @ 2019-05-23 20:00 UTC (permalink / raw)
  To: Robin Murphy
  Cc: jean-philippe.brucker, linux-arm-msm, Joerg Roedel, Will Deacon,
	dianders, linux-kernel, iommu, hoegsberg, freedreno,
	linux-arm-kernel

On Tue, May 21, 2019 at 07:18:32PM +0100, Robin Murphy wrote:
> On 21/05/2019 17:13, Jordan Crouse wrote:
> >Add support for a split pagetable (TTBR0/TTBR1) scheme for arm-smmu-v2.
> >If split pagetables are enabled, create a pagetable for TTBR1 and set
> >up the sign extension bit so that all IOVAs with that bit set are mapped
> >and translated from the TTBR1 pagetable.
> >
> >Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> >---
> >
> >  drivers/iommu/arm-smmu-regs.h  |  19 +++++
> >  drivers/iommu/arm-smmu.c       | 179 ++++++++++++++++++++++++++++++++++++++---
> >  drivers/iommu/io-pgtable-arm.c |   3 +-
> >  3 files changed, 186 insertions(+), 15 deletions(-)
> >
> >diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-regs.h
> >index e9132a9..23f27c2 100644
> >--- a/drivers/iommu/arm-smmu-regs.h
> >+++ b/drivers/iommu/arm-smmu-regs.h
> >@@ -195,7 +195,26 @@ enum arm_smmu_s2cr_privcfg {
> >  #define RESUME_RETRY			(0 << 0)
> >  #define RESUME_TERMINATE		(1 << 0)
> >+#define TTBCR_EPD1			(1 << 23)
> >+#define TTBCR_T0SZ_SHIFT		0
> >+#define TTBCR_T1SZ_SHIFT		16
> >+#define TTBCR_IRGN1_SHIFT		24
> >+#define TTBCR_ORGN1_SHIFT		26
> >+#define TTBCR_RGN_WBWA			1
> >+#define TTBCR_SH1_SHIFT			28
> >+#define TTBCR_SH_IS			3
> >+
> >+#define TTBCR_TG1_16K			(1 << 30)
> >+#define TTBCR_TG1_4K			(2 << 30)
> >+#define TTBCR_TG1_64K			(3 << 30)
> >+
> >  #define TTBCR2_SEP_SHIFT		15
> >+#define TTBCR2_SEP_31			(0x0 << TTBCR2_SEP_SHIFT)
> >+#define TTBCR2_SEP_35			(0x1 << TTBCR2_SEP_SHIFT)
> >+#define TTBCR2_SEP_39			(0x2 << TTBCR2_SEP_SHIFT)
> >+#define TTBCR2_SEP_41			(0x3 << TTBCR2_SEP_SHIFT)
> >+#define TTBCR2_SEP_43			(0x4 << TTBCR2_SEP_SHIFT)
> >+#define TTBCR2_SEP_47			(0x5 << TTBCR2_SEP_SHIFT)
> >  #define TTBCR2_SEP_UPSTREAM		(0x7 << TTBCR2_SEP_SHIFT)
> >  #define TTBCR2_AS			(1 << 4)
> >diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> >index a795ada..e09c0e6 100644
> >--- a/drivers/iommu/arm-smmu.c
> >+++ b/drivers/iommu/arm-smmu.c
> >@@ -152,6 +152,7 @@ struct arm_smmu_cb {
> >  	u32				tcr[2];
> >  	u32				mair[2];
> >  	struct arm_smmu_cfg		*cfg;
> >+	unsigned long			split_table_mask;
> >  };
> >  struct arm_smmu_master_cfg {
> >@@ -253,13 +254,14 @@ enum arm_smmu_domain_stage {
> >  struct arm_smmu_domain {
> >  	struct arm_smmu_device		*smmu;
> >-	struct io_pgtable_ops		*pgtbl_ops;
> >+	struct io_pgtable_ops		*pgtbl_ops[2];
> 
> This seems a bit off - surely the primary domain and aux domain only ever
> need one set of tables each, but either way there's definitely unnecessary
> redundancy in having four sets of io_pgtable_ops between them.
> 
> >  	const struct iommu_gather_ops	*tlb_ops;
> >  	struct arm_smmu_cfg		cfg;
> >  	enum arm_smmu_domain_stage	stage;
> >  	bool				non_strict;
> >  	struct mutex			init_mutex; /* Protects smmu pointer */
> >  	spinlock_t			cb_lock; /* Serialises ATS1* ops and TLB syncs */
> >+	u32 attributes;
> >  	struct iommu_domain		domain;
> >  };
> >@@ -621,6 +623,85 @@ static irqreturn_t arm_smmu_global_fault(int irq, void *dev)
> >  	return IRQ_HANDLED;
> >  }
> >+/* Adjust the context bank settings to support TTBR1 */
> >+static void arm_smmu_init_ttbr1(struct arm_smmu_domain *smmu_domain,
> >+		struct io_pgtable_cfg *pgtbl_cfg)
> >+{
> >+	struct arm_smmu_device *smmu = smmu_domain->smmu;
> >+	struct arm_smmu_cfg *cfg = &smmu_domain->cfg;
> >+	struct arm_smmu_cb *cb = &smmu_domain->smmu->cbs[cfg->cbndx];
> >+	int pgsize = 1 << __ffs(pgtbl_cfg->pgsize_bitmap);
> >+
> >+	/* Enable speculative walks through the TTBR1 */
> >+	cb->tcr[0] &= ~TTBCR_EPD1;
> >+
> >+	cb->tcr[0] |= TTBCR_SH_IS << TTBCR_SH1_SHIFT;
> >+	cb->tcr[0] |= TTBCR_RGN_WBWA << TTBCR_IRGN1_SHIFT;
> >+	cb->tcr[0] |= TTBCR_RGN_WBWA << TTBCR_ORGN1_SHIFT;
> >+
> >+	switch (pgsize) {
> >+	case SZ_4K:
> >+		cb->tcr[0] |= TTBCR_TG1_4K;
> >+		break;
> >+	case SZ_16K:
> >+		cb->tcr[0] |= TTBCR_TG1_16K;
> >+		break;
> >+	case SZ_64K:
> >+		cb->tcr[0] |= TTBCR_TG1_64K;
> >+		break;
> >+	}
> >+
> >+	/*
> >+	 * Outside of the special 49 bit UBS case that has a dedicated sign
> >+	 * extension bit, setting the SEP for any other va_size will force us to
> >+	 * shrink the size of the T0/T1 regions by one bit to accommodate the
> >+	 * SEP
> >+	 */
> >+	if (smmu->va_size != 48) {
> >+		/* Replace the T0 size */
> >+		cb->tcr[0] &= ~(0x3f << TTBCR_T0SZ_SHIFT);
> >+		cb->tcr[0] |= (64ULL - smmu->va_size - 1) << TTBCR_T0SZ_SHIFT;
> >+		/* Set the T1 size */
> >+		cb->tcr[0] |= (64ULL - smmu->va_size - 1) << TTBCR_T1SZ_SHIFT;
> >+	} else {
> >+		/* Set the T1 size to the full available UBS */
> >+		cb->tcr[0] |= (64ULL - smmu->va_size) << TTBCR_T1SZ_SHIFT;
> >+	}
> >+
> >+	/* Clear the existing SEP configuration */
> >+	cb->tcr[1] &= ~TTBCR2_SEP_UPSTREAM;
> >+
> >+	/* Set up the sign extend bit */
> >+	switch (smmu->va_size) {
> >+	case 32:
> >+		cb->tcr[1] |= TTBCR2_SEP_31;
> >+		cb->split_table_mask = (1UL << 31);
> >+		break;
> >+	case 36:
> >+		cb->tcr[1] |= TTBCR2_SEP_35;
> >+		cb->split_table_mask = (1UL << 35);
> >+		break;
> >+	case 40:
> >+		cb->tcr[1] |= TTBCR2_SEP_39;
> >+		cb->split_table_mask = (1UL << 39);
> >+		break;
> >+	case 42:
> >+		cb->tcr[1] |= TTBCR2_SEP_41;
> >+		cb->split_table_mask = (1UL << 41);
> >+		break;
> >+	case 44:
> >+		cb->tcr[1] |= TTBCR2_SEP_43;
> >+		cb->split_table_mask = (1UL << 43);
> >+		break;
> >+	case 48:
> >+		cb->tcr[1] |= TTBCR2_SEP_UPSTREAM;
> >+		cb->split_table_mask = (1UL << 48);
> >+	}
> >+
> >+	cb->ttbr[1] = pgtbl_cfg->arm_lpae_s1_cfg.ttbr[0];
> 
> Assigning a "TTBR0" to a "TTBR1" is the point at which it becomes clear that
> we need to take a step back and reconsider. I think there was originally a
> half-formed idea that pagetables might go around in pairs, but things really
> aren't working out that way in practice, so it's almost certainly time to
> rework the io_pgatble_alloc() interface. We probably want to make "TTBR1" an
> up-front option for the appropriate formats, such that either way they
> return a single TTBR value plus a TCR with the appropriate half configured
> (hopefully in such a way that the caller can simply allocate one of each and
> merge the two TCRs together, so maybe responsibility for EPD* needs to
> move). That way we can also make *better* use of the IOVA sanity-checking in
> io-pgtable-arm, rather than just removing it (especially since this will
> open up a whole new class of "unmapping a TTBR0 address from the TTBR1
> domain" type bugs).

I'm kind of having trouble wrapping my brain around what an API like this would
look like, so please bear with me.

The current patch does three things in the arm-smmu driver: it creates a
secondary pagetable in the same manner as the first one (with the same
parameters), updates the context bank registers and makes a decision at
map/unmap time as to which pagetable ops pointer to use.

If I understand you correctly I think you are saying that we would like to move
the register specific details into the io_pgtable code and get rid of the
function quoted above. It also sounds like you may want to use separate
pagetable ops for mapping ttbr0 and ttbr1 to allow for better sanity checking.

I'm still not quite sure how the pagetable allocation will work in this case.
The biggest downside is that we need to possibly adjust T0SZ in a split table
situation to account for the SEP and both T0SZ and T1SZ live in tcr[0] so if we
stick with individual allocation functions struct io_pgtable_cfg would have to
be an accumulator of sorts, passed first for TTBR0 and the again for TTBR1 which
may modify the value of tcr[0] (or OR the two values together and hope
they don't conflict).

To me this kind of defeats the purpose of calling the allocator twice. It feels
cleaner if we called it once with the advanced knowledge that we are going to
use TTBR1 and then return the pagetable addresses in .ttbr[0] and .ttbr[1] as
above. We could make a new format type that incorporates TTBR1 and then we
wouldn't have to change struct io_ptgable_cfg or the API call.

This could also have the advantage of moving the mask check out of the arm-smmu
map/unmap functions and moving it to special TTBR1 ops in the io pgtable that
could find the right pagetable  to write to as well as do the appropriate sanity
checks.

I'm kind of shooting from the hip here so feel free to let me know I'm being
silly. I really want to get this moving forward so any reasonable ideas will be
welcome.

> Robin.

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2019-05-23 20:00 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-21 16:13 [PATCH v2 00/15] drm/msm: Per-instance pagetable support Jordan Crouse
2019-05-21 16:13 ` Jordan Crouse
2019-05-21 16:13 ` Jordan Crouse
2019-05-21 16:13 ` Jordan Crouse
2019-05-21 16:13 ` [PATCH v2 01/15] iommu/arm-smmu: Allow IOMMU enabled devices to skip DMA domains Jordan Crouse
2019-05-21 16:13   ` Jordan Crouse
2019-05-21 16:13   ` Jordan Crouse
2019-05-21 17:43   ` Robin Murphy
2019-05-21 17:43     ` Robin Murphy
2019-05-21 17:43     ` Robin Murphy
2019-05-21 19:07     ` Jordan Crouse
2019-05-21 19:07       ` Jordan Crouse
2019-05-21 19:07       ` Jordan Crouse
2019-05-21 16:13 ` [PATCH v2 02/15] iommu: Add DOMAIN_ATTR_SPLIT_TABLES Jordan Crouse
2019-05-21 16:13   ` Jordan Crouse
2019-05-21 16:13 ` [PATCH v2 03/15] iommu/arm-smmu: Add split pagetable support for arm-smmu-v2 Jordan Crouse
2019-05-21 16:13   ` Jordan Crouse
2019-05-21 16:13   ` Jordan Crouse
2019-05-21 18:18   ` Robin Murphy
2019-05-21 18:18     ` Robin Murphy
2019-05-21 18:18     ` Robin Murphy
2019-05-23 20:00     ` Jordan Crouse
2019-05-23 20:00       ` Jordan Crouse
2019-05-23 20:00       ` Jordan Crouse
2019-05-21 16:13 ` [PATCH v2 04/15] iommu: Add DOMAIN_ATTR_PTBASE Jordan Crouse
2019-05-21 16:13   ` Jordan Crouse
2019-05-21 16:13 ` [PATCH v2 05/15] iommu/arm-smmu: Add auxiliary domain support for arm-smmuv2 Jordan Crouse
2019-05-21 16:13   ` Jordan Crouse
2019-05-21 16:13   ` Jordan Crouse
2019-05-21 16:13 ` [PATCH v2 06/15] drm/msm/adreno: Enable 64 bit mode by default on a5xx and a6xx targets Jordan Crouse
2019-05-21 16:13   ` Jordan Crouse
2019-05-21 16:13 ` [PATCH v2 07/15] drm/msm: Print all 64 bits of the faulting IOMMU address Jordan Crouse
2019-05-21 16:13   ` Jordan Crouse
2019-05-21 16:13 ` [PATCH v2 08/15] drm/msm: Pass the MMU domain index in struct msm_file_private Jordan Crouse
2019-05-21 16:13 ` [PATCH v2 09/15] drm/msm/gpu: Move address space setup to the GPU targets Jordan Crouse
2019-05-21 16:13 ` [PATCH v2 10/15] drm/msm: Add a helper function for a per-instance address space Jordan Crouse
2019-05-21 16:13   ` Jordan Crouse
2019-05-21 16:13 ` [PATCH v2 11/15] drm/msm/gpu: Add ttbr0 to the memptrs Jordan Crouse
2019-05-21 16:14 ` [PATCH v2 12/15] drm/msm: Add support to create target specific address spaces Jordan Crouse
2019-05-21 16:14 ` [PATCH v2 13/15] drm/msm: Add support for IOMMU auxiliary domains Jordan Crouse
2019-05-21 16:14   ` Jordan Crouse
2019-05-21 16:14 ` [PATCH v2 14/15] drm/msm/a6xx: Support per-instance pagetables Jordan Crouse
2019-05-21 16:14   ` Jordan Crouse
2019-05-21 16:14 ` [PATCH v2 15/15] drm/msm/a5xx: " Jordan Crouse
2019-05-21 16:14   ` Jordan Crouse

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.