* [PATCH v8 0/7] iommu/arm-smmu: Enable split pagetable support
@ 2020-06-11 22:21 Jordan Crouse
2020-06-11 22:21 ` [PATCH v8 6/7] drm/msm: Set the global virtual address range from the IOMMU domain Jordan Crouse
0 siblings, 1 reply; 3+ messages in thread
From: Jordan Crouse @ 2020-06-11 22:21 UTC (permalink / raw)
To: linux-arm-msm
Cc: Sean Paul, linux-arm-kernel, Will Deacon, Jeffrey Hugo,
David Airlie, Robin Murphy, Joerg Roedel, linux-kernel,
Douglas Anderson, Rob Herring, Bjorn Andersson, iommu,
Andy Gross, dri-devel, Thomas Gleixner, freedreno, devicetree,
Brian Masney
Another iteration of the split-pagetable support for arm-smmu and the Adreno GPU
SMMU. After email discussions [1] we opted to make a arm-smmu implementation for
specifically for the Adreno GPU and use that to enable split pagetable support
and later other implementation specific bits that we need.
On the hardware side this is very close to the same code from before [2] only
the TTBR1 quirk is turned on by the implementation and not a domain attribute.
In drm/msm we use the returned size of the aperture as a clue to let us know
which virtual address space we should use for global memory objects.
There are two open items that you should be aware of. First, in the
implementation specific code we have to check the compatible string of the
device so that we only enable TTBR1 for the GPU (SID 0) and not the GMU (SID 4).
I went back and forth trying to decide if I wanted to use the compatbile string
or the SID as the filter and settled on the compatible string but I could be
talked out of it.
The other open item is that in drm/msm the hardware only uses 49 bits of the
address space but arm-smmu expects the address to be sign extended all the way
to 64 bits. This isn't a problem normally unless you look at the hardware
registers that contain a IOVA and then the upper bits will be zero. I opted to
restrict the internal drm/msm IOVA range to only 49 bits and then sign extend
right before calling iommu_map / iommu_unmap. This is a bit wonky but I thought
that matching the hardware would be less confusing when debugging a hang.
v8: Pass the attached device in the smmu_domain to the implementation
specific functions
[1] https://lists.linuxfoundation.org/pipermail/iommu/2020-May/044537.html
[2] https://patchwork.kernel.org/patch/11482591/
Jordan Crouse (7):
iommu/arm-smmu: Pass io-pgtable config to implementation specific
function
iommu/arm-smmu: Add support for split pagetables
dt-bindings: arm-smmu: Add compatible string for Adreno GPU SMMU
iommu/arm-smmu: Add a pointer to the attached device to smmu_domain
iommu/arm-smmu: Add implementation for the adreno GPU SMMU
drm/msm: Set the global virtual address range from the IOMMU domain
arm: dts: qcom: sm845: Set the compatible string for the GPU SMMU
.../devicetree/bindings/iommu/arm,smmu.yaml | 4 ++
arch/arm64/boot/dts/qcom/sdm845.dtsi | 2 +-
drivers/gpu/drm/msm/adreno/adreno_gpu.c | 13 +++++-
drivers/gpu/drm/msm/msm_iommu.c | 7 +++
drivers/iommu/arm-smmu-impl.c | 6 ++-
drivers/iommu/arm-smmu-qcom.c | 45 ++++++++++++++++++-
drivers/iommu/arm-smmu.c | 33 +++++++++-----
drivers/iommu/arm-smmu.h | 30 ++++++++++---
8 files changed, 117 insertions(+), 23 deletions(-)
--
2.17.1
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH v8 6/7] drm/msm: Set the global virtual address range from the IOMMU domain
2020-06-11 22:21 [PATCH v8 0/7] iommu/arm-smmu: Enable split pagetable support Jordan Crouse
@ 2020-06-11 22:21 ` Jordan Crouse
2020-06-12 4:49 ` kernel test robot
0 siblings, 1 reply; 3+ messages in thread
From: Jordan Crouse @ 2020-06-11 22:21 UTC (permalink / raw)
To: linux-arm-msm
Cc: Jeffrey Hugo, David Airlie, Sean Paul, Douglas Anderson,
dri-devel, Bjorn Andersson, iommu, Thomas Gleixner, freedreno,
linux-kernel, Brian Masney
Use the aperture settings from the IOMMU domain to set up the virtual
address range for the GPU. This allows us to transparently deal with
IOMMU side features (like split pagetables).
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
drivers/gpu/drm/msm/adreno/adreno_gpu.c | 13 +++++++++++--
drivers/gpu/drm/msm/msm_iommu.c | 7 +++++++
2 files changed, 18 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index 89673c7ed473..3e717c1ebb7f 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -192,9 +192,18 @@ adreno_iommu_create_address_space(struct msm_gpu *gpu,
struct iommu_domain *iommu = iommu_domain_alloc(&platform_bus_type);
struct msm_mmu *mmu = msm_iommu_new(&pdev->dev, iommu);
struct msm_gem_address_space *aspace;
+ u64 start, size;
- aspace = msm_gem_address_space_create(mmu, "gpu", SZ_16M,
- 0xfffffff);
+ /*
+ * Use the aperture start or SZ_16M, whichever is greater. This will
+ * ensure that we align with the allocated pagetable range while still
+ * allowing room in the lower 32 bits for GMEM and whatnot
+ */
+ start = max_t(u64, SZ_16M, iommu->geometry.aperture_start);
+ size = iommu->geometry.aperture_end - start + 1;
+
+ aspace = msm_gem_address_space_create(mmu, "gpu",
+ start & GENMASK(48, 0), size);
if (IS_ERR(aspace) && !IS_ERR(mmu))
mmu->funcs->destroy(mmu);
diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c
index 3a381a9674c9..bbe129867590 100644
--- a/drivers/gpu/drm/msm/msm_iommu.c
+++ b/drivers/gpu/drm/msm/msm_iommu.c
@@ -36,6 +36,10 @@ static int msm_iommu_map(struct msm_mmu *mmu, uint64_t iova,
struct msm_iommu *iommu = to_msm_iommu(mmu);
size_t ret;
+ /* The arm-smmu driver expects the addresses to be sign extended */
+ if (iova & BIT(48))
+ iova |= GENMASK(63, 49);
+
ret = iommu_map_sg(iommu->domain, iova, sgt->sgl, sgt->nents, prot);
WARN_ON(!ret);
@@ -46,6 +50,9 @@ static int msm_iommu_unmap(struct msm_mmu *mmu, uint64_t iova, size_t len)
{
struct msm_iommu *iommu = to_msm_iommu(mmu);
+ if (iova & BIT(48))
+ iova |= GENMASK(63, 49);
+
iommu_unmap(iommu->domain, iova, len);
return 0;
--
2.17.1
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH v8 6/7] drm/msm: Set the global virtual address range from the IOMMU domain
2020-06-11 22:21 ` [PATCH v8 6/7] drm/msm: Set the global virtual address range from the IOMMU domain Jordan Crouse
@ 2020-06-12 4:49 ` kernel test robot
0 siblings, 0 replies; 3+ messages in thread
From: kernel test robot @ 2020-06-12 4:49 UTC (permalink / raw)
To: Jordan Crouse, linux-arm-msm
Cc: kbuild-all, Jeffrey Hugo, David Airlie, freedreno,
Douglas Anderson, dri-devel, Bjorn Andersson, iommu,
Thomas Gleixner, Sean Paul
[-- Attachment #1: Type: text/plain, Size: 12326 bytes --]
Hi Jordan,
Thank you for the patch! Perhaps something to improve:
[auto build test WARNING on linus/master]
[also build test WARNING on next-20200611]
[cannot apply to iommu/next robh/for-next arm/for-next keystone/next rockchip/for-next arm64/for-next/core shawnguo/for-next soc/for-next v5.7]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]
url: https://github.com/0day-ci/linux/commits/Jordan-Crouse/iommu-arm-smmu-Enable-split-pagetable-support/20200612-094718
base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git b961f8dc8976c091180839f4483d67b7c2ca2578
config: arm-defconfig (attached as .config)
compiler: arm-linux-gnueabi-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=arm
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
All warnings (new ones prefixed by >>, old ones prefixed by <<):
In file included from include/linux/bitops.h:5,
from include/linux/kernel.h:12,
from include/linux/ascii85.h:11,
from drivers/gpu/drm/msm/adreno/adreno_gpu.c:9:
drivers/gpu/drm/msm/adreno/adreno_gpu.c: In function 'adreno_iommu_create_address_space':
include/linux/bits.h:37:11: warning: right shift count is negative [-Wshift-count-negative]
37 | (~UL(0) >> (BITS_PER_LONG - 1 - (h))))
| ^~
include/linux/bits.h:39:31: note: in expansion of macro '__GENMASK'
39 | (GENMASK_INPUT_CHECK(h, l) + __GENMASK(h, l))
| ^~~~~~~~~
>> drivers/gpu/drm/msm/adreno/adreno_gpu.c:206:11: note: in expansion of macro 'GENMASK'
206 | start & GENMASK(48, 0), size);
| ^~~~~~~
--
In file included from include/linux/bits.h:6,
from include/linux/bitops.h:5,
from include/linux/kernel.h:12,
from drivers/gpu/drm/msm/msm_drv.h:11,
from drivers/gpu/drm/msm/msm_iommu.c:7:
drivers/gpu/drm/msm/msm_iommu.c: In function 'msm_iommu_map':
>> include/vdso/bits.h:7:26: warning: left shift count >= width of type [-Wshift-count-overflow]
7 | #define BIT(nr) (UL(1) << (nr))
| ^~
>> drivers/gpu/drm/msm/msm_iommu.c:40:13: note: in expansion of macro 'BIT'
40 | if (iova & BIT(48))
| ^~~
In file included from include/linux/bitops.h:5,
from include/linux/kernel.h:12,
from drivers/gpu/drm/msm/msm_drv.h:11,
from drivers/gpu/drm/msm/msm_iommu.c:7:
>> include/linux/bits.h:36:22: warning: left shift count >= width of type [-Wshift-count-overflow]
36 | (((~UL(0)) - (UL(1) << (l)) + 1) & | ^~
include/linux/bits.h:39:31: note: in expansion of macro '__GENMASK'
39 | (GENMASK_INPUT_CHECK(h, l) + __GENMASK(h, l))
| ^~~~~~~~~
>> drivers/gpu/drm/msm/msm_iommu.c:41:11: note: in expansion of macro 'GENMASK'
41 | iova |= GENMASK(63, 49);
| ^~~~~~~
include/linux/bits.h:37:11: warning: right shift count is negative [-Wshift-count-negative]
37 | (~UL(0) >> (BITS_PER_LONG - 1 - (h))))
| ^~
include/linux/bits.h:39:31: note: in expansion of macro '__GENMASK'
39 | (GENMASK_INPUT_CHECK(h, l) + __GENMASK(h, l))
| ^~~~~~~~~
>> drivers/gpu/drm/msm/msm_iommu.c:41:11: note: in expansion of macro 'GENMASK'
41 | iova |= GENMASK(63, 49);
| ^~~~~~~
In file included from include/linux/bits.h:6,
from include/linux/bitops.h:5,
from include/linux/kernel.h:12,
from drivers/gpu/drm/msm/msm_drv.h:11,
from drivers/gpu/drm/msm/msm_iommu.c:7:
drivers/gpu/drm/msm/msm_iommu.c: In function 'msm_iommu_unmap':
>> include/vdso/bits.h:7:26: warning: left shift count >= width of type [-Wshift-count-overflow]
7 | #define BIT(nr) (UL(1) << (nr))
| ^~
drivers/gpu/drm/msm/msm_iommu.c:53:13: note: in expansion of macro 'BIT'
53 | if (iova & BIT(48))
| ^~~
In file included from include/linux/bitops.h:5,
from include/linux/kernel.h:12,
from drivers/gpu/drm/msm/msm_drv.h:11,
from drivers/gpu/drm/msm/msm_iommu.c:7:
>> include/linux/bits.h:36:22: warning: left shift count >= width of type [-Wshift-count-overflow]
36 | (((~UL(0)) - (UL(1) << (l)) + 1) & | ^~
include/linux/bits.h:39:31: note: in expansion of macro '__GENMASK'
39 | (GENMASK_INPUT_CHECK(h, l) + __GENMASK(h, l))
| ^~~~~~~~~
drivers/gpu/drm/msm/msm_iommu.c:54:11: note: in expansion of macro 'GENMASK'
54 | iova |= GENMASK(63, 49);
| ^~~~~~~
include/linux/bits.h:37:11: warning: right shift count is negative [-Wshift-count-negative]
37 | (~UL(0) >> (BITS_PER_LONG - 1 - (h))))
| ^~
include/linux/bits.h:39:31: note: in expansion of macro '__GENMASK'
39 | (GENMASK_INPUT_CHECK(h, l) + __GENMASK(h, l))
| ^~~~~~~~~
drivers/gpu/drm/msm/msm_iommu.c:54:11: note: in expansion of macro 'GENMASK'
54 | iova |= GENMASK(63, 49);
| ^~~~~~~
vim +/GENMASK +206 drivers/gpu/drm/msm/adreno/adreno_gpu.c
> 9 #include <linux/ascii85.h>
10 #include <linux/interconnect.h>
11 #include <linux/qcom_scm.h>
12 #include <linux/kernel.h>
13 #include <linux/of_address.h>
14 #include <linux/pm_opp.h>
15 #include <linux/slab.h>
16 #include <linux/soc/qcom/mdt_loader.h>
17 #include <soc/qcom/ocmem.h>
18 #include "adreno_gpu.h"
19 #include "msm_gem.h"
20 #include "msm_mmu.h"
21
22 static bool zap_available = true;
23
24 static int zap_shader_load_mdt(struct msm_gpu *gpu, const char *fwname,
25 u32 pasid)
26 {
27 struct device *dev = &gpu->pdev->dev;
28 const struct firmware *fw;
29 const char *signed_fwname = NULL;
30 struct device_node *np, *mem_np;
31 struct resource r;
32 phys_addr_t mem_phys;
33 ssize_t mem_size;
34 void *mem_region = NULL;
35 int ret;
36
37 if (!IS_ENABLED(CONFIG_ARCH_QCOM)) {
38 zap_available = false;
39 return -EINVAL;
40 }
41
42 np = of_get_child_by_name(dev->of_node, "zap-shader");
43 if (!np) {
44 zap_available = false;
45 return -ENODEV;
46 }
47
48 mem_np = of_parse_phandle(np, "memory-region", 0);
49 of_node_put(np);
50 if (!mem_np) {
51 zap_available = false;
52 return -EINVAL;
53 }
54
55 ret = of_address_to_resource(mem_np, 0, &r);
56 of_node_put(mem_np);
57 if (ret)
58 return ret;
59
60 mem_phys = r.start;
61
62 /*
63 * Check for a firmware-name property. This is the new scheme
64 * to handle firmware that may be signed with device specific
65 * keys, allowing us to have a different zap fw path for different
66 * devices.
67 *
68 * If the firmware-name property is found, we bypass the
69 * adreno_request_fw() mechanism, because we don't need to handle
70 * the /lib/firmware/qcom/... vs /lib/firmware/... case.
71 *
72 * If the firmware-name property is not found, for backwards
73 * compatibility we fall back to the fwname from the gpulist
74 * table.
75 */
76 of_property_read_string_index(np, "firmware-name", 0, &signed_fwname);
77 if (signed_fwname) {
78 fwname = signed_fwname;
79 ret = request_firmware_direct(&fw, fwname, gpu->dev->dev);
80 if (ret)
81 fw = ERR_PTR(ret);
82 } else if (fwname) {
83 /* Request the MDT file from the default location: */
84 fw = adreno_request_fw(to_adreno_gpu(gpu), fwname);
85 } else {
86 /*
87 * For new targets, we require the firmware-name property,
88 * if a zap-shader is required, rather than falling back
89 * to a firmware name specified in gpulist.
90 *
91 * Because the firmware is signed with a (potentially)
92 * device specific key, having the name come from gpulist
93 * was a bad idea, and is only provided for backwards
94 * compatibility for older targets.
95 */
96 return -ENODEV;
97 }
98
99 if (IS_ERR(fw)) {
100 DRM_DEV_ERROR(dev, "Unable to load %s\n", fwname);
101 return PTR_ERR(fw);
102 }
103
104 /* Figure out how much memory we need */
105 mem_size = qcom_mdt_get_size(fw);
106 if (mem_size < 0) {
107 ret = mem_size;
108 goto out;
109 }
110
111 if (mem_size > resource_size(&r)) {
112 DRM_DEV_ERROR(dev,
113 "memory region is too small to load the MDT\n");
114 ret = -E2BIG;
115 goto out;
116 }
117
118 /* Allocate memory for the firmware image */
119 mem_region = memremap(mem_phys, mem_size, MEMREMAP_WC);
120 if (!mem_region) {
121 ret = -ENOMEM;
122 goto out;
123 }
124
125 /*
126 * Load the rest of the MDT
127 *
128 * Note that we could be dealing with two different paths, since
129 * with upstream linux-firmware it would be in a qcom/ subdir..
130 * adreno_request_fw() handles this, but qcom_mdt_load() does
131 * not. But since we've already gotten through adreno_request_fw()
132 * we know which of the two cases it is:
133 */
134 if (signed_fwname || (to_adreno_gpu(gpu)->fwloc == FW_LOCATION_LEGACY)) {
135 ret = qcom_mdt_load(dev, fw, fwname, pasid,
136 mem_region, mem_phys, mem_size, NULL);
137 } else {
138 char *newname;
139
140 newname = kasprintf(GFP_KERNEL, "qcom/%s", fwname);
141
142 ret = qcom_mdt_load(dev, fw, newname, pasid,
143 mem_region, mem_phys, mem_size, NULL);
144 kfree(newname);
145 }
146 if (ret)
147 goto out;
148
149 /* Send the image to the secure world */
150 ret = qcom_scm_pas_auth_and_reset(pasid);
151
152 /*
153 * If the scm call returns -EOPNOTSUPP we assume that this target
154 * doesn't need/support the zap shader so quietly fail
155 */
156 if (ret == -EOPNOTSUPP)
157 zap_available = false;
158 else if (ret)
159 DRM_DEV_ERROR(dev, "Unable to authorize the image\n");
160
161 out:
162 if (mem_region)
163 memunmap(mem_region);
164
165 release_firmware(fw);
166
167 return ret;
168 }
169
170 int adreno_zap_shader_load(struct msm_gpu *gpu, u32 pasid)
171 {
172 struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
173 struct platform_device *pdev = gpu->pdev;
174
175 /* Short cut if we determine the zap shader isn't available/needed */
176 if (!zap_available)
177 return -ENODEV;
178
179 /* We need SCM to be able to load the firmware */
180 if (!qcom_scm_is_available()) {
181 DRM_DEV_ERROR(&pdev->dev, "SCM is not available\n");
182 return -EPROBE_DEFER;
183 }
184
185 return zap_shader_load_mdt(gpu, adreno_gpu->info->zapfw, pasid);
186 }
187
188 struct msm_gem_address_space *
189 adreno_iommu_create_address_space(struct msm_gpu *gpu,
190 struct platform_device *pdev)
191 {
192 struct iommu_domain *iommu = iommu_domain_alloc(&platform_bus_type);
193 struct msm_mmu *mmu = msm_iommu_new(&pdev->dev, iommu);
194 struct msm_gem_address_space *aspace;
195 u64 start, size;
196
197 /*
198 * Use the aperture start or SZ_16M, whichever is greater. This will
199 * ensure that we align with the allocated pagetable range while still
200 * allowing room in the lower 32 bits for GMEM and whatnot
201 */
202 start = max_t(u64, SZ_16M, iommu->geometry.aperture_start);
203 size = iommu->geometry.aperture_end - start + 1;
204
205 aspace = msm_gem_address_space_create(mmu, "gpu",
> 206 start & GENMASK(48, 0), size);
207
208 if (IS_ERR(aspace) && !IS_ERR(mmu))
209 mmu->funcs->destroy(mmu);
210
211 return aspace;
212 }
213
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 52430 bytes --]
[-- Attachment #3: Type: text/plain, Size: 160 bytes --]
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2020-06-12 4:49 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-11 22:21 [PATCH v8 0/7] iommu/arm-smmu: Enable split pagetable support Jordan Crouse
2020-06-11 22:21 ` [PATCH v8 6/7] drm/msm: Set the global virtual address range from the IOMMU domain Jordan Crouse
2020-06-12 4:49 ` kernel test robot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).