linux-arm-msm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] drm/msm: Support a750 "software fuse" for raytracing
@ 2024-04-25 13:43 Connor Abbott
  2024-04-25 13:43 ` [PATCH 1/6] arm64: dts: qcom: sm8650: Fix GPU cx_mem size Connor Abbott
                   ` (6 more replies)
  0 siblings, 7 replies; 23+ messages in thread
From: Connor Abbott @ 2024-04-25 13:43 UTC (permalink / raw)
  To: Rob Clark, Abhinav Kumar, Dmitry Baryshkov, Sean Paul,
	Marijn Suijten, linux-arm-msm, freedreno
  Cc: Connor Abbott

On a750, Qualcomm decided to gate support for certain features behind a
"software fuse." This consists of a register in the cx_mem zone, which
is normally only writeable by the TrustZone firmware.  On bootup it is
0, and we must call an SCM method to initialize it. Then we communicate
its value to userspace. This implements all of this, copying the SCM
call from the downstream kernel and kgsl.

So far the only optional feature we use is ray tracing (i.e. the
"ray_intersection" instruction) in a pending Mesa MR [1], so that's what
we expose to userspace. There's one extra patch to write some missing
registers, which depends on the register XML bump but is otherwise
unrelated, I just included it to make things easier on myself.

The drm/msm part of this series depends on [2] to avoid conflicts.

[1] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28447
[2] https://lore.kernel.org/all/20240324095222.ldnwumjkxk6uymmc@hu-akhilpo-hyd.qualcomm.com/T/

Connor Abbott (6):
  arm64: dts: qcom: sm8650: Fix GPU cx_mem size
  firmware: qcom_scm: Add gpu_init_regs call
  drm/msm: Update a6xx registers
  drm/msm/a7xx: Initialize a750 "software fuse"
  drm/msm: Add MSM_PARAM_RAYTRACING uapi
  drm/msm/a7xx: Add missing register writes from downstream

 arch/arm64/boot/dts/qcom/sm8650.dtsi          |  2 +-
 drivers/firmware/qcom/qcom_scm.c              | 14 +++
 drivers/firmware/qcom/qcom_scm.h              |  3 +
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c         | 97 ++++++++++++++++++-
 drivers/gpu/drm/msm/adreno/adreno_gpu.c       |  3 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.h       |  2 +
 drivers/gpu/drm/msm/registers/adreno/a6xx.xml | 28 +++++-
 include/linux/firmware/qcom/qcom_scm.h        | 23 +++++
 include/uapi/drm/msm_drm.h                    |  1 +
 9 files changed, 168 insertions(+), 5 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 1/6] arm64: dts: qcom: sm8650: Fix GPU cx_mem size
  2024-04-25 13:43 [PATCH 0/6] drm/msm: Support a750 "software fuse" for raytracing Connor Abbott
@ 2024-04-25 13:43 ` Connor Abbott
  2024-04-25 13:43 ` [PATCH 3/6] drm/msm: Update a6xx registers Connor Abbott
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 23+ messages in thread
From: Connor Abbott @ 2024-04-25 13:43 UTC (permalink / raw)
  To: Bjorn Andersson, Konrad Dybcio, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Neil Armstrong, linux-arm-msm, devicetree
  Cc: Connor Abbott

This is doubled compared to previous GPUs. We can't access the new
SW_FUSE_VALUE register without this.

Fixes: db33633b05c0 ("arm64: dts: qcom: sm8650: add GPU nodes")
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
---
 arch/arm64/boot/dts/qcom/sm8650.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/qcom/sm8650.dtsi b/arch/arm64/boot/dts/qcom/sm8650.dtsi
index 658ad2b41c5a..78b8944eaab2 100644
--- a/arch/arm64/boot/dts/qcom/sm8650.dtsi
+++ b/arch/arm64/boot/dts/qcom/sm8650.dtsi
@@ -2607,7 +2607,7 @@ tcsr: clock-controller@1fc0000 {
 		gpu: gpu@3d00000 {
 			compatible = "qcom,adreno-43051401", "qcom,adreno";
 			reg = <0x0 0x03d00000 0x0 0x40000>,
-			      <0x0 0x03d9e000 0x0 0x1000>,
+			      <0x0 0x03d9e000 0x0 0x2000>,
 			      <0x0 0x03d61000 0x0 0x800>;
 			reg-names = "kgsl_3d0_reg_memory",
 				    "cx_mem",
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 3/6] drm/msm: Update a6xx registers
  2024-04-25 13:43 [PATCH 0/6] drm/msm: Support a750 "software fuse" for raytracing Connor Abbott
  2024-04-25 13:43 ` [PATCH 1/6] arm64: dts: qcom: sm8650: Fix GPU cx_mem size Connor Abbott
@ 2024-04-25 13:43 ` Connor Abbott
  2024-04-25 13:43 ` [PATCH 4/6] drm/msm/a7xx: Initialize a750 "software fuse" Connor Abbott
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 23+ messages in thread
From: Connor Abbott @ 2024-04-25 13:43 UTC (permalink / raw)
  To: Rob Clark, Abhinav Kumar, Dmitry Baryshkov, Sean Paul,
	Marijn Suijten, linux-arm-msm, freedreno
  Cc: Connor Abbott

Update to mesa commit ff155f46a33 ("freedreno/a7xx: Register updates
from kgsl").

Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
---
 drivers/gpu/drm/msm/registers/adreno/a6xx.xml | 28 +++++++++++++++++--
 1 file changed, 25 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/msm/registers/adreno/a6xx.xml b/drivers/gpu/drm/msm/registers/adreno/a6xx.xml
index 78524aaab9d4..43fe90c12679 100644
--- a/drivers/gpu/drm/msm/registers/adreno/a6xx.xml
+++ b/drivers/gpu/drm/msm/registers/adreno/a6xx.xml
@@ -1227,6 +1227,7 @@ to upconvert to 32b float internally?
 		<bitfield name="DEBBUS_INTR_0" pos="26" type="boolean"/>
 		<bitfield name="DEBBUS_INTR_1" pos="27" type="boolean"/>
 		<bitfield name="TSBWRITEERROR" pos="28" type="boolean" variants="A7XX-"/>
+		<bitfield name="SWFUSEVIOLATION" pos="29" type="boolean" variants="A7XX-"/>
 		<bitfield name="ISDB_CPU_IRQ" pos="30" type="boolean"/>
 		<bitfield name="ISDB_UNDER_DEBUG" pos="31" type="boolean"/>
 	</bitset>
@@ -1503,6 +1504,9 @@ to upconvert to 32b float internally?
 	<reg32 offset="0x0287" name="RBBM_CLOCK_MODE_BV_VFD" variants="A7XX-"/>
 	<reg32 offset="0x0288" name="RBBM_CLOCK_MODE_BV_GPC" variants="A7XX-"/>
 
+	<reg32 offset="0x02c0" name="RBBM_SW_FUSE_INT_STATUS" variants="A7XX-"/>
+	<reg32 offset="0x02c1" name="RBBM_SW_FUSE_INT_MASK" variants="A7XX-"/>
+
 	<array offset="0x0400" name="RBBM_PERFCTR_CP" stride="2" length="14" variants="A6XX"/>
 	<array offset="0x041c" name="RBBM_PERFCTR_RBBM" stride="2" length="4" variants="A6XX"/>
 	<array offset="0x0424" name="RBBM_PERFCTR_PC" stride="2" length="8" variants="A6XX"/>
@@ -2842,7 +2846,11 @@ to upconvert to 32b float internally?
 		</reg32>
 	</array>
 	<!-- 0x891b-0x8926 invalid -->
-	<reg64 offset="0x8927" name="RB_SAMPLE_COUNT_ADDR" type="waddress" align="16" usage="cmd" variants="A6XX"/>
+	<doc>
+		RB_SAMPLE_COUNT_ADDR register is used up to (and including) a730. After that
+		the address is specified through CP_EVENT_WRITE7::WRITE_SAMPLE_COUNT.
+	</doc>
+	<reg64 offset="0x8927" name="RB_SAMPLE_COUNT_ADDR" type="waddress" align="16" usage="cmd"/>
 	<!-- 0x8929-0x89ff invalid -->
 
 	<!-- TODO: there are some registers in the 0x8a00-0x8bff range -->
@@ -2950,7 +2958,7 @@ to upconvert to 32b float internally?
 	<!-- 0x8e1d-0x8e1f invalid -->
 	<!-- 0x8e20-0x8e25 more perfcntr sel? -->
 	<!-- 0x8e26-0x8e27 invalid -->
-	<reg32 offset="0x8e28" name="RB_UNKNOWN_8E28" low="0" high="10"/>
+	<reg32 offset="0x8e28" name="RB_CMP_DBG_ECO_CNTL"/>
 	<!-- 0x8e29-0x8e2b invalid -->
 	<array offset="0x8e2c" name="RB_PERFCTR_CMP_SEL" stride="1" length="4"/>
 	<array offset="0x8e30" name="RB_PERFCTR_UFC_SEL" stride="1" length="6" variants="A7XX-"/>
@@ -3306,6 +3314,15 @@ to upconvert to 32b float internally?
 		<bitfield name="DISCARD" pos="2" type="boolean"/>
 	</reg32>
 
+	<!-- Both are a750+.
+	     Probably needed to correctly overlap execution of several draws.
+	-->
+	<reg32 offset="0x9885" name="PC_TESS_PARAM_SIZE" variants="A7XX-" usage="cmd"/>
+	<!-- Blob adds a bit more space {0x10, 0x20, 0x30, 0x40} bytes, but the meaning of
+	     this additional space is not known.
+	-->
+	<reg32 offset="0x9886" name="PC_TESS_FACTOR_SIZE" variants="A7XX-" usage="cmd"/>
+
 	<!-- 0x9982-0x9aff invalid -->
 
 	<reg32 offset="0x9b00" name="PC_PRIMITIVE_CNTL_0" type="a6xx_primitive_cntl_0" usage="rp_blit"/>
@@ -4293,7 +4310,7 @@ to upconvert to 32b float internally?
 	<!-- always 0x100000 or 0x1000000? -->
 	<reg32 offset="0xb600" name="TPL1_DBG_ECO_CNTL" low="0" high="25" usage="cmd"/>
 	<reg32 offset="0xb601" name="TPL1_ADDR_MODE_CNTL" type="a5xx_address_mode"/>
-	<reg32 offset="0xb602" name="TPL1_UNKNOWN_B602" low="0" high="7" type="uint" usage="cmd"/>
+	<reg32 offset="0xb602" name="TPL1_DBG_ECO_CNTL1" usage="cmd"/>
 	<reg32 offset="0xb604" name="TPL1_NC_MODE_CNTL">
 		<bitfield name="MODE" pos="0" type="boolean"/>
 		<bitfield name="LOWER_BIT" low="1" high="2" type="uint"/>
@@ -4965,6 +4982,11 @@ to upconvert to 32b float internally?
 	<reg32 offset="0x0001" name="SYSTEM_CACHE_CNTL_0"/>
 	<reg32 offset="0x0002" name="SYSTEM_CACHE_CNTL_1"/>
 	<reg32 offset="0x0039" name="CX_MISC_TCM_RET_CNTL" variants="A7XX-"/>
+	<reg32 offset="0x0400" name="CX_MISC_SW_FUSE_VALUE" variants="A7XX-">
+		<bitfield pos="0" name="FASTBLEND" type="boolean"/>
+		<bitfield pos="1" name="LPAC" type="boolean"/>
+		<bitfield pos="2" name="RAYTRACING" type="boolean"/>
+	</reg32>
 </domain>
 
 </database>
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 4/6] drm/msm/a7xx: Initialize a750 "software fuse"
  2024-04-25 13:43 [PATCH 0/6] drm/msm: Support a750 "software fuse" for raytracing Connor Abbott
  2024-04-25 13:43 ` [PATCH 1/6] arm64: dts: qcom: sm8650: Fix GPU cx_mem size Connor Abbott
  2024-04-25 13:43 ` [PATCH 3/6] drm/msm: Update a6xx registers Connor Abbott
@ 2024-04-25 13:43 ` Connor Abbott
  2024-04-25 23:02   ` Dmitry Baryshkov
  2024-04-25 13:43 ` [PATCH 5/6] drm/msm: Add MSM_PARAM_RAYTRACING uapi Connor Abbott
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 23+ messages in thread
From: Connor Abbott @ 2024-04-25 13:43 UTC (permalink / raw)
  To: Rob Clark, Abhinav Kumar, Dmitry Baryshkov, Sean Paul,
	Marijn Suijten, linux-arm-msm, freedreno
  Cc: Connor Abbott

On all Qualcomm platforms with a7xx GPUs, qcom_scm provides a method to
initialize cx_mem. Copy this from downstream (minus BCL which we
currently don't support). On a750, this includes a new "fuse" register
which can be used by qcom_scm to fuse off certain features like
raytracing in software. The fuse is default off, and is initialized by
calling the method. Afterwards we have to read it to find out which
features were enabled.

Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 89 ++++++++++++++++++++++++-
 drivers/gpu/drm/msm/adreno/adreno_gpu.h |  2 +
 2 files changed, 90 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index cf0b1de1c071..fb2722574ae5 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -10,6 +10,7 @@
 
 #include <linux/bitfield.h>
 #include <linux/devfreq.h>
+#include <linux/firmware/qcom/qcom_scm.h>
 #include <linux/pm_domain.h>
 #include <linux/soc/qcom/llcc-qcom.h>
 
@@ -1686,7 +1687,8 @@ static int a6xx_zap_shader_init(struct msm_gpu *gpu)
 		       A6XX_RBBM_INT_0_MASK_RBBM_HANG_DETECT | \
 		       A6XX_RBBM_INT_0_MASK_UCHE_OOB_ACCESS | \
 		       A6XX_RBBM_INT_0_MASK_UCHE_TRAP_INTR | \
-		       A6XX_RBBM_INT_0_MASK_TSBWRITEERROR)
+		       A6XX_RBBM_INT_0_MASK_TSBWRITEERROR | \
+		       A6XX_RBBM_INT_0_MASK_SWFUSEVIOLATION)
 
 #define A7XX_APRIV_MASK (A6XX_CP_APRIV_CNTL_ICACHE | \
 			 A6XX_CP_APRIV_CNTL_RBFETCH | \
@@ -2356,6 +2358,26 @@ static void a6xx_fault_detect_irq(struct msm_gpu *gpu)
 	kthread_queue_work(gpu->worker, &gpu->recover_work);
 }
 
+static void a7xx_sw_fuse_violation_irq(struct msm_gpu *gpu)
+{
+	u32 status;
+
+	status = gpu_read(gpu, REG_A7XX_RBBM_SW_FUSE_INT_STATUS);
+	gpu_write(gpu, REG_A7XX_RBBM_SW_FUSE_INT_MASK, 0);
+
+	dev_err_ratelimited(&gpu->pdev->dev, "SW fuse violation status=%8.8x\n", status);
+
+	/* Ignore FASTBLEND violations, because the HW will silently fall back
+	 * to legacy blending.
+	 */
+	if (status & (A7XX_CX_MISC_SW_FUSE_VALUE_RAYTRACING |
+		      A7XX_CX_MISC_SW_FUSE_VALUE_LPAC)) {
+		del_timer(&gpu->hangcheck_timer);
+
+		kthread_queue_work(gpu->worker, &gpu->recover_work);
+	}
+}
+
 static irqreturn_t a6xx_irq(struct msm_gpu *gpu)
 {
 	struct msm_drm_private *priv = gpu->dev->dev_private;
@@ -2384,6 +2406,9 @@ static irqreturn_t a6xx_irq(struct msm_gpu *gpu)
 	if (status & A6XX_RBBM_INT_0_MASK_UCHE_OOB_ACCESS)
 		dev_err_ratelimited(&gpu->pdev->dev, "UCHE | Out of bounds access\n");
 
+	if (status & A6XX_RBBM_INT_0_MASK_SWFUSEVIOLATION)
+		a7xx_sw_fuse_violation_irq(gpu);
+
 	if (status & A6XX_RBBM_INT_0_MASK_CP_CACHE_FLUSH_TS)
 		msm_gpu_retire(gpu);
 
@@ -2525,6 +2550,60 @@ static void a6xx_llc_slices_init(struct platform_device *pdev,
 		a6xx_gpu->llc_mmio = ERR_PTR(-EINVAL);
 }
 
+static int a7xx_cx_mem_init(struct a6xx_gpu *a6xx_gpu)
+{
+	struct adreno_gpu *adreno_gpu = &a6xx_gpu->base;
+	struct msm_gpu *gpu = &adreno_gpu->base;
+	u32 gpu_req = QCOM_SCM_GPU_ALWAYS_EN_REQ;
+	u32 fuse_val;
+	int ret;
+
+	if (adreno_is_a740(adreno_gpu)) {
+		/* Raytracing is always enabled on a740 */
+		adreno_gpu->has_ray_tracing = true;
+	}
+
+	if (!qcom_scm_is_available()) {
+		/* Assume that if qcom scm isn't available, that whatever
+		 * replacement allows writing the fuse register ourselves.
+		 * Users of alternative firmware need to make sure this
+		 * register is writeable or indicate that it's not somehow.
+		 * Print a warning because if you mess this up you're about to
+		 * crash horribly.
+		 */
+		if (adreno_is_a750(adreno_gpu)) {
+			dev_warn_once(gpu->dev->dev,
+				"SCM is not available, poking fuse register\n");
+			a6xx_llc_write(a6xx_gpu, REG_A7XX_CX_MISC_SW_FUSE_VALUE,
+				A7XX_CX_MISC_SW_FUSE_VALUE_RAYTRACING |
+				A7XX_CX_MISC_SW_FUSE_VALUE_FASTBLEND |
+				A7XX_CX_MISC_SW_FUSE_VALUE_LPAC);
+			adreno_gpu->has_ray_tracing = true;
+		}
+
+		return 0;
+	}
+
+	if (adreno_is_a750(adreno_gpu))
+		gpu_req |= QCOM_SCM_GPU_TSENSE_EN_REQ;
+
+	ret = qcom_scm_gpu_init_regs(gpu_req);
+	if (ret)
+		return ret;
+
+	/* On a750 raytracing may be disabled by the firmware, find out whether
+	 * that's the case. The scm call above sets the fuse register.
+	 */
+	if (adreno_is_a750(adreno_gpu)) {
+		fuse_val = a6xx_llc_read(a6xx_gpu, REG_A7XX_CX_MISC_SW_FUSE_VALUE);
+		adreno_gpu->has_ray_tracing =
+			!!(fuse_val & A7XX_CX_MISC_SW_FUSE_VALUE_RAYTRACING);
+	}
+
+	return 0;
+}
+
+
 #define GBIF_CLIENT_HALT_MASK		BIT(0)
 #define GBIF_ARB_HALT_MASK		BIT(1)
 #define VBIF_XIN_HALT_CTRL0_MASK	GENMASK(3, 0)
@@ -3094,6 +3173,14 @@ struct msm_gpu *a6xx_gpu_init(struct drm_device *dev)
 		return ERR_PTR(ret);
 	}
 
+	if (adreno_is_a7xx(adreno_gpu)) {
+		ret = a7xx_cx_mem_init(a6xx_gpu);
+		if (ret) {
+			a6xx_destroy(&(a6xx_gpu->base.base));
+			return ERR_PTR(ret);
+		}
+	}
+
 	if (gpu->aspace)
 		msm_mmu_set_fault_handler(gpu->aspace->mmu, gpu,
 				a6xx_fault_handler);
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
index 77526892eb8c..4180f3149dd8 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
@@ -182,6 +182,8 @@ struct adreno_gpu {
 	 */
 	const unsigned int *reg_offsets;
 	bool gmu_is_wrapper;
+
+	bool has_ray_tracing;
 };
 #define to_adreno_gpu(x) container_of(x, struct adreno_gpu, base)
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 5/6] drm/msm: Add MSM_PARAM_RAYTRACING uapi
  2024-04-25 13:43 [PATCH 0/6] drm/msm: Support a750 "software fuse" for raytracing Connor Abbott
                   ` (2 preceding siblings ...)
  2024-04-25 13:43 ` [PATCH 4/6] drm/msm/a7xx: Initialize a750 "software fuse" Connor Abbott
@ 2024-04-25 13:43 ` Connor Abbott
  2024-04-25 23:03   ` Dmitry Baryshkov
  2024-04-25 13:43 ` [PATCH 6/6] drm/msm/a7xx: Add missing register writes from downstream Connor Abbott
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 23+ messages in thread
From: Connor Abbott @ 2024-04-25 13:43 UTC (permalink / raw)
  To: Rob Clark, Abhinav Kumar, Dmitry Baryshkov, Sean Paul,
	Marijn Suijten, linux-arm-msm, freedreno
  Cc: Connor Abbott

Expose the value of the software fuse to userspace.

Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
---
 drivers/gpu/drm/msm/adreno/adreno_gpu.c | 3 +++
 include/uapi/drm/msm_drm.h              | 1 +
 2 files changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index 074fb498706f..99ad651857b2 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -376,6 +376,9 @@ int adreno_get_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
 	case MSM_PARAM_HIGHEST_BANK_BIT:
 		*value = adreno_gpu->ubwc_config.highest_bank_bit;
 		return 0;
+	case MSM_PARAM_RAYTRACING:
+		*value = adreno_gpu->has_ray_tracing;
+		return 0;
 	default:
 		DBG("%s: invalid param: %u", gpu->name, param);
 		return -EINVAL;
diff --git a/include/uapi/drm/msm_drm.h b/include/uapi/drm/msm_drm.h
index d8a6b3472760..3fca72f73861 100644
--- a/include/uapi/drm/msm_drm.h
+++ b/include/uapi/drm/msm_drm.h
@@ -87,6 +87,7 @@ struct drm_msm_timespec {
 #define MSM_PARAM_VA_START   0x0e  /* RO: start of valid GPU iova range */
 #define MSM_PARAM_VA_SIZE    0x0f  /* RO: size of valid GPU iova range (bytes) */
 #define MSM_PARAM_HIGHEST_BANK_BIT 0x10 /* RO */
+#define MSM_PARAM_RAYTRACING 0x11 /* RO */
 
 /* For backwards compat.  The original support for preemption was based on
  * a single ring per priority level so # of priority levels equals the #
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 6/6] drm/msm/a7xx: Add missing register writes from downstream
  2024-04-25 13:43 [PATCH 0/6] drm/msm: Support a750 "software fuse" for raytracing Connor Abbott
                   ` (3 preceding siblings ...)
  2024-04-25 13:43 ` [PATCH 5/6] drm/msm: Add MSM_PARAM_RAYTRACING uapi Connor Abbott
@ 2024-04-25 13:43 ` Connor Abbott
  2024-04-25 15:03 ` [PATCH 0/6] drm/msm: Support a750 "software fuse" for raytracing Dmitry Baryshkov
  2024-04-25 15:13 ` [PATCH 2/6] firmware: qcom_scm: Add gpu_init_regs call Connor Abbott
  6 siblings, 0 replies; 23+ messages in thread
From: Connor Abbott @ 2024-04-25 13:43 UTC (permalink / raw)
  To: Rob Clark, Abhinav Kumar, Dmitry Baryshkov, Sean Paul,
	Marijn Suijten, linux-arm-msm, freedreno
  Cc: Connor Abbott

This isn't known to fix anything yet, but it's a good idea to add it.

Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index fb2722574ae5..e015f3b43bac 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1953,6 +1953,14 @@ static int hw_init(struct msm_gpu *gpu)
 				  BIT(6) | BIT(5) | BIT(3) | BIT(2) | BIT(1));
 	}
 
+	if (adreno_is_a750(adreno_gpu)) {
+		gpu_rmw(gpu, REG_A6XX_RB_CMP_DBG_ECO_CNTL, BIT(19), BIT(19));
+
+		gpu_write(gpu, REG_A6XX_TPL1_DBG_ECO_CNTL1, 0xc0700);
+	} else if (adreno_is_a7xx(adreno_gpu)) {
+		gpu_rmw(gpu, REG_A6XX_RB_CMP_DBG_ECO_CNTL, BIT(19), BIT(19));
+	}
+
 	/* Enable interrupts */
 	gpu_write(gpu, REG_A6XX_RBBM_INT_0_MASK,
 		  adreno_is_a7xx(adreno_gpu) ? A7XX_INT_MASK : A6XX_INT_MASK);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/6] drm/msm: Support a750 "software fuse" for raytracing
  2024-04-25 13:43 [PATCH 0/6] drm/msm: Support a750 "software fuse" for raytracing Connor Abbott
                   ` (4 preceding siblings ...)
  2024-04-25 13:43 ` [PATCH 6/6] drm/msm/a7xx: Add missing register writes from downstream Connor Abbott
@ 2024-04-25 15:03 ` Dmitry Baryshkov
  2024-04-25 15:13 ` [PATCH 2/6] firmware: qcom_scm: Add gpu_init_regs call Connor Abbott
  6 siblings, 0 replies; 23+ messages in thread
From: Dmitry Baryshkov @ 2024-04-25 15:03 UTC (permalink / raw)
  To: Connor Abbott
  Cc: Rob Clark, Abhinav Kumar, Sean Paul, Marijn Suijten,
	linux-arm-msm, freedreno

On Thu, 25 Apr 2024 at 16:44, Connor Abbott <cwabbott0@gmail.com> wrote:
>
> On a750, Qualcomm decided to gate support for certain features behind a
> "software fuse." This consists of a register in the cx_mem zone, which
> is normally only writeable by the TrustZone firmware.  On bootup it is
> 0, and we must call an SCM method to initialize it. Then we communicate
> its value to userspace. This implements all of this, copying the SCM
> call from the downstream kernel and kgsl.
>
> So far the only optional feature we use is ray tracing (i.e. the
> "ray_intersection" instruction) in a pending Mesa MR [1], so that's what
> we expose to userspace. There's one extra patch to write some missing
> registers, which depends on the register XML bump but is otherwise
> unrelated, I just included it to make things easier on myself.
>
> The drm/msm part of this series depends on [2] to avoid conflicts.
>
> [1] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28447
> [2] https://lore.kernel.org/all/20240324095222.ldnwumjkxk6uymmc@hu-akhilpo-hyd.qualcomm.com/T/
>
> Connor Abbott (6):
>   arm64: dts: qcom: sm8650: Fix GPU cx_mem size
>   firmware: qcom_scm: Add gpu_init_regs call

I don't see patch 2 at all. Granted that patches 1 and 3-6 have
different cc lists, might it be that it went to some blackhole?

>   drm/msm: Update a6xx registers
>   drm/msm/a7xx: Initialize a750 "software fuse"
>   drm/msm: Add MSM_PARAM_RAYTRACING uapi
>   drm/msm/a7xx: Add missing register writes from downstream



-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 2/6] firmware: qcom_scm: Add gpu_init_regs call
  2024-04-25 13:43 [PATCH 0/6] drm/msm: Support a750 "software fuse" for raytracing Connor Abbott
                   ` (5 preceding siblings ...)
  2024-04-25 15:03 ` [PATCH 0/6] drm/msm: Support a750 "software fuse" for raytracing Dmitry Baryshkov
@ 2024-04-25 15:13 ` Connor Abbott
  6 siblings, 0 replies; 23+ messages in thread
From: Connor Abbott @ 2024-04-25 15:13 UTC (permalink / raw)
  To: Bjorn Andersson, Konrad Dybcio, linux-arm-msm; +Cc: Connor Abbott

This will used by drm/msm.

Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
---
 drivers/firmware/qcom/qcom_scm.c       | 14 ++++++++++++++
 drivers/firmware/qcom/qcom_scm.h       |  3 +++
 include/linux/firmware/qcom/qcom_scm.h | 23 +++++++++++++++++++++++
 3 files changed, 40 insertions(+)

diff --git a/drivers/firmware/qcom/qcom_scm.c b/drivers/firmware/qcom/qcom_scm.c
index 06e46267161b..f8623ad0987c 100644
--- a/drivers/firmware/qcom/qcom_scm.c
+++ b/drivers/firmware/qcom/qcom_scm.c
@@ -1394,6 +1394,20 @@ int qcom_scm_lmh_dcvsh(u32 payload_fn, u32 payload_reg, u32 payload_val,
 }
 EXPORT_SYMBOL_GPL(qcom_scm_lmh_dcvsh);
 
+int qcom_scm_gpu_init_regs(u32 gpu_req)
+{
+	struct qcom_scm_desc desc = {
+		.svc = QCOM_SCM_SVC_GPU,
+		.cmd = QCOM_SCM_SVC_GPU_INIT_REGS,
+		.arginfo = QCOM_SCM_ARGS(1),
+		.args[0] = gpu_req,
+		.owner = ARM_SMCCC_OWNER_SIP,
+	};
+
+	return qcom_scm_call(__scm->dev, &desc, NULL);
+}
+EXPORT_SYMBOL_GPL(qcom_scm_gpu_init_regs);
+
 static int qcom_scm_find_dload_address(struct device *dev, u64 *addr)
 {
 	struct device_node *tcsr;
diff --git a/drivers/firmware/qcom/qcom_scm.h b/drivers/firmware/qcom/qcom_scm.h
index 4532907e8489..484e030bcac9 100644
--- a/drivers/firmware/qcom/qcom_scm.h
+++ b/drivers/firmware/qcom/qcom_scm.h
@@ -138,6 +138,9 @@ int scm_legacy_call(struct device *dev, const struct qcom_scm_desc *desc,
 #define QCOM_SCM_WAITQ_RESUME			0x02
 #define QCOM_SCM_WAITQ_GET_WQ_CTX		0x03
 
+#define QCOM_SCM_SVC_GPU			0x28
+#define QCOM_SCM_SVC_GPU_INIT_REGS		0x01
+
 /* common error codes */
 #define QCOM_SCM_V2_EBUSY	-12
 #define QCOM_SCM_ENOMEM		-5
diff --git a/include/linux/firmware/qcom/qcom_scm.h b/include/linux/firmware/qcom/qcom_scm.h
index aaa19f93ac43..2c444c98682e 100644
--- a/include/linux/firmware/qcom/qcom_scm.h
+++ b/include/linux/firmware/qcom/qcom_scm.h
@@ -115,6 +115,29 @@ int qcom_scm_lmh_dcvsh(u32 payload_fn, u32 payload_reg, u32 payload_val,
 int qcom_scm_lmh_profile_change(u32 profile_id);
 bool qcom_scm_lmh_dcvsh_available(void);
 
+/**
+ * Request TZ to program set of access controlled registers necessary
+ * irrespective of any features
+ */
+#define QCOM_SCM_GPU_ALWAYS_EN_REQ BIT(0)
+/**
+ * Request TZ to program BCL id to access controlled register when BCL is
+ * enabled
+ */
+#define QCOM_SCM_GPU_BCL_EN_REQ BIT(1)
+/**
+ * Request TZ to program set of access controlled register for CLX feature
+ * when enabled
+ */
+#define QCOM_SCM_GPU_CLX_EN_REQ BIT(2)
+/**
+ * Request TZ to program tsense ids to access controlled registers for reading
+ * gpu temperature sensors
+ */
+#define QCOM_SCM_GPU_TSENSE_EN_REQ BIT(3)
+
+int qcom_scm_gpu_init_regs(u32 gpu_req);
+
 #ifdef CONFIG_QCOM_QSEECOM
 
 int qcom_scm_qseecom_app_get_id(const char *app_name, u32 *app_id);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/6] drm/msm/a7xx: Initialize a750 "software fuse"
  2024-04-25 13:43 ` [PATCH 4/6] drm/msm/a7xx: Initialize a750 "software fuse" Connor Abbott
@ 2024-04-25 23:02   ` Dmitry Baryshkov
  2024-04-26 12:35     ` Connor Abbott
  2024-04-26 12:45     ` Rob Clark
  0 siblings, 2 replies; 23+ messages in thread
From: Dmitry Baryshkov @ 2024-04-25 23:02 UTC (permalink / raw)
  To: Connor Abbott
  Cc: Rob Clark, Abhinav Kumar, Sean Paul, Marijn Suijten,
	linux-arm-msm, freedreno

On Thu, 25 Apr 2024 at 16:44, Connor Abbott <cwabbott0@gmail.com> wrote:
>
> On all Qualcomm platforms with a7xx GPUs, qcom_scm provides a method to
> initialize cx_mem. Copy this from downstream (minus BCL which we
> currently don't support). On a750, this includes a new "fuse" register
> which can be used by qcom_scm to fuse off certain features like
> raytracing in software. The fuse is default off, and is initialized by
> calling the method. Afterwards we have to read it to find out which
> features were enabled.
>
> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 89 ++++++++++++++++++++++++-
>  drivers/gpu/drm/msm/adreno/adreno_gpu.h |  2 +
>  2 files changed, 90 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index cf0b1de1c071..fb2722574ae5 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -10,6 +10,7 @@
>
>  #include <linux/bitfield.h>
>  #include <linux/devfreq.h>
> +#include <linux/firmware/qcom/qcom_scm.h>
>  #include <linux/pm_domain.h>
>  #include <linux/soc/qcom/llcc-qcom.h>
>
> @@ -1686,7 +1687,8 @@ static int a6xx_zap_shader_init(struct msm_gpu *gpu)
>                        A6XX_RBBM_INT_0_MASK_RBBM_HANG_DETECT | \
>                        A6XX_RBBM_INT_0_MASK_UCHE_OOB_ACCESS | \
>                        A6XX_RBBM_INT_0_MASK_UCHE_TRAP_INTR | \
> -                      A6XX_RBBM_INT_0_MASK_TSBWRITEERROR)
> +                      A6XX_RBBM_INT_0_MASK_TSBWRITEERROR | \
> +                      A6XX_RBBM_INT_0_MASK_SWFUSEVIOLATION)
>
>  #define A7XX_APRIV_MASK (A6XX_CP_APRIV_CNTL_ICACHE | \
>                          A6XX_CP_APRIV_CNTL_RBFETCH | \
> @@ -2356,6 +2358,26 @@ static void a6xx_fault_detect_irq(struct msm_gpu *gpu)
>         kthread_queue_work(gpu->worker, &gpu->recover_work);
>  }
>
> +static void a7xx_sw_fuse_violation_irq(struct msm_gpu *gpu)
> +{
> +       u32 status;
> +
> +       status = gpu_read(gpu, REG_A7XX_RBBM_SW_FUSE_INT_STATUS);
> +       gpu_write(gpu, REG_A7XX_RBBM_SW_FUSE_INT_MASK, 0);
> +
> +       dev_err_ratelimited(&gpu->pdev->dev, "SW fuse violation status=%8.8x\n", status);
> +
> +       /* Ignore FASTBLEND violations, because the HW will silently fall back
> +        * to legacy blending.
> +        */
> +       if (status & (A7XX_CX_MISC_SW_FUSE_VALUE_RAYTRACING |
> +                     A7XX_CX_MISC_SW_FUSE_VALUE_LPAC)) {
> +               del_timer(&gpu->hangcheck_timer);
> +
> +               kthread_queue_work(gpu->worker, &gpu->recover_work);
> +       }
> +}
> +
>  static irqreturn_t a6xx_irq(struct msm_gpu *gpu)
>  {
>         struct msm_drm_private *priv = gpu->dev->dev_private;
> @@ -2384,6 +2406,9 @@ static irqreturn_t a6xx_irq(struct msm_gpu *gpu)
>         if (status & A6XX_RBBM_INT_0_MASK_UCHE_OOB_ACCESS)
>                 dev_err_ratelimited(&gpu->pdev->dev, "UCHE | Out of bounds access\n");
>
> +       if (status & A6XX_RBBM_INT_0_MASK_SWFUSEVIOLATION)
> +               a7xx_sw_fuse_violation_irq(gpu);
> +
>         if (status & A6XX_RBBM_INT_0_MASK_CP_CACHE_FLUSH_TS)
>                 msm_gpu_retire(gpu);
>
> @@ -2525,6 +2550,60 @@ static void a6xx_llc_slices_init(struct platform_device *pdev,
>                 a6xx_gpu->llc_mmio = ERR_PTR(-EINVAL);
>  }
>
> +static int a7xx_cx_mem_init(struct a6xx_gpu *a6xx_gpu)
> +{
> +       struct adreno_gpu *adreno_gpu = &a6xx_gpu->base;
> +       struct msm_gpu *gpu = &adreno_gpu->base;
> +       u32 gpu_req = QCOM_SCM_GPU_ALWAYS_EN_REQ;
> +       u32 fuse_val;
> +       int ret;
> +
> +       if (adreno_is_a740(adreno_gpu)) {
> +               /* Raytracing is always enabled on a740 */
> +               adreno_gpu->has_ray_tracing = true;
> +       }
> +
> +       if (!qcom_scm_is_available()) {
> +               /* Assume that if qcom scm isn't available, that whatever
> +                * replacement allows writing the fuse register ourselves.
> +                * Users of alternative firmware need to make sure this
> +                * register is writeable or indicate that it's not somehow.
> +                * Print a warning because if you mess this up you're about to
> +                * crash horribly.
> +                */
> +               if (adreno_is_a750(adreno_gpu)) {
> +                       dev_warn_once(gpu->dev->dev,
> +                               "SCM is not available, poking fuse register\n");
> +                       a6xx_llc_write(a6xx_gpu, REG_A7XX_CX_MISC_SW_FUSE_VALUE,
> +                               A7XX_CX_MISC_SW_FUSE_VALUE_RAYTRACING |
> +                               A7XX_CX_MISC_SW_FUSE_VALUE_FASTBLEND |
> +                               A7XX_CX_MISC_SW_FUSE_VALUE_LPAC);
> +                       adreno_gpu->has_ray_tracing = true;
> +               }
> +
> +               return 0;
> +       }
> +
> +       if (adreno_is_a750(adreno_gpu))

Most of the function is under the if (adreno_is_a750) conditions. Can
we invert the logic and add a single block of if(adreno_is_a750) and
then place all the code underneath?

> +               gpu_req |= QCOM_SCM_GPU_TSENSE_EN_REQ;
> +
> +       ret = qcom_scm_gpu_init_regs(gpu_req);
> +       if (ret)
> +               return ret;
> +
> +       /* On a750 raytracing may be disabled by the firmware, find out whether
> +        * that's the case. The scm call above sets the fuse register.
> +        */
> +       if (adreno_is_a750(adreno_gpu)) {
> +               fuse_val = a6xx_llc_read(a6xx_gpu, REG_A7XX_CX_MISC_SW_FUSE_VALUE);

This register isn't accessible with the current sm8650.dtsi. Since DT
and driver are going through different trees, please add safety guards
here, so that the driver doesn't crash if used with older dtsi
(not to mention that dts is considered to be an ABI and newer kernels
are supposed not to break with older DT files).

> +               adreno_gpu->has_ray_tracing =
> +                       !!(fuse_val & A7XX_CX_MISC_SW_FUSE_VALUE_RAYTRACING);
> +       }
> +
> +       return 0;
> +}
> +
> +
>  #define GBIF_CLIENT_HALT_MASK          BIT(0)
>  #define GBIF_ARB_HALT_MASK             BIT(1)
>  #define VBIF_XIN_HALT_CTRL0_MASK       GENMASK(3, 0)
> @@ -3094,6 +3173,14 @@ struct msm_gpu *a6xx_gpu_init(struct drm_device *dev)
>                 return ERR_PTR(ret);
>         }
>
> +       if (adreno_is_a7xx(adreno_gpu)) {
> +               ret = a7xx_cx_mem_init(a6xx_gpu);
> +               if (ret) {
> +                       a6xx_destroy(&(a6xx_gpu->base.base));
> +                       return ERR_PTR(ret);
> +               }
> +       }
> +
>         if (gpu->aspace)
>                 msm_mmu_set_fault_handler(gpu->aspace->mmu, gpu,
>                                 a6xx_fault_handler);
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> index 77526892eb8c..4180f3149dd8 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> @@ -182,6 +182,8 @@ struct adreno_gpu {
>          */
>         const unsigned int *reg_offsets;
>         bool gmu_is_wrapper;
> +
> +       bool has_ray_tracing;
>  };
>  #define to_adreno_gpu(x) container_of(x, struct adreno_gpu, base)
>
> --
> 2.31.1
>


-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 5/6] drm/msm: Add MSM_PARAM_RAYTRACING uapi
  2024-04-25 13:43 ` [PATCH 5/6] drm/msm: Add MSM_PARAM_RAYTRACING uapi Connor Abbott
@ 2024-04-25 23:03   ` Dmitry Baryshkov
  0 siblings, 0 replies; 23+ messages in thread
From: Dmitry Baryshkov @ 2024-04-25 23:03 UTC (permalink / raw)
  To: Connor Abbott
  Cc: Rob Clark, Abhinav Kumar, Sean Paul, Marijn Suijten,
	linux-arm-msm, freedreno

On Thu, 25 Apr 2024 at 16:44, Connor Abbott <cwabbott0@gmail.com> wrote:
>
> Expose the value of the software fuse to userspace.
>
> Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
> ---
>  drivers/gpu/drm/msm/adreno/adreno_gpu.c | 3 +++
>  include/uapi/drm/msm_drm.h              | 1 +
>  2 files changed, 4 insertions(+)

Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>

-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/6] drm/msm/a7xx: Initialize a750 "software fuse"
  2024-04-25 23:02   ` Dmitry Baryshkov
@ 2024-04-26 12:35     ` Connor Abbott
  2024-04-26 12:54       ` Connor Abbott
  2024-04-26 13:31       ` Dmitry Baryshkov
  2024-04-26 12:45     ` Rob Clark
  1 sibling, 2 replies; 23+ messages in thread
From: Connor Abbott @ 2024-04-26 12:35 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: Rob Clark, Abhinav Kumar, Sean Paul, Marijn Suijten,
	linux-arm-msm, freedreno

On Fri, Apr 26, 2024 at 12:02 AM Dmitry Baryshkov
<dmitry.baryshkov@linaro.org> wrote:
>
> On Thu, 25 Apr 2024 at 16:44, Connor Abbott <cwabbott0@gmail.com> wrote:
> >
> > On all Qualcomm platforms with a7xx GPUs, qcom_scm provides a method to
> > initialize cx_mem. Copy this from downstream (minus BCL which we
> > currently don't support). On a750, this includes a new "fuse" register
> > which can be used by qcom_scm to fuse off certain features like
> > raytracing in software. The fuse is default off, and is initialized by
> > calling the method. Afterwards we have to read it to find out which
> > features were enabled.
> >
> > Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
> > ---
> >  drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 89 ++++++++++++++++++++++++-
> >  drivers/gpu/drm/msm/adreno/adreno_gpu.h |  2 +
> >  2 files changed, 90 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > index cf0b1de1c071..fb2722574ae5 100644
> > --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > @@ -10,6 +10,7 @@
> >
> >  #include <linux/bitfield.h>
> >  #include <linux/devfreq.h>
> > +#include <linux/firmware/qcom/qcom_scm.h>
> >  #include <linux/pm_domain.h>
> >  #include <linux/soc/qcom/llcc-qcom.h>
> >
> > @@ -1686,7 +1687,8 @@ static int a6xx_zap_shader_init(struct msm_gpu *gpu)
> >                        A6XX_RBBM_INT_0_MASK_RBBM_HANG_DETECT | \
> >                        A6XX_RBBM_INT_0_MASK_UCHE_OOB_ACCESS | \
> >                        A6XX_RBBM_INT_0_MASK_UCHE_TRAP_INTR | \
> > -                      A6XX_RBBM_INT_0_MASK_TSBWRITEERROR)
> > +                      A6XX_RBBM_INT_0_MASK_TSBWRITEERROR | \
> > +                      A6XX_RBBM_INT_0_MASK_SWFUSEVIOLATION)
> >
> >  #define A7XX_APRIV_MASK (A6XX_CP_APRIV_CNTL_ICACHE | \
> >                          A6XX_CP_APRIV_CNTL_RBFETCH | \
> > @@ -2356,6 +2358,26 @@ static void a6xx_fault_detect_irq(struct msm_gpu *gpu)
> >         kthread_queue_work(gpu->worker, &gpu->recover_work);
> >  }
> >
> > +static void a7xx_sw_fuse_violation_irq(struct msm_gpu *gpu)
> > +{
> > +       u32 status;
> > +
> > +       status = gpu_read(gpu, REG_A7XX_RBBM_SW_FUSE_INT_STATUS);
> > +       gpu_write(gpu, REG_A7XX_RBBM_SW_FUSE_INT_MASK, 0);
> > +
> > +       dev_err_ratelimited(&gpu->pdev->dev, "SW fuse violation status=%8.8x\n", status);
> > +
> > +       /* Ignore FASTBLEND violations, because the HW will silently fall back
> > +        * to legacy blending.
> > +        */
> > +       if (status & (A7XX_CX_MISC_SW_FUSE_VALUE_RAYTRACING |
> > +                     A7XX_CX_MISC_SW_FUSE_VALUE_LPAC)) {
> > +               del_timer(&gpu->hangcheck_timer);
> > +
> > +               kthread_queue_work(gpu->worker, &gpu->recover_work);
> > +       }
> > +}
> > +
> >  static irqreturn_t a6xx_irq(struct msm_gpu *gpu)
> >  {
> >         struct msm_drm_private *priv = gpu->dev->dev_private;
> > @@ -2384,6 +2406,9 @@ static irqreturn_t a6xx_irq(struct msm_gpu *gpu)
> >         if (status & A6XX_RBBM_INT_0_MASK_UCHE_OOB_ACCESS)
> >                 dev_err_ratelimited(&gpu->pdev->dev, "UCHE | Out of bounds access\n");
> >
> > +       if (status & A6XX_RBBM_INT_0_MASK_SWFUSEVIOLATION)
> > +               a7xx_sw_fuse_violation_irq(gpu);
> > +
> >         if (status & A6XX_RBBM_INT_0_MASK_CP_CACHE_FLUSH_TS)
> >                 msm_gpu_retire(gpu);
> >
> > @@ -2525,6 +2550,60 @@ static void a6xx_llc_slices_init(struct platform_device *pdev,
> >                 a6xx_gpu->llc_mmio = ERR_PTR(-EINVAL);
> >  }
> >
> > +static int a7xx_cx_mem_init(struct a6xx_gpu *a6xx_gpu)
> > +{
> > +       struct adreno_gpu *adreno_gpu = &a6xx_gpu->base;
> > +       struct msm_gpu *gpu = &adreno_gpu->base;
> > +       u32 gpu_req = QCOM_SCM_GPU_ALWAYS_EN_REQ;
> > +       u32 fuse_val;
> > +       int ret;
> > +
> > +       if (adreno_is_a740(adreno_gpu)) {
> > +               /* Raytracing is always enabled on a740 */
> > +               adreno_gpu->has_ray_tracing = true;
> > +       }
> > +
> > +       if (!qcom_scm_is_available()) {
> > +               /* Assume that if qcom scm isn't available, that whatever
> > +                * replacement allows writing the fuse register ourselves.
> > +                * Users of alternative firmware need to make sure this
> > +                * register is writeable or indicate that it's not somehow.
> > +                * Print a warning because if you mess this up you're about to
> > +                * crash horribly.
> > +                */
> > +               if (adreno_is_a750(adreno_gpu)) {
> > +                       dev_warn_once(gpu->dev->dev,
> > +                               "SCM is not available, poking fuse register\n");
> > +                       a6xx_llc_write(a6xx_gpu, REG_A7XX_CX_MISC_SW_FUSE_VALUE,
> > +                               A7XX_CX_MISC_SW_FUSE_VALUE_RAYTRACING |
> > +                               A7XX_CX_MISC_SW_FUSE_VALUE_FASTBLEND |
> > +                               A7XX_CX_MISC_SW_FUSE_VALUE_LPAC);
> > +                       adreno_gpu->has_ray_tracing = true;
> > +               }
> > +
> > +               return 0;
> > +       }
> > +
> > +       if (adreno_is_a750(adreno_gpu))
>
> Most of the function is under the if (adreno_is_a750) conditions. Can
> we invert the logic and add a single block of if(adreno_is_a750) and
> then place all the code underneath?

You mean to duplicate the qcom_scm_is_available check and qcom_scm_

>
> > +               gpu_req |= QCOM_SCM_GPU_TSENSE_EN_REQ;
> > +
> > +       ret = qcom_scm_gpu_init_regs(gpu_req);
> > +       if (ret)
> > +               return ret;
> > +
> > +       /* On a750 raytracing may be disabled by the firmware, find out whether
> > +        * that's the case. The scm call above sets the fuse register.
> > +        */
> > +       if (adreno_is_a750(adreno_gpu)) {
> > +               fuse_val = a6xx_llc_read(a6xx_gpu, REG_A7XX_CX_MISC_SW_FUSE_VALUE);
>
> This register isn't accessible with the current sm8650.dtsi. Since DT
> and driver are going through different trees, please add safety guards
> here, so that the driver doesn't crash if used with older dtsi

I don't see how this is an issue. msm-next is currently based on 6.9,
which doesn't have the GPU defined in sm8650.dtsi. AFAIK patches 1 and
2 will have to go through the linux-arm-msm tree, which will have to
be merged into msm-next before this patch lands there, so there will
never be any breakage.

> (not to mention that dts is considered to be an ABI and newer kernels
> are supposed not to break with older DT files).

That policy only applies to released kernels, so that's irrelevant here.

>
> > +               adreno_gpu->has_ray_tracing =
> > +                       !!(fuse_val & A7XX_CX_MISC_SW_FUSE_VALUE_RAYTRACING);
> > +       }
> > +
> > +       return 0;
> > +}
> > +
> > +
> >  #define GBIF_CLIENT_HALT_MASK          BIT(0)
> >  #define GBIF_ARB_HALT_MASK             BIT(1)
> >  #define VBIF_XIN_HALT_CTRL0_MASK       GENMASK(3, 0)
> > @@ -3094,6 +3173,14 @@ struct msm_gpu *a6xx_gpu_init(struct drm_device *dev)
> >                 return ERR_PTR(ret);
> >         }
> >
> > +       if (adreno_is_a7xx(adreno_gpu)) {
> > +               ret = a7xx_cx_mem_init(a6xx_gpu);
> > +               if (ret) {
> > +                       a6xx_destroy(&(a6xx_gpu->base.base));
> > +                       return ERR_PTR(ret);
> > +               }
> > +       }
> > +
> >         if (gpu->aspace)
> >                 msm_mmu_set_fault_handler(gpu->aspace->mmu, gpu,
> >                                 a6xx_fault_handler);
> > diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> > index 77526892eb8c..4180f3149dd8 100644
> > --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> > +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> > @@ -182,6 +182,8 @@ struct adreno_gpu {
> >          */
> >         const unsigned int *reg_offsets;
> >         bool gmu_is_wrapper;
> > +
> > +       bool has_ray_tracing;
> >  };
> >  #define to_adreno_gpu(x) container_of(x, struct adreno_gpu, base)
> >
> > --
> > 2.31.1
> >
>
>
> --
> With best wishes
> Dmitry

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/6] drm/msm/a7xx: Initialize a750 "software fuse"
  2024-04-25 23:02   ` Dmitry Baryshkov
  2024-04-26 12:35     ` Connor Abbott
@ 2024-04-26 12:45     ` Rob Clark
  2024-04-26 13:28       ` Dmitry Baryshkov
  1 sibling, 1 reply; 23+ messages in thread
From: Rob Clark @ 2024-04-26 12:45 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: Connor Abbott, Abhinav Kumar, Sean Paul, Marijn Suijten,
	linux-arm-msm, freedreno

On Thu, Apr 25, 2024 at 4:02 PM Dmitry Baryshkov
<dmitry.baryshkov@linaro.org> wrote:
>
> On Thu, 25 Apr 2024 at 16:44, Connor Abbott <cwabbott0@gmail.com> wrote:
> >
> > On all Qualcomm platforms with a7xx GPUs, qcom_scm provides a method to
> > initialize cx_mem. Copy this from downstream (minus BCL which we
> > currently don't support). On a750, this includes a new "fuse" register
> > which can be used by qcom_scm to fuse off certain features like
> > raytracing in software. The fuse is default off, and is initialized by
> > calling the method. Afterwards we have to read it to find out which
> > features were enabled.
> >
> > Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
> > ---
> >  drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 89 ++++++++++++++++++++++++-
> >  drivers/gpu/drm/msm/adreno/adreno_gpu.h |  2 +
> >  2 files changed, 90 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > index cf0b1de1c071..fb2722574ae5 100644
> > --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > @@ -10,6 +10,7 @@
> >
> >  #include <linux/bitfield.h>
> >  #include <linux/devfreq.h>
> > +#include <linux/firmware/qcom/qcom_scm.h>
> >  #include <linux/pm_domain.h>
> >  #include <linux/soc/qcom/llcc-qcom.h>
> >
> > @@ -1686,7 +1687,8 @@ static int a6xx_zap_shader_init(struct msm_gpu *gpu)
> >                        A6XX_RBBM_INT_0_MASK_RBBM_HANG_DETECT | \
> >                        A6XX_RBBM_INT_0_MASK_UCHE_OOB_ACCESS | \
> >                        A6XX_RBBM_INT_0_MASK_UCHE_TRAP_INTR | \
> > -                      A6XX_RBBM_INT_0_MASK_TSBWRITEERROR)
> > +                      A6XX_RBBM_INT_0_MASK_TSBWRITEERROR | \
> > +                      A6XX_RBBM_INT_0_MASK_SWFUSEVIOLATION)
> >
> >  #define A7XX_APRIV_MASK (A6XX_CP_APRIV_CNTL_ICACHE | \
> >                          A6XX_CP_APRIV_CNTL_RBFETCH | \
> > @@ -2356,6 +2358,26 @@ static void a6xx_fault_detect_irq(struct msm_gpu *gpu)
> >         kthread_queue_work(gpu->worker, &gpu->recover_work);
> >  }
> >
> > +static void a7xx_sw_fuse_violation_irq(struct msm_gpu *gpu)
> > +{
> > +       u32 status;
> > +
> > +       status = gpu_read(gpu, REG_A7XX_RBBM_SW_FUSE_INT_STATUS);
> > +       gpu_write(gpu, REG_A7XX_RBBM_SW_FUSE_INT_MASK, 0);
> > +
> > +       dev_err_ratelimited(&gpu->pdev->dev, "SW fuse violation status=%8.8x\n", status);
> > +
> > +       /* Ignore FASTBLEND violations, because the HW will silently fall back
> > +        * to legacy blending.
> > +        */
> > +       if (status & (A7XX_CX_MISC_SW_FUSE_VALUE_RAYTRACING |
> > +                     A7XX_CX_MISC_SW_FUSE_VALUE_LPAC)) {
> > +               del_timer(&gpu->hangcheck_timer);
> > +
> > +               kthread_queue_work(gpu->worker, &gpu->recover_work);
> > +       }
> > +}
> > +
> >  static irqreturn_t a6xx_irq(struct msm_gpu *gpu)
> >  {
> >         struct msm_drm_private *priv = gpu->dev->dev_private;
> > @@ -2384,6 +2406,9 @@ static irqreturn_t a6xx_irq(struct msm_gpu *gpu)
> >         if (status & A6XX_RBBM_INT_0_MASK_UCHE_OOB_ACCESS)
> >                 dev_err_ratelimited(&gpu->pdev->dev, "UCHE | Out of bounds access\n");
> >
> > +       if (status & A6XX_RBBM_INT_0_MASK_SWFUSEVIOLATION)
> > +               a7xx_sw_fuse_violation_irq(gpu);
> > +
> >         if (status & A6XX_RBBM_INT_0_MASK_CP_CACHE_FLUSH_TS)
> >                 msm_gpu_retire(gpu);
> >
> > @@ -2525,6 +2550,60 @@ static void a6xx_llc_slices_init(struct platform_device *pdev,
> >                 a6xx_gpu->llc_mmio = ERR_PTR(-EINVAL);
> >  }
> >
> > +static int a7xx_cx_mem_init(struct a6xx_gpu *a6xx_gpu)
> > +{
> > +       struct adreno_gpu *adreno_gpu = &a6xx_gpu->base;
> > +       struct msm_gpu *gpu = &adreno_gpu->base;
> > +       u32 gpu_req = QCOM_SCM_GPU_ALWAYS_EN_REQ;
> > +       u32 fuse_val;
> > +       int ret;
> > +
> > +       if (adreno_is_a740(adreno_gpu)) {
> > +               /* Raytracing is always enabled on a740 */
> > +               adreno_gpu->has_ray_tracing = true;
> > +       }
> > +
> > +       if (!qcom_scm_is_available()) {
> > +               /* Assume that if qcom scm isn't available, that whatever
> > +                * replacement allows writing the fuse register ourselves.
> > +                * Users of alternative firmware need to make sure this
> > +                * register is writeable or indicate that it's not somehow.
> > +                * Print a warning because if you mess this up you're about to
> > +                * crash horribly.
> > +                */
> > +               if (adreno_is_a750(adreno_gpu)) {
> > +                       dev_warn_once(gpu->dev->dev,
> > +                               "SCM is not available, poking fuse register\n");
> > +                       a6xx_llc_write(a6xx_gpu, REG_A7XX_CX_MISC_SW_FUSE_VALUE,
> > +                               A7XX_CX_MISC_SW_FUSE_VALUE_RAYTRACING |
> > +                               A7XX_CX_MISC_SW_FUSE_VALUE_FASTBLEND |
> > +                               A7XX_CX_MISC_SW_FUSE_VALUE_LPAC);
> > +                       adreno_gpu->has_ray_tracing = true;
> > +               }
> > +
> > +               return 0;
> > +       }
> > +
> > +       if (adreno_is_a750(adreno_gpu))
>
> Most of the function is under the if (adreno_is_a750) conditions. Can
> we invert the logic and add a single block of if(adreno_is_a750) and
> then place all the code underneath?
>
> > +               gpu_req |= QCOM_SCM_GPU_TSENSE_EN_REQ;
> > +
> > +       ret = qcom_scm_gpu_init_regs(gpu_req);
> > +       if (ret)
> > +               return ret;
> > +
> > +       /* On a750 raytracing may be disabled by the firmware, find out whether
> > +        * that's the case. The scm call above sets the fuse register.
> > +        */
> > +       if (adreno_is_a750(adreno_gpu)) {
> > +               fuse_val = a6xx_llc_read(a6xx_gpu, REG_A7XX_CX_MISC_SW_FUSE_VALUE);
>
> This register isn't accessible with the current sm8650.dtsi. Since DT
> and driver are going through different trees, please add safety guards
> here, so that the driver doesn't crash if used with older dtsi
> (not to mention that dts is considered to be an ABI and newer kernels
> are supposed not to break with older DT files).

I'd be happy if older kernels consistently worked with newer dtb, the
other direction is too much to ask.  If necessary we can ask for ack
to land the dts fix thru msm-next somehow, but since the gpu is newly
enabled device landing in the same merge window I think that is not
necessary.

BR,
-R

> > +               adreno_gpu->has_ray_tracing =
> > +                       !!(fuse_val & A7XX_CX_MISC_SW_FUSE_VALUE_RAYTRACING);
> > +       }
> > +
> > +       return 0;
> > +}
> > +
> > +
> >  #define GBIF_CLIENT_HALT_MASK          BIT(0)
> >  #define GBIF_ARB_HALT_MASK             BIT(1)
> >  #define VBIF_XIN_HALT_CTRL0_MASK       GENMASK(3, 0)
> > @@ -3094,6 +3173,14 @@ struct msm_gpu *a6xx_gpu_init(struct drm_device *dev)
> >                 return ERR_PTR(ret);
> >         }
> >
> > +       if (adreno_is_a7xx(adreno_gpu)) {
> > +               ret = a7xx_cx_mem_init(a6xx_gpu);
> > +               if (ret) {
> > +                       a6xx_destroy(&(a6xx_gpu->base.base));
> > +                       return ERR_PTR(ret);
> > +               }
> > +       }
> > +
> >         if (gpu->aspace)
> >                 msm_mmu_set_fault_handler(gpu->aspace->mmu, gpu,
> >                                 a6xx_fault_handler);
> > diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> > index 77526892eb8c..4180f3149dd8 100644
> > --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> > +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
> > @@ -182,6 +182,8 @@ struct adreno_gpu {
> >          */
> >         const unsigned int *reg_offsets;
> >         bool gmu_is_wrapper;
> > +
> > +       bool has_ray_tracing;
> >  };
> >  #define to_adreno_gpu(x) container_of(x, struct adreno_gpu, base)
> >
> > --
> > 2.31.1
> >
>
>
> --
> With best wishes
> Dmitry

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/6] drm/msm/a7xx: Initialize a750 "software fuse"
  2024-04-26 12:35     ` Connor Abbott
@ 2024-04-26 12:54       ` Connor Abbott
  2024-04-26 13:37         ` Dmitry Baryshkov
  2024-04-26 13:31       ` Dmitry Baryshkov
  1 sibling, 1 reply; 23+ messages in thread
From: Connor Abbott @ 2024-04-26 12:54 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: Rob Clark, Abhinav Kumar, Sean Paul, Marijn Suijten,
	linux-arm-msm, freedreno

On Fri, Apr 26, 2024 at 1:35 PM Connor Abbott <cwabbott0@gmail.com> wrote:
>
> On Fri, Apr 26, 2024 at 12:02 AM Dmitry Baryshkov
> <dmitry.baryshkov@linaro.org> wrote:
> >
> > On Thu, 25 Apr 2024 at 16:44, Connor Abbott <cwabbott0@gmail.com> wrote:
> > >
> > > On all Qualcomm platforms with a7xx GPUs, qcom_scm provides a method to
> > > initialize cx_mem. Copy this from downstream (minus BCL which we
> > > currently don't support). On a750, this includes a new "fuse" register
> > > which can be used by qcom_scm to fuse off certain features like
> > > raytracing in software. The fuse is default off, and is initialized by
> > > calling the method. Afterwards we have to read it to find out which
> > > features were enabled.
> > >
> > > Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
> > > ---
> > >  drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 89 ++++++++++++++++++++++++-
> > >  drivers/gpu/drm/msm/adreno/adreno_gpu.h |  2 +
> > >  2 files changed, 90 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > index cf0b1de1c071..fb2722574ae5 100644
> > > --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > @@ -10,6 +10,7 @@
> > >
> > >  #include <linux/bitfield.h>
> > >  #include <linux/devfreq.h>
> > > +#include <linux/firmware/qcom/qcom_scm.h>
> > >  #include <linux/pm_domain.h>
> > >  #include <linux/soc/qcom/llcc-qcom.h>
> > >
> > > @@ -1686,7 +1687,8 @@ static int a6xx_zap_shader_init(struct msm_gpu *gpu)
> > >                        A6XX_RBBM_INT_0_MASK_RBBM_HANG_DETECT | \
> > >                        A6XX_RBBM_INT_0_MASK_UCHE_OOB_ACCESS | \
> > >                        A6XX_RBBM_INT_0_MASK_UCHE_TRAP_INTR | \
> > > -                      A6XX_RBBM_INT_0_MASK_TSBWRITEERROR)
> > > +                      A6XX_RBBM_INT_0_MASK_TSBWRITEERROR | \
> > > +                      A6XX_RBBM_INT_0_MASK_SWFUSEVIOLATION)
> > >
> > >  #define A7XX_APRIV_MASK (A6XX_CP_APRIV_CNTL_ICACHE | \
> > >                          A6XX_CP_APRIV_CNTL_RBFETCH | \
> > > @@ -2356,6 +2358,26 @@ static void a6xx_fault_detect_irq(struct msm_gpu *gpu)
> > >         kthread_queue_work(gpu->worker, &gpu->recover_work);
> > >  }
> > >
> > > +static void a7xx_sw_fuse_violation_irq(struct msm_gpu *gpu)
> > > +{
> > > +       u32 status;
> > > +
> > > +       status = gpu_read(gpu, REG_A7XX_RBBM_SW_FUSE_INT_STATUS);
> > > +       gpu_write(gpu, REG_A7XX_RBBM_SW_FUSE_INT_MASK, 0);
> > > +
> > > +       dev_err_ratelimited(&gpu->pdev->dev, "SW fuse violation status=%8.8x\n", status);
> > > +
> > > +       /* Ignore FASTBLEND violations, because the HW will silently fall back
> > > +        * to legacy blending.
> > > +        */
> > > +       if (status & (A7XX_CX_MISC_SW_FUSE_VALUE_RAYTRACING |
> > > +                     A7XX_CX_MISC_SW_FUSE_VALUE_LPAC)) {
> > > +               del_timer(&gpu->hangcheck_timer);
> > > +
> > > +               kthread_queue_work(gpu->worker, &gpu->recover_work);
> > > +       }
> > > +}
> > > +
> > >  static irqreturn_t a6xx_irq(struct msm_gpu *gpu)
> > >  {
> > >         struct msm_drm_private *priv = gpu->dev->dev_private;
> > > @@ -2384,6 +2406,9 @@ static irqreturn_t a6xx_irq(struct msm_gpu *gpu)
> > >         if (status & A6XX_RBBM_INT_0_MASK_UCHE_OOB_ACCESS)
> > >                 dev_err_ratelimited(&gpu->pdev->dev, "UCHE | Out of bounds access\n");
> > >
> > > +       if (status & A6XX_RBBM_INT_0_MASK_SWFUSEVIOLATION)
> > > +               a7xx_sw_fuse_violation_irq(gpu);
> > > +
> > >         if (status & A6XX_RBBM_INT_0_MASK_CP_CACHE_FLUSH_TS)
> > >                 msm_gpu_retire(gpu);
> > >
> > > @@ -2525,6 +2550,60 @@ static void a6xx_llc_slices_init(struct platform_device *pdev,
> > >                 a6xx_gpu->llc_mmio = ERR_PTR(-EINVAL);
> > >  }
> > >
> > > +static int a7xx_cx_mem_init(struct a6xx_gpu *a6xx_gpu)
> > > +{
> > > +       struct adreno_gpu *adreno_gpu = &a6xx_gpu->base;
> > > +       struct msm_gpu *gpu = &adreno_gpu->base;
> > > +       u32 gpu_req = QCOM_SCM_GPU_ALWAYS_EN_REQ;
> > > +       u32 fuse_val;
> > > +       int ret;
> > > +
> > > +       if (adreno_is_a740(adreno_gpu)) {
> > > +               /* Raytracing is always enabled on a740 */
> > > +               adreno_gpu->has_ray_tracing = true;
> > > +       }
> > > +
> > > +       if (!qcom_scm_is_available()) {
> > > +               /* Assume that if qcom scm isn't available, that whatever
> > > +                * replacement allows writing the fuse register ourselves.
> > > +                * Users of alternative firmware need to make sure this
> > > +                * register is writeable or indicate that it's not somehow.
> > > +                * Print a warning because if you mess this up you're about to
> > > +                * crash horribly.
> > > +                */
> > > +               if (adreno_is_a750(adreno_gpu)) {
> > > +                       dev_warn_once(gpu->dev->dev,
> > > +                               "SCM is not available, poking fuse register\n");
> > > +                       a6xx_llc_write(a6xx_gpu, REG_A7XX_CX_MISC_SW_FUSE_VALUE,
> > > +                               A7XX_CX_MISC_SW_FUSE_VALUE_RAYTRACING |
> > > +                               A7XX_CX_MISC_SW_FUSE_VALUE_FASTBLEND |
> > > +                               A7XX_CX_MISC_SW_FUSE_VALUE_LPAC);
> > > +                       adreno_gpu->has_ray_tracing = true;
> > > +               }
> > > +
> > > +               return 0;
> > > +       }
> > > +
> > > +       if (adreno_is_a750(adreno_gpu))
> >
> > Most of the function is under the if (adreno_is_a750) conditions. Can
> > we invert the logic and add a single block of if(adreno_is_a750) and
> > then place all the code underneath?
>
> You mean to duplicate the qcom_scm_is_available check and qcom_scm_
>

Sorry, didn't finish this thought. I meant to ask if you wanted to
duplicate qcom_scm_is_available check and qcom_scm_gpu_init_regs
between a750+ and everything else.

Connor

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/6] drm/msm/a7xx: Initialize a750 "software fuse"
  2024-04-26 12:45     ` Rob Clark
@ 2024-04-26 13:28       ` Dmitry Baryshkov
  0 siblings, 0 replies; 23+ messages in thread
From: Dmitry Baryshkov @ 2024-04-26 13:28 UTC (permalink / raw)
  To: Rob Clark
  Cc: Connor Abbott, Abhinav Kumar, Sean Paul, Marijn Suijten,
	linux-arm-msm, freedreno

On Fri, 26 Apr 2024 at 15:46, Rob Clark <robdclark@gmail.com> wrote:
>
> On Thu, Apr 25, 2024 at 4:02 PM Dmitry Baryshkov
> <dmitry.baryshkov@linaro.org> wrote:
> >
> > On Thu, 25 Apr 2024 at 16:44, Connor Abbott <cwabbott0@gmail.com> wrote:
> > >
> > > On all Qualcomm platforms with a7xx GPUs, qcom_scm provides a method to
> > > initialize cx_mem. Copy this from downstream (minus BCL which we
> > > currently don't support). On a750, this includes a new "fuse" register
> > > which can be used by qcom_scm to fuse off certain features like
> > > raytracing in software. The fuse is default off, and is initialized by
> > > calling the method. Afterwards we have to read it to find out which
> > > features were enabled.
> > >
> > > Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
> > > ---
> > >  drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 89 ++++++++++++++++++++++++-
> > >  drivers/gpu/drm/msm/adreno/adreno_gpu.h |  2 +
> > >  2 files changed, 90 insertions(+), 1 deletion(-)
> > >

[...]

> > > +               gpu_req |= QCOM_SCM_GPU_TSENSE_EN_REQ;
> > > +
> > > +       ret = qcom_scm_gpu_init_regs(gpu_req);
> > > +       if (ret)
> > > +               return ret;
> > > +
> > > +       /* On a750 raytracing may be disabled by the firmware, find out whether
> > > +        * that's the case. The scm call above sets the fuse register.
> > > +        */
> > > +       if (adreno_is_a750(adreno_gpu)) {
> > > +               fuse_val = a6xx_llc_read(a6xx_gpu, REG_A7XX_CX_MISC_SW_FUSE_VALUE);
> >
> > This register isn't accessible with the current sm8650.dtsi. Since DT
> > and driver are going through different trees, please add safety guards
> > here, so that the driver doesn't crash if used with older dtsi
> > (not to mention that dts is considered to be an ABI and newer kernels
> > are supposed not to break with older DT files).
>
> I'd be happy if older kernels consistently worked with newer dtb, the
> other direction is too much to ask.

Well, we guarantee that newer kernels work with older dts.

>  If necessary we can ask for ack
> to land the dts fix thru msm-next somehow, but since the gpu is newly
> enabled device landing in the same merge window I think that is not
> necessary.

This might work too.

-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/6] drm/msm/a7xx: Initialize a750 "software fuse"
  2024-04-26 12:35     ` Connor Abbott
  2024-04-26 12:54       ` Connor Abbott
@ 2024-04-26 13:31       ` Dmitry Baryshkov
  2024-04-26 14:05         ` Connor Abbott
  1 sibling, 1 reply; 23+ messages in thread
From: Dmitry Baryshkov @ 2024-04-26 13:31 UTC (permalink / raw)
  To: Connor Abbott
  Cc: Rob Clark, Abhinav Kumar, Sean Paul, Marijn Suijten,
	linux-arm-msm, freedreno

On Fri, 26 Apr 2024 at 15:35, Connor Abbott <cwabbott0@gmail.com> wrote:
>
> On Fri, Apr 26, 2024 at 12:02 AM Dmitry Baryshkov
> <dmitry.baryshkov@linaro.org> wrote:
> >
> > On Thu, 25 Apr 2024 at 16:44, Connor Abbott <cwabbott0@gmail.com> wrote:
> > >
> > > On all Qualcomm platforms with a7xx GPUs, qcom_scm provides a method to
> > > initialize cx_mem. Copy this from downstream (minus BCL which we
> > > currently don't support). On a750, this includes a new "fuse" register
> > > which can be used by qcom_scm to fuse off certain features like
> > > raytracing in software. The fuse is default off, and is initialized by
> > > calling the method. Afterwards we have to read it to find out which
> > > features were enabled.
> > >
> > > Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
> > > ---
> > >  drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 89 ++++++++++++++++++++++++-
> > >  drivers/gpu/drm/msm/adreno/adreno_gpu.h |  2 +
> > >  2 files changed, 90 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > index cf0b1de1c071..fb2722574ae5 100644
> > > --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > @@ -10,6 +10,7 @@
> > >
> > >  #include <linux/bitfield.h>
> > >  #include <linux/devfreq.h>
> > > +#include <linux/firmware/qcom/qcom_scm.h>
> > >  #include <linux/pm_domain.h>
> > >  #include <linux/soc/qcom/llcc-qcom.h>
> > >
> > > @@ -1686,7 +1687,8 @@ static int a6xx_zap_shader_init(struct msm_gpu *gpu)
> > >                        A6XX_RBBM_INT_0_MASK_RBBM_HANG_DETECT | \
> > >                        A6XX_RBBM_INT_0_MASK_UCHE_OOB_ACCESS | \
> > >                        A6XX_RBBM_INT_0_MASK_UCHE_TRAP_INTR | \
> > > -                      A6XX_RBBM_INT_0_MASK_TSBWRITEERROR)
> > > +                      A6XX_RBBM_INT_0_MASK_TSBWRITEERROR | \
> > > +                      A6XX_RBBM_INT_0_MASK_SWFUSEVIOLATION)
> > >
> > >  #define A7XX_APRIV_MASK (A6XX_CP_APRIV_CNTL_ICACHE | \
> > >                          A6XX_CP_APRIV_CNTL_RBFETCH | \
> > > @@ -2356,6 +2358,26 @@ static void a6xx_fault_detect_irq(struct msm_gpu *gpu)
> > >         kthread_queue_work(gpu->worker, &gpu->recover_work);
> > >  }
> > >
> > > +static void a7xx_sw_fuse_violation_irq(struct msm_gpu *gpu)
> > > +{
> > > +       u32 status;
> > > +
> > > +       status = gpu_read(gpu, REG_A7XX_RBBM_SW_FUSE_INT_STATUS);
> > > +       gpu_write(gpu, REG_A7XX_RBBM_SW_FUSE_INT_MASK, 0);
> > > +
> > > +       dev_err_ratelimited(&gpu->pdev->dev, "SW fuse violation status=%8.8x\n", status);
> > > +
> > > +       /* Ignore FASTBLEND violations, because the HW will silently fall back
> > > +        * to legacy blending.
> > > +        */
> > > +       if (status & (A7XX_CX_MISC_SW_FUSE_VALUE_RAYTRACING |
> > > +                     A7XX_CX_MISC_SW_FUSE_VALUE_LPAC)) {
> > > +               del_timer(&gpu->hangcheck_timer);
> > > +
> > > +               kthread_queue_work(gpu->worker, &gpu->recover_work);
> > > +       }
> > > +}
> > > +
> > >  static irqreturn_t a6xx_irq(struct msm_gpu *gpu)
> > >  {
> > >         struct msm_drm_private *priv = gpu->dev->dev_private;
> > > @@ -2384,6 +2406,9 @@ static irqreturn_t a6xx_irq(struct msm_gpu *gpu)
> > >         if (status & A6XX_RBBM_INT_0_MASK_UCHE_OOB_ACCESS)
> > >                 dev_err_ratelimited(&gpu->pdev->dev, "UCHE | Out of bounds access\n");
> > >
> > > +       if (status & A6XX_RBBM_INT_0_MASK_SWFUSEVIOLATION)
> > > +               a7xx_sw_fuse_violation_irq(gpu);
> > > +
> > >         if (status & A6XX_RBBM_INT_0_MASK_CP_CACHE_FLUSH_TS)
> > >                 msm_gpu_retire(gpu);
> > >
> > > @@ -2525,6 +2550,60 @@ static void a6xx_llc_slices_init(struct platform_device *pdev,
> > >                 a6xx_gpu->llc_mmio = ERR_PTR(-EINVAL);
> > >  }
> > >
> > > +static int a7xx_cx_mem_init(struct a6xx_gpu *a6xx_gpu)
> > > +{
> > > +       struct adreno_gpu *adreno_gpu = &a6xx_gpu->base;
> > > +       struct msm_gpu *gpu = &adreno_gpu->base;
> > > +       u32 gpu_req = QCOM_SCM_GPU_ALWAYS_EN_REQ;
> > > +       u32 fuse_val;
> > > +       int ret;
> > > +
> > > +       if (adreno_is_a740(adreno_gpu)) {
> > > +               /* Raytracing is always enabled on a740 */
> > > +               adreno_gpu->has_ray_tracing = true;
> > > +       }
> > > +
> > > +       if (!qcom_scm_is_available()) {
> > > +               /* Assume that if qcom scm isn't available, that whatever
> > > +                * replacement allows writing the fuse register ourselves.
> > > +                * Users of alternative firmware need to make sure this
> > > +                * register is writeable or indicate that it's not somehow.
> > > +                * Print a warning because if you mess this up you're about to
> > > +                * crash horribly.
> > > +                */
> > > +               if (adreno_is_a750(adreno_gpu)) {
> > > +                       dev_warn_once(gpu->dev->dev,
> > > +                               "SCM is not available, poking fuse register\n");
> > > +                       a6xx_llc_write(a6xx_gpu, REG_A7XX_CX_MISC_SW_FUSE_VALUE,
> > > +                               A7XX_CX_MISC_SW_FUSE_VALUE_RAYTRACING |
> > > +                               A7XX_CX_MISC_SW_FUSE_VALUE_FASTBLEND |
> > > +                               A7XX_CX_MISC_SW_FUSE_VALUE_LPAC);
> > > +                       adreno_gpu->has_ray_tracing = true;
> > > +               }
> > > +
> > > +               return 0;
> > > +       }
> > > +
> > > +       if (adreno_is_a750(adreno_gpu))
> >
> > Most of the function is under the if (adreno_is_a750) conditions. Can
> > we invert the logic and add a single block of if(adreno_is_a750) and
> > then place all the code underneath?
>
> You mean to duplicate the qcom_scm_is_available check and qcom_scm_
>
> >
> > > +               gpu_req |= QCOM_SCM_GPU_TSENSE_EN_REQ;
> > > +
> > > +       ret = qcom_scm_gpu_init_regs(gpu_req);
> > > +       if (ret)
> > > +               return ret;
> > > +
> > > +       /* On a750 raytracing may be disabled by the firmware, find out whether
> > > +        * that's the case. The scm call above sets the fuse register.
> > > +        */
> > > +       if (adreno_is_a750(adreno_gpu)) {
> > > +               fuse_val = a6xx_llc_read(a6xx_gpu, REG_A7XX_CX_MISC_SW_FUSE_VALUE);
> >
> > This register isn't accessible with the current sm8650.dtsi. Since DT
> > and driver are going through different trees, please add safety guards
> > here, so that the driver doesn't crash if used with older dtsi
>
> I don't see how this is an issue. msm-next is currently based on 6.9,
> which doesn't have the GPU defined in sm8650.dtsi. AFAIK patches 1 and
> 2 will have to go through the linux-arm-msm tree, which will have to
> be merged into msm-next before this patch lands there, so there will
> never be any breakage.

linux-arm-msm isn't going to be merged into msm-next. If we do not ask
for ack for the fix to go through msm-next, they will get these
patches in parallel.

Another option is to get dtsi fix into 6.9 and delay the raytracing
until 6.10-rc which doesn't make a lot of sense from my POV).

>
> > (not to mention that dts is considered to be an ABI and newer kernels
> > are supposed not to break with older DT files).
>
> That policy only applies to released kernels, so that's irrelevant here.

It applies to all kernels, the reason being pretty simple: git-bisect
should not be broken.

-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/6] drm/msm/a7xx: Initialize a750 "software fuse"
  2024-04-26 12:54       ` Connor Abbott
@ 2024-04-26 13:37         ` Dmitry Baryshkov
  0 siblings, 0 replies; 23+ messages in thread
From: Dmitry Baryshkov @ 2024-04-26 13:37 UTC (permalink / raw)
  To: Connor Abbott
  Cc: Rob Clark, Abhinav Kumar, Sean Paul, Marijn Suijten,
	linux-arm-msm, freedreno

On Fri, 26 Apr 2024 at 15:54, Connor Abbott <cwabbott0@gmail.com> wrote:
>
> On Fri, Apr 26, 2024 at 1:35 PM Connor Abbott <cwabbott0@gmail.com> wrote:
> >
> > On Fri, Apr 26, 2024 at 12:02 AM Dmitry Baryshkov
> > <dmitry.baryshkov@linaro.org> wrote:
> > >
> > > On Thu, 25 Apr 2024 at 16:44, Connor Abbott <cwabbott0@gmail.com> wrote:
> > > >
> > > > On all Qualcomm platforms with a7xx GPUs, qcom_scm provides a method to
> > > > initialize cx_mem. Copy this from downstream (minus BCL which we
> > > > currently don't support). On a750, this includes a new "fuse" register
> > > > which can be used by qcom_scm to fuse off certain features like
> > > > raytracing in software. The fuse is default off, and is initialized by
> > > > calling the method. Afterwards we have to read it to find out which
> > > > features were enabled.
> > > >
> > > > Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
> > > > ---
> > > >  drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 89 ++++++++++++++++++++++++-
> > > >  drivers/gpu/drm/msm/adreno/adreno_gpu.h |  2 +
> > > >  2 files changed, 90 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > > index cf0b1de1c071..fb2722574ae5 100644
> > > > --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > > @@ -10,6 +10,7 @@
> > > >
> > > >  #include <linux/bitfield.h>
> > > >  #include <linux/devfreq.h>
> > > > +#include <linux/firmware/qcom/qcom_scm.h>
> > > >  #include <linux/pm_domain.h>
> > > >  #include <linux/soc/qcom/llcc-qcom.h>
> > > >
> > > > @@ -1686,7 +1687,8 @@ static int a6xx_zap_shader_init(struct msm_gpu *gpu)
> > > >                        A6XX_RBBM_INT_0_MASK_RBBM_HANG_DETECT | \
> > > >                        A6XX_RBBM_INT_0_MASK_UCHE_OOB_ACCESS | \
> > > >                        A6XX_RBBM_INT_0_MASK_UCHE_TRAP_INTR | \
> > > > -                      A6XX_RBBM_INT_0_MASK_TSBWRITEERROR)
> > > > +                      A6XX_RBBM_INT_0_MASK_TSBWRITEERROR | \
> > > > +                      A6XX_RBBM_INT_0_MASK_SWFUSEVIOLATION)
> > > >
> > > >  #define A7XX_APRIV_MASK (A6XX_CP_APRIV_CNTL_ICACHE | \
> > > >                          A6XX_CP_APRIV_CNTL_RBFETCH | \
> > > > @@ -2356,6 +2358,26 @@ static void a6xx_fault_detect_irq(struct msm_gpu *gpu)
> > > >         kthread_queue_work(gpu->worker, &gpu->recover_work);
> > > >  }
> > > >
> > > > +static void a7xx_sw_fuse_violation_irq(struct msm_gpu *gpu)
> > > > +{
> > > > +       u32 status;
> > > > +
> > > > +       status = gpu_read(gpu, REG_A7XX_RBBM_SW_FUSE_INT_STATUS);
> > > > +       gpu_write(gpu, REG_A7XX_RBBM_SW_FUSE_INT_MASK, 0);
> > > > +
> > > > +       dev_err_ratelimited(&gpu->pdev->dev, "SW fuse violation status=%8.8x\n", status);
> > > > +
> > > > +       /* Ignore FASTBLEND violations, because the HW will silently fall back
> > > > +        * to legacy blending.
> > > > +        */
> > > > +       if (status & (A7XX_CX_MISC_SW_FUSE_VALUE_RAYTRACING |
> > > > +                     A7XX_CX_MISC_SW_FUSE_VALUE_LPAC)) {
> > > > +               del_timer(&gpu->hangcheck_timer);
> > > > +
> > > > +               kthread_queue_work(gpu->worker, &gpu->recover_work);
> > > > +       }
> > > > +}
> > > > +
> > > >  static irqreturn_t a6xx_irq(struct msm_gpu *gpu)
> > > >  {
> > > >         struct msm_drm_private *priv = gpu->dev->dev_private;
> > > > @@ -2384,6 +2406,9 @@ static irqreturn_t a6xx_irq(struct msm_gpu *gpu)
> > > >         if (status & A6XX_RBBM_INT_0_MASK_UCHE_OOB_ACCESS)
> > > >                 dev_err_ratelimited(&gpu->pdev->dev, "UCHE | Out of bounds access\n");
> > > >
> > > > +       if (status & A6XX_RBBM_INT_0_MASK_SWFUSEVIOLATION)
> > > > +               a7xx_sw_fuse_violation_irq(gpu);
> > > > +
> > > >         if (status & A6XX_RBBM_INT_0_MASK_CP_CACHE_FLUSH_TS)
> > > >                 msm_gpu_retire(gpu);
> > > >
> > > > @@ -2525,6 +2550,60 @@ static void a6xx_llc_slices_init(struct platform_device *pdev,
> > > >                 a6xx_gpu->llc_mmio = ERR_PTR(-EINVAL);
> > > >  }
> > > >
> > > > +static int a7xx_cx_mem_init(struct a6xx_gpu *a6xx_gpu)
> > > > +{
> > > > +       struct adreno_gpu *adreno_gpu = &a6xx_gpu->base;
> > > > +       struct msm_gpu *gpu = &adreno_gpu->base;
> > > > +       u32 gpu_req = QCOM_SCM_GPU_ALWAYS_EN_REQ;
> > > > +       u32 fuse_val;
> > > > +       int ret;
> > > > +
> > > > +       if (adreno_is_a740(adreno_gpu)) {
> > > > +               /* Raytracing is always enabled on a740 */
> > > > +               adreno_gpu->has_ray_tracing = true;
> > > > +       }
> > > > +
> > > > +       if (!qcom_scm_is_available()) {
> > > > +               /* Assume that if qcom scm isn't available, that whatever
> > > > +                * replacement allows writing the fuse register ourselves.
> > > > +                * Users of alternative firmware need to make sure this
> > > > +                * register is writeable or indicate that it's not somehow.
> > > > +                * Print a warning because if you mess this up you're about to
> > > > +                * crash horribly.
> > > > +                */
> > > > +               if (adreno_is_a750(adreno_gpu)) {
> > > > +                       dev_warn_once(gpu->dev->dev,
> > > > +                               "SCM is not available, poking fuse register\n");
> > > > +                       a6xx_llc_write(a6xx_gpu, REG_A7XX_CX_MISC_SW_FUSE_VALUE,
> > > > +                               A7XX_CX_MISC_SW_FUSE_VALUE_RAYTRACING |
> > > > +                               A7XX_CX_MISC_SW_FUSE_VALUE_FASTBLEND |
> > > > +                               A7XX_CX_MISC_SW_FUSE_VALUE_LPAC);
> > > > +                       adreno_gpu->has_ray_tracing = true;
> > > > +               }
> > > > +
> > > > +               return 0;
> > > > +       }
> > > > +
> > > > +       if (adreno_is_a750(adreno_gpu))
> > >
> > > Most of the function is under the if (adreno_is_a750) conditions. Can
> > > we invert the logic and add a single block of if(adreno_is_a750) and
> > > then place all the code underneath?
> >
> > You mean to duplicate the qcom_scm_is_available check and qcom_scm_
> >
>
> Sorry, didn't finish this thought. I meant to ask if you wanted to
> duplicate qcom_scm_is_available check and qcom_scm_gpu_init_regs
> between a750+ and everything else.

I don't see !qcom_scm_is_available()) being useful anywhere else, at
least for now.

So it becomes:

       if (adreno_is_a740(adreno_gpu)) {
               /* Raytracing is always enabled on a740 */
               adreno_gpu->has_ray_tracing = true;
               // FIXME: Do we need this at all on a740?
               qcom_scm_gpu_init_regs(gpu_req);
       } else if (adreno_is_a750(adreno_gpu)) {
                 if (!qcom_scm_is_available()) {
                       dev_warn_once();
                       adreno_gpu->has_ray_tracing = true;
                       return 0;
                  }
                  gpu_req |= ...;
                  qcom_scm_gpu_init_regs(gpu_req);
                  fuse_val ....;
         }

-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/6] drm/msm/a7xx: Initialize a750 "software fuse"
  2024-04-26 13:31       ` Dmitry Baryshkov
@ 2024-04-26 14:05         ` Connor Abbott
  2024-04-26 14:53           ` Dmitry Baryshkov
  0 siblings, 1 reply; 23+ messages in thread
From: Connor Abbott @ 2024-04-26 14:05 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: Rob Clark, Abhinav Kumar, Sean Paul, Marijn Suijten,
	linux-arm-msm, freedreno

On Fri, Apr 26, 2024 at 2:31 PM Dmitry Baryshkov
<dmitry.baryshkov@linaro.org> wrote:
>
> On Fri, 26 Apr 2024 at 15:35, Connor Abbott <cwabbott0@gmail.com> wrote:
> >
> > On Fri, Apr 26, 2024 at 12:02 AM Dmitry Baryshkov
> > <dmitry.baryshkov@linaro.org> wrote:
> > >
> > > On Thu, 25 Apr 2024 at 16:44, Connor Abbott <cwabbott0@gmail.com> wrote:
> > > >
> > > > On all Qualcomm platforms with a7xx GPUs, qcom_scm provides a method to
> > > > initialize cx_mem. Copy this from downstream (minus BCL which we
> > > > currently don't support). On a750, this includes a new "fuse" register
> > > > which can be used by qcom_scm to fuse off certain features like
> > > > raytracing in software. The fuse is default off, and is initialized by
> > > > calling the method. Afterwards we have to read it to find out which
> > > > features were enabled.
> > > >
> > > > Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
> > > > ---
> > > >  drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 89 ++++++++++++++++++++++++-
> > > >  drivers/gpu/drm/msm/adreno/adreno_gpu.h |  2 +
> > > >  2 files changed, 90 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > > index cf0b1de1c071..fb2722574ae5 100644
> > > > --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > > @@ -10,6 +10,7 @@
> > > >
> > > >  #include <linux/bitfield.h>
> > > >  #include <linux/devfreq.h>
> > > > +#include <linux/firmware/qcom/qcom_scm.h>
> > > >  #include <linux/pm_domain.h>
> > > >  #include <linux/soc/qcom/llcc-qcom.h>
> > > >
> > > > @@ -1686,7 +1687,8 @@ static int a6xx_zap_shader_init(struct msm_gpu *gpu)
> > > >                        A6XX_RBBM_INT_0_MASK_RBBM_HANG_DETECT | \
> > > >                        A6XX_RBBM_INT_0_MASK_UCHE_OOB_ACCESS | \
> > > >                        A6XX_RBBM_INT_0_MASK_UCHE_TRAP_INTR | \
> > > > -                      A6XX_RBBM_INT_0_MASK_TSBWRITEERROR)
> > > > +                      A6XX_RBBM_INT_0_MASK_TSBWRITEERROR | \
> > > > +                      A6XX_RBBM_INT_0_MASK_SWFUSEVIOLATION)
> > > >
> > > >  #define A7XX_APRIV_MASK (A6XX_CP_APRIV_CNTL_ICACHE | \
> > > >                          A6XX_CP_APRIV_CNTL_RBFETCH | \
> > > > @@ -2356,6 +2358,26 @@ static void a6xx_fault_detect_irq(struct msm_gpu *gpu)
> > > >         kthread_queue_work(gpu->worker, &gpu->recover_work);
> > > >  }
> > > >
> > > > +static void a7xx_sw_fuse_violation_irq(struct msm_gpu *gpu)
> > > > +{
> > > > +       u32 status;
> > > > +
> > > > +       status = gpu_read(gpu, REG_A7XX_RBBM_SW_FUSE_INT_STATUS);
> > > > +       gpu_write(gpu, REG_A7XX_RBBM_SW_FUSE_INT_MASK, 0);
> > > > +
> > > > +       dev_err_ratelimited(&gpu->pdev->dev, "SW fuse violation status=%8.8x\n", status);
> > > > +
> > > > +       /* Ignore FASTBLEND violations, because the HW will silently fall back
> > > > +        * to legacy blending.
> > > > +        */
> > > > +       if (status & (A7XX_CX_MISC_SW_FUSE_VALUE_RAYTRACING |
> > > > +                     A7XX_CX_MISC_SW_FUSE_VALUE_LPAC)) {
> > > > +               del_timer(&gpu->hangcheck_timer);
> > > > +
> > > > +               kthread_queue_work(gpu->worker, &gpu->recover_work);
> > > > +       }
> > > > +}
> > > > +
> > > >  static irqreturn_t a6xx_irq(struct msm_gpu *gpu)
> > > >  {
> > > >         struct msm_drm_private *priv = gpu->dev->dev_private;
> > > > @@ -2384,6 +2406,9 @@ static irqreturn_t a6xx_irq(struct msm_gpu *gpu)
> > > >         if (status & A6XX_RBBM_INT_0_MASK_UCHE_OOB_ACCESS)
> > > >                 dev_err_ratelimited(&gpu->pdev->dev, "UCHE | Out of bounds access\n");
> > > >
> > > > +       if (status & A6XX_RBBM_INT_0_MASK_SWFUSEVIOLATION)
> > > > +               a7xx_sw_fuse_violation_irq(gpu);
> > > > +
> > > >         if (status & A6XX_RBBM_INT_0_MASK_CP_CACHE_FLUSH_TS)
> > > >                 msm_gpu_retire(gpu);
> > > >
> > > > @@ -2525,6 +2550,60 @@ static void a6xx_llc_slices_init(struct platform_device *pdev,
> > > >                 a6xx_gpu->llc_mmio = ERR_PTR(-EINVAL);
> > > >  }
> > > >
> > > > +static int a7xx_cx_mem_init(struct a6xx_gpu *a6xx_gpu)
> > > > +{
> > > > +       struct adreno_gpu *adreno_gpu = &a6xx_gpu->base;
> > > > +       struct msm_gpu *gpu = &adreno_gpu->base;
> > > > +       u32 gpu_req = QCOM_SCM_GPU_ALWAYS_EN_REQ;
> > > > +       u32 fuse_val;
> > > > +       int ret;
> > > > +
> > > > +       if (adreno_is_a740(adreno_gpu)) {
> > > > +               /* Raytracing is always enabled on a740 */
> > > > +               adreno_gpu->has_ray_tracing = true;
> > > > +       }
> > > > +
> > > > +       if (!qcom_scm_is_available()) {
> > > > +               /* Assume that if qcom scm isn't available, that whatever
> > > > +                * replacement allows writing the fuse register ourselves.
> > > > +                * Users of alternative firmware need to make sure this
> > > > +                * register is writeable or indicate that it's not somehow.
> > > > +                * Print a warning because if you mess this up you're about to
> > > > +                * crash horribly.
> > > > +                */
> > > > +               if (adreno_is_a750(adreno_gpu)) {
> > > > +                       dev_warn_once(gpu->dev->dev,
> > > > +                               "SCM is not available, poking fuse register\n");
> > > > +                       a6xx_llc_write(a6xx_gpu, REG_A7XX_CX_MISC_SW_FUSE_VALUE,
> > > > +                               A7XX_CX_MISC_SW_FUSE_VALUE_RAYTRACING |
> > > > +                               A7XX_CX_MISC_SW_FUSE_VALUE_FASTBLEND |
> > > > +                               A7XX_CX_MISC_SW_FUSE_VALUE_LPAC);
> > > > +                       adreno_gpu->has_ray_tracing = true;
> > > > +               }
> > > > +
> > > > +               return 0;
> > > > +       }
> > > > +
> > > > +       if (adreno_is_a750(adreno_gpu))
> > >
> > > Most of the function is under the if (adreno_is_a750) conditions. Can
> > > we invert the logic and add a single block of if(adreno_is_a750) and
> > > then place all the code underneath?
> >
> > You mean to duplicate the qcom_scm_is_available check and qcom_scm_
> >
> > >
> > > > +               gpu_req |= QCOM_SCM_GPU_TSENSE_EN_REQ;
> > > > +
> > > > +       ret = qcom_scm_gpu_init_regs(gpu_req);
> > > > +       if (ret)
> > > > +               return ret;
> > > > +
> > > > +       /* On a750 raytracing may be disabled by the firmware, find out whether
> > > > +        * that's the case. The scm call above sets the fuse register.
> > > > +        */
> > > > +       if (adreno_is_a750(adreno_gpu)) {
> > > > +               fuse_val = a6xx_llc_read(a6xx_gpu, REG_A7XX_CX_MISC_SW_FUSE_VALUE);
> > >
> > > This register isn't accessible with the current sm8650.dtsi. Since DT
> > > and driver are going through different trees, please add safety guards
> > > here, so that the driver doesn't crash if used with older dtsi
> >
> > I don't see how this is an issue. msm-next is currently based on 6.9,
> > which doesn't have the GPU defined in sm8650.dtsi. AFAIK patches 1 and
> > 2 will have to go through the linux-arm-msm tree, which will have to
> > be merged into msm-next before this patch lands there, so there will
> > never be any breakage.
>
> linux-arm-msm isn't going to be merged into msm-next. If we do not ask
> for ack for the fix to go through msm-next, they will get these
> patches in parallel.

I'm not familiar with how complicated cross-tree changes like this get
merged, but why would we merge these in parallel given that this patch
depends on the previous patch that introduces
qcom_scm_gpu_init_regs(), and that would (I assume?) normally go
through the same tree as patch 1? Even if patch 1 gets merged in
parallel in linux-arm-msm, in what scenario would we have a broken
boot? You won't have a devicetree with a working sm8650 GPU and
drm/msm with raytracing until linux-arm-msm is merged into msm-next at
which point patch 1 will have landed somehow.

>
> Another option is to get dtsi fix into 6.9 and delay the raytracing
> until 6.10-rc which doesn't make a lot of sense from my POV).
>
> >
> > > (not to mention that dts is considered to be an ABI and newer kernels
> > > are supposed not to break with older DT files).
> >
> > That policy only applies to released kernels, so that's irrelevant here.
>
> It applies to all kernels, the reason being pretty simple: git-bisect
> should not be broken.

As I wrote above, this is not an issue. The point I was making is that
mixing and matching dtb's from one unmerged subsystem tree and a
kernel from another isn't supported AFAIK, and that's the only
scenario where this could break.

Connor

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/6] drm/msm/a7xx: Initialize a750 "software fuse"
  2024-04-26 14:05         ` Connor Abbott
@ 2024-04-26 14:53           ` Dmitry Baryshkov
  2024-04-26 15:08             ` Connor Abbott
  0 siblings, 1 reply; 23+ messages in thread
From: Dmitry Baryshkov @ 2024-04-26 14:53 UTC (permalink / raw)
  To: Connor Abbott
  Cc: Rob Clark, Abhinav Kumar, Sean Paul, Marijn Suijten,
	linux-arm-msm, freedreno

On Fri, 26 Apr 2024 at 17:05, Connor Abbott <cwabbott0@gmail.com> wrote:
>
> On Fri, Apr 26, 2024 at 2:31 PM Dmitry Baryshkov
> <dmitry.baryshkov@linaro.org> wrote:
> >
> > On Fri, 26 Apr 2024 at 15:35, Connor Abbott <cwabbott0@gmail.com> wrote:
> > >
> > > On Fri, Apr 26, 2024 at 12:02 AM Dmitry Baryshkov
> > > <dmitry.baryshkov@linaro.org> wrote:
> > > >
> > > > On Thu, 25 Apr 2024 at 16:44, Connor Abbott <cwabbott0@gmail.com> wrote:
> > > > >
> > > > > On all Qualcomm platforms with a7xx GPUs, qcom_scm provides a method to
> > > > > initialize cx_mem. Copy this from downstream (minus BCL which we
> > > > > currently don't support). On a750, this includes a new "fuse" register
> > > > > which can be used by qcom_scm to fuse off certain features like
> > > > > raytracing in software. The fuse is default off, and is initialized by
> > > > > calling the method. Afterwards we have to read it to find out which
> > > > > features were enabled.
> > > > >
> > > > > Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
> > > > > ---
> > > > >  drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 89 ++++++++++++++++++++++++-
> > > > >  drivers/gpu/drm/msm/adreno/adreno_gpu.h |  2 +
> > > > >  2 files changed, 90 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > > > index cf0b1de1c071..fb2722574ae5 100644
> > > > > --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > > > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > > > @@ -10,6 +10,7 @@
> > > > >
> > > > >  #include <linux/bitfield.h>
> > > > >  #include <linux/devfreq.h>
> > > > > +#include <linux/firmware/qcom/qcom_scm.h>
> > > > >  #include <linux/pm_domain.h>
> > > > >  #include <linux/soc/qcom/llcc-qcom.h>
> > > > >
> > > > > @@ -1686,7 +1687,8 @@ static int a6xx_zap_shader_init(struct msm_gpu *gpu)
> > > > >                        A6XX_RBBM_INT_0_MASK_RBBM_HANG_DETECT | \
> > > > >                        A6XX_RBBM_INT_0_MASK_UCHE_OOB_ACCESS | \
> > > > >                        A6XX_RBBM_INT_0_MASK_UCHE_TRAP_INTR | \
> > > > > -                      A6XX_RBBM_INT_0_MASK_TSBWRITEERROR)
> > > > > +                      A6XX_RBBM_INT_0_MASK_TSBWRITEERROR | \
> > > > > +                      A6XX_RBBM_INT_0_MASK_SWFUSEVIOLATION)
> > > > >
> > > > >  #define A7XX_APRIV_MASK (A6XX_CP_APRIV_CNTL_ICACHE | \
> > > > >                          A6XX_CP_APRIV_CNTL_RBFETCH | \
> > > > > @@ -2356,6 +2358,26 @@ static void a6xx_fault_detect_irq(struct msm_gpu *gpu)
> > > > >         kthread_queue_work(gpu->worker, &gpu->recover_work);
> > > > >  }
> > > > >
> > > > > +static void a7xx_sw_fuse_violation_irq(struct msm_gpu *gpu)
> > > > > +{
> > > > > +       u32 status;
> > > > > +
> > > > > +       status = gpu_read(gpu, REG_A7XX_RBBM_SW_FUSE_INT_STATUS);
> > > > > +       gpu_write(gpu, REG_A7XX_RBBM_SW_FUSE_INT_MASK, 0);
> > > > > +
> > > > > +       dev_err_ratelimited(&gpu->pdev->dev, "SW fuse violation status=%8.8x\n", status);
> > > > > +
> > > > > +       /* Ignore FASTBLEND violations, because the HW will silently fall back
> > > > > +        * to legacy blending.
> > > > > +        */
> > > > > +       if (status & (A7XX_CX_MISC_SW_FUSE_VALUE_RAYTRACING |
> > > > > +                     A7XX_CX_MISC_SW_FUSE_VALUE_LPAC)) {
> > > > > +               del_timer(&gpu->hangcheck_timer);
> > > > > +
> > > > > +               kthread_queue_work(gpu->worker, &gpu->recover_work);
> > > > > +       }
> > > > > +}
> > > > > +
> > > > >  static irqreturn_t a6xx_irq(struct msm_gpu *gpu)
> > > > >  {
> > > > >         struct msm_drm_private *priv = gpu->dev->dev_private;
> > > > > @@ -2384,6 +2406,9 @@ static irqreturn_t a6xx_irq(struct msm_gpu *gpu)
> > > > >         if (status & A6XX_RBBM_INT_0_MASK_UCHE_OOB_ACCESS)
> > > > >                 dev_err_ratelimited(&gpu->pdev->dev, "UCHE | Out of bounds access\n");
> > > > >
> > > > > +       if (status & A6XX_RBBM_INT_0_MASK_SWFUSEVIOLATION)
> > > > > +               a7xx_sw_fuse_violation_irq(gpu);
> > > > > +
> > > > >         if (status & A6XX_RBBM_INT_0_MASK_CP_CACHE_FLUSH_TS)
> > > > >                 msm_gpu_retire(gpu);
> > > > >
> > > > > @@ -2525,6 +2550,60 @@ static void a6xx_llc_slices_init(struct platform_device *pdev,
> > > > >                 a6xx_gpu->llc_mmio = ERR_PTR(-EINVAL);
> > > > >  }
> > > > >
> > > > > +static int a7xx_cx_mem_init(struct a6xx_gpu *a6xx_gpu)
> > > > > +{
> > > > > +       struct adreno_gpu *adreno_gpu = &a6xx_gpu->base;
> > > > > +       struct msm_gpu *gpu = &adreno_gpu->base;
> > > > > +       u32 gpu_req = QCOM_SCM_GPU_ALWAYS_EN_REQ;
> > > > > +       u32 fuse_val;
> > > > > +       int ret;
> > > > > +
> > > > > +       if (adreno_is_a740(adreno_gpu)) {
> > > > > +               /* Raytracing is always enabled on a740 */
> > > > > +               adreno_gpu->has_ray_tracing = true;
> > > > > +       }
> > > > > +
> > > > > +       if (!qcom_scm_is_available()) {
> > > > > +               /* Assume that if qcom scm isn't available, that whatever
> > > > > +                * replacement allows writing the fuse register ourselves.
> > > > > +                * Users of alternative firmware need to make sure this
> > > > > +                * register is writeable or indicate that it's not somehow.
> > > > > +                * Print a warning because if you mess this up you're about to
> > > > > +                * crash horribly.
> > > > > +                */
> > > > > +               if (adreno_is_a750(adreno_gpu)) {
> > > > > +                       dev_warn_once(gpu->dev->dev,
> > > > > +                               "SCM is not available, poking fuse register\n");
> > > > > +                       a6xx_llc_write(a6xx_gpu, REG_A7XX_CX_MISC_SW_FUSE_VALUE,
> > > > > +                               A7XX_CX_MISC_SW_FUSE_VALUE_RAYTRACING |
> > > > > +                               A7XX_CX_MISC_SW_FUSE_VALUE_FASTBLEND |
> > > > > +                               A7XX_CX_MISC_SW_FUSE_VALUE_LPAC);
> > > > > +                       adreno_gpu->has_ray_tracing = true;
> > > > > +               }
> > > > > +
> > > > > +               return 0;
> > > > > +       }
> > > > > +
> > > > > +       if (adreno_is_a750(adreno_gpu))
> > > >
> > > > Most of the function is under the if (adreno_is_a750) conditions. Can
> > > > we invert the logic and add a single block of if(adreno_is_a750) and
> > > > then place all the code underneath?
> > >
> > > You mean to duplicate the qcom_scm_is_available check and qcom_scm_
> > >
> > > >
> > > > > +               gpu_req |= QCOM_SCM_GPU_TSENSE_EN_REQ;
> > > > > +
> > > > > +       ret = qcom_scm_gpu_init_regs(gpu_req);
> > > > > +       if (ret)
> > > > > +               return ret;
> > > > > +
> > > > > +       /* On a750 raytracing may be disabled by the firmware, find out whether
> > > > > +        * that's the case. The scm call above sets the fuse register.
> > > > > +        */
> > > > > +       if (adreno_is_a750(adreno_gpu)) {
> > > > > +               fuse_val = a6xx_llc_read(a6xx_gpu, REG_A7XX_CX_MISC_SW_FUSE_VALUE);
> > > >
> > > > This register isn't accessible with the current sm8650.dtsi. Since DT
> > > > and driver are going through different trees, please add safety guards
> > > > here, so that the driver doesn't crash if used with older dtsi
> > >
> > > I don't see how this is an issue. msm-next is currently based on 6.9,
> > > which doesn't have the GPU defined in sm8650.dtsi. AFAIK patches 1 and
> > > 2 will have to go through the linux-arm-msm tree, which will have to
> > > be merged into msm-next before this patch lands there, so there will
> > > never be any breakage.
> >
> > linux-arm-msm isn't going to be merged into msm-next. If we do not ask
> > for ack for the fix to go through msm-next, they will get these
> > patches in parallel.
>
> I'm not familiar with how complicated cross-tree changes like this get
> merged, but why would we merge these in parallel given that this patch
> depends on the previous patch that introduces
> qcom_scm_gpu_init_regs(), and that would (I assume?) normally go
> through the same tree as patch 1? Even if patch 1 gets merged in
> parallel in linux-arm-msm, in what scenario would we have a broken
> boot? You won't have a devicetree with a working sm8650 GPU and
> drm/msm with raytracing until linux-arm-msm is merged into msm-next at
> which point patch 1 will have landed somehow.

arch/arm64/qcom/dts and drivers/firmware/qcom are two separate trees.
So yes, this needs a lot of coordination.

>
> >
> > Another option is to get dtsi fix into 6.9 and delay the raytracing
> > until 6.10-rc which doesn't make a lot of sense from my POV).
> >
> > >
> > > > (not to mention that dts is considered to be an ABI and newer kernels
> > > > are supposed not to break with older DT files).
> > >
> > > That policy only applies to released kernels, so that's irrelevant here.
> >
> > It applies to all kernels, the reason being pretty simple: git-bisect
> > should not be broken.
>
> As I wrote above, this is not an issue. The point I was making is that
> mixing and matching dtb's from one unmerged subsystem tree and a
> kernel from another isn't supported AFAIK, and that's the only
> scenario where this could break.

And it can happen if somebody running a bisect ends up in the branch
with these patches in, but with the dtsi bits not being picked up.


-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/6] drm/msm/a7xx: Initialize a750 "software fuse"
  2024-04-26 14:53           ` Dmitry Baryshkov
@ 2024-04-26 15:08             ` Connor Abbott
  2024-04-26 15:24               ` Dmitry Baryshkov
  0 siblings, 1 reply; 23+ messages in thread
From: Connor Abbott @ 2024-04-26 15:08 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: Rob Clark, Abhinav Kumar, Sean Paul, Marijn Suijten,
	linux-arm-msm, freedreno

On Fri, Apr 26, 2024 at 3:53 PM Dmitry Baryshkov
<dmitry.baryshkov@linaro.org> wrote:
>
> On Fri, 26 Apr 2024 at 17:05, Connor Abbott <cwabbott0@gmail.com> wrote:
> >
> > On Fri, Apr 26, 2024 at 2:31 PM Dmitry Baryshkov
> > <dmitry.baryshkov@linaro.org> wrote:
> > >
> > > On Fri, 26 Apr 2024 at 15:35, Connor Abbott <cwabbott0@gmail.com> wrote:
> > > >
> > > > On Fri, Apr 26, 2024 at 12:02 AM Dmitry Baryshkov
> > > > <dmitry.baryshkov@linaro.org> wrote:
> > > > >
> > > > > On Thu, 25 Apr 2024 at 16:44, Connor Abbott <cwabbott0@gmail.com> wrote:
> > > > > >
> > > > > > On all Qualcomm platforms with a7xx GPUs, qcom_scm provides a method to
> > > > > > initialize cx_mem. Copy this from downstream (minus BCL which we
> > > > > > currently don't support). On a750, this includes a new "fuse" register
> > > > > > which can be used by qcom_scm to fuse off certain features like
> > > > > > raytracing in software. The fuse is default off, and is initialized by
> > > > > > calling the method. Afterwards we have to read it to find out which
> > > > > > features were enabled.
> > > > > >
> > > > > > Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
> > > > > > ---
> > > > > >  drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 89 ++++++++++++++++++++++++-
> > > > > >  drivers/gpu/drm/msm/adreno/adreno_gpu.h |  2 +
> > > > > >  2 files changed, 90 insertions(+), 1 deletion(-)
> > > > > >
> > > > > > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > > > > index cf0b1de1c071..fb2722574ae5 100644
> > > > > > --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > > > > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > > > > @@ -10,6 +10,7 @@
> > > > > >
> > > > > >  #include <linux/bitfield.h>
> > > > > >  #include <linux/devfreq.h>
> > > > > > +#include <linux/firmware/qcom/qcom_scm.h>
> > > > > >  #include <linux/pm_domain.h>
> > > > > >  #include <linux/soc/qcom/llcc-qcom.h>
> > > > > >
> > > > > > @@ -1686,7 +1687,8 @@ static int a6xx_zap_shader_init(struct msm_gpu *gpu)
> > > > > >                        A6XX_RBBM_INT_0_MASK_RBBM_HANG_DETECT | \
> > > > > >                        A6XX_RBBM_INT_0_MASK_UCHE_OOB_ACCESS | \
> > > > > >                        A6XX_RBBM_INT_0_MASK_UCHE_TRAP_INTR | \
> > > > > > -                      A6XX_RBBM_INT_0_MASK_TSBWRITEERROR)
> > > > > > +                      A6XX_RBBM_INT_0_MASK_TSBWRITEERROR | \
> > > > > > +                      A6XX_RBBM_INT_0_MASK_SWFUSEVIOLATION)
> > > > > >
> > > > > >  #define A7XX_APRIV_MASK (A6XX_CP_APRIV_CNTL_ICACHE | \
> > > > > >                          A6XX_CP_APRIV_CNTL_RBFETCH | \
> > > > > > @@ -2356,6 +2358,26 @@ static void a6xx_fault_detect_irq(struct msm_gpu *gpu)
> > > > > >         kthread_queue_work(gpu->worker, &gpu->recover_work);
> > > > > >  }
> > > > > >
> > > > > > +static void a7xx_sw_fuse_violation_irq(struct msm_gpu *gpu)
> > > > > > +{
> > > > > > +       u32 status;
> > > > > > +
> > > > > > +       status = gpu_read(gpu, REG_A7XX_RBBM_SW_FUSE_INT_STATUS);
> > > > > > +       gpu_write(gpu, REG_A7XX_RBBM_SW_FUSE_INT_MASK, 0);
> > > > > > +
> > > > > > +       dev_err_ratelimited(&gpu->pdev->dev, "SW fuse violation status=%8.8x\n", status);
> > > > > > +
> > > > > > +       /* Ignore FASTBLEND violations, because the HW will silently fall back
> > > > > > +        * to legacy blending.
> > > > > > +        */
> > > > > > +       if (status & (A7XX_CX_MISC_SW_FUSE_VALUE_RAYTRACING |
> > > > > > +                     A7XX_CX_MISC_SW_FUSE_VALUE_LPAC)) {
> > > > > > +               del_timer(&gpu->hangcheck_timer);
> > > > > > +
> > > > > > +               kthread_queue_work(gpu->worker, &gpu->recover_work);
> > > > > > +       }
> > > > > > +}
> > > > > > +
> > > > > >  static irqreturn_t a6xx_irq(struct msm_gpu *gpu)
> > > > > >  {
> > > > > >         struct msm_drm_private *priv = gpu->dev->dev_private;
> > > > > > @@ -2384,6 +2406,9 @@ static irqreturn_t a6xx_irq(struct msm_gpu *gpu)
> > > > > >         if (status & A6XX_RBBM_INT_0_MASK_UCHE_OOB_ACCESS)
> > > > > >                 dev_err_ratelimited(&gpu->pdev->dev, "UCHE | Out of bounds access\n");
> > > > > >
> > > > > > +       if (status & A6XX_RBBM_INT_0_MASK_SWFUSEVIOLATION)
> > > > > > +               a7xx_sw_fuse_violation_irq(gpu);
> > > > > > +
> > > > > >         if (status & A6XX_RBBM_INT_0_MASK_CP_CACHE_FLUSH_TS)
> > > > > >                 msm_gpu_retire(gpu);
> > > > > >
> > > > > > @@ -2525,6 +2550,60 @@ static void a6xx_llc_slices_init(struct platform_device *pdev,
> > > > > >                 a6xx_gpu->llc_mmio = ERR_PTR(-EINVAL);
> > > > > >  }
> > > > > >
> > > > > > +static int a7xx_cx_mem_init(struct a6xx_gpu *a6xx_gpu)
> > > > > > +{
> > > > > > +       struct adreno_gpu *adreno_gpu = &a6xx_gpu->base;
> > > > > > +       struct msm_gpu *gpu = &adreno_gpu->base;
> > > > > > +       u32 gpu_req = QCOM_SCM_GPU_ALWAYS_EN_REQ;
> > > > > > +       u32 fuse_val;
> > > > > > +       int ret;
> > > > > > +
> > > > > > +       if (adreno_is_a740(adreno_gpu)) {
> > > > > > +               /* Raytracing is always enabled on a740 */
> > > > > > +               adreno_gpu->has_ray_tracing = true;
> > > > > > +       }
> > > > > > +
> > > > > > +       if (!qcom_scm_is_available()) {
> > > > > > +               /* Assume that if qcom scm isn't available, that whatever
> > > > > > +                * replacement allows writing the fuse register ourselves.
> > > > > > +                * Users of alternative firmware need to make sure this
> > > > > > +                * register is writeable or indicate that it's not somehow.
> > > > > > +                * Print a warning because if you mess this up you're about to
> > > > > > +                * crash horribly.
> > > > > > +                */
> > > > > > +               if (adreno_is_a750(adreno_gpu)) {
> > > > > > +                       dev_warn_once(gpu->dev->dev,
> > > > > > +                               "SCM is not available, poking fuse register\n");
> > > > > > +                       a6xx_llc_write(a6xx_gpu, REG_A7XX_CX_MISC_SW_FUSE_VALUE,
> > > > > > +                               A7XX_CX_MISC_SW_FUSE_VALUE_RAYTRACING |
> > > > > > +                               A7XX_CX_MISC_SW_FUSE_VALUE_FASTBLEND |
> > > > > > +                               A7XX_CX_MISC_SW_FUSE_VALUE_LPAC);
> > > > > > +                       adreno_gpu->has_ray_tracing = true;
> > > > > > +               }
> > > > > > +
> > > > > > +               return 0;
> > > > > > +       }
> > > > > > +
> > > > > > +       if (adreno_is_a750(adreno_gpu))
> > > > >
> > > > > Most of the function is under the if (adreno_is_a750) conditions. Can
> > > > > we invert the logic and add a single block of if(adreno_is_a750) and
> > > > > then place all the code underneath?
> > > >
> > > > You mean to duplicate the qcom_scm_is_available check and qcom_scm_
> > > >
> > > > >
> > > > > > +               gpu_req |= QCOM_SCM_GPU_TSENSE_EN_REQ;
> > > > > > +
> > > > > > +       ret = qcom_scm_gpu_init_regs(gpu_req);
> > > > > > +       if (ret)
> > > > > > +               return ret;
> > > > > > +
> > > > > > +       /* On a750 raytracing may be disabled by the firmware, find out whether
> > > > > > +        * that's the case. The scm call above sets the fuse register.
> > > > > > +        */
> > > > > > +       if (adreno_is_a750(adreno_gpu)) {
> > > > > > +               fuse_val = a6xx_llc_read(a6xx_gpu, REG_A7XX_CX_MISC_SW_FUSE_VALUE);
> > > > >
> > > > > This register isn't accessible with the current sm8650.dtsi. Since DT
> > > > > and driver are going through different trees, please add safety guards
> > > > > here, so that the driver doesn't crash if used with older dtsi
> > > >
> > > > I don't see how this is an issue. msm-next is currently based on 6.9,
> > > > which doesn't have the GPU defined in sm8650.dtsi. AFAIK patches 1 and
> > > > 2 will have to go through the linux-arm-msm tree, which will have to
> > > > be merged into msm-next before this patch lands there, so there will
> > > > never be any breakage.
> > >
> > > linux-arm-msm isn't going to be merged into msm-next. If we do not ask
> > > for ack for the fix to go through msm-next, they will get these
> > > patches in parallel.
> >
> > I'm not familiar with how complicated cross-tree changes like this get
> > merged, but why would we merge these in parallel given that this patch
> > depends on the previous patch that introduces
> > qcom_scm_gpu_init_regs(), and that would (I assume?) normally go
> > through the same tree as patch 1? Even if patch 1 gets merged in
> > parallel in linux-arm-msm, in what scenario would we have a broken
> > boot? You won't have a devicetree with a working sm8650 GPU and
> > drm/msm with raytracing until linux-arm-msm is merged into msm-next at
> > which point patch 1 will have landed somehow.
>
> arch/arm64/qcom/dts and drivers/firmware/qcom are two separate trees.
> So yes, this needs a lot of coordination.



>
> >
> > >
> > > Another option is to get dtsi fix into 6.9 and delay the raytracing
> > > until 6.10-rc which doesn't make a lot of sense from my POV).
> > >
> > > >
> > > > > (not to mention that dts is considered to be an ABI and newer kernels
> > > > > are supposed not to break with older DT files).
> > > >
> > > > That policy only applies to released kernels, so that's irrelevant here.
> > >
> > > It applies to all kernels, the reason being pretty simple: git-bisect
> > > should not be broken.
> >
> > As I wrote above, this is not an issue. The point I was making is that
> > mixing and matching dtb's from one unmerged subsystem tree and a
> > kernel from another isn't supported AFAIK, and that's the only
> > scenario where this could break.
>
> And it can happen if somebody running a bisect ends up in the branch
> with these patches in, but with the dtsi bits not being picked up.

That wouldn't be possible unless we merged the "bad" commit
introducing the GPU node to sm8650.dtsi into msm-next but not the fix.
So yeah, it's going to require a lot of careful cooperation but it
should be possible to avoid that happening.

Connor

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/6] drm/msm/a7xx: Initialize a750 "software fuse"
  2024-04-26 15:08             ` Connor Abbott
@ 2024-04-26 15:24               ` Dmitry Baryshkov
  2024-04-26 15:36                 ` Connor Abbott
  2024-04-26 16:02                 ` Rob Clark
  0 siblings, 2 replies; 23+ messages in thread
From: Dmitry Baryshkov @ 2024-04-26 15:24 UTC (permalink / raw)
  To: Connor Abbott
  Cc: Rob Clark, Abhinav Kumar, Sean Paul, Marijn Suijten,
	linux-arm-msm, freedreno

On Fri, 26 Apr 2024 at 18:08, Connor Abbott <cwabbott0@gmail.com> wrote:
>
> On Fri, Apr 26, 2024 at 3:53 PM Dmitry Baryshkov
> <dmitry.baryshkov@linaro.org> wrote:
> >
> > On Fri, 26 Apr 2024 at 17:05, Connor Abbott <cwabbott0@gmail.com> wrote:
> > >
> > > On Fri, Apr 26, 2024 at 2:31 PM Dmitry Baryshkov
> > > <dmitry.baryshkov@linaro.org> wrote:
> > > >
> > > > On Fri, 26 Apr 2024 at 15:35, Connor Abbott <cwabbott0@gmail.com> wrote:
> > > > >
> > > > > On Fri, Apr 26, 2024 at 12:02 AM Dmitry Baryshkov
> > > > > <dmitry.baryshkov@linaro.org> wrote:
> > > > > >
> > > > > > On Thu, 25 Apr 2024 at 16:44, Connor Abbott <cwabbott0@gmail.com> wrote:
> > > > > > >
> > > > > > > On all Qualcomm platforms with a7xx GPUs, qcom_scm provides a method to
> > > > > > > initialize cx_mem. Copy this from downstream (minus BCL which we
> > > > > > > currently don't support). On a750, this includes a new "fuse" register
> > > > > > > which can be used by qcom_scm to fuse off certain features like
> > > > > > > raytracing in software. The fuse is default off, and is initialized by
> > > > > > > calling the method. Afterwards we have to read it to find out which
> > > > > > > features were enabled.
> > > > > > >
> > > > > > > Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
> > > > > > > ---
> > > > > > >  drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 89 ++++++++++++++++++++++++-
> > > > > > >  drivers/gpu/drm/msm/adreno/adreno_gpu.h |  2 +
> > > > > > >  2 files changed, 90 insertions(+), 1 deletion(-)
> > > > > > >
> > > > > > > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > > > > > index cf0b1de1c071..fb2722574ae5 100644
> > > > > > > --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > > > > > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > > > > > @@ -10,6 +10,7 @@
> > > > > > >
> > > > > > >  #include <linux/bitfield.h>
> > > > > > >  #include <linux/devfreq.h>
> > > > > > > +#include <linux/firmware/qcom/qcom_scm.h>
> > > > > > >  #include <linux/pm_domain.h>
> > > > > > >  #include <linux/soc/qcom/llcc-qcom.h>
> > > > > > >
> > > > > > > @@ -1686,7 +1687,8 @@ static int a6xx_zap_shader_init(struct msm_gpu *gpu)
> > > > > > >                        A6XX_RBBM_INT_0_MASK_RBBM_HANG_DETECT | \
> > > > > > >                        A6XX_RBBM_INT_0_MASK_UCHE_OOB_ACCESS | \
> > > > > > >                        A6XX_RBBM_INT_0_MASK_UCHE_TRAP_INTR | \
> > > > > > > -                      A6XX_RBBM_INT_0_MASK_TSBWRITEERROR)
> > > > > > > +                      A6XX_RBBM_INT_0_MASK_TSBWRITEERROR | \
> > > > > > > +                      A6XX_RBBM_INT_0_MASK_SWFUSEVIOLATION)
> > > > > > >
> > > > > > >  #define A7XX_APRIV_MASK (A6XX_CP_APRIV_CNTL_ICACHE | \
> > > > > > >                          A6XX_CP_APRIV_CNTL_RBFETCH | \
> > > > > > > @@ -2356,6 +2358,26 @@ static void a6xx_fault_detect_irq(struct msm_gpu *gpu)
> > > > > > >         kthread_queue_work(gpu->worker, &gpu->recover_work);
> > > > > > >  }
> > > > > > >
> > > > > > > +static void a7xx_sw_fuse_violation_irq(struct msm_gpu *gpu)
> > > > > > > +{
> > > > > > > +       u32 status;
> > > > > > > +
> > > > > > > +       status = gpu_read(gpu, REG_A7XX_RBBM_SW_FUSE_INT_STATUS);
> > > > > > > +       gpu_write(gpu, REG_A7XX_RBBM_SW_FUSE_INT_MASK, 0);
> > > > > > > +
> > > > > > > +       dev_err_ratelimited(&gpu->pdev->dev, "SW fuse violation status=%8.8x\n", status);
> > > > > > > +
> > > > > > > +       /* Ignore FASTBLEND violations, because the HW will silently fall back
> > > > > > > +        * to legacy blending.
> > > > > > > +        */
> > > > > > > +       if (status & (A7XX_CX_MISC_SW_FUSE_VALUE_RAYTRACING |
> > > > > > > +                     A7XX_CX_MISC_SW_FUSE_VALUE_LPAC)) {
> > > > > > > +               del_timer(&gpu->hangcheck_timer);
> > > > > > > +
> > > > > > > +               kthread_queue_work(gpu->worker, &gpu->recover_work);
> > > > > > > +       }
> > > > > > > +}
> > > > > > > +
> > > > > > >  static irqreturn_t a6xx_irq(struct msm_gpu *gpu)
> > > > > > >  {
> > > > > > >         struct msm_drm_private *priv = gpu->dev->dev_private;
> > > > > > > @@ -2384,6 +2406,9 @@ static irqreturn_t a6xx_irq(struct msm_gpu *gpu)
> > > > > > >         if (status & A6XX_RBBM_INT_0_MASK_UCHE_OOB_ACCESS)
> > > > > > >                 dev_err_ratelimited(&gpu->pdev->dev, "UCHE | Out of bounds access\n");
> > > > > > >
> > > > > > > +       if (status & A6XX_RBBM_INT_0_MASK_SWFUSEVIOLATION)
> > > > > > > +               a7xx_sw_fuse_violation_irq(gpu);
> > > > > > > +
> > > > > > >         if (status & A6XX_RBBM_INT_0_MASK_CP_CACHE_FLUSH_TS)
> > > > > > >                 msm_gpu_retire(gpu);
> > > > > > >
> > > > > > > @@ -2525,6 +2550,60 @@ static void a6xx_llc_slices_init(struct platform_device *pdev,
> > > > > > >                 a6xx_gpu->llc_mmio = ERR_PTR(-EINVAL);
> > > > > > >  }
> > > > > > >
> > > > > > > +static int a7xx_cx_mem_init(struct a6xx_gpu *a6xx_gpu)
> > > > > > > +{
> > > > > > > +       struct adreno_gpu *adreno_gpu = &a6xx_gpu->base;
> > > > > > > +       struct msm_gpu *gpu = &adreno_gpu->base;
> > > > > > > +       u32 gpu_req = QCOM_SCM_GPU_ALWAYS_EN_REQ;
> > > > > > > +       u32 fuse_val;
> > > > > > > +       int ret;
> > > > > > > +
> > > > > > > +       if (adreno_is_a740(adreno_gpu)) {
> > > > > > > +               /* Raytracing is always enabled on a740 */
> > > > > > > +               adreno_gpu->has_ray_tracing = true;
> > > > > > > +       }
> > > > > > > +
> > > > > > > +       if (!qcom_scm_is_available()) {
> > > > > > > +               /* Assume that if qcom scm isn't available, that whatever
> > > > > > > +                * replacement allows writing the fuse register ourselves.
> > > > > > > +                * Users of alternative firmware need to make sure this
> > > > > > > +                * register is writeable or indicate that it's not somehow.
> > > > > > > +                * Print a warning because if you mess this up you're about to
> > > > > > > +                * crash horribly.
> > > > > > > +                */
> > > > > > > +               if (adreno_is_a750(adreno_gpu)) {
> > > > > > > +                       dev_warn_once(gpu->dev->dev,
> > > > > > > +                               "SCM is not available, poking fuse register\n");
> > > > > > > +                       a6xx_llc_write(a6xx_gpu, REG_A7XX_CX_MISC_SW_FUSE_VALUE,
> > > > > > > +                               A7XX_CX_MISC_SW_FUSE_VALUE_RAYTRACING |
> > > > > > > +                               A7XX_CX_MISC_SW_FUSE_VALUE_FASTBLEND |
> > > > > > > +                               A7XX_CX_MISC_SW_FUSE_VALUE_LPAC);
> > > > > > > +                       adreno_gpu->has_ray_tracing = true;
> > > > > > > +               }
> > > > > > > +
> > > > > > > +               return 0;
> > > > > > > +       }
> > > > > > > +
> > > > > > > +       if (adreno_is_a750(adreno_gpu))
> > > > > >
> > > > > > Most of the function is under the if (adreno_is_a750) conditions. Can
> > > > > > we invert the logic and add a single block of if(adreno_is_a750) and
> > > > > > then place all the code underneath?
> > > > >
> > > > > You mean to duplicate the qcom_scm_is_available check and qcom_scm_
> > > > >
> > > > > >
> > > > > > > +               gpu_req |= QCOM_SCM_GPU_TSENSE_EN_REQ;
> > > > > > > +
> > > > > > > +       ret = qcom_scm_gpu_init_regs(gpu_req);
> > > > > > > +       if (ret)
> > > > > > > +               return ret;
> > > > > > > +
> > > > > > > +       /* On a750 raytracing may be disabled by the firmware, find out whether
> > > > > > > +        * that's the case. The scm call above sets the fuse register.
> > > > > > > +        */
> > > > > > > +       if (adreno_is_a750(adreno_gpu)) {
> > > > > > > +               fuse_val = a6xx_llc_read(a6xx_gpu, REG_A7XX_CX_MISC_SW_FUSE_VALUE);
> > > > > >
> > > > > > This register isn't accessible with the current sm8650.dtsi. Since DT
> > > > > > and driver are going through different trees, please add safety guards
> > > > > > here, so that the driver doesn't crash if used with older dtsi
> > > > >
> > > > > I don't see how this is an issue. msm-next is currently based on 6.9,
> > > > > which doesn't have the GPU defined in sm8650.dtsi. AFAIK patches 1 and
> > > > > 2 will have to go through the linux-arm-msm tree, which will have to
> > > > > be merged into msm-next before this patch lands there, so there will
> > > > > never be any breakage.
> > > >
> > > > linux-arm-msm isn't going to be merged into msm-next. If we do not ask
> > > > for ack for the fix to go through msm-next, they will get these
> > > > patches in parallel.
> > >
> > > I'm not familiar with how complicated cross-tree changes like this get
> > > merged, but why would we merge these in parallel given that this patch
> > > depends on the previous patch that introduces
> > > qcom_scm_gpu_init_regs(), and that would (I assume?) normally go
> > > through the same tree as patch 1? Even if patch 1 gets merged in
> > > parallel in linux-arm-msm, in what scenario would we have a broken
> > > boot? You won't have a devicetree with a working sm8650 GPU and
> > > drm/msm with raytracing until linux-arm-msm is merged into msm-next at
> > > which point patch 1 will have landed somehow.
> >
> > arch/arm64/qcom/dts and drivers/firmware/qcom are two separate trees.
> > So yes, this needs a lot of coordination.
>
>
>
> >
> > >
> > > >
> > > > Another option is to get dtsi fix into 6.9 and delay the raytracing
> > > > until 6.10-rc which doesn't make a lot of sense from my POV).
> > > >
> > > > >
> > > > > > (not to mention that dts is considered to be an ABI and newer kernels
> > > > > > are supposed not to break with older DT files).
> > > > >
> > > > > That policy only applies to released kernels, so that's irrelevant here.
> > > >
> > > > It applies to all kernels, the reason being pretty simple: git-bisect
> > > > should not be broken.
> > >
> > > As I wrote above, this is not an issue. The point I was making is that
> > > mixing and matching dtb's from one unmerged subsystem tree and a
> > > kernel from another isn't supported AFAIK, and that's the only
> > > scenario where this could break.
> >
> > And it can happen if somebody running a bisect ends up in the branch
> > with these patches in, but with the dtsi bits not being picked up.
>
> That wouldn't be possible unless we merged the "bad" commit
> introducing the GPU node to sm8650.dtsi into msm-next but not the fix.
> So yeah, it's going to require a lot of careful cooperation but it
> should be possible to avoid that happening.

Well, the GPU node is already there in the linux-next.

Anyway. Please. Don't break compat with old DTS. That is a rule of thumb.


-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/6] drm/msm/a7xx: Initialize a750 "software fuse"
  2024-04-26 15:24               ` Dmitry Baryshkov
@ 2024-04-26 15:36                 ` Connor Abbott
  2024-04-26 15:38                   ` Dmitry Baryshkov
  2024-04-26 16:02                 ` Rob Clark
  1 sibling, 1 reply; 23+ messages in thread
From: Connor Abbott @ 2024-04-26 15:36 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: Rob Clark, Abhinav Kumar, Sean Paul, Marijn Suijten,
	linux-arm-msm, freedreno

On Fri, Apr 26, 2024 at 4:24 PM Dmitry Baryshkov
<dmitry.baryshkov@linaro.org> wrote:
>
> On Fri, 26 Apr 2024 at 18:08, Connor Abbott <cwabbott0@gmail.com> wrote:
> >
> > On Fri, Apr 26, 2024 at 3:53 PM Dmitry Baryshkov
> > <dmitry.baryshkov@linaro.org> wrote:
> > >
> > > On Fri, 26 Apr 2024 at 17:05, Connor Abbott <cwabbott0@gmail.com> wrote:
> > > >
> > > > On Fri, Apr 26, 2024 at 2:31 PM Dmitry Baryshkov
> > > > <dmitry.baryshkov@linaro.org> wrote:
> > > > >
> > > > > On Fri, 26 Apr 2024 at 15:35, Connor Abbott <cwabbott0@gmail.com> wrote:
> > > > > >
> > > > > > On Fri, Apr 26, 2024 at 12:02 AM Dmitry Baryshkov
> > > > > > <dmitry.baryshkov@linaro.org> wrote:
> > > > > > >
> > > > > > > On Thu, 25 Apr 2024 at 16:44, Connor Abbott <cwabbott0@gmail.com> wrote:
> > > > > > > >
> > > > > > > > On all Qualcomm platforms with a7xx GPUs, qcom_scm provides a method to
> > > > > > > > initialize cx_mem. Copy this from downstream (minus BCL which we
> > > > > > > > currently don't support). On a750, this includes a new "fuse" register
> > > > > > > > which can be used by qcom_scm to fuse off certain features like
> > > > > > > > raytracing in software. The fuse is default off, and is initialized by
> > > > > > > > calling the method. Afterwards we have to read it to find out which
> > > > > > > > features were enabled.
> > > > > > > >
> > > > > > > > Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
> > > > > > > > ---
> > > > > > > >  drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 89 ++++++++++++++++++++++++-
> > > > > > > >  drivers/gpu/drm/msm/adreno/adreno_gpu.h |  2 +
> > > > > > > >  2 files changed, 90 insertions(+), 1 deletion(-)
> > > > > > > >
> > > > > > > > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > > > > > > index cf0b1de1c071..fb2722574ae5 100644
> > > > > > > > --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > > > > > > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > > > > > > @@ -10,6 +10,7 @@
> > > > > > > >
> > > > > > > >  #include <linux/bitfield.h>
> > > > > > > >  #include <linux/devfreq.h>
> > > > > > > > +#include <linux/firmware/qcom/qcom_scm.h>
> > > > > > > >  #include <linux/pm_domain.h>
> > > > > > > >  #include <linux/soc/qcom/llcc-qcom.h>
> > > > > > > >
> > > > > > > > @@ -1686,7 +1687,8 @@ static int a6xx_zap_shader_init(struct msm_gpu *gpu)
> > > > > > > >                        A6XX_RBBM_INT_0_MASK_RBBM_HANG_DETECT | \
> > > > > > > >                        A6XX_RBBM_INT_0_MASK_UCHE_OOB_ACCESS | \
> > > > > > > >                        A6XX_RBBM_INT_0_MASK_UCHE_TRAP_INTR | \
> > > > > > > > -                      A6XX_RBBM_INT_0_MASK_TSBWRITEERROR)
> > > > > > > > +                      A6XX_RBBM_INT_0_MASK_TSBWRITEERROR | \
> > > > > > > > +                      A6XX_RBBM_INT_0_MASK_SWFUSEVIOLATION)
> > > > > > > >
> > > > > > > >  #define A7XX_APRIV_MASK (A6XX_CP_APRIV_CNTL_ICACHE | \
> > > > > > > >                          A6XX_CP_APRIV_CNTL_RBFETCH | \
> > > > > > > > @@ -2356,6 +2358,26 @@ static void a6xx_fault_detect_irq(struct msm_gpu *gpu)
> > > > > > > >         kthread_queue_work(gpu->worker, &gpu->recover_work);
> > > > > > > >  }
> > > > > > > >
> > > > > > > > +static void a7xx_sw_fuse_violation_irq(struct msm_gpu *gpu)
> > > > > > > > +{
> > > > > > > > +       u32 status;
> > > > > > > > +
> > > > > > > > +       status = gpu_read(gpu, REG_A7XX_RBBM_SW_FUSE_INT_STATUS);
> > > > > > > > +       gpu_write(gpu, REG_A7XX_RBBM_SW_FUSE_INT_MASK, 0);
> > > > > > > > +
> > > > > > > > +       dev_err_ratelimited(&gpu->pdev->dev, "SW fuse violation status=%8.8x\n", status);
> > > > > > > > +
> > > > > > > > +       /* Ignore FASTBLEND violations, because the HW will silently fall back
> > > > > > > > +        * to legacy blending.
> > > > > > > > +        */
> > > > > > > > +       if (status & (A7XX_CX_MISC_SW_FUSE_VALUE_RAYTRACING |
> > > > > > > > +                     A7XX_CX_MISC_SW_FUSE_VALUE_LPAC)) {
> > > > > > > > +               del_timer(&gpu->hangcheck_timer);
> > > > > > > > +
> > > > > > > > +               kthread_queue_work(gpu->worker, &gpu->recover_work);
> > > > > > > > +       }
> > > > > > > > +}
> > > > > > > > +
> > > > > > > >  static irqreturn_t a6xx_irq(struct msm_gpu *gpu)
> > > > > > > >  {
> > > > > > > >         struct msm_drm_private *priv = gpu->dev->dev_private;
> > > > > > > > @@ -2384,6 +2406,9 @@ static irqreturn_t a6xx_irq(struct msm_gpu *gpu)
> > > > > > > >         if (status & A6XX_RBBM_INT_0_MASK_UCHE_OOB_ACCESS)
> > > > > > > >                 dev_err_ratelimited(&gpu->pdev->dev, "UCHE | Out of bounds access\n");
> > > > > > > >
> > > > > > > > +       if (status & A6XX_RBBM_INT_0_MASK_SWFUSEVIOLATION)
> > > > > > > > +               a7xx_sw_fuse_violation_irq(gpu);
> > > > > > > > +
> > > > > > > >         if (status & A6XX_RBBM_INT_0_MASK_CP_CACHE_FLUSH_TS)
> > > > > > > >                 msm_gpu_retire(gpu);
> > > > > > > >
> > > > > > > > @@ -2525,6 +2550,60 @@ static void a6xx_llc_slices_init(struct platform_device *pdev,
> > > > > > > >                 a6xx_gpu->llc_mmio = ERR_PTR(-EINVAL);
> > > > > > > >  }
> > > > > > > >
> > > > > > > > +static int a7xx_cx_mem_init(struct a6xx_gpu *a6xx_gpu)
> > > > > > > > +{
> > > > > > > > +       struct adreno_gpu *adreno_gpu = &a6xx_gpu->base;
> > > > > > > > +       struct msm_gpu *gpu = &adreno_gpu->base;
> > > > > > > > +       u32 gpu_req = QCOM_SCM_GPU_ALWAYS_EN_REQ;
> > > > > > > > +       u32 fuse_val;
> > > > > > > > +       int ret;
> > > > > > > > +
> > > > > > > > +       if (adreno_is_a740(adreno_gpu)) {
> > > > > > > > +               /* Raytracing is always enabled on a740 */
> > > > > > > > +               adreno_gpu->has_ray_tracing = true;
> > > > > > > > +       }
> > > > > > > > +
> > > > > > > > +       if (!qcom_scm_is_available()) {
> > > > > > > > +               /* Assume that if qcom scm isn't available, that whatever
> > > > > > > > +                * replacement allows writing the fuse register ourselves.
> > > > > > > > +                * Users of alternative firmware need to make sure this
> > > > > > > > +                * register is writeable or indicate that it's not somehow.
> > > > > > > > +                * Print a warning because if you mess this up you're about to
> > > > > > > > +                * crash horribly.
> > > > > > > > +                */
> > > > > > > > +               if (adreno_is_a750(adreno_gpu)) {
> > > > > > > > +                       dev_warn_once(gpu->dev->dev,
> > > > > > > > +                               "SCM is not available, poking fuse register\n");
> > > > > > > > +                       a6xx_llc_write(a6xx_gpu, REG_A7XX_CX_MISC_SW_FUSE_VALUE,
> > > > > > > > +                               A7XX_CX_MISC_SW_FUSE_VALUE_RAYTRACING |
> > > > > > > > +                               A7XX_CX_MISC_SW_FUSE_VALUE_FASTBLEND |
> > > > > > > > +                               A7XX_CX_MISC_SW_FUSE_VALUE_LPAC);
> > > > > > > > +                       adreno_gpu->has_ray_tracing = true;
> > > > > > > > +               }
> > > > > > > > +
> > > > > > > > +               return 0;
> > > > > > > > +       }
> > > > > > > > +
> > > > > > > > +       if (adreno_is_a750(adreno_gpu))
> > > > > > >
> > > > > > > Most of the function is under the if (adreno_is_a750) conditions. Can
> > > > > > > we invert the logic and add a single block of if(adreno_is_a750) and
> > > > > > > then place all the code underneath?
> > > > > >
> > > > > > You mean to duplicate the qcom_scm_is_available check and qcom_scm_
> > > > > >
> > > > > > >
> > > > > > > > +               gpu_req |= QCOM_SCM_GPU_TSENSE_EN_REQ;
> > > > > > > > +
> > > > > > > > +       ret = qcom_scm_gpu_init_regs(gpu_req);
> > > > > > > > +       if (ret)
> > > > > > > > +               return ret;
> > > > > > > > +
> > > > > > > > +       /* On a750 raytracing may be disabled by the firmware, find out whether
> > > > > > > > +        * that's the case. The scm call above sets the fuse register.
> > > > > > > > +        */
> > > > > > > > +       if (adreno_is_a750(adreno_gpu)) {
> > > > > > > > +               fuse_val = a6xx_llc_read(a6xx_gpu, REG_A7XX_CX_MISC_SW_FUSE_VALUE);
> > > > > > >
> > > > > > > This register isn't accessible with the current sm8650.dtsi. Since DT
> > > > > > > and driver are going through different trees, please add safety guards
> > > > > > > here, so that the driver doesn't crash if used with older dtsi
> > > > > >
> > > > > > I don't see how this is an issue. msm-next is currently based on 6.9,
> > > > > > which doesn't have the GPU defined in sm8650.dtsi. AFAIK patches 1 and
> > > > > > 2 will have to go through the linux-arm-msm tree, which will have to
> > > > > > be merged into msm-next before this patch lands there, so there will
> > > > > > never be any breakage.
> > > > >
> > > > > linux-arm-msm isn't going to be merged into msm-next. If we do not ask
> > > > > for ack for the fix to go through msm-next, they will get these
> > > > > patches in parallel.
> > > >
> > > > I'm not familiar with how complicated cross-tree changes like this get
> > > > merged, but why would we merge these in parallel given that this patch
> > > > depends on the previous patch that introduces
> > > > qcom_scm_gpu_init_regs(), and that would (I assume?) normally go
> > > > through the same tree as patch 1? Even if patch 1 gets merged in
> > > > parallel in linux-arm-msm, in what scenario would we have a broken
> > > > boot? You won't have a devicetree with a working sm8650 GPU and
> > > > drm/msm with raytracing until linux-arm-msm is merged into msm-next at
> > > > which point patch 1 will have landed somehow.
> > >
> > > arch/arm64/qcom/dts and drivers/firmware/qcom are two separate trees.
> > > So yes, this needs a lot of coordination.
> >
> >
> >
> > >
> > > >
> > > > >
> > > > > Another option is to get dtsi fix into 6.9 and delay the raytracing
> > > > > until 6.10-rc which doesn't make a lot of sense from my POV).
> > > > >
> > > > > >
> > > > > > > (not to mention that dts is considered to be an ABI and newer kernels
> > > > > > > are supposed not to break with older DT files).
> > > > > >
> > > > > > That policy only applies to released kernels, so that's irrelevant here.
> > > > >
> > > > > It applies to all kernels, the reason being pretty simple: git-bisect
> > > > > should not be broken.
> > > >
> > > > As I wrote above, this is not an issue. The point I was making is that
> > > > mixing and matching dtb's from one unmerged subsystem tree and a
> > > > kernel from another isn't supported AFAIK, and that's the only
> > > > scenario where this could break.
> > >
> > > And it can happen if somebody running a bisect ends up in the branch
> > > with these patches in, but with the dtsi bits not being picked up.
> >
> > That wouldn't be possible unless we merged the "bad" commit
> > introducing the GPU node to sm8650.dtsi into msm-next but not the fix.
> > So yeah, it's going to require a lot of careful cooperation but it
> > should be possible to avoid that happening.
>
> Well, the GPU node is already there in the linux-next.

And? As long as the devicetree fix lands first, linux-next will never be broken.

> Anyway. Please. Don't break compat with old DTS. That is a rule of thumb.

It's exactly that, a rule of thumb. This is obviously a bit of an
exceptional case, and you haven't articulated any reason why we should
follow it in this case when there's an obvious reason not to.

Connor

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/6] drm/msm/a7xx: Initialize a750 "software fuse"
  2024-04-26 15:36                 ` Connor Abbott
@ 2024-04-26 15:38                   ` Dmitry Baryshkov
  0 siblings, 0 replies; 23+ messages in thread
From: Dmitry Baryshkov @ 2024-04-26 15:38 UTC (permalink / raw)
  To: Connor Abbott
  Cc: Rob Clark, Abhinav Kumar, Sean Paul, Marijn Suijten,
	linux-arm-msm, freedreno

On Fri, 26 Apr 2024 at 18:36, Connor Abbott <cwabbott0@gmail.com> wrote:
>
> On Fri, Apr 26, 2024 at 4:24 PM Dmitry Baryshkov
> <dmitry.baryshkov@linaro.org> wrote:
> >
> > On Fri, 26 Apr 2024 at 18:08, Connor Abbott <cwabbott0@gmail.com> wrote:
> > >
> > > On Fri, Apr 26, 2024 at 3:53 PM Dmitry Baryshkov
> > > <dmitry.baryshkov@linaro.org> wrote:
> > > >
> > > > On Fri, 26 Apr 2024 at 17:05, Connor Abbott <cwabbott0@gmail.com> wrote:
> > > > >
> > > > > On Fri, Apr 26, 2024 at 2:31 PM Dmitry Baryshkov
> > > > > <dmitry.baryshkov@linaro.org> wrote:
> > > > > >
> > > > > > On Fri, 26 Apr 2024 at 15:35, Connor Abbott <cwabbott0@gmail.com> wrote:
> > > > > > >
> > > > > > > On Fri, Apr 26, 2024 at 12:02 AM Dmitry Baryshkov
> > > > > > > <dmitry.baryshkov@linaro.org> wrote:
> > > > > > > >
> > > > > > > > On Thu, 25 Apr 2024 at 16:44, Connor Abbott <cwabbott0@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > On all Qualcomm platforms with a7xx GPUs, qcom_scm provides a method to
> > > > > > > > > initialize cx_mem. Copy this from downstream (minus BCL which we
> > > > > > > > > currently don't support). On a750, this includes a new "fuse" register
> > > > > > > > > which can be used by qcom_scm to fuse off certain features like
> > > > > > > > > raytracing in software. The fuse is default off, and is initialized by
> > > > > > > > > calling the method. Afterwards we have to read it to find out which
> > > > > > > > > features were enabled.
> > > > > > > > >
> > > > > > > > > Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
> > > > > > > > > ---
> > > > > > > > >  drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 89 ++++++++++++++++++++++++-
> > > > > > > > >  drivers/gpu/drm/msm/adreno/adreno_gpu.h |  2 +
> > > > > > > > >  2 files changed, 90 insertions(+), 1 deletion(-)
> > > > > > > > >
> > > > > > > > > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > > > > > > > index cf0b1de1c071..fb2722574ae5 100644
> > > > > > > > > --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > > > > > > > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > > > > > > > @@ -10,6 +10,7 @@
> > > > > > > > >
> > > > > > > > >  #include <linux/bitfield.h>
> > > > > > > > >  #include <linux/devfreq.h>
> > > > > > > > > +#include <linux/firmware/qcom/qcom_scm.h>
> > > > > > > > >  #include <linux/pm_domain.h>
> > > > > > > > >  #include <linux/soc/qcom/llcc-qcom.h>
> > > > > > > > >
> > > > > > > > > @@ -1686,7 +1687,8 @@ static int a6xx_zap_shader_init(struct msm_gpu *gpu)
> > > > > > > > >                        A6XX_RBBM_INT_0_MASK_RBBM_HANG_DETECT | \
> > > > > > > > >                        A6XX_RBBM_INT_0_MASK_UCHE_OOB_ACCESS | \
> > > > > > > > >                        A6XX_RBBM_INT_0_MASK_UCHE_TRAP_INTR | \
> > > > > > > > > -                      A6XX_RBBM_INT_0_MASK_TSBWRITEERROR)
> > > > > > > > > +                      A6XX_RBBM_INT_0_MASK_TSBWRITEERROR | \
> > > > > > > > > +                      A6XX_RBBM_INT_0_MASK_SWFUSEVIOLATION)
> > > > > > > > >
> > > > > > > > >  #define A7XX_APRIV_MASK (A6XX_CP_APRIV_CNTL_ICACHE | \
> > > > > > > > >                          A6XX_CP_APRIV_CNTL_RBFETCH | \
> > > > > > > > > @@ -2356,6 +2358,26 @@ static void a6xx_fault_detect_irq(struct msm_gpu *gpu)
> > > > > > > > >         kthread_queue_work(gpu->worker, &gpu->recover_work);
> > > > > > > > >  }
> > > > > > > > >
> > > > > > > > > +static void a7xx_sw_fuse_violation_irq(struct msm_gpu *gpu)
> > > > > > > > > +{
> > > > > > > > > +       u32 status;
> > > > > > > > > +
> > > > > > > > > +       status = gpu_read(gpu, REG_A7XX_RBBM_SW_FUSE_INT_STATUS);
> > > > > > > > > +       gpu_write(gpu, REG_A7XX_RBBM_SW_FUSE_INT_MASK, 0);
> > > > > > > > > +
> > > > > > > > > +       dev_err_ratelimited(&gpu->pdev->dev, "SW fuse violation status=%8.8x\n", status);
> > > > > > > > > +
> > > > > > > > > +       /* Ignore FASTBLEND violations, because the HW will silently fall back
> > > > > > > > > +        * to legacy blending.
> > > > > > > > > +        */
> > > > > > > > > +       if (status & (A7XX_CX_MISC_SW_FUSE_VALUE_RAYTRACING |
> > > > > > > > > +                     A7XX_CX_MISC_SW_FUSE_VALUE_LPAC)) {
> > > > > > > > > +               del_timer(&gpu->hangcheck_timer);
> > > > > > > > > +
> > > > > > > > > +               kthread_queue_work(gpu->worker, &gpu->recover_work);
> > > > > > > > > +       }
> > > > > > > > > +}
> > > > > > > > > +
> > > > > > > > >  static irqreturn_t a6xx_irq(struct msm_gpu *gpu)
> > > > > > > > >  {
> > > > > > > > >         struct msm_drm_private *priv = gpu->dev->dev_private;
> > > > > > > > > @@ -2384,6 +2406,9 @@ static irqreturn_t a6xx_irq(struct msm_gpu *gpu)
> > > > > > > > >         if (status & A6XX_RBBM_INT_0_MASK_UCHE_OOB_ACCESS)
> > > > > > > > >                 dev_err_ratelimited(&gpu->pdev->dev, "UCHE | Out of bounds access\n");
> > > > > > > > >
> > > > > > > > > +       if (status & A6XX_RBBM_INT_0_MASK_SWFUSEVIOLATION)
> > > > > > > > > +               a7xx_sw_fuse_violation_irq(gpu);
> > > > > > > > > +
> > > > > > > > >         if (status & A6XX_RBBM_INT_0_MASK_CP_CACHE_FLUSH_TS)
> > > > > > > > >                 msm_gpu_retire(gpu);
> > > > > > > > >
> > > > > > > > > @@ -2525,6 +2550,60 @@ static void a6xx_llc_slices_init(struct platform_device *pdev,
> > > > > > > > >                 a6xx_gpu->llc_mmio = ERR_PTR(-EINVAL);
> > > > > > > > >  }
> > > > > > > > >
> > > > > > > > > +static int a7xx_cx_mem_init(struct a6xx_gpu *a6xx_gpu)
> > > > > > > > > +{
> > > > > > > > > +       struct adreno_gpu *adreno_gpu = &a6xx_gpu->base;
> > > > > > > > > +       struct msm_gpu *gpu = &adreno_gpu->base;
> > > > > > > > > +       u32 gpu_req = QCOM_SCM_GPU_ALWAYS_EN_REQ;
> > > > > > > > > +       u32 fuse_val;
> > > > > > > > > +       int ret;
> > > > > > > > > +
> > > > > > > > > +       if (adreno_is_a740(adreno_gpu)) {
> > > > > > > > > +               /* Raytracing is always enabled on a740 */
> > > > > > > > > +               adreno_gpu->has_ray_tracing = true;
> > > > > > > > > +       }
> > > > > > > > > +
> > > > > > > > > +       if (!qcom_scm_is_available()) {
> > > > > > > > > +               /* Assume that if qcom scm isn't available, that whatever
> > > > > > > > > +                * replacement allows writing the fuse register ourselves.
> > > > > > > > > +                * Users of alternative firmware need to make sure this
> > > > > > > > > +                * register is writeable or indicate that it's not somehow.
> > > > > > > > > +                * Print a warning because if you mess this up you're about to
> > > > > > > > > +                * crash horribly.
> > > > > > > > > +                */
> > > > > > > > > +               if (adreno_is_a750(adreno_gpu)) {
> > > > > > > > > +                       dev_warn_once(gpu->dev->dev,
> > > > > > > > > +                               "SCM is not available, poking fuse register\n");
> > > > > > > > > +                       a6xx_llc_write(a6xx_gpu, REG_A7XX_CX_MISC_SW_FUSE_VALUE,
> > > > > > > > > +                               A7XX_CX_MISC_SW_FUSE_VALUE_RAYTRACING |
> > > > > > > > > +                               A7XX_CX_MISC_SW_FUSE_VALUE_FASTBLEND |
> > > > > > > > > +                               A7XX_CX_MISC_SW_FUSE_VALUE_LPAC);
> > > > > > > > > +                       adreno_gpu->has_ray_tracing = true;
> > > > > > > > > +               }
> > > > > > > > > +
> > > > > > > > > +               return 0;
> > > > > > > > > +       }
> > > > > > > > > +
> > > > > > > > > +       if (adreno_is_a750(adreno_gpu))
> > > > > > > >
> > > > > > > > Most of the function is under the if (adreno_is_a750) conditions. Can
> > > > > > > > we invert the logic and add a single block of if(adreno_is_a750) and
> > > > > > > > then place all the code underneath?
> > > > > > >
> > > > > > > You mean to duplicate the qcom_scm_is_available check and qcom_scm_
> > > > > > >
> > > > > > > >
> > > > > > > > > +               gpu_req |= QCOM_SCM_GPU_TSENSE_EN_REQ;
> > > > > > > > > +
> > > > > > > > > +       ret = qcom_scm_gpu_init_regs(gpu_req);
> > > > > > > > > +       if (ret)
> > > > > > > > > +               return ret;
> > > > > > > > > +
> > > > > > > > > +       /* On a750 raytracing may be disabled by the firmware, find out whether
> > > > > > > > > +        * that's the case. The scm call above sets the fuse register.
> > > > > > > > > +        */
> > > > > > > > > +       if (adreno_is_a750(adreno_gpu)) {
> > > > > > > > > +               fuse_val = a6xx_llc_read(a6xx_gpu, REG_A7XX_CX_MISC_SW_FUSE_VALUE);
> > > > > > > >
> > > > > > > > This register isn't accessible with the current sm8650.dtsi. Since DT
> > > > > > > > and driver are going through different trees, please add safety guards
> > > > > > > > here, so that the driver doesn't crash if used with older dtsi
> > > > > > >
> > > > > > > I don't see how this is an issue. msm-next is currently based on 6.9,
> > > > > > > which doesn't have the GPU defined in sm8650.dtsi. AFAIK patches 1 and
> > > > > > > 2 will have to go through the linux-arm-msm tree, which will have to
> > > > > > > be merged into msm-next before this patch lands there, so there will
> > > > > > > never be any breakage.
> > > > > >
> > > > > > linux-arm-msm isn't going to be merged into msm-next. If we do not ask
> > > > > > for ack for the fix to go through msm-next, they will get these
> > > > > > patches in parallel.
> > > > >
> > > > > I'm not familiar with how complicated cross-tree changes like this get
> > > > > merged, but why would we merge these in parallel given that this patch
> > > > > depends on the previous patch that introduces
> > > > > qcom_scm_gpu_init_regs(), and that would (I assume?) normally go
> > > > > through the same tree as patch 1? Even if patch 1 gets merged in
> > > > > parallel in linux-arm-msm, in what scenario would we have a broken
> > > > > boot? You won't have a devicetree with a working sm8650 GPU and
> > > > > drm/msm with raytracing until linux-arm-msm is merged into msm-next at
> > > > > which point patch 1 will have landed somehow.
> > > >
> > > > arch/arm64/qcom/dts and drivers/firmware/qcom are two separate trees.
> > > > So yes, this needs a lot of coordination.
> > >
> > >
> > >
> > > >
> > > > >
> > > > > >
> > > > > > Another option is to get dtsi fix into 6.9 and delay the raytracing
> > > > > > until 6.10-rc which doesn't make a lot of sense from my POV).
> > > > > >
> > > > > > >
> > > > > > > > (not to mention that dts is considered to be an ABI and newer kernels
> > > > > > > > are supposed not to break with older DT files).
> > > > > > >
> > > > > > > That policy only applies to released kernels, so that's irrelevant here.
> > > > > >
> > > > > > It applies to all kernels, the reason being pretty simple: git-bisect
> > > > > > should not be broken.
> > > > >
> > > > > As I wrote above, this is not an issue. The point I was making is that
> > > > > mixing and matching dtb's from one unmerged subsystem tree and a
> > > > > kernel from another isn't supported AFAIK, and that's the only
> > > > > scenario where this could break.
> > > >
> > > > And it can happen if somebody running a bisect ends up in the branch
> > > > with these patches in, but with the dtsi bits not being picked up.
> > >
> > > That wouldn't be possible unless we merged the "bad" commit
> > > introducing the GPU node to sm8650.dtsi into msm-next but not the fix.
> > > So yeah, it's going to require a lot of careful cooperation but it
> > > should be possible to avoid that happening.
> >
> > Well, the GPU node is already there in the linux-next.
>
> And? As long as the devicetree fix lands first, linux-next will never be broken.

So we need to land dtsi for 6.10 and delay the drm/msm changes for
6.11. If that's fine with you and Bjorn, I'm ok with that.

>
> > Anyway. Please. Don't break compat with old DTS. That is a rule of thumb.
>
> It's exactly that, a rule of thumb. This is obviously a bit of an
> exceptional case, and you haven't articulated any reason why we should
> follow it in this case when there's an obvious reason not to.



-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/6] drm/msm/a7xx: Initialize a750 "software fuse"
  2024-04-26 15:24               ` Dmitry Baryshkov
  2024-04-26 15:36                 ` Connor Abbott
@ 2024-04-26 16:02                 ` Rob Clark
  1 sibling, 0 replies; 23+ messages in thread
From: Rob Clark @ 2024-04-26 16:02 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: Connor Abbott, Abhinav Kumar, Sean Paul, Marijn Suijten,
	linux-arm-msm, freedreno, Bjorn Andersson

On Fri, Apr 26, 2024 at 8:24 AM Dmitry Baryshkov
<dmitry.baryshkov@linaro.org> wrote:
>
> On Fri, 26 Apr 2024 at 18:08, Connor Abbott <cwabbott0@gmail.com> wrote:
> >
> > On Fri, Apr 26, 2024 at 3:53 PM Dmitry Baryshkov
> > <dmitry.baryshkov@linaro.org> wrote:
> > >
> > > On Fri, 26 Apr 2024 at 17:05, Connor Abbott <cwabbott0@gmail.com> wrote:
> > > >
> > > > On Fri, Apr 26, 2024 at 2:31 PM Dmitry Baryshkov
> > > > <dmitry.baryshkov@linaro.org> wrote:
> > > > >
> > > > > On Fri, 26 Apr 2024 at 15:35, Connor Abbott <cwabbott0@gmail.com> wrote:
> > > > > >
> > > > > > On Fri, Apr 26, 2024 at 12:02 AM Dmitry Baryshkov
> > > > > > <dmitry.baryshkov@linaro.org> wrote:
> > > > > > >
> > > > > > > On Thu, 25 Apr 2024 at 16:44, Connor Abbott <cwabbott0@gmail.com> wrote:
> > > > > > > >
> > > > > > > > On all Qualcomm platforms with a7xx GPUs, qcom_scm provides a method to
> > > > > > > > initialize cx_mem. Copy this from downstream (minus BCL which we
> > > > > > > > currently don't support). On a750, this includes a new "fuse" register
> > > > > > > > which can be used by qcom_scm to fuse off certain features like
> > > > > > > > raytracing in software. The fuse is default off, and is initialized by
> > > > > > > > calling the method. Afterwards we have to read it to find out which
> > > > > > > > features were enabled.
> > > > > > > >
> > > > > > > > Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
> > > > > > > > ---
> > > > > > > >  drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 89 ++++++++++++++++++++++++-
> > > > > > > >  drivers/gpu/drm/msm/adreno/adreno_gpu.h |  2 +
> > > > > > > >  2 files changed, 90 insertions(+), 1 deletion(-)
> > > > > > > >
> > > > > > > > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > > > > > > index cf0b1de1c071..fb2722574ae5 100644
> > > > > > > > --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > > > > > > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > > > > > > @@ -10,6 +10,7 @@
> > > > > > > >
> > > > > > > >  #include <linux/bitfield.h>
> > > > > > > >  #include <linux/devfreq.h>
> > > > > > > > +#include <linux/firmware/qcom/qcom_scm.h>
> > > > > > > >  #include <linux/pm_domain.h>
> > > > > > > >  #include <linux/soc/qcom/llcc-qcom.h>
> > > > > > > >
> > > > > > > > @@ -1686,7 +1687,8 @@ static int a6xx_zap_shader_init(struct msm_gpu *gpu)
> > > > > > > >                        A6XX_RBBM_INT_0_MASK_RBBM_HANG_DETECT | \
> > > > > > > >                        A6XX_RBBM_INT_0_MASK_UCHE_OOB_ACCESS | \
> > > > > > > >                        A6XX_RBBM_INT_0_MASK_UCHE_TRAP_INTR | \
> > > > > > > > -                      A6XX_RBBM_INT_0_MASK_TSBWRITEERROR)
> > > > > > > > +                      A6XX_RBBM_INT_0_MASK_TSBWRITEERROR | \
> > > > > > > > +                      A6XX_RBBM_INT_0_MASK_SWFUSEVIOLATION)
> > > > > > > >
> > > > > > > >  #define A7XX_APRIV_MASK (A6XX_CP_APRIV_CNTL_ICACHE | \
> > > > > > > >                          A6XX_CP_APRIV_CNTL_RBFETCH | \
> > > > > > > > @@ -2356,6 +2358,26 @@ static void a6xx_fault_detect_irq(struct msm_gpu *gpu)
> > > > > > > >         kthread_queue_work(gpu->worker, &gpu->recover_work);
> > > > > > > >  }
> > > > > > > >
> > > > > > > > +static void a7xx_sw_fuse_violation_irq(struct msm_gpu *gpu)
> > > > > > > > +{
> > > > > > > > +       u32 status;
> > > > > > > > +
> > > > > > > > +       status = gpu_read(gpu, REG_A7XX_RBBM_SW_FUSE_INT_STATUS);
> > > > > > > > +       gpu_write(gpu, REG_A7XX_RBBM_SW_FUSE_INT_MASK, 0);
> > > > > > > > +
> > > > > > > > +       dev_err_ratelimited(&gpu->pdev->dev, "SW fuse violation status=%8.8x\n", status);
> > > > > > > > +
> > > > > > > > +       /* Ignore FASTBLEND violations, because the HW will silently fall back
> > > > > > > > +        * to legacy blending.
> > > > > > > > +        */
> > > > > > > > +       if (status & (A7XX_CX_MISC_SW_FUSE_VALUE_RAYTRACING |
> > > > > > > > +                     A7XX_CX_MISC_SW_FUSE_VALUE_LPAC)) {
> > > > > > > > +               del_timer(&gpu->hangcheck_timer);
> > > > > > > > +
> > > > > > > > +               kthread_queue_work(gpu->worker, &gpu->recover_work);
> > > > > > > > +       }
> > > > > > > > +}
> > > > > > > > +
> > > > > > > >  static irqreturn_t a6xx_irq(struct msm_gpu *gpu)
> > > > > > > >  {
> > > > > > > >         struct msm_drm_private *priv = gpu->dev->dev_private;
> > > > > > > > @@ -2384,6 +2406,9 @@ static irqreturn_t a6xx_irq(struct msm_gpu *gpu)
> > > > > > > >         if (status & A6XX_RBBM_INT_0_MASK_UCHE_OOB_ACCESS)
> > > > > > > >                 dev_err_ratelimited(&gpu->pdev->dev, "UCHE | Out of bounds access\n");
> > > > > > > >
> > > > > > > > +       if (status & A6XX_RBBM_INT_0_MASK_SWFUSEVIOLATION)
> > > > > > > > +               a7xx_sw_fuse_violation_irq(gpu);
> > > > > > > > +
> > > > > > > >         if (status & A6XX_RBBM_INT_0_MASK_CP_CACHE_FLUSH_TS)
> > > > > > > >                 msm_gpu_retire(gpu);
> > > > > > > >
> > > > > > > > @@ -2525,6 +2550,60 @@ static void a6xx_llc_slices_init(struct platform_device *pdev,
> > > > > > > >                 a6xx_gpu->llc_mmio = ERR_PTR(-EINVAL);
> > > > > > > >  }
> > > > > > > >
> > > > > > > > +static int a7xx_cx_mem_init(struct a6xx_gpu *a6xx_gpu)
> > > > > > > > +{
> > > > > > > > +       struct adreno_gpu *adreno_gpu = &a6xx_gpu->base;
> > > > > > > > +       struct msm_gpu *gpu = &adreno_gpu->base;
> > > > > > > > +       u32 gpu_req = QCOM_SCM_GPU_ALWAYS_EN_REQ;
> > > > > > > > +       u32 fuse_val;
> > > > > > > > +       int ret;
> > > > > > > > +
> > > > > > > > +       if (adreno_is_a740(adreno_gpu)) {
> > > > > > > > +               /* Raytracing is always enabled on a740 */
> > > > > > > > +               adreno_gpu->has_ray_tracing = true;
> > > > > > > > +       }
> > > > > > > > +
> > > > > > > > +       if (!qcom_scm_is_available()) {
> > > > > > > > +               /* Assume that if qcom scm isn't available, that whatever
> > > > > > > > +                * replacement allows writing the fuse register ourselves.
> > > > > > > > +                * Users of alternative firmware need to make sure this
> > > > > > > > +                * register is writeable or indicate that it's not somehow.
> > > > > > > > +                * Print a warning because if you mess this up you're about to
> > > > > > > > +                * crash horribly.
> > > > > > > > +                */
> > > > > > > > +               if (adreno_is_a750(adreno_gpu)) {
> > > > > > > > +                       dev_warn_once(gpu->dev->dev,
> > > > > > > > +                               "SCM is not available, poking fuse register\n");
> > > > > > > > +                       a6xx_llc_write(a6xx_gpu, REG_A7XX_CX_MISC_SW_FUSE_VALUE,
> > > > > > > > +                               A7XX_CX_MISC_SW_FUSE_VALUE_RAYTRACING |
> > > > > > > > +                               A7XX_CX_MISC_SW_FUSE_VALUE_FASTBLEND |
> > > > > > > > +                               A7XX_CX_MISC_SW_FUSE_VALUE_LPAC);
> > > > > > > > +                       adreno_gpu->has_ray_tracing = true;
> > > > > > > > +               }
> > > > > > > > +
> > > > > > > > +               return 0;
> > > > > > > > +       }
> > > > > > > > +
> > > > > > > > +       if (adreno_is_a750(adreno_gpu))
> > > > > > >
> > > > > > > Most of the function is under the if (adreno_is_a750) conditions. Can
> > > > > > > we invert the logic and add a single block of if(adreno_is_a750) and
> > > > > > > then place all the code underneath?
> > > > > >
> > > > > > You mean to duplicate the qcom_scm_is_available check and qcom_scm_
> > > > > >
> > > > > > >
> > > > > > > > +               gpu_req |= QCOM_SCM_GPU_TSENSE_EN_REQ;
> > > > > > > > +
> > > > > > > > +       ret = qcom_scm_gpu_init_regs(gpu_req);
> > > > > > > > +       if (ret)
> > > > > > > > +               return ret;
> > > > > > > > +
> > > > > > > > +       /* On a750 raytracing may be disabled by the firmware, find out whether
> > > > > > > > +        * that's the case. The scm call above sets the fuse register.
> > > > > > > > +        */
> > > > > > > > +       if (adreno_is_a750(adreno_gpu)) {
> > > > > > > > +               fuse_val = a6xx_llc_read(a6xx_gpu, REG_A7XX_CX_MISC_SW_FUSE_VALUE);
> > > > > > >
> > > > > > > This register isn't accessible with the current sm8650.dtsi. Since DT
> > > > > > > and driver are going through different trees, please add safety guards
> > > > > > > here, so that the driver doesn't crash if used with older dtsi
> > > > > >
> > > > > > I don't see how this is an issue. msm-next is currently based on 6.9,
> > > > > > which doesn't have the GPU defined in sm8650.dtsi. AFAIK patches 1 and
> > > > > > 2 will have to go through the linux-arm-msm tree, which will have to
> > > > > > be merged into msm-next before this patch lands there, so there will
> > > > > > never be any breakage.
> > > > >
> > > > > linux-arm-msm isn't going to be merged into msm-next. If we do not ask
> > > > > for ack for the fix to go through msm-next, they will get these
> > > > > patches in parallel.
> > > >
> > > > I'm not familiar with how complicated cross-tree changes like this get
> > > > merged, but why would we merge these in parallel given that this patch
> > > > depends on the previous patch that introduces
> > > > qcom_scm_gpu_init_regs(), and that would (I assume?) normally go
> > > > through the same tree as patch 1? Even if patch 1 gets merged in
> > > > parallel in linux-arm-msm, in what scenario would we have a broken
> > > > boot? You won't have a devicetree with a working sm8650 GPU and
> > > > drm/msm with raytracing until linux-arm-msm is merged into msm-next at
> > > > which point patch 1 will have landed somehow.
> > >
> > > arch/arm64/qcom/dts and drivers/firmware/qcom are two separate trees.
> > > So yes, this needs a lot of coordination.
> >
> >
> >
> > >
> > > >
> > > > >
> > > > > Another option is to get dtsi fix into 6.9 and delay the raytracing
> > > > > until 6.10-rc which doesn't make a lot of sense from my POV).
> > > > >
> > > > > >
> > > > > > > (not to mention that dts is considered to be an ABI and newer kernels
> > > > > > > are supposed not to break with older DT files).
> > > > > >
> > > > > > That policy only applies to released kernels, so that's irrelevant here.
> > > > >
> > > > > It applies to all kernels, the reason being pretty simple: git-bisect
> > > > > should not be broken.
> > > >
> > > > As I wrote above, this is not an issue. The point I was making is that
> > > > mixing and matching dtb's from one unmerged subsystem tree and a
> > > > kernel from another isn't supported AFAIK, and that's the only
> > > > scenario where this could break.
> > >
> > > And it can happen if somebody running a bisect ends up in the branch
> > > with these patches in, but with the dtsi bits not being picked up.
> >
> > That wouldn't be possible unless we merged the "bad" commit
> > introducing the GPU node to sm8650.dtsi into msm-next but not the fix.
> > So yeah, it's going to require a lot of careful cooperation but it
> > should be possible to avoid that happening.
>
> Well, the GPU node is already there in the linux-next.
>
> Anyway. Please. Don't break compat with old DTS. That is a rule of thumb.
>

+Bjorn, since that is who we need to coordinate with, on two points

1) fix for sm8650.dtsi gpu node..  the gpu node is in linux-next, but
not yet (AFAICT) in any pull req.  So we just ask Bjorn to land the
gpu node fix from this series before sending his DT pull req.  Problem
solved.  Either drm-next gets pulled first, in which case the dt node
doesn't even exist yet, or the dt is pulled with the fix before
drm-next is.

2) the scm dependency.. looks like there are these in-flight scm patches:

[1/4] firmware: qcom: scm: Remove log reporting memory allocation failure
      commit: 3de990f7895906a7a18d2dff63e3e525acaa4ecc
[2/4] firmware: scm: Remove redundant scm argument from qcom_scm_waitq_wakeup()
      commit: 000636d91d605f6209a635a29d0487af5b12b237
[3/4] firmware: qcom: scm: Rework dload mode availability check
      commit: 398a4c58f3f29ac3ff4d777dc91fe40a07bbca8c
[4/4] firmware: qcom: scm: Fix __scm and waitq completion variable
initialization
      commit: 2e4955167ec5c04534cebea9e8273a907e7a75e1

[1/1] firmware: qcom: scm: Modify only the download bits in TCSR register
      commit: b9718298e028f9edbe0fcdf48c02a1c355409410

Those don't look like they should conflict with [2/6] firmware:
qcom_scm: Add gpu_init_regs call... so maybe we could get an a-b for
landing that patch via msm-next.

BR,
-R

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2024-04-26 16:02 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-25 13:43 [PATCH 0/6] drm/msm: Support a750 "software fuse" for raytracing Connor Abbott
2024-04-25 13:43 ` [PATCH 1/6] arm64: dts: qcom: sm8650: Fix GPU cx_mem size Connor Abbott
2024-04-25 13:43 ` [PATCH 3/6] drm/msm: Update a6xx registers Connor Abbott
2024-04-25 13:43 ` [PATCH 4/6] drm/msm/a7xx: Initialize a750 "software fuse" Connor Abbott
2024-04-25 23:02   ` Dmitry Baryshkov
2024-04-26 12:35     ` Connor Abbott
2024-04-26 12:54       ` Connor Abbott
2024-04-26 13:37         ` Dmitry Baryshkov
2024-04-26 13:31       ` Dmitry Baryshkov
2024-04-26 14:05         ` Connor Abbott
2024-04-26 14:53           ` Dmitry Baryshkov
2024-04-26 15:08             ` Connor Abbott
2024-04-26 15:24               ` Dmitry Baryshkov
2024-04-26 15:36                 ` Connor Abbott
2024-04-26 15:38                   ` Dmitry Baryshkov
2024-04-26 16:02                 ` Rob Clark
2024-04-26 12:45     ` Rob Clark
2024-04-26 13:28       ` Dmitry Baryshkov
2024-04-25 13:43 ` [PATCH 5/6] drm/msm: Add MSM_PARAM_RAYTRACING uapi Connor Abbott
2024-04-25 23:03   ` Dmitry Baryshkov
2024-04-25 13:43 ` [PATCH 6/6] drm/msm/a7xx: Add missing register writes from downstream Connor Abbott
2024-04-25 15:03 ` [PATCH 0/6] drm/msm: Support a750 "software fuse" for raytracing Dmitry Baryshkov
2024-04-25 15:13 ` [PATCH 2/6] firmware: qcom_scm: Add gpu_init_regs call Connor Abbott

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).