All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] drm/msm/adreno: add support for a730
@ 2022-03-27 20:25 ` Jonathan Marek
  0 siblings, 0 replies; 12+ messages in thread
From: Jonathan Marek @ 2022-03-27 20:25 UTC (permalink / raw)
  To: freedreno
  Cc: Abhinav Kumar, Akhil P Oommen, AngeloGioacchino Del Regno,
	Bjorn Andersson, Christian König, Dan Carpenter,
	Daniel Vetter, David Airlie, Dmitry Baryshkov, Douglas Anderson,
	open list:DRM DRIVER FOR MSM ADRENO GPU, Emma Anholt,
	Jordan Crouse, open list:DRM DRIVER FOR MSM ADRENO GPU,
	open list, Rob Clark, Sean Paul, Viresh Kumar, Vladimir Lypak,
	Yangtao Li

Based on a6xx_gpu.c, stripped down and updated for a7xx based on the
downstream driver. Implements the minimum to be able to submit commands to
the GPU and use it for userspace driver development. Notably this doesn't
implement support for the GMU (this means that the clock driver needs to
support the GPU core clock and turning on the GX rail, which is normally
offloaded to the GMU).

Register updates: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15602

Jonathan Marek (4):
  drm/msm/adreno: move a6xx CP_PROTECT macros to common code
  drm/msm/adreno: use a single register offset for
    gpu_read64/gpu_write64
  drm/msm/adreno: update headers
  drm/msm/adreno: add support for a730

 drivers/gpu/drm/msm/Makefile                |   1 +
 drivers/gpu/drm/msm/adreno/a4xx_gpu.c       |   3 +-
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c       |  27 +-
 drivers/gpu/drm/msm/adreno/a5xx_preempt.c   |   4 +-
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c       |  25 +-
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h       |  17 -
 drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c |   3 +-
 drivers/gpu/drm/msm/adreno/a7xx.xml.h       | 666 +++++++++++++++++
 drivers/gpu/drm/msm/adreno/a7xx_gpu.c       | 777 ++++++++++++++++++++
 drivers/gpu/drm/msm/adreno/a7xx_gpu.h       |  26 +
 drivers/gpu/drm/msm/adreno/adreno_device.c  |  12 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.h     |   9 +-
 drivers/gpu/drm/msm/adreno/adreno_pm4.xml.h |  45 +-
 drivers/gpu/drm/msm/msm_gpu.h               |  12 +-
 drivers/gpu/drm/msm/msm_ringbuffer.h        |   1 +
 15 files changed, 1550 insertions(+), 78 deletions(-)
 create mode 100644 drivers/gpu/drm/msm/adreno/a7xx.xml.h
 create mode 100644 drivers/gpu/drm/msm/adreno/a7xx_gpu.c
 create mode 100644 drivers/gpu/drm/msm/adreno/a7xx_gpu.h

-- 
2.26.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 0/4] drm/msm/adreno: add support for a730
@ 2022-03-27 20:25 ` Jonathan Marek
  0 siblings, 0 replies; 12+ messages in thread
From: Jonathan Marek @ 2022-03-27 20:25 UTC (permalink / raw)
  To: freedreno
  Cc: Douglas Anderson, open list, Emma Anholt, Akhil P Oommen,
	open list:DRM DRIVER FOR MSM ADRENO GPU, Yangtao Li,
	Vladimir Lypak, Abhinav Kumar,
	open list:DRM DRIVER FOR MSM ADRENO GPU, Bjorn Andersson,
	David Airlie, Viresh Kumar, Dmitry Baryshkov, Jordan Crouse,
	Sean Paul, Christian König, Dan Carpenter,
	AngeloGioacchino Del Regno

Based on a6xx_gpu.c, stripped down and updated for a7xx based on the
downstream driver. Implements the minimum to be able to submit commands to
the GPU and use it for userspace driver development. Notably this doesn't
implement support for the GMU (this means that the clock driver needs to
support the GPU core clock and turning on the GX rail, which is normally
offloaded to the GMU).

Register updates: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15602

Jonathan Marek (4):
  drm/msm/adreno: move a6xx CP_PROTECT macros to common code
  drm/msm/adreno: use a single register offset for
    gpu_read64/gpu_write64
  drm/msm/adreno: update headers
  drm/msm/adreno: add support for a730

 drivers/gpu/drm/msm/Makefile                |   1 +
 drivers/gpu/drm/msm/adreno/a4xx_gpu.c       |   3 +-
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c       |  27 +-
 drivers/gpu/drm/msm/adreno/a5xx_preempt.c   |   4 +-
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c       |  25 +-
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h       |  17 -
 drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c |   3 +-
 drivers/gpu/drm/msm/adreno/a7xx.xml.h       | 666 +++++++++++++++++
 drivers/gpu/drm/msm/adreno/a7xx_gpu.c       | 777 ++++++++++++++++++++
 drivers/gpu/drm/msm/adreno/a7xx_gpu.h       |  26 +
 drivers/gpu/drm/msm/adreno/adreno_device.c  |  12 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.h     |   9 +-
 drivers/gpu/drm/msm/adreno/adreno_pm4.xml.h |  45 +-
 drivers/gpu/drm/msm/msm_gpu.h               |  12 +-
 drivers/gpu/drm/msm/msm_ringbuffer.h        |   1 +
 15 files changed, 1550 insertions(+), 78 deletions(-)
 create mode 100644 drivers/gpu/drm/msm/adreno/a7xx.xml.h
 create mode 100644 drivers/gpu/drm/msm/adreno/a7xx_gpu.c
 create mode 100644 drivers/gpu/drm/msm/adreno/a7xx_gpu.h

-- 
2.26.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/4] drm/msm/adreno: move a6xx CP_PROTECT macros to common code
  2022-03-27 20:25 ` Jonathan Marek
@ 2022-03-27 20:25   ` Jonathan Marek
  -1 siblings, 0 replies; 12+ messages in thread
From: Jonathan Marek @ 2022-03-27 20:25 UTC (permalink / raw)
  To: freedreno
  Cc: Rob Clark, Sean Paul, Abhinav Kumar, David Airlie, Daniel Vetter,
	Akhil P Oommen, Yangtao Li, Dmitry Osipenko, Bjorn Andersson,
	Emma Anholt, Vladimir Lypak,
	open list:DRM DRIVER FOR MSM ADRENO GPU,
	open list:DRM DRIVER FOR MSM ADRENO GPU, open list

These will be used by a7xx, so move them to common code. A6XX_ prefix is
kept because the generic ADRENO_ is already in use.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h   | 17 -----------------
 drivers/gpu/drm/msm/adreno/adreno_gpu.h |  6 ++++++
 2 files changed, 6 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
index 86e0a7c3fe6df..d117c1589f2af 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
@@ -36,23 +36,6 @@ struct a6xx_gpu {
 
 #define to_a6xx_gpu(x) container_of(x, struct a6xx_gpu, base)
 
-/*
- * Given a register and a count, return a value to program into
- * REG_CP_PROTECT_REG(n) - this will block both reads and writes for _len
- * registers starting at _reg.
- */
-#define A6XX_PROTECT_NORDWR(_reg, _len) \
-	((1 << 31) | \
-	(((_len) & 0x3FFF) << 18) | ((_reg) & 0x3FFFF))
-
-/*
- * Same as above, but allow reads over the range. For areas of mixed use (such
- * as performance counters) this allows us to protect a much larger range with a
- * single register
- */
-#define A6XX_PROTECT_RDONLY(_reg, _len) \
-	((((_len) & 0x3FFF) << 18) | ((_reg) & 0x3FFFF))
-
 static inline bool a6xx_has_gbif(struct adreno_gpu *gpu)
 {
 	if(adreno_is_a630(gpu))
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
index 0490c5fbb7803..55c5433a4ea18 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
@@ -416,6 +416,10 @@ static inline uint32_t get_wptr(struct msm_ringbuffer *ring)
 	((1 << 30) | (1 << 29) | \
 	((ilog2((_len)) & 0x1F) << 24) | (((_reg) << 2) & 0xFFFFF))
 
+#define A6XX_PROTECT_NORDWR(_reg, _len) \
+	((1 << 31) | \
+	(((_len) & 0x3FFF) << 18) | ((_reg) & 0x3FFFF))
+
 /*
  * Same as above, but allow reads over the range. For areas of mixed use (such
  * as performance counters) this allows us to protect a much larger range with a
@@ -425,6 +429,8 @@ static inline uint32_t get_wptr(struct msm_ringbuffer *ring)
 	((1 << 29) \
 	((ilog2((_len)) & 0x1F) << 24) | (((_reg) << 2) & 0xFFFFF))
 
+#define A6XX_PROTECT_RDONLY(_reg, _len) \
+	((((_len) & 0x3FFF) << 18) | ((_reg) & 0x3FFFF))
 
 #define gpu_poll_timeout(gpu, addr, val, cond, interval, timeout) \
 	readl_poll_timeout((gpu)->mmio + ((addr) << 2), val, cond, \
-- 
2.26.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 1/4] drm/msm/adreno: move a6xx CP_PROTECT macros to common code
@ 2022-03-27 20:25   ` Jonathan Marek
  0 siblings, 0 replies; 12+ messages in thread
From: Jonathan Marek @ 2022-03-27 20:25 UTC (permalink / raw)
  To: freedreno
  Cc: Emma Anholt, David Airlie,
	open list:DRM DRIVER FOR MSM ADRENO GPU, Yangtao Li,
	Vladimir Lypak, Abhinav Kumar,
	open list:DRM DRIVER FOR MSM ADRENO GPU, Bjorn Andersson,
	Akhil P Oommen, Dmitry Osipenko, Sean Paul, open list

These will be used by a7xx, so move them to common code. A6XX_ prefix is
kept because the generic ADRENO_ is already in use.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h   | 17 -----------------
 drivers/gpu/drm/msm/adreno/adreno_gpu.h |  6 ++++++
 2 files changed, 6 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
index 86e0a7c3fe6df..d117c1589f2af 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
@@ -36,23 +36,6 @@ struct a6xx_gpu {
 
 #define to_a6xx_gpu(x) container_of(x, struct a6xx_gpu, base)
 
-/*
- * Given a register and a count, return a value to program into
- * REG_CP_PROTECT_REG(n) - this will block both reads and writes for _len
- * registers starting at _reg.
- */
-#define A6XX_PROTECT_NORDWR(_reg, _len) \
-	((1 << 31) | \
-	(((_len) & 0x3FFF) << 18) | ((_reg) & 0x3FFFF))
-
-/*
- * Same as above, but allow reads over the range. For areas of mixed use (such
- * as performance counters) this allows us to protect a much larger range with a
- * single register
- */
-#define A6XX_PROTECT_RDONLY(_reg, _len) \
-	((((_len) & 0x3FFF) << 18) | ((_reg) & 0x3FFFF))
-
 static inline bool a6xx_has_gbif(struct adreno_gpu *gpu)
 {
 	if(adreno_is_a630(gpu))
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
index 0490c5fbb7803..55c5433a4ea18 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
@@ -416,6 +416,10 @@ static inline uint32_t get_wptr(struct msm_ringbuffer *ring)
 	((1 << 30) | (1 << 29) | \
 	((ilog2((_len)) & 0x1F) << 24) | (((_reg) << 2) & 0xFFFFF))
 
+#define A6XX_PROTECT_NORDWR(_reg, _len) \
+	((1 << 31) | \
+	(((_len) & 0x3FFF) << 18) | ((_reg) & 0x3FFFF))
+
 /*
  * Same as above, but allow reads over the range. For areas of mixed use (such
  * as performance counters) this allows us to protect a much larger range with a
@@ -425,6 +429,8 @@ static inline uint32_t get_wptr(struct msm_ringbuffer *ring)
 	((1 << 29) \
 	((ilog2((_len)) & 0x1F) << 24) | (((_reg) << 2) & 0xFFFFF))
 
+#define A6XX_PROTECT_RDONLY(_reg, _len) \
+	((((_len) & 0x3FFF) << 18) | ((_reg) & 0x3FFFF))
 
 #define gpu_poll_timeout(gpu, addr, val, cond, interval, timeout) \
 	readl_poll_timeout((gpu)->mmio + ((addr) << 2), val, cond, \
-- 
2.26.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/4] drm/msm/adreno: use a single register offset for gpu_read64/gpu_write64
  2022-03-27 20:25 ` Jonathan Marek
@ 2022-03-27 20:25   ` Jonathan Marek
  -1 siblings, 0 replies; 12+ messages in thread
From: Jonathan Marek @ 2022-03-27 20:25 UTC (permalink / raw)
  To: freedreno
  Cc: Rob Clark, Sean Paul, Abhinav Kumar, David Airlie, Daniel Vetter,
	Dan Carpenter, Akhil P Oommen, Jordan Crouse, Vladimir Lypak,
	Yangtao Li, Christian König, Dmitry Baryshkov,
	Douglas Anderson, open list:DRM DRIVER FOR MSM ADRENO GPU,
	open list:DRM DRIVER FOR MSM ADRENO GPU, open list

The high half of 64-bit registers is always at +1 offset, so change these
helpers to be more convenient by removing the unnecessary argument.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
---
 drivers/gpu/drm/msm/adreno/a4xx_gpu.c       |  3 +--
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c       | 27 ++++++++-------------
 drivers/gpu/drm/msm/adreno/a5xx_preempt.c   |  4 +--
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c       | 25 ++++++-------------
 drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c |  3 +--
 drivers/gpu/drm/msm/msm_gpu.h               | 12 ++++-----
 6 files changed, 27 insertions(+), 47 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
index 0c6b2a6d0b4c9..da5e18bd74a45 100644
--- a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
@@ -606,8 +606,7 @@ static int a4xx_pm_suspend(struct msm_gpu *gpu) {
 
 static int a4xx_get_timestamp(struct msm_gpu *gpu, uint64_t *value)
 {
-	*value = gpu_read64(gpu, REG_A4XX_RBBM_PERFCTR_CP_0_LO,
-		REG_A4XX_RBBM_PERFCTR_CP_0_HI);
+	*value = gpu_read64(gpu, REG_A4XX_RBBM_PERFCTR_CP_0_LO);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index 407f50a15faa4..1916cb759cd5c 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -605,11 +605,9 @@ static int a5xx_ucode_init(struct msm_gpu *gpu)
 		a5xx_ucode_check_version(a5xx_gpu, a5xx_gpu->pfp_bo);
 	}
 
-	gpu_write64(gpu, REG_A5XX_CP_ME_INSTR_BASE_LO,
-		REG_A5XX_CP_ME_INSTR_BASE_HI, a5xx_gpu->pm4_iova);
+	gpu_write64(gpu, REG_A5XX_CP_ME_INSTR_BASE_LO, a5xx_gpu->pm4_iova);
 
-	gpu_write64(gpu, REG_A5XX_CP_PFP_INSTR_BASE_LO,
-		REG_A5XX_CP_PFP_INSTR_BASE_HI, a5xx_gpu->pfp_iova);
+	gpu_write64(gpu, REG_A5XX_CP_PFP_INSTR_BASE_LO, a5xx_gpu->pfp_iova);
 
 	return 0;
 }
@@ -868,8 +866,7 @@ static int a5xx_hw_init(struct msm_gpu *gpu)
 	 * memory rendering at this point in time and we don't want to block off
 	 * part of the virtual memory space.
 	 */
-	gpu_write64(gpu, REG_A5XX_RBBM_SECVID_TSB_TRUSTED_BASE_LO,
-		REG_A5XX_RBBM_SECVID_TSB_TRUSTED_BASE_HI, 0x00000000);
+	gpu_write64(gpu, REG_A5XX_RBBM_SECVID_TSB_TRUSTED_BASE_LO, 0x00000000);
 	gpu_write(gpu, REG_A5XX_RBBM_SECVID_TSB_TRUSTED_SIZE, 0x00000000);
 
 	/* Put the GPU into 64 bit by default */
@@ -908,8 +905,7 @@ static int a5xx_hw_init(struct msm_gpu *gpu)
 		return ret;
 
 	/* Set the ringbuffer address */
-	gpu_write64(gpu, REG_A5XX_CP_RB_BASE, REG_A5XX_CP_RB_BASE_HI,
-		gpu->rb[0]->iova);
+	gpu_write64(gpu, REG_A5XX_CP_RB_BASE, gpu->rb[0]->iova);
 
 	/*
 	 * If the microcode supports the WHERE_AM_I opcode then we can use that
@@ -936,7 +932,7 @@ static int a5xx_hw_init(struct msm_gpu *gpu)
 		}
 
 		gpu_write64(gpu, REG_A5XX_CP_RB_RPTR_ADDR,
-			REG_A5XX_CP_RB_RPTR_ADDR_HI, shadowptr(a5xx_gpu, gpu->rb[0]));
+			shadowptr(a5xx_gpu, gpu->rb[0]));
 	} else if (gpu->nr_rings > 1) {
 		/* Disable preemption if WHERE_AM_I isn't available */
 		a5xx_preempt_fini(gpu);
@@ -1239,9 +1235,9 @@ static void a5xx_fault_detect_irq(struct msm_gpu *gpu)
 		gpu_read(gpu, REG_A5XX_RBBM_STATUS),
 		gpu_read(gpu, REG_A5XX_CP_RB_RPTR),
 		gpu_read(gpu, REG_A5XX_CP_RB_WPTR),
-		gpu_read64(gpu, REG_A5XX_CP_IB1_BASE, REG_A5XX_CP_IB1_BASE_HI),
+		gpu_read64(gpu, REG_A5XX_CP_IB1_BASE),
 		gpu_read(gpu, REG_A5XX_CP_IB1_BUFSZ),
-		gpu_read64(gpu, REG_A5XX_CP_IB2_BASE, REG_A5XX_CP_IB2_BASE_HI),
+		gpu_read64(gpu, REG_A5XX_CP_IB2_BASE),
 		gpu_read(gpu, REG_A5XX_CP_IB2_BUFSZ));
 
 	/* Turn off the hangcheck timer to keep it from bothering us */
@@ -1427,8 +1423,7 @@ static int a5xx_pm_suspend(struct msm_gpu *gpu)
 
 static int a5xx_get_timestamp(struct msm_gpu *gpu, uint64_t *value)
 {
-	*value = gpu_read64(gpu, REG_A5XX_RBBM_ALWAYSON_COUNTER_LO,
-		REG_A5XX_RBBM_ALWAYSON_COUNTER_HI);
+	*value = gpu_read64(gpu, REG_A5XX_RBBM_ALWAYSON_COUNTER_LO);
 
 	return 0;
 }
@@ -1465,8 +1460,7 @@ static int a5xx_crashdumper_run(struct msm_gpu *gpu,
 	if (IS_ERR_OR_NULL(dumper->ptr))
 		return -EINVAL;
 
-	gpu_write64(gpu, REG_A5XX_CP_CRASH_SCRIPT_BASE_LO,
-		REG_A5XX_CP_CRASH_SCRIPT_BASE_HI, dumper->iova);
+	gpu_write64(gpu, REG_A5XX_CP_CRASH_SCRIPT_BASE_LO, dumper->iova);
 
 	gpu_write(gpu, REG_A5XX_CP_CRASH_DUMP_CNTL, 1);
 
@@ -1670,8 +1664,7 @@ static unsigned long a5xx_gpu_busy(struct msm_gpu *gpu)
 	if (pm_runtime_get_if_in_use(&gpu->pdev->dev) == 0)
 		return 0;
 
-	busy_cycles = gpu_read64(gpu, REG_A5XX_RBBM_PERFCTR_RBBM_0_LO,
-			REG_A5XX_RBBM_PERFCTR_RBBM_0_HI);
+	busy_cycles = gpu_read64(gpu, REG_A5XX_RBBM_PERFCTR_RBBM_0_LO);
 
 	busy_time = busy_cycles - gpu->devfreq.busy_cycles;
 	do_div(busy_time, clk_get_rate(gpu->core_clk) / 1000000);
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
index 8abc9a2b114a2..7658e89844b46 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
@@ -137,7 +137,6 @@ void a5xx_preempt_trigger(struct msm_gpu *gpu)
 
 	/* Set the address of the incoming preemption record */
 	gpu_write64(gpu, REG_A5XX_CP_CONTEXT_SWITCH_RESTORE_ADDR_LO,
-		REG_A5XX_CP_CONTEXT_SWITCH_RESTORE_ADDR_HI,
 		a5xx_gpu->preempt_iova[ring->id]);
 
 	a5xx_gpu->next_ring = ring;
@@ -211,8 +210,7 @@ void a5xx_preempt_hw_init(struct msm_gpu *gpu)
 	}
 
 	/* Write a 0 to signal that we aren't switching pagetables */
-	gpu_write64(gpu, REG_A5XX_CP_CONTEXT_SWITCH_SMMU_INFO_LO,
-		REG_A5XX_CP_CONTEXT_SWITCH_SMMU_INFO_HI, 0);
+	gpu_write64(gpu, REG_A5XX_CP_CONTEXT_SWITCH_SMMU_INFO_LO, 0);
 
 	/* Reset the preemption state */
 	set_preempt_state(a5xx_gpu, PREEMPT_NONE);
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 83c31b2ad865b..a624cb2df233b 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -246,8 +246,7 @@ static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
 	OUT_RING(ring, submit->seqno);
 
 	trace_msm_gpu_submit_flush(submit,
-		gpu_read64(gpu, REG_A6XX_CP_ALWAYS_ON_COUNTER_LO,
-			REG_A6XX_CP_ALWAYS_ON_COUNTER_HI));
+		gpu_read64(gpu, REG_A6XX_CP_ALWAYS_ON_COUNTER_LO));
 
 	a6xx_flush(gpu, ring);
 }
@@ -878,8 +877,7 @@ static int a6xx_ucode_init(struct msm_gpu *gpu)
 		}
 	}
 
-	gpu_write64(gpu, REG_A6XX_CP_SQE_INSTR_BASE,
-		REG_A6XX_CP_SQE_INSTR_BASE+1, a6xx_gpu->sqe_iova);
+	gpu_write64(gpu, REG_A6XX_CP_SQE_INSTR_BASE, a6xx_gpu->sqe_iova);
 
 	return 0;
 }
@@ -926,8 +924,7 @@ static int hw_init(struct msm_gpu *gpu)
 	 * memory rendering at this point in time and we don't want to block off
 	 * part of the virtual memory space.
 	 */
-	gpu_write64(gpu, REG_A6XX_RBBM_SECVID_TSB_TRUSTED_BASE_LO,
-		REG_A6XX_RBBM_SECVID_TSB_TRUSTED_BASE_HI, 0x00000000);
+	gpu_write64(gpu, REG_A6XX_RBBM_SECVID_TSB_TRUSTED_BASE_LO, 0x00000000);
 	gpu_write(gpu, REG_A6XX_RBBM_SECVID_TSB_TRUSTED_SIZE, 0x00000000);
 
 	/* Turn on 64 bit addressing for all blocks */
@@ -976,11 +973,8 @@ static int hw_init(struct msm_gpu *gpu)
 
 	if (!adreno_is_a650_family(adreno_gpu)) {
 		/* Set the GMEM VA range [0x100000:0x100000 + gpu->gmem - 1] */
-		gpu_write64(gpu, REG_A6XX_UCHE_GMEM_RANGE_MIN_LO,
-			REG_A6XX_UCHE_GMEM_RANGE_MIN_HI, 0x00100000);
-
+		gpu_write64(gpu, REG_A6XX_UCHE_GMEM_RANGE_MIN_LO, 0x00100000);
 		gpu_write64(gpu, REG_A6XX_UCHE_GMEM_RANGE_MAX_LO,
-			REG_A6XX_UCHE_GMEM_RANGE_MAX_HI,
 			0x00100000 + adreno_gpu->gmem - 1);
 	}
 
@@ -1072,8 +1066,7 @@ static int hw_init(struct msm_gpu *gpu)
 		goto out;
 
 	/* Set the ringbuffer address */
-	gpu_write64(gpu, REG_A6XX_CP_RB_BASE, REG_A6XX_CP_RB_BASE_HI,
-		gpu->rb[0]->iova);
+	gpu_write64(gpu, REG_A6XX_CP_RB_BASE, gpu->rb[0]->iova);
 
 	/* Targets that support extended APRIV can use the RPTR shadow from
 	 * hardware but all the other ones need to disable the feature. Targets
@@ -1105,7 +1098,6 @@ static int hw_init(struct msm_gpu *gpu)
 		}
 
 		gpu_write64(gpu, REG_A6XX_CP_RB_RPTR_ADDR_LO,
-			REG_A6XX_CP_RB_RPTR_ADDR_HI,
 			shadowptr(a6xx_gpu, gpu->rb[0]));
 	}
 
@@ -1394,9 +1386,9 @@ static void a6xx_fault_detect_irq(struct msm_gpu *gpu)
 		gpu_read(gpu, REG_A6XX_RBBM_STATUS),
 		gpu_read(gpu, REG_A6XX_CP_RB_RPTR),
 		gpu_read(gpu, REG_A6XX_CP_RB_WPTR),
-		gpu_read64(gpu, REG_A6XX_CP_IB1_BASE, REG_A6XX_CP_IB1_BASE_HI),
+		gpu_read64(gpu, REG_A6XX_CP_IB1_BASE),
 		gpu_read(gpu, REG_A6XX_CP_IB1_REM_SIZE),
-		gpu_read64(gpu, REG_A6XX_CP_IB2_BASE, REG_A6XX_CP_IB2_BASE_HI),
+		gpu_read64(gpu, REG_A6XX_CP_IB2_BASE),
 		gpu_read(gpu, REG_A6XX_CP_IB2_REM_SIZE));
 
 	/* Turn off the hangcheck timer to keep it from bothering us */
@@ -1607,8 +1599,7 @@ static int a6xx_get_timestamp(struct msm_gpu *gpu, uint64_t *value)
 	/* Force the GPU power on so we can read this register */
 	a6xx_gmu_set_oob(&a6xx_gpu->gmu, GMU_OOB_PERFCOUNTER_SET);
 
-	*value = gpu_read64(gpu, REG_A6XX_CP_ALWAYS_ON_COUNTER_LO,
-			    REG_A6XX_CP_ALWAYS_ON_COUNTER_HI);
+	*value = gpu_read64(gpu, REG_A6XX_CP_ALWAYS_ON_COUNTER_LO);
 
 	a6xx_gmu_clear_oob(&a6xx_gpu->gmu, GMU_OOB_PERFCOUNTER_SET);
 
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
index 55f443328d8e7..c61b233aff09b 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
@@ -147,8 +147,7 @@ static int a6xx_crashdumper_run(struct msm_gpu *gpu,
 	/* Make sure all pending memory writes are posted */
 	wmb();
 
-	gpu_write64(gpu, REG_A6XX_CP_CRASH_SCRIPT_BASE_LO,
-		REG_A6XX_CP_CRASH_SCRIPT_BASE_HI, dumper->iova);
+	gpu_write64(gpu, REG_A6XX_CP_CRASH_SCRIPT_BASE_LO, dumper->iova);
 
 	gpu_write(gpu, REG_A6XX_CP_CRASH_DUMP_CNTL, 1);
 
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index 02419f2ca2bc5..f7fca687d45de 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -503,7 +503,7 @@ static inline void gpu_rmw(struct msm_gpu *gpu, u32 reg, u32 mask, u32 or)
 	msm_rmw(gpu->mmio + (reg << 2), mask, or);
 }
 
-static inline u64 gpu_read64(struct msm_gpu *gpu, u32 lo, u32 hi)
+static inline u64 gpu_read64(struct msm_gpu *gpu, u32 reg)
 {
 	u64 val;
 
@@ -521,17 +521,17 @@ static inline u64 gpu_read64(struct msm_gpu *gpu, u32 lo, u32 hi)
 	 * when the lo is read, so make sure to read the lo first to trigger
 	 * that
 	 */
-	val = (u64) msm_readl(gpu->mmio + (lo << 2));
-	val |= ((u64) msm_readl(gpu->mmio + (hi << 2)) << 32);
+	val = (u64) msm_readl(gpu->mmio + (reg << 2));
+	val |= ((u64) msm_readl(gpu->mmio + ((reg + 1) << 2)) << 32);
 
 	return val;
 }
 
-static inline void gpu_write64(struct msm_gpu *gpu, u32 lo, u32 hi, u64 val)
+static inline void gpu_write64(struct msm_gpu *gpu, u32 reg, u64 val)
 {
 	/* Why not a writeq here? Read the screed above */
-	msm_writel(lower_32_bits(val), gpu->mmio + (lo << 2));
-	msm_writel(upper_32_bits(val), gpu->mmio + (hi << 2));
+	msm_writel(lower_32_bits(val), gpu->mmio + (reg << 2));
+	msm_writel(upper_32_bits(val), gpu->mmio + ((reg + 1) << 2));
 }
 
 int msm_gpu_pm_suspend(struct msm_gpu *gpu);
-- 
2.26.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/4] drm/msm/adreno: use a single register offset for gpu_read64/gpu_write64
@ 2022-03-27 20:25   ` Jonathan Marek
  0 siblings, 0 replies; 12+ messages in thread
From: Jonathan Marek @ 2022-03-27 20:25 UTC (permalink / raw)
  To: freedreno
  Cc: Douglas Anderson, open list, David Airlie,
	open list:DRM DRIVER FOR MSM ADRENO GPU, Yangtao Li,
	Vladimir Lypak, Abhinav Kumar,
	open list:DRM DRIVER FOR MSM ADRENO GPU, Jordan Crouse,
	Akhil P Oommen, Dmitry Baryshkov, Sean Paul,
	Christian König, Dan Carpenter

The high half of 64-bit registers is always at +1 offset, so change these
helpers to be more convenient by removing the unnecessary argument.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
---
 drivers/gpu/drm/msm/adreno/a4xx_gpu.c       |  3 +--
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c       | 27 ++++++++-------------
 drivers/gpu/drm/msm/adreno/a5xx_preempt.c   |  4 +--
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c       | 25 ++++++-------------
 drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c |  3 +--
 drivers/gpu/drm/msm/msm_gpu.h               | 12 ++++-----
 6 files changed, 27 insertions(+), 47 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
index 0c6b2a6d0b4c9..da5e18bd74a45 100644
--- a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
@@ -606,8 +606,7 @@ static int a4xx_pm_suspend(struct msm_gpu *gpu) {
 
 static int a4xx_get_timestamp(struct msm_gpu *gpu, uint64_t *value)
 {
-	*value = gpu_read64(gpu, REG_A4XX_RBBM_PERFCTR_CP_0_LO,
-		REG_A4XX_RBBM_PERFCTR_CP_0_HI);
+	*value = gpu_read64(gpu, REG_A4XX_RBBM_PERFCTR_CP_0_LO);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index 407f50a15faa4..1916cb759cd5c 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -605,11 +605,9 @@ static int a5xx_ucode_init(struct msm_gpu *gpu)
 		a5xx_ucode_check_version(a5xx_gpu, a5xx_gpu->pfp_bo);
 	}
 
-	gpu_write64(gpu, REG_A5XX_CP_ME_INSTR_BASE_LO,
-		REG_A5XX_CP_ME_INSTR_BASE_HI, a5xx_gpu->pm4_iova);
+	gpu_write64(gpu, REG_A5XX_CP_ME_INSTR_BASE_LO, a5xx_gpu->pm4_iova);
 
-	gpu_write64(gpu, REG_A5XX_CP_PFP_INSTR_BASE_LO,
-		REG_A5XX_CP_PFP_INSTR_BASE_HI, a5xx_gpu->pfp_iova);
+	gpu_write64(gpu, REG_A5XX_CP_PFP_INSTR_BASE_LO, a5xx_gpu->pfp_iova);
 
 	return 0;
 }
@@ -868,8 +866,7 @@ static int a5xx_hw_init(struct msm_gpu *gpu)
 	 * memory rendering at this point in time and we don't want to block off
 	 * part of the virtual memory space.
 	 */
-	gpu_write64(gpu, REG_A5XX_RBBM_SECVID_TSB_TRUSTED_BASE_LO,
-		REG_A5XX_RBBM_SECVID_TSB_TRUSTED_BASE_HI, 0x00000000);
+	gpu_write64(gpu, REG_A5XX_RBBM_SECVID_TSB_TRUSTED_BASE_LO, 0x00000000);
 	gpu_write(gpu, REG_A5XX_RBBM_SECVID_TSB_TRUSTED_SIZE, 0x00000000);
 
 	/* Put the GPU into 64 bit by default */
@@ -908,8 +905,7 @@ static int a5xx_hw_init(struct msm_gpu *gpu)
 		return ret;
 
 	/* Set the ringbuffer address */
-	gpu_write64(gpu, REG_A5XX_CP_RB_BASE, REG_A5XX_CP_RB_BASE_HI,
-		gpu->rb[0]->iova);
+	gpu_write64(gpu, REG_A5XX_CP_RB_BASE, gpu->rb[0]->iova);
 
 	/*
 	 * If the microcode supports the WHERE_AM_I opcode then we can use that
@@ -936,7 +932,7 @@ static int a5xx_hw_init(struct msm_gpu *gpu)
 		}
 
 		gpu_write64(gpu, REG_A5XX_CP_RB_RPTR_ADDR,
-			REG_A5XX_CP_RB_RPTR_ADDR_HI, shadowptr(a5xx_gpu, gpu->rb[0]));
+			shadowptr(a5xx_gpu, gpu->rb[0]));
 	} else if (gpu->nr_rings > 1) {
 		/* Disable preemption if WHERE_AM_I isn't available */
 		a5xx_preempt_fini(gpu);
@@ -1239,9 +1235,9 @@ static void a5xx_fault_detect_irq(struct msm_gpu *gpu)
 		gpu_read(gpu, REG_A5XX_RBBM_STATUS),
 		gpu_read(gpu, REG_A5XX_CP_RB_RPTR),
 		gpu_read(gpu, REG_A5XX_CP_RB_WPTR),
-		gpu_read64(gpu, REG_A5XX_CP_IB1_BASE, REG_A5XX_CP_IB1_BASE_HI),
+		gpu_read64(gpu, REG_A5XX_CP_IB1_BASE),
 		gpu_read(gpu, REG_A5XX_CP_IB1_BUFSZ),
-		gpu_read64(gpu, REG_A5XX_CP_IB2_BASE, REG_A5XX_CP_IB2_BASE_HI),
+		gpu_read64(gpu, REG_A5XX_CP_IB2_BASE),
 		gpu_read(gpu, REG_A5XX_CP_IB2_BUFSZ));
 
 	/* Turn off the hangcheck timer to keep it from bothering us */
@@ -1427,8 +1423,7 @@ static int a5xx_pm_suspend(struct msm_gpu *gpu)
 
 static int a5xx_get_timestamp(struct msm_gpu *gpu, uint64_t *value)
 {
-	*value = gpu_read64(gpu, REG_A5XX_RBBM_ALWAYSON_COUNTER_LO,
-		REG_A5XX_RBBM_ALWAYSON_COUNTER_HI);
+	*value = gpu_read64(gpu, REG_A5XX_RBBM_ALWAYSON_COUNTER_LO);
 
 	return 0;
 }
@@ -1465,8 +1460,7 @@ static int a5xx_crashdumper_run(struct msm_gpu *gpu,
 	if (IS_ERR_OR_NULL(dumper->ptr))
 		return -EINVAL;
 
-	gpu_write64(gpu, REG_A5XX_CP_CRASH_SCRIPT_BASE_LO,
-		REG_A5XX_CP_CRASH_SCRIPT_BASE_HI, dumper->iova);
+	gpu_write64(gpu, REG_A5XX_CP_CRASH_SCRIPT_BASE_LO, dumper->iova);
 
 	gpu_write(gpu, REG_A5XX_CP_CRASH_DUMP_CNTL, 1);
 
@@ -1670,8 +1664,7 @@ static unsigned long a5xx_gpu_busy(struct msm_gpu *gpu)
 	if (pm_runtime_get_if_in_use(&gpu->pdev->dev) == 0)
 		return 0;
 
-	busy_cycles = gpu_read64(gpu, REG_A5XX_RBBM_PERFCTR_RBBM_0_LO,
-			REG_A5XX_RBBM_PERFCTR_RBBM_0_HI);
+	busy_cycles = gpu_read64(gpu, REG_A5XX_RBBM_PERFCTR_RBBM_0_LO);
 
 	busy_time = busy_cycles - gpu->devfreq.busy_cycles;
 	do_div(busy_time, clk_get_rate(gpu->core_clk) / 1000000);
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
index 8abc9a2b114a2..7658e89844b46 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
@@ -137,7 +137,6 @@ void a5xx_preempt_trigger(struct msm_gpu *gpu)
 
 	/* Set the address of the incoming preemption record */
 	gpu_write64(gpu, REG_A5XX_CP_CONTEXT_SWITCH_RESTORE_ADDR_LO,
-		REG_A5XX_CP_CONTEXT_SWITCH_RESTORE_ADDR_HI,
 		a5xx_gpu->preempt_iova[ring->id]);
 
 	a5xx_gpu->next_ring = ring;
@@ -211,8 +210,7 @@ void a5xx_preempt_hw_init(struct msm_gpu *gpu)
 	}
 
 	/* Write a 0 to signal that we aren't switching pagetables */
-	gpu_write64(gpu, REG_A5XX_CP_CONTEXT_SWITCH_SMMU_INFO_LO,
-		REG_A5XX_CP_CONTEXT_SWITCH_SMMU_INFO_HI, 0);
+	gpu_write64(gpu, REG_A5XX_CP_CONTEXT_SWITCH_SMMU_INFO_LO, 0);
 
 	/* Reset the preemption state */
 	set_preempt_state(a5xx_gpu, PREEMPT_NONE);
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 83c31b2ad865b..a624cb2df233b 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -246,8 +246,7 @@ static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
 	OUT_RING(ring, submit->seqno);
 
 	trace_msm_gpu_submit_flush(submit,
-		gpu_read64(gpu, REG_A6XX_CP_ALWAYS_ON_COUNTER_LO,
-			REG_A6XX_CP_ALWAYS_ON_COUNTER_HI));
+		gpu_read64(gpu, REG_A6XX_CP_ALWAYS_ON_COUNTER_LO));
 
 	a6xx_flush(gpu, ring);
 }
@@ -878,8 +877,7 @@ static int a6xx_ucode_init(struct msm_gpu *gpu)
 		}
 	}
 
-	gpu_write64(gpu, REG_A6XX_CP_SQE_INSTR_BASE,
-		REG_A6XX_CP_SQE_INSTR_BASE+1, a6xx_gpu->sqe_iova);
+	gpu_write64(gpu, REG_A6XX_CP_SQE_INSTR_BASE, a6xx_gpu->sqe_iova);
 
 	return 0;
 }
@@ -926,8 +924,7 @@ static int hw_init(struct msm_gpu *gpu)
 	 * memory rendering at this point in time and we don't want to block off
 	 * part of the virtual memory space.
 	 */
-	gpu_write64(gpu, REG_A6XX_RBBM_SECVID_TSB_TRUSTED_BASE_LO,
-		REG_A6XX_RBBM_SECVID_TSB_TRUSTED_BASE_HI, 0x00000000);
+	gpu_write64(gpu, REG_A6XX_RBBM_SECVID_TSB_TRUSTED_BASE_LO, 0x00000000);
 	gpu_write(gpu, REG_A6XX_RBBM_SECVID_TSB_TRUSTED_SIZE, 0x00000000);
 
 	/* Turn on 64 bit addressing for all blocks */
@@ -976,11 +973,8 @@ static int hw_init(struct msm_gpu *gpu)
 
 	if (!adreno_is_a650_family(adreno_gpu)) {
 		/* Set the GMEM VA range [0x100000:0x100000 + gpu->gmem - 1] */
-		gpu_write64(gpu, REG_A6XX_UCHE_GMEM_RANGE_MIN_LO,
-			REG_A6XX_UCHE_GMEM_RANGE_MIN_HI, 0x00100000);
-
+		gpu_write64(gpu, REG_A6XX_UCHE_GMEM_RANGE_MIN_LO, 0x00100000);
 		gpu_write64(gpu, REG_A6XX_UCHE_GMEM_RANGE_MAX_LO,
-			REG_A6XX_UCHE_GMEM_RANGE_MAX_HI,
 			0x00100000 + adreno_gpu->gmem - 1);
 	}
 
@@ -1072,8 +1066,7 @@ static int hw_init(struct msm_gpu *gpu)
 		goto out;
 
 	/* Set the ringbuffer address */
-	gpu_write64(gpu, REG_A6XX_CP_RB_BASE, REG_A6XX_CP_RB_BASE_HI,
-		gpu->rb[0]->iova);
+	gpu_write64(gpu, REG_A6XX_CP_RB_BASE, gpu->rb[0]->iova);
 
 	/* Targets that support extended APRIV can use the RPTR shadow from
 	 * hardware but all the other ones need to disable the feature. Targets
@@ -1105,7 +1098,6 @@ static int hw_init(struct msm_gpu *gpu)
 		}
 
 		gpu_write64(gpu, REG_A6XX_CP_RB_RPTR_ADDR_LO,
-			REG_A6XX_CP_RB_RPTR_ADDR_HI,
 			shadowptr(a6xx_gpu, gpu->rb[0]));
 	}
 
@@ -1394,9 +1386,9 @@ static void a6xx_fault_detect_irq(struct msm_gpu *gpu)
 		gpu_read(gpu, REG_A6XX_RBBM_STATUS),
 		gpu_read(gpu, REG_A6XX_CP_RB_RPTR),
 		gpu_read(gpu, REG_A6XX_CP_RB_WPTR),
-		gpu_read64(gpu, REG_A6XX_CP_IB1_BASE, REG_A6XX_CP_IB1_BASE_HI),
+		gpu_read64(gpu, REG_A6XX_CP_IB1_BASE),
 		gpu_read(gpu, REG_A6XX_CP_IB1_REM_SIZE),
-		gpu_read64(gpu, REG_A6XX_CP_IB2_BASE, REG_A6XX_CP_IB2_BASE_HI),
+		gpu_read64(gpu, REG_A6XX_CP_IB2_BASE),
 		gpu_read(gpu, REG_A6XX_CP_IB2_REM_SIZE));
 
 	/* Turn off the hangcheck timer to keep it from bothering us */
@@ -1607,8 +1599,7 @@ static int a6xx_get_timestamp(struct msm_gpu *gpu, uint64_t *value)
 	/* Force the GPU power on so we can read this register */
 	a6xx_gmu_set_oob(&a6xx_gpu->gmu, GMU_OOB_PERFCOUNTER_SET);
 
-	*value = gpu_read64(gpu, REG_A6XX_CP_ALWAYS_ON_COUNTER_LO,
-			    REG_A6XX_CP_ALWAYS_ON_COUNTER_HI);
+	*value = gpu_read64(gpu, REG_A6XX_CP_ALWAYS_ON_COUNTER_LO);
 
 	a6xx_gmu_clear_oob(&a6xx_gpu->gmu, GMU_OOB_PERFCOUNTER_SET);
 
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
index 55f443328d8e7..c61b233aff09b 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
@@ -147,8 +147,7 @@ static int a6xx_crashdumper_run(struct msm_gpu *gpu,
 	/* Make sure all pending memory writes are posted */
 	wmb();
 
-	gpu_write64(gpu, REG_A6XX_CP_CRASH_SCRIPT_BASE_LO,
-		REG_A6XX_CP_CRASH_SCRIPT_BASE_HI, dumper->iova);
+	gpu_write64(gpu, REG_A6XX_CP_CRASH_SCRIPT_BASE_LO, dumper->iova);
 
 	gpu_write(gpu, REG_A6XX_CP_CRASH_DUMP_CNTL, 1);
 
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index 02419f2ca2bc5..f7fca687d45de 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -503,7 +503,7 @@ static inline void gpu_rmw(struct msm_gpu *gpu, u32 reg, u32 mask, u32 or)
 	msm_rmw(gpu->mmio + (reg << 2), mask, or);
 }
 
-static inline u64 gpu_read64(struct msm_gpu *gpu, u32 lo, u32 hi)
+static inline u64 gpu_read64(struct msm_gpu *gpu, u32 reg)
 {
 	u64 val;
 
@@ -521,17 +521,17 @@ static inline u64 gpu_read64(struct msm_gpu *gpu, u32 lo, u32 hi)
 	 * when the lo is read, so make sure to read the lo first to trigger
 	 * that
 	 */
-	val = (u64) msm_readl(gpu->mmio + (lo << 2));
-	val |= ((u64) msm_readl(gpu->mmio + (hi << 2)) << 32);
+	val = (u64) msm_readl(gpu->mmio + (reg << 2));
+	val |= ((u64) msm_readl(gpu->mmio + ((reg + 1) << 2)) << 32);
 
 	return val;
 }
 
-static inline void gpu_write64(struct msm_gpu *gpu, u32 lo, u32 hi, u64 val)
+static inline void gpu_write64(struct msm_gpu *gpu, u32 reg, u64 val)
 {
 	/* Why not a writeq here? Read the screed above */
-	msm_writel(lower_32_bits(val), gpu->mmio + (lo << 2));
-	msm_writel(upper_32_bits(val), gpu->mmio + (hi << 2));
+	msm_writel(lower_32_bits(val), gpu->mmio + (reg << 2));
+	msm_writel(upper_32_bits(val), gpu->mmio + ((reg + 1) << 2));
 }
 
 int msm_gpu_pm_suspend(struct msm_gpu *gpu);
-- 
2.26.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/4] drm/msm/adreno: update headers
  2022-03-27 20:25 ` Jonathan Marek
@ 2022-03-27 20:25   ` Jonathan Marek
  -1 siblings, 0 replies; 12+ messages in thread
From: Jonathan Marek @ 2022-03-27 20:25 UTC (permalink / raw)
  To: freedreno
  Cc: Rob Clark, Sean Paul, Abhinav Kumar, David Airlie, Daniel Vetter,
	open list, open list:DRM DRIVER FOR MSM ADRENO GPU,
	open list:DRM DRIVER FOR MSM ADRENO GPU

Adds a7xx changes for the kernel driver.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
---
 drivers/gpu/drm/msm/adreno/a7xx.xml.h       | 666 ++++++++++++++++++++
 drivers/gpu/drm/msm/adreno/adreno_pm4.xml.h |  63 +-
 2 files changed, 716 insertions(+), 13 deletions(-)
 create mode 100644 drivers/gpu/drm/msm/adreno/a7xx.xml.h

diff --git a/drivers/gpu/drm/msm/adreno/a7xx.xml.h b/drivers/gpu/drm/msm/adreno/a7xx.xml.h
new file mode 100644
index 0000000000000..45ef4289ac52b
--- /dev/null
+++ b/drivers/gpu/drm/msm/adreno/a7xx.xml.h
@@ -0,0 +1,666 @@
+#ifndef A7XX_XML
+#define A7XX_XML
+
+/* Autogenerated file, DO NOT EDIT manually!
+
+This file was generated by the rules-ng-ng headergen tool in this git repository:
+http://github.com/freedreno/envytools/
+git clone https://github.com/freedreno/envytools.git
+
+The rules-ng-ng source files this header was generated from are:
+- freedreno/registers/adreno.xml                     (    627 bytes, from 2022-03-27 15:04:47)
+- freedreno/registers/freedreno_copyright.xml        (   1572 bytes, from 2020-11-18 00:17:12)
+- freedreno/registers/adreno/a2xx.xml                (  90810 bytes, from 2021-08-06 17:44:41)
+- freedreno/registers/adreno/adreno_common.xml       (  14631 bytes, from 2022-03-27 14:52:08)
+- freedreno/registers/adreno/adreno_pm4.xml          (  70334 bytes, from 2022-03-27 20:01:26)
+- freedreno/registers/adreno/a3xx.xml                (  84231 bytes, from 2021-08-27 13:03:56)
+- freedreno/registers/adreno/a4xx.xml                ( 113474 bytes, from 2022-03-22 19:23:46)
+- freedreno/registers/adreno/a5xx.xml                ( 149512 bytes, from 2022-03-21 16:05:18)
+- freedreno/registers/adreno/a6xx.xml                ( 184954 bytes, from 2022-03-22 19:23:46)
+- freedreno/registers/adreno/a6xx_gmu.xml            (  11331 bytes, from 2021-08-06 17:44:41)
+- freedreno/registers/adreno/a7xx.xml                (  20004 bytes, from 2022-03-27 20:01:42)
+- freedreno/registers/adreno/ocmem.xml               (   1773 bytes, from 2020-11-18 00:17:12)
+- freedreno/registers/adreno/adreno_control_regs.xml (   6038 bytes, from 2022-03-22 19:23:46)
+- freedreno/registers/adreno/adreno_pipe_regs.xml    (   2924 bytes, from 2022-03-22 19:23:46)
+
+Copyright (C) 2013-2022 by the following authors:
+- Rob Clark <robdclark@gmail.com> (robclark)
+- Ilia Mirkin <imirkin@alum.mit.edu> (imirkin)
+
+Permission is hereby granted, free of charge, to any person obtaining
+a copy of this software and associated documentation files (the
+"Software"), to deal in the Software without restriction, including
+without limitation the rights to use, copy, modify, merge, publish,
+distribute, sublicense, and/or sell copies of the Software, and to
+permit persons to whom the Software is furnished to do so, subject to
+the following conditions:
+
+The above copyright notice and this permission notice (including the
+next paragraph) shall be included in all copies or substantial
+portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
+LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
+OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
+WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+*/
+
+
+enum a7xx_event {
+	CCU_INVALIDATE_DEPTH = 24,
+	CCU_INVALIDATE_COLOR = 25,
+	CCU_RESOLVE_CLEAN = 26,
+	CCU_FLUSH_DEPTH = 28,
+	CCU_FLUSH_COLOR = 29,
+	CCU_RESOLVE = 30,
+	CCU_END_RESOLVE_GROUP = 31,
+	CCU_CLEAN_DEPTH = 32,
+	CCU_CLEAN_COLOR = 33,
+	CACHE_RESET = 48,
+	CACHE_CLEAN = 49,
+	CACHE_FLUSH7 = 50,
+	CACHE_INVALIDATE7 = 51,
+};
+
+#define REG_A7XX_RBBM_GBIF_CLIENT_QOS_CNTL			0x00000011
+
+#define REG_A7XX_RBBM_GBIF_HALT					0x00000016
+
+#define REG_A7XX_RBBM_GBIF_HALT_ACK				0x00000017
+
+#define REG_A7XX_RBBM_INTERFACE_HANG_INT_CNTL			0x0000001f
+
+#define REG_A7XX_RBBM_INT_CLEAR_CMD				0x00000037
+
+#define REG_A7XX_RBBM_INT_0_MASK				0x00000038
+#define A7XX_RBBM_INT_0_MASK_GPUIDLE				0x00000001
+#define A7XX_RBBM_INT_0_MASK_AHBERROR				0x00000002
+#define A7XX_RBBM_INT_0_MASK_CPIPCINT0				0x00000010
+#define A7XX_RBBM_INT_0_MASK_CPIPCINT1				0x00000020
+#define A7XX_RBBM_INT_0_MASK_ATBASYNCFIFOOVERFLOW		0x00000040
+#define A7XX_RBBM_INT_0_MASK_GPCERROR				0x00000080
+#define A7XX_RBBM_INT_0_MASK_SWINTERRUPT			0x00000100
+#define A7XX_RBBM_INT_0_MASK_HWERROR				0x00000200
+#define A7XX_RBBM_INT_0_MASK_CCU_CLEAN_DEPTH_TS			0x00000400
+#define A7XX_RBBM_INT_0_MASK_CCU_CLEAN_COLOR_TS			0x00000800
+#define A7XX_RBBM_INT_0_MASK_CCU_RESOLVE_CLEAN_TS		0x00001000
+#define A7XX_RBBM_INT_0_MASK_PM4CPINTERRUPT			0x00008000
+#define A7XX_RBBM_INT_0_MASK_PM4CPINTERRUPTLPAC			0x00010000
+#define A7XX_RBBM_INT_0_MASK_RB_DONE_TS				0x00020000
+#define A7XX_RBBM_INT_0_MASK_CACHE_CLEAN_TS			0x00100000
+#define A7XX_RBBM_INT_0_MASK_CACHE_CLEAN_TS_LPAC		0x00200000
+#define A7XX_RBBM_INT_0_MASK_ATBBUSOVERFLOW			0x00400000
+#define A7XX_RBBM_INT_0_MASK_HANGDETECTINTERRUPT		0x00800000
+#define A7XX_RBBM_INT_0_MASK_OUTOFBOUNDACCESS			0x01000000
+#define A7XX_RBBM_INT_0_MASK_UCHETRAPINTERRUPT			0x02000000
+#define A7XX_RBBM_INT_0_MASK_DEBUGBUSINTERRUPT0			0x04000000
+#define A7XX_RBBM_INT_0_MASK_DEBUGBUSINTERRUPT1			0x08000000
+#define A7XX_RBBM_INT_0_MASK_TSBWRITEERROR			0x10000000
+#define A7XX_RBBM_INT_0_MASK_ISDBCPUIRQ				0x40000000
+#define A7XX_RBBM_INT_0_MASK_ISDBUNDERDEBUG			0x80000000
+
+#define REG_A7XX_RBBM_INT_2_MASK				0x0000003a
+
+#define REG_A7XX_RBBM_SP_HYST_CNT				0x00000042
+
+#define REG_A7XX_RBBM_SW_RESET_CMD				0x00000043
+
+#define REG_A7XX_RBBM_RAC_THRESHOLD_CNT				0x00000044
+
+#define REG_A7XX_RBBM_CLOCK_CNTL				0x000000ae
+
+#define REG_A7XX_RBBM_CLOCK_CNTL_SP0				0x000000b0
+
+#define REG_A7XX_RBBM_CLOCK_CNTL2_SP0				0x000000b4
+
+#define REG_A7XX_RBBM_CLOCK_DELAY_SP0				0x000000b8
+
+#define REG_A7XX_RBBM_CLOCK_HYST_SP0				0x000000bc
+
+#define REG_A7XX_RBBM_CLOCK_CNTL_TP0				0x000000c0
+
+#define REG_A7XX_RBBM_CLOCK_CNTL2_TP0				0x000000c4
+
+#define REG_A7XX_RBBM_CLOCK_CNTL3_TP0				0x000000c8
+
+#define REG_A7XX_RBBM_CLOCK_CNTL4_TP0				0x000000cc
+
+#define REG_A7XX_RBBM_CLOCK_DELAY_TP0				0x000000d0
+
+#define REG_A7XX_RBBM_CLOCK_DELAY2_TP0				0x000000d4
+
+#define REG_A7XX_RBBM_CLOCK_DELAY3_TP0				0x000000d8
+
+#define REG_A7XX_RBBM_CLOCK_DELAY4_TP0				0x000000dc
+
+#define REG_A7XX_RBBM_CLOCK_HYST_TP0				0x000000e0
+
+#define REG_A7XX_RBBM_CLOCK_HYST2_TP0				0x000000e4
+
+#define REG_A7XX_RBBM_CLOCK_HYST3_TP0				0x000000e8
+
+#define REG_A7XX_RBBM_CLOCK_HYST4_TP0				0x000000ec
+
+#define REG_A7XX_RBBM_CLOCK_CNTL_RB0				0x000000f0
+
+#define REG_A7XX_RBBM_CLOCK_CNTL2_RB0				0x000000f4
+
+#define REG_A7XX_RBBM_CLOCK_CNTL_CCU0				0x000000f8
+
+#define REG_A7XX_RBBM_CLOCK_HYST_RB_CCU0			0x00000100
+
+#define REG_A7XX_RBBM_CLOCK_CNTL_RAC				0x00000104
+
+#define REG_A7XX_RBBM_CLOCK_CNTL2_RAC				0x00000105
+
+#define REG_A7XX_RBBM_CLOCK_DELAY_RAC				0x00000106
+
+#define REG_A7XX_RBBM_CLOCK_HYST_RAC				0x00000107
+
+#define REG_A7XX_RBBM_CLOCK_CNTL_TSE_RAS_RBBM			0x00000108
+
+#define REG_A7XX_RBBM_CLOCK_DELAY_TSE_RAS_RBBM			0x00000109
+
+#define REG_A7XX_RBBM_CLOCK_HYST_TSE_RAS_RBBM			0x0000010a
+
+#define REG_A7XX_RBBM_CLOCK_CNTL_UCHE				0x0000010b
+
+#define REG_A7XX_RBBM_CLOCK_DELAY_UCHE				0x0000010f
+
+#define REG_A7XX_RBBM_CLOCK_HYST_UCHE				0x00000110
+
+#define REG_A7XX_RBBM_CLOCK_MODE_VFD				0x00000111
+
+#define REG_A7XX_RBBM_CLOCK_DELAY_VFD				0x00000112
+
+#define REG_A7XX_RBBM_CLOCK_HYST_VFD				0x00000113
+
+#define REG_A7XX_RBBM_CLOCK_MODE_GPC				0x00000114
+
+#define REG_A7XX_RBBM_CLOCK_DELAY_GPC				0x00000115
+
+#define REG_A7XX_RBBM_CLOCK_HYST_GPC				0x00000116
+
+#define REG_A7XX_RBBM_CLOCK_DELAY_HLSQ_2			0x00000117
+
+#define REG_A7XX_RBBM_CLOCK_CNTL_GMU_GX				0x00000118
+
+#define REG_A7XX_RBBM_CLOCK_DELAY_GMU_GX			0x00000119
+
+#define REG_A7XX_RBBM_CLOCK_HYST_GMU_GX				0x0000011a
+
+#define REG_A7XX_RBBM_CLOCK_MODE_HLSQ				0x0000011b
+
+#define REG_A7XX_RBBM_CLOCK_DELAY_HLSQ				0x0000011c
+
+#define REG_A7XX_RBBM_CLOCK_HYST_HLSQ				0x0000011d
+
+#define REG_A7XX_RBBM_INT_0_STATUS				0x00000201
+
+#define REG_A7XX_RBBM_STATUS					0x00000210
+#define A7XX_RBBM_STATUS_CPAHBBUSYCXMASTER			0x00000001
+#define A7XX_RBBM_STATUS_CPAHBBUSYCPMASTER			0x00000002
+#define A7XX_RBBM_STATUS_CPBUSY					0x00000004
+#define A7XX_RBBM_STATUS_GFXDBGCBUSY				0x00000008
+#define A7XX_RBBM_STATUS_VBIFGXFPARTBUSY			0x00000010
+#define A7XX_RBBM_STATUS_TSEBUSY				0x00000020
+#define A7XX_RBBM_STATUS_RASBUSY				0x00000040
+#define A7XX_RBBM_STATUS_RBBUSY					0x00000080
+#define A7XX_RBBM_STATUS_CCUBUSY				0x00000100
+#define A7XX_RBBM_STATUS_A2DBUSY				0x00000200
+#define A7XX_RBBM_STATUS_LRZBUSY				0x00000400
+#define A7XX_RBBM_STATUS_COMDCOMBUSY				0x00000800
+#define A7XX_RBBM_STATUS_PCDCALLBUSY				0x00001000
+#define A7XX_RBBM_STATUS_PCVSDBUSY				0x00002000
+#define A7XX_RBBM_STATUS_TESSBUSY				0x00004000
+#define A7XX_RBBM_STATUS_VFDBUSY				0x00008000
+#define A7XX_RBBM_STATUS_VPCBUSY				0x00010000
+#define A7XX_RBBM_STATUS_UCHEBUSY				0x00020000
+#define A7XX_RBBM_STATUS_SPBUSY					0x00040000
+#define A7XX_RBBM_STATUS_TPL1BUSY				0x00080000
+#define A7XX_RBBM_STATUS_VSCBUSY				0x00100000
+#define A7XX_RBBM_STATUS_HLSQBUSY				0x00200000
+#define A7XX_RBBM_STATUS_GPUBUSYIGNAHBCP			0x00400000
+#define A7XX_RBBM_STATUS_GPUBUSYIGNAHB				0x00800000
+
+#define REG_A7XX_RBBM_STATUS3					0x00000213
+
+#define REG_A7XX_RBBM_CLOCK_MODE_CP				0x00000260
+
+#define REG_A7XX_RBBM_CLOCK_MODE_BV_LRZ				0x00000284
+
+#define REG_A7XX_RBBM_CLOCK_MODE_BV_GRAS			0x00000285
+
+#define REG_A7XX_RBBM_CLOCK_MODE2_GRAS				0x00000286
+
+#define REG_A7XX_RBBM_CLOCK_MODE_BV_VFD				0x00000287
+
+#define REG_A7XX_RBBM_CLOCK_MODE_BV_GPC				0x00000288
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_CP(uint32_t i0) { return 0x00000300 + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_RBBM(uint32_t i0) { return 0x0000031c + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_PC(uint32_t i0) { return 0x00000324 + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_VFD(uint32_t i0) { return 0x00000334 + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_HLSQ(uint32_t i0) { return 0x00000344 + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_VPC(uint32_t i0) { return 0x00000350 + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_CCU(uint32_t i0) { return 0x0000035c + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_TSE(uint32_t i0) { return 0x00000366 + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_RAS(uint32_t i0) { return 0x0000036e + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_UCHE(uint32_t i0) { return 0x00000376 + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_TP(uint32_t i0) { return 0x0000038e + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_SP(uint32_t i0) { return 0x000003a6 + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_RB(uint32_t i0) { return 0x000003d6 + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_VSC(uint32_t i0) { return 0x000003e6 + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_LRZ(uint32_t i0) { return 0x000003ea + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_CMP(uint32_t i0) { return 0x000003f2 + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_UFC(uint32_t i0) { return 0x000003fa + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR2_HLSQ(uint32_t i0) { return 0x00000410 + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR2_CP(uint32_t i0) { return 0x0000041c + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR2_SP(uint32_t i0) { return 0x0000042a + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR2_TP(uint32_t i0) { return 0x00000442 + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR2_UFC(uint32_t i0) { return 0x0000044e + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_BV_PC(uint32_t i0) { return 0x00000460 + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_BV_VFD(uint32_t i0) { return 0x00000470 + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_BV_VPC(uint32_t i0) { return 0x00000480 + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_BV_TSE(uint32_t i0) { return 0x0000048c + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_BV_RAS(uint32_t i0) { return 0x00000494 + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_BV_LRZ(uint32_t i0) { return 0x0000049c + 0x2*i0; }
+
+#define REG_A7XX_RBBM_PERFCTR_CNTL				0x00000500
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_RBBM_SEL(uint32_t i0) { return 0x00000507 + 0x1*i0; }
+
+#define REG_A7XX_RBBM_PERFCTR_GPU_BUSY_MASKED			0x0000050b
+
+#define REG_A7XX_RBBM_ISDB_CNT					0x00000533
+
+#define REG_A7XX_RBBM_NC_MODE_CNTL				0x00000534
+
+#define REG_A7XX_RBBM_SNAPSHOT_STATUS				0x00000535
+
+#define REG_A7XX_CP_RB_BASE					0x00000800
+
+#define REG_A7XX_CP_RB_CNTL					0x00000802
+
+#define REG_A7XX_CP_RB_RPTR_ADDR				0x00000804
+
+#define REG_A7XX_CP_RB_RPTR					0x00000806
+
+#define REG_A7XX_CP_RB_WPTR					0x00000807
+
+#define REG_A7XX_CP_SQE_CNTL					0x00000808
+
+#define REG_A7XX_CP_CP2GMU_STATUS				0x00000812
+
+#define REG_A7XX_CP_HW_FAULT					0x00000821
+
+#define REG_A7XX_CP_INTERRUPT_STATUS				0x00000823
+#define A7XX_CP_INTERRUPT_STATUS_OPCODEERROR			0x00000001
+#define A7XX_CP_INTERRUPT_STATUS_UCODEERROR			0x00000002
+#define A7XX_CP_INTERRUPT_STATUS_CPHWFAULT			0x00000004
+#define A7XX_CP_INTERRUPT_STATUS_REGISTERPROTECTION		0x00000010
+#define A7XX_CP_INTERRUPT_STATUS_VSDPARITYERROR			0x00000040
+#define A7XX_CP_INTERRUPT_STATUS_ILLEGALINSTRUCTION		0x00000080
+#define A7XX_CP_INTERRUPT_STATUS_OPCODEERRORLPAC		0x00000100
+#define A7XX_CP_INTERRUPT_STATUS_UCODEERRORLPAC			0x00000200
+#define A7XX_CP_INTERRUPT_STATUS_CPHWFAULTLPAC			0x00000400
+#define A7XX_CP_INTERRUPT_STATUS_REGISTERPROTECTIONLPAC		0x00000800
+#define A7XX_CP_INTERRUPT_STATUS_ILLEGALINSTRUCTIONLPAC		0x00001000
+#define A7XX_CP_INTERRUPT_STATUS_OPCODEERRORBV			0x00002000
+#define A7XX_CP_INTERRUPT_STATUS_UCODEERRORBV			0x00004000
+#define A7XX_CP_INTERRUPT_STATUS_CPHWFAULTBV			0x00008000
+#define A7XX_CP_INTERRUPT_STATUS_REGISTERPROTECTIONBV		0x00010000
+#define A7XX_CP_INTERRUPT_STATUS_ILLEGALINSTRUCTIONBV		0x00020000
+
+#define REG_A7XX_CP_PROTECT_STATUS				0x00000824
+
+#define REG_A7XX_CP_STATUS_1					0x00000825
+
+#define REG_A7XX_CP_SQE_INSTR_BASE				0x00000830
+
+#define REG_A7XX_CP_MISC_CNTL					0x00000840
+
+#define REG_A7XX_CP_CHICKEN_DBG					0x00000841
+
+#define REG_A7XX_CP_DBG_ECO_CNTL				0x00000843
+
+#define REG_A7XX_CP_APRIV_CNTL					0x00000844
+
+#define REG_A7XX_CP_PROTECT_CNTL				0x0000084f
+
+static inline uint32_t REG_A7XX_CP_PROTECT_REG(uint32_t i0) { return 0x00000850 + 0x1*i0; }
+
+#define REG_A7XX_CP_CONTEXT_SWITCH_CNTL				0x000008a0
+
+#define REG_A7XX_CP_CONTEXT_SWITCH_SMMU_INFO			0x000008a1
+
+#define REG_A7XX_CP_CONTEXT_SWITCH_PRIV_NON_SECURE_RESTORE_ADDR	0x000008a3
+
+#define REG_A7XX_CP_CONTEXT_SWITCH_PRIV_SECURE_RESTORE_ADDR	0x000008a5
+
+#define REG_A7XX_CP_CONTEXT_SWITCH_NON_PRIV_RESTORE_ADDR	0x000008a7
+
+#define REG_A7XX_CP_CONTEXT_SWITCH_LEVEL_STATUS			0x000008ab
+
+static inline uint32_t REG_A7XX_CP_PERFCTR_CP_SEL(uint32_t i0) { return 0x000008d0 + 0x1*i0; }
+
+static inline uint32_t REG_A7XX_CP_BV_PERFCTR_CP_SEL(uint32_t i0) { return 0x000008e0 + 0x1*i0; }
+
+#define REG_A7XX_CP_CRASH_SCRIPT_BASE				0x00000900
+
+#define REG_A7XX_CP_CRASH_DUMP_CNTL				0x00000902
+
+#define REG_A7XX_CP_CRASH_DUMP_STATUS				0x00000903
+
+#define REG_A7XX_CP_SQE_STAT_ADDR				0x00000908
+
+#define REG_A7XX_CP_SQE_STAT_DATA				0x00000909
+
+#define REG_A7XX_CP_DRAW_STATE_ADDR				0x0000090a
+
+#define REG_A7XX_CP_DRAW_STATE_DATA				0x0000090b
+
+#define REG_A7XX_CP_ROQ_DBG_ADDR				0x0000090c
+
+#define REG_A7XX_CP_ROQ_DBG_DATA				0x0000090d
+
+#define REG_A7XX_CP_MEM_POOL_DBG_ADDR				0x0000090e
+
+#define REG_A7XX_CP_MEM_POOL_DBG_DATA				0x0000090f
+
+#define REG_A7XX_CP_SQE_UCODE_DBG_ADDR				0x00000910
+
+#define REG_A7XX_CP_SQE_UCODE_DBG_DATA				0x00000911
+
+#define REG_A7XX_CP_IB1_BASE					0x00000928
+
+#define REG_A7XX_CP_IB1_REM_SIZE				0x0000092a
+
+#define REG_A7XX_CP_IB2_BASE					0x0000092b
+
+#define REG_A7XX_CP_IB2_REM_SIZE				0x0000092d
+
+#define REG_A7XX_CP_ALWAYS_ON_COUNTER				0x00000980
+
+#define REG_A7XX_CP_AHB_CNTL					0x0000098d
+
+#define REG_A7XX_CP_APERTURE_CNTL_HOST				0x00000a00
+
+#define REG_A7XX_CP_APERTURE_CNTL_CD				0x00000a03
+
+#define REG_A7XX_CP_BV_PROTECT_STATUS				0x00000a61
+
+#define REG_A7XX_CP_BV_HW_FAULT					0x00000a64
+
+#define REG_A7XX_CP_BV_DRAW_STATE_ADDR				0x00000a81
+
+#define REG_A7XX_CP_BV_DRAW_STATE_DATA				0x00000a82
+
+#define REG_A7XX_CP_BV_ROQ_DBG_ADDR				0x00000a83
+
+#define REG_A7XX_CP_BV_ROQ_DBG_DATA				0x00000a84
+
+#define REG_A7XX_CP_BV_SQE_UCODE_DBG_ADDR			0x00000a85
+
+#define REG_A7XX_CP_BV_SQE_UCODE_DBG_DATA			0x00000a86
+
+#define REG_A7XX_CP_BV_SQE_STAT_ADDR				0x00000a87
+
+#define REG_A7XX_CP_BV_SQE_STAT_DATA				0x00000a88
+
+#define REG_A7XX_CP_BV_MEM_POOL_DBG_ADDR			0x00000a96
+
+#define REG_A7XX_CP_BV_MEM_POOL_DBG_DATA			0x00000a97
+
+#define REG_A7XX_CP_BV_RB_RPTR_ADDR				0x00000a98
+
+#define REG_A7XX_CP_RESOURCE_TBL_DBG_ADDR			0x00000a9a
+
+#define REG_A7XX_CP_RESOURCE_TBL_DBG_DATA			0x00000a9b
+
+#define REG_A7XX_CP_BV_APRIV_CNTL				0x00000ad0
+
+#define REG_A7XX_CP_BV_CHICKEN_DBG				0x00000ada
+
+#define REG_A7XX_CP_LPAC_DRAW_STATE_ADDR			0x00000b0a
+
+#define REG_A7XX_CP_LPAC_DRAW_STATE_DATA			0x00000b0b
+
+#define REG_A7XX_CP_LPAC_ROQ_DBG_ADDR				0x00000b0c
+
+#define REG_A7XX_CP_SQE_AC_UCODE_DBG_ADDR			0x00000b27
+
+#define REG_A7XX_CP_SQE_AC_UCODE_DBG_DATA			0x00000b28
+
+#define REG_A7XX_CP_SQE_AC_STAT_ADDR				0x00000b29
+
+#define REG_A7XX_CP_SQE_AC_STAT_DATA				0x00000b2a
+
+#define REG_A7XX_CP_LPAC_APRIV_CNTL				0x00000b31
+
+#define REG_A7XX_CP_LPAC_ROQ_DBG_DATA				0x00000b35
+
+#define REG_A7XX_CP_LPAC_FIFO_DBG_DATA				0x00000b36
+
+#define REG_A7XX_CP_LPAC_FIFO_DBG_ADDR				0x00000b40
+
+static inline uint32_t REG_A7XX_VSC_PERFCTR_VSC_SEL(uint32_t i0) { return 0x00000cd8 + 0x1*i0; }
+
+#define REG_A7XX_UCHE_MODE_CNTL					0x00000e01
+
+#define REG_A7XX_UCHE_WRITE_THRU_BASE				0x00000e07
+
+#define REG_A7XX_UCHE_TRAP_BASE					0x00000e09
+
+#define REG_A7XX_UCHE_GMEM_RANGE_MIN				0x00000e0b
+
+#define REG_A7XX_UCHE_GMEM_RANGE_MAX				0x00000e0d
+
+#define REG_A7XX_UCHE_CACHE_WAYS				0x00000e17
+
+#define REG_A7XX_UCHE_CLIENT_PF					0x00000e19
+
+static inline uint32_t REG_A7XX_UCHE_PERFCTR_UCHE_SEL(uint32_t i0) { return 0x00000e1c + 0x1*i0; }
+
+#define REG_A7XX_UCHE_GBIF_GX_CONFIG				0x00000e3a
+
+#define REG_A7XX_UCHE_CMDQ_CONFIG				0x00000e3c
+
+#define REG_A7XX_PDC_GPU_ENABLE_PDC				0x00001140
+
+#define REG_A7XX_PDC_GPU_SEQ_START_ADDR				0x00001148
+
+#define REG_A7XX_VBIF_XIN_HALT_CTRL1				0x00003081
+
+#define REG_A7XX_VBIF_TEST_BUS_OUT_CTRL				0x00003084
+
+#define REG_A7XX_VBIF_TEST_BUS1_CTRL0				0x00003085
+
+#define REG_A7XX_VBIF_TEST_BUS1_CTRL1				0x00003086
+
+#define REG_A7XX_VBIF_TEST_BUS2_CTRL0				0x00003087
+
+#define REG_A7XX_VBIF_TEST_BUS2_CTRL1				0x00003088
+
+#define REG_A7XX_VBIF_TEST_BUS_OUT				0x0000308c
+
+#define REG_A7XX_VBIF_PERF_CNT_SEL0				0x000030d0
+
+#define REG_A7XX_VBIF_PERF_CNT_SEL1				0x000030d1
+
+#define REG_A7XX_VBIF_PERF_CNT_SEL2				0x000030d2
+
+#define REG_A7XX_VBIF_PERF_CNT_SEL3				0x000030d3
+
+#define REG_A7XX_VBIF_PERF_CNT_LOW0				0x000030d8
+
+#define REG_A7XX_VBIF_PERF_CNT_LOW1				0x000030d9
+
+#define REG_A7XX_VBIF_PERF_CNT_LOW2				0x000030da
+
+#define REG_A7XX_VBIF_PERF_CNT_LOW3				0x000030db
+
+#define REG_A7XX_VBIF_PERF_CNT_HIGH0				0x000030e0
+
+#define REG_A7XX_VBIF_PERF_CNT_HIGH1				0x000030e1
+
+#define REG_A7XX_VBIF_PERF_CNT_HIGH2				0x000030e2
+
+#define REG_A7XX_VBIF_PERF_CNT_HIGH3				0x000030e3
+
+#define REG_A7XX_VBIF_PERF_PWR_CNT_EN0				0x00003100
+
+#define REG_A7XX_VBIF_PERF_PWR_CNT_EN1				0x00003101
+
+#define REG_A7XX_VBIF_PERF_PWR_CNT_EN2				0x00003102
+
+#define REG_A7XX_VBIF_PERF_PWR_CNT_LOW0				0x00003110
+
+#define REG_A7XX_VBIF_PERF_PWR_CNT_LOW1				0x00003111
+
+#define REG_A7XX_VBIF_PERF_PWR_CNT_LOW2				0x00003112
+
+#define REG_A7XX_VBIF_PERF_PWR_CNT_HIGH0			0x00003118
+
+#define REG_A7XX_VBIF_PERF_PWR_CNT_HIGH1			0x00003119
+
+#define REG_A7XX_VBIF_PERF_PWR_CNT_HIGH2			0x0000311a
+
+#define REG_A7XX_GBIF_SCACHE_CNTL0				0x00003c01
+
+#define REG_A7XX_GBIF_SCACHE_CNTL1				0x00003c02
+
+#define REG_A7XX_GBIF_QSB_SIDE0					0x00003c03
+
+#define REG_A7XX_GBIF_QSB_SIDE1					0x00003c04
+
+#define REG_A7XX_GBIF_QSB_SIDE2					0x00003c05
+
+#define REG_A7XX_GBIF_QSB_SIDE3					0x00003c06
+
+#define REG_A7XX_GBIF_HALT					0x00003c45
+
+#define REG_A7XX_GBIF_HALT_ACK					0x00003c46
+
+#define REG_A7XX_GBIF_PERF_PWR_CNT_EN				0x00003cc0
+
+#define REG_A7XX_GBIF_PERF_PWR_CNT_CLR				0x00003cc1
+
+#define REG_A7XX_GBIF_PERF_CNT_SEL				0x00003cc2
+
+#define REG_A7XX_GBIF_PERF_PWR_CNT_SEL				0x00003cc3
+
+#define REG_A7XX_GBIF_PERF_CNT_LOW0				0x00003cc4
+
+#define REG_A7XX_GBIF_PERF_CNT_LOW1				0x00003cc5
+
+#define REG_A7XX_GBIF_PERF_CNT_LOW2				0x00003cc6
+
+#define REG_A7XX_GBIF_PERF_CNT_LOW3				0x00003cc7
+
+#define REG_A7XX_GBIF_PERF_CNT_HIGH0				0x00003cc8
+
+#define REG_A7XX_GBIF_PERF_CNT_HIGH1				0x00003cc9
+
+#define REG_A7XX_GBIF_PERF_CNT_HIGH2				0x00003cca
+
+#define REG_A7XX_GBIF_PERF_CNT_HIGH3				0x00003ccb
+
+#define REG_A7XX_GBIF_PWR_CNT_LOW0				0x00003ccc
+
+#define REG_A7XX_GBIF_PWR_CNT_LOW1				0x00003ccd
+
+#define REG_A7XX_GBIF_PWR_CNT_LOW2				0x00003cce
+
+#define REG_A7XX_GBIF_PWR_CNT_HIGH0				0x00003ccf
+
+#define REG_A7XX_GBIF_PWR_CNT_HIGH1				0x00003cd0
+
+#define REG_A7XX_GBIF_PWR_CNT_HIGH2				0x00003cd1
+
+#define REG_A7XX_GRAS_NC_MODE_CNTL				0x00008602
+
+static inline uint32_t REG_A7XX_GRAS_PERFCTR_TSE_SEL(uint32_t i0) { return 0x00008610 + 0x1*i0; }
+
+static inline uint32_t REG_A7XX_GRAS_PERFCTR_RAS_SEL(uint32_t i0) { return 0x00008614 + 0x1*i0; }
+
+static inline uint32_t REG_A7XX_GRAS_PERFCTR_LRZ_SEL(uint32_t i0) { return 0x00008618 + 0x1*i0; }
+
+#define REG_A7XX_RB_NC_MODE_CNTL				0x00008e08
+
+static inline uint32_t REG_A7XX_RB_PERFCTR_RB_SEL(uint32_t i0) { return 0x00008e10 + 0x1*i0; }
+
+static inline uint32_t REG_A7XX_RB_PERFCTR_CCU_SEL(uint32_t i0) { return 0x00008e18 + 0x1*i0; }
+
+static inline uint32_t REG_A7XX_RB_PERFCTR_CMP_SEL(uint32_t i0) { return 0x00008e2c + 0x1*i0; }
+
+static inline uint32_t REG_A7XX_RB_PERFCTR_UFC_SEL(uint32_t i0) { return 0x00008e30 + 0x1*i0; }
+
+#define REG_A7XX_RB_RB_SUB_BLOCK_SEL_CNTL_HOST			0x00008e3b
+
+#define REG_A7XX_RB_RB_SUB_BLOCK_SEL_CNTL_CD			0x00008e3d
+
+#define REG_A7XX_RB_CONTEXT_SWITCH_GMEM_SAVE_RESTORE		0x00008e50
+
+static inline uint32_t REG_A7XX_VPC_PERFCTR_VPC_SEL(uint32_t i0) { return 0x0000960b + 0x1*i0; }
+
+static inline uint32_t REG_A7XX_PC_PERFCTR_PC_SEL(uint32_t i0) { return 0x00009e42 + 0x1*i0; }
+
+static inline uint32_t REG_A7XX_VFD_PERFCTR_VFD_SEL(uint32_t i0) { return 0x0000a610 + 0x1*i0; }
+
+#define REG_A7XX_SP_NC_MODE_CNTL				0x0000ae02
+
+static inline uint32_t REG_A7XX_SP_PERFCTR_HLSQ_SEL(uint32_t i0) { return 0x0000ae60 + 0x1*i0; }
+
+#define REG_A7XX_SP_READ_SEL					0x0000ae6d
+
+static inline uint32_t REG_A7XX_SP_PERFCTR_SP_SEL(uint32_t i0) { return 0x0000ae80 + 0x1*i0; }
+
+#define REG_A7XX_TPL1_NC_MODE_CNTL				0x0000b604
+
+static inline uint32_t REG_A7XX_TPL1_PERFCTR_TP_SEL(uint32_t i0) { return 0x0000b610 + 0x1*i0; }
+
+#define REG_A7XX_SP_AHB_READ_APERTURE				0x0000c000
+
+#define REG_A7XX_RBBM_SECVID_TRUST_CNTL				0x0000f400
+
+#define REG_A7XX_RBBM_SECVID_TSB_TRUSTED_BASE			0x0000f800
+
+#define REG_A7XX_RBBM_SECVID_TSB_TRUSTED_SIZE			0x0000f802
+
+#define REG_A7XX_RBBM_SECVID_TSB_CNTL				0x0000f803
+
+#define REG_A7XX_RBBM_SECVID_TSB_STATUS				0x0000fc00
+
+
+#endif /* A7XX_XML */
diff --git a/drivers/gpu/drm/msm/adreno/adreno_pm4.xml.h b/drivers/gpu/drm/msm/adreno/adreno_pm4.xml.h
index 7aecf920f9b90..9eeedd261b733 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_pm4.xml.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_pm4.xml.h
@@ -8,19 +8,20 @@ This file was generated by the rules-ng-ng headergen tool in this git repository
 git clone https://github.com/freedreno/envytools.git
 
 The rules-ng-ng source files this header was generated from are:
-- /home/robclark/tmp/mesa/src/freedreno/registers/adreno.xml                     (    594 bytes, from 2021-01-30 18:25:22)
-- /home/robclark/tmp/mesa/src/freedreno/registers/freedreno_copyright.xml        (   1572 bytes, from 2020-12-31 19:26:32)
-- /home/robclark/tmp/mesa/src/freedreno/registers/adreno/a2xx.xml                (  90810 bytes, from 2021-06-21 15:24:24)
-- /home/robclark/tmp/mesa/src/freedreno/registers/adreno/adreno_common.xml       (  14609 bytes, from 2021-11-24 23:05:10)
-- /home/robclark/tmp/mesa/src/freedreno/registers/adreno/adreno_pm4.xml          (  69086 bytes, from 2022-03-03 16:41:33)
-- /home/robclark/tmp/mesa/src/freedreno/registers/adreno/a3xx.xml                (  84231 bytes, from 2021-11-24 23:05:10)
-- /home/robclark/tmp/mesa/src/freedreno/registers/adreno/a4xx.xml                ( 113358 bytes, from 2022-01-31 23:06:21)
-- /home/robclark/tmp/mesa/src/freedreno/registers/adreno/a5xx.xml                ( 149512 bytes, from 2022-01-31 23:06:21)
-- /home/robclark/tmp/mesa/src/freedreno/registers/adreno/a6xx.xml                ( 184954 bytes, from 2022-03-03 16:41:33)
-- /home/robclark/tmp/mesa/src/freedreno/registers/adreno/a6xx_gmu.xml            (  11331 bytes, from 2021-07-22 15:21:56)
-- /home/robclark/tmp/mesa/src/freedreno/registers/adreno/ocmem.xml               (   1773 bytes, from 2021-01-30 18:25:22)
-- /home/robclark/tmp/mesa/src/freedreno/registers/adreno/adreno_control_regs.xml (   6038 bytes, from 2021-07-22 15:21:56)
-- /home/robclark/tmp/mesa/src/freedreno/registers/adreno/adreno_pipe_regs.xml    (   2924 bytes, from 2021-07-22 15:21:56)
+- freedreno/registers/adreno.xml                     (    627 bytes, from 2022-03-27 15:04:47)
+- freedreno/registers/freedreno_copyright.xml        (   1572 bytes, from 2020-11-18 00:17:12)
+- freedreno/registers/adreno/a2xx.xml                (  90810 bytes, from 2021-08-06 17:44:41)
+- freedreno/registers/adreno/adreno_common.xml       (  14631 bytes, from 2022-03-27 14:52:08)
+- freedreno/registers/adreno/adreno_pm4.xml          (  70177 bytes, from 2022-03-27 20:02:31)
+- freedreno/registers/adreno/a3xx.xml                (  84231 bytes, from 2021-08-27 13:03:56)
+- freedreno/registers/adreno/a4xx.xml                ( 113474 bytes, from 2022-03-22 19:23:46)
+- freedreno/registers/adreno/a5xx.xml                ( 149512 bytes, from 2022-03-21 16:05:18)
+- freedreno/registers/adreno/a6xx.xml                ( 184954 bytes, from 2022-03-22 19:23:46)
+- freedreno/registers/adreno/a6xx_gmu.xml            (  11331 bytes, from 2021-08-06 17:44:41)
+- freedreno/registers/adreno/a7xx.xml                (  20004 bytes, from 2022-03-27 20:01:42)
+- freedreno/registers/adreno/ocmem.xml               (   1773 bytes, from 2020-11-18 00:17:12)
+- freedreno/registers/adreno/adreno_control_regs.xml (   6038 bytes, from 2022-03-22 19:23:46)
+- freedreno/registers/adreno/adreno_pipe_regs.xml    (   2924 bytes, from 2022-03-22 19:23:46)
 
 Copyright (C) 2013-2022 by the following authors:
 - Rob Clark <robdclark@gmail.com> (robclark)
@@ -301,6 +302,8 @@ enum adreno_pm4_type3_packets {
 	CP_REG_WRITE = 109,
 	CP_START_BIN = 80,
 	CP_END_BIN = 81,
+	CP_WAIT_TIMESTAMP = 20,
+	CP_THREAD_CONTROL = 23,
 };
 
 enum adreno_state_block {
@@ -482,6 +485,12 @@ enum reg_tracker {
 	UNK_EVENT_WRITE = 4,
 };
 
+enum cp_thread {
+	CP_SET_THREAD_BR = 1,
+	CP_SET_THREAD_BV = 2,
+	CP_SET_THREAD_BOTH = 3,
+};
+
 #define REG_CP_LOAD_STATE_0					0x00000000
 #define CP_LOAD_STATE_0_DST_OFF__MASK				0x0000ffff
 #define CP_LOAD_STATE_0_DST_OFF__SHIFT				0
@@ -2361,5 +2370,33 @@ static inline uint32_t CP_SMMU_TABLE_UPDATE_3_CONTEXTBANK(uint32_t val)
 
 #define REG_CP_START_BIN_BODY_DWORDS				0x00000004
 
+#define REG_CP_THREAD_CONTROL_0					0x00000000
+#define CP_THREAD_CONTROL_0_THREAD__MASK			0x00000003
+#define CP_THREAD_CONTROL_0_THREAD__SHIFT			0
+static inline uint32_t CP_THREAD_CONTROL_0_THREAD(enum cp_thread val)
+{
+	return ((val) << CP_THREAD_CONTROL_0_THREAD__SHIFT) & CP_THREAD_CONTROL_0_THREAD__MASK;
+}
+#define CP_THREAD_CONTROL_0_CONCURRENT_BIN_DISABLE		0x08000000
+#define CP_THREAD_CONTROL_0_SYNC_THREADS			0x80000000
+
+#define REG_CP_WAIT_TIMESTAMP_0					0x00000000
+#define CP_WAIT_TIMESTAMP_0_REF__MASK				0x00000003
+#define CP_WAIT_TIMESTAMP_0_REF__SHIFT				0
+static inline uint32_t CP_WAIT_TIMESTAMP_0_REF(uint32_t val)
+{
+	return ((val) << CP_WAIT_TIMESTAMP_0_REF__SHIFT) & CP_WAIT_TIMESTAMP_0_REF__MASK;
+}
+#define CP_WAIT_TIMESTAMP_0_MEMSPACE__MASK			0x00000010
+#define CP_WAIT_TIMESTAMP_0_MEMSPACE__SHIFT			4
+static inline uint32_t CP_WAIT_TIMESTAMP_0_MEMSPACE(uint32_t val)
+{
+	return ((val) << CP_WAIT_TIMESTAMP_0_MEMSPACE__SHIFT) & CP_WAIT_TIMESTAMP_0_MEMSPACE__MASK;
+}
+
+#define REG_CP_WAIT_TIMESTAMP_ADDR				0x00000001
+
+#define REG_CP_WAIT_TIMESTAMP_VALUE				0x00000003
+
 
 #endif /* ADRENO_PM4_XML */
-- 
2.26.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/4] drm/msm/adreno: update headers
@ 2022-03-27 20:25   ` Jonathan Marek
  0 siblings, 0 replies; 12+ messages in thread
From: Jonathan Marek @ 2022-03-27 20:25 UTC (permalink / raw)
  To: freedreno
  Cc: David Airlie, open list:DRM DRIVER FOR MSM ADRENO GPU,
	Abhinav Kumar, open list:DRM DRIVER FOR MSM ADRENO GPU,
	open list, Sean Paul

Adds a7xx changes for the kernel driver.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
---
 drivers/gpu/drm/msm/adreno/a7xx.xml.h       | 666 ++++++++++++++++++++
 drivers/gpu/drm/msm/adreno/adreno_pm4.xml.h |  63 +-
 2 files changed, 716 insertions(+), 13 deletions(-)
 create mode 100644 drivers/gpu/drm/msm/adreno/a7xx.xml.h

diff --git a/drivers/gpu/drm/msm/adreno/a7xx.xml.h b/drivers/gpu/drm/msm/adreno/a7xx.xml.h
new file mode 100644
index 0000000000000..45ef4289ac52b
--- /dev/null
+++ b/drivers/gpu/drm/msm/adreno/a7xx.xml.h
@@ -0,0 +1,666 @@
+#ifndef A7XX_XML
+#define A7XX_XML
+
+/* Autogenerated file, DO NOT EDIT manually!
+
+This file was generated by the rules-ng-ng headergen tool in this git repository:
+http://github.com/freedreno/envytools/
+git clone https://github.com/freedreno/envytools.git
+
+The rules-ng-ng source files this header was generated from are:
+- freedreno/registers/adreno.xml                     (    627 bytes, from 2022-03-27 15:04:47)
+- freedreno/registers/freedreno_copyright.xml        (   1572 bytes, from 2020-11-18 00:17:12)
+- freedreno/registers/adreno/a2xx.xml                (  90810 bytes, from 2021-08-06 17:44:41)
+- freedreno/registers/adreno/adreno_common.xml       (  14631 bytes, from 2022-03-27 14:52:08)
+- freedreno/registers/adreno/adreno_pm4.xml          (  70334 bytes, from 2022-03-27 20:01:26)
+- freedreno/registers/adreno/a3xx.xml                (  84231 bytes, from 2021-08-27 13:03:56)
+- freedreno/registers/adreno/a4xx.xml                ( 113474 bytes, from 2022-03-22 19:23:46)
+- freedreno/registers/adreno/a5xx.xml                ( 149512 bytes, from 2022-03-21 16:05:18)
+- freedreno/registers/adreno/a6xx.xml                ( 184954 bytes, from 2022-03-22 19:23:46)
+- freedreno/registers/adreno/a6xx_gmu.xml            (  11331 bytes, from 2021-08-06 17:44:41)
+- freedreno/registers/adreno/a7xx.xml                (  20004 bytes, from 2022-03-27 20:01:42)
+- freedreno/registers/adreno/ocmem.xml               (   1773 bytes, from 2020-11-18 00:17:12)
+- freedreno/registers/adreno/adreno_control_regs.xml (   6038 bytes, from 2022-03-22 19:23:46)
+- freedreno/registers/adreno/adreno_pipe_regs.xml    (   2924 bytes, from 2022-03-22 19:23:46)
+
+Copyright (C) 2013-2022 by the following authors:
+- Rob Clark <robdclark@gmail.com> (robclark)
+- Ilia Mirkin <imirkin@alum.mit.edu> (imirkin)
+
+Permission is hereby granted, free of charge, to any person obtaining
+a copy of this software and associated documentation files (the
+"Software"), to deal in the Software without restriction, including
+without limitation the rights to use, copy, modify, merge, publish,
+distribute, sublicense, and/or sell copies of the Software, and to
+permit persons to whom the Software is furnished to do so, subject to
+the following conditions:
+
+The above copyright notice and this permission notice (including the
+next paragraph) shall be included in all copies or substantial
+portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE
+LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
+OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
+WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+*/
+
+
+enum a7xx_event {
+	CCU_INVALIDATE_DEPTH = 24,
+	CCU_INVALIDATE_COLOR = 25,
+	CCU_RESOLVE_CLEAN = 26,
+	CCU_FLUSH_DEPTH = 28,
+	CCU_FLUSH_COLOR = 29,
+	CCU_RESOLVE = 30,
+	CCU_END_RESOLVE_GROUP = 31,
+	CCU_CLEAN_DEPTH = 32,
+	CCU_CLEAN_COLOR = 33,
+	CACHE_RESET = 48,
+	CACHE_CLEAN = 49,
+	CACHE_FLUSH7 = 50,
+	CACHE_INVALIDATE7 = 51,
+};
+
+#define REG_A7XX_RBBM_GBIF_CLIENT_QOS_CNTL			0x00000011
+
+#define REG_A7XX_RBBM_GBIF_HALT					0x00000016
+
+#define REG_A7XX_RBBM_GBIF_HALT_ACK				0x00000017
+
+#define REG_A7XX_RBBM_INTERFACE_HANG_INT_CNTL			0x0000001f
+
+#define REG_A7XX_RBBM_INT_CLEAR_CMD				0x00000037
+
+#define REG_A7XX_RBBM_INT_0_MASK				0x00000038
+#define A7XX_RBBM_INT_0_MASK_GPUIDLE				0x00000001
+#define A7XX_RBBM_INT_0_MASK_AHBERROR				0x00000002
+#define A7XX_RBBM_INT_0_MASK_CPIPCINT0				0x00000010
+#define A7XX_RBBM_INT_0_MASK_CPIPCINT1				0x00000020
+#define A7XX_RBBM_INT_0_MASK_ATBASYNCFIFOOVERFLOW		0x00000040
+#define A7XX_RBBM_INT_0_MASK_GPCERROR				0x00000080
+#define A7XX_RBBM_INT_0_MASK_SWINTERRUPT			0x00000100
+#define A7XX_RBBM_INT_0_MASK_HWERROR				0x00000200
+#define A7XX_RBBM_INT_0_MASK_CCU_CLEAN_DEPTH_TS			0x00000400
+#define A7XX_RBBM_INT_0_MASK_CCU_CLEAN_COLOR_TS			0x00000800
+#define A7XX_RBBM_INT_0_MASK_CCU_RESOLVE_CLEAN_TS		0x00001000
+#define A7XX_RBBM_INT_0_MASK_PM4CPINTERRUPT			0x00008000
+#define A7XX_RBBM_INT_0_MASK_PM4CPINTERRUPTLPAC			0x00010000
+#define A7XX_RBBM_INT_0_MASK_RB_DONE_TS				0x00020000
+#define A7XX_RBBM_INT_0_MASK_CACHE_CLEAN_TS			0x00100000
+#define A7XX_RBBM_INT_0_MASK_CACHE_CLEAN_TS_LPAC		0x00200000
+#define A7XX_RBBM_INT_0_MASK_ATBBUSOVERFLOW			0x00400000
+#define A7XX_RBBM_INT_0_MASK_HANGDETECTINTERRUPT		0x00800000
+#define A7XX_RBBM_INT_0_MASK_OUTOFBOUNDACCESS			0x01000000
+#define A7XX_RBBM_INT_0_MASK_UCHETRAPINTERRUPT			0x02000000
+#define A7XX_RBBM_INT_0_MASK_DEBUGBUSINTERRUPT0			0x04000000
+#define A7XX_RBBM_INT_0_MASK_DEBUGBUSINTERRUPT1			0x08000000
+#define A7XX_RBBM_INT_0_MASK_TSBWRITEERROR			0x10000000
+#define A7XX_RBBM_INT_0_MASK_ISDBCPUIRQ				0x40000000
+#define A7XX_RBBM_INT_0_MASK_ISDBUNDERDEBUG			0x80000000
+
+#define REG_A7XX_RBBM_INT_2_MASK				0x0000003a
+
+#define REG_A7XX_RBBM_SP_HYST_CNT				0x00000042
+
+#define REG_A7XX_RBBM_SW_RESET_CMD				0x00000043
+
+#define REG_A7XX_RBBM_RAC_THRESHOLD_CNT				0x00000044
+
+#define REG_A7XX_RBBM_CLOCK_CNTL				0x000000ae
+
+#define REG_A7XX_RBBM_CLOCK_CNTL_SP0				0x000000b0
+
+#define REG_A7XX_RBBM_CLOCK_CNTL2_SP0				0x000000b4
+
+#define REG_A7XX_RBBM_CLOCK_DELAY_SP0				0x000000b8
+
+#define REG_A7XX_RBBM_CLOCK_HYST_SP0				0x000000bc
+
+#define REG_A7XX_RBBM_CLOCK_CNTL_TP0				0x000000c0
+
+#define REG_A7XX_RBBM_CLOCK_CNTL2_TP0				0x000000c4
+
+#define REG_A7XX_RBBM_CLOCK_CNTL3_TP0				0x000000c8
+
+#define REG_A7XX_RBBM_CLOCK_CNTL4_TP0				0x000000cc
+
+#define REG_A7XX_RBBM_CLOCK_DELAY_TP0				0x000000d0
+
+#define REG_A7XX_RBBM_CLOCK_DELAY2_TP0				0x000000d4
+
+#define REG_A7XX_RBBM_CLOCK_DELAY3_TP0				0x000000d8
+
+#define REG_A7XX_RBBM_CLOCK_DELAY4_TP0				0x000000dc
+
+#define REG_A7XX_RBBM_CLOCK_HYST_TP0				0x000000e0
+
+#define REG_A7XX_RBBM_CLOCK_HYST2_TP0				0x000000e4
+
+#define REG_A7XX_RBBM_CLOCK_HYST3_TP0				0x000000e8
+
+#define REG_A7XX_RBBM_CLOCK_HYST4_TP0				0x000000ec
+
+#define REG_A7XX_RBBM_CLOCK_CNTL_RB0				0x000000f0
+
+#define REG_A7XX_RBBM_CLOCK_CNTL2_RB0				0x000000f4
+
+#define REG_A7XX_RBBM_CLOCK_CNTL_CCU0				0x000000f8
+
+#define REG_A7XX_RBBM_CLOCK_HYST_RB_CCU0			0x00000100
+
+#define REG_A7XX_RBBM_CLOCK_CNTL_RAC				0x00000104
+
+#define REG_A7XX_RBBM_CLOCK_CNTL2_RAC				0x00000105
+
+#define REG_A7XX_RBBM_CLOCK_DELAY_RAC				0x00000106
+
+#define REG_A7XX_RBBM_CLOCK_HYST_RAC				0x00000107
+
+#define REG_A7XX_RBBM_CLOCK_CNTL_TSE_RAS_RBBM			0x00000108
+
+#define REG_A7XX_RBBM_CLOCK_DELAY_TSE_RAS_RBBM			0x00000109
+
+#define REG_A7XX_RBBM_CLOCK_HYST_TSE_RAS_RBBM			0x0000010a
+
+#define REG_A7XX_RBBM_CLOCK_CNTL_UCHE				0x0000010b
+
+#define REG_A7XX_RBBM_CLOCK_DELAY_UCHE				0x0000010f
+
+#define REG_A7XX_RBBM_CLOCK_HYST_UCHE				0x00000110
+
+#define REG_A7XX_RBBM_CLOCK_MODE_VFD				0x00000111
+
+#define REG_A7XX_RBBM_CLOCK_DELAY_VFD				0x00000112
+
+#define REG_A7XX_RBBM_CLOCK_HYST_VFD				0x00000113
+
+#define REG_A7XX_RBBM_CLOCK_MODE_GPC				0x00000114
+
+#define REG_A7XX_RBBM_CLOCK_DELAY_GPC				0x00000115
+
+#define REG_A7XX_RBBM_CLOCK_HYST_GPC				0x00000116
+
+#define REG_A7XX_RBBM_CLOCK_DELAY_HLSQ_2			0x00000117
+
+#define REG_A7XX_RBBM_CLOCK_CNTL_GMU_GX				0x00000118
+
+#define REG_A7XX_RBBM_CLOCK_DELAY_GMU_GX			0x00000119
+
+#define REG_A7XX_RBBM_CLOCK_HYST_GMU_GX				0x0000011a
+
+#define REG_A7XX_RBBM_CLOCK_MODE_HLSQ				0x0000011b
+
+#define REG_A7XX_RBBM_CLOCK_DELAY_HLSQ				0x0000011c
+
+#define REG_A7XX_RBBM_CLOCK_HYST_HLSQ				0x0000011d
+
+#define REG_A7XX_RBBM_INT_0_STATUS				0x00000201
+
+#define REG_A7XX_RBBM_STATUS					0x00000210
+#define A7XX_RBBM_STATUS_CPAHBBUSYCXMASTER			0x00000001
+#define A7XX_RBBM_STATUS_CPAHBBUSYCPMASTER			0x00000002
+#define A7XX_RBBM_STATUS_CPBUSY					0x00000004
+#define A7XX_RBBM_STATUS_GFXDBGCBUSY				0x00000008
+#define A7XX_RBBM_STATUS_VBIFGXFPARTBUSY			0x00000010
+#define A7XX_RBBM_STATUS_TSEBUSY				0x00000020
+#define A7XX_RBBM_STATUS_RASBUSY				0x00000040
+#define A7XX_RBBM_STATUS_RBBUSY					0x00000080
+#define A7XX_RBBM_STATUS_CCUBUSY				0x00000100
+#define A7XX_RBBM_STATUS_A2DBUSY				0x00000200
+#define A7XX_RBBM_STATUS_LRZBUSY				0x00000400
+#define A7XX_RBBM_STATUS_COMDCOMBUSY				0x00000800
+#define A7XX_RBBM_STATUS_PCDCALLBUSY				0x00001000
+#define A7XX_RBBM_STATUS_PCVSDBUSY				0x00002000
+#define A7XX_RBBM_STATUS_TESSBUSY				0x00004000
+#define A7XX_RBBM_STATUS_VFDBUSY				0x00008000
+#define A7XX_RBBM_STATUS_VPCBUSY				0x00010000
+#define A7XX_RBBM_STATUS_UCHEBUSY				0x00020000
+#define A7XX_RBBM_STATUS_SPBUSY					0x00040000
+#define A7XX_RBBM_STATUS_TPL1BUSY				0x00080000
+#define A7XX_RBBM_STATUS_VSCBUSY				0x00100000
+#define A7XX_RBBM_STATUS_HLSQBUSY				0x00200000
+#define A7XX_RBBM_STATUS_GPUBUSYIGNAHBCP			0x00400000
+#define A7XX_RBBM_STATUS_GPUBUSYIGNAHB				0x00800000
+
+#define REG_A7XX_RBBM_STATUS3					0x00000213
+
+#define REG_A7XX_RBBM_CLOCK_MODE_CP				0x00000260
+
+#define REG_A7XX_RBBM_CLOCK_MODE_BV_LRZ				0x00000284
+
+#define REG_A7XX_RBBM_CLOCK_MODE_BV_GRAS			0x00000285
+
+#define REG_A7XX_RBBM_CLOCK_MODE2_GRAS				0x00000286
+
+#define REG_A7XX_RBBM_CLOCK_MODE_BV_VFD				0x00000287
+
+#define REG_A7XX_RBBM_CLOCK_MODE_BV_GPC				0x00000288
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_CP(uint32_t i0) { return 0x00000300 + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_RBBM(uint32_t i0) { return 0x0000031c + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_PC(uint32_t i0) { return 0x00000324 + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_VFD(uint32_t i0) { return 0x00000334 + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_HLSQ(uint32_t i0) { return 0x00000344 + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_VPC(uint32_t i0) { return 0x00000350 + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_CCU(uint32_t i0) { return 0x0000035c + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_TSE(uint32_t i0) { return 0x00000366 + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_RAS(uint32_t i0) { return 0x0000036e + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_UCHE(uint32_t i0) { return 0x00000376 + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_TP(uint32_t i0) { return 0x0000038e + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_SP(uint32_t i0) { return 0x000003a6 + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_RB(uint32_t i0) { return 0x000003d6 + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_VSC(uint32_t i0) { return 0x000003e6 + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_LRZ(uint32_t i0) { return 0x000003ea + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_CMP(uint32_t i0) { return 0x000003f2 + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_UFC(uint32_t i0) { return 0x000003fa + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR2_HLSQ(uint32_t i0) { return 0x00000410 + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR2_CP(uint32_t i0) { return 0x0000041c + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR2_SP(uint32_t i0) { return 0x0000042a + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR2_TP(uint32_t i0) { return 0x00000442 + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR2_UFC(uint32_t i0) { return 0x0000044e + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_BV_PC(uint32_t i0) { return 0x00000460 + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_BV_VFD(uint32_t i0) { return 0x00000470 + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_BV_VPC(uint32_t i0) { return 0x00000480 + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_BV_TSE(uint32_t i0) { return 0x0000048c + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_BV_RAS(uint32_t i0) { return 0x00000494 + 0x2*i0; }
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_BV_LRZ(uint32_t i0) { return 0x0000049c + 0x2*i0; }
+
+#define REG_A7XX_RBBM_PERFCTR_CNTL				0x00000500
+
+static inline uint32_t REG_A7XX_RBBM_PERFCTR_RBBM_SEL(uint32_t i0) { return 0x00000507 + 0x1*i0; }
+
+#define REG_A7XX_RBBM_PERFCTR_GPU_BUSY_MASKED			0x0000050b
+
+#define REG_A7XX_RBBM_ISDB_CNT					0x00000533
+
+#define REG_A7XX_RBBM_NC_MODE_CNTL				0x00000534
+
+#define REG_A7XX_RBBM_SNAPSHOT_STATUS				0x00000535
+
+#define REG_A7XX_CP_RB_BASE					0x00000800
+
+#define REG_A7XX_CP_RB_CNTL					0x00000802
+
+#define REG_A7XX_CP_RB_RPTR_ADDR				0x00000804
+
+#define REG_A7XX_CP_RB_RPTR					0x00000806
+
+#define REG_A7XX_CP_RB_WPTR					0x00000807
+
+#define REG_A7XX_CP_SQE_CNTL					0x00000808
+
+#define REG_A7XX_CP_CP2GMU_STATUS				0x00000812
+
+#define REG_A7XX_CP_HW_FAULT					0x00000821
+
+#define REG_A7XX_CP_INTERRUPT_STATUS				0x00000823
+#define A7XX_CP_INTERRUPT_STATUS_OPCODEERROR			0x00000001
+#define A7XX_CP_INTERRUPT_STATUS_UCODEERROR			0x00000002
+#define A7XX_CP_INTERRUPT_STATUS_CPHWFAULT			0x00000004
+#define A7XX_CP_INTERRUPT_STATUS_REGISTERPROTECTION		0x00000010
+#define A7XX_CP_INTERRUPT_STATUS_VSDPARITYERROR			0x00000040
+#define A7XX_CP_INTERRUPT_STATUS_ILLEGALINSTRUCTION		0x00000080
+#define A7XX_CP_INTERRUPT_STATUS_OPCODEERRORLPAC		0x00000100
+#define A7XX_CP_INTERRUPT_STATUS_UCODEERRORLPAC			0x00000200
+#define A7XX_CP_INTERRUPT_STATUS_CPHWFAULTLPAC			0x00000400
+#define A7XX_CP_INTERRUPT_STATUS_REGISTERPROTECTIONLPAC		0x00000800
+#define A7XX_CP_INTERRUPT_STATUS_ILLEGALINSTRUCTIONLPAC		0x00001000
+#define A7XX_CP_INTERRUPT_STATUS_OPCODEERRORBV			0x00002000
+#define A7XX_CP_INTERRUPT_STATUS_UCODEERRORBV			0x00004000
+#define A7XX_CP_INTERRUPT_STATUS_CPHWFAULTBV			0x00008000
+#define A7XX_CP_INTERRUPT_STATUS_REGISTERPROTECTIONBV		0x00010000
+#define A7XX_CP_INTERRUPT_STATUS_ILLEGALINSTRUCTIONBV		0x00020000
+
+#define REG_A7XX_CP_PROTECT_STATUS				0x00000824
+
+#define REG_A7XX_CP_STATUS_1					0x00000825
+
+#define REG_A7XX_CP_SQE_INSTR_BASE				0x00000830
+
+#define REG_A7XX_CP_MISC_CNTL					0x00000840
+
+#define REG_A7XX_CP_CHICKEN_DBG					0x00000841
+
+#define REG_A7XX_CP_DBG_ECO_CNTL				0x00000843
+
+#define REG_A7XX_CP_APRIV_CNTL					0x00000844
+
+#define REG_A7XX_CP_PROTECT_CNTL				0x0000084f
+
+static inline uint32_t REG_A7XX_CP_PROTECT_REG(uint32_t i0) { return 0x00000850 + 0x1*i0; }
+
+#define REG_A7XX_CP_CONTEXT_SWITCH_CNTL				0x000008a0
+
+#define REG_A7XX_CP_CONTEXT_SWITCH_SMMU_INFO			0x000008a1
+
+#define REG_A7XX_CP_CONTEXT_SWITCH_PRIV_NON_SECURE_RESTORE_ADDR	0x000008a3
+
+#define REG_A7XX_CP_CONTEXT_SWITCH_PRIV_SECURE_RESTORE_ADDR	0x000008a5
+
+#define REG_A7XX_CP_CONTEXT_SWITCH_NON_PRIV_RESTORE_ADDR	0x000008a7
+
+#define REG_A7XX_CP_CONTEXT_SWITCH_LEVEL_STATUS			0x000008ab
+
+static inline uint32_t REG_A7XX_CP_PERFCTR_CP_SEL(uint32_t i0) { return 0x000008d0 + 0x1*i0; }
+
+static inline uint32_t REG_A7XX_CP_BV_PERFCTR_CP_SEL(uint32_t i0) { return 0x000008e0 + 0x1*i0; }
+
+#define REG_A7XX_CP_CRASH_SCRIPT_BASE				0x00000900
+
+#define REG_A7XX_CP_CRASH_DUMP_CNTL				0x00000902
+
+#define REG_A7XX_CP_CRASH_DUMP_STATUS				0x00000903
+
+#define REG_A7XX_CP_SQE_STAT_ADDR				0x00000908
+
+#define REG_A7XX_CP_SQE_STAT_DATA				0x00000909
+
+#define REG_A7XX_CP_DRAW_STATE_ADDR				0x0000090a
+
+#define REG_A7XX_CP_DRAW_STATE_DATA				0x0000090b
+
+#define REG_A7XX_CP_ROQ_DBG_ADDR				0x0000090c
+
+#define REG_A7XX_CP_ROQ_DBG_DATA				0x0000090d
+
+#define REG_A7XX_CP_MEM_POOL_DBG_ADDR				0x0000090e
+
+#define REG_A7XX_CP_MEM_POOL_DBG_DATA				0x0000090f
+
+#define REG_A7XX_CP_SQE_UCODE_DBG_ADDR				0x00000910
+
+#define REG_A7XX_CP_SQE_UCODE_DBG_DATA				0x00000911
+
+#define REG_A7XX_CP_IB1_BASE					0x00000928
+
+#define REG_A7XX_CP_IB1_REM_SIZE				0x0000092a
+
+#define REG_A7XX_CP_IB2_BASE					0x0000092b
+
+#define REG_A7XX_CP_IB2_REM_SIZE				0x0000092d
+
+#define REG_A7XX_CP_ALWAYS_ON_COUNTER				0x00000980
+
+#define REG_A7XX_CP_AHB_CNTL					0x0000098d
+
+#define REG_A7XX_CP_APERTURE_CNTL_HOST				0x00000a00
+
+#define REG_A7XX_CP_APERTURE_CNTL_CD				0x00000a03
+
+#define REG_A7XX_CP_BV_PROTECT_STATUS				0x00000a61
+
+#define REG_A7XX_CP_BV_HW_FAULT					0x00000a64
+
+#define REG_A7XX_CP_BV_DRAW_STATE_ADDR				0x00000a81
+
+#define REG_A7XX_CP_BV_DRAW_STATE_DATA				0x00000a82
+
+#define REG_A7XX_CP_BV_ROQ_DBG_ADDR				0x00000a83
+
+#define REG_A7XX_CP_BV_ROQ_DBG_DATA				0x00000a84
+
+#define REG_A7XX_CP_BV_SQE_UCODE_DBG_ADDR			0x00000a85
+
+#define REG_A7XX_CP_BV_SQE_UCODE_DBG_DATA			0x00000a86
+
+#define REG_A7XX_CP_BV_SQE_STAT_ADDR				0x00000a87
+
+#define REG_A7XX_CP_BV_SQE_STAT_DATA				0x00000a88
+
+#define REG_A7XX_CP_BV_MEM_POOL_DBG_ADDR			0x00000a96
+
+#define REG_A7XX_CP_BV_MEM_POOL_DBG_DATA			0x00000a97
+
+#define REG_A7XX_CP_BV_RB_RPTR_ADDR				0x00000a98
+
+#define REG_A7XX_CP_RESOURCE_TBL_DBG_ADDR			0x00000a9a
+
+#define REG_A7XX_CP_RESOURCE_TBL_DBG_DATA			0x00000a9b
+
+#define REG_A7XX_CP_BV_APRIV_CNTL				0x00000ad0
+
+#define REG_A7XX_CP_BV_CHICKEN_DBG				0x00000ada
+
+#define REG_A7XX_CP_LPAC_DRAW_STATE_ADDR			0x00000b0a
+
+#define REG_A7XX_CP_LPAC_DRAW_STATE_DATA			0x00000b0b
+
+#define REG_A7XX_CP_LPAC_ROQ_DBG_ADDR				0x00000b0c
+
+#define REG_A7XX_CP_SQE_AC_UCODE_DBG_ADDR			0x00000b27
+
+#define REG_A7XX_CP_SQE_AC_UCODE_DBG_DATA			0x00000b28
+
+#define REG_A7XX_CP_SQE_AC_STAT_ADDR				0x00000b29
+
+#define REG_A7XX_CP_SQE_AC_STAT_DATA				0x00000b2a
+
+#define REG_A7XX_CP_LPAC_APRIV_CNTL				0x00000b31
+
+#define REG_A7XX_CP_LPAC_ROQ_DBG_DATA				0x00000b35
+
+#define REG_A7XX_CP_LPAC_FIFO_DBG_DATA				0x00000b36
+
+#define REG_A7XX_CP_LPAC_FIFO_DBG_ADDR				0x00000b40
+
+static inline uint32_t REG_A7XX_VSC_PERFCTR_VSC_SEL(uint32_t i0) { return 0x00000cd8 + 0x1*i0; }
+
+#define REG_A7XX_UCHE_MODE_CNTL					0x00000e01
+
+#define REG_A7XX_UCHE_WRITE_THRU_BASE				0x00000e07
+
+#define REG_A7XX_UCHE_TRAP_BASE					0x00000e09
+
+#define REG_A7XX_UCHE_GMEM_RANGE_MIN				0x00000e0b
+
+#define REG_A7XX_UCHE_GMEM_RANGE_MAX				0x00000e0d
+
+#define REG_A7XX_UCHE_CACHE_WAYS				0x00000e17
+
+#define REG_A7XX_UCHE_CLIENT_PF					0x00000e19
+
+static inline uint32_t REG_A7XX_UCHE_PERFCTR_UCHE_SEL(uint32_t i0) { return 0x00000e1c + 0x1*i0; }
+
+#define REG_A7XX_UCHE_GBIF_GX_CONFIG				0x00000e3a
+
+#define REG_A7XX_UCHE_CMDQ_CONFIG				0x00000e3c
+
+#define REG_A7XX_PDC_GPU_ENABLE_PDC				0x00001140
+
+#define REG_A7XX_PDC_GPU_SEQ_START_ADDR				0x00001148
+
+#define REG_A7XX_VBIF_XIN_HALT_CTRL1				0x00003081
+
+#define REG_A7XX_VBIF_TEST_BUS_OUT_CTRL				0x00003084
+
+#define REG_A7XX_VBIF_TEST_BUS1_CTRL0				0x00003085
+
+#define REG_A7XX_VBIF_TEST_BUS1_CTRL1				0x00003086
+
+#define REG_A7XX_VBIF_TEST_BUS2_CTRL0				0x00003087
+
+#define REG_A7XX_VBIF_TEST_BUS2_CTRL1				0x00003088
+
+#define REG_A7XX_VBIF_TEST_BUS_OUT				0x0000308c
+
+#define REG_A7XX_VBIF_PERF_CNT_SEL0				0x000030d0
+
+#define REG_A7XX_VBIF_PERF_CNT_SEL1				0x000030d1
+
+#define REG_A7XX_VBIF_PERF_CNT_SEL2				0x000030d2
+
+#define REG_A7XX_VBIF_PERF_CNT_SEL3				0x000030d3
+
+#define REG_A7XX_VBIF_PERF_CNT_LOW0				0x000030d8
+
+#define REG_A7XX_VBIF_PERF_CNT_LOW1				0x000030d9
+
+#define REG_A7XX_VBIF_PERF_CNT_LOW2				0x000030da
+
+#define REG_A7XX_VBIF_PERF_CNT_LOW3				0x000030db
+
+#define REG_A7XX_VBIF_PERF_CNT_HIGH0				0x000030e0
+
+#define REG_A7XX_VBIF_PERF_CNT_HIGH1				0x000030e1
+
+#define REG_A7XX_VBIF_PERF_CNT_HIGH2				0x000030e2
+
+#define REG_A7XX_VBIF_PERF_CNT_HIGH3				0x000030e3
+
+#define REG_A7XX_VBIF_PERF_PWR_CNT_EN0				0x00003100
+
+#define REG_A7XX_VBIF_PERF_PWR_CNT_EN1				0x00003101
+
+#define REG_A7XX_VBIF_PERF_PWR_CNT_EN2				0x00003102
+
+#define REG_A7XX_VBIF_PERF_PWR_CNT_LOW0				0x00003110
+
+#define REG_A7XX_VBIF_PERF_PWR_CNT_LOW1				0x00003111
+
+#define REG_A7XX_VBIF_PERF_PWR_CNT_LOW2				0x00003112
+
+#define REG_A7XX_VBIF_PERF_PWR_CNT_HIGH0			0x00003118
+
+#define REG_A7XX_VBIF_PERF_PWR_CNT_HIGH1			0x00003119
+
+#define REG_A7XX_VBIF_PERF_PWR_CNT_HIGH2			0x0000311a
+
+#define REG_A7XX_GBIF_SCACHE_CNTL0				0x00003c01
+
+#define REG_A7XX_GBIF_SCACHE_CNTL1				0x00003c02
+
+#define REG_A7XX_GBIF_QSB_SIDE0					0x00003c03
+
+#define REG_A7XX_GBIF_QSB_SIDE1					0x00003c04
+
+#define REG_A7XX_GBIF_QSB_SIDE2					0x00003c05
+
+#define REG_A7XX_GBIF_QSB_SIDE3					0x00003c06
+
+#define REG_A7XX_GBIF_HALT					0x00003c45
+
+#define REG_A7XX_GBIF_HALT_ACK					0x00003c46
+
+#define REG_A7XX_GBIF_PERF_PWR_CNT_EN				0x00003cc0
+
+#define REG_A7XX_GBIF_PERF_PWR_CNT_CLR				0x00003cc1
+
+#define REG_A7XX_GBIF_PERF_CNT_SEL				0x00003cc2
+
+#define REG_A7XX_GBIF_PERF_PWR_CNT_SEL				0x00003cc3
+
+#define REG_A7XX_GBIF_PERF_CNT_LOW0				0x00003cc4
+
+#define REG_A7XX_GBIF_PERF_CNT_LOW1				0x00003cc5
+
+#define REG_A7XX_GBIF_PERF_CNT_LOW2				0x00003cc6
+
+#define REG_A7XX_GBIF_PERF_CNT_LOW3				0x00003cc7
+
+#define REG_A7XX_GBIF_PERF_CNT_HIGH0				0x00003cc8
+
+#define REG_A7XX_GBIF_PERF_CNT_HIGH1				0x00003cc9
+
+#define REG_A7XX_GBIF_PERF_CNT_HIGH2				0x00003cca
+
+#define REG_A7XX_GBIF_PERF_CNT_HIGH3				0x00003ccb
+
+#define REG_A7XX_GBIF_PWR_CNT_LOW0				0x00003ccc
+
+#define REG_A7XX_GBIF_PWR_CNT_LOW1				0x00003ccd
+
+#define REG_A7XX_GBIF_PWR_CNT_LOW2				0x00003cce
+
+#define REG_A7XX_GBIF_PWR_CNT_HIGH0				0x00003ccf
+
+#define REG_A7XX_GBIF_PWR_CNT_HIGH1				0x00003cd0
+
+#define REG_A7XX_GBIF_PWR_CNT_HIGH2				0x00003cd1
+
+#define REG_A7XX_GRAS_NC_MODE_CNTL				0x00008602
+
+static inline uint32_t REG_A7XX_GRAS_PERFCTR_TSE_SEL(uint32_t i0) { return 0x00008610 + 0x1*i0; }
+
+static inline uint32_t REG_A7XX_GRAS_PERFCTR_RAS_SEL(uint32_t i0) { return 0x00008614 + 0x1*i0; }
+
+static inline uint32_t REG_A7XX_GRAS_PERFCTR_LRZ_SEL(uint32_t i0) { return 0x00008618 + 0x1*i0; }
+
+#define REG_A7XX_RB_NC_MODE_CNTL				0x00008e08
+
+static inline uint32_t REG_A7XX_RB_PERFCTR_RB_SEL(uint32_t i0) { return 0x00008e10 + 0x1*i0; }
+
+static inline uint32_t REG_A7XX_RB_PERFCTR_CCU_SEL(uint32_t i0) { return 0x00008e18 + 0x1*i0; }
+
+static inline uint32_t REG_A7XX_RB_PERFCTR_CMP_SEL(uint32_t i0) { return 0x00008e2c + 0x1*i0; }
+
+static inline uint32_t REG_A7XX_RB_PERFCTR_UFC_SEL(uint32_t i0) { return 0x00008e30 + 0x1*i0; }
+
+#define REG_A7XX_RB_RB_SUB_BLOCK_SEL_CNTL_HOST			0x00008e3b
+
+#define REG_A7XX_RB_RB_SUB_BLOCK_SEL_CNTL_CD			0x00008e3d
+
+#define REG_A7XX_RB_CONTEXT_SWITCH_GMEM_SAVE_RESTORE		0x00008e50
+
+static inline uint32_t REG_A7XX_VPC_PERFCTR_VPC_SEL(uint32_t i0) { return 0x0000960b + 0x1*i0; }
+
+static inline uint32_t REG_A7XX_PC_PERFCTR_PC_SEL(uint32_t i0) { return 0x00009e42 + 0x1*i0; }
+
+static inline uint32_t REG_A7XX_VFD_PERFCTR_VFD_SEL(uint32_t i0) { return 0x0000a610 + 0x1*i0; }
+
+#define REG_A7XX_SP_NC_MODE_CNTL				0x0000ae02
+
+static inline uint32_t REG_A7XX_SP_PERFCTR_HLSQ_SEL(uint32_t i0) { return 0x0000ae60 + 0x1*i0; }
+
+#define REG_A7XX_SP_READ_SEL					0x0000ae6d
+
+static inline uint32_t REG_A7XX_SP_PERFCTR_SP_SEL(uint32_t i0) { return 0x0000ae80 + 0x1*i0; }
+
+#define REG_A7XX_TPL1_NC_MODE_CNTL				0x0000b604
+
+static inline uint32_t REG_A7XX_TPL1_PERFCTR_TP_SEL(uint32_t i0) { return 0x0000b610 + 0x1*i0; }
+
+#define REG_A7XX_SP_AHB_READ_APERTURE				0x0000c000
+
+#define REG_A7XX_RBBM_SECVID_TRUST_CNTL				0x0000f400
+
+#define REG_A7XX_RBBM_SECVID_TSB_TRUSTED_BASE			0x0000f800
+
+#define REG_A7XX_RBBM_SECVID_TSB_TRUSTED_SIZE			0x0000f802
+
+#define REG_A7XX_RBBM_SECVID_TSB_CNTL				0x0000f803
+
+#define REG_A7XX_RBBM_SECVID_TSB_STATUS				0x0000fc00
+
+
+#endif /* A7XX_XML */
diff --git a/drivers/gpu/drm/msm/adreno/adreno_pm4.xml.h b/drivers/gpu/drm/msm/adreno/adreno_pm4.xml.h
index 7aecf920f9b90..9eeedd261b733 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_pm4.xml.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_pm4.xml.h
@@ -8,19 +8,20 @@ This file was generated by the rules-ng-ng headergen tool in this git repository
 git clone https://github.com/freedreno/envytools.git
 
 The rules-ng-ng source files this header was generated from are:
-- /home/robclark/tmp/mesa/src/freedreno/registers/adreno.xml                     (    594 bytes, from 2021-01-30 18:25:22)
-- /home/robclark/tmp/mesa/src/freedreno/registers/freedreno_copyright.xml        (   1572 bytes, from 2020-12-31 19:26:32)
-- /home/robclark/tmp/mesa/src/freedreno/registers/adreno/a2xx.xml                (  90810 bytes, from 2021-06-21 15:24:24)
-- /home/robclark/tmp/mesa/src/freedreno/registers/adreno/adreno_common.xml       (  14609 bytes, from 2021-11-24 23:05:10)
-- /home/robclark/tmp/mesa/src/freedreno/registers/adreno/adreno_pm4.xml          (  69086 bytes, from 2022-03-03 16:41:33)
-- /home/robclark/tmp/mesa/src/freedreno/registers/adreno/a3xx.xml                (  84231 bytes, from 2021-11-24 23:05:10)
-- /home/robclark/tmp/mesa/src/freedreno/registers/adreno/a4xx.xml                ( 113358 bytes, from 2022-01-31 23:06:21)
-- /home/robclark/tmp/mesa/src/freedreno/registers/adreno/a5xx.xml                ( 149512 bytes, from 2022-01-31 23:06:21)
-- /home/robclark/tmp/mesa/src/freedreno/registers/adreno/a6xx.xml                ( 184954 bytes, from 2022-03-03 16:41:33)
-- /home/robclark/tmp/mesa/src/freedreno/registers/adreno/a6xx_gmu.xml            (  11331 bytes, from 2021-07-22 15:21:56)
-- /home/robclark/tmp/mesa/src/freedreno/registers/adreno/ocmem.xml               (   1773 bytes, from 2021-01-30 18:25:22)
-- /home/robclark/tmp/mesa/src/freedreno/registers/adreno/adreno_control_regs.xml (   6038 bytes, from 2021-07-22 15:21:56)
-- /home/robclark/tmp/mesa/src/freedreno/registers/adreno/adreno_pipe_regs.xml    (   2924 bytes, from 2021-07-22 15:21:56)
+- freedreno/registers/adreno.xml                     (    627 bytes, from 2022-03-27 15:04:47)
+- freedreno/registers/freedreno_copyright.xml        (   1572 bytes, from 2020-11-18 00:17:12)
+- freedreno/registers/adreno/a2xx.xml                (  90810 bytes, from 2021-08-06 17:44:41)
+- freedreno/registers/adreno/adreno_common.xml       (  14631 bytes, from 2022-03-27 14:52:08)
+- freedreno/registers/adreno/adreno_pm4.xml          (  70177 bytes, from 2022-03-27 20:02:31)
+- freedreno/registers/adreno/a3xx.xml                (  84231 bytes, from 2021-08-27 13:03:56)
+- freedreno/registers/adreno/a4xx.xml                ( 113474 bytes, from 2022-03-22 19:23:46)
+- freedreno/registers/adreno/a5xx.xml                ( 149512 bytes, from 2022-03-21 16:05:18)
+- freedreno/registers/adreno/a6xx.xml                ( 184954 bytes, from 2022-03-22 19:23:46)
+- freedreno/registers/adreno/a6xx_gmu.xml            (  11331 bytes, from 2021-08-06 17:44:41)
+- freedreno/registers/adreno/a7xx.xml                (  20004 bytes, from 2022-03-27 20:01:42)
+- freedreno/registers/adreno/ocmem.xml               (   1773 bytes, from 2020-11-18 00:17:12)
+- freedreno/registers/adreno/adreno_control_regs.xml (   6038 bytes, from 2022-03-22 19:23:46)
+- freedreno/registers/adreno/adreno_pipe_regs.xml    (   2924 bytes, from 2022-03-22 19:23:46)
 
 Copyright (C) 2013-2022 by the following authors:
 - Rob Clark <robdclark@gmail.com> (robclark)
@@ -301,6 +302,8 @@ enum adreno_pm4_type3_packets {
 	CP_REG_WRITE = 109,
 	CP_START_BIN = 80,
 	CP_END_BIN = 81,
+	CP_WAIT_TIMESTAMP = 20,
+	CP_THREAD_CONTROL = 23,
 };
 
 enum adreno_state_block {
@@ -482,6 +485,12 @@ enum reg_tracker {
 	UNK_EVENT_WRITE = 4,
 };
 
+enum cp_thread {
+	CP_SET_THREAD_BR = 1,
+	CP_SET_THREAD_BV = 2,
+	CP_SET_THREAD_BOTH = 3,
+};
+
 #define REG_CP_LOAD_STATE_0					0x00000000
 #define CP_LOAD_STATE_0_DST_OFF__MASK				0x0000ffff
 #define CP_LOAD_STATE_0_DST_OFF__SHIFT				0
@@ -2361,5 +2370,33 @@ static inline uint32_t CP_SMMU_TABLE_UPDATE_3_CONTEXTBANK(uint32_t val)
 
 #define REG_CP_START_BIN_BODY_DWORDS				0x00000004
 
+#define REG_CP_THREAD_CONTROL_0					0x00000000
+#define CP_THREAD_CONTROL_0_THREAD__MASK			0x00000003
+#define CP_THREAD_CONTROL_0_THREAD__SHIFT			0
+static inline uint32_t CP_THREAD_CONTROL_0_THREAD(enum cp_thread val)
+{
+	return ((val) << CP_THREAD_CONTROL_0_THREAD__SHIFT) & CP_THREAD_CONTROL_0_THREAD__MASK;
+}
+#define CP_THREAD_CONTROL_0_CONCURRENT_BIN_DISABLE		0x08000000
+#define CP_THREAD_CONTROL_0_SYNC_THREADS			0x80000000
+
+#define REG_CP_WAIT_TIMESTAMP_0					0x00000000
+#define CP_WAIT_TIMESTAMP_0_REF__MASK				0x00000003
+#define CP_WAIT_TIMESTAMP_0_REF__SHIFT				0
+static inline uint32_t CP_WAIT_TIMESTAMP_0_REF(uint32_t val)
+{
+	return ((val) << CP_WAIT_TIMESTAMP_0_REF__SHIFT) & CP_WAIT_TIMESTAMP_0_REF__MASK;
+}
+#define CP_WAIT_TIMESTAMP_0_MEMSPACE__MASK			0x00000010
+#define CP_WAIT_TIMESTAMP_0_MEMSPACE__SHIFT			4
+static inline uint32_t CP_WAIT_TIMESTAMP_0_MEMSPACE(uint32_t val)
+{
+	return ((val) << CP_WAIT_TIMESTAMP_0_MEMSPACE__SHIFT) & CP_WAIT_TIMESTAMP_0_MEMSPACE__MASK;
+}
+
+#define REG_CP_WAIT_TIMESTAMP_ADDR				0x00000001
+
+#define REG_CP_WAIT_TIMESTAMP_VALUE				0x00000003
+
 
 #endif /* ADRENO_PM4_XML */
-- 
2.26.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/4] drm/msm/adreno: add support for a730
  2022-03-27 20:25 ` Jonathan Marek
@ 2022-03-27 20:25   ` Jonathan Marek
  -1 siblings, 0 replies; 12+ messages in thread
From: Jonathan Marek @ 2022-03-27 20:25 UTC (permalink / raw)
  To: freedreno
  Cc: Rob Clark, Sean Paul, Abhinav Kumar, David Airlie, Daniel Vetter,
	Akhil P Oommen, Bjorn Andersson, AngeloGioacchino Del Regno,
	Vladimir Lypak, Emma Anholt, open list,
	open list:DRM DRIVER FOR MSM ADRENO GPU,
	open list:DRM DRIVER FOR MSM ADRENO GPU

Based on a6xx_gpu.c, stripped down and updated for a7xx based on the
downstream driver. Implements the minimum to be able to submit commands to
the GPU and use it for userspace driver development. Notably this doesn't
implement support for the GMU (this means that the clock driver needs to
support the GPU core clock and turning on the GX rail, which is normally
offloaded to the GMU).

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
---
 drivers/gpu/drm/msm/Makefile                |   1 +
 drivers/gpu/drm/msm/adreno/a7xx_gpu.c       | 777 ++++++++++++++++++++
 drivers/gpu/drm/msm/adreno/a7xx_gpu.h       |  26 +
 drivers/gpu/drm/msm/adreno/adreno_device.c  |  12 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.h     |   3 +-
 drivers/gpu/drm/msm/adreno/adreno_pm4.xml.h |  20 +-
 drivers/gpu/drm/msm/msm_ringbuffer.h        |   1 +
 7 files changed, 820 insertions(+), 20 deletions(-)
 create mode 100644 drivers/gpu/drm/msm/adreno/a7xx_gpu.c
 create mode 100644 drivers/gpu/drm/msm/adreno/a7xx_gpu.h

diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile
index e9cc7d8ac301e..b91e543e42265 100644
--- a/drivers/gpu/drm/msm/Makefile
+++ b/drivers/gpu/drm/msm/Makefile
@@ -16,6 +16,7 @@ msm-y := \
 	adreno/a6xx_gpu.o \
 	adreno/a6xx_gmu.o \
 	adreno/a6xx_hfi.o \
+	adreno/a7xx_gpu.o \
 	hdmi/hdmi.o \
 	hdmi/hdmi_audio.o \
 	hdmi/hdmi_bridge.o \
diff --git a/drivers/gpu/drm/msm/adreno/a7xx_gpu.c b/drivers/gpu/drm/msm/adreno/a7xx_gpu.c
new file mode 100644
index 0000000000000..16bdce21b06f2
--- /dev/null
+++ b/drivers/gpu/drm/msm/adreno/a7xx_gpu.c
@@ -0,0 +1,777 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2017-2022 The Linux Foundation. All rights reserved. */
+
+#include "msm_gem.h"
+#include "msm_mmu.h"
+#include "msm_gpu_trace.h"
+#include "a7xx_gpu.h"
+
+#include <linux/bitfield.h>
+
+extern bool hang_debug;
+
+#define GPU_PAS_ID 13
+
+static inline bool _a7xx_check_idle(struct msm_gpu *gpu)
+{
+	/* Check that the CX master is idle */
+	if (gpu_read(gpu, REG_A7XX_RBBM_STATUS) & ~A7XX_RBBM_STATUS_CPAHBBUSYCXMASTER)
+		return false;
+
+	return !(gpu_read(gpu, REG_A7XX_RBBM_INT_0_STATUS) & A7XX_RBBM_INT_0_MASK_HANGDETECTINTERRUPT);
+}
+
+static bool a7xx_idle(struct msm_gpu *gpu, struct msm_ringbuffer *ring)
+{
+	/* wait for CP to drain ringbuffer: */
+	if (!adreno_idle(gpu, ring))
+		return false;
+
+	if (spin_until(_a7xx_check_idle(gpu))) {
+		DRM_ERROR("%s: %ps: timeout waiting for GPU to idle: status %8.8X irq %8.8X rptr/wptr %d/%d\n",
+			gpu->name, __builtin_return_address(0),
+			gpu_read(gpu, REG_A7XX_RBBM_STATUS),
+			gpu_read(gpu, REG_A7XX_RBBM_INT_0_STATUS),
+			gpu_read(gpu, REG_A7XX_CP_RB_RPTR),
+			gpu_read(gpu, REG_A7XX_CP_RB_WPTR));
+		return false;
+	}
+
+	return true;
+}
+
+static void a7xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
+{
+	struct msm_ringbuffer *ring = submit->ring;
+	unsigned int i;
+
+	OUT_PKT7(ring, CP_THREAD_CONTROL, 1);
+	OUT_RING(ring, CP_SET_THREAD_BOTH);
+
+	OUT_PKT7(ring, CP_SET_MARKER, 1);
+	OUT_RING(ring, 0x101); /* IFPC disable */
+
+	OUT_PKT7(ring, CP_SET_MARKER, 1);
+	OUT_RING(ring, 0x00d); /* IB1LIST start */
+
+	/* Submit the commands */
+	for (i = 0; i < submit->nr_cmds; i++) {
+		switch (submit->cmd[i].type) {
+		case MSM_SUBMIT_CMD_IB_TARGET_BUF:
+			break;
+		case MSM_SUBMIT_CMD_CTX_RESTORE_BUF:
+			if (gpu->cur_ctx_seqno == submit->queue->ctx->seqno)
+				break;
+			fallthrough;
+		case MSM_SUBMIT_CMD_BUF:
+			OUT_PKT7(ring, CP_INDIRECT_BUFFER_PFE, 3);
+			OUT_RING(ring, lower_32_bits(submit->cmd[i].iova));
+			OUT_RING(ring, upper_32_bits(submit->cmd[i].iova));
+			OUT_RING(ring, submit->cmd[i].size);
+			break;
+		}
+	}
+
+	OUT_PKT7(ring, CP_SET_MARKER, 1);
+	OUT_RING(ring, 0x00e); /* IB1LIST end */
+
+	OUT_PKT7(ring, CP_THREAD_CONTROL, 1);
+	OUT_RING(ring, CP_SET_THREAD_BR);
+
+	OUT_PKT7(ring, CP_EVENT_WRITE, 1);
+	OUT_RING(ring, CCU_INVALIDATE_DEPTH);
+
+	OUT_PKT7(ring, CP_EVENT_WRITE, 1);
+	OUT_RING(ring, CCU_INVALIDATE_COLOR);
+
+	OUT_PKT7(ring, CP_THREAD_CONTROL, 1);
+	OUT_RING(ring, CP_SET_THREAD_BV);
+
+	/*
+	 * Make sure the timestamp is committed once BV pipe is
+	 * completely done with this submission.
+	 */
+	OUT_PKT7(ring, CP_EVENT_WRITE, 4);
+	OUT_RING(ring, CACHE_CLEAN | BIT(27));
+	OUT_RING(ring, lower_32_bits(rbmemptr(ring, bv_fence)));
+	OUT_RING(ring, upper_32_bits(rbmemptr(ring, bv_fence)));
+	OUT_RING(ring, submit->seqno);
+
+	OUT_PKT7(ring, CP_THREAD_CONTROL, 1);
+	OUT_RING(ring, CP_SET_THREAD_BR);
+
+	/*
+	 * This makes sure that BR doesn't race ahead and commit
+	 * timestamp to memstore while BV is still processing
+	 * this submission.
+	 */
+	OUT_PKT7(ring, CP_WAIT_TIMESTAMP, 4);
+	OUT_RING(ring, 0);
+	OUT_RING(ring, lower_32_bits(rbmemptr(ring, bv_fence)));
+	OUT_RING(ring, upper_32_bits(rbmemptr(ring, bv_fence)));
+	OUT_RING(ring, submit->seqno);
+
+	/* write the ringbuffer timestamp */
+	OUT_PKT7(ring, CP_EVENT_WRITE, 4);
+	OUT_RING(ring, CACHE_CLEAN | BIT(31) | BIT(27));
+	OUT_RING(ring, lower_32_bits(rbmemptr(ring, fence)));
+	OUT_RING(ring, upper_32_bits(rbmemptr(ring, fence)));
+	OUT_RING(ring, submit->seqno);
+
+	OUT_PKT7(ring, CP_THREAD_CONTROL, 1);
+	OUT_RING(ring, CP_SET_THREAD_BOTH);
+
+	OUT_PKT7(ring, CP_SET_MARKER, 1);
+	OUT_RING(ring, 0x100); /* IFPC enable */
+
+	trace_msm_gpu_submit_flush(submit, gpu_read64(gpu, REG_A7XX_CP_ALWAYS_ON_COUNTER));
+
+	adreno_flush(gpu, ring, REG_A7XX_CP_RB_WPTR);
+}
+
+const struct adreno_reglist a730_hwcg[] = {
+	{ REG_A7XX_RBBM_CLOCK_CNTL_SP0, 0x02222222 },
+	{ REG_A7XX_RBBM_CLOCK_CNTL2_SP0, 0x02022222 },
+	{ REG_A7XX_RBBM_CLOCK_HYST_SP0, 0x0000f3cf },
+	{ REG_A7XX_RBBM_CLOCK_DELAY_SP0, 0x00000080 },
+	{ REG_A7XX_RBBM_CLOCK_CNTL_TP0, 0x22222220 },
+	{ REG_A7XX_RBBM_CLOCK_CNTL2_TP0, 0x22222222 },
+	{ REG_A7XX_RBBM_CLOCK_CNTL3_TP0, 0x22222222 },
+	{ REG_A7XX_RBBM_CLOCK_CNTL4_TP0, 0x00222222 },
+	{ REG_A7XX_RBBM_CLOCK_HYST_TP0, 0x77777777 },
+	{ REG_A7XX_RBBM_CLOCK_HYST2_TP0, 0x77777777 },
+	{ REG_A7XX_RBBM_CLOCK_HYST3_TP0, 0x77777777 },
+	{ REG_A7XX_RBBM_CLOCK_HYST4_TP0, 0x00077777 },
+	{ REG_A7XX_RBBM_CLOCK_DELAY_TP0, 0x11111111 },
+	{ REG_A7XX_RBBM_CLOCK_DELAY2_TP0, 0x11111111 },
+	{ REG_A7XX_RBBM_CLOCK_DELAY3_TP0, 0x11111111 },
+	{ REG_A7XX_RBBM_CLOCK_DELAY4_TP0, 0x00011111 },
+	{ REG_A7XX_RBBM_CLOCK_CNTL_UCHE, 0x22222222 },
+	{ REG_A7XX_RBBM_CLOCK_HYST_UCHE, 0x00000004 },
+	{ REG_A7XX_RBBM_CLOCK_DELAY_UCHE, 0x00000002 },
+	{ REG_A7XX_RBBM_CLOCK_CNTL_RB0, 0x22222222 },
+	{ REG_A7XX_RBBM_CLOCK_CNTL2_RB0, 0x01002222 },
+	{ REG_A7XX_RBBM_CLOCK_CNTL_CCU0, 0x00002220 },
+	{ REG_A7XX_RBBM_CLOCK_HYST_RB_CCU0, 0x44000f00 },
+	{ REG_A7XX_RBBM_CLOCK_CNTL_RAC, 0x25222022 },
+	{ REG_A7XX_RBBM_CLOCK_CNTL2_RAC, 0x00555555 },
+	{ REG_A7XX_RBBM_CLOCK_DELAY_RAC, 0x00000011 },
+	{ REG_A7XX_RBBM_CLOCK_HYST_RAC, 0x00440044 },
+	{ REG_A7XX_RBBM_CLOCK_CNTL_TSE_RAS_RBBM, 0x04222222 },
+	{ REG_A7XX_RBBM_CLOCK_MODE2_GRAS, 0x00000222 },
+	{ REG_A7XX_RBBM_CLOCK_MODE_BV_GRAS, 0x00222222 },
+	{ REG_A7XX_RBBM_CLOCK_MODE_GPC, 0x02222223 },
+	{ REG_A7XX_RBBM_CLOCK_MODE_VFD, 0x00002222 },
+	{ REG_A7XX_RBBM_CLOCK_MODE_BV_GPC, 0x00222222 },
+	{ REG_A7XX_RBBM_CLOCK_MODE_BV_VFD, 0x00002222 },
+	{ REG_A7XX_RBBM_CLOCK_HYST_TSE_RAS_RBBM, 0x00000000 },
+	{ REG_A7XX_RBBM_CLOCK_HYST_GPC, 0x04104004 },
+	{ REG_A7XX_RBBM_CLOCK_HYST_VFD, 0x00000000 },
+	{ REG_A7XX_RBBM_CLOCK_DELAY_TSE_RAS_RBBM, 0x00004000 },
+	{ REG_A7XX_RBBM_CLOCK_DELAY_GPC, 0x00000200 },
+	{ REG_A7XX_RBBM_CLOCK_DELAY_VFD, 0x00002222 },
+	{ REG_A7XX_RBBM_CLOCK_MODE_HLSQ, 0x00002222 },
+	{ REG_A7XX_RBBM_CLOCK_DELAY_HLSQ, 0x00000000 },
+	{ REG_A7XX_RBBM_CLOCK_HYST_HLSQ, 0x00000000 },
+	{ REG_A7XX_RBBM_CLOCK_DELAY_HLSQ_2, 0x00000002 },
+	{ REG_A7XX_RBBM_CLOCK_MODE_BV_LRZ, 0x55555552 },
+	{ REG_A7XX_RBBM_CLOCK_MODE_CP, 0x00000223 },
+	{ REG_A7XX_RBBM_CLOCK_CNTL, 0x8aa8aa82 },
+	{ REG_A7XX_RBBM_ISDB_CNT, 0x00000182 },
+	{ REG_A7XX_RBBM_RAC_THRESHOLD_CNT, 0x00000000 },
+	{ REG_A7XX_RBBM_SP_HYST_CNT, 0x00000000 },
+	{ REG_A7XX_RBBM_CLOCK_CNTL_GMU_GX, 0x00000222 },
+	{ REG_A7XX_RBBM_CLOCK_DELAY_GMU_GX, 0x00000111 },
+	{ REG_A7XX_RBBM_CLOCK_HYST_GMU_GX, 0x00000555 },
+	{},
+};
+
+#define RBBM_CLOCK_CNTL_ON 0x8aa8aa82
+
+static void a7xx_set_hwcg(struct msm_gpu *gpu, bool state)
+{
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	const struct adreno_reglist *reg;
+	unsigned int i;
+	u32 val;
+
+	if (!adreno_gpu->info->hwcg)
+		return;
+
+	val = gpu_read(gpu, REG_A7XX_RBBM_CLOCK_CNTL);
+
+	/* Don't re-program the registers if they are already correct */
+	if ((val == RBBM_CLOCK_CNTL_ON) == state)
+		return;
+
+	for (i = 0; (reg = &adreno_gpu->info->hwcg[i], reg->offset); i++)
+		gpu_write(gpu, reg->offset, state ? reg->value : 0);
+
+	gpu_write(gpu, REG_A7XX_RBBM_CLOCK_CNTL, state ? RBBM_CLOCK_CNTL_ON : 0);
+}
+
+static const u32 a730_protect[] = {
+	A6XX_PROTECT_RDONLY(0x00000, 0x04ff),
+	A6XX_PROTECT_RDONLY(0x0050b, 0x0058),
+	A6XX_PROTECT_NORDWR(0x0050e, 0x0000),
+	A6XX_PROTECT_NORDWR(0x00510, 0x0000),
+	A6XX_PROTECT_NORDWR(0x00534, 0x0000),
+	A6XX_PROTECT_RDONLY(0x005fb, 0x009d),
+	A6XX_PROTECT_NORDWR(0x00699, 0x01e9),
+	A6XX_PROTECT_NORDWR(0x008a0, 0x0008),
+	A6XX_PROTECT_NORDWR(0x008ab, 0x0024),
+	A6XX_PROTECT_RDONLY(0x008d0, 0x0170),
+	A6XX_PROTECT_NORDWR(0x00900, 0x004d),
+	A6XX_PROTECT_NORDWR(0x0098d, 0x00b2),
+	A6XX_PROTECT_NORDWR(0x00a41, 0x01be),
+	A6XX_PROTECT_NORDWR(0x00df0, 0x0001),
+	A6XX_PROTECT_NORDWR(0x00e01, 0x0000),
+	A6XX_PROTECT_NORDWR(0x00e07, 0x0008),
+	A6XX_PROTECT_NORDWR(0x03c00, 0x00c3),
+	A6XX_PROTECT_RDONLY(0x03cc4, 0x1fff),
+	A6XX_PROTECT_NORDWR(0x08630, 0x01cf),
+	A6XX_PROTECT_NORDWR(0x08e00, 0x0000),
+	A6XX_PROTECT_NORDWR(0x08e08, 0x0000),
+	A6XX_PROTECT_NORDWR(0x08e50, 0x001f),
+	A6XX_PROTECT_NORDWR(0x08e80, 0x0280),
+	A6XX_PROTECT_NORDWR(0x09624, 0x01db),
+	A6XX_PROTECT_NORDWR(0x09e40, 0x0000),
+	A6XX_PROTECT_NORDWR(0x09e64, 0x000d),
+	A6XX_PROTECT_NORDWR(0x09e78, 0x0187),
+	A6XX_PROTECT_NORDWR(0x0a630, 0x01cf),
+	A6XX_PROTECT_NORDWR(0x0ae02, 0x0000),
+	A6XX_PROTECT_NORDWR(0x0ae50, 0x000f),
+	A6XX_PROTECT_NORDWR(0x0ae66, 0x0003),
+	A6XX_PROTECT_NORDWR(0x0ae6f, 0x0003),
+	A6XX_PROTECT_NORDWR(0x0b604, 0x0003),
+	A6XX_PROTECT_NORDWR(0x0ec00, 0x0fff),
+	A6XX_PROTECT_RDONLY(0x0fc00, 0x1fff),
+	A6XX_PROTECT_NORDWR(0x18400, 0x0053),
+	A6XX_PROTECT_RDONLY(0x18454, 0x0004),
+	A6XX_PROTECT_NORDWR(0x18459, 0x1fff),
+	A6XX_PROTECT_NORDWR(0x1a459, 0x1fff),
+	A6XX_PROTECT_NORDWR(0x1c459, 0x1fff),
+	A6XX_PROTECT_NORDWR(0x1f400, 0x0443),
+	A6XX_PROTECT_RDONLY(0x1f844, 0x007b),
+	A6XX_PROTECT_NORDWR(0x1f860, 0x0000),
+	A6XX_PROTECT_NORDWR(0x1f878, 0x002a),
+	A6XX_PROTECT_NORDWR(0x1f8c0, 0x0000), /* note: infinite range */
+};
+
+static void a7xx_set_cp_protect(struct msm_gpu *gpu)
+{
+	const u32 *regs = a730_protect;
+	unsigned i, count, count_max;
+
+	regs = a730_protect;
+	count = ARRAY_SIZE(a730_protect);
+	count_max = 48;
+	BUILD_BUG_ON(ARRAY_SIZE(a730_protect) > 48);
+
+	/*
+	 * Enable access protection to privileged registers, fault on an access
+	 * protect violation and select the last span to protect from the start
+	 * address all the way to the end of the register address space
+	 */
+	gpu_write(gpu, REG_A7XX_CP_PROTECT_CNTL, BIT(0) | BIT(1) | BIT(3));
+
+	for (i = 0; i < count - 1; i++)
+		gpu_write(gpu, REG_A7XX_CP_PROTECT_REG(i), regs[i]);
+	/* last CP_PROTECT to have "infinite" length on the last entry */
+	gpu_write(gpu, REG_A7XX_CP_PROTECT_REG(count_max - 1), regs[i]);
+}
+
+static void a7xx_set_ubwc_config(struct msm_gpu *gpu)
+{
+	u32 lower_bit = 3;
+	u32 amsbc = 1;
+	u32 rgb565_predicator = 1;
+	u32 uavflagprd_inv = 2;
+
+	gpu_write(gpu, REG_A7XX_RB_NC_MODE_CNTL, rgb565_predicator << 11 | amsbc << 4 | lower_bit << 1);
+	gpu_write(gpu, REG_A7XX_TPL1_NC_MODE_CNTL, lower_bit << 1);
+	gpu_write(gpu, REG_A7XX_SP_NC_MODE_CNTL, uavflagprd_inv << 4 | lower_bit << 1);
+	gpu_write(gpu, REG_A7XX_GRAS_NC_MODE_CNTL, lower_bit << 5);
+	gpu_write(gpu, REG_A7XX_UCHE_MODE_CNTL, lower_bit << 21);
+}
+
+static int a7xx_cp_init(struct msm_gpu *gpu)
+{
+	struct msm_ringbuffer *ring = gpu->rb[0];
+
+	/* Disable concurrent binning before sending CP init */
+	OUT_PKT7(ring, CP_THREAD_CONTROL, 1);
+	OUT_RING(ring, BIT(27));
+
+	OUT_PKT7(ring, CP_ME_INIT, 7);
+	OUT_RING(ring, BIT(0) | /* Use multiple HW contexts */
+		BIT(1) | /* Enable error detection */
+		BIT(3) | /* Set default reset state */
+		BIT(6) | /* Disable save/restore of performance counters across preemption */
+		BIT(8)); /* Enable the register init list with the spinlock */
+	OUT_RING(ring, 0x00000003); /* Set number of HW contexts */
+	OUT_RING(ring, 0x20000000); /* Enable error detection */
+	OUT_RING(ring, 0x00000002); /* Operation mode mask */
+	/* Register initialization list with spinlock (TODO used for IFPC/preemption) */
+	OUT_RING(ring, 0);
+	OUT_RING(ring, 0);
+	OUT_RING(ring, 0);
+
+	adreno_flush(gpu, ring, REG_A7XX_CP_RB_WPTR);
+	return a7xx_idle(gpu, ring) ? 0 : -EINVAL;
+}
+
+static int a7xx_ucode_init(struct msm_gpu *gpu)
+{
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct a7xx_gpu *a7xx_gpu = to_a7xx_gpu(adreno_gpu);
+
+	if (!a7xx_gpu->sqe_bo) {
+		a7xx_gpu->sqe_bo = adreno_fw_create_bo(gpu,
+			adreno_gpu->fw[ADRENO_FW_SQE], &a7xx_gpu->sqe_iova);
+
+		if (IS_ERR(a7xx_gpu->sqe_bo)) {
+			int ret = PTR_ERR(a7xx_gpu->sqe_bo);
+
+			a7xx_gpu->sqe_bo = NULL;
+			DRM_DEV_ERROR(&gpu->pdev->dev,
+				"Could not allocate SQE ucode: %d\n", ret);
+
+			return ret;
+		}
+
+		msm_gem_object_set_name(a7xx_gpu->sqe_bo, "sqefw");
+	}
+
+	gpu_write64(gpu, REG_A7XX_CP_SQE_INSTR_BASE, a7xx_gpu->sqe_iova);
+
+	return 0;
+}
+
+static int a7xx_zap_shader_init(struct msm_gpu *gpu)
+{
+	static bool loaded;
+	int ret;
+
+	if (loaded)
+		return 0;
+
+	ret = adreno_zap_shader_load(gpu, GPU_PAS_ID);
+
+	loaded = !ret;
+	return ret;
+}
+
+#define A7XX_INT_MASK ( \
+	A7XX_RBBM_INT_0_MASK_AHBERROR | \
+	A7XX_RBBM_INT_0_MASK_ATBASYNCFIFOOVERFLOW | \
+	A7XX_RBBM_INT_0_MASK_GPCERROR | \
+	A7XX_RBBM_INT_0_MASK_SWINTERRUPT | \
+	A7XX_RBBM_INT_0_MASK_HWERROR | \
+	A7XX_RBBM_INT_0_MASK_PM4CPINTERRUPT | \
+	A7XX_RBBM_INT_0_MASK_RB_DONE_TS | \
+	A7XX_RBBM_INT_0_MASK_CACHE_CLEAN_TS | \
+	A7XX_RBBM_INT_0_MASK_ATBBUSOVERFLOW | \
+	A7XX_RBBM_INT_0_MASK_HANGDETECTINTERRUPT | \
+	A7XX_RBBM_INT_0_MASK_OUTOFBOUNDACCESS | \
+	A7XX_RBBM_INT_0_MASK_UCHETRAPINTERRUPT | \
+	A7XX_RBBM_INT_0_MASK_TSBWRITEERROR)
+
+/*
+ * All Gen7 targets support marking certain transactions as always privileged
+ * which allows us to mark more memory as privileged without having to
+ * explicitly set the APRIV bit. Choose the following transactions to be
+ * privileged by default:
+ * CDWRITE     [6:6] - Crashdumper writes
+ * CDREAD      [5:5] - Crashdumper reads
+ * RBRPWB      [3:3] - RPTR shadow writes
+ * RBPRIVLEVEL [2:2] - Memory accesses from PM4 packets in the ringbuffer
+ * RBFETCH     [1:1] - Ringbuffer reads
+ * ICACHE      [0:0] - Instruction cache fetches
+ */
+
+#define A7XX_APRIV_DEFAULT (BIT(3) | BIT(2) | BIT(1) | BIT(0))
+/* Add crashdumper permissions for the BR APRIV */
+#define A7XX_BR_APRIV_DEFAULT (A7XX_APRIV_DEFAULT | BIT(6) | BIT(5))
+
+static int a7xx_hw_init(struct msm_gpu *gpu)
+{
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct a7xx_gpu *a7xx_gpu = to_a7xx_gpu(adreno_gpu);
+	int ret;
+
+	/* Set up GBIF registers */
+	gpu_write(gpu, REG_A7XX_GBIF_QSB_SIDE0, 0x00071620);
+	gpu_write(gpu, REG_A7XX_GBIF_QSB_SIDE1, 0x00071620);
+	gpu_write(gpu, REG_A7XX_GBIF_QSB_SIDE2, 0x00071620);
+	gpu_write(gpu, REG_A7XX_GBIF_QSB_SIDE3, 0x00071620);
+	gpu_write(gpu, REG_A7XX_GBIF_QSB_SIDE3, 0x00071620);
+	gpu_write(gpu, REG_A7XX_RBBM_GBIF_CLIENT_QOS_CNTL, 0x2120212);
+	gpu_write(gpu, REG_A7XX_UCHE_GBIF_GX_CONFIG, 0x10240e0);
+
+	/* Make all blocks contribute to the GPU BUSY perf counter */
+	gpu_write(gpu, REG_A7XX_RBBM_PERFCTR_GPU_BUSY_MASKED, 0xffffffff);
+
+	/*
+	 * Set UCHE_WRITE_THRU_BASE to the UCHE_TRAP_BASE effectively
+	 * disabling L2 bypass
+	 */
+	gpu_write64(gpu, REG_A7XX_UCHE_TRAP_BASE, ~0ull);
+	gpu_write64(gpu, REG_A7XX_UCHE_WRITE_THRU_BASE, ~0ull);
+
+	gpu_write(gpu, REG_A7XX_UCHE_CACHE_WAYS, 0x800000);
+	gpu_write(gpu, REG_A7XX_UCHE_CMDQ_CONFIG, 6 << 16 | 6 << 12 | 9 << 8 | BIT(3) | BIT(2) | 2);
+
+	/* Set the AHB default slave response to "ERROR" */
+	gpu_write(gpu, REG_A7XX_CP_AHB_CNTL, 0x1);
+
+	/* Turn on performance counters */
+	gpu_write(gpu, REG_A7XX_RBBM_PERFCTR_CNTL, 0x1);
+
+	a7xx_set_ubwc_config(gpu);
+
+	gpu_write(gpu, REG_A7XX_RBBM_INTERFACE_HANG_INT_CNTL, BIT(30) | 0xcfffff);
+	gpu_write(gpu, REG_A7XX_UCHE_CLIENT_PF, BIT(0));
+
+	a7xx_set_cp_protect(gpu);
+
+	/* TODO: Configure LLCC */
+
+	gpu_write(gpu, REG_A7XX_CP_APRIV_CNTL, A7XX_BR_APRIV_DEFAULT);
+	gpu_write(gpu, REG_A7XX_CP_BV_APRIV_CNTL, A7XX_APRIV_DEFAULT);
+	gpu_write(gpu, REG_A7XX_CP_LPAC_APRIV_CNTL, A7XX_APRIV_DEFAULT);
+
+	gpu_write(gpu, REG_A7XX_RBBM_SECVID_TSB_CNTL, 0);
+	gpu_write64(gpu, REG_A7XX_RBBM_SECVID_TSB_TRUSTED_BASE, 0);
+	gpu_write(gpu, REG_A7XX_RBBM_SECVID_TSB_TRUSTED_SIZE, 0);
+
+	a7xx_set_hwcg(gpu, true);
+
+	/* Enable interrupts */
+	gpu_write(gpu, REG_A7XX_RBBM_INT_0_MASK, A7XX_INT_MASK);
+
+	ret = adreno_hw_init(gpu);
+	if (ret)
+		return ret;
+
+	/* reset the value of bv_fence too */
+	gpu->rb[0]->memptrs->bv_fence = gpu->rb[0]->fctx->completed_fence;
+
+	ret = a7xx_ucode_init(gpu);
+	if (ret)
+		return ret;
+
+	/* Set the ringbuffer address and setup rptr shadow */
+	gpu_write64(gpu, REG_A7XX_CP_RB_BASE, gpu->rb[0]->iova);
+
+	gpu_write(gpu, REG_A7XX_CP_RB_CNTL, MSM_GPU_RB_CNTL_DEFAULT);
+
+	if (!a7xx_gpu->shadow_bo) {
+		a7xx_gpu->shadow = msm_gem_kernel_new(gpu->dev,
+			sizeof(u32) * gpu->nr_rings,
+			MSM_BO_WC | MSM_BO_MAP_PRIV,
+			gpu->aspace, &a7xx_gpu->shadow_bo,
+			&a7xx_gpu->shadow_iova);
+
+		if (IS_ERR(a7xx_gpu->shadow))
+			return PTR_ERR(a7xx_gpu->shadow);
+
+		msm_gem_object_set_name(a7xx_gpu->shadow_bo, "shadow");
+	}
+
+	gpu_write64(gpu, REG_A7XX_CP_RB_RPTR_ADDR, shadowptr(a7xx_gpu, gpu->rb[0]));
+
+	gpu->cur_ctx_seqno = 0;
+
+	/* Enable the SQE_to start the CP engine */
+	gpu_write(gpu, REG_A7XX_CP_SQE_CNTL, 1);
+
+	ret = a7xx_cp_init(gpu);
+	if (ret)
+		return ret;
+
+	/*
+	 * Try to load a zap shader into the secure world. If successful
+	 * we can use the CP to switch out of secure mode. If not then we
+	 * have no resource but to try to switch ourselves out manually. If we
+	 * guessed wrong then access to the RBBM_SECVID_TRUST_CNTL register will
+	 * be blocked and a permissions violation will soon follow.
+	 */
+	ret = a7xx_zap_shader_init(gpu);
+	if (!ret) {
+		OUT_PKT7(gpu->rb[0], CP_SET_SECURE_MODE, 1);
+		OUT_RING(gpu->rb[0], 0x00000000);
+
+		adreno_flush(gpu, gpu->rb[0], REG_A7XX_CP_RB_WPTR);
+		if (!a7xx_idle(gpu, gpu->rb[0]))
+			return -EINVAL;
+	} else if (ret == -ENODEV) {
+		/*
+		 * This device does not use zap shader (but print a warning
+		 * just in case someone got their dt wrong.. hopefully they
+		 * have a debug UART to realize the error of their ways...
+		 * if you mess this up you are about to crash horribly)
+		 */
+		dev_warn_once(gpu->dev->dev,
+			"Zap shader not enabled - using SECVID_TRUST_CNTL instead\n");
+		gpu_write(gpu, REG_A7XX_RBBM_SECVID_TRUST_CNTL, 0x0);
+		ret = 0;
+	} else {
+		return ret;
+	}
+
+	return ret;
+}
+
+static void a7xx_dump(struct msm_gpu *gpu)
+{
+	DRM_DEV_INFO(&gpu->pdev->dev, "status:   %08x\n",
+			gpu_read(gpu, REG_A7XX_RBBM_STATUS));
+	adreno_dump(gpu);
+}
+
+static void a7xx_recover(struct msm_gpu *gpu)
+{
+	adreno_dump_info(gpu);
+
+	if (hang_debug)
+		a7xx_dump(gpu);
+
+	gpu_write(gpu, REG_A7XX_RBBM_SW_RESET_CMD, 1);
+	/*
+	 * Do a dummy read to get a brief read cycle delay for the
+	 * reset to take effect
+	 * (does this work as expected for a7xx?)
+	 */
+	gpu_read(gpu, REG_A7XX_RBBM_SW_RESET_CMD);
+	gpu_write(gpu, REG_A7XX_RBBM_SW_RESET_CMD, 0);
+
+	msm_gpu_hw_init(gpu);
+}
+
+static void a7xx_cp_hw_err_irq(struct msm_gpu *gpu)
+{
+	struct device *dev = &gpu->pdev->dev;
+	u32 status = gpu_read(gpu, REG_A7XX_CP_INTERRUPT_STATUS);
+	u32 val;
+
+	if (status & A7XX_CP_INTERRUPT_STATUS_OPCODEERROR) {
+		gpu_write(gpu, REG_A7XX_CP_SQE_STAT_ADDR, 1);
+		val = gpu_read(gpu, REG_A7XX_CP_SQE_STAT_DATA);
+		dev_err_ratelimited(dev,"CP | opcode error | possible opcode=0x%8.8X\n", val);
+	}
+
+	if (status & A7XX_CP_INTERRUPT_STATUS_UCODEERROR)
+		dev_err_ratelimited(dev, "CP ucode error interrupt\n");
+
+	if (status & A7XX_CP_INTERRUPT_STATUS_CPHWFAULT)
+		dev_err_ratelimited(dev, "CP | HW fault | status=0x%8.8X\n", gpu_read(gpu, REG_A7XX_CP_HW_FAULT));
+
+	if (status & A7XX_CP_INTERRUPT_STATUS_REGISTERPROTECTION) {
+		val = gpu_read(gpu, REG_A7XX_CP_PROTECT_STATUS);
+		dev_err_ratelimited(dev,
+			"CP | protected mode error | %s | addr=0x%8.8X | status=0x%8.8X\n",
+			val & (1 << 20) ? "READ" : "WRITE",
+			(val & 0x3ffff), val);
+	}
+
+	if (status & A7XX_CP_INTERRUPT_STATUS_VSDPARITYERROR)
+		dev_err_ratelimited(dev,"CP VSD decoder parity error\n");
+
+	if (status & A7XX_CP_INTERRUPT_STATUS_ILLEGALINSTRUCTION)
+		dev_err_ratelimited(dev,"CP illegal instruction error\n");
+
+	if (status & A7XX_CP_INTERRUPT_STATUS_OPCODEERRORLPAC)
+		dev_err_ratelimited(dev, "CP opcode error LPAC\n");
+
+	if (status & A7XX_CP_INTERRUPT_STATUS_UCODEERRORLPAC)
+		dev_err_ratelimited(dev, "CP ucode error LPAC\n");
+
+	if (status & A7XX_CP_INTERRUPT_STATUS_CPHWFAULTLPAC)
+		dev_err_ratelimited(dev, "CP hw fault LPAC\n");
+
+	if (status & A7XX_CP_INTERRUPT_STATUS_REGISTERPROTECTIONLPAC)
+		dev_err_ratelimited(dev, "CP register protection LPAC\n");
+
+	if (status & A7XX_CP_INTERRUPT_STATUS_ILLEGALINSTRUCTIONLPAC)
+		dev_err_ratelimited(dev, "CP illegal instruction LPAC\n");
+
+	if (status & A7XX_CP_INTERRUPT_STATUS_OPCODEERRORBV) {
+		gpu_write(gpu, REG_A7XX_CP_BV_SQE_STAT_ADDR, 1);
+		val = gpu_read(gpu, REG_A7XX_CP_BV_SQE_STAT_DATA);
+		dev_err_ratelimited(dev, "CP opcode error BV | opcode=0x%8.8x\n", val);
+	}
+
+	if (status & A7XX_CP_INTERRUPT_STATUS_UCODEERRORBV)
+		dev_err_ratelimited(dev, "CP ucode error BV\n");
+
+	if (status & A7XX_CP_INTERRUPT_STATUS_CPHWFAULTBV) {
+		val = gpu_read(gpu, REG_A7XX_CP_BV_HW_FAULT);
+		dev_err_ratelimited(dev, "CP BV | Ringbuffer HW fault | status=%x\n", val);
+	}
+
+	if (status & A7XX_CP_INTERRUPT_STATUS_REGISTERPROTECTIONBV) {
+		val = gpu_read(gpu, REG_A7XX_CP_BV_PROTECT_STATUS);
+		dev_err_ratelimited(dev,
+			"CP BV | protected mode error | %s | addr=0x%8.8X | status=0x%8.8X\n",
+			val & BIT(20) ? "READ" : "WRITE",
+			val & 0x3ffff, val);
+	}
+
+	if (status & A7XX_CP_INTERRUPT_STATUS_ILLEGALINSTRUCTIONBV)
+		dev_err_ratelimited(dev, "CP illegal instruction BV\n");
+}
+
+static irqreturn_t a7xx_irq(struct msm_gpu *gpu)
+{
+	struct msm_drm_private *priv = gpu->dev->dev_private;
+	u32 status = gpu_read(gpu, REG_A7XX_RBBM_INT_0_STATUS);
+
+	gpu_write(gpu, REG_A7XX_RBBM_INT_CLEAR_CMD, status);
+
+	if (priv->disable_err_irq)
+		status &= A7XX_RBBM_INT_0_MASK_CACHE_CLEAN_TS;
+
+	/* TODO: print human friendly strings for each error ? */
+	if (status & ~A7XX_RBBM_INT_0_MASK_CACHE_CLEAN_TS)
+		dev_err_ratelimited(&gpu->pdev->dev, "unexpected irq status: 0x%8.8X\n", status);
+
+	if (status & A7XX_RBBM_INT_0_MASK_HWERROR)
+		a7xx_cp_hw_err_irq(gpu);
+
+	if (status & A7XX_RBBM_INT_0_MASK_CACHE_CLEAN_TS)
+		msm_gpu_retire(gpu);
+
+	return IRQ_HANDLED;
+}
+
+static int a7xx_get_timestamp(struct msm_gpu *gpu, uint64_t *value)
+{
+	*value = gpu_read64(gpu, REG_A7XX_CP_ALWAYS_ON_COUNTER);
+	return 0;
+}
+
+static void a7xx_destroy(struct msm_gpu *gpu)
+{
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct a7xx_gpu *a7xx_gpu = to_a7xx_gpu(adreno_gpu);
+
+	if (a7xx_gpu->sqe_bo) {
+		msm_gem_unpin_iova(a7xx_gpu->sqe_bo, gpu->aspace);
+		drm_gem_object_put(a7xx_gpu->sqe_bo);
+	}
+
+	if (a7xx_gpu->shadow_bo) {
+		msm_gem_unpin_iova(a7xx_gpu->shadow_bo, gpu->aspace);
+		drm_gem_object_put(a7xx_gpu->shadow_bo);
+	}
+
+	adreno_gpu_cleanup(adreno_gpu);
+
+	kfree(a7xx_gpu);
+}
+
+static struct msm_gpu_state *a7xx_gpu_state_get(struct msm_gpu *gpu)
+{
+	struct msm_gpu_state *state = kzalloc(sizeof(*state), GFP_KERNEL);
+
+	if (!state)
+		return ERR_PTR(-ENOMEM);
+
+	adreno_gpu_state_get(gpu, state);
+
+	state->rbbm_status = gpu_read(gpu, REG_A7XX_RBBM_STATUS);
+
+	return state;
+}
+
+static struct msm_gem_address_space *
+a7xx_create_address_space(struct msm_gpu *gpu, struct platform_device *pdev)
+{
+	struct iommu_domain *iommu;
+	struct msm_mmu *mmu;
+	struct msm_gem_address_space *aspace;
+	u64 start, size;
+
+	iommu = iommu_domain_alloc(&platform_bus_type);
+	if (!iommu)
+		return NULL;
+
+	mmu = msm_iommu_new(&pdev->dev, iommu);
+	if (IS_ERR(mmu)) {
+		iommu_domain_free(iommu);
+		return ERR_CAST(mmu);
+	}
+
+	/*
+	 * Use the aperture start or SZ_16M, whichever is greater. This will
+	 * ensure that we align with the allocated pagetable range while still
+	 * allowing room in the lower 32 bits for GMEM and whatnot
+	 */
+	start = max_t(u64, SZ_16M, iommu->geometry.aperture_start);
+	size = iommu->geometry.aperture_end - start + 1;
+
+	aspace = msm_gem_address_space_create(mmu, "gpu",
+		start & GENMASK_ULL(48, 0), size);
+
+	if (IS_ERR(aspace) && !IS_ERR(mmu))
+		mmu->funcs->destroy(mmu);
+
+	return aspace;
+}
+
+static uint32_t a7xx_get_rptr(struct msm_gpu *gpu, struct msm_ringbuffer *ring)
+{
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct a7xx_gpu *a7xx_gpu = to_a7xx_gpu(adreno_gpu);
+
+	return a7xx_gpu->shadow[ring->id];
+}
+
+static const struct adreno_gpu_funcs funcs = {
+	.base = {
+		.get_param = adreno_get_param,
+		.set_param = adreno_set_param,
+		.hw_init = a7xx_hw_init,
+		.pm_suspend = msm_gpu_pm_suspend,
+		.pm_resume = msm_gpu_pm_resume,
+		.recover = a7xx_recover,
+		.submit = a7xx_submit,
+		.active_ring = adreno_active_ring,
+		.irq = a7xx_irq,
+		.destroy = a7xx_destroy,
+#if defined(CONFIG_DEBUG_FS) || defined(CONFIG_DEV_COREDUMP)
+		.show = adreno_show,
+#endif
+		.gpu_state_get = a7xx_gpu_state_get,
+		.gpu_state_put = adreno_gpu_state_put,
+		.create_address_space = a7xx_create_address_space,
+		.get_rptr = a7xx_get_rptr,
+	},
+	.get_timestamp = a7xx_get_timestamp,
+};
+
+struct msm_gpu *a7xx_gpu_init(struct drm_device *dev)
+{
+	struct msm_drm_private *priv = dev->dev_private;
+	struct platform_device *pdev = priv->gpu_pdev;
+	struct a7xx_gpu *a7xx_gpu;
+	struct adreno_gpu *adreno_gpu;
+	int ret;
+
+	a7xx_gpu = kzalloc(sizeof(*a7xx_gpu), GFP_KERNEL);
+	if (!a7xx_gpu)
+		return ERR_PTR(-ENOMEM);
+
+	adreno_gpu = &a7xx_gpu->base;
+	adreno_gpu->registers = NULL;
+	adreno_gpu->base.hw_apriv = true;
+
+	ret = adreno_gpu_init(dev, pdev, adreno_gpu, &funcs, 1);
+	if (ret) {
+		a7xx_destroy(&(a7xx_gpu->base.base));
+		return ERR_PTR(ret);
+	}
+
+	return &adreno_gpu->base;
+}
diff --git a/drivers/gpu/drm/msm/adreno/a7xx_gpu.h b/drivers/gpu/drm/msm/adreno/a7xx_gpu.h
new file mode 100644
index 0000000000000..ebb86d67b812d
--- /dev/null
+++ b/drivers/gpu/drm/msm/adreno/a7xx_gpu.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (c) 2017-2022 The Linux Foundation. All rights reserved. */
+
+#ifndef __A7XX_GPU_H__
+#define __A7XX_GPU_H__
+
+#include "adreno_gpu.h"
+#include "a7xx.xml.h"
+
+struct a7xx_gpu {
+	struct adreno_gpu base;
+
+	struct drm_gem_object *sqe_bo;
+	uint64_t sqe_iova;
+
+	struct drm_gem_object *shadow_bo;
+	uint64_t shadow_iova;
+	uint32_t *shadow;
+};
+
+#define to_a7xx_gpu(x) container_of(x, struct a7xx_gpu, base)
+
+#define shadowptr(_a7xx_gpu, _ring) ((_a7xx_gpu)->shadow_iova + \
+		((_ring)->id * sizeof(uint32_t)))
+
+#endif /* __A7XX_GPU_H__ */
diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c
index 89cfd84760d7e..2e93a99a7160d 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_device.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
@@ -339,6 +339,18 @@ static const struct adreno_info gpulist[] = {
 		.init = a6xx_gpu_init,
 		.zapfw = "a640_zap.mdt",
 		.hwcg = a640_hwcg,
+	}, {
+		.rev = ADRENO_REV(7, 0, 0, ANY_ID),
+		.revn = 730,
+		.name = "A730",
+		.fw = {
+			[ADRENO_FW_SQE] = "a730_sqe.fw",
+		},
+		.gmem = SZ_2M,
+		.inactive_period = DRM_MSM_INACTIVE_PERIOD,
+		.init = a7xx_gpu_init,
+		.zapfw = "a730_zap.mbn",
+		.hwcg = a730_hwcg,
 	},
 };
 
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
index 55c5433a4ea18..685a0ffb9b0f8 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
@@ -57,7 +57,7 @@ struct adreno_reglist {
 	u32 value;
 };
 
-extern const struct adreno_reglist a630_hwcg[], a640_hwcg[], a650_hwcg[], a660_hwcg[];
+extern const struct adreno_reglist a630_hwcg[], a640_hwcg[], a650_hwcg[], a660_hwcg[], a730_hwcg[];
 
 struct adreno_info {
 	struct adreno_rev rev;
@@ -395,6 +395,7 @@ struct msm_gpu *a3xx_gpu_init(struct drm_device *dev);
 struct msm_gpu *a4xx_gpu_init(struct drm_device *dev);
 struct msm_gpu *a5xx_gpu_init(struct drm_device *dev);
 struct msm_gpu *a6xx_gpu_init(struct drm_device *dev);
+struct msm_gpu *a7xx_gpu_init(struct drm_device *dev);
 
 static inline uint32_t get_wptr(struct msm_ringbuffer *ring)
 {
diff --git a/drivers/gpu/drm/msm/adreno/adreno_pm4.xml.h b/drivers/gpu/drm/msm/adreno/adreno_pm4.xml.h
index 9eeedd261b733..b9587c2892a35 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_pm4.xml.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_pm4.xml.h
@@ -12,7 +12,7 @@ The rules-ng-ng source files this header was generated from are:
 - freedreno/registers/freedreno_copyright.xml        (   1572 bytes, from 2020-11-18 00:17:12)
 - freedreno/registers/adreno/a2xx.xml                (  90810 bytes, from 2021-08-06 17:44:41)
 - freedreno/registers/adreno/adreno_common.xml       (  14631 bytes, from 2022-03-27 14:52:08)
-- freedreno/registers/adreno/adreno_pm4.xml          (  70177 bytes, from 2022-03-27 20:02:31)
+- freedreno/registers/adreno/adreno_pm4.xml          (  69699 bytes, from 2022-03-27 20:17:52)
 - freedreno/registers/adreno/a3xx.xml                (  84231 bytes, from 2021-08-27 13:03:56)
 - freedreno/registers/adreno/a4xx.xml                ( 113474 bytes, from 2022-03-22 19:23:46)
 - freedreno/registers/adreno/a5xx.xml                ( 149512 bytes, from 2022-03-21 16:05:18)
@@ -2380,23 +2380,5 @@ static inline uint32_t CP_THREAD_CONTROL_0_THREAD(enum cp_thread val)
 #define CP_THREAD_CONTROL_0_CONCURRENT_BIN_DISABLE		0x08000000
 #define CP_THREAD_CONTROL_0_SYNC_THREADS			0x80000000
 
-#define REG_CP_WAIT_TIMESTAMP_0					0x00000000
-#define CP_WAIT_TIMESTAMP_0_REF__MASK				0x00000003
-#define CP_WAIT_TIMESTAMP_0_REF__SHIFT				0
-static inline uint32_t CP_WAIT_TIMESTAMP_0_REF(uint32_t val)
-{
-	return ((val) << CP_WAIT_TIMESTAMP_0_REF__SHIFT) & CP_WAIT_TIMESTAMP_0_REF__MASK;
-}
-#define CP_WAIT_TIMESTAMP_0_MEMSPACE__MASK			0x00000010
-#define CP_WAIT_TIMESTAMP_0_MEMSPACE__SHIFT			4
-static inline uint32_t CP_WAIT_TIMESTAMP_0_MEMSPACE(uint32_t val)
-{
-	return ((val) << CP_WAIT_TIMESTAMP_0_MEMSPACE__SHIFT) & CP_WAIT_TIMESTAMP_0_MEMSPACE__MASK;
-}
-
-#define REG_CP_WAIT_TIMESTAMP_ADDR				0x00000001
-
-#define REG_CP_WAIT_TIMESTAMP_VALUE				0x00000003
-
 
 #endif /* ADRENO_PM4_XML */
diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.h b/drivers/gpu/drm/msm/msm_ringbuffer.h
index d8c63df4e9ca9..ae7b35ccbeff9 100644
--- a/drivers/gpu/drm/msm/msm_ringbuffer.h
+++ b/drivers/gpu/drm/msm/msm_ringbuffer.h
@@ -30,6 +30,7 @@ struct msm_gpu_submit_stats {
 struct msm_rbmemptrs {
 	volatile uint32_t rptr;
 	volatile uint32_t fence;
+	volatile uint32_t bv_fence;
 
 	volatile struct msm_gpu_submit_stats stats[MSM_GPU_SUBMIT_STATS_COUNT];
 	volatile u64 ttbr0;
-- 
2.26.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/4] drm/msm/adreno: add support for a730
@ 2022-03-27 20:25   ` Jonathan Marek
  0 siblings, 0 replies; 12+ messages in thread
From: Jonathan Marek @ 2022-03-27 20:25 UTC (permalink / raw)
  To: freedreno
  Cc: Emma Anholt, David Airlie, open list, Vladimir Lypak,
	Abhinav Kumar, open list:DRM DRIVER FOR MSM ADRENO GPU,
	Bjorn Andersson, Akhil P Oommen,
	open list:DRM DRIVER FOR MSM ADRENO GPU, Sean Paul,
	AngeloGioacchino Del Regno

Based on a6xx_gpu.c, stripped down and updated for a7xx based on the
downstream driver. Implements the minimum to be able to submit commands to
the GPU and use it for userspace driver development. Notably this doesn't
implement support for the GMU (this means that the clock driver needs to
support the GPU core clock and turning on the GX rail, which is normally
offloaded to the GMU).

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
---
 drivers/gpu/drm/msm/Makefile                |   1 +
 drivers/gpu/drm/msm/adreno/a7xx_gpu.c       | 777 ++++++++++++++++++++
 drivers/gpu/drm/msm/adreno/a7xx_gpu.h       |  26 +
 drivers/gpu/drm/msm/adreno/adreno_device.c  |  12 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.h     |   3 +-
 drivers/gpu/drm/msm/adreno/adreno_pm4.xml.h |  20 +-
 drivers/gpu/drm/msm/msm_ringbuffer.h        |   1 +
 7 files changed, 820 insertions(+), 20 deletions(-)
 create mode 100644 drivers/gpu/drm/msm/adreno/a7xx_gpu.c
 create mode 100644 drivers/gpu/drm/msm/adreno/a7xx_gpu.h

diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile
index e9cc7d8ac301e..b91e543e42265 100644
--- a/drivers/gpu/drm/msm/Makefile
+++ b/drivers/gpu/drm/msm/Makefile
@@ -16,6 +16,7 @@ msm-y := \
 	adreno/a6xx_gpu.o \
 	adreno/a6xx_gmu.o \
 	adreno/a6xx_hfi.o \
+	adreno/a7xx_gpu.o \
 	hdmi/hdmi.o \
 	hdmi/hdmi_audio.o \
 	hdmi/hdmi_bridge.o \
diff --git a/drivers/gpu/drm/msm/adreno/a7xx_gpu.c b/drivers/gpu/drm/msm/adreno/a7xx_gpu.c
new file mode 100644
index 0000000000000..16bdce21b06f2
--- /dev/null
+++ b/drivers/gpu/drm/msm/adreno/a7xx_gpu.c
@@ -0,0 +1,777 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2017-2022 The Linux Foundation. All rights reserved. */
+
+#include "msm_gem.h"
+#include "msm_mmu.h"
+#include "msm_gpu_trace.h"
+#include "a7xx_gpu.h"
+
+#include <linux/bitfield.h>
+
+extern bool hang_debug;
+
+#define GPU_PAS_ID 13
+
+static inline bool _a7xx_check_idle(struct msm_gpu *gpu)
+{
+	/* Check that the CX master is idle */
+	if (gpu_read(gpu, REG_A7XX_RBBM_STATUS) & ~A7XX_RBBM_STATUS_CPAHBBUSYCXMASTER)
+		return false;
+
+	return !(gpu_read(gpu, REG_A7XX_RBBM_INT_0_STATUS) & A7XX_RBBM_INT_0_MASK_HANGDETECTINTERRUPT);
+}
+
+static bool a7xx_idle(struct msm_gpu *gpu, struct msm_ringbuffer *ring)
+{
+	/* wait for CP to drain ringbuffer: */
+	if (!adreno_idle(gpu, ring))
+		return false;
+
+	if (spin_until(_a7xx_check_idle(gpu))) {
+		DRM_ERROR("%s: %ps: timeout waiting for GPU to idle: status %8.8X irq %8.8X rptr/wptr %d/%d\n",
+			gpu->name, __builtin_return_address(0),
+			gpu_read(gpu, REG_A7XX_RBBM_STATUS),
+			gpu_read(gpu, REG_A7XX_RBBM_INT_0_STATUS),
+			gpu_read(gpu, REG_A7XX_CP_RB_RPTR),
+			gpu_read(gpu, REG_A7XX_CP_RB_WPTR));
+		return false;
+	}
+
+	return true;
+}
+
+static void a7xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
+{
+	struct msm_ringbuffer *ring = submit->ring;
+	unsigned int i;
+
+	OUT_PKT7(ring, CP_THREAD_CONTROL, 1);
+	OUT_RING(ring, CP_SET_THREAD_BOTH);
+
+	OUT_PKT7(ring, CP_SET_MARKER, 1);
+	OUT_RING(ring, 0x101); /* IFPC disable */
+
+	OUT_PKT7(ring, CP_SET_MARKER, 1);
+	OUT_RING(ring, 0x00d); /* IB1LIST start */
+
+	/* Submit the commands */
+	for (i = 0; i < submit->nr_cmds; i++) {
+		switch (submit->cmd[i].type) {
+		case MSM_SUBMIT_CMD_IB_TARGET_BUF:
+			break;
+		case MSM_SUBMIT_CMD_CTX_RESTORE_BUF:
+			if (gpu->cur_ctx_seqno == submit->queue->ctx->seqno)
+				break;
+			fallthrough;
+		case MSM_SUBMIT_CMD_BUF:
+			OUT_PKT7(ring, CP_INDIRECT_BUFFER_PFE, 3);
+			OUT_RING(ring, lower_32_bits(submit->cmd[i].iova));
+			OUT_RING(ring, upper_32_bits(submit->cmd[i].iova));
+			OUT_RING(ring, submit->cmd[i].size);
+			break;
+		}
+	}
+
+	OUT_PKT7(ring, CP_SET_MARKER, 1);
+	OUT_RING(ring, 0x00e); /* IB1LIST end */
+
+	OUT_PKT7(ring, CP_THREAD_CONTROL, 1);
+	OUT_RING(ring, CP_SET_THREAD_BR);
+
+	OUT_PKT7(ring, CP_EVENT_WRITE, 1);
+	OUT_RING(ring, CCU_INVALIDATE_DEPTH);
+
+	OUT_PKT7(ring, CP_EVENT_WRITE, 1);
+	OUT_RING(ring, CCU_INVALIDATE_COLOR);
+
+	OUT_PKT7(ring, CP_THREAD_CONTROL, 1);
+	OUT_RING(ring, CP_SET_THREAD_BV);
+
+	/*
+	 * Make sure the timestamp is committed once BV pipe is
+	 * completely done with this submission.
+	 */
+	OUT_PKT7(ring, CP_EVENT_WRITE, 4);
+	OUT_RING(ring, CACHE_CLEAN | BIT(27));
+	OUT_RING(ring, lower_32_bits(rbmemptr(ring, bv_fence)));
+	OUT_RING(ring, upper_32_bits(rbmemptr(ring, bv_fence)));
+	OUT_RING(ring, submit->seqno);
+
+	OUT_PKT7(ring, CP_THREAD_CONTROL, 1);
+	OUT_RING(ring, CP_SET_THREAD_BR);
+
+	/*
+	 * This makes sure that BR doesn't race ahead and commit
+	 * timestamp to memstore while BV is still processing
+	 * this submission.
+	 */
+	OUT_PKT7(ring, CP_WAIT_TIMESTAMP, 4);
+	OUT_RING(ring, 0);
+	OUT_RING(ring, lower_32_bits(rbmemptr(ring, bv_fence)));
+	OUT_RING(ring, upper_32_bits(rbmemptr(ring, bv_fence)));
+	OUT_RING(ring, submit->seqno);
+
+	/* write the ringbuffer timestamp */
+	OUT_PKT7(ring, CP_EVENT_WRITE, 4);
+	OUT_RING(ring, CACHE_CLEAN | BIT(31) | BIT(27));
+	OUT_RING(ring, lower_32_bits(rbmemptr(ring, fence)));
+	OUT_RING(ring, upper_32_bits(rbmemptr(ring, fence)));
+	OUT_RING(ring, submit->seqno);
+
+	OUT_PKT7(ring, CP_THREAD_CONTROL, 1);
+	OUT_RING(ring, CP_SET_THREAD_BOTH);
+
+	OUT_PKT7(ring, CP_SET_MARKER, 1);
+	OUT_RING(ring, 0x100); /* IFPC enable */
+
+	trace_msm_gpu_submit_flush(submit, gpu_read64(gpu, REG_A7XX_CP_ALWAYS_ON_COUNTER));
+
+	adreno_flush(gpu, ring, REG_A7XX_CP_RB_WPTR);
+}
+
+const struct adreno_reglist a730_hwcg[] = {
+	{ REG_A7XX_RBBM_CLOCK_CNTL_SP0, 0x02222222 },
+	{ REG_A7XX_RBBM_CLOCK_CNTL2_SP0, 0x02022222 },
+	{ REG_A7XX_RBBM_CLOCK_HYST_SP0, 0x0000f3cf },
+	{ REG_A7XX_RBBM_CLOCK_DELAY_SP0, 0x00000080 },
+	{ REG_A7XX_RBBM_CLOCK_CNTL_TP0, 0x22222220 },
+	{ REG_A7XX_RBBM_CLOCK_CNTL2_TP0, 0x22222222 },
+	{ REG_A7XX_RBBM_CLOCK_CNTL3_TP0, 0x22222222 },
+	{ REG_A7XX_RBBM_CLOCK_CNTL4_TP0, 0x00222222 },
+	{ REG_A7XX_RBBM_CLOCK_HYST_TP0, 0x77777777 },
+	{ REG_A7XX_RBBM_CLOCK_HYST2_TP0, 0x77777777 },
+	{ REG_A7XX_RBBM_CLOCK_HYST3_TP0, 0x77777777 },
+	{ REG_A7XX_RBBM_CLOCK_HYST4_TP0, 0x00077777 },
+	{ REG_A7XX_RBBM_CLOCK_DELAY_TP0, 0x11111111 },
+	{ REG_A7XX_RBBM_CLOCK_DELAY2_TP0, 0x11111111 },
+	{ REG_A7XX_RBBM_CLOCK_DELAY3_TP0, 0x11111111 },
+	{ REG_A7XX_RBBM_CLOCK_DELAY4_TP0, 0x00011111 },
+	{ REG_A7XX_RBBM_CLOCK_CNTL_UCHE, 0x22222222 },
+	{ REG_A7XX_RBBM_CLOCK_HYST_UCHE, 0x00000004 },
+	{ REG_A7XX_RBBM_CLOCK_DELAY_UCHE, 0x00000002 },
+	{ REG_A7XX_RBBM_CLOCK_CNTL_RB0, 0x22222222 },
+	{ REG_A7XX_RBBM_CLOCK_CNTL2_RB0, 0x01002222 },
+	{ REG_A7XX_RBBM_CLOCK_CNTL_CCU0, 0x00002220 },
+	{ REG_A7XX_RBBM_CLOCK_HYST_RB_CCU0, 0x44000f00 },
+	{ REG_A7XX_RBBM_CLOCK_CNTL_RAC, 0x25222022 },
+	{ REG_A7XX_RBBM_CLOCK_CNTL2_RAC, 0x00555555 },
+	{ REG_A7XX_RBBM_CLOCK_DELAY_RAC, 0x00000011 },
+	{ REG_A7XX_RBBM_CLOCK_HYST_RAC, 0x00440044 },
+	{ REG_A7XX_RBBM_CLOCK_CNTL_TSE_RAS_RBBM, 0x04222222 },
+	{ REG_A7XX_RBBM_CLOCK_MODE2_GRAS, 0x00000222 },
+	{ REG_A7XX_RBBM_CLOCK_MODE_BV_GRAS, 0x00222222 },
+	{ REG_A7XX_RBBM_CLOCK_MODE_GPC, 0x02222223 },
+	{ REG_A7XX_RBBM_CLOCK_MODE_VFD, 0x00002222 },
+	{ REG_A7XX_RBBM_CLOCK_MODE_BV_GPC, 0x00222222 },
+	{ REG_A7XX_RBBM_CLOCK_MODE_BV_VFD, 0x00002222 },
+	{ REG_A7XX_RBBM_CLOCK_HYST_TSE_RAS_RBBM, 0x00000000 },
+	{ REG_A7XX_RBBM_CLOCK_HYST_GPC, 0x04104004 },
+	{ REG_A7XX_RBBM_CLOCK_HYST_VFD, 0x00000000 },
+	{ REG_A7XX_RBBM_CLOCK_DELAY_TSE_RAS_RBBM, 0x00004000 },
+	{ REG_A7XX_RBBM_CLOCK_DELAY_GPC, 0x00000200 },
+	{ REG_A7XX_RBBM_CLOCK_DELAY_VFD, 0x00002222 },
+	{ REG_A7XX_RBBM_CLOCK_MODE_HLSQ, 0x00002222 },
+	{ REG_A7XX_RBBM_CLOCK_DELAY_HLSQ, 0x00000000 },
+	{ REG_A7XX_RBBM_CLOCK_HYST_HLSQ, 0x00000000 },
+	{ REG_A7XX_RBBM_CLOCK_DELAY_HLSQ_2, 0x00000002 },
+	{ REG_A7XX_RBBM_CLOCK_MODE_BV_LRZ, 0x55555552 },
+	{ REG_A7XX_RBBM_CLOCK_MODE_CP, 0x00000223 },
+	{ REG_A7XX_RBBM_CLOCK_CNTL, 0x8aa8aa82 },
+	{ REG_A7XX_RBBM_ISDB_CNT, 0x00000182 },
+	{ REG_A7XX_RBBM_RAC_THRESHOLD_CNT, 0x00000000 },
+	{ REG_A7XX_RBBM_SP_HYST_CNT, 0x00000000 },
+	{ REG_A7XX_RBBM_CLOCK_CNTL_GMU_GX, 0x00000222 },
+	{ REG_A7XX_RBBM_CLOCK_DELAY_GMU_GX, 0x00000111 },
+	{ REG_A7XX_RBBM_CLOCK_HYST_GMU_GX, 0x00000555 },
+	{},
+};
+
+#define RBBM_CLOCK_CNTL_ON 0x8aa8aa82
+
+static void a7xx_set_hwcg(struct msm_gpu *gpu, bool state)
+{
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	const struct adreno_reglist *reg;
+	unsigned int i;
+	u32 val;
+
+	if (!adreno_gpu->info->hwcg)
+		return;
+
+	val = gpu_read(gpu, REG_A7XX_RBBM_CLOCK_CNTL);
+
+	/* Don't re-program the registers if they are already correct */
+	if ((val == RBBM_CLOCK_CNTL_ON) == state)
+		return;
+
+	for (i = 0; (reg = &adreno_gpu->info->hwcg[i], reg->offset); i++)
+		gpu_write(gpu, reg->offset, state ? reg->value : 0);
+
+	gpu_write(gpu, REG_A7XX_RBBM_CLOCK_CNTL, state ? RBBM_CLOCK_CNTL_ON : 0);
+}
+
+static const u32 a730_protect[] = {
+	A6XX_PROTECT_RDONLY(0x00000, 0x04ff),
+	A6XX_PROTECT_RDONLY(0x0050b, 0x0058),
+	A6XX_PROTECT_NORDWR(0x0050e, 0x0000),
+	A6XX_PROTECT_NORDWR(0x00510, 0x0000),
+	A6XX_PROTECT_NORDWR(0x00534, 0x0000),
+	A6XX_PROTECT_RDONLY(0x005fb, 0x009d),
+	A6XX_PROTECT_NORDWR(0x00699, 0x01e9),
+	A6XX_PROTECT_NORDWR(0x008a0, 0x0008),
+	A6XX_PROTECT_NORDWR(0x008ab, 0x0024),
+	A6XX_PROTECT_RDONLY(0x008d0, 0x0170),
+	A6XX_PROTECT_NORDWR(0x00900, 0x004d),
+	A6XX_PROTECT_NORDWR(0x0098d, 0x00b2),
+	A6XX_PROTECT_NORDWR(0x00a41, 0x01be),
+	A6XX_PROTECT_NORDWR(0x00df0, 0x0001),
+	A6XX_PROTECT_NORDWR(0x00e01, 0x0000),
+	A6XX_PROTECT_NORDWR(0x00e07, 0x0008),
+	A6XX_PROTECT_NORDWR(0x03c00, 0x00c3),
+	A6XX_PROTECT_RDONLY(0x03cc4, 0x1fff),
+	A6XX_PROTECT_NORDWR(0x08630, 0x01cf),
+	A6XX_PROTECT_NORDWR(0x08e00, 0x0000),
+	A6XX_PROTECT_NORDWR(0x08e08, 0x0000),
+	A6XX_PROTECT_NORDWR(0x08e50, 0x001f),
+	A6XX_PROTECT_NORDWR(0x08e80, 0x0280),
+	A6XX_PROTECT_NORDWR(0x09624, 0x01db),
+	A6XX_PROTECT_NORDWR(0x09e40, 0x0000),
+	A6XX_PROTECT_NORDWR(0x09e64, 0x000d),
+	A6XX_PROTECT_NORDWR(0x09e78, 0x0187),
+	A6XX_PROTECT_NORDWR(0x0a630, 0x01cf),
+	A6XX_PROTECT_NORDWR(0x0ae02, 0x0000),
+	A6XX_PROTECT_NORDWR(0x0ae50, 0x000f),
+	A6XX_PROTECT_NORDWR(0x0ae66, 0x0003),
+	A6XX_PROTECT_NORDWR(0x0ae6f, 0x0003),
+	A6XX_PROTECT_NORDWR(0x0b604, 0x0003),
+	A6XX_PROTECT_NORDWR(0x0ec00, 0x0fff),
+	A6XX_PROTECT_RDONLY(0x0fc00, 0x1fff),
+	A6XX_PROTECT_NORDWR(0x18400, 0x0053),
+	A6XX_PROTECT_RDONLY(0x18454, 0x0004),
+	A6XX_PROTECT_NORDWR(0x18459, 0x1fff),
+	A6XX_PROTECT_NORDWR(0x1a459, 0x1fff),
+	A6XX_PROTECT_NORDWR(0x1c459, 0x1fff),
+	A6XX_PROTECT_NORDWR(0x1f400, 0x0443),
+	A6XX_PROTECT_RDONLY(0x1f844, 0x007b),
+	A6XX_PROTECT_NORDWR(0x1f860, 0x0000),
+	A6XX_PROTECT_NORDWR(0x1f878, 0x002a),
+	A6XX_PROTECT_NORDWR(0x1f8c0, 0x0000), /* note: infinite range */
+};
+
+static void a7xx_set_cp_protect(struct msm_gpu *gpu)
+{
+	const u32 *regs = a730_protect;
+	unsigned i, count, count_max;
+
+	regs = a730_protect;
+	count = ARRAY_SIZE(a730_protect);
+	count_max = 48;
+	BUILD_BUG_ON(ARRAY_SIZE(a730_protect) > 48);
+
+	/*
+	 * Enable access protection to privileged registers, fault on an access
+	 * protect violation and select the last span to protect from the start
+	 * address all the way to the end of the register address space
+	 */
+	gpu_write(gpu, REG_A7XX_CP_PROTECT_CNTL, BIT(0) | BIT(1) | BIT(3));
+
+	for (i = 0; i < count - 1; i++)
+		gpu_write(gpu, REG_A7XX_CP_PROTECT_REG(i), regs[i]);
+	/* last CP_PROTECT to have "infinite" length on the last entry */
+	gpu_write(gpu, REG_A7XX_CP_PROTECT_REG(count_max - 1), regs[i]);
+}
+
+static void a7xx_set_ubwc_config(struct msm_gpu *gpu)
+{
+	u32 lower_bit = 3;
+	u32 amsbc = 1;
+	u32 rgb565_predicator = 1;
+	u32 uavflagprd_inv = 2;
+
+	gpu_write(gpu, REG_A7XX_RB_NC_MODE_CNTL, rgb565_predicator << 11 | amsbc << 4 | lower_bit << 1);
+	gpu_write(gpu, REG_A7XX_TPL1_NC_MODE_CNTL, lower_bit << 1);
+	gpu_write(gpu, REG_A7XX_SP_NC_MODE_CNTL, uavflagprd_inv << 4 | lower_bit << 1);
+	gpu_write(gpu, REG_A7XX_GRAS_NC_MODE_CNTL, lower_bit << 5);
+	gpu_write(gpu, REG_A7XX_UCHE_MODE_CNTL, lower_bit << 21);
+}
+
+static int a7xx_cp_init(struct msm_gpu *gpu)
+{
+	struct msm_ringbuffer *ring = gpu->rb[0];
+
+	/* Disable concurrent binning before sending CP init */
+	OUT_PKT7(ring, CP_THREAD_CONTROL, 1);
+	OUT_RING(ring, BIT(27));
+
+	OUT_PKT7(ring, CP_ME_INIT, 7);
+	OUT_RING(ring, BIT(0) | /* Use multiple HW contexts */
+		BIT(1) | /* Enable error detection */
+		BIT(3) | /* Set default reset state */
+		BIT(6) | /* Disable save/restore of performance counters across preemption */
+		BIT(8)); /* Enable the register init list with the spinlock */
+	OUT_RING(ring, 0x00000003); /* Set number of HW contexts */
+	OUT_RING(ring, 0x20000000); /* Enable error detection */
+	OUT_RING(ring, 0x00000002); /* Operation mode mask */
+	/* Register initialization list with spinlock (TODO used for IFPC/preemption) */
+	OUT_RING(ring, 0);
+	OUT_RING(ring, 0);
+	OUT_RING(ring, 0);
+
+	adreno_flush(gpu, ring, REG_A7XX_CP_RB_WPTR);
+	return a7xx_idle(gpu, ring) ? 0 : -EINVAL;
+}
+
+static int a7xx_ucode_init(struct msm_gpu *gpu)
+{
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct a7xx_gpu *a7xx_gpu = to_a7xx_gpu(adreno_gpu);
+
+	if (!a7xx_gpu->sqe_bo) {
+		a7xx_gpu->sqe_bo = adreno_fw_create_bo(gpu,
+			adreno_gpu->fw[ADRENO_FW_SQE], &a7xx_gpu->sqe_iova);
+
+		if (IS_ERR(a7xx_gpu->sqe_bo)) {
+			int ret = PTR_ERR(a7xx_gpu->sqe_bo);
+
+			a7xx_gpu->sqe_bo = NULL;
+			DRM_DEV_ERROR(&gpu->pdev->dev,
+				"Could not allocate SQE ucode: %d\n", ret);
+
+			return ret;
+		}
+
+		msm_gem_object_set_name(a7xx_gpu->sqe_bo, "sqefw");
+	}
+
+	gpu_write64(gpu, REG_A7XX_CP_SQE_INSTR_BASE, a7xx_gpu->sqe_iova);
+
+	return 0;
+}
+
+static int a7xx_zap_shader_init(struct msm_gpu *gpu)
+{
+	static bool loaded;
+	int ret;
+
+	if (loaded)
+		return 0;
+
+	ret = adreno_zap_shader_load(gpu, GPU_PAS_ID);
+
+	loaded = !ret;
+	return ret;
+}
+
+#define A7XX_INT_MASK ( \
+	A7XX_RBBM_INT_0_MASK_AHBERROR | \
+	A7XX_RBBM_INT_0_MASK_ATBASYNCFIFOOVERFLOW | \
+	A7XX_RBBM_INT_0_MASK_GPCERROR | \
+	A7XX_RBBM_INT_0_MASK_SWINTERRUPT | \
+	A7XX_RBBM_INT_0_MASK_HWERROR | \
+	A7XX_RBBM_INT_0_MASK_PM4CPINTERRUPT | \
+	A7XX_RBBM_INT_0_MASK_RB_DONE_TS | \
+	A7XX_RBBM_INT_0_MASK_CACHE_CLEAN_TS | \
+	A7XX_RBBM_INT_0_MASK_ATBBUSOVERFLOW | \
+	A7XX_RBBM_INT_0_MASK_HANGDETECTINTERRUPT | \
+	A7XX_RBBM_INT_0_MASK_OUTOFBOUNDACCESS | \
+	A7XX_RBBM_INT_0_MASK_UCHETRAPINTERRUPT | \
+	A7XX_RBBM_INT_0_MASK_TSBWRITEERROR)
+
+/*
+ * All Gen7 targets support marking certain transactions as always privileged
+ * which allows us to mark more memory as privileged without having to
+ * explicitly set the APRIV bit. Choose the following transactions to be
+ * privileged by default:
+ * CDWRITE     [6:6] - Crashdumper writes
+ * CDREAD      [5:5] - Crashdumper reads
+ * RBRPWB      [3:3] - RPTR shadow writes
+ * RBPRIVLEVEL [2:2] - Memory accesses from PM4 packets in the ringbuffer
+ * RBFETCH     [1:1] - Ringbuffer reads
+ * ICACHE      [0:0] - Instruction cache fetches
+ */
+
+#define A7XX_APRIV_DEFAULT (BIT(3) | BIT(2) | BIT(1) | BIT(0))
+/* Add crashdumper permissions for the BR APRIV */
+#define A7XX_BR_APRIV_DEFAULT (A7XX_APRIV_DEFAULT | BIT(6) | BIT(5))
+
+static int a7xx_hw_init(struct msm_gpu *gpu)
+{
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct a7xx_gpu *a7xx_gpu = to_a7xx_gpu(adreno_gpu);
+	int ret;
+
+	/* Set up GBIF registers */
+	gpu_write(gpu, REG_A7XX_GBIF_QSB_SIDE0, 0x00071620);
+	gpu_write(gpu, REG_A7XX_GBIF_QSB_SIDE1, 0x00071620);
+	gpu_write(gpu, REG_A7XX_GBIF_QSB_SIDE2, 0x00071620);
+	gpu_write(gpu, REG_A7XX_GBIF_QSB_SIDE3, 0x00071620);
+	gpu_write(gpu, REG_A7XX_GBIF_QSB_SIDE3, 0x00071620);
+	gpu_write(gpu, REG_A7XX_RBBM_GBIF_CLIENT_QOS_CNTL, 0x2120212);
+	gpu_write(gpu, REG_A7XX_UCHE_GBIF_GX_CONFIG, 0x10240e0);
+
+	/* Make all blocks contribute to the GPU BUSY perf counter */
+	gpu_write(gpu, REG_A7XX_RBBM_PERFCTR_GPU_BUSY_MASKED, 0xffffffff);
+
+	/*
+	 * Set UCHE_WRITE_THRU_BASE to the UCHE_TRAP_BASE effectively
+	 * disabling L2 bypass
+	 */
+	gpu_write64(gpu, REG_A7XX_UCHE_TRAP_BASE, ~0ull);
+	gpu_write64(gpu, REG_A7XX_UCHE_WRITE_THRU_BASE, ~0ull);
+
+	gpu_write(gpu, REG_A7XX_UCHE_CACHE_WAYS, 0x800000);
+	gpu_write(gpu, REG_A7XX_UCHE_CMDQ_CONFIG, 6 << 16 | 6 << 12 | 9 << 8 | BIT(3) | BIT(2) | 2);
+
+	/* Set the AHB default slave response to "ERROR" */
+	gpu_write(gpu, REG_A7XX_CP_AHB_CNTL, 0x1);
+
+	/* Turn on performance counters */
+	gpu_write(gpu, REG_A7XX_RBBM_PERFCTR_CNTL, 0x1);
+
+	a7xx_set_ubwc_config(gpu);
+
+	gpu_write(gpu, REG_A7XX_RBBM_INTERFACE_HANG_INT_CNTL, BIT(30) | 0xcfffff);
+	gpu_write(gpu, REG_A7XX_UCHE_CLIENT_PF, BIT(0));
+
+	a7xx_set_cp_protect(gpu);
+
+	/* TODO: Configure LLCC */
+
+	gpu_write(gpu, REG_A7XX_CP_APRIV_CNTL, A7XX_BR_APRIV_DEFAULT);
+	gpu_write(gpu, REG_A7XX_CP_BV_APRIV_CNTL, A7XX_APRIV_DEFAULT);
+	gpu_write(gpu, REG_A7XX_CP_LPAC_APRIV_CNTL, A7XX_APRIV_DEFAULT);
+
+	gpu_write(gpu, REG_A7XX_RBBM_SECVID_TSB_CNTL, 0);
+	gpu_write64(gpu, REG_A7XX_RBBM_SECVID_TSB_TRUSTED_BASE, 0);
+	gpu_write(gpu, REG_A7XX_RBBM_SECVID_TSB_TRUSTED_SIZE, 0);
+
+	a7xx_set_hwcg(gpu, true);
+
+	/* Enable interrupts */
+	gpu_write(gpu, REG_A7XX_RBBM_INT_0_MASK, A7XX_INT_MASK);
+
+	ret = adreno_hw_init(gpu);
+	if (ret)
+		return ret;
+
+	/* reset the value of bv_fence too */
+	gpu->rb[0]->memptrs->bv_fence = gpu->rb[0]->fctx->completed_fence;
+
+	ret = a7xx_ucode_init(gpu);
+	if (ret)
+		return ret;
+
+	/* Set the ringbuffer address and setup rptr shadow */
+	gpu_write64(gpu, REG_A7XX_CP_RB_BASE, gpu->rb[0]->iova);
+
+	gpu_write(gpu, REG_A7XX_CP_RB_CNTL, MSM_GPU_RB_CNTL_DEFAULT);
+
+	if (!a7xx_gpu->shadow_bo) {
+		a7xx_gpu->shadow = msm_gem_kernel_new(gpu->dev,
+			sizeof(u32) * gpu->nr_rings,
+			MSM_BO_WC | MSM_BO_MAP_PRIV,
+			gpu->aspace, &a7xx_gpu->shadow_bo,
+			&a7xx_gpu->shadow_iova);
+
+		if (IS_ERR(a7xx_gpu->shadow))
+			return PTR_ERR(a7xx_gpu->shadow);
+
+		msm_gem_object_set_name(a7xx_gpu->shadow_bo, "shadow");
+	}
+
+	gpu_write64(gpu, REG_A7XX_CP_RB_RPTR_ADDR, shadowptr(a7xx_gpu, gpu->rb[0]));
+
+	gpu->cur_ctx_seqno = 0;
+
+	/* Enable the SQE_to start the CP engine */
+	gpu_write(gpu, REG_A7XX_CP_SQE_CNTL, 1);
+
+	ret = a7xx_cp_init(gpu);
+	if (ret)
+		return ret;
+
+	/*
+	 * Try to load a zap shader into the secure world. If successful
+	 * we can use the CP to switch out of secure mode. If not then we
+	 * have no resource but to try to switch ourselves out manually. If we
+	 * guessed wrong then access to the RBBM_SECVID_TRUST_CNTL register will
+	 * be blocked and a permissions violation will soon follow.
+	 */
+	ret = a7xx_zap_shader_init(gpu);
+	if (!ret) {
+		OUT_PKT7(gpu->rb[0], CP_SET_SECURE_MODE, 1);
+		OUT_RING(gpu->rb[0], 0x00000000);
+
+		adreno_flush(gpu, gpu->rb[0], REG_A7XX_CP_RB_WPTR);
+		if (!a7xx_idle(gpu, gpu->rb[0]))
+			return -EINVAL;
+	} else if (ret == -ENODEV) {
+		/*
+		 * This device does not use zap shader (but print a warning
+		 * just in case someone got their dt wrong.. hopefully they
+		 * have a debug UART to realize the error of their ways...
+		 * if you mess this up you are about to crash horribly)
+		 */
+		dev_warn_once(gpu->dev->dev,
+			"Zap shader not enabled - using SECVID_TRUST_CNTL instead\n");
+		gpu_write(gpu, REG_A7XX_RBBM_SECVID_TRUST_CNTL, 0x0);
+		ret = 0;
+	} else {
+		return ret;
+	}
+
+	return ret;
+}
+
+static void a7xx_dump(struct msm_gpu *gpu)
+{
+	DRM_DEV_INFO(&gpu->pdev->dev, "status:   %08x\n",
+			gpu_read(gpu, REG_A7XX_RBBM_STATUS));
+	adreno_dump(gpu);
+}
+
+static void a7xx_recover(struct msm_gpu *gpu)
+{
+	adreno_dump_info(gpu);
+
+	if (hang_debug)
+		a7xx_dump(gpu);
+
+	gpu_write(gpu, REG_A7XX_RBBM_SW_RESET_CMD, 1);
+	/*
+	 * Do a dummy read to get a brief read cycle delay for the
+	 * reset to take effect
+	 * (does this work as expected for a7xx?)
+	 */
+	gpu_read(gpu, REG_A7XX_RBBM_SW_RESET_CMD);
+	gpu_write(gpu, REG_A7XX_RBBM_SW_RESET_CMD, 0);
+
+	msm_gpu_hw_init(gpu);
+}
+
+static void a7xx_cp_hw_err_irq(struct msm_gpu *gpu)
+{
+	struct device *dev = &gpu->pdev->dev;
+	u32 status = gpu_read(gpu, REG_A7XX_CP_INTERRUPT_STATUS);
+	u32 val;
+
+	if (status & A7XX_CP_INTERRUPT_STATUS_OPCODEERROR) {
+		gpu_write(gpu, REG_A7XX_CP_SQE_STAT_ADDR, 1);
+		val = gpu_read(gpu, REG_A7XX_CP_SQE_STAT_DATA);
+		dev_err_ratelimited(dev,"CP | opcode error | possible opcode=0x%8.8X\n", val);
+	}
+
+	if (status & A7XX_CP_INTERRUPT_STATUS_UCODEERROR)
+		dev_err_ratelimited(dev, "CP ucode error interrupt\n");
+
+	if (status & A7XX_CP_INTERRUPT_STATUS_CPHWFAULT)
+		dev_err_ratelimited(dev, "CP | HW fault | status=0x%8.8X\n", gpu_read(gpu, REG_A7XX_CP_HW_FAULT));
+
+	if (status & A7XX_CP_INTERRUPT_STATUS_REGISTERPROTECTION) {
+		val = gpu_read(gpu, REG_A7XX_CP_PROTECT_STATUS);
+		dev_err_ratelimited(dev,
+			"CP | protected mode error | %s | addr=0x%8.8X | status=0x%8.8X\n",
+			val & (1 << 20) ? "READ" : "WRITE",
+			(val & 0x3ffff), val);
+	}
+
+	if (status & A7XX_CP_INTERRUPT_STATUS_VSDPARITYERROR)
+		dev_err_ratelimited(dev,"CP VSD decoder parity error\n");
+
+	if (status & A7XX_CP_INTERRUPT_STATUS_ILLEGALINSTRUCTION)
+		dev_err_ratelimited(dev,"CP illegal instruction error\n");
+
+	if (status & A7XX_CP_INTERRUPT_STATUS_OPCODEERRORLPAC)
+		dev_err_ratelimited(dev, "CP opcode error LPAC\n");
+
+	if (status & A7XX_CP_INTERRUPT_STATUS_UCODEERRORLPAC)
+		dev_err_ratelimited(dev, "CP ucode error LPAC\n");
+
+	if (status & A7XX_CP_INTERRUPT_STATUS_CPHWFAULTLPAC)
+		dev_err_ratelimited(dev, "CP hw fault LPAC\n");
+
+	if (status & A7XX_CP_INTERRUPT_STATUS_REGISTERPROTECTIONLPAC)
+		dev_err_ratelimited(dev, "CP register protection LPAC\n");
+
+	if (status & A7XX_CP_INTERRUPT_STATUS_ILLEGALINSTRUCTIONLPAC)
+		dev_err_ratelimited(dev, "CP illegal instruction LPAC\n");
+
+	if (status & A7XX_CP_INTERRUPT_STATUS_OPCODEERRORBV) {
+		gpu_write(gpu, REG_A7XX_CP_BV_SQE_STAT_ADDR, 1);
+		val = gpu_read(gpu, REG_A7XX_CP_BV_SQE_STAT_DATA);
+		dev_err_ratelimited(dev, "CP opcode error BV | opcode=0x%8.8x\n", val);
+	}
+
+	if (status & A7XX_CP_INTERRUPT_STATUS_UCODEERRORBV)
+		dev_err_ratelimited(dev, "CP ucode error BV\n");
+
+	if (status & A7XX_CP_INTERRUPT_STATUS_CPHWFAULTBV) {
+		val = gpu_read(gpu, REG_A7XX_CP_BV_HW_FAULT);
+		dev_err_ratelimited(dev, "CP BV | Ringbuffer HW fault | status=%x\n", val);
+	}
+
+	if (status & A7XX_CP_INTERRUPT_STATUS_REGISTERPROTECTIONBV) {
+		val = gpu_read(gpu, REG_A7XX_CP_BV_PROTECT_STATUS);
+		dev_err_ratelimited(dev,
+			"CP BV | protected mode error | %s | addr=0x%8.8X | status=0x%8.8X\n",
+			val & BIT(20) ? "READ" : "WRITE",
+			val & 0x3ffff, val);
+	}
+
+	if (status & A7XX_CP_INTERRUPT_STATUS_ILLEGALINSTRUCTIONBV)
+		dev_err_ratelimited(dev, "CP illegal instruction BV\n");
+}
+
+static irqreturn_t a7xx_irq(struct msm_gpu *gpu)
+{
+	struct msm_drm_private *priv = gpu->dev->dev_private;
+	u32 status = gpu_read(gpu, REG_A7XX_RBBM_INT_0_STATUS);
+
+	gpu_write(gpu, REG_A7XX_RBBM_INT_CLEAR_CMD, status);
+
+	if (priv->disable_err_irq)
+		status &= A7XX_RBBM_INT_0_MASK_CACHE_CLEAN_TS;
+
+	/* TODO: print human friendly strings for each error ? */
+	if (status & ~A7XX_RBBM_INT_0_MASK_CACHE_CLEAN_TS)
+		dev_err_ratelimited(&gpu->pdev->dev, "unexpected irq status: 0x%8.8X\n", status);
+
+	if (status & A7XX_RBBM_INT_0_MASK_HWERROR)
+		a7xx_cp_hw_err_irq(gpu);
+
+	if (status & A7XX_RBBM_INT_0_MASK_CACHE_CLEAN_TS)
+		msm_gpu_retire(gpu);
+
+	return IRQ_HANDLED;
+}
+
+static int a7xx_get_timestamp(struct msm_gpu *gpu, uint64_t *value)
+{
+	*value = gpu_read64(gpu, REG_A7XX_CP_ALWAYS_ON_COUNTER);
+	return 0;
+}
+
+static void a7xx_destroy(struct msm_gpu *gpu)
+{
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct a7xx_gpu *a7xx_gpu = to_a7xx_gpu(adreno_gpu);
+
+	if (a7xx_gpu->sqe_bo) {
+		msm_gem_unpin_iova(a7xx_gpu->sqe_bo, gpu->aspace);
+		drm_gem_object_put(a7xx_gpu->sqe_bo);
+	}
+
+	if (a7xx_gpu->shadow_bo) {
+		msm_gem_unpin_iova(a7xx_gpu->shadow_bo, gpu->aspace);
+		drm_gem_object_put(a7xx_gpu->shadow_bo);
+	}
+
+	adreno_gpu_cleanup(adreno_gpu);
+
+	kfree(a7xx_gpu);
+}
+
+static struct msm_gpu_state *a7xx_gpu_state_get(struct msm_gpu *gpu)
+{
+	struct msm_gpu_state *state = kzalloc(sizeof(*state), GFP_KERNEL);
+
+	if (!state)
+		return ERR_PTR(-ENOMEM);
+
+	adreno_gpu_state_get(gpu, state);
+
+	state->rbbm_status = gpu_read(gpu, REG_A7XX_RBBM_STATUS);
+
+	return state;
+}
+
+static struct msm_gem_address_space *
+a7xx_create_address_space(struct msm_gpu *gpu, struct platform_device *pdev)
+{
+	struct iommu_domain *iommu;
+	struct msm_mmu *mmu;
+	struct msm_gem_address_space *aspace;
+	u64 start, size;
+
+	iommu = iommu_domain_alloc(&platform_bus_type);
+	if (!iommu)
+		return NULL;
+
+	mmu = msm_iommu_new(&pdev->dev, iommu);
+	if (IS_ERR(mmu)) {
+		iommu_domain_free(iommu);
+		return ERR_CAST(mmu);
+	}
+
+	/*
+	 * Use the aperture start or SZ_16M, whichever is greater. This will
+	 * ensure that we align with the allocated pagetable range while still
+	 * allowing room in the lower 32 bits for GMEM and whatnot
+	 */
+	start = max_t(u64, SZ_16M, iommu->geometry.aperture_start);
+	size = iommu->geometry.aperture_end - start + 1;
+
+	aspace = msm_gem_address_space_create(mmu, "gpu",
+		start & GENMASK_ULL(48, 0), size);
+
+	if (IS_ERR(aspace) && !IS_ERR(mmu))
+		mmu->funcs->destroy(mmu);
+
+	return aspace;
+}
+
+static uint32_t a7xx_get_rptr(struct msm_gpu *gpu, struct msm_ringbuffer *ring)
+{
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct a7xx_gpu *a7xx_gpu = to_a7xx_gpu(adreno_gpu);
+
+	return a7xx_gpu->shadow[ring->id];
+}
+
+static const struct adreno_gpu_funcs funcs = {
+	.base = {
+		.get_param = adreno_get_param,
+		.set_param = adreno_set_param,
+		.hw_init = a7xx_hw_init,
+		.pm_suspend = msm_gpu_pm_suspend,
+		.pm_resume = msm_gpu_pm_resume,
+		.recover = a7xx_recover,
+		.submit = a7xx_submit,
+		.active_ring = adreno_active_ring,
+		.irq = a7xx_irq,
+		.destroy = a7xx_destroy,
+#if defined(CONFIG_DEBUG_FS) || defined(CONFIG_DEV_COREDUMP)
+		.show = adreno_show,
+#endif
+		.gpu_state_get = a7xx_gpu_state_get,
+		.gpu_state_put = adreno_gpu_state_put,
+		.create_address_space = a7xx_create_address_space,
+		.get_rptr = a7xx_get_rptr,
+	},
+	.get_timestamp = a7xx_get_timestamp,
+};
+
+struct msm_gpu *a7xx_gpu_init(struct drm_device *dev)
+{
+	struct msm_drm_private *priv = dev->dev_private;
+	struct platform_device *pdev = priv->gpu_pdev;
+	struct a7xx_gpu *a7xx_gpu;
+	struct adreno_gpu *adreno_gpu;
+	int ret;
+
+	a7xx_gpu = kzalloc(sizeof(*a7xx_gpu), GFP_KERNEL);
+	if (!a7xx_gpu)
+		return ERR_PTR(-ENOMEM);
+
+	adreno_gpu = &a7xx_gpu->base;
+	adreno_gpu->registers = NULL;
+	adreno_gpu->base.hw_apriv = true;
+
+	ret = adreno_gpu_init(dev, pdev, adreno_gpu, &funcs, 1);
+	if (ret) {
+		a7xx_destroy(&(a7xx_gpu->base.base));
+		return ERR_PTR(ret);
+	}
+
+	return &adreno_gpu->base;
+}
diff --git a/drivers/gpu/drm/msm/adreno/a7xx_gpu.h b/drivers/gpu/drm/msm/adreno/a7xx_gpu.h
new file mode 100644
index 0000000000000..ebb86d67b812d
--- /dev/null
+++ b/drivers/gpu/drm/msm/adreno/a7xx_gpu.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (c) 2017-2022 The Linux Foundation. All rights reserved. */
+
+#ifndef __A7XX_GPU_H__
+#define __A7XX_GPU_H__
+
+#include "adreno_gpu.h"
+#include "a7xx.xml.h"
+
+struct a7xx_gpu {
+	struct adreno_gpu base;
+
+	struct drm_gem_object *sqe_bo;
+	uint64_t sqe_iova;
+
+	struct drm_gem_object *shadow_bo;
+	uint64_t shadow_iova;
+	uint32_t *shadow;
+};
+
+#define to_a7xx_gpu(x) container_of(x, struct a7xx_gpu, base)
+
+#define shadowptr(_a7xx_gpu, _ring) ((_a7xx_gpu)->shadow_iova + \
+		((_ring)->id * sizeof(uint32_t)))
+
+#endif /* __A7XX_GPU_H__ */
diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c
index 89cfd84760d7e..2e93a99a7160d 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_device.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
@@ -339,6 +339,18 @@ static const struct adreno_info gpulist[] = {
 		.init = a6xx_gpu_init,
 		.zapfw = "a640_zap.mdt",
 		.hwcg = a640_hwcg,
+	}, {
+		.rev = ADRENO_REV(7, 0, 0, ANY_ID),
+		.revn = 730,
+		.name = "A730",
+		.fw = {
+			[ADRENO_FW_SQE] = "a730_sqe.fw",
+		},
+		.gmem = SZ_2M,
+		.inactive_period = DRM_MSM_INACTIVE_PERIOD,
+		.init = a7xx_gpu_init,
+		.zapfw = "a730_zap.mbn",
+		.hwcg = a730_hwcg,
 	},
 };
 
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
index 55c5433a4ea18..685a0ffb9b0f8 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
@@ -57,7 +57,7 @@ struct adreno_reglist {
 	u32 value;
 };
 
-extern const struct adreno_reglist a630_hwcg[], a640_hwcg[], a650_hwcg[], a660_hwcg[];
+extern const struct adreno_reglist a630_hwcg[], a640_hwcg[], a650_hwcg[], a660_hwcg[], a730_hwcg[];
 
 struct adreno_info {
 	struct adreno_rev rev;
@@ -395,6 +395,7 @@ struct msm_gpu *a3xx_gpu_init(struct drm_device *dev);
 struct msm_gpu *a4xx_gpu_init(struct drm_device *dev);
 struct msm_gpu *a5xx_gpu_init(struct drm_device *dev);
 struct msm_gpu *a6xx_gpu_init(struct drm_device *dev);
+struct msm_gpu *a7xx_gpu_init(struct drm_device *dev);
 
 static inline uint32_t get_wptr(struct msm_ringbuffer *ring)
 {
diff --git a/drivers/gpu/drm/msm/adreno/adreno_pm4.xml.h b/drivers/gpu/drm/msm/adreno/adreno_pm4.xml.h
index 9eeedd261b733..b9587c2892a35 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_pm4.xml.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_pm4.xml.h
@@ -12,7 +12,7 @@ The rules-ng-ng source files this header was generated from are:
 - freedreno/registers/freedreno_copyright.xml        (   1572 bytes, from 2020-11-18 00:17:12)
 - freedreno/registers/adreno/a2xx.xml                (  90810 bytes, from 2021-08-06 17:44:41)
 - freedreno/registers/adreno/adreno_common.xml       (  14631 bytes, from 2022-03-27 14:52:08)
-- freedreno/registers/adreno/adreno_pm4.xml          (  70177 bytes, from 2022-03-27 20:02:31)
+- freedreno/registers/adreno/adreno_pm4.xml          (  69699 bytes, from 2022-03-27 20:17:52)
 - freedreno/registers/adreno/a3xx.xml                (  84231 bytes, from 2021-08-27 13:03:56)
 - freedreno/registers/adreno/a4xx.xml                ( 113474 bytes, from 2022-03-22 19:23:46)
 - freedreno/registers/adreno/a5xx.xml                ( 149512 bytes, from 2022-03-21 16:05:18)
@@ -2380,23 +2380,5 @@ static inline uint32_t CP_THREAD_CONTROL_0_THREAD(enum cp_thread val)
 #define CP_THREAD_CONTROL_0_CONCURRENT_BIN_DISABLE		0x08000000
 #define CP_THREAD_CONTROL_0_SYNC_THREADS			0x80000000
 
-#define REG_CP_WAIT_TIMESTAMP_0					0x00000000
-#define CP_WAIT_TIMESTAMP_0_REF__MASK				0x00000003
-#define CP_WAIT_TIMESTAMP_0_REF__SHIFT				0
-static inline uint32_t CP_WAIT_TIMESTAMP_0_REF(uint32_t val)
-{
-	return ((val) << CP_WAIT_TIMESTAMP_0_REF__SHIFT) & CP_WAIT_TIMESTAMP_0_REF__MASK;
-}
-#define CP_WAIT_TIMESTAMP_0_MEMSPACE__MASK			0x00000010
-#define CP_WAIT_TIMESTAMP_0_MEMSPACE__SHIFT			4
-static inline uint32_t CP_WAIT_TIMESTAMP_0_MEMSPACE(uint32_t val)
-{
-	return ((val) << CP_WAIT_TIMESTAMP_0_MEMSPACE__SHIFT) & CP_WAIT_TIMESTAMP_0_MEMSPACE__MASK;
-}
-
-#define REG_CP_WAIT_TIMESTAMP_ADDR				0x00000001
-
-#define REG_CP_WAIT_TIMESTAMP_VALUE				0x00000003
-
 
 #endif /* ADRENO_PM4_XML */
diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.h b/drivers/gpu/drm/msm/msm_ringbuffer.h
index d8c63df4e9ca9..ae7b35ccbeff9 100644
--- a/drivers/gpu/drm/msm/msm_ringbuffer.h
+++ b/drivers/gpu/drm/msm/msm_ringbuffer.h
@@ -30,6 +30,7 @@ struct msm_gpu_submit_stats {
 struct msm_rbmemptrs {
 	volatile uint32_t rptr;
 	volatile uint32_t fence;
+	volatile uint32_t bv_fence;
 
 	volatile struct msm_gpu_submit_stats stats[MSM_GPU_SUBMIT_STATS_COUNT];
 	volatile u64 ttbr0;
-- 
2.26.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/4] drm/msm/adreno: use a single register offset for gpu_read64/gpu_write64
  2022-03-27 20:25   ` Jonathan Marek
@ 2022-04-02  1:39     ` Rob Clark
  -1 siblings, 0 replies; 12+ messages in thread
From: Rob Clark @ 2022-04-02  1:39 UTC (permalink / raw)
  To: Jonathan Marek
  Cc: freedreno, Sean Paul, Abhinav Kumar, David Airlie, Daniel Vetter,
	Dan Carpenter, Akhil P Oommen, Jordan Crouse, Vladimir Lypak,
	Yangtao Li, Christian König, Dmitry Baryshkov,
	Douglas Anderson, open list:DRM DRIVER FOR MSM ADRENO GPU,
	open list:DRM DRIVER FOR MSM ADRENO GPU, open list

On Sun, Mar 27, 2022 at 1:27 PM Jonathan Marek <jonathan@marek.ca> wrote:
>
> The high half of 64-bit registers is always at +1 offset, so change these
> helpers to be more convenient by removing the unnecessary argument.
>
> Signed-off-by: Jonathan Marek <jonathan@marek.ca>

I'd been meaning to do this for a while.. so I think I'll cherry-pick
this ahead of the rest of the series

Reviewed-by: Rob Clark <robdclark@gmail.com>

> ---
>  drivers/gpu/drm/msm/adreno/a4xx_gpu.c       |  3 +--
>  drivers/gpu/drm/msm/adreno/a5xx_gpu.c       | 27 ++++++++-------------
>  drivers/gpu/drm/msm/adreno/a5xx_preempt.c   |  4 +--
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c       | 25 ++++++-------------
>  drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c |  3 +--
>  drivers/gpu/drm/msm/msm_gpu.h               | 12 ++++-----
>  6 files changed, 27 insertions(+), 47 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
> index 0c6b2a6d0b4c9..da5e18bd74a45 100644
> --- a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
> @@ -606,8 +606,7 @@ static int a4xx_pm_suspend(struct msm_gpu *gpu) {
>
>  static int a4xx_get_timestamp(struct msm_gpu *gpu, uint64_t *value)
>  {
> -       *value = gpu_read64(gpu, REG_A4XX_RBBM_PERFCTR_CP_0_LO,
> -               REG_A4XX_RBBM_PERFCTR_CP_0_HI);
> +       *value = gpu_read64(gpu, REG_A4XX_RBBM_PERFCTR_CP_0_LO);
>
>         return 0;
>  }
> diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> index 407f50a15faa4..1916cb759cd5c 100644
> --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> @@ -605,11 +605,9 @@ static int a5xx_ucode_init(struct msm_gpu *gpu)
>                 a5xx_ucode_check_version(a5xx_gpu, a5xx_gpu->pfp_bo);
>         }
>
> -       gpu_write64(gpu, REG_A5XX_CP_ME_INSTR_BASE_LO,
> -               REG_A5XX_CP_ME_INSTR_BASE_HI, a5xx_gpu->pm4_iova);
> +       gpu_write64(gpu, REG_A5XX_CP_ME_INSTR_BASE_LO, a5xx_gpu->pm4_iova);
>
> -       gpu_write64(gpu, REG_A5XX_CP_PFP_INSTR_BASE_LO,
> -               REG_A5XX_CP_PFP_INSTR_BASE_HI, a5xx_gpu->pfp_iova);
> +       gpu_write64(gpu, REG_A5XX_CP_PFP_INSTR_BASE_LO, a5xx_gpu->pfp_iova);
>
>         return 0;
>  }
> @@ -868,8 +866,7 @@ static int a5xx_hw_init(struct msm_gpu *gpu)
>          * memory rendering at this point in time and we don't want to block off
>          * part of the virtual memory space.
>          */
> -       gpu_write64(gpu, REG_A5XX_RBBM_SECVID_TSB_TRUSTED_BASE_LO,
> -               REG_A5XX_RBBM_SECVID_TSB_TRUSTED_BASE_HI, 0x00000000);
> +       gpu_write64(gpu, REG_A5XX_RBBM_SECVID_TSB_TRUSTED_BASE_LO, 0x00000000);
>         gpu_write(gpu, REG_A5XX_RBBM_SECVID_TSB_TRUSTED_SIZE, 0x00000000);
>
>         /* Put the GPU into 64 bit by default */
> @@ -908,8 +905,7 @@ static int a5xx_hw_init(struct msm_gpu *gpu)
>                 return ret;
>
>         /* Set the ringbuffer address */
> -       gpu_write64(gpu, REG_A5XX_CP_RB_BASE, REG_A5XX_CP_RB_BASE_HI,
> -               gpu->rb[0]->iova);
> +       gpu_write64(gpu, REG_A5XX_CP_RB_BASE, gpu->rb[0]->iova);
>
>         /*
>          * If the microcode supports the WHERE_AM_I opcode then we can use that
> @@ -936,7 +932,7 @@ static int a5xx_hw_init(struct msm_gpu *gpu)
>                 }
>
>                 gpu_write64(gpu, REG_A5XX_CP_RB_RPTR_ADDR,
> -                       REG_A5XX_CP_RB_RPTR_ADDR_HI, shadowptr(a5xx_gpu, gpu->rb[0]));
> +                       shadowptr(a5xx_gpu, gpu->rb[0]));
>         } else if (gpu->nr_rings > 1) {
>                 /* Disable preemption if WHERE_AM_I isn't available */
>                 a5xx_preempt_fini(gpu);
> @@ -1239,9 +1235,9 @@ static void a5xx_fault_detect_irq(struct msm_gpu *gpu)
>                 gpu_read(gpu, REG_A5XX_RBBM_STATUS),
>                 gpu_read(gpu, REG_A5XX_CP_RB_RPTR),
>                 gpu_read(gpu, REG_A5XX_CP_RB_WPTR),
> -               gpu_read64(gpu, REG_A5XX_CP_IB1_BASE, REG_A5XX_CP_IB1_BASE_HI),
> +               gpu_read64(gpu, REG_A5XX_CP_IB1_BASE),
>                 gpu_read(gpu, REG_A5XX_CP_IB1_BUFSZ),
> -               gpu_read64(gpu, REG_A5XX_CP_IB2_BASE, REG_A5XX_CP_IB2_BASE_HI),
> +               gpu_read64(gpu, REG_A5XX_CP_IB2_BASE),
>                 gpu_read(gpu, REG_A5XX_CP_IB2_BUFSZ));
>
>         /* Turn off the hangcheck timer to keep it from bothering us */
> @@ -1427,8 +1423,7 @@ static int a5xx_pm_suspend(struct msm_gpu *gpu)
>
>  static int a5xx_get_timestamp(struct msm_gpu *gpu, uint64_t *value)
>  {
> -       *value = gpu_read64(gpu, REG_A5XX_RBBM_ALWAYSON_COUNTER_LO,
> -               REG_A5XX_RBBM_ALWAYSON_COUNTER_HI);
> +       *value = gpu_read64(gpu, REG_A5XX_RBBM_ALWAYSON_COUNTER_LO);
>
>         return 0;
>  }
> @@ -1465,8 +1460,7 @@ static int a5xx_crashdumper_run(struct msm_gpu *gpu,
>         if (IS_ERR_OR_NULL(dumper->ptr))
>                 return -EINVAL;
>
> -       gpu_write64(gpu, REG_A5XX_CP_CRASH_SCRIPT_BASE_LO,
> -               REG_A5XX_CP_CRASH_SCRIPT_BASE_HI, dumper->iova);
> +       gpu_write64(gpu, REG_A5XX_CP_CRASH_SCRIPT_BASE_LO, dumper->iova);
>
>         gpu_write(gpu, REG_A5XX_CP_CRASH_DUMP_CNTL, 1);
>
> @@ -1670,8 +1664,7 @@ static unsigned long a5xx_gpu_busy(struct msm_gpu *gpu)
>         if (pm_runtime_get_if_in_use(&gpu->pdev->dev) == 0)
>                 return 0;
>
> -       busy_cycles = gpu_read64(gpu, REG_A5XX_RBBM_PERFCTR_RBBM_0_LO,
> -                       REG_A5XX_RBBM_PERFCTR_RBBM_0_HI);
> +       busy_cycles = gpu_read64(gpu, REG_A5XX_RBBM_PERFCTR_RBBM_0_LO);
>
>         busy_time = busy_cycles - gpu->devfreq.busy_cycles;
>         do_div(busy_time, clk_get_rate(gpu->core_clk) / 1000000);
> diff --git a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
> index 8abc9a2b114a2..7658e89844b46 100644
> --- a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
> +++ b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
> @@ -137,7 +137,6 @@ void a5xx_preempt_trigger(struct msm_gpu *gpu)
>
>         /* Set the address of the incoming preemption record */
>         gpu_write64(gpu, REG_A5XX_CP_CONTEXT_SWITCH_RESTORE_ADDR_LO,
> -               REG_A5XX_CP_CONTEXT_SWITCH_RESTORE_ADDR_HI,
>                 a5xx_gpu->preempt_iova[ring->id]);
>
>         a5xx_gpu->next_ring = ring;
> @@ -211,8 +210,7 @@ void a5xx_preempt_hw_init(struct msm_gpu *gpu)
>         }
>
>         /* Write a 0 to signal that we aren't switching pagetables */
> -       gpu_write64(gpu, REG_A5XX_CP_CONTEXT_SWITCH_SMMU_INFO_LO,
> -               REG_A5XX_CP_CONTEXT_SWITCH_SMMU_INFO_HI, 0);
> +       gpu_write64(gpu, REG_A5XX_CP_CONTEXT_SWITCH_SMMU_INFO_LO, 0);
>
>         /* Reset the preemption state */
>         set_preempt_state(a5xx_gpu, PREEMPT_NONE);
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index 83c31b2ad865b..a624cb2df233b 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -246,8 +246,7 @@ static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
>         OUT_RING(ring, submit->seqno);
>
>         trace_msm_gpu_submit_flush(submit,
> -               gpu_read64(gpu, REG_A6XX_CP_ALWAYS_ON_COUNTER_LO,
> -                       REG_A6XX_CP_ALWAYS_ON_COUNTER_HI));
> +               gpu_read64(gpu, REG_A6XX_CP_ALWAYS_ON_COUNTER_LO));
>
>         a6xx_flush(gpu, ring);
>  }
> @@ -878,8 +877,7 @@ static int a6xx_ucode_init(struct msm_gpu *gpu)
>                 }
>         }
>
> -       gpu_write64(gpu, REG_A6XX_CP_SQE_INSTR_BASE,
> -               REG_A6XX_CP_SQE_INSTR_BASE+1, a6xx_gpu->sqe_iova);
> +       gpu_write64(gpu, REG_A6XX_CP_SQE_INSTR_BASE, a6xx_gpu->sqe_iova);
>
>         return 0;
>  }
> @@ -926,8 +924,7 @@ static int hw_init(struct msm_gpu *gpu)
>          * memory rendering at this point in time and we don't want to block off
>          * part of the virtual memory space.
>          */
> -       gpu_write64(gpu, REG_A6XX_RBBM_SECVID_TSB_TRUSTED_BASE_LO,
> -               REG_A6XX_RBBM_SECVID_TSB_TRUSTED_BASE_HI, 0x00000000);
> +       gpu_write64(gpu, REG_A6XX_RBBM_SECVID_TSB_TRUSTED_BASE_LO, 0x00000000);
>         gpu_write(gpu, REG_A6XX_RBBM_SECVID_TSB_TRUSTED_SIZE, 0x00000000);
>
>         /* Turn on 64 bit addressing for all blocks */
> @@ -976,11 +973,8 @@ static int hw_init(struct msm_gpu *gpu)
>
>         if (!adreno_is_a650_family(adreno_gpu)) {
>                 /* Set the GMEM VA range [0x100000:0x100000 + gpu->gmem - 1] */
> -               gpu_write64(gpu, REG_A6XX_UCHE_GMEM_RANGE_MIN_LO,
> -                       REG_A6XX_UCHE_GMEM_RANGE_MIN_HI, 0x00100000);
> -
> +               gpu_write64(gpu, REG_A6XX_UCHE_GMEM_RANGE_MIN_LO, 0x00100000);
>                 gpu_write64(gpu, REG_A6XX_UCHE_GMEM_RANGE_MAX_LO,
> -                       REG_A6XX_UCHE_GMEM_RANGE_MAX_HI,
>                         0x00100000 + adreno_gpu->gmem - 1);
>         }
>
> @@ -1072,8 +1066,7 @@ static int hw_init(struct msm_gpu *gpu)
>                 goto out;
>
>         /* Set the ringbuffer address */
> -       gpu_write64(gpu, REG_A6XX_CP_RB_BASE, REG_A6XX_CP_RB_BASE_HI,
> -               gpu->rb[0]->iova);
> +       gpu_write64(gpu, REG_A6XX_CP_RB_BASE, gpu->rb[0]->iova);
>
>         /* Targets that support extended APRIV can use the RPTR shadow from
>          * hardware but all the other ones need to disable the feature. Targets
> @@ -1105,7 +1098,6 @@ static int hw_init(struct msm_gpu *gpu)
>                 }
>
>                 gpu_write64(gpu, REG_A6XX_CP_RB_RPTR_ADDR_LO,
> -                       REG_A6XX_CP_RB_RPTR_ADDR_HI,
>                         shadowptr(a6xx_gpu, gpu->rb[0]));
>         }
>
> @@ -1394,9 +1386,9 @@ static void a6xx_fault_detect_irq(struct msm_gpu *gpu)
>                 gpu_read(gpu, REG_A6XX_RBBM_STATUS),
>                 gpu_read(gpu, REG_A6XX_CP_RB_RPTR),
>                 gpu_read(gpu, REG_A6XX_CP_RB_WPTR),
> -               gpu_read64(gpu, REG_A6XX_CP_IB1_BASE, REG_A6XX_CP_IB1_BASE_HI),
> +               gpu_read64(gpu, REG_A6XX_CP_IB1_BASE),
>                 gpu_read(gpu, REG_A6XX_CP_IB1_REM_SIZE),
> -               gpu_read64(gpu, REG_A6XX_CP_IB2_BASE, REG_A6XX_CP_IB2_BASE_HI),
> +               gpu_read64(gpu, REG_A6XX_CP_IB2_BASE),
>                 gpu_read(gpu, REG_A6XX_CP_IB2_REM_SIZE));
>
>         /* Turn off the hangcheck timer to keep it from bothering us */
> @@ -1607,8 +1599,7 @@ static int a6xx_get_timestamp(struct msm_gpu *gpu, uint64_t *value)
>         /* Force the GPU power on so we can read this register */
>         a6xx_gmu_set_oob(&a6xx_gpu->gmu, GMU_OOB_PERFCOUNTER_SET);
>
> -       *value = gpu_read64(gpu, REG_A6XX_CP_ALWAYS_ON_COUNTER_LO,
> -                           REG_A6XX_CP_ALWAYS_ON_COUNTER_HI);
> +       *value = gpu_read64(gpu, REG_A6XX_CP_ALWAYS_ON_COUNTER_LO);
>
>         a6xx_gmu_clear_oob(&a6xx_gpu->gmu, GMU_OOB_PERFCOUNTER_SET);
>
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
> index 55f443328d8e7..c61b233aff09b 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
> @@ -147,8 +147,7 @@ static int a6xx_crashdumper_run(struct msm_gpu *gpu,
>         /* Make sure all pending memory writes are posted */
>         wmb();
>
> -       gpu_write64(gpu, REG_A6XX_CP_CRASH_SCRIPT_BASE_LO,
> -               REG_A6XX_CP_CRASH_SCRIPT_BASE_HI, dumper->iova);
> +       gpu_write64(gpu, REG_A6XX_CP_CRASH_SCRIPT_BASE_LO, dumper->iova);
>
>         gpu_write(gpu, REG_A6XX_CP_CRASH_DUMP_CNTL, 1);
>
> diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
> index 02419f2ca2bc5..f7fca687d45de 100644
> --- a/drivers/gpu/drm/msm/msm_gpu.h
> +++ b/drivers/gpu/drm/msm/msm_gpu.h
> @@ -503,7 +503,7 @@ static inline void gpu_rmw(struct msm_gpu *gpu, u32 reg, u32 mask, u32 or)
>         msm_rmw(gpu->mmio + (reg << 2), mask, or);
>  }
>
> -static inline u64 gpu_read64(struct msm_gpu *gpu, u32 lo, u32 hi)
> +static inline u64 gpu_read64(struct msm_gpu *gpu, u32 reg)
>  {
>         u64 val;
>
> @@ -521,17 +521,17 @@ static inline u64 gpu_read64(struct msm_gpu *gpu, u32 lo, u32 hi)
>          * when the lo is read, so make sure to read the lo first to trigger
>          * that
>          */
> -       val = (u64) msm_readl(gpu->mmio + (lo << 2));
> -       val |= ((u64) msm_readl(gpu->mmio + (hi << 2)) << 32);
> +       val = (u64) msm_readl(gpu->mmio + (reg << 2));
> +       val |= ((u64) msm_readl(gpu->mmio + ((reg + 1) << 2)) << 32);
>
>         return val;
>  }
>
> -static inline void gpu_write64(struct msm_gpu *gpu, u32 lo, u32 hi, u64 val)
> +static inline void gpu_write64(struct msm_gpu *gpu, u32 reg, u64 val)
>  {
>         /* Why not a writeq here? Read the screed above */
> -       msm_writel(lower_32_bits(val), gpu->mmio + (lo << 2));
> -       msm_writel(upper_32_bits(val), gpu->mmio + (hi << 2));
> +       msm_writel(lower_32_bits(val), gpu->mmio + (reg << 2));
> +       msm_writel(upper_32_bits(val), gpu->mmio + ((reg + 1) << 2));
>  }
>
>  int msm_gpu_pm_suspend(struct msm_gpu *gpu);
> --
> 2.26.1
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/4] drm/msm/adreno: use a single register offset for gpu_read64/gpu_write64
@ 2022-04-02  1:39     ` Rob Clark
  0 siblings, 0 replies; 12+ messages in thread
From: Rob Clark @ 2022-04-02  1:39 UTC (permalink / raw)
  To: Jonathan Marek
  Cc: Douglas Anderson, open list, David Airlie, freedreno, Yangtao Li,
	Vladimir Lypak, Abhinav Kumar,
	open list:DRM DRIVER FOR MSM ADRENO GPU, Jordan Crouse,
	Akhil P Oommen, open list:DRM DRIVER FOR MSM ADRENO GPU,
	Dmitry Baryshkov, Sean Paul, Christian König, Dan Carpenter

On Sun, Mar 27, 2022 at 1:27 PM Jonathan Marek <jonathan@marek.ca> wrote:
>
> The high half of 64-bit registers is always at +1 offset, so change these
> helpers to be more convenient by removing the unnecessary argument.
>
> Signed-off-by: Jonathan Marek <jonathan@marek.ca>

I'd been meaning to do this for a while.. so I think I'll cherry-pick
this ahead of the rest of the series

Reviewed-by: Rob Clark <robdclark@gmail.com>

> ---
>  drivers/gpu/drm/msm/adreno/a4xx_gpu.c       |  3 +--
>  drivers/gpu/drm/msm/adreno/a5xx_gpu.c       | 27 ++++++++-------------
>  drivers/gpu/drm/msm/adreno/a5xx_preempt.c   |  4 +--
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c       | 25 ++++++-------------
>  drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c |  3 +--
>  drivers/gpu/drm/msm/msm_gpu.h               | 12 ++++-----
>  6 files changed, 27 insertions(+), 47 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
> index 0c6b2a6d0b4c9..da5e18bd74a45 100644
> --- a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
> @@ -606,8 +606,7 @@ static int a4xx_pm_suspend(struct msm_gpu *gpu) {
>
>  static int a4xx_get_timestamp(struct msm_gpu *gpu, uint64_t *value)
>  {
> -       *value = gpu_read64(gpu, REG_A4XX_RBBM_PERFCTR_CP_0_LO,
> -               REG_A4XX_RBBM_PERFCTR_CP_0_HI);
> +       *value = gpu_read64(gpu, REG_A4XX_RBBM_PERFCTR_CP_0_LO);
>
>         return 0;
>  }
> diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> index 407f50a15faa4..1916cb759cd5c 100644
> --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> @@ -605,11 +605,9 @@ static int a5xx_ucode_init(struct msm_gpu *gpu)
>                 a5xx_ucode_check_version(a5xx_gpu, a5xx_gpu->pfp_bo);
>         }
>
> -       gpu_write64(gpu, REG_A5XX_CP_ME_INSTR_BASE_LO,
> -               REG_A5XX_CP_ME_INSTR_BASE_HI, a5xx_gpu->pm4_iova);
> +       gpu_write64(gpu, REG_A5XX_CP_ME_INSTR_BASE_LO, a5xx_gpu->pm4_iova);
>
> -       gpu_write64(gpu, REG_A5XX_CP_PFP_INSTR_BASE_LO,
> -               REG_A5XX_CP_PFP_INSTR_BASE_HI, a5xx_gpu->pfp_iova);
> +       gpu_write64(gpu, REG_A5XX_CP_PFP_INSTR_BASE_LO, a5xx_gpu->pfp_iova);
>
>         return 0;
>  }
> @@ -868,8 +866,7 @@ static int a5xx_hw_init(struct msm_gpu *gpu)
>          * memory rendering at this point in time and we don't want to block off
>          * part of the virtual memory space.
>          */
> -       gpu_write64(gpu, REG_A5XX_RBBM_SECVID_TSB_TRUSTED_BASE_LO,
> -               REG_A5XX_RBBM_SECVID_TSB_TRUSTED_BASE_HI, 0x00000000);
> +       gpu_write64(gpu, REG_A5XX_RBBM_SECVID_TSB_TRUSTED_BASE_LO, 0x00000000);
>         gpu_write(gpu, REG_A5XX_RBBM_SECVID_TSB_TRUSTED_SIZE, 0x00000000);
>
>         /* Put the GPU into 64 bit by default */
> @@ -908,8 +905,7 @@ static int a5xx_hw_init(struct msm_gpu *gpu)
>                 return ret;
>
>         /* Set the ringbuffer address */
> -       gpu_write64(gpu, REG_A5XX_CP_RB_BASE, REG_A5XX_CP_RB_BASE_HI,
> -               gpu->rb[0]->iova);
> +       gpu_write64(gpu, REG_A5XX_CP_RB_BASE, gpu->rb[0]->iova);
>
>         /*
>          * If the microcode supports the WHERE_AM_I opcode then we can use that
> @@ -936,7 +932,7 @@ static int a5xx_hw_init(struct msm_gpu *gpu)
>                 }
>
>                 gpu_write64(gpu, REG_A5XX_CP_RB_RPTR_ADDR,
> -                       REG_A5XX_CP_RB_RPTR_ADDR_HI, shadowptr(a5xx_gpu, gpu->rb[0]));
> +                       shadowptr(a5xx_gpu, gpu->rb[0]));
>         } else if (gpu->nr_rings > 1) {
>                 /* Disable preemption if WHERE_AM_I isn't available */
>                 a5xx_preempt_fini(gpu);
> @@ -1239,9 +1235,9 @@ static void a5xx_fault_detect_irq(struct msm_gpu *gpu)
>                 gpu_read(gpu, REG_A5XX_RBBM_STATUS),
>                 gpu_read(gpu, REG_A5XX_CP_RB_RPTR),
>                 gpu_read(gpu, REG_A5XX_CP_RB_WPTR),
> -               gpu_read64(gpu, REG_A5XX_CP_IB1_BASE, REG_A5XX_CP_IB1_BASE_HI),
> +               gpu_read64(gpu, REG_A5XX_CP_IB1_BASE),
>                 gpu_read(gpu, REG_A5XX_CP_IB1_BUFSZ),
> -               gpu_read64(gpu, REG_A5XX_CP_IB2_BASE, REG_A5XX_CP_IB2_BASE_HI),
> +               gpu_read64(gpu, REG_A5XX_CP_IB2_BASE),
>                 gpu_read(gpu, REG_A5XX_CP_IB2_BUFSZ));
>
>         /* Turn off the hangcheck timer to keep it from bothering us */
> @@ -1427,8 +1423,7 @@ static int a5xx_pm_suspend(struct msm_gpu *gpu)
>
>  static int a5xx_get_timestamp(struct msm_gpu *gpu, uint64_t *value)
>  {
> -       *value = gpu_read64(gpu, REG_A5XX_RBBM_ALWAYSON_COUNTER_LO,
> -               REG_A5XX_RBBM_ALWAYSON_COUNTER_HI);
> +       *value = gpu_read64(gpu, REG_A5XX_RBBM_ALWAYSON_COUNTER_LO);
>
>         return 0;
>  }
> @@ -1465,8 +1460,7 @@ static int a5xx_crashdumper_run(struct msm_gpu *gpu,
>         if (IS_ERR_OR_NULL(dumper->ptr))
>                 return -EINVAL;
>
> -       gpu_write64(gpu, REG_A5XX_CP_CRASH_SCRIPT_BASE_LO,
> -               REG_A5XX_CP_CRASH_SCRIPT_BASE_HI, dumper->iova);
> +       gpu_write64(gpu, REG_A5XX_CP_CRASH_SCRIPT_BASE_LO, dumper->iova);
>
>         gpu_write(gpu, REG_A5XX_CP_CRASH_DUMP_CNTL, 1);
>
> @@ -1670,8 +1664,7 @@ static unsigned long a5xx_gpu_busy(struct msm_gpu *gpu)
>         if (pm_runtime_get_if_in_use(&gpu->pdev->dev) == 0)
>                 return 0;
>
> -       busy_cycles = gpu_read64(gpu, REG_A5XX_RBBM_PERFCTR_RBBM_0_LO,
> -                       REG_A5XX_RBBM_PERFCTR_RBBM_0_HI);
> +       busy_cycles = gpu_read64(gpu, REG_A5XX_RBBM_PERFCTR_RBBM_0_LO);
>
>         busy_time = busy_cycles - gpu->devfreq.busy_cycles;
>         do_div(busy_time, clk_get_rate(gpu->core_clk) / 1000000);
> diff --git a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
> index 8abc9a2b114a2..7658e89844b46 100644
> --- a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
> +++ b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c
> @@ -137,7 +137,6 @@ void a5xx_preempt_trigger(struct msm_gpu *gpu)
>
>         /* Set the address of the incoming preemption record */
>         gpu_write64(gpu, REG_A5XX_CP_CONTEXT_SWITCH_RESTORE_ADDR_LO,
> -               REG_A5XX_CP_CONTEXT_SWITCH_RESTORE_ADDR_HI,
>                 a5xx_gpu->preempt_iova[ring->id]);
>
>         a5xx_gpu->next_ring = ring;
> @@ -211,8 +210,7 @@ void a5xx_preempt_hw_init(struct msm_gpu *gpu)
>         }
>
>         /* Write a 0 to signal that we aren't switching pagetables */
> -       gpu_write64(gpu, REG_A5XX_CP_CONTEXT_SWITCH_SMMU_INFO_LO,
> -               REG_A5XX_CP_CONTEXT_SWITCH_SMMU_INFO_HI, 0);
> +       gpu_write64(gpu, REG_A5XX_CP_CONTEXT_SWITCH_SMMU_INFO_LO, 0);
>
>         /* Reset the preemption state */
>         set_preempt_state(a5xx_gpu, PREEMPT_NONE);
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index 83c31b2ad865b..a624cb2df233b 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -246,8 +246,7 @@ static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
>         OUT_RING(ring, submit->seqno);
>
>         trace_msm_gpu_submit_flush(submit,
> -               gpu_read64(gpu, REG_A6XX_CP_ALWAYS_ON_COUNTER_LO,
> -                       REG_A6XX_CP_ALWAYS_ON_COUNTER_HI));
> +               gpu_read64(gpu, REG_A6XX_CP_ALWAYS_ON_COUNTER_LO));
>
>         a6xx_flush(gpu, ring);
>  }
> @@ -878,8 +877,7 @@ static int a6xx_ucode_init(struct msm_gpu *gpu)
>                 }
>         }
>
> -       gpu_write64(gpu, REG_A6XX_CP_SQE_INSTR_BASE,
> -               REG_A6XX_CP_SQE_INSTR_BASE+1, a6xx_gpu->sqe_iova);
> +       gpu_write64(gpu, REG_A6XX_CP_SQE_INSTR_BASE, a6xx_gpu->sqe_iova);
>
>         return 0;
>  }
> @@ -926,8 +924,7 @@ static int hw_init(struct msm_gpu *gpu)
>          * memory rendering at this point in time and we don't want to block off
>          * part of the virtual memory space.
>          */
> -       gpu_write64(gpu, REG_A6XX_RBBM_SECVID_TSB_TRUSTED_BASE_LO,
> -               REG_A6XX_RBBM_SECVID_TSB_TRUSTED_BASE_HI, 0x00000000);
> +       gpu_write64(gpu, REG_A6XX_RBBM_SECVID_TSB_TRUSTED_BASE_LO, 0x00000000);
>         gpu_write(gpu, REG_A6XX_RBBM_SECVID_TSB_TRUSTED_SIZE, 0x00000000);
>
>         /* Turn on 64 bit addressing for all blocks */
> @@ -976,11 +973,8 @@ static int hw_init(struct msm_gpu *gpu)
>
>         if (!adreno_is_a650_family(adreno_gpu)) {
>                 /* Set the GMEM VA range [0x100000:0x100000 + gpu->gmem - 1] */
> -               gpu_write64(gpu, REG_A6XX_UCHE_GMEM_RANGE_MIN_LO,
> -                       REG_A6XX_UCHE_GMEM_RANGE_MIN_HI, 0x00100000);
> -
> +               gpu_write64(gpu, REG_A6XX_UCHE_GMEM_RANGE_MIN_LO, 0x00100000);
>                 gpu_write64(gpu, REG_A6XX_UCHE_GMEM_RANGE_MAX_LO,
> -                       REG_A6XX_UCHE_GMEM_RANGE_MAX_HI,
>                         0x00100000 + adreno_gpu->gmem - 1);
>         }
>
> @@ -1072,8 +1066,7 @@ static int hw_init(struct msm_gpu *gpu)
>                 goto out;
>
>         /* Set the ringbuffer address */
> -       gpu_write64(gpu, REG_A6XX_CP_RB_BASE, REG_A6XX_CP_RB_BASE_HI,
> -               gpu->rb[0]->iova);
> +       gpu_write64(gpu, REG_A6XX_CP_RB_BASE, gpu->rb[0]->iova);
>
>         /* Targets that support extended APRIV can use the RPTR shadow from
>          * hardware but all the other ones need to disable the feature. Targets
> @@ -1105,7 +1098,6 @@ static int hw_init(struct msm_gpu *gpu)
>                 }
>
>                 gpu_write64(gpu, REG_A6XX_CP_RB_RPTR_ADDR_LO,
> -                       REG_A6XX_CP_RB_RPTR_ADDR_HI,
>                         shadowptr(a6xx_gpu, gpu->rb[0]));
>         }
>
> @@ -1394,9 +1386,9 @@ static void a6xx_fault_detect_irq(struct msm_gpu *gpu)
>                 gpu_read(gpu, REG_A6XX_RBBM_STATUS),
>                 gpu_read(gpu, REG_A6XX_CP_RB_RPTR),
>                 gpu_read(gpu, REG_A6XX_CP_RB_WPTR),
> -               gpu_read64(gpu, REG_A6XX_CP_IB1_BASE, REG_A6XX_CP_IB1_BASE_HI),
> +               gpu_read64(gpu, REG_A6XX_CP_IB1_BASE),
>                 gpu_read(gpu, REG_A6XX_CP_IB1_REM_SIZE),
> -               gpu_read64(gpu, REG_A6XX_CP_IB2_BASE, REG_A6XX_CP_IB2_BASE_HI),
> +               gpu_read64(gpu, REG_A6XX_CP_IB2_BASE),
>                 gpu_read(gpu, REG_A6XX_CP_IB2_REM_SIZE));
>
>         /* Turn off the hangcheck timer to keep it from bothering us */
> @@ -1607,8 +1599,7 @@ static int a6xx_get_timestamp(struct msm_gpu *gpu, uint64_t *value)
>         /* Force the GPU power on so we can read this register */
>         a6xx_gmu_set_oob(&a6xx_gpu->gmu, GMU_OOB_PERFCOUNTER_SET);
>
> -       *value = gpu_read64(gpu, REG_A6XX_CP_ALWAYS_ON_COUNTER_LO,
> -                           REG_A6XX_CP_ALWAYS_ON_COUNTER_HI);
> +       *value = gpu_read64(gpu, REG_A6XX_CP_ALWAYS_ON_COUNTER_LO);
>
>         a6xx_gmu_clear_oob(&a6xx_gpu->gmu, GMU_OOB_PERFCOUNTER_SET);
>
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
> index 55f443328d8e7..c61b233aff09b 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
> @@ -147,8 +147,7 @@ static int a6xx_crashdumper_run(struct msm_gpu *gpu,
>         /* Make sure all pending memory writes are posted */
>         wmb();
>
> -       gpu_write64(gpu, REG_A6XX_CP_CRASH_SCRIPT_BASE_LO,
> -               REG_A6XX_CP_CRASH_SCRIPT_BASE_HI, dumper->iova);
> +       gpu_write64(gpu, REG_A6XX_CP_CRASH_SCRIPT_BASE_LO, dumper->iova);
>
>         gpu_write(gpu, REG_A6XX_CP_CRASH_DUMP_CNTL, 1);
>
> diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
> index 02419f2ca2bc5..f7fca687d45de 100644
> --- a/drivers/gpu/drm/msm/msm_gpu.h
> +++ b/drivers/gpu/drm/msm/msm_gpu.h
> @@ -503,7 +503,7 @@ static inline void gpu_rmw(struct msm_gpu *gpu, u32 reg, u32 mask, u32 or)
>         msm_rmw(gpu->mmio + (reg << 2), mask, or);
>  }
>
> -static inline u64 gpu_read64(struct msm_gpu *gpu, u32 lo, u32 hi)
> +static inline u64 gpu_read64(struct msm_gpu *gpu, u32 reg)
>  {
>         u64 val;
>
> @@ -521,17 +521,17 @@ static inline u64 gpu_read64(struct msm_gpu *gpu, u32 lo, u32 hi)
>          * when the lo is read, so make sure to read the lo first to trigger
>          * that
>          */
> -       val = (u64) msm_readl(gpu->mmio + (lo << 2));
> -       val |= ((u64) msm_readl(gpu->mmio + (hi << 2)) << 32);
> +       val = (u64) msm_readl(gpu->mmio + (reg << 2));
> +       val |= ((u64) msm_readl(gpu->mmio + ((reg + 1) << 2)) << 32);
>
>         return val;
>  }
>
> -static inline void gpu_write64(struct msm_gpu *gpu, u32 lo, u32 hi, u64 val)
> +static inline void gpu_write64(struct msm_gpu *gpu, u32 reg, u64 val)
>  {
>         /* Why not a writeq here? Read the screed above */
> -       msm_writel(lower_32_bits(val), gpu->mmio + (lo << 2));
> -       msm_writel(upper_32_bits(val), gpu->mmio + (hi << 2));
> +       msm_writel(lower_32_bits(val), gpu->mmio + (reg << 2));
> +       msm_writel(upper_32_bits(val), gpu->mmio + ((reg + 1) << 2));
>  }
>
>  int msm_gpu_pm_suspend(struct msm_gpu *gpu);
> --
> 2.26.1
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-04-02  1:38 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-27 20:25 [PATCH 0/4] drm/msm/adreno: add support for a730 Jonathan Marek
2022-03-27 20:25 ` Jonathan Marek
2022-03-27 20:25 ` [PATCH 1/4] drm/msm/adreno: move a6xx CP_PROTECT macros to common code Jonathan Marek
2022-03-27 20:25   ` Jonathan Marek
2022-03-27 20:25 ` [PATCH 2/4] drm/msm/adreno: use a single register offset for gpu_read64/gpu_write64 Jonathan Marek
2022-03-27 20:25   ` Jonathan Marek
2022-04-02  1:39   ` Rob Clark
2022-04-02  1:39     ` Rob Clark
2022-03-27 20:25 ` [PATCH 3/4] drm/msm/adreno: update headers Jonathan Marek
2022-03-27 20:25   ` Jonathan Marek
2022-03-27 20:25 ` [PATCH 4/4] drm/msm/adreno: add support for a730 Jonathan Marek
2022-03-27 20:25   ` Jonathan Marek

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.