All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/12] Adreno A5XX support
@ 2016-11-28 19:28 Jordan Crouse
  2016-11-28 19:28 ` [PATCH 01/12] drm/msm: gpu: Cut down the list of "generic" registers to the ones we use Jordan Crouse
                   ` (9 more replies)
  0 siblings, 10 replies; 34+ messages in thread
From: Jordan Crouse @ 2016-11-28 19:28 UTC (permalink / raw)
  To: freedreno; +Cc: linux-arm-msm, dri-devel

All, here is initial kernel support for the Adreno A5XX family of
GPUs found on the QTI Snapdragon 820 and 821 among others.

This stack turns on the A5XX hardware and initializes the GPMU
(Graphics Power Management Unit) which is a microcontroller
to assist with more independent power management. This initial
stack is mainly concerned with getting the GPU/GPMU operational.
Hopefully future improvements will actually tune it for better power.

The last three patches in the stack take the GPU out of "secure" mode.
When the GPU is in secure mode it can only read/write from memory
that is locked by the secure components of the SoC. By default the
GPU comes up in secure mode - to get out of it we need to use SCM to
load a special shader that "zaps" the persistent memory and registers
and then we can execute a GPU command to take us out of secure. This
code should compile cleanly against upstream but there are some changes
coming in on other trees to make it functional for the Snapdragon 820/821.

And since the GPU isn't very interesting unless it can draw a triangle
or two, Rob Clark has been hard at work on the user side of things and
has a great chunk of functionality ready for Mesa:

https://lists.freedesktop.org/archives/freedreno/2016-November/000852.html

So I think this should be everything we need to make the A5XX useful for
the community. Please enjoy.

Best regards,
Jordan

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 01/12] drm/msm: gpu: Cut down the list of "generic" registers to the ones we use
  2016-11-28 19:28 [PATCH 00/12] Adreno A5XX support Jordan Crouse
@ 2016-11-28 19:28 ` Jordan Crouse
       [not found] ` <1480361317-9937-1-git-send-email-jcrouse-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 34+ messages in thread
From: Jordan Crouse @ 2016-11-28 19:28 UTC (permalink / raw)
  To: freedreno; +Cc: linux-arm-msm, dri-devel

There are very few register accesses in the common code. Cut down
the list of common registers to just those that are used.  This
saves const space and saves us the effort of maintaining registers
for A3XX and A4XX that don't exist or are unused.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/gpu/drm/msm/adreno/a3xx_gpu.c   | 80 ---------------------------------
 drivers/gpu/drm/msm/adreno/a4xx_gpu.c   | 76 -------------------------------
 drivers/gpu/drm/msm/adreno/adreno_gpu.h | 59 ------------------------
 3 files changed, 215 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
index 156abf0..5c64607 100644
--- a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
@@ -419,91 +419,11 @@ static void a3xx_dump(struct msm_gpu *gpu)
 }
 /* Register offset defines for A3XX */
 static const unsigned int a3xx_register_offsets[REG_ADRENO_REGISTER_MAX] = {
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_DEBUG, REG_AXXX_CP_DEBUG),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_ME_RAM_WADDR, REG_AXXX_CP_ME_RAM_WADDR),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_ME_RAM_DATA, REG_AXXX_CP_ME_RAM_DATA),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_PFP_UCODE_DATA,
-			REG_A3XX_CP_PFP_UCODE_DATA),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_PFP_UCODE_ADDR,
-			REG_A3XX_CP_PFP_UCODE_ADDR),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_WFI_PEND_CTR, REG_A3XX_CP_WFI_PEND_CTR),
 	REG_ADRENO_DEFINE(REG_ADRENO_CP_RB_BASE, REG_AXXX_CP_RB_BASE),
 	REG_ADRENO_DEFINE(REG_ADRENO_CP_RB_RPTR_ADDR, REG_AXXX_CP_RB_RPTR_ADDR),
 	REG_ADRENO_DEFINE(REG_ADRENO_CP_RB_RPTR, REG_AXXX_CP_RB_RPTR),
 	REG_ADRENO_DEFINE(REG_ADRENO_CP_RB_WPTR, REG_AXXX_CP_RB_WPTR),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_PROTECT_CTRL, REG_A3XX_CP_PROTECT_CTRL),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_ME_CNTL, REG_AXXX_CP_ME_CNTL),
 	REG_ADRENO_DEFINE(REG_ADRENO_CP_RB_CNTL, REG_AXXX_CP_RB_CNTL),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_IB1_BASE, REG_AXXX_CP_IB1_BASE),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_IB1_BUFSZ, REG_AXXX_CP_IB1_BUFSZ),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_IB2_BASE, REG_AXXX_CP_IB2_BASE),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_IB2_BUFSZ, REG_AXXX_CP_IB2_BUFSZ),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_TIMESTAMP, REG_AXXX_CP_SCRATCH_REG0),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_ME_RAM_RADDR, REG_AXXX_CP_ME_RAM_RADDR),
-	REG_ADRENO_DEFINE(REG_ADRENO_SCRATCH_ADDR, REG_AXXX_SCRATCH_ADDR),
-	REG_ADRENO_DEFINE(REG_ADRENO_SCRATCH_UMSK, REG_AXXX_SCRATCH_UMSK),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_ROQ_ADDR, REG_A3XX_CP_ROQ_ADDR),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_ROQ_DATA, REG_A3XX_CP_ROQ_DATA),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_MERCIU_ADDR, REG_A3XX_CP_MERCIU_ADDR),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_MERCIU_DATA, REG_A3XX_CP_MERCIU_DATA),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_MERCIU_DATA2, REG_A3XX_CP_MERCIU_DATA2),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_MEQ_ADDR, REG_A3XX_CP_MEQ_ADDR),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_MEQ_DATA, REG_A3XX_CP_MEQ_DATA),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_HW_FAULT, REG_A3XX_CP_HW_FAULT),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_PROTECT_STATUS,
-			REG_A3XX_CP_PROTECT_STATUS),
-	REG_ADRENO_DEFINE(REG_ADRENO_RBBM_STATUS, REG_A3XX_RBBM_STATUS),
-	REG_ADRENO_DEFINE(REG_ADRENO_RBBM_PERFCTR_CTL,
-			REG_A3XX_RBBM_PERFCTR_CTL),
-	REG_ADRENO_DEFINE(REG_ADRENO_RBBM_PERFCTR_LOAD_CMD0,
-			REG_A3XX_RBBM_PERFCTR_LOAD_CMD0),
-	REG_ADRENO_DEFINE(REG_ADRENO_RBBM_PERFCTR_LOAD_CMD1,
-			REG_A3XX_RBBM_PERFCTR_LOAD_CMD1),
-	REG_ADRENO_DEFINE(REG_ADRENO_RBBM_PERFCTR_PWR_1_LO,
-			REG_A3XX_RBBM_PERFCTR_PWR_1_LO),
-	REG_ADRENO_DEFINE(REG_ADRENO_RBBM_INT_0_MASK, REG_A3XX_RBBM_INT_0_MASK),
-	REG_ADRENO_DEFINE(REG_ADRENO_RBBM_INT_0_STATUS,
-			REG_A3XX_RBBM_INT_0_STATUS),
-	REG_ADRENO_DEFINE(REG_ADRENO_RBBM_AHB_ERROR_STATUS,
-			REG_A3XX_RBBM_AHB_ERROR_STATUS),
-	REG_ADRENO_DEFINE(REG_ADRENO_RBBM_AHB_CMD, REG_A3XX_RBBM_AHB_CMD),
-	REG_ADRENO_DEFINE(REG_ADRENO_RBBM_INT_CLEAR_CMD,
-			REG_A3XX_RBBM_INT_CLEAR_CMD),
-	REG_ADRENO_DEFINE(REG_ADRENO_RBBM_CLOCK_CTL, REG_A3XX_RBBM_CLOCK_CTL),
-	REG_ADRENO_DEFINE(REG_ADRENO_VPC_DEBUG_RAM_SEL,
-			REG_A3XX_VPC_VPC_DEBUG_RAM_SEL),
-	REG_ADRENO_DEFINE(REG_ADRENO_VPC_DEBUG_RAM_READ,
-			REG_A3XX_VPC_VPC_DEBUG_RAM_READ),
-	REG_ADRENO_DEFINE(REG_ADRENO_VSC_SIZE_ADDRESS,
-			REG_A3XX_VSC_SIZE_ADDRESS),
-	REG_ADRENO_DEFINE(REG_ADRENO_VFD_CONTROL_0, REG_A3XX_VFD_CONTROL_0),
-	REG_ADRENO_DEFINE(REG_ADRENO_VFD_INDEX_MAX, REG_A3XX_VFD_INDEX_MAX),
-	REG_ADRENO_DEFINE(REG_ADRENO_SP_VS_PVT_MEM_ADDR_REG,
-			REG_A3XX_SP_VS_PVT_MEM_ADDR_REG),
-	REG_ADRENO_DEFINE(REG_ADRENO_SP_FS_PVT_MEM_ADDR_REG,
-			REG_A3XX_SP_FS_PVT_MEM_ADDR_REG),
-	REG_ADRENO_DEFINE(REG_ADRENO_SP_VS_OBJ_START_REG,
-			REG_A3XX_SP_VS_OBJ_START_REG),
-	REG_ADRENO_DEFINE(REG_ADRENO_SP_FS_OBJ_START_REG,
-			REG_A3XX_SP_FS_OBJ_START_REG),
-	REG_ADRENO_DEFINE(REG_ADRENO_PA_SC_AA_CONFIG, REG_A3XX_PA_SC_AA_CONFIG),
-	REG_ADRENO_DEFINE(REG_ADRENO_RBBM_PM_OVERRIDE2,
-			REG_A3XX_RBBM_PM_OVERRIDE2),
-	REG_ADRENO_DEFINE(REG_ADRENO_SCRATCH_REG2, REG_AXXX_CP_SCRATCH_REG2),
-	REG_ADRENO_DEFINE(REG_ADRENO_SQ_GPR_MANAGEMENT,
-			REG_A3XX_SQ_GPR_MANAGEMENT),
-	REG_ADRENO_DEFINE(REG_ADRENO_SQ_INST_STORE_MANAGMENT,
-			REG_A3XX_SQ_INST_STORE_MANAGMENT),
-	REG_ADRENO_DEFINE(REG_ADRENO_TP0_CHICKEN, REG_A3XX_TP0_CHICKEN),
-	REG_ADRENO_DEFINE(REG_ADRENO_RBBM_RBBM_CTL, REG_A3XX_RBBM_RBBM_CTL),
-	REG_ADRENO_DEFINE(REG_ADRENO_RBBM_SW_RESET_CMD,
-			REG_A3XX_RBBM_SW_RESET_CMD),
-	REG_ADRENO_DEFINE(REG_ADRENO_UCHE_INVALIDATE0,
-			REG_A3XX_UCHE_CACHE_INVALIDATE0_REG),
-	REG_ADRENO_DEFINE(REG_ADRENO_RBBM_PERFCTR_LOAD_VALUE_LO,
-			REG_A3XX_RBBM_PERFCTR_LOAD_VALUE_LO),
-	REG_ADRENO_DEFINE(REG_ADRENO_RBBM_PERFCTR_LOAD_VALUE_HI,
-			REG_A3XX_RBBM_PERFCTR_LOAD_VALUE_HI),
 };
 
 static const struct adreno_gpu_funcs funcs = {
diff --git a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
index 2dc9412..8ce2250 100644
--- a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
@@ -460,87 +460,11 @@ static void a4xx_show(struct msm_gpu *gpu, struct seq_file *m)
 
 /* Register offset defines for A4XX, in order of enum adreno_regs */
 static const unsigned int a4xx_register_offsets[REG_ADRENO_REGISTER_MAX] = {
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_DEBUG, REG_A4XX_CP_DEBUG),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_ME_RAM_WADDR, REG_A4XX_CP_ME_RAM_WADDR),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_ME_RAM_DATA, REG_A4XX_CP_ME_RAM_DATA),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_PFP_UCODE_DATA,
-			REG_A4XX_CP_PFP_UCODE_DATA),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_PFP_UCODE_ADDR,
-			REG_A4XX_CP_PFP_UCODE_ADDR),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_WFI_PEND_CTR, REG_A4XX_CP_WFI_PEND_CTR),
 	REG_ADRENO_DEFINE(REG_ADRENO_CP_RB_BASE, REG_A4XX_CP_RB_BASE),
 	REG_ADRENO_DEFINE(REG_ADRENO_CP_RB_RPTR_ADDR, REG_A4XX_CP_RB_RPTR_ADDR),
 	REG_ADRENO_DEFINE(REG_ADRENO_CP_RB_RPTR, REG_A4XX_CP_RB_RPTR),
 	REG_ADRENO_DEFINE(REG_ADRENO_CP_RB_WPTR, REG_A4XX_CP_RB_WPTR),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_PROTECT_CTRL, REG_A4XX_CP_PROTECT_CTRL),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_ME_CNTL, REG_A4XX_CP_ME_CNTL),
 	REG_ADRENO_DEFINE(REG_ADRENO_CP_RB_CNTL, REG_A4XX_CP_RB_CNTL),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_IB1_BASE, REG_A4XX_CP_IB1_BASE),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_IB1_BUFSZ, REG_A4XX_CP_IB1_BUFSZ),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_IB2_BASE, REG_A4XX_CP_IB2_BASE),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_IB2_BUFSZ, REG_A4XX_CP_IB2_BUFSZ),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_TIMESTAMP, REG_AXXX_CP_SCRATCH_REG0),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_ME_RAM_RADDR, REG_A4XX_CP_ME_RAM_RADDR),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_ROQ_ADDR, REG_A4XX_CP_ROQ_ADDR),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_ROQ_DATA, REG_A4XX_CP_ROQ_DATA),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_MERCIU_ADDR, REG_A4XX_CP_MERCIU_ADDR),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_MERCIU_DATA, REG_A4XX_CP_MERCIU_DATA),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_MERCIU_DATA2, REG_A4XX_CP_MERCIU_DATA2),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_MEQ_ADDR, REG_A4XX_CP_MEQ_ADDR),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_MEQ_DATA, REG_A4XX_CP_MEQ_DATA),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_HW_FAULT, REG_A4XX_CP_HW_FAULT),
-	REG_ADRENO_DEFINE(REG_ADRENO_CP_PROTECT_STATUS,
-			REG_A4XX_CP_PROTECT_STATUS),
-	REG_ADRENO_DEFINE(REG_ADRENO_SCRATCH_ADDR, REG_A4XX_CP_SCRATCH_ADDR),
-	REG_ADRENO_DEFINE(REG_ADRENO_SCRATCH_UMSK, REG_A4XX_CP_SCRATCH_UMASK),
-	REG_ADRENO_DEFINE(REG_ADRENO_RBBM_STATUS, REG_A4XX_RBBM_STATUS),
-	REG_ADRENO_DEFINE(REG_ADRENO_RBBM_PERFCTR_CTL,
-			REG_A4XX_RBBM_PERFCTR_CTL),
-	REG_ADRENO_DEFINE(REG_ADRENO_RBBM_PERFCTR_LOAD_CMD0,
-			REG_A4XX_RBBM_PERFCTR_LOAD_CMD0),
-	REG_ADRENO_DEFINE(REG_ADRENO_RBBM_PERFCTR_LOAD_CMD1,
-			REG_A4XX_RBBM_PERFCTR_LOAD_CMD1),
-	REG_ADRENO_DEFINE(REG_ADRENO_RBBM_PERFCTR_LOAD_CMD2,
-			REG_A4XX_RBBM_PERFCTR_LOAD_CMD2),
-	REG_ADRENO_DEFINE(REG_ADRENO_RBBM_PERFCTR_PWR_1_LO,
-			REG_A4XX_RBBM_PERFCTR_PWR_1_LO),
-	REG_ADRENO_DEFINE(REG_ADRENO_RBBM_INT_0_MASK, REG_A4XX_RBBM_INT_0_MASK),
-	REG_ADRENO_DEFINE(REG_ADRENO_RBBM_INT_0_STATUS,
-			REG_A4XX_RBBM_INT_0_STATUS),
-	REG_ADRENO_DEFINE(REG_ADRENO_RBBM_AHB_ERROR_STATUS,
-			REG_A4XX_RBBM_AHB_ERROR_STATUS),
-	REG_ADRENO_DEFINE(REG_ADRENO_RBBM_AHB_CMD, REG_A4XX_RBBM_AHB_CMD),
-	REG_ADRENO_DEFINE(REG_ADRENO_RBBM_CLOCK_CTL, REG_A4XX_RBBM_CLOCK_CTL),
-	REG_ADRENO_DEFINE(REG_ADRENO_RBBM_AHB_ME_SPLIT_STATUS,
-			REG_A4XX_RBBM_AHB_ME_SPLIT_STATUS),
-	REG_ADRENO_DEFINE(REG_ADRENO_RBBM_AHB_PFP_SPLIT_STATUS,
-			REG_A4XX_RBBM_AHB_PFP_SPLIT_STATUS),
-	REG_ADRENO_DEFINE(REG_ADRENO_VPC_DEBUG_RAM_SEL,
-			REG_A4XX_VPC_DEBUG_RAM_SEL),
-	REG_ADRENO_DEFINE(REG_ADRENO_VPC_DEBUG_RAM_READ,
-			REG_A4XX_VPC_DEBUG_RAM_READ),
-	REG_ADRENO_DEFINE(REG_ADRENO_RBBM_INT_CLEAR_CMD,
-			REG_A4XX_RBBM_INT_CLEAR_CMD),
-	REG_ADRENO_DEFINE(REG_ADRENO_VSC_SIZE_ADDRESS,
-			REG_A4XX_VSC_SIZE_ADDRESS),
-	REG_ADRENO_DEFINE(REG_ADRENO_VFD_CONTROL_0, REG_A4XX_VFD_CONTROL_0),
-	REG_ADRENO_DEFINE(REG_ADRENO_SP_VS_PVT_MEM_ADDR_REG,
-			REG_A4XX_SP_VS_PVT_MEM_ADDR),
-	REG_ADRENO_DEFINE(REG_ADRENO_SP_FS_PVT_MEM_ADDR_REG,
-			REG_A4XX_SP_FS_PVT_MEM_ADDR),
-	REG_ADRENO_DEFINE(REG_ADRENO_SP_VS_OBJ_START_REG,
-			REG_A4XX_SP_VS_OBJ_START),
-	REG_ADRENO_DEFINE(REG_ADRENO_SP_FS_OBJ_START_REG,
-			REG_A4XX_SP_FS_OBJ_START),
-	REG_ADRENO_DEFINE(REG_ADRENO_RBBM_RBBM_CTL, REG_A4XX_RBBM_RBBM_CTL),
-	REG_ADRENO_DEFINE(REG_ADRENO_RBBM_SW_RESET_CMD,
-			REG_A4XX_RBBM_SW_RESET_CMD),
-	REG_ADRENO_DEFINE(REG_ADRENO_UCHE_INVALIDATE0,
-			REG_A4XX_UCHE_INVALIDATE0),
-	REG_ADRENO_DEFINE(REG_ADRENO_RBBM_PERFCTR_LOAD_VALUE_LO,
-			REG_A4XX_RBBM_PERFCTR_LOAD_VALUE_LO),
-	REG_ADRENO_DEFINE(REG_ADRENO_RBBM_PERFCTR_LOAD_VALUE_HI,
-			REG_A4XX_RBBM_PERFCTR_LOAD_VALUE_HI),
 };
 
 static void a4xx_dump(struct msm_gpu *gpu)
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
index 399f2f6..35c01cc 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
@@ -35,70 +35,11 @@
  * and are indexed by the enumeration values defined in this enum
  */
 enum adreno_regs {
-	REG_ADRENO_CP_DEBUG,
-	REG_ADRENO_CP_ME_RAM_WADDR,
-	REG_ADRENO_CP_ME_RAM_DATA,
-	REG_ADRENO_CP_PFP_UCODE_DATA,
-	REG_ADRENO_CP_PFP_UCODE_ADDR,
-	REG_ADRENO_CP_WFI_PEND_CTR,
 	REG_ADRENO_CP_RB_BASE,
 	REG_ADRENO_CP_RB_RPTR_ADDR,
 	REG_ADRENO_CP_RB_RPTR,
 	REG_ADRENO_CP_RB_WPTR,
-	REG_ADRENO_CP_PROTECT_CTRL,
-	REG_ADRENO_CP_ME_CNTL,
 	REG_ADRENO_CP_RB_CNTL,
-	REG_ADRENO_CP_IB1_BASE,
-	REG_ADRENO_CP_IB1_BUFSZ,
-	REG_ADRENO_CP_IB2_BASE,
-	REG_ADRENO_CP_IB2_BUFSZ,
-	REG_ADRENO_CP_TIMESTAMP,
-	REG_ADRENO_CP_ME_RAM_RADDR,
-	REG_ADRENO_CP_ROQ_ADDR,
-	REG_ADRENO_CP_ROQ_DATA,
-	REG_ADRENO_CP_MERCIU_ADDR,
-	REG_ADRENO_CP_MERCIU_DATA,
-	REG_ADRENO_CP_MERCIU_DATA2,
-	REG_ADRENO_CP_MEQ_ADDR,
-	REG_ADRENO_CP_MEQ_DATA,
-	REG_ADRENO_CP_HW_FAULT,
-	REG_ADRENO_CP_PROTECT_STATUS,
-	REG_ADRENO_SCRATCH_ADDR,
-	REG_ADRENO_SCRATCH_UMSK,
-	REG_ADRENO_SCRATCH_REG2,
-	REG_ADRENO_RBBM_STATUS,
-	REG_ADRENO_RBBM_PERFCTR_CTL,
-	REG_ADRENO_RBBM_PERFCTR_LOAD_CMD0,
-	REG_ADRENO_RBBM_PERFCTR_LOAD_CMD1,
-	REG_ADRENO_RBBM_PERFCTR_LOAD_CMD2,
-	REG_ADRENO_RBBM_PERFCTR_PWR_1_LO,
-	REG_ADRENO_RBBM_INT_0_MASK,
-	REG_ADRENO_RBBM_INT_0_STATUS,
-	REG_ADRENO_RBBM_AHB_ERROR_STATUS,
-	REG_ADRENO_RBBM_PM_OVERRIDE2,
-	REG_ADRENO_RBBM_AHB_CMD,
-	REG_ADRENO_RBBM_INT_CLEAR_CMD,
-	REG_ADRENO_RBBM_SW_RESET_CMD,
-	REG_ADRENO_RBBM_CLOCK_CTL,
-	REG_ADRENO_RBBM_AHB_ME_SPLIT_STATUS,
-	REG_ADRENO_RBBM_AHB_PFP_SPLIT_STATUS,
-	REG_ADRENO_VPC_DEBUG_RAM_SEL,
-	REG_ADRENO_VPC_DEBUG_RAM_READ,
-	REG_ADRENO_VSC_SIZE_ADDRESS,
-	REG_ADRENO_VFD_CONTROL_0,
-	REG_ADRENO_VFD_INDEX_MAX,
-	REG_ADRENO_SP_VS_PVT_MEM_ADDR_REG,
-	REG_ADRENO_SP_FS_PVT_MEM_ADDR_REG,
-	REG_ADRENO_SP_VS_OBJ_START_REG,
-	REG_ADRENO_SP_FS_OBJ_START_REG,
-	REG_ADRENO_PA_SC_AA_CONFIG,
-	REG_ADRENO_SQ_GPR_MANAGEMENT,
-	REG_ADRENO_SQ_INST_STORE_MANAGMENT,
-	REG_ADRENO_TP0_CHICKEN,
-	REG_ADRENO_RBBM_RBBM_CTL,
-	REG_ADRENO_UCHE_INVALIDATE0,
-	REG_ADRENO_RBBM_PERFCTR_LOAD_VALUE_LO,
-	REG_ADRENO_RBBM_PERFCTR_LOAD_VALUE_HI,
 	REG_ADRENO_REGISTER_MAX,
 };
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 02/12] drm/msm: gpu: Return error on hw_init failure
       [not found] ` <1480361317-9937-1-git-send-email-jcrouse-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
@ 2016-11-28 19:28   ` Jordan Crouse
  2016-11-28 19:28   ` [PATCH 05/12] drm/msm: gpu: Add OUT_TYPE4 and OUT_TYPE7 Jordan Crouse
  2016-11-28 19:28   ` [PATCH 06/12] drm/msm: Remove 'src_clk' from adreno configuration Jordan Crouse
  2 siblings, 0 replies; 34+ messages in thread
From: Jordan Crouse @ 2016-11-28 19:28 UTC (permalink / raw)
  To: freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

When the GPU hardware init function fails (like say, ME_INIT timed
out) return error instead of blindly continuing on. This gives us
a small chance of saving the system before it goes boom.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/gpu/drm/msm/adreno/a3xx_gpu.c   | 21 ++++++++++++---------
 drivers/gpu/drm/msm/adreno/a4xx_gpu.c   | 20 +++++++++++---------
 drivers/gpu/drm/msm/adreno/adreno_gpu.c | 11 +++++------
 drivers/gpu/drm/msm/adreno/adreno_gpu.h |  2 +-
 drivers/gpu/drm/msm/msm_gpu.h           |  2 +-
 5 files changed, 30 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
index 5c64607..19a1839 100644
--- a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
@@ -41,7 +41,7 @@ extern bool hang_debug;
 
 static void a3xx_dump(struct msm_gpu *gpu);
 
-static void a3xx_me_init(struct msm_gpu *gpu)
+static bool a3xx_me_init(struct msm_gpu *gpu)
 {
 	struct msm_ringbuffer *ring = gpu->rb;
 
@@ -65,7 +65,7 @@ static void a3xx_me_init(struct msm_gpu *gpu)
 	OUT_RING(ring, 0x00000000);
 
 	gpu->funcs->flush(gpu);
-	gpu->funcs->idle(gpu);
+	return gpu->funcs->idle(gpu);
 }
 
 static int a3xx_hw_init(struct msm_gpu *gpu)
@@ -294,9 +294,7 @@ static int a3xx_hw_init(struct msm_gpu *gpu)
 	/* clear ME_HALT to start micro engine */
 	gpu_write(gpu, REG_AXXX_CP_ME_CNTL, 0);
 
-	a3xx_me_init(gpu);
-
-	return 0;
+	return a3xx_me_init(gpu) ? 0 : -EINVAL;
 }
 
 static void a3xx_recover(struct msm_gpu *gpu)
@@ -330,17 +328,22 @@ static void a3xx_destroy(struct msm_gpu *gpu)
 	kfree(a3xx_gpu);
 }
 
-static void a3xx_idle(struct msm_gpu *gpu)
+static bool a3xx_idle(struct msm_gpu *gpu)
 {
 	/* wait for ringbuffer to drain: */
-	adreno_idle(gpu);
+	if (!adreno_idle(gpu))
+		return false;
 
 	/* then wait for GPU to finish: */
 	if (spin_until(!(gpu_read(gpu, REG_A3XX_RBBM_STATUS) &
-			A3XX_RBBM_STATUS_GPU_BUSY)))
+			A3XX_RBBM_STATUS_GPU_BUSY))) {
 		DRM_ERROR("%s: timeout waiting for GPU to idle!\n", gpu->name);
 
-	/* TODO maybe we need to reset GPU here to recover from hang? */
+		/* TODO maybe we need to reset GPU here to recover from hang? */
+		return false;
+	}
+
+	return true;
 }
 
 static irqreturn_t a3xx_irq(struct msm_gpu *gpu)
diff --git a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
index 8ce2250..9e7f5b7 100644
--- a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
@@ -113,7 +113,7 @@ static void a4xx_enable_hwcg(struct msm_gpu *gpu)
 }
 
 
-static void a4xx_me_init(struct msm_gpu *gpu)
+static bool a4xx_me_init(struct msm_gpu *gpu)
 {
 	struct msm_ringbuffer *ring = gpu->rb;
 
@@ -137,7 +137,7 @@ static void a4xx_me_init(struct msm_gpu *gpu)
 	OUT_RING(ring, 0x00000000);
 
 	gpu->funcs->flush(gpu);
-	gpu->funcs->idle(gpu);
+	return gpu->funcs->idle(gpu);
 }
 
 static int a4xx_hw_init(struct msm_gpu *gpu)
@@ -292,9 +292,7 @@ static int a4xx_hw_init(struct msm_gpu *gpu)
 	/* clear ME_HALT to start micro engine */
 	gpu_write(gpu, REG_A4XX_CP_ME_CNTL, 0);
 
-	a4xx_me_init(gpu);
-
-	return 0;
+	return a4xx_me_init(gpu) ? 0 : -EINVAL;
 }
 
 static void a4xx_recover(struct msm_gpu *gpu)
@@ -328,17 +326,21 @@ static void a4xx_destroy(struct msm_gpu *gpu)
 	kfree(a4xx_gpu);
 }
 
-static void a4xx_idle(struct msm_gpu *gpu)
+static bool a4xx_idle(struct msm_gpu *gpu)
 {
 	/* wait for ringbuffer to drain: */
-	adreno_idle(gpu);
+	if (!adreno_idle(gpu))
+		return false;
 
 	/* then wait for GPU to finish: */
 	if (spin_until(!(gpu_read(gpu, REG_A4XX_RBBM_STATUS) &
-					A4XX_RBBM_STATUS_GPU_BUSY)))
+					A4XX_RBBM_STATUS_GPU_BUSY))) {
 		DRM_ERROR("%s: timeout waiting for GPU to idle!\n", gpu->name);
+		/* TODO maybe we need to reset GPU here to recover from hang? */
+		return false;
+	}
 
-	/* TODO maybe we need to reset GPU here to recover from hang? */
+	return true;
 }
 
 static irqreturn_t a4xx_irq(struct msm_gpu *gpu)
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index b468d2a..6436d54a 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -218,19 +218,18 @@ void adreno_flush(struct msm_gpu *gpu)
 	adreno_gpu_write(adreno_gpu, REG_ADRENO_CP_RB_WPTR, wptr);
 }
 
-void adreno_idle(struct msm_gpu *gpu)
+bool adreno_idle(struct msm_gpu *gpu)
 {
 	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
 	uint32_t wptr = get_wptr(gpu->rb);
-	int ret;
 
 	/* wait for CP to drain ringbuffer: */
-	ret = spin_until(get_rptr(adreno_gpu) == wptr);
-
-	if (ret)
-		DRM_ERROR("%s: timeout waiting to drain ringbuffer!\n", gpu->name);
+	if (!spin_until(get_rptr(adreno_gpu) == wptr))
+		return true;
 
 	/* TODO maybe we need to reset GPU here to recover from hang? */
+	DRM_ERROR("%s: timeout waiting to drain ringbuffer!\n", gpu->name);
+	return false;
 }
 
 #ifdef CONFIG_DEBUG_FS
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
index 35c01cc..ab476f9 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
@@ -182,7 +182,7 @@ void adreno_recover(struct msm_gpu *gpu);
 void adreno_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit,
 		struct msm_file_private *ctx);
 void adreno_flush(struct msm_gpu *gpu);
-void adreno_idle(struct msm_gpu *gpu);
+bool adreno_idle(struct msm_gpu *gpu);
 #ifdef CONFIG_DEBUG_FS
 void adreno_show(struct msm_gpu *gpu, struct seq_file *m);
 #endif
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index 3c3be44..19a7254 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -50,7 +50,7 @@ struct msm_gpu_funcs {
 	void (*submit)(struct msm_gpu *gpu, struct msm_gem_submit *submit,
 			struct msm_file_private *ctx);
 	void (*flush)(struct msm_gpu *gpu);
-	void (*idle)(struct msm_gpu *gpu);
+	bool (*idle)(struct msm_gpu *gpu);
 	irqreturn_t (*irq)(struct msm_gpu *irq);
 	uint32_t (*last_fence)(struct msm_gpu *gpu);
 	void (*recover)(struct msm_gpu *gpu);
-- 
1.9.1

_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 03/12] drm/msm: gpu Add new gpu register read/write functions
  2016-11-28 19:28 [PATCH 00/12] Adreno A5XX support Jordan Crouse
  2016-11-28 19:28 ` [PATCH 01/12] drm/msm: gpu: Cut down the list of "generic" registers to the ones we use Jordan Crouse
       [not found] ` <1480361317-9937-1-git-send-email-jcrouse-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
@ 2016-11-28 19:28 ` Jordan Crouse
  2016-11-28 19:28 ` [PATCH 04/12] drm/msm: Add adreno_gpu_write64() Jordan Crouse
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 34+ messages in thread
From: Jordan Crouse @ 2016-11-28 19:28 UTC (permalink / raw)
  To: freedreno; +Cc: linux-arm-msm, dri-devel

Add some new functions to manipulate GPU registers.  gpu_read64 and
gpu_write64 can read/write a 64 bit value to two 32 bit registers.
For 4XX and older these are normally perfcounter registers, but
future targets will use 64 bit addressing so there will be many
more spots where a 64 bit read and write are needed.

gpu_rmw() does a read/modify/write on a 32 bit register given a mask
and bits to OR in.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/gpu/drm/msm/adreno/a4xx_gpu.c | 12 ++---------
 drivers/gpu/drm/msm/msm_gpu.h         | 39 +++++++++++++++++++++++++++++++++++
 2 files changed, 41 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
index 9e7f5b7..4f68b63 100644
--- a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
@@ -513,16 +513,8 @@ static int a4xx_pm_suspend(struct msm_gpu *gpu) {
 
 static int a4xx_get_timestamp(struct msm_gpu *gpu, uint64_t *value)
 {
-	uint32_t hi, lo, tmp;
-
-	tmp = gpu_read(gpu, REG_A4XX_RBBM_PERFCTR_CP_0_HI);
-	do {
-		hi = tmp;
-		lo = gpu_read(gpu, REG_A4XX_RBBM_PERFCTR_CP_0_LO);
-		tmp = gpu_read(gpu, REG_A4XX_RBBM_PERFCTR_CP_0_HI);
-	} while (tmp != hi);
-
-	*value = (((uint64_t)hi) << 32) | lo;
+	*value = gpu_read64(gpu, REG_A4XX_RBBM_PERFCTR_CP_0_LO,
+		REG_A4XX_RBBM_PERFCTR_CP_0_HI);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index 19a7254..baca428 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -154,6 +154,45 @@ static inline u32 gpu_read(struct msm_gpu *gpu, u32 reg)
 	return msm_readl(gpu->mmio + (reg << 2));
 }
 
+static inline void gpu_rmw(struct msm_gpu *gpu, u32 reg, u32 mask, u32 or)
+{
+	uint32_t val = gpu_read(gpu, reg);
+
+	val &= ~mask;
+	gpu_write(gpu, reg, val | or);
+}
+
+static inline u64 gpu_read64(struct msm_gpu *gpu, u32 lo, u32 hi)
+{
+	u64 val;
+
+	/*
+	 * Why not a readq here? Two reasons: 1) many of the LO registers are
+	 * not quad word aligned and 2) the GPU hardware designers have a bit
+	 * of a history of putting registers where they fit, especially in
+	 * spins. The longer a GPU family goes the higher the chance that
+	 * we'll get burned.  We could do a series of validity checks if we
+	 * wanted to, but really is a readq() that much better? Nah.
+	 */
+
+	/*
+	 * For some lo/hi registers (like perfcounters), the hi value is latched
+	 * when the lo is read, so make sure to read the lo first to trigger
+	 * that
+	 */
+	val = (u64) msm_readl(gpu->mmio + (lo << 2));
+	val |= ((u64) msm_readl(gpu->mmio + (hi << 2)) << 32);
+
+	return val;
+}
+
+static inline void gpu_write64(struct msm_gpu *gpu, u32 lo, u32 hi, u64 val)
+{
+	/* Why not a writeq here? Read the screed above */
+	msm_writel(lower_32_bits(val), gpu->mmio + (lo << 2));
+	msm_writel(upper_32_bits(val), gpu->mmio + (hi << 2));
+}
+
 int msm_gpu_pm_suspend(struct msm_gpu *gpu);
 int msm_gpu_pm_resume(struct msm_gpu *gpu);
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 04/12] drm/msm: Add adreno_gpu_write64()
  2016-11-28 19:28 [PATCH 00/12] Adreno A5XX support Jordan Crouse
                   ` (2 preceding siblings ...)
  2016-11-28 19:28 ` [PATCH 03/12] drm/msm: gpu Add new gpu register read/write functions Jordan Crouse
@ 2016-11-28 19:28 ` Jordan Crouse
  2016-11-28 19:28 ` [PATCH 07/12] drm/msm: Disable interrupts during init Jordan Crouse
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 34+ messages in thread
From: Jordan Crouse @ 2016-11-28 19:28 UTC (permalink / raw)
  To: freedreno; +Cc: linux-arm-msm, dri-devel

Add a new generic function to write a "64" bit value. This isn't
actually a 64 bit operation, it just writes the upper and lower
32 bit of a 64 bit value to a specified LO and HI register.  If
a particular target doesn't support one of the registers it can
mark that register as SKIP and writes/reads from that register
will be quietly dropped.

This can be immediately put in place for the ringbuffer base and
the RPTR address.  Both writes are converted to use
adreno_gpu_write64() with their respective high and low registers
and the high register appropriately marked as SKIP for both 32 bit
targets (a3xx and a4xx). When a5xx comes it will define valid target
registers for the 'hi' option and everything else will just work.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/gpu/drm/msm/adreno/a3xx_gpu.c   |  2 ++
 drivers/gpu/drm/msm/adreno/a4xx_gpu.c   |  2 ++
 drivers/gpu/drm/msm/adreno/adreno_gpu.c | 11 +++++++----
 drivers/gpu/drm/msm/adreno/adreno_gpu.h | 24 +++++++++++++++++++++++-
 4 files changed, 34 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
index 19a1839..5a061ad 100644
--- a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
@@ -423,7 +423,9 @@ static void a3xx_dump(struct msm_gpu *gpu)
 /* Register offset defines for A3XX */
 static const unsigned int a3xx_register_offsets[REG_ADRENO_REGISTER_MAX] = {
 	REG_ADRENO_DEFINE(REG_ADRENO_CP_RB_BASE, REG_AXXX_CP_RB_BASE),
+	REG_ADRENO_SKIP(REG_ADRENO_CP_RB_BASE_HI),
 	REG_ADRENO_DEFINE(REG_ADRENO_CP_RB_RPTR_ADDR, REG_AXXX_CP_RB_RPTR_ADDR),
+	REG_ADRENO_SKIP(REG_ADRENO_CP_RB_RPTR_ADDR_HI),
 	REG_ADRENO_DEFINE(REG_ADRENO_CP_RB_RPTR, REG_AXXX_CP_RB_RPTR),
 	REG_ADRENO_DEFINE(REG_ADRENO_CP_RB_WPTR, REG_AXXX_CP_RB_WPTR),
 	REG_ADRENO_DEFINE(REG_ADRENO_CP_RB_CNTL, REG_AXXX_CP_RB_CNTL),
diff --git a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
index 4f68b63..29a1860 100644
--- a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
@@ -463,7 +463,9 @@ static void a4xx_show(struct msm_gpu *gpu, struct seq_file *m)
 /* Register offset defines for A4XX, in order of enum adreno_regs */
 static const unsigned int a4xx_register_offsets[REG_ADRENO_REGISTER_MAX] = {
 	REG_ADRENO_DEFINE(REG_ADRENO_CP_RB_BASE, REG_A4XX_CP_RB_BASE),
+	REG_ADRENO_SKIP(REG_ADRENO_CP_RB_BASE_HI),
 	REG_ADRENO_DEFINE(REG_ADRENO_CP_RB_RPTR_ADDR, REG_A4XX_CP_RB_RPTR_ADDR),
+	REG_ADRENO_SKIP(REG_ADRENO_CP_RB_RPTR_ADDR_HI),
 	REG_ADRENO_DEFINE(REG_ADRENO_CP_RB_RPTR, REG_A4XX_CP_RB_RPTR),
 	REG_ADRENO_DEFINE(REG_ADRENO_CP_RB_WPTR, REG_A4XX_CP_RB_WPTR),
 	REG_ADRENO_DEFINE(REG_ADRENO_CP_RB_CNTL, REG_A4XX_CP_RB_CNTL),
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index 6436d54a..3f8c730 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -79,11 +79,14 @@ int adreno_hw_init(struct msm_gpu *gpu)
 			(adreno_is_a430(adreno_gpu) ? AXXX_CP_RB_CNTL_NO_UPDATE : 0));
 
 	/* Setup ringbuffer address: */
-	adreno_gpu_write(adreno_gpu, REG_ADRENO_CP_RB_BASE, gpu->rb_iova);
+	adreno_gpu_write64(adreno_gpu, REG_ADRENO_CP_RB_BASE,
+		REG_ADRENO_CP_RB_BASE_HI, gpu->rb_iova);
 
-	if (!adreno_is_a430(adreno_gpu))
-		adreno_gpu_write(adreno_gpu, REG_ADRENO_CP_RB_RPTR_ADDR,
-						rbmemptr(adreno_gpu, rptr));
+	if (!adreno_is_a430(adreno_gpu)) {
+		adreno_gpu_write64(adreno_gpu, REG_ADRENO_CP_RB_RPTR_ADDR,
+			REG_ADRENO_CP_RB_RPTR_ADDR_HI,
+			rbmemptr(adreno_gpu, rptr));
+	}
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
index ab476f9..50480bd 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
@@ -28,6 +28,9 @@
 #include "adreno_pm4.xml.h"
 
 #define REG_ADRENO_DEFINE(_offset, _reg) [_offset] = (_reg) + 1
+#define REG_SKIP ~0
+#define REG_ADRENO_SKIP(_offset) [_offset] = REG_SKIP
+
 /**
  * adreno_regs: List of registers that are used in across all
  * 3D devices. Each device type has different offset value for the same
@@ -36,7 +39,9 @@
  */
 enum adreno_regs {
 	REG_ADRENO_CP_RB_BASE,
+	REG_ADRENO_CP_RB_BASE_HI,
 	REG_ADRENO_CP_RB_RPTR_ADDR,
+	REG_ADRENO_CP_RB_RPTR_ADDR_HI,
 	REG_ADRENO_CP_RB_RPTR,
 	REG_ADRENO_CP_RB_WPTR,
 	REG_ADRENO_CP_RB_CNTL,
@@ -220,7 +225,7 @@ OUT_PKT3(struct msm_ringbuffer *ring, uint8_t opcode, uint16_t cnt)
 }
 
 /*
- * adreno_checkreg_off() - Checks the validity of a register enum
+ * adreno_reg_check() - Checks the validity of a register enum
  * @gpu:		Pointer to struct adreno_gpu
  * @offset_name:	The register enum that is checked
  */
@@ -231,6 +236,16 @@ static inline bool adreno_reg_check(struct adreno_gpu *gpu,
 			!gpu->reg_offsets[offset_name]) {
 		BUG();
 	}
+
+	/*
+	 * REG_SKIP is a special value that tell us that the register in
+	 * question isn't implemented on target but don't trigger a BUG(). This
+	 * is used to cleanly implement adreno_gpu_write64() and
+	 * adreno_gpu_read64() in a generic fashion
+	 */
+	if (gpu->reg_offsets[offset_name] == REG_SKIP)
+		return false;
+
 	return true;
 }
 
@@ -252,4 +267,11 @@ static inline void adreno_gpu_write(struct adreno_gpu *gpu,
 		gpu_write(&gpu->base, reg - 1, data);
 }
 
+static inline void adreno_gpu_write64(struct adreno_gpu *gpu,
+		enum adreno_regs lo, enum adreno_regs hi, u64 data)
+{
+	adreno_gpu_write(gpu, lo, lower_32_bits(data));
+	adreno_gpu_write(gpu, hi, upper_32_bits(data));
+}
+
 #endif /* __ADRENO_GPU_H__ */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 05/12] drm/msm: gpu: Add OUT_TYPE4 and OUT_TYPE7
       [not found] ` <1480361317-9937-1-git-send-email-jcrouse-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
  2016-11-28 19:28   ` [PATCH 02/12] drm/msm: gpu: Return error on hw_init failure Jordan Crouse
@ 2016-11-28 19:28   ` Jordan Crouse
  2016-11-28 19:28   ` [PATCH 06/12] drm/msm: Remove 'src_clk' from adreno configuration Jordan Crouse
  2 siblings, 0 replies; 34+ messages in thread
From: Jordan Crouse @ 2016-11-28 19:28 UTC (permalink / raw)
  To: freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Add helper functions for TYPE4 and TYPE7 ME opcodes that replace
TYPE0 and TYPE3 starting with the A5XX targets.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/gpu/drm/msm/adreno/adreno_gpu.h | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
index 50480bd..b93366b 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
@@ -224,6 +224,36 @@ OUT_PKT3(struct msm_ringbuffer *ring, uint8_t opcode, uint16_t cnt)
 	OUT_RING(ring, CP_TYPE3_PKT | ((cnt-1) << 16) | ((opcode & 0xFF) << 8));
 }
 
+static inline u32 PM4_PARITY(u32 val)
+{
+	return (0x9669 >> (0xF & (val ^
+		(val >> 4) ^ (val >> 8) ^ (val >> 12) ^
+		(val >> 16) ^ ((val) >> 20) ^ (val >> 24) ^
+		(val >> 28)))) & 1;
+}
+
+/* Maximum number of values that can be executed for one opcode */
+#define TYPE4_MAX_PAYLOAD 127
+
+#define PKT4(_reg, _cnt) \
+	(CP_TYPE4_PKT | ((_cnt) << 0) | (PM4_PARITY((_cnt)) << 7) | \
+	 (((_reg) & 0x3FFFF) << 8) | (PM4_PARITY((_reg)) << 27))
+
+static inline void
+OUT_PKT4(struct msm_ringbuffer *ring, uint16_t regindx, uint16_t cnt)
+{
+	adreno_wait_ring(ring->gpu, cnt + 1);
+	OUT_RING(ring, PKT4(regindx, cnt));
+}
+
+static inline void
+OUT_PKT7(struct msm_ringbuffer *ring, uint8_t opcode, uint16_t cnt)
+{
+	adreno_wait_ring(ring->gpu, cnt + 1);
+	OUT_RING(ring, CP_TYPE7_PKT | (cnt << 0) | (PM4_PARITY(cnt) << 15) |
+		((opcode & 0x7F) << 16) | (PM4_PARITY(opcode) << 23));
+}
+
 /*
  * adreno_reg_check() - Checks the validity of a register enum
  * @gpu:		Pointer to struct adreno_gpu
-- 
1.9.1

_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 06/12] drm/msm: Remove 'src_clk' from adreno configuration
       [not found] ` <1480361317-9937-1-git-send-email-jcrouse-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
  2016-11-28 19:28   ` [PATCH 02/12] drm/msm: gpu: Return error on hw_init failure Jordan Crouse
  2016-11-28 19:28   ` [PATCH 05/12] drm/msm: gpu: Add OUT_TYPE4 and OUT_TYPE7 Jordan Crouse
@ 2016-11-28 19:28   ` Jordan Crouse
  2 siblings, 0 replies; 34+ messages in thread
From: Jordan Crouse @ 2016-11-28 19:28 UTC (permalink / raw)
  To: freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

The adreno code inherited a silly workaround from downstream
from the bad old days before decent clock control. grp_clk[0]
(named 'src_clk') doesn't actually exist - it was used as a proxy
for whatever the core clock actually was (usually 'core_clk').

All targets should be able to correctly request 'core_clk' and
get the right thing back so zap the anachronism and directly
use grp_clk[0] to control the clock rate.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/gpu/drm/msm/msm_gpu.c | 36 +++++++++++++-----------------------
 drivers/gpu/drm/msm/msm_gpu.h |  2 +-
 2 files changed, 14 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index f493931..e981224 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -91,21 +91,16 @@ static int disable_pwrrail(struct msm_gpu *gpu)
 
 static int enable_clk(struct msm_gpu *gpu)
 {
-	struct clk *rate_clk = NULL;
 	int i;
 
-	/* NOTE: kgsl_pwrctrl_clk() ignores grp_clks[0].. */
-	for (i = ARRAY_SIZE(gpu->grp_clks) - 1; i > 0; i--) {
-		if (gpu->grp_clks[i]) {
-			clk_prepare(gpu->grp_clks[i]);
-			rate_clk = gpu->grp_clks[i];
-		}
-	}
+	if (gpu->grp_clks[0] && gpu->fast_rate)
+		clk_set_rate(gpu->grp_clks[0], gpu->fast_rate);
 
-	if (rate_clk && gpu->fast_rate)
-		clk_set_rate(rate_clk, gpu->fast_rate);
+	for (i = ARRAY_SIZE(gpu->grp_clks) - 1; i >= 0; i--)
+		if (gpu->grp_clks[i])
+			clk_prepare(gpu->grp_clks[i]);
 
-	for (i = ARRAY_SIZE(gpu->grp_clks) - 1; i > 0; i--)
+	for (i = ARRAY_SIZE(gpu->grp_clks) - 1; i >= 0; i--)
 		if (gpu->grp_clks[i])
 			clk_enable(gpu->grp_clks[i]);
 
@@ -114,24 +109,19 @@ static int enable_clk(struct msm_gpu *gpu)
 
 static int disable_clk(struct msm_gpu *gpu)
 {
-	struct clk *rate_clk = NULL;
 	int i;
 
-	/* NOTE: kgsl_pwrctrl_clk() ignores grp_clks[0].. */
-	for (i = ARRAY_SIZE(gpu->grp_clks) - 1; i > 0; i--) {
-		if (gpu->grp_clks[i]) {
+	for (i = ARRAY_SIZE(gpu->grp_clks) - 1; i >= 0; i--)
+		if (gpu->grp_clks[i])
 			clk_disable(gpu->grp_clks[i]);
-			rate_clk = gpu->grp_clks[i];
-		}
-	}
 
-	if (rate_clk && gpu->slow_rate)
-		clk_set_rate(rate_clk, gpu->slow_rate);
-
-	for (i = ARRAY_SIZE(gpu->grp_clks) - 1; i > 0; i--)
+	for (i = ARRAY_SIZE(gpu->grp_clks) - 1; i >= 0; i--)
 		if (gpu->grp_clks[i])
 			clk_unprepare(gpu->grp_clks[i]);
 
+	if (gpu->grp_clks[0] && gpu->slow_rate)
+		clk_set_rate(gpu->grp_clks[0], gpu->slow_rate);
+
 	return 0;
 }
 
@@ -572,7 +562,7 @@ static irqreturn_t irq_handler(int irq, void *data)
 }
 
 static const char *clk_names[] = {
-		"src_clk", "core_clk", "iface_clk", "mem_clk", "mem_iface_clk",
+		"core_clk", "iface_clk", "mem_clk", "mem_iface_clk",
 		"alt_mem_iface_clk",
 };
 
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index baca428..bfe13d3 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -103,7 +103,7 @@ struct msm_gpu {
 
 	/* Power Control: */
 	struct regulator *gpu_reg, *gpu_cx;
-	struct clk *ebi1_clk, *grp_clks[6];
+	struct clk *ebi1_clk, *grp_clks[5];
 	uint32_t fast_rate, slow_rate, bus_freq;
 
 #ifdef DOWNSTREAM_CONFIG_MSM_BUS_SCALING
-- 
1.9.1

_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 07/12] drm/msm: Disable interrupts during init
  2016-11-28 19:28 [PATCH 00/12] Adreno A5XX support Jordan Crouse
                   ` (3 preceding siblings ...)
  2016-11-28 19:28 ` [PATCH 04/12] drm/msm: Add adreno_gpu_write64() Jordan Crouse
@ 2016-11-28 19:28 ` Jordan Crouse
  2016-11-28 19:28 ` [PATCH 08/12] drm/msm: gpu: Add A5XX target support Jordan Crouse
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 34+ messages in thread
From: Jordan Crouse @ 2016-11-28 19:28 UTC (permalink / raw)
  To: freedreno; +Cc: linux-arm-msm, dri-devel

Disable the interrupt during the init sequence to avoid having
interrupts fired for errors and other things that we are not
ready to handle while initializing.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/gpu/drm/msm/adreno/adreno_device.c | 4 ++++
 drivers/gpu/drm/msm/adreno/adreno_gpu.c    | 3 +++
 2 files changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c
index 5127b75..3321d33 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_device.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
@@ -148,12 +148,16 @@ struct msm_gpu *adreno_load_gpu(struct drm_device *dev)
 		mutex_lock(&dev->struct_mutex);
 		gpu->funcs->pm_resume(gpu);
 		mutex_unlock(&dev->struct_mutex);
+
+		disable_irq(gpu->irq);
+
 		ret = gpu->funcs->hw_init(gpu);
 		if (ret) {
 			dev_err(dev->dev, "gpu hw init failed: %d\n", ret);
 			gpu->funcs->destroy(gpu);
 			gpu = NULL;
 		} else {
+			enable_irq(gpu->irq);
 			/* give inactive pm a chance to kick in: */
 			msm_gpu_retire(gpu);
 		}
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index 3f8c730..644be52 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -129,11 +129,14 @@ void adreno_recover(struct msm_gpu *gpu)
 	adreno_gpu->memptrs->wptr  = 0;
 
 	gpu->funcs->pm_resume(gpu);
+
+	disable_irq(gpu->irq);
 	ret = gpu->funcs->hw_init(gpu);
 	if (ret) {
 		dev_err(dev->dev, "gpu hw init failed: %d\n", ret);
 		/* hmm, oh well? */
 	}
+	enable_irq(gpu->irq);
 }
 
 void adreno_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 08/12] drm/msm: gpu: Add A5XX target support
  2016-11-28 19:28 [PATCH 00/12] Adreno A5XX support Jordan Crouse
                   ` (4 preceding siblings ...)
  2016-11-28 19:28 ` [PATCH 07/12] drm/msm: Disable interrupts during init Jordan Crouse
@ 2016-11-28 19:28 ` Jordan Crouse
  2016-11-28 19:28 ` [PATCH 09/12] drm/msm: gpu: Add support for the GPMU Jordan Crouse
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 34+ messages in thread
From: Jordan Crouse @ 2016-11-28 19:28 UTC (permalink / raw)
  To: freedreno; +Cc: linux-arm-msm, dri-devel

Add support for the A5XX family of Adreno GPUs.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/gpu/drm/msm/Makefile               |   1 +
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c      | 823 +++++++++++++++++++++++++++++
 drivers/gpu/drm/msm/adreno/a5xx_gpu.h      |  37 ++
 drivers/gpu/drm/msm/adreno/adreno_device.c |  25 +-
 drivers/gpu/drm/msm/adreno/adreno_gpu.c    |   6 +-
 drivers/gpu/drm/msm/adreno/adreno_gpu.h    |  40 ++
 drivers/gpu/drm/msm/msm_gpu.c              |  13 +-
 drivers/gpu/drm/msm/msm_gpu.h              |   2 +-
 8 files changed, 938 insertions(+), 9 deletions(-)
 create mode 100644 drivers/gpu/drm/msm/adreno/a5xx_gpu.c
 create mode 100644 drivers/gpu/drm/msm/adreno/a5xx_gpu.h

diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile
index fb5be3e..4abba41 100644
--- a/drivers/gpu/drm/msm/Makefile
+++ b/drivers/gpu/drm/msm/Makefile
@@ -6,6 +6,7 @@ msm-y := \
 	adreno/adreno_gpu.o \
 	adreno/a3xx_gpu.o \
 	adreno/a4xx_gpu.o \
+	adreno/a5xx_gpu.o \
 	hdmi/hdmi.o \
 	hdmi/hdmi_audio.o \
 	hdmi/hdmi_bridge.o \
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
new file mode 100644
index 0000000..63f67a8
--- /dev/null
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -0,0 +1,823 @@
+/* Copyright (c) 2016 The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ */
+
+#include "msm_gem.h"
+#include "a5xx_gpu.h"
+
+extern bool hang_debug;
+static void a5xx_dump(struct msm_gpu *gpu);
+
+static void a5xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit,
+	struct msm_file_private *ctx)
+{
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct msm_drm_private *priv = gpu->dev->dev_private;
+	struct msm_ringbuffer *ring = gpu->rb;
+	unsigned int i, ibs = 0;
+
+	for (i = 0; i < submit->nr_cmds; i++) {
+		switch (submit->cmd[i].type) {
+		case MSM_SUBMIT_CMD_IB_TARGET_BUF:
+			break;
+		case MSM_SUBMIT_CMD_CTX_RESTORE_BUF:
+			if (priv->lastctx == ctx)
+				break;
+		case MSM_SUBMIT_CMD_BUF:
+			OUT_PKT7(ring, CP_INDIRECT_BUFFER_PFE, 3);
+			OUT_RING(ring, lower_32_bits(submit->cmd[i].iova));
+			OUT_RING(ring, upper_32_bits(submit->cmd[i].iova));
+			OUT_RING(ring, submit->cmd[i].size);
+			ibs++;
+			break;
+		}
+	}
+
+	OUT_PKT4(ring, REG_A5XX_CP_SCRATCH_REG(2), 1);
+	OUT_RING(ring, submit->fence->seqno);
+
+	OUT_PKT7(ring, CP_EVENT_WRITE, 4);
+	OUT_RING(ring, CACHE_FLUSH_TS | (1 << 31));
+	OUT_RING(ring, lower_32_bits(rbmemptr(adreno_gpu, fence)));
+	OUT_RING(ring, upper_32_bits(rbmemptr(adreno_gpu, fence)));
+	OUT_RING(ring, submit->fence->seqno);
+
+	gpu->funcs->flush(gpu);
+}
+
+struct a5xx_hwcg {
+	u32 offset;
+	u32 value;
+};
+
+static const struct a5xx_hwcg a530_hwcg[] = {
+	{REG_A5XX_RBBM_CLOCK_CNTL_SP0, 0x02222222},
+	{REG_A5XX_RBBM_CLOCK_CNTL_SP1, 0x02222222},
+	{REG_A5XX_RBBM_CLOCK_CNTL_SP2, 0x02222222},
+	{REG_A5XX_RBBM_CLOCK_CNTL_SP3, 0x02222222},
+	{REG_A5XX_RBBM_CLOCK_CNTL2_SP0, 0x02222220},
+	{REG_A5XX_RBBM_CLOCK_CNTL2_SP1, 0x02222220},
+	{REG_A5XX_RBBM_CLOCK_CNTL2_SP2, 0x02222220},
+	{REG_A5XX_RBBM_CLOCK_CNTL2_SP3, 0x02222220},
+	{REG_A5XX_RBBM_CLOCK_HYST_SP0, 0x0000F3CF},
+	{REG_A5XX_RBBM_CLOCK_HYST_SP1, 0x0000F3CF},
+	{REG_A5XX_RBBM_CLOCK_HYST_SP2, 0x0000F3CF},
+	{REG_A5XX_RBBM_CLOCK_HYST_SP3, 0x0000F3CF},
+	{REG_A5XX_RBBM_CLOCK_DELAY_SP0, 0x00000080},
+	{REG_A5XX_RBBM_CLOCK_DELAY_SP1, 0x00000080},
+	{REG_A5XX_RBBM_CLOCK_DELAY_SP2, 0x00000080},
+	{REG_A5XX_RBBM_CLOCK_DELAY_SP3, 0x00000080},
+	{REG_A5XX_RBBM_CLOCK_CNTL_TP0, 0x22222222},
+	{REG_A5XX_RBBM_CLOCK_CNTL_TP1, 0x22222222},
+	{REG_A5XX_RBBM_CLOCK_CNTL_TP2, 0x22222222},
+	{REG_A5XX_RBBM_CLOCK_CNTL_TP3, 0x22222222},
+	{REG_A5XX_RBBM_CLOCK_CNTL2_TP0, 0x22222222},
+	{REG_A5XX_RBBM_CLOCK_CNTL2_TP1, 0x22222222},
+	{REG_A5XX_RBBM_CLOCK_CNTL2_TP2, 0x22222222},
+	{REG_A5XX_RBBM_CLOCK_CNTL2_TP3, 0x22222222},
+	{REG_A5XX_RBBM_CLOCK_CNTL3_TP0, 0x00002222},
+	{REG_A5XX_RBBM_CLOCK_CNTL3_TP1, 0x00002222},
+	{REG_A5XX_RBBM_CLOCK_CNTL3_TP2, 0x00002222},
+	{REG_A5XX_RBBM_CLOCK_CNTL3_TP3, 0x00002222},
+	{REG_A5XX_RBBM_CLOCK_HYST_TP0, 0x77777777},
+	{REG_A5XX_RBBM_CLOCK_HYST_TP1, 0x77777777},
+	{REG_A5XX_RBBM_CLOCK_HYST_TP2, 0x77777777},
+	{REG_A5XX_RBBM_CLOCK_HYST_TP3, 0x77777777},
+	{REG_A5XX_RBBM_CLOCK_HYST2_TP0, 0x77777777},
+	{REG_A5XX_RBBM_CLOCK_HYST2_TP1, 0x77777777},
+	{REG_A5XX_RBBM_CLOCK_HYST2_TP2, 0x77777777},
+	{REG_A5XX_RBBM_CLOCK_HYST2_TP3, 0x77777777},
+	{REG_A5XX_RBBM_CLOCK_HYST3_TP0, 0x00007777},
+	{REG_A5XX_RBBM_CLOCK_HYST3_TP1, 0x00007777},
+	{REG_A5XX_RBBM_CLOCK_HYST3_TP2, 0x00007777},
+	{REG_A5XX_RBBM_CLOCK_HYST3_TP3, 0x00007777},
+	{REG_A5XX_RBBM_CLOCK_DELAY_TP0, 0x11111111},
+	{REG_A5XX_RBBM_CLOCK_DELAY_TP1, 0x11111111},
+	{REG_A5XX_RBBM_CLOCK_DELAY_TP2, 0x11111111},
+	{REG_A5XX_RBBM_CLOCK_DELAY_TP3, 0x11111111},
+	{REG_A5XX_RBBM_CLOCK_DELAY2_TP0, 0x11111111},
+	{REG_A5XX_RBBM_CLOCK_DELAY2_TP1, 0x11111111},
+	{REG_A5XX_RBBM_CLOCK_DELAY2_TP2, 0x11111111},
+	{REG_A5XX_RBBM_CLOCK_DELAY2_TP3, 0x11111111},
+	{REG_A5XX_RBBM_CLOCK_DELAY3_TP0, 0x00001111},
+	{REG_A5XX_RBBM_CLOCK_DELAY3_TP1, 0x00001111},
+	{REG_A5XX_RBBM_CLOCK_DELAY3_TP2, 0x00001111},
+	{REG_A5XX_RBBM_CLOCK_DELAY3_TP3, 0x00001111},
+	{REG_A5XX_RBBM_CLOCK_CNTL_UCHE, 0x22222222},
+	{REG_A5XX_RBBM_CLOCK_CNTL2_UCHE, 0x22222222},
+	{REG_A5XX_RBBM_CLOCK_CNTL3_UCHE, 0x22222222},
+	{REG_A5XX_RBBM_CLOCK_CNTL4_UCHE, 0x00222222},
+	{REG_A5XX_RBBM_CLOCK_HYST_UCHE, 0x00444444},
+	{REG_A5XX_RBBM_CLOCK_DELAY_UCHE, 0x00000002},
+	{REG_A5XX_RBBM_CLOCK_CNTL_RB0, 0x22222222},
+	{REG_A5XX_RBBM_CLOCK_CNTL_RB1, 0x22222222},
+	{REG_A5XX_RBBM_CLOCK_CNTL_RB2, 0x22222222},
+	{REG_A5XX_RBBM_CLOCK_CNTL_RB3, 0x22222222},
+	{REG_A5XX_RBBM_CLOCK_CNTL2_RB0, 0x00222222},
+	{REG_A5XX_RBBM_CLOCK_CNTL2_RB1, 0x00222222},
+	{REG_A5XX_RBBM_CLOCK_CNTL2_RB2, 0x00222222},
+	{REG_A5XX_RBBM_CLOCK_CNTL2_RB3, 0x00222222},
+	{REG_A5XX_RBBM_CLOCK_CNTL_CCU0, 0x00022220},
+	{REG_A5XX_RBBM_CLOCK_CNTL_CCU1, 0x00022220},
+	{REG_A5XX_RBBM_CLOCK_CNTL_CCU2, 0x00022220},
+	{REG_A5XX_RBBM_CLOCK_CNTL_CCU3, 0x00022220},
+	{REG_A5XX_RBBM_CLOCK_CNTL_RAC, 0x05522222},
+	{REG_A5XX_RBBM_CLOCK_CNTL2_RAC, 0x00505555},
+	{REG_A5XX_RBBM_CLOCK_HYST_RB_CCU0, 0x04040404},
+	{REG_A5XX_RBBM_CLOCK_HYST_RB_CCU1, 0x04040404},
+	{REG_A5XX_RBBM_CLOCK_HYST_RB_CCU2, 0x04040404},
+	{REG_A5XX_RBBM_CLOCK_HYST_RB_CCU3, 0x04040404},
+	{REG_A5XX_RBBM_CLOCK_HYST_RAC, 0x07444044},
+	{REG_A5XX_RBBM_CLOCK_DELAY_RB_CCU_L1_0, 0x00000002},
+	{REG_A5XX_RBBM_CLOCK_DELAY_RB_CCU_L1_1, 0x00000002},
+	{REG_A5XX_RBBM_CLOCK_DELAY_RB_CCU_L1_2, 0x00000002},
+	{REG_A5XX_RBBM_CLOCK_DELAY_RB_CCU_L1_3, 0x00000002},
+	{REG_A5XX_RBBM_CLOCK_DELAY_RAC, 0x00010011},
+	{REG_A5XX_RBBM_CLOCK_CNTL_TSE_RAS_RBBM, 0x04222222},
+	{REG_A5XX_RBBM_CLOCK_MODE_GPC, 0x02222222},
+	{REG_A5XX_RBBM_CLOCK_MODE_VFD, 0x00002222},
+	{REG_A5XX_RBBM_CLOCK_HYST_TSE_RAS_RBBM, 0x00000000},
+	{REG_A5XX_RBBM_CLOCK_HYST_GPC, 0x04104004},
+	{REG_A5XX_RBBM_CLOCK_HYST_VFD, 0x00000000},
+	{REG_A5XX_RBBM_CLOCK_DELAY_HLSQ, 0x00000000},
+	{REG_A5XX_RBBM_CLOCK_DELAY_TSE_RAS_RBBM, 0x00004000},
+	{REG_A5XX_RBBM_CLOCK_DELAY_GPC, 0x00000200},
+	{REG_A5XX_RBBM_CLOCK_DELAY_VFD, 0x00002222}
+};
+
+static const struct {
+	int (*test)(struct adreno_gpu *gpu);
+	const struct a5xx_hwcg *regs;
+	unsigned int count;
+} a5xx_hwcg_regs[] = {
+	{ adreno_is_a530, a530_hwcg, ARRAY_SIZE(a530_hwcg), },
+};
+
+static void _a5xx_enable_hwcg(struct msm_gpu *gpu,
+		const struct a5xx_hwcg *regs, unsigned int count)
+{
+	unsigned int i;
+
+	for (i = 0; i < count; i++)
+		gpu_write(gpu, regs[i].offset, regs[i].value);
+
+	gpu_write(gpu, REG_A5XX_RBBM_CLOCK_CNTL, 0xAAA8AA00);
+	gpu_write(gpu, REG_A5XX_RBBM_ISDB_CNT, 0x182);
+}
+
+static void a5xx_enable_hwcg(struct msm_gpu *gpu)
+{
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	unsigned int i;
+
+	for (i = 0; i < ARRAY_SIZE(a5xx_hwcg_regs); i++) {
+		if (a5xx_hwcg_regs[i].test(adreno_gpu)) {
+			_a5xx_enable_hwcg(gpu, a5xx_hwcg_regs[i].regs,
+				a5xx_hwcg_regs[i].count);
+			return;
+		}
+	}
+}
+
+static int a5xx_me_init(struct msm_gpu *gpu)
+{
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct msm_ringbuffer *ring = gpu->rb;
+
+	OUT_PKT7(ring, CP_ME_INIT, 8);
+
+	OUT_RING(ring, 0x0000002F);
+
+	/* Enable multiple hardware contexts */
+	OUT_RING(ring, 0x00000003);
+
+	/* Enable error detection */
+	OUT_RING(ring, 0x20000000);
+
+	/* Don't enable header dump */
+	OUT_RING(ring, 0x00000000);
+	OUT_RING(ring, 0x00000000);
+
+	/* Specify workarounds for various microcode issues */
+	if (adreno_is_a530(adreno_gpu)) {
+		/* Workaround for token end syncs
+		 * Force a WFI after every direct-render 3D mode draw and every
+		 * 2D mode 3 draw
+		 */
+		OUT_RING(ring, 0x0000000B);
+	} else {
+		/* No workarounds enabled */
+		OUT_RING(ring, 0x00000000);
+	}
+
+	OUT_RING(ring, 0x00000000);
+	OUT_RING(ring, 0x00000000);
+
+	gpu->funcs->flush(gpu);
+
+	return gpu->funcs->idle(gpu) ? 0 : -EINVAL;
+}
+
+static struct drm_gem_object *a5xx_ucode_load_bo(struct msm_gpu *gpu,
+		const struct firmware *fw, u64 *iova)
+{
+	struct drm_device *drm = gpu->dev;
+	struct drm_gem_object *bo;
+	void *ptr;
+
+	mutex_lock(&drm->struct_mutex);
+	bo = msm_gem_new(drm, fw->size - 4, MSM_BO_UNCACHED);
+	mutex_unlock(&drm->struct_mutex);
+
+	if (IS_ERR(bo))
+		return bo;
+
+	ptr = msm_gem_get_vaddr(bo);
+	if (!ptr) {
+		drm_gem_object_unreference_unlocked(bo);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	if (iova) {
+		int ret = msm_gem_get_iova(bo, gpu->id, iova);
+
+		if (ret) {
+			drm_gem_object_unreference_unlocked(bo);
+			return ERR_PTR(ret);
+		}
+	}
+
+	memcpy(ptr, &fw->data[4], fw->size - 4);
+
+	msm_gem_put_vaddr(bo);
+	return bo;
+}
+
+static int a5xx_ucode_init(struct msm_gpu *gpu)
+{
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct a5xx_gpu *a5xx_gpu = to_a5xx_gpu(adreno_gpu);
+	int ret;
+
+	if (!a5xx_gpu->pm4_bo) {
+		a5xx_gpu->pm4_bo = a5xx_ucode_load_bo(gpu, adreno_gpu->pm4,
+			&a5xx_gpu->pm4_iova);
+
+		if (IS_ERR(a5xx_gpu->pm4_bo)) {
+			ret = PTR_ERR(a5xx_gpu->pm4_bo);
+			a5xx_gpu->pm4_bo = NULL;
+			dev_err(gpu->dev->dev, "could not allocate PM4: %d\n",
+				ret);
+			return ret;
+		}
+	}
+
+	if (!a5xx_gpu->pfp_bo) {
+		a5xx_gpu->pfp_bo = a5xx_ucode_load_bo(gpu, adreno_gpu->pfp,
+			&a5xx_gpu->pfp_iova);
+
+		if (IS_ERR(a5xx_gpu->pfp_bo)) {
+			ret = PTR_ERR(a5xx_gpu->pfp_bo);
+			a5xx_gpu->pfp_bo = NULL;
+			dev_err(gpu->dev->dev, "could not allocate PFP: %d\n",
+				ret);
+			return ret;
+		}
+	}
+
+	gpu_write64(gpu, REG_A5XX_CP_ME_INSTR_BASE_LO,
+		REG_A5XX_CP_ME_INSTR_BASE_HI, a5xx_gpu->pm4_iova);
+
+	gpu_write64(gpu, REG_A5XX_CP_PFP_INSTR_BASE_LO,
+		REG_A5XX_CP_PFP_INSTR_BASE_HI, a5xx_gpu->pfp_iova);
+
+	return 0;
+}
+
+#define A5XX_INT_MASK (A5XX_RBBM_INT_0_MASK_RBBM_AHB_ERROR | \
+	  A5XX_RBBM_INT_0_MASK_RBBM_TRANSFER_TIMEOUT | \
+	  A5XX_RBBM_INT_0_MASK_RBBM_ME_MS_TIMEOUT | \
+	  A5XX_RBBM_INT_0_MASK_RBBM_PFP_MS_TIMEOUT | \
+	  A5XX_RBBM_INT_0_MASK_RBBM_ETS_MS_TIMEOUT | \
+	  A5XX_RBBM_INT_0_MASK_RBBM_ATB_ASYNC_OVERFLOW | \
+	  A5XX_RBBM_INT_0_MASK_CP_HW_ERROR | \
+	  A5XX_RBBM_INT_0_MASK_CP_CACHE_FLUSH_TS | \
+	  A5XX_RBBM_INT_0_MASK_UCHE_OOB_ACCESS | \
+	  A5XX_RBBM_INT_0_MASK_GPMU_VOLTAGE_DROOP)
+
+static int a5xx_hw_init(struct msm_gpu *gpu)
+{
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	int ret;
+
+	gpu_write(gpu, REG_A5XX_VBIF_ROUND_ROBIN_QOS_ARB, 0x00000003);
+
+	/* Make all blocks contribute to the GPU BUSY perf counter */
+	gpu_write(gpu, REG_A5XX_RBBM_PERFCTR_GPU_BUSY_MASKED, 0xFFFFFFFF);
+
+	/* Enable RBBM error reporting bits */
+	gpu_write(gpu, REG_A5XX_RBBM_AHB_CNTL0, 0x00000001);
+
+	if (adreno_gpu->quirks & ADRENO_QUIRK_FAULT_DETECT_MASK) {
+		/*
+		 * Mask out the activity signals from RB1-3 to avoid false
+		 * positives
+		 */
+
+		gpu_write(gpu, REG_A5XX_RBBM_INTERFACE_HANG_MASK_CNTL11,
+			0xF0000000);
+		gpu_write(gpu, REG_A5XX_RBBM_INTERFACE_HANG_MASK_CNTL12,
+			0xFFFFFFFF);
+		gpu_write(gpu, REG_A5XX_RBBM_INTERFACE_HANG_MASK_CNTL13,
+			0xFFFFFFFF);
+		gpu_write(gpu, REG_A5XX_RBBM_INTERFACE_HANG_MASK_CNTL14,
+			0xFFFFFFFF);
+		gpu_write(gpu, REG_A5XX_RBBM_INTERFACE_HANG_MASK_CNTL15,
+			0xFFFFFFFF);
+		gpu_write(gpu, REG_A5XX_RBBM_INTERFACE_HANG_MASK_CNTL16,
+			0xFFFFFFFF);
+		gpu_write(gpu, REG_A5XX_RBBM_INTERFACE_HANG_MASK_CNTL17,
+			0xFFFFFFFF);
+		gpu_write(gpu, REG_A5XX_RBBM_INTERFACE_HANG_MASK_CNTL18,
+			0xFFFFFFFF);
+	}
+
+	/* Enable fault detection */
+	gpu_write(gpu, REG_A5XX_RBBM_INTERFACE_HANG_INT_CNTL,
+		(1 << 30) | 0xFFFF);
+
+	/* Turn on performance counters */
+	gpu_write(gpu, REG_A5XX_RBBM_PERFCTR_CNTL, 0x01);
+
+	/* Increase VFD cache access so LRZ and other data gets evicted less */
+	gpu_write(gpu, REG_A5XX_UCHE_CACHE_WAYS, 0x02);
+
+	/* Disable L2 bypass in the UCHE */
+	gpu_write(gpu, REG_A5XX_UCHE_TRAP_BASE_LO, 0xFFFF0000);
+	gpu_write(gpu, REG_A5XX_UCHE_TRAP_BASE_HI, 0x0001FFFF);
+	gpu_write(gpu, REG_A5XX_UCHE_WRITE_THRU_BASE_LO, 0xFFFF0000);
+	gpu_write(gpu, REG_A5XX_UCHE_WRITE_THRU_BASE_HI, 0x0001FFFF);
+
+	/* Set the GMEM VA range (0 to gpu->gmem) */
+	gpu_write(gpu, REG_A5XX_UCHE_GMEM_RANGE_MIN_LO, 0x00100000);
+	gpu_write(gpu, REG_A5XX_UCHE_GMEM_RANGE_MIN_HI, 0x00000000);
+	gpu_write(gpu, REG_A5XX_UCHE_GMEM_RANGE_MAX_LO,
+		0x00100000 + adreno_gpu->gmem - 1);
+	gpu_write(gpu, REG_A5XX_UCHE_GMEM_RANGE_MAX_HI, 0x00000000);
+
+	gpu_write(gpu, REG_A5XX_CP_MEQ_THRESHOLDS, 0x40);
+	gpu_write(gpu, REG_A5XX_CP_MERCIU_SIZE, 0x40);
+	gpu_write(gpu, REG_A5XX_CP_ROQ_THRESHOLDS_2, 0x80000060);
+	gpu_write(gpu, REG_A5XX_CP_ROQ_THRESHOLDS_1, 0x40201B16);
+
+	gpu_write(gpu, REG_A5XX_PC_DBG_ECO_CNTL, (0x400 << 11 | 0x300 << 22));
+
+	if (adreno_gpu->quirks & ADRENO_QUIRK_TWO_PASS_USE_WFI)
+		gpu_rmw(gpu, REG_A5XX_PC_DBG_ECO_CNTL, 0, (1 << 8));
+
+	gpu_write(gpu, REG_A5XX_PC_DBG_ECO_CNTL, 0xc0200100);
+
+	/* Enable USE_RETENTION_FLOPS */
+	gpu_write(gpu, REG_A5XX_CP_CHICKEN_DBG, 0x02000000);
+
+	/* Enable ME/PFP split notification */
+	gpu_write(gpu, REG_A5XX_RBBM_AHB_CNTL1, 0xA6FFFFFF);
+
+	/* Enable HWCG */
+	a5xx_enable_hwcg(gpu);
+
+	gpu_write(gpu, REG_A5XX_RBBM_AHB_CNTL2, 0x0000003F);
+
+	/* Set the highest bank bit */
+	gpu_write(gpu, REG_A5XX_TPL1_MODE_CNTL, 2 << 7);
+	gpu_write(gpu, REG_A5XX_RB_MODE_CNTL, 2 << 1);
+
+	/* Protect registers from the CP */
+	gpu_write(gpu, REG_A5XX_CP_PROTECT_CNTL, 0x00000007);
+
+	/* RBBM */
+	gpu_write(gpu, REG_A5XX_CP_PROTECT(0), ADRENO_PROTECT_RW(0x04, 4));
+	gpu_write(gpu, REG_A5XX_CP_PROTECT(1), ADRENO_PROTECT_RW(0x08, 8));
+	gpu_write(gpu, REG_A5XX_CP_PROTECT(2), ADRENO_PROTECT_RW(0x10, 16));
+	gpu_write(gpu, REG_A5XX_CP_PROTECT(3), ADRENO_PROTECT_RW(0x20, 32));
+	gpu_write(gpu, REG_A5XX_CP_PROTECT(4), ADRENO_PROTECT_RW(0x40, 64));
+	gpu_write(gpu, REG_A5XX_CP_PROTECT(5), ADRENO_PROTECT_RW(0x80, 64));
+
+	/* Content protect */
+	gpu_write(gpu, REG_A5XX_CP_PROTECT(6),
+		ADRENO_PROTECT_RW(REG_A5XX_RBBM_SECVID_TSB_TRUSTED_BASE_LO,
+			16));
+	gpu_write(gpu, REG_A5XX_CP_PROTECT(7),
+		ADRENO_PROTECT_RW(REG_A5XX_RBBM_SECVID_TRUST_CNTL, 2));
+
+	/* CP */
+	gpu_write(gpu, REG_A5XX_CP_PROTECT(8), ADRENO_PROTECT_RW(0x800, 64));
+	gpu_write(gpu, REG_A5XX_CP_PROTECT(9), ADRENO_PROTECT_RW(0x840, 8));
+	gpu_write(gpu, REG_A5XX_CP_PROTECT(10), ADRENO_PROTECT_RW(0x880, 32));
+	gpu_write(gpu, REG_A5XX_CP_PROTECT(11), ADRENO_PROTECT_RW(0xAA0, 1));
+
+	/* RB */
+	gpu_write(gpu, REG_A5XX_CP_PROTECT(12), ADRENO_PROTECT_RW(0xCC0, 1));
+	gpu_write(gpu, REG_A5XX_CP_PROTECT(13), ADRENO_PROTECT_RW(0xCF0, 2));
+
+	/* VPC */
+	gpu_write(gpu, REG_A5XX_CP_PROTECT(14), ADRENO_PROTECT_RW(0xE68, 8));
+	gpu_write(gpu, REG_A5XX_CP_PROTECT(15), ADRENO_PROTECT_RW(0xE70, 4));
+
+	/* UCHE */
+	gpu_write(gpu, REG_A5XX_CP_PROTECT(16), ADRENO_PROTECT_RW(0xE80, 16));
+
+	if (adreno_is_a530(adreno_gpu))
+		gpu_write(gpu, REG_A5XX_CP_PROTECT(17),
+			ADRENO_PROTECT_RW(0x10000, 0x8000));
+
+	gpu_write(gpu, REG_A5XX_RBBM_SECVID_TSB_CNTL, 0);
+	/*
+	 * Disable the trusted memory range - we don't actually supported secure
+	 * memory rendering at this point in time and we don't want to block off
+	 * part of the virtual memory space.
+	 */
+	gpu_write64(gpu, REG_A5XX_RBBM_SECVID_TSB_TRUSTED_BASE_LO,
+		REG_A5XX_RBBM_SECVID_TSB_TRUSTED_BASE_HI, 0x00000000);
+	gpu_write(gpu, REG_A5XX_RBBM_SECVID_TSB_TRUSTED_SIZE, 0x00000000);
+
+	ret = adreno_hw_init(gpu);
+	if (ret)
+		return ret;
+
+	ret = a5xx_ucode_init(gpu);
+	if (ret)
+		return ret;
+
+	/* Disable the interrupts through the initial bringup stage */
+	gpu_write(gpu, REG_A5XX_RBBM_INT_0_MASK, A5XX_INT_MASK);
+
+	/* Clear ME_HALT to start the micro engine */
+	gpu_write(gpu, REG_A5XX_CP_PFP_ME_CNTL, 0);
+	ret = a5xx_me_init(gpu);
+	if (ret)
+		return ret;
+
+	/* Put the GPU into insecure mode */
+	gpu_write(gpu, REG_A5XX_RBBM_SECVID_TRUST_CNTL, 0x0);
+
+	/*
+	 * Send a pipeline event stat to get misbehaving counters to start
+	 * ticking correctly
+	 */
+	if (adreno_is_a530(adreno_gpu)) {
+		OUT_PKT7(gpu->rb, CP_EVENT_WRITE, 1);
+		OUT_RING(gpu->rb, 0x0F);
+
+		gpu->funcs->flush(gpu);
+		if (!gpu->funcs->idle(gpu))
+			return -EINVAL;
+	}
+
+	return 0;
+}
+
+static void a5xx_recover(struct msm_gpu *gpu)
+{
+	adreno_dump_info(gpu);
+
+	if (hang_debug)
+		a5xx_dump(gpu);
+
+	gpu_write(gpu, REG_A5XX_RBBM_SW_RESET_CMD, 1);
+	gpu_read(gpu, REG_A5XX_RBBM_SW_RESET_CMD);
+	gpu_write(gpu, REG_A5XX_RBBM_SW_RESET_CMD, 0);
+	adreno_recover(gpu);
+}
+
+static void a5xx_destroy(struct msm_gpu *gpu)
+{
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct a5xx_gpu *a5xx_gpu = to_a5xx_gpu(adreno_gpu);
+
+	DBG("%s", gpu->name);
+
+	if (a5xx_gpu->pm4_bo) {
+		if (a5xx_gpu->pm4_iova)
+			msm_gem_put_iova(a5xx_gpu->pm4_bo, gpu->id);
+		drm_gem_object_unreference_unlocked(a5xx_gpu->pm4_bo);
+	}
+
+	if (a5xx_gpu->pfp_bo) {
+		if (a5xx_gpu->pfp_iova)
+			msm_gem_put_iova(a5xx_gpu->pfp_bo, gpu->id);
+		drm_gem_object_unreference_unlocked(a5xx_gpu->pfp_bo);
+	}
+
+	adreno_gpu_cleanup(adreno_gpu);
+	kfree(a5xx_gpu);
+}
+
+static inline bool _a5xx_check_idle(struct msm_gpu *gpu)
+{
+	if (gpu_read(gpu, REG_A5XX_RBBM_STATUS) & ~A5XX_RBBM_STATUS_HI_BUSY)
+		return false;
+
+	/*
+	 * Nearly every abnormality ends up pausing the GPU and triggering a
+	 * fault so we can safely just watch for this one interrupt to fire
+	 */
+	return !(gpu_read(gpu, REG_A5XX_RBBM_INT_0_STATUS) &
+		A5XX_RBBM_INT_0_MASK_MISC_HANG_DETECT);
+}
+
+static bool a5xx_idle(struct msm_gpu *gpu)
+{
+	/* wait for CP to drain ringbuffer: */
+	if (!adreno_idle(gpu))
+		return false;
+
+	if (spin_until(_a5xx_check_idle(gpu))) {
+		DRM_ERROR("%s: %ps: timeout waiting for GPU to idle: status %8.8X irq %8.8X\n",
+			gpu->name, __builtin_return_address(0),
+			gpu_read(gpu, REG_A5XX_RBBM_STATUS),
+			gpu_read(gpu, REG_A5XX_RBBM_INT_0_STATUS));
+
+		return false;
+	}
+
+	return true;
+}
+
+static void a5xx_cp_err_irq(struct msm_gpu *gpu)
+{
+	u32 status = gpu_read(gpu, REG_A5XX_CP_INTERRUPT_STATUS);
+
+	if (status & A5XX_CP_INT_CP_OPCODE_ERROR) {
+		u32 val;
+
+		gpu_write(gpu, REG_A5XX_CP_PFP_STAT_ADDR, 0);
+
+		/*
+		 * REG_A5XX_CP_PFP_STAT_DATA is indexed, and we want index 1 so
+		 * read it twice
+		 */
+
+		gpu_read(gpu, REG_A5XX_CP_PFP_STAT_DATA);
+		val = gpu_read(gpu, REG_A5XX_CP_PFP_STAT_DATA);
+
+		dev_err_ratelimited(gpu->dev->dev, "CP | opcode error | possible opcode=0x%8.8X\n",
+			val);
+	}
+
+	if (status & A5XX_CP_INT_CP_HW_FAULT_ERROR)
+		dev_err_ratelimited(gpu->dev->dev, "CP | HW fault | status=0x%8.8X\n",
+			gpu_read(gpu, REG_A5XX_CP_HW_FAULT));
+
+	if (status & A5XX_CP_INT_CP_DMA_ERROR)
+		dev_err_ratelimited(gpu->dev->dev, "CP | DMA error\n");
+
+	if (status & A5XX_CP_INT_CP_REGISTER_PROTECTION_ERROR) {
+		u32 val = gpu_read(gpu, REG_A5XX_CP_PROTECT_STATUS);
+
+		dev_err_ratelimited(gpu->dev->dev,
+			"CP | protected mode error | %s | addr=0x%8.8X | status=0x%8.8X\n",
+			val & (1 << 24) ? "WRITE" : "READ",
+			(val & 0xFFFFF) >> 2, val);
+	}
+
+	if (status & A5XX_CP_INT_CP_AHB_ERROR) {
+		u32 status = gpu_read(gpu, REG_A5XX_CP_AHB_FAULT);
+		const char *access[16] = { "reserved", "reserved",
+			"timestamp lo", "timestamp hi", "pfp read", "pfp write",
+			"", "", "me read", "me write", "", "", "crashdump read",
+			"crashdump write" };
+
+		dev_err_ratelimited(gpu->dev->dev,
+			"CP | AHB error | addr=%X access=%s error=%d | status=0x%8.8X\n",
+			status & 0xFFFFF, access[(status >> 24) & 0xF],
+			(status & (1 << 31)), status);
+	}
+}
+
+static void a5xx_rbbm_err_irq(struct msm_gpu *gpu)
+{
+	u32 status = gpu_read(gpu, REG_A5XX_RBBM_INT_0_STATUS);
+
+	if (status & A5XX_RBBM_INT_0_MASK_RBBM_AHB_ERROR) {
+		u32 val = gpu_read(gpu, REG_A5XX_RBBM_AHB_ERROR_STATUS);
+
+		dev_err_ratelimited(gpu->dev->dev,
+			"RBBM | AHB bus error | %s | addr=0x%X | ports=0x%X:0x%X\n",
+			val & (1 << 28) ? "WRITE" : "READ",
+			(val & 0xFFFFF) >> 2, (val >> 20) & 0x3,
+			(val >> 24) & 0xF);
+
+		/* Clear the error */
+		gpu_write(gpu, REG_A5XX_RBBM_AHB_CMD, (1 << 4));
+	}
+
+	if (status & A5XX_RBBM_INT_0_MASK_RBBM_TRANSFER_TIMEOUT)
+		dev_err_ratelimited(gpu->dev->dev, "RBBM | AHB transfer timeout\n");
+
+	if (status & A5XX_RBBM_INT_0_MASK_RBBM_ME_MS_TIMEOUT)
+		dev_err_ratelimited(gpu->dev->dev, "RBBM | ME master split | status=0x%X\n",
+			gpu_read(gpu, REG_A5XX_RBBM_AHB_ME_SPLIT_STATUS));
+
+	if (status & A5XX_RBBM_INT_0_MASK_RBBM_PFP_MS_TIMEOUT)
+		dev_err_ratelimited(gpu->dev->dev, "RBBM | PFP master split | status=0x%X\n",
+			gpu_read(gpu, REG_A5XX_RBBM_AHB_PFP_SPLIT_STATUS));
+
+	if (status & A5XX_RBBM_INT_0_MASK_RBBM_ETS_MS_TIMEOUT)
+		dev_err_ratelimited(gpu->dev->dev, "RBBM | ETS master split | status=0x%X\n",
+			gpu_read(gpu, REG_A5XX_RBBM_AHB_ETS_SPLIT_STATUS));
+
+	if (status & A5XX_RBBM_INT_0_MASK_RBBM_ATB_ASYNC_OVERFLOW)
+		dev_err_ratelimited(gpu->dev->dev, "RBBM | ATB ASYNC overflow\n");
+
+	if (status & A5XX_RBBM_INT_0_MASK_RBBM_ATB_BUS_OVERFLOW)
+		dev_err_ratelimited(gpu->dev->dev, "RBBM | ATB bus overflow\n");
+}
+
+static void a5xx_uche_err_irq(struct msm_gpu *gpu)
+{
+	uint64_t addr = (uint64_t) gpu_read(gpu, REG_A5XX_UCHE_TRAP_LOG_HI);
+
+	addr |= gpu_read(gpu, REG_A5XX_UCHE_TRAP_LOG_LO);
+
+	dev_err_ratelimited(gpu->dev->dev, "UCHE | Out of bounds access | addr=0x%llX\n",
+		addr);
+}
+
+static void a5xx_gpmu_err_irq(struct msm_gpu *gpu)
+{
+	dev_err_ratelimited(gpu->dev->dev, "GPMU | voltage droop\n");
+}
+
+#define RBBM_ERROR_MASK \
+	(A5XX_RBBM_INT_0_MASK_RBBM_AHB_ERROR | \
+	A5XX_RBBM_INT_0_MASK_RBBM_TRANSFER_TIMEOUT | \
+	A5XX_RBBM_INT_0_MASK_RBBM_ME_MS_TIMEOUT | \
+	A5XX_RBBM_INT_0_MASK_RBBM_PFP_MS_TIMEOUT | \
+	A5XX_RBBM_INT_0_MASK_RBBM_ETS_MS_TIMEOUT | \
+	A5XX_RBBM_INT_0_MASK_RBBM_ATB_ASYNC_OVERFLOW)
+
+static irqreturn_t a5xx_irq(struct msm_gpu *gpu)
+{
+	u32 status = gpu_read(gpu, REG_A5XX_RBBM_INT_0_STATUS);
+
+	gpu_write(gpu, REG_A5XX_RBBM_INT_CLEAR_CMD, status);
+
+	if (status & RBBM_ERROR_MASK)
+		a5xx_rbbm_err_irq(gpu);
+
+	if (status & A5XX_RBBM_INT_0_MASK_CP_HW_ERROR)
+		a5xx_cp_err_irq(gpu);
+
+	if (status & A5XX_RBBM_INT_0_MASK_UCHE_OOB_ACCESS)
+		a5xx_uche_err_irq(gpu);
+
+	if (status & A5XX_RBBM_INT_0_MASK_GPMU_VOLTAGE_DROOP)
+		a5xx_gpmu_err_irq(gpu);
+
+	if (status & A5XX_RBBM_INT_0_MASK_CP_CACHE_FLUSH_TS)
+		msm_gpu_retire(gpu);
+
+	return IRQ_HANDLED;
+}
+
+static const u32 a5xx_register_offsets[REG_ADRENO_REGISTER_MAX] = {
+	REG_ADRENO_DEFINE(REG_ADRENO_CP_RB_BASE, REG_A5XX_CP_RB_BASE),
+	REG_ADRENO_DEFINE(REG_ADRENO_CP_RB_BASE_HI, REG_A5XX_CP_RB_BASE_HI),
+	REG_ADRENO_DEFINE(REG_ADRENO_CP_RB_RPTR_ADDR, REG_A5XX_CP_RB_RPTR_ADDR),
+	REG_ADRENO_DEFINE(REG_ADRENO_CP_RB_RPTR_ADDR_HI,
+		REG_A5XX_CP_RB_RPTR_ADDR_HI),
+	REG_ADRENO_DEFINE(REG_ADRENO_CP_RB_RPTR, REG_A5XX_CP_RB_RPTR),
+	REG_ADRENO_DEFINE(REG_ADRENO_CP_RB_WPTR, REG_A5XX_CP_RB_WPTR),
+	REG_ADRENO_DEFINE(REG_ADRENO_CP_RB_CNTL, REG_A5XX_CP_RB_CNTL),
+};
+
+static const u32 a5xx_registers[] = {
+	0x0000, 0x0002, 0x0004, 0x0020, 0x0022, 0x0026, 0x0029, 0x002B,
+	0x002E, 0x0035, 0x0038, 0x0042, 0x0044, 0x0044, 0x0047, 0x0095,
+	0x0097, 0x00BB, 0x03A0, 0x0464, 0x0469, 0x046F, 0x04D2, 0x04D3,
+	0x04E0, 0x0533, 0x0540, 0x0555, 0xF400, 0xF400, 0xF800, 0xF807,
+	0x0800, 0x081A, 0x081F, 0x0841, 0x0860, 0x0860, 0x0880, 0x08A0,
+	0x0B00, 0x0B12, 0x0B15, 0x0B28, 0x0B78, 0x0B7F, 0x0BB0, 0x0BBD,
+	0x0BC0, 0x0BC6, 0x0BD0, 0x0C53, 0x0C60, 0x0C61, 0x0C80, 0x0C82,
+	0x0C84, 0x0C85, 0x0C90, 0x0C98, 0x0CA0, 0x0CA0, 0x0CB0, 0x0CB2,
+	0x2180, 0x2185, 0x2580, 0x2585, 0x0CC1, 0x0CC1, 0x0CC4, 0x0CC7,
+	0x0CCC, 0x0CCC, 0x0CD0, 0x0CD8, 0x0CE0, 0x0CE5, 0x0CE8, 0x0CE8,
+	0x0CEC, 0x0CF1, 0x0CFB, 0x0D0E, 0x2100, 0x211E, 0x2140, 0x2145,
+	0x2500, 0x251E, 0x2540, 0x2545, 0x0D10, 0x0D17, 0x0D20, 0x0D23,
+	0x0D30, 0x0D30, 0x20C0, 0x20C0, 0x24C0, 0x24C0, 0x0E40, 0x0E43,
+	0x0E4A, 0x0E4A, 0x0E50, 0x0E57, 0x0E60, 0x0E7C, 0x0E80, 0x0E8E,
+	0x0E90, 0x0E96, 0x0EA0, 0x0EA8, 0x0EB0, 0x0EB2, 0xE140, 0xE147,
+	0xE150, 0xE187, 0xE1A0, 0xE1A9, 0xE1B0, 0xE1B6, 0xE1C0, 0xE1C7,
+	0xE1D0, 0xE1D1, 0xE200, 0xE201, 0xE210, 0xE21C, 0xE240, 0xE268,
+	0xE000, 0xE006, 0xE010, 0xE09A, 0xE0A0, 0xE0A4, 0xE0AA, 0xE0EB,
+	0xE100, 0xE105, 0xE380, 0xE38F, 0xE3B0, 0xE3B0, 0xE400, 0xE405,
+	0xE408, 0xE4E9, 0xE4F0, 0xE4F0, 0xE280, 0xE280, 0xE282, 0xE2A3,
+	0xE2A5, 0xE2C2, 0xE940, 0xE947, 0xE950, 0xE987, 0xE9A0, 0xE9A9,
+	0xE9B0, 0xE9B6, 0xE9C0, 0xE9C7, 0xE9D0, 0xE9D1, 0xEA00, 0xEA01,
+	0xEA10, 0xEA1C, 0xEA40, 0xEA68, 0xE800, 0xE806, 0xE810, 0xE89A,
+	0xE8A0, 0xE8A4, 0xE8AA, 0xE8EB, 0xE900, 0xE905, 0xEB80, 0xEB8F,
+	0xEBB0, 0xEBB0, 0xEC00, 0xEC05, 0xEC08, 0xECE9, 0xECF0, 0xECF0,
+	0xEA80, 0xEA80, 0xEA82, 0xEAA3, 0xEAA5, 0xEAC2, 0xA800, 0xA8FF,
+	0xAC60, 0xAC60, 0xB000, 0xB97F, 0xB9A0, 0xB9BF,
+	~0
+};
+
+static void a5xx_dump(struct msm_gpu *gpu)
+{
+	dev_info(gpu->dev->dev, "status:   %08x\n",
+		gpu_read(gpu, REG_A5XX_RBBM_STATUS));
+	adreno_dump(gpu);
+}
+
+static int a5xx_pm_resume(struct msm_gpu *gpu)
+{
+	return msm_gpu_pm_resume(gpu);
+}
+
+static int a5xx_pm_suspend(struct msm_gpu *gpu)
+{
+	return msm_gpu_pm_suspend(gpu);
+}
+
+static int a5xx_get_timestamp(struct msm_gpu *gpu, uint64_t *value)
+{
+	*value = gpu_read64(gpu, REG_A5XX_RBBM_PERFCTR_CP_0_LO,
+		REG_A5XX_RBBM_PERFCTR_CP_0_HI);
+
+	return 0;
+}
+
+#ifdef CONFIG_DEBUG_FS
+static void a5xx_show(struct msm_gpu *gpu, struct seq_file *m)
+{
+	gpu->funcs->pm_resume(gpu);
+
+	seq_printf(m, "status:   %08x\n",
+			gpu_read(gpu, REG_A5XX_RBBM_STATUS));
+	gpu->funcs->pm_suspend(gpu);
+
+	adreno_show(gpu, m);
+}
+#endif
+
+static const struct adreno_gpu_funcs funcs = {
+	.base = {
+		.get_param = adreno_get_param,
+		.hw_init = a5xx_hw_init,
+		.pm_suspend = a5xx_pm_suspend,
+		.pm_resume = a5xx_pm_resume,
+		.recover = a5xx_recover,
+		.last_fence = adreno_last_fence,
+		.submit = a5xx_submit,
+		.flush = adreno_flush,
+		.idle = a5xx_idle,
+		.irq = a5xx_irq,
+		.destroy = a5xx_destroy,
+		.show = a5xx_show,
+	},
+	.get_timestamp = a5xx_get_timestamp,
+};
+
+struct msm_gpu *a5xx_gpu_init(struct drm_device *dev)
+{
+	struct msm_drm_private *priv = dev->dev_private;
+	struct platform_device *pdev = priv->gpu_pdev;
+	struct a5xx_gpu *a5xx_gpu = NULL;
+	struct adreno_gpu *adreno_gpu;
+	struct msm_gpu *gpu;
+	int ret;
+
+	if (!pdev) {
+		dev_err(dev->dev, "No A5XX device is defined\n");
+		return ERR_PTR(-ENXIO);
+	}
+
+	a5xx_gpu = kzalloc(sizeof(*a5xx_gpu), GFP_KERNEL);
+	if (!a5xx_gpu)
+		return ERR_PTR(-ENOMEM);
+
+	adreno_gpu = &a5xx_gpu->base;
+	gpu = &adreno_gpu->base;
+
+	a5xx_gpu->pdev = pdev;
+	adreno_gpu->registers = a5xx_registers;
+	adreno_gpu->reg_offsets = a5xx_register_offsets;
+
+	ret = adreno_gpu_init(dev, pdev, adreno_gpu, &funcs);
+	if (ret) {
+		a5xx_destroy(&(a5xx_gpu->base.base));
+		return ERR_PTR(ret);
+	}
+
+	return gpu;
+}
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.h b/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
new file mode 100644
index 0000000..39a07f4
--- /dev/null
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
@@ -0,0 +1,37 @@
+/* Copyright (c) 2016 The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ */
+#ifndef __A5XX_GPU_H__
+#define __A5XX_GPU_H__
+
+#include "adreno_gpu.h"
+
+/* Bringing over the hack from the previous targets */
+#undef ROP_COPY
+#undef ROP_XOR
+
+#include "a5xx.xml.h"
+
+struct a5xx_gpu {
+	struct adreno_gpu base;
+	struct platform_device *pdev;
+
+	struct drm_gem_object *pm4_bo;
+	uint64_t pm4_iova;
+
+	struct drm_gem_object *pfp_bo;
+	uint64_t pfp_iova;
+};
+
+#define to_a5xx_gpu(x) container_of(x, struct a5xx_gpu, base)
+
+#endif /* __A5XX_GPU_H__ */
diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c
index 3321d33..7749b60 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_device.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
@@ -27,6 +27,7 @@ module_param_named(hang_debug, hang_debug, bool, 0600);
 
 struct msm_gpu *a3xx_gpu_init(struct drm_device *dev);
 struct msm_gpu *a4xx_gpu_init(struct drm_device *dev);
+struct msm_gpu *a5xx_gpu_init(struct drm_device *dev);
 
 static const struct adreno_info gpulist[] = {
 	{
@@ -77,6 +78,14 @@ static const struct adreno_info gpulist[] = {
 		.pfpfw = "a420_pfp.fw",
 		.gmem  = (SZ_1M + SZ_512K),
 		.init  = a4xx_gpu_init,
+	}, {
+		.rev = ADRENO_REV(5, 3, 0, ANY_ID),
+		.revn = 530,
+		.name = "A530",
+		.pm4fw = "a530_pm4.fw",
+		.pfpfw = "a530_pfp.fw",
+		.gmem = SZ_1M,
+		.init = a5xx_gpu_init,
 	},
 };
 
@@ -86,6 +95,8 @@ MODULE_FIRMWARE("a330_pm4.fw");
 MODULE_FIRMWARE("a330_pfp.fw");
 MODULE_FIRMWARE("a420_pm4.fw");
 MODULE_FIRMWARE("a420_pfp.fw");
+MODULE_FIRMWARE("a530_fm4.fw");
+MODULE_FIRMWARE("a530_pfp.fw");
 
 static inline bool _rev_match(uint8_t entry, uint8_t id)
 {
@@ -173,12 +184,20 @@ static void set_gpu_pdev(struct drm_device *dev,
 	priv->gpu_pdev = pdev;
 }
 
+static const struct {
+	const char *str;
+	uint32_t flag;
+} quirks[] = {
+	{ "qcom,gpu-quirk-two-pass-use-wfi", ADRENO_QUIRK_TWO_PASS_USE_WFI },
+	{ "qcom,gpu-quirk-fault-detect-mask", ADRENO_QUIRK_FAULT_DETECT_MASK },
+};
+
 static int adreno_bind(struct device *dev, struct device *master, void *data)
 {
 	static struct adreno_platform_config config = {};
 	struct device_node *child, *node = dev->of_node;
 	u32 val;
-	int ret;
+	int ret, i;
 
 	ret = of_property_read_u32(node, "qcom,chipid", &val);
 	if (ret) {
@@ -212,6 +231,10 @@ static int adreno_bind(struct device *dev, struct device *master, void *data)
 		return -ENXIO;
 	}
 
+	for (i = 0; i < ARRAY_SIZE(quirks); i++)
+		if (of_property_read_bool(node, quirks[i].str))
+			config.quirks |= quirks[i].flag;
+
 	dev->platform_data = &config;
 	set_gpu_pdev(dev_get_drvdata(master), to_platform_device(dev));
 	return 0;
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index 644be52..e240790 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -22,7 +22,7 @@
 #include "msm_mmu.h"
 
 #define RB_SIZE    SZ_32K
-#define RB_BLKSIZE 16
+#define RB_BLKSIZE 32
 
 int adreno_get_param(struct msm_gpu *gpu, uint32_t param, uint64_t *value)
 {
@@ -54,9 +54,6 @@ int adreno_get_param(struct msm_gpu *gpu, uint32_t param, uint64_t *value)
 	}
 }
 
-#define rbmemptr(adreno_gpu, member)  \
-	((adreno_gpu)->memptrs_iova + offsetof(struct adreno_rbmemptrs, member))
-
 int adreno_hw_init(struct msm_gpu *gpu)
 {
 	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
@@ -355,6 +352,7 @@ int adreno_gpu_init(struct drm_device *drm, struct platform_device *pdev,
 	adreno_gpu->gmem = adreno_gpu->info->gmem;
 	adreno_gpu->revn = adreno_gpu->info->revn;
 	adreno_gpu->rev = config->rev;
+	adreno_gpu->quirks = config->quirks;
 
 	gpu->fast_rate = config->fast_rate;
 	gpu->slow_rate = config->slow_rate;
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
index b93366b..d67e78f 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
@@ -48,6 +48,11 @@ enum adreno_regs {
 	REG_ADRENO_REGISTER_MAX,
 };
 
+enum adreno_quirks {
+	ADRENO_QUIRK_TWO_PASS_USE_WFI = 1,
+	ADRENO_QUIRK_FAULT_DETECT_MASK = 2,
+};
+
 struct adreno_rev {
 	uint8_t  core;
 	uint8_t  major;
@@ -74,6 +79,9 @@ struct adreno_info {
 
 const struct adreno_info *adreno_info(struct adreno_rev rev);
 
+#define rbmemptr(adreno_gpu, member)  \
+	((adreno_gpu)->memptrs_iova + offsetof(struct adreno_rbmemptrs, member))
+
 struct adreno_rbmemptrs {
 	volatile uint32_t rptr;
 	volatile uint32_t wptr;
@@ -107,6 +115,8 @@ struct adreno_gpu {
 	 * code (a3xx_gpu.c) and stored in this common location.
 	 */
 	const unsigned int *reg_offsets;
+
+	uint32_t quirks;
 };
 #define to_adreno_gpu(x) container_of(x, struct adreno_gpu, base)
 
@@ -117,6 +127,7 @@ struct adreno_platform_config {
 #ifdef DOWNSTREAM_CONFIG_MSM_BUS_SCALING
 	struct msm_bus_scale_pdata *bus_scale_table;
 #endif
+	uint32_t quirks;
 };
 
 #define ADRENO_IDLE_TIMEOUT msecs_to_jiffies(1000)
@@ -180,6 +191,11 @@ static inline int adreno_is_a430(struct adreno_gpu *gpu)
        return gpu->revn == 430;
 }
 
+static inline int adreno_is_a530(struct adreno_gpu *gpu)
+{
+	return gpu->revn == 530;
+}
+
 int adreno_get_param(struct msm_gpu *gpu, uint32_t param, uint64_t *value);
 int adreno_hw_init(struct msm_gpu *gpu);
 uint32_t adreno_last_fence(struct msm_gpu *gpu);
@@ -304,4 +320,28 @@ static inline void adreno_gpu_write64(struct adreno_gpu *gpu,
 	adreno_gpu_write(gpu, hi, upper_32_bits(data));
 }
 
+/*
+ * Given a register and a count, return a value to program into
+ * REG_CP_PROTECT_REG(n) - this will block both reads and writes for _len
+ * registers starting at _reg.
+ *
+ * The register base needs to be a multiple of the length. If it is not, the
+ * hardware will quietly mask off the bits for you and shift the size. For
+ * example, if you intend the protection to start at 0x07 for a length of 4
+ * (0x07-0x0A) the hardware will actually protect (0x04-0x07) which might
+ * expose registers you intended to protect!
+ */
+#define ADRENO_PROTECT_RW(_reg, _len) \
+	((1 << 30) | (1 << 29) | \
+	((ilog2((_len)) & 0x1F) << 24) | (((_reg) << 2) & 0xFFFFF))
+
+/*
+ * Same as above, but allow reads over the range. For areas of mixed use (such
+ * as performance counters) this allows us to protect a much larger range with a
+ * single register
+ */
+#define ADRENO_PROTECT_RDONLY(_reg, _len) \
+	((1 << 29) \
+	((ilog2((_len)) & 0x1F) << 24) | (((_reg) << 2) & 0xFFFFF))
+
 #endif /* __ADRENO_GPU_H__ */
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index e981224..a95c43e 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -96,6 +96,10 @@ static int enable_clk(struct msm_gpu *gpu)
 	if (gpu->grp_clks[0] && gpu->fast_rate)
 		clk_set_rate(gpu->grp_clks[0], gpu->fast_rate);
 
+	/* Set the RBBM timer rate to 19.2Mhz */
+	if (gpu->grp_clks[2])
+		clk_set_rate(gpu->grp_clks[2], 19200000);
+
 	for (i = ARRAY_SIZE(gpu->grp_clks) - 1; i >= 0; i--)
 		if (gpu->grp_clks[i])
 			clk_prepare(gpu->grp_clks[i]);
@@ -122,6 +126,9 @@ static int disable_clk(struct msm_gpu *gpu)
 	if (gpu->grp_clks[0] && gpu->slow_rate)
 		clk_set_rate(gpu->grp_clks[0], gpu->slow_rate);
 
+	if (gpu->grp_clks[2])
+		clk_set_rate(gpu->grp_clks[2], 0);
+
 	return 0;
 }
 
@@ -562,8 +569,8 @@ static irqreturn_t irq_handler(int irq, void *data)
 }
 
 static const char *clk_names[] = {
-		"core_clk", "iface_clk", "mem_clk", "mem_iface_clk",
-		"alt_mem_iface_clk",
+		"core_clk", "iface_clk", "rbbmtimer_clk", "mem_clk",
+		"mem_iface_clk", "alt_mem_iface_clk",
 };
 
 int msm_gpu_init(struct drm_device *drm, struct platform_device *pdev,
@@ -656,7 +663,7 @@ int msm_gpu_init(struct drm_device *drm, struct platform_device *pdev,
 	iommu = iommu_domain_alloc(&platform_bus_type);
 	if (iommu) {
 		/* TODO 32b vs 64b address space.. */
-		iommu->geometry.aperture_start = 0x1000;
+		iommu->geometry.aperture_start = SZ_16M;
 		iommu->geometry.aperture_end = 0xffffffff;
 
 		dev_info(drm->dev, "%s: using IOMMU\n", name);
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index bfe13d3..baca428 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -103,7 +103,7 @@ struct msm_gpu {
 
 	/* Power Control: */
 	struct regulator *gpu_reg, *gpu_cx;
-	struct clk *ebi1_clk, *grp_clks[5];
+	struct clk *ebi1_clk, *grp_clks[6];
 	uint32_t fast_rate, slow_rate, bus_freq;
 
 #ifdef DOWNSTREAM_CONFIG_MSM_BUS_SCALING
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 09/12] drm/msm: gpu: Add support for the GPMU
  2016-11-28 19:28 [PATCH 00/12] Adreno A5XX support Jordan Crouse
                   ` (5 preceding siblings ...)
  2016-11-28 19:28 ` [PATCH 08/12] drm/msm: gpu: Add A5XX target support Jordan Crouse
@ 2016-11-28 19:28 ` Jordan Crouse
  2016-11-28 19:28 ` [PATCH 10/12] firmware: qcom_scm: Add qcom_scm_gpu_zap_resume() Jordan Crouse
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 34+ messages in thread
From: Jordan Crouse @ 2016-11-28 19:28 UTC (permalink / raw)
  To: freedreno; +Cc: linux-arm-msm, dri-devel

Most 5XX targets have GPMU (Graphics Power Management Unit) that
handles a lot of the heavy lifting for power management including
thermal and limits management and dynamic power collapse. While
the GPMU itself is optional, it is usually nessesary to hit
aggressive power targets.

The GPMU firmware needs to be loaded into the GPMU at init time via a
shared hardware block of registers. Using the GPU to write the microcode
is more efficient than using the CPU so at first load create an indirect
buffer that can be executed during subsequent initalization sequences.

After loading the GPMU gets initalized through a shared register
interface and then we mostly get out of its way and let it do
its thing.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/gpu/drm/msm/Makefile               |   1 +
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c      |  64 +++++-
 drivers/gpu/drm/msm/adreno/a5xx_gpu.h      |  23 ++
 drivers/gpu/drm/msm/adreno/a5xx_power.c    | 344 +++++++++++++++++++++++++++++
 drivers/gpu/drm/msm/adreno/adreno_device.c |   1 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.h    |   1 +
 6 files changed, 431 insertions(+), 3 deletions(-)
 create mode 100644 drivers/gpu/drm/msm/adreno/a5xx_power.c

diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile
index 4abba41..d969a48 100644
--- a/drivers/gpu/drm/msm/Makefile
+++ b/drivers/gpu/drm/msm/Makefile
@@ -7,6 +7,7 @@ msm-y := \
 	adreno/a3xx_gpu.o \
 	adreno/a4xx_gpu.o \
 	adreno/a5xx_gpu.o \
+	adreno/a5xx_power.o \
 	hdmi/hdmi.o \
 	hdmi/hdmi_audio.o \
 	hdmi/hdmi_bridge.o \
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index 63f67a8..3824bc4 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -450,6 +450,9 @@ static int a5xx_hw_init(struct msm_gpu *gpu)
 		REG_A5XX_RBBM_SECVID_TSB_TRUSTED_BASE_HI, 0x00000000);
 	gpu_write(gpu, REG_A5XX_RBBM_SECVID_TSB_TRUSTED_SIZE, 0x00000000);
 
+	/* Load the GPMU firmware before starting the HW init */
+	a5xx_gpmu_ucode_init(gpu);
+
 	ret = adreno_hw_init(gpu);
 	if (ret)
 		return ret;
@@ -467,8 +470,9 @@ static int a5xx_hw_init(struct msm_gpu *gpu)
 	if (ret)
 		return ret;
 
-	/* Put the GPU into insecure mode */
-	gpu_write(gpu, REG_A5XX_RBBM_SECVID_TRUST_CNTL, 0x0);
+	ret = a5xx_power_init(gpu);
+	if (ret)
+		return ret;
 
 	/*
 	 * Send a pipeline event stat to get misbehaving counters to start
@@ -483,6 +487,9 @@ static int a5xx_hw_init(struct msm_gpu *gpu)
 			return -EINVAL;
 	}
 
+	/* Put the GPU into unsecure mode */
+	gpu_write(gpu, REG_A5XX_RBBM_SECVID_TRUST_CNTL, 0x0);
+
 	return 0;
 }
 
@@ -518,6 +525,12 @@ static void a5xx_destroy(struct msm_gpu *gpu)
 		drm_gem_object_unreference_unlocked(a5xx_gpu->pfp_bo);
 	}
 
+	if (a5xx_gpu->gpmu_bo) {
+		if (a5xx_gpu->gpmu_bo)
+			msm_gem_put_iova(a5xx_gpu->gpmu_bo, gpu->id);
+		drm_gem_object_unreference_unlocked(a5xx_gpu->gpmu_bo);
+	}
+
 	adreno_gpu_cleanup(adreno_gpu);
 	kfree(a5xx_gpu);
 }
@@ -741,11 +754,54 @@ static void a5xx_dump(struct msm_gpu *gpu)
 
 static int a5xx_pm_resume(struct msm_gpu *gpu)
 {
-	return msm_gpu_pm_resume(gpu);
+	int ret;
+
+	/* Turn on the core power */
+	ret = msm_gpu_pm_resume(gpu);
+	if (ret)
+		return ret;
+
+	/* Turn the RBCCU domain first to limit the chances of voltage droop */
+	gpu_write(gpu, REG_A5XX_GPMU_RBCCU_POWER_CNTL, 0x778000);
+
+	/* Wait 3 usecs before polling */
+	udelay(3);
+
+	ret = spin_usecs(gpu, 20, REG_A5XX_GPMU_RBCCU_PWR_CLK_STATUS,
+		(1 << 20), (1 << 20));
+	if (ret) {
+		DRM_ERROR("%s: timeout waiting for RBCCU GDSC enable: %X\n",
+			gpu->name,
+			gpu_read(gpu, REG_A5XX_GPMU_RBCCU_PWR_CLK_STATUS));
+		return ret;
+	}
+
+	/* Turn on the SP domain */
+	gpu_write(gpu, REG_A5XX_GPMU_SP_POWER_CNTL, 0x778000);
+	ret = spin_usecs(gpu, 20, REG_A5XX_GPMU_SP_PWR_CLK_STATUS,
+		(1 << 20), (1 << 20));
+	if (ret)
+		DRM_ERROR("%s: timeout waiting for SP GDSC enable\n",
+			gpu->name);
+
+	return ret;
 }
 
 static int a5xx_pm_suspend(struct msm_gpu *gpu)
 {
+	/* Clear the VBIF pipe before shutting down */
+	gpu_write(gpu, REG_A5XX_VBIF_XIN_HALT_CTRL0, 0xF);
+	spin_until((gpu_read(gpu, REG_A5XX_VBIF_XIN_HALT_CTRL1) & 0xF) == 0xF);
+
+	gpu_write(gpu, REG_A5XX_VBIF_XIN_HALT_CTRL0, 0);
+
+	/*
+	 * Reset the VBIF before power collapse to avoid issue with FIFO
+	 * entries
+	 */
+	gpu_write(gpu, REG_A5XX_RBBM_BLOCK_SW_RESET_CMD, 0x003C0000);
+	gpu_write(gpu, REG_A5XX_RBBM_BLOCK_SW_RESET_CMD, 0x00000000);
+
 	return msm_gpu_pm_suspend(gpu);
 }
 
@@ -813,6 +869,8 @@ struct msm_gpu *a5xx_gpu_init(struct drm_device *dev)
 	adreno_gpu->registers = a5xx_registers;
 	adreno_gpu->reg_offsets = a5xx_register_offsets;
 
+	a5xx_gpu->lm_leakage = 0x4E001A;
+
 	ret = adreno_gpu_init(dev, pdev, adreno_gpu, &funcs);
 	if (ret) {
 		a5xx_destroy(&(a5xx_gpu->base.base));
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.h b/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
index 39a07f4..1590f84 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
@@ -30,8 +30,31 @@ struct a5xx_gpu {
 
 	struct drm_gem_object *pfp_bo;
 	uint64_t pfp_iova;
+
+	struct drm_gem_object *gpmu_bo;
+	uint64_t gpmu_iova;
+	uint32_t gpmu_dwords;
+
+	uint32_t lm_leakage;
 };
 
 #define to_a5xx_gpu(x) container_of(x, struct a5xx_gpu, base)
 
+int a5xx_power_init(struct msm_gpu *gpu);
+void a5xx_gpmu_ucode_init(struct msm_gpu *gpu);
+
+static inline int spin_usecs(struct msm_gpu *gpu, uint32_t usecs,
+		uint32_t reg, uint32_t mask, uint32_t value)
+{
+	while (usecs--) {
+		udelay(1);
+		if ((gpu_read(gpu, reg) & mask) == value)
+			return 0;
+		cpu_relax();
+	}
+
+	return -ETIMEDOUT;
+}
+
+
 #endif /* __A5XX_GPU_H__ */
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_power.c b/drivers/gpu/drm/msm/adreno/a5xx_power.c
new file mode 100644
index 0000000..72d52c7
--- /dev/null
+++ b/drivers/gpu/drm/msm/adreno/a5xx_power.c
@@ -0,0 +1,344 @@
+/* Copyright (c) 2016 The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ */
+
+#include <linux/pm_opp.h>
+#include "a5xx_gpu.h"
+
+/*
+ * The GPMU data block is a block of shared registers that can be used to
+ * communicate back and forth. These "registers" are by convention with the GPMU
+ * firwmare and not bound to any specific hardware design
+ */
+
+#define AGC_INIT_BASE REG_A5XX_GPMU_DATA_RAM_BASE
+#define AGC_INIT_MSG_MAGIC (AGC_INIT_BASE + 5)
+#define AGC_MSG_BASE (AGC_INIT_BASE + 7)
+
+#define AGC_MSG_STATE (AGC_MSG_BASE + 0)
+#define AGC_MSG_COMMAND (AGC_MSG_BASE + 1)
+#define AGC_MSG_PAYLOAD_SIZE (AGC_MSG_BASE + 3)
+#define AGC_MSG_PAYLOAD(_o) ((AGC_MSG_BASE + 5) + (_o))
+
+#define AGC_POWER_CONFIG_PRODUCTION_ID 1
+#define AGC_INIT_MSG_VALUE 0xBABEFACE
+
+static struct {
+	uint32_t reg;
+	uint32_t value;
+} a5xx_sequence_regs[] = {
+	{ 0xB9A1, 0x00010303 },
+	{ 0xB9A2, 0x13000000 },
+	{ 0xB9A3, 0x00460020 },
+	{ 0xB9A4, 0x10000000 },
+	{ 0xB9A5, 0x040A1707 },
+	{ 0xB9A6, 0x00010000 },
+	{ 0xB9A7, 0x0E000904 },
+	{ 0xB9A8, 0x10000000 },
+	{ 0xB9A9, 0x01165000 },
+	{ 0xB9AA, 0x000E0002 },
+	{ 0xB9AB, 0x03884141 },
+	{ 0xB9AC, 0x10000840 },
+	{ 0xB9AD, 0x572A5000 },
+	{ 0xB9AE, 0x00000003 },
+	{ 0xB9AF, 0x00000000 },
+	{ 0xB9B0, 0x10000000 },
+	{ 0xB828, 0x6C204010 },
+	{ 0xB829, 0x6C204011 },
+	{ 0xB82A, 0x6C204012 },
+	{ 0xB82B, 0x6C204013 },
+	{ 0xB82C, 0x6C204014 },
+	{ 0xB90F, 0x00000004 },
+	{ 0xB910, 0x00000002 },
+	{ 0xB911, 0x00000002 },
+	{ 0xB912, 0x00000002 },
+	{ 0xB913, 0x00000002 },
+	{ 0xB92F, 0x00000004 },
+	{ 0xB930, 0x00000005 },
+	{ 0xB931, 0x00000005 },
+	{ 0xB932, 0x00000005 },
+	{ 0xB933, 0x00000005 },
+	{ 0xB96F, 0x00000001 },
+	{ 0xB970, 0x00000003 },
+	{ 0xB94F, 0x00000004 },
+	{ 0xB950, 0x0000000B },
+	{ 0xB951, 0x0000000B },
+	{ 0xB952, 0x0000000B },
+	{ 0xB953, 0x0000000B },
+	{ 0xB907, 0x00000019 },
+	{ 0xB927, 0x00000019 },
+	{ 0xB947, 0x00000019 },
+	{ 0xB967, 0x00000019 },
+	{ 0xB987, 0x00000019 },
+	{ 0xB906, 0x00220001 },
+	{ 0xB926, 0x00220001 },
+	{ 0xB946, 0x00220001 },
+	{ 0xB966, 0x00220001 },
+	{ 0xB986, 0x00300000 },
+	{ 0xAC40, 0x0340FF41 },
+	{ 0xAC41, 0x03BEFED0 },
+	{ 0xAC42, 0x00331FED },
+	{ 0xAC43, 0x021FFDD3 },
+	{ 0xAC44, 0x5555AAAA },
+	{ 0xAC45, 0x5555AAAA },
+	{ 0xB9BA, 0x00000008 },
+};
+
+/*
+ * Get the actual voltage value for the operating point at the specified
+ * frequency
+ */
+static inline uint32_t _get_mvolts(struct msm_gpu *gpu, uint32_t freq)
+{
+	struct drm_device *dev = gpu->dev;
+	struct msm_drm_private *priv = dev->dev_private;
+	struct platform_device *pdev = priv->gpu_pdev;
+	struct dev_pm_opp *opp;
+
+	opp = dev_pm_opp_find_freq_exact(&pdev->dev, freq, true);
+
+	return (!IS_ERR(opp)) ? dev_pm_opp_get_voltage(opp) / 1000 : 0;
+}
+
+/* Setup thermal limit management */
+static void a5xx_lm_setup(struct msm_gpu *gpu)
+{
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct a5xx_gpu *a5xx_gpu = to_a5xx_gpu(adreno_gpu);
+	unsigned int i;
+
+	/* Write the block of sequence registers */
+	for (i = 0; i < ARRAY_SIZE(a5xx_sequence_regs); i++)
+		gpu_write(gpu, a5xx_sequence_regs[i].reg,
+			a5xx_sequence_regs[i].value);
+
+	/* Hard code the A530 GPU thermal sensor ID for the GPMU */
+	gpu_write(gpu, REG_A5XX_GPMU_TEMP_SENSOR_ID, 0x60007);
+	gpu_write(gpu, REG_A5XX_GPMU_DELTA_TEMP_THRESHOLD, 0x01);
+	gpu_write(gpu, REG_A5XX_GPMU_TEMP_SENSOR_CONFIG, 0x01);
+
+	/* Until we get clock scaling 0 is always the active power level */
+	gpu_write(gpu, REG_A5XX_GPMU_GPMU_VOLTAGE, 0x80000000 | 0);
+
+	gpu_write(gpu, REG_A5XX_GPMU_BASE_LEAKAGE, a5xx_gpu->lm_leakage);
+
+	/* The threshold is fixed at 6000 for A530 */
+	gpu_write(gpu, REG_A5XX_GPMU_GPMU_PWR_THRESHOLD, 0x80000000 | 6000);
+
+	gpu_write(gpu, REG_A5XX_GPMU_BEC_ENABLE, 0x10001FFF);
+	gpu_write(gpu, REG_A5XX_GDPM_CONFIG1, 0x00201FF1);
+
+	/* Write the voltage table */
+	gpu_write(gpu, REG_A5XX_GPMU_BEC_ENABLE, 0x10001FFF);
+	gpu_write(gpu, REG_A5XX_GDPM_CONFIG1, 0x201FF1);
+
+	gpu_write(gpu, AGC_MSG_STATE, 1);
+	gpu_write(gpu, AGC_MSG_COMMAND, AGC_POWER_CONFIG_PRODUCTION_ID);
+
+	/* Write the max power - hard coded to 5448 for A530 */
+	gpu_write(gpu, AGC_MSG_PAYLOAD(0), 5448);
+	gpu_write(gpu, AGC_MSG_PAYLOAD(1), 1);
+
+	/*
+	 * For now just write the one voltage level - we will do more when we
+	 * can do scaling
+	 */
+	gpu_write(gpu, AGC_MSG_PAYLOAD(2), _get_mvolts(gpu, gpu->fast_rate));
+	gpu_write(gpu, AGC_MSG_PAYLOAD(3), gpu->fast_rate / 1000000);
+
+	gpu_write(gpu, AGC_MSG_PAYLOAD_SIZE, 4 * sizeof(uint32_t));
+	gpu_write(gpu, AGC_INIT_MSG_MAGIC, AGC_INIT_MSG_VALUE);
+}
+
+/* Enable SP/TP cpower collapse */
+static void a5xx_pc_init(struct msm_gpu *gpu)
+{
+	gpu_write(gpu, REG_A5XX_GPMU_PWR_COL_INTER_FRAME_CTRL, 0x7F);
+	gpu_write(gpu, REG_A5XX_GPMU_PWR_COL_BINNING_CTRL, 0);
+	gpu_write(gpu, REG_A5XX_GPMU_PWR_COL_INTER_FRAME_HYST, 0xA0080);
+	gpu_write(gpu, REG_A5XX_GPMU_PWR_COL_STAGGER_DELAY, 0x600040);
+}
+
+/* Enable the GPMU microcontroller */
+static int a5xx_gpmu_init(struct msm_gpu *gpu)
+{
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct a5xx_gpu *a5xx_gpu = to_a5xx_gpu(adreno_gpu);
+	struct msm_ringbuffer *ring = gpu->rb;
+
+	if (!a5xx_gpu->gpmu_dwords)
+		return 0;
+
+	/* Turn off protected mode for this operation */
+	OUT_PKT7(ring, CP_SET_PROTECTED_MODE, 1);
+	OUT_RING(ring, 0);
+
+	/* Kick off the IB to load the GPMU microcode */
+	OUT_PKT7(ring, CP_INDIRECT_BUFFER_PFE, 3);
+	OUT_RING(ring, lower_32_bits(a5xx_gpu->gpmu_iova));
+	OUT_RING(ring, upper_32_bits(a5xx_gpu->gpmu_iova));
+	OUT_RING(ring, a5xx_gpu->gpmu_dwords);
+
+	/* Turn back on protected mode */
+	OUT_PKT7(ring, CP_SET_PROTECTED_MODE, 1);
+	OUT_RING(ring, 1);
+
+	gpu->funcs->flush(gpu);
+
+	if (!gpu->funcs->idle(gpu)) {
+		DRM_ERROR("%s: Unable to load GPMU firmware. GPMU will not be active\n",
+			gpu->name);
+		return -EINVAL;
+	}
+
+	gpu_write(gpu, REG_A5XX_GPMU_WFI_CONFIG, 0x4014);
+
+	/* Kick off the GPMU */
+	gpu_write(gpu, REG_A5XX_GPMU_CM3_SYSRESET, 0x0);
+
+	/*
+	 * Wait for the GPMU to respond. It isn't fatal if it doesn't, we just
+	 * won't have advanced power collapse.
+	 */
+	if (spin_usecs(gpu, 25, REG_A5XX_GPMU_GENERAL_0, 0xFFFFFFFF,
+		0xBABEFACE))
+		DRM_ERROR("%s: GPMU firmware initialization timed out\n",
+			gpu->name);
+
+	return 0;
+}
+
+/* Enable limits management */
+static void a5xx_lm_enable(struct msm_gpu *gpu)
+{
+	gpu_write(gpu, REG_A5XX_GDPM_INT_MASK, 0x0);
+	gpu_write(gpu, REG_A5XX_GDPM_INT_EN, 0x0A);
+	gpu_write(gpu, REG_A5XX_GPMU_GPMU_VOLTAGE_INTR_EN_MASK, 0x01);
+	gpu_write(gpu, REG_A5XX_GPMU_TEMP_THRESHOLD_INTR_EN_MASK, 0x50000);
+	gpu_write(gpu, REG_A5XX_GPMU_THROTTLE_UNMASK_FORCE_CTRL, 0x30000);
+
+	gpu_write(gpu, REG_A5XX_GPMU_CLOCK_THROTTLE_CTRL, 0x011);
+}
+
+int a5xx_power_init(struct msm_gpu *gpu)
+{
+	int ret;
+
+	/* Set up the limits management */
+	a5xx_lm_setup(gpu);
+
+	/* Set up SP/TP power collpase */
+	a5xx_pc_init(gpu);
+
+	/* Start the GPMU */
+	ret = a5xx_gpmu_init(gpu);
+	if (ret)
+		return ret;
+
+	/* Start the limits management */
+	a5xx_lm_enable(gpu);
+
+	return 0;
+}
+
+void a5xx_gpmu_ucode_init(struct msm_gpu *gpu)
+{
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct a5xx_gpu *a5xx_gpu = to_a5xx_gpu(adreno_gpu);
+	struct drm_device *drm = gpu->dev;
+	const struct firmware *fw;
+	uint32_t dwords = 0, offset = 0, bosize;
+	unsigned int *data, *ptr, *cmds;
+	unsigned int cmds_size;
+
+	if (a5xx_gpu->gpmu_bo)
+		return;
+
+	/* Get the firmware */
+	if (request_firmware(&fw, adreno_gpu->info->gpmufw, drm->dev)) {
+		DRM_ERROR("%s: Could not get GPMU firmware. GPMU will not be active\n",
+			gpu->name);
+		return;
+	}
+
+	data = (unsigned int *) fw->data;
+
+	/*
+	 * The first dword is the size of the remaining data in dwords. Use it
+	 * as a checksum of sorts and make sure it matches the actual size of
+	 * the firmware that we read
+	 */
+
+	if (fw->size < 8 || (data[0] < 2) || (data[0] >= (fw->size >> 2)))
+		goto out;
+
+	/* The second dword is an ID - look for 2 (GPMU_FIRMWARE_ID) */
+	if (data[1] != 2)
+		goto out;
+
+	cmds = data + data[2] + 3;
+	cmds_size = data[0] - data[2] - 2;
+
+	/*
+	 * A single type4 opcode can only have so many values attached so
+	 * add enough opcodes to load the all the commands
+	 */
+	bosize = (cmds_size + (cmds_size / TYPE4_MAX_PAYLOAD) + 1) << 2;
+
+	mutex_lock(&drm->struct_mutex);
+	a5xx_gpu->gpmu_bo = msm_gem_new(drm, bosize, MSM_BO_UNCACHED);
+	mutex_unlock(&drm->struct_mutex);
+
+	if (IS_ERR(a5xx_gpu->gpmu_bo))
+		goto err;
+
+	if (msm_gem_get_iova(a5xx_gpu->gpmu_bo, gpu->id, &a5xx_gpu->gpmu_iova))
+		goto err;
+
+	ptr = msm_gem_get_vaddr(a5xx_gpu->gpmu_bo);
+	if (!ptr)
+		goto err;
+
+	while (cmds_size > 0) {
+		int i;
+		uint32_t _size = cmds_size > TYPE4_MAX_PAYLOAD ?
+			TYPE4_MAX_PAYLOAD : cmds_size;
+
+		ptr[dwords++] = PKT4(REG_A5XX_GPMU_INST_RAM_BASE + offset,
+			_size);
+
+		for (i = 0; i < _size; i++)
+			ptr[dwords++] = *cmds++;
+
+		offset += _size;
+		cmds_size -= _size;
+	}
+
+	msm_gem_put_vaddr(a5xx_gpu->gpmu_bo);
+	a5xx_gpu->gpmu_dwords = dwords;
+
+	goto out;
+
+err:
+	if (a5xx_gpu->gpmu_iova)
+		msm_gem_put_iova(a5xx_gpu->gpmu_bo, gpu->id);
+	if (a5xx_gpu->gpmu_bo)
+		drm_gem_object_unreference_unlocked(a5xx_gpu->gpmu_bo);
+
+	a5xx_gpu->gpmu_bo = NULL;
+	a5xx_gpu->gpmu_iova = 0;
+	a5xx_gpu->gpmu_dwords = 0;
+
+out:
+	/* No need to keep that firmware laying around anymore */
+	release_firmware(fw);
+}
diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c
index 7749b60..0d9ceb2 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_device.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
@@ -86,6 +86,7 @@ static const struct adreno_info gpulist[] = {
 		.pfpfw = "a530_pfp.fw",
 		.gmem = SZ_1M,
 		.init = a5xx_gpu_init,
+		.gpmufw = "a530v3_gpmu.fw2",
 	},
 };
 
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
index d67e78f..cbcd55b 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
@@ -73,6 +73,7 @@ struct adreno_info {
 	uint32_t revn;
 	const char *name;
 	const char *pm4fw, *pfpfw;
+	const char *gpmufw;
 	uint32_t gmem;
 	struct msm_gpu *(*init)(struct drm_device *dev);
 };
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 10/12] firmware: qcom_scm: Add qcom_scm_gpu_zap_resume()
  2016-11-28 19:28 [PATCH 00/12] Adreno A5XX support Jordan Crouse
                   ` (6 preceding siblings ...)
  2016-11-28 19:28 ` [PATCH 09/12] drm/msm: gpu: Add support for the GPMU Jordan Crouse
@ 2016-11-28 19:28 ` Jordan Crouse
  2017-01-13 17:12   ` Andy Gross
       [not found]   ` <1480361317-9937-11-git-send-email-jcrouse-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
  2016-11-28 19:28 ` [PATCH 11/12] drm/msm: Add a quick and dirty PIL loader Jordan Crouse
  2016-11-28 19:28 ` [PATCH 12/12] drm/msm: gpu: Use the zap shader on 5XX if we can Jordan Crouse
  9 siblings, 2 replies; 34+ messages in thread
From: Jordan Crouse @ 2016-11-28 19:28 UTC (permalink / raw)
  To: freedreno; +Cc: linux-arm-msm, dri-devel

Add an interface to trigger the remote processor to reinitialize the GPU
zap shader on power-up.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/firmware/qcom_scm-32.c |  5 +++++
 drivers/firmware/qcom_scm-64.c | 15 +++++++++++++++
 drivers/firmware/qcom_scm.c    |  6 ++++++
 drivers/firmware/qcom_scm.h    |  2 ++
 include/linux/qcom_scm.h       |  2 ++
 5 files changed, 30 insertions(+)

diff --git a/drivers/firmware/qcom_scm-32.c b/drivers/firmware/qcom_scm-32.c
index c6aeedb..1a0876c 100644
--- a/drivers/firmware/qcom_scm-32.c
+++ b/drivers/firmware/qcom_scm-32.c
@@ -560,3 +560,8 @@ int __qcom_scm_pas_mss_reset(struct device *dev, bool reset)
 
 	return ret ? : le32_to_cpu(out);
 }
+
+int __qcom_scm_gpu_zap_resume(struct device *dev)
+{
+	return -ENOTSUPP;
+}
diff --git a/drivers/firmware/qcom_scm-64.c b/drivers/firmware/qcom_scm-64.c
index cdf422f2..82aba97 100644
--- a/drivers/firmware/qcom_scm-64.c
+++ b/drivers/firmware/qcom_scm-64.c
@@ -363,3 +363,18 @@ int __qcom_scm_pas_mss_reset(struct device *dev, bool reset)
 
 	return ret ? : res.a1;
 }
+
+int __qcom_scm_gpu_zap_resume(struct device *dev)
+{
+	struct qcom_scm_desc desc = {0};
+	struct arm_smccc_res res;
+	int ret;
+
+	desc.args[0] = 0;
+	desc.args[1] = 13;
+	desc.arginfo = QCOM_SCM_ARGS(2);
+
+	ret = qcom_scm_call(dev, QCOM_SCM_SVC_BOOT, 0x0A, &desc, &res);
+
+	return ret ? : res.a1;
+}
diff --git a/drivers/firmware/qcom_scm.c b/drivers/firmware/qcom_scm.c
index 143edbc..e26c7f4 100644
--- a/drivers/firmware/qcom_scm.c
+++ b/drivers/firmware/qcom_scm.c
@@ -312,6 +312,12 @@ static const struct reset_control_ops qcom_scm_pas_reset_ops = {
 	.deassert = qcom_scm_pas_reset_deassert,
 };
 
+int qcom_scm_gpu_zap_resume(void)
+{
+	return __qcom_scm_gpu_zap_resume(__scm->dev);
+}
+EXPORT_SYMBOL(qcom_scm_gpu_zap_resume);
+
 /**
  * qcom_scm_is_available() - Checks if SCM is available
  */
diff --git a/drivers/firmware/qcom_scm.h b/drivers/firmware/qcom_scm.h
index 3584b00..581f934 100644
--- a/drivers/firmware/qcom_scm.h
+++ b/drivers/firmware/qcom_scm.h
@@ -56,6 +56,8 @@ extern int  __qcom_scm_pas_auth_and_reset(struct device *dev, u32 peripheral);
 extern int  __qcom_scm_pas_shutdown(struct device *dev, u32 peripheral);
 extern int  __qcom_scm_pas_mss_reset(struct device *dev, bool reset);
 
+extern int __qcom_scm_gpu_zap_resume(struct device *dev);
+
 /* common error codes */
 #define QCOM_SCM_V2_EBUSY	-12
 #define QCOM_SCM_ENOMEM		-5
diff --git a/include/linux/qcom_scm.h b/include/linux/qcom_scm.h
index cc32ab8..e1729fb 100644
--- a/include/linux/qcom_scm.h
+++ b/include/linux/qcom_scm.h
@@ -37,6 +37,8 @@ extern int qcom_scm_pas_mem_setup(u32 peripheral, phys_addr_t addr,
 extern int qcom_scm_pas_auth_and_reset(u32 peripheral);
 extern int qcom_scm_pas_shutdown(u32 peripheral);
 
+extern int qcom_scm_gpu_zap_resume(void);
+
 #define QCOM_SCM_CPU_PWR_DOWN_L2_ON	0x0
 #define QCOM_SCM_CPU_PWR_DOWN_L2_OFF	0x1
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 11/12] drm/msm: Add a quick and dirty PIL loader
  2016-11-28 19:28 [PATCH 00/12] Adreno A5XX support Jordan Crouse
                   ` (7 preceding siblings ...)
  2016-11-28 19:28 ` [PATCH 10/12] firmware: qcom_scm: Add qcom_scm_gpu_zap_resume() Jordan Crouse
@ 2016-11-28 19:28 ` Jordan Crouse
  2016-12-05 19:57   ` Bjorn Andersson
  2016-11-28 19:28 ` [PATCH 12/12] drm/msm: gpu: Use the zap shader on 5XX if we can Jordan Crouse
  9 siblings, 1 reply; 34+ messages in thread
From: Jordan Crouse @ 2016-11-28 19:28 UTC (permalink / raw)
  To: freedreno; +Cc: linux-arm-msm, dri-devel

In order to switch the GPU out of secure mode on most systems we
need to load a zap shader into memory and get it authenticated
and into the secure world.  All the bits and pieces to do
the load are scattered throughout the kernel, but we need to
bring everything together.

Add a semi-custom loader that will read a MDT file and get
it loaded and authenticated through SCM.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 166 ++++++++++++++++++++++++++++++++++
 1 file changed, 166 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index 3824bc4..eefe197 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -11,12 +11,178 @@
  *
  */
 
+#include <linux/elf.h>
+#include <linux/types.h>
+#include <linux/cpumask.h>
+#include <linux/qcom_scm.h>
+#include <linux/dma-mapping.h>
+#include <linux/of_reserved_mem.h>
 #include "msm_gem.h"
 #include "a5xx_gpu.h"
 
 extern bool hang_debug;
 static void a5xx_dump(struct msm_gpu *gpu);
 
+static inline bool _check_segment(const struct elf32_phdr *phdr)
+{
+	return ((phdr->p_type == PT_LOAD) &&
+		((phdr->p_flags & (7 << 24)) != (2 << 24)) &&
+		phdr->p_memsz);
+}
+
+static int __pil_tz_load_image(struct platform_device *pdev,
+		const struct firmware *mdt, const char *fwname,
+		void *fwptr, size_t fw_size, unsigned long fw_min_addr)
+{
+	char str[64] = { 0 };
+	const struct elf32_hdr *ehdr = (struct elf32_hdr *) mdt->data;
+	const struct elf32_phdr *phdrs = (struct elf32_phdr *) (ehdr + 1);
+	const struct firmware *fw;
+	int i, ret = 0;
+
+	for (i = 0; i < ehdr->e_phnum; i++) {
+		const struct elf32_phdr *phdr = &phdrs[i];
+		size_t offset;
+
+		/* Make sure the segment is loadable */
+		if (!_check_segment(phdr))
+			continue;
+
+		/* Get the offset of the segment within the region */
+		offset = (phdr->p_paddr - fw_min_addr);
+
+		/* Request the file containing the segment */
+		snprintf(str, sizeof(str) - 1, "%s.b%02d", fwname, i);
+
+		ret = request_firmware(&fw, str, &pdev->dev);
+		if (ret) {
+			dev_err(&pdev->dev, "Failed to load segment %s\n", str);
+			break;
+		}
+
+		if (offset + fw->size > fw_size) {
+			dev_err(&pdev->dev, "Segment %s is too big\n", str);
+			ret = -EINVAL;
+			release_firmware(fw);
+			break;
+		}
+
+		/* Copy the segment into place */
+		memcpy(fwptr + offset, fw->data, fw->size);
+		release_firmware(fw);
+	}
+
+	return ret;
+}
+
+static int _pil_tz_load_image(struct platform_device *pdev)
+{
+	char str[64] = { 0 };
+	const char *fwname;
+	const struct elf32_hdr *ehdr;
+	const struct elf32_phdr *phdrs;
+	const struct firmware *mdt;
+	phys_addr_t fw_min_addr, fw_max_addr;
+	dma_addr_t fw_phys;
+	size_t fw_size;
+	u32 pas_id;
+	void *ptr;
+	int i, ret;
+
+	if (pdev == NULL)
+		return -ENODEV;
+
+	if (!qcom_scm_is_available()) {
+		dev_err(&pdev->dev, "SCM is not available\n");
+		return -EINVAL;
+	}
+
+	ret = of_reserved_mem_device_init(&pdev->dev);
+
+	if (ret) {
+		dev_err(&pdev->dev, "Unable to set up the reserved memory\n");
+		return ret;
+	}
+
+	/* Get the firmware and PAS id from the device node */
+	if (of_property_read_string(pdev->dev.of_node, "qcom,firmware",
+		&fwname)) {
+		dev_err(&pdev->dev, "Could not read a firmware name\n");
+		return -EINVAL;
+	}
+
+	if (of_property_read_u32(pdev->dev.of_node, "qcom,pas-id", &pas_id)) {
+		dev_err(&pdev->dev, "Could not read the pas ID\n");
+		return -EINVAL;
+	}
+
+	snprintf(str, sizeof(str) - 1, "%s.mdt", fwname);
+
+	/* Request the MDT file for the firmware */
+	ret = request_firmware(&mdt, str, &pdev->dev);
+	if (ret) {
+		dev_err(&pdev->dev, "Unable to load %s\n", str);
+		return ret;
+	}
+
+	ehdr = (struct elf32_hdr *) mdt->data;
+	phdrs = (struct elf32_phdr *) (ehdr + 1);
+
+	/* Get the extents of the firmware image */
+
+	fw_min_addr = (phys_addr_t) ULLONG_MAX;
+	fw_max_addr = 0;
+
+	for (i = 0; i < ehdr->e_phnum; i++) {
+		const struct elf32_phdr *phdr = &phdrs[i];
+
+		if (!_check_segment(phdr))
+			continue;
+
+		fw_min_addr = min_t(phys_addr_t, fw_min_addr, phdr->p_paddr);
+		fw_max_addr = max_t(phys_addr_t, fw_max_addr,
+			PAGE_ALIGN(phdr->p_paddr + phdr->p_memsz));
+	}
+
+	if (fw_min_addr == (phys_addr_t) ULLONG_MAX && fw_max_addr == 0)
+		goto out;
+
+	fw_size = (size_t) (fw_max_addr - fw_min_addr);
+
+	/* Verify the MDT header */
+	ret = qcom_scm_pas_init_image(pas_id, mdt->data, mdt->size);
+	if (ret) {
+		dev_err(&pdev->dev, "Invalid firmware metadata\n");
+		goto out;
+	}
+
+	/* allocate some memory */
+	ptr = dma_alloc_coherent(&pdev->dev, fw_size, &fw_phys, GFP_KERNEL);
+	if (ptr == NULL)
+		goto out;
+
+	/* Set up the newly allocated memory region */
+	ret = qcom_scm_pas_mem_setup(pas_id, fw_phys, fw_size);
+	if (ret) {
+		dev_err(&pdev->dev, "Unable to set up firmware memory\n");
+		goto out;
+	}
+
+	ret = __pil_tz_load_image(pdev, mdt, fwname, ptr, fw_size, fw_min_addr);
+	if (!ret) {
+		ret = qcom_scm_pas_auth_and_reset(pas_id);
+		if (ret)
+			dev_err(&pdev->dev, "Unable to authorize the image\n");
+	}
+
+out:
+	if (ret && ptr)
+		dma_free_coherent(&pdev->dev, fw_size, ptr, fw_phys);
+
+	release_firmware(mdt);
+	return ret;
+}
+
 static void a5xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit,
 	struct msm_file_private *ctx)
 {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 12/12] drm/msm: gpu: Use the zap shader on 5XX if we can
  2016-11-28 19:28 [PATCH 00/12] Adreno A5XX support Jordan Crouse
                   ` (8 preceding siblings ...)
  2016-11-28 19:28 ` [PATCH 11/12] drm/msm: Add a quick and dirty PIL loader Jordan Crouse
@ 2016-11-28 19:28 ` Jordan Crouse
  2016-12-05 19:57   ` Bjorn Andersson
  9 siblings, 1 reply; 34+ messages in thread
From: Jordan Crouse @ 2016-11-28 19:28 UTC (permalink / raw)
  To: freedreno; +Cc: linux-arm-msm, dri-devel

The A5XX GPU powers on in "secure" mode. In secure mode the GPU can
only render to buffers that are marked as secure and inaccessible
to the kernel and user through a series of hardware protections. In
practice secure mode is used to draw things like a UI on a secure
video frame.

In order to switch out of secure mode the GPU executes a special
shader that clears out the GMEM and other sensitve registers and
then writes a register. Because the kernel can't be trusted the
shader binary is signed and verified and programmed by the
secure world. To do this we need to read the MDT header and the
segments from the firmware location and put them in memory and
present them for approval.

For targets without secure support there is an out: if the
secure world doesn't support secure then there are no hardware
protections and we can freely write the SECVID_TRUST register from
the CPU. We don't have 100% confidence that we can query the
secure capabilities at run time but we have enough calls that
need to go right to give us some confidence that we're at least doing
something useful.

Of course if we guess wrong you trigger a permissions violation
which usually ends up in a system crash but thats a problem
that shows up immediately.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 72 ++++++++++++++++++++++++++++++++++-
 1 file changed, 70 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index eefe197..a7a58ec 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -469,6 +469,55 @@ static int a5xx_ucode_init(struct msm_gpu *gpu)
 	return 0;
 }
 
+static int a5xx_zap_shader_resume(struct msm_gpu *gpu)
+{
+	int ret;
+
+	ret = qcom_scm_gpu_zap_resume();
+	if (ret)
+		DRM_ERROR("%s: zap-shader resume failed: %d\n",
+			gpu->name, ret);
+
+	return ret;
+}
+
+static int a5xx_zap_shader_init(struct msm_gpu *gpu)
+{
+	static bool loaded;
+	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
+	struct a5xx_gpu *a5xx_gpu = to_a5xx_gpu(adreno_gpu);
+	struct platform_device *pdev = a5xx_gpu->pdev;
+	struct device_node *node;
+	int ret;
+
+	/*
+	 * If the zap shader is already loaded into memory we just need to kick
+	 * the remote processor to reinitialize it
+	 */
+	if (loaded)
+		return a5xx_zap_shader_resume(gpu);
+
+	/* Populate the sub-nodes if they haven't already been done */
+	of_platform_populate(pdev->dev.of_node, NULL, NULL, &pdev->dev);
+
+	/* Find the sub-node for the zap shader */
+	node = of_find_node_by_name(pdev->dev.of_node, "qcom,zap-shader");
+	if (!node) {
+		DRM_ERROR("%s: qcom,zap-shader not found in device tree\n",
+			gpu->name);
+		return -ENODEV;
+	}
+
+	ret = _pil_tz_load_image(of_find_device_by_node(node));
+	if (ret)
+		DRM_ERROR("%s: Unable to load the zap shader\n",
+			gpu->name);
+
+	loaded = !ret;
+
+	return ret;
+}
+
 #define A5XX_INT_MASK (A5XX_RBBM_INT_0_MASK_RBBM_AHB_ERROR | \
 	  A5XX_RBBM_INT_0_MASK_RBBM_TRANSFER_TIMEOUT | \
 	  A5XX_RBBM_INT_0_MASK_RBBM_ME_MS_TIMEOUT | \
@@ -653,8 +702,27 @@ static int a5xx_hw_init(struct msm_gpu *gpu)
 			return -EINVAL;
 	}
 
-	/* Put the GPU into unsecure mode */
-	gpu_write(gpu, REG_A5XX_RBBM_SECVID_TRUST_CNTL, 0x0);
+	/*
+	 * Try to load a zap shader into the secure world. If successful
+	 * we can use the CP to switch out of secure mode. If not then we
+	 * have no resource but to try to switch ourselves out manually. If we
+	 * guessed wrong then access to the RBBM_SECVID_TRUST_CNTL register will
+	 * be blocked and a permissions violation will soon follow.
+	 */
+	ret = a5xx_zap_shader_init(gpu);
+	if (!ret) {
+		OUT_PKT7(gpu->rb, CP_SET_SECURE_MODE, 1);
+		OUT_RING(gpu->rb, 0x00000000);
+
+		gpu->funcs->flush(gpu);
+		if (!gpu->funcs->idle(gpu))
+			return -EINVAL;
+	} else {
+		/* Print a warning so if we die, we know why */
+		dev_warn_once(gpu->dev->dev,
+			"Zap shader not enabled - using SECVID_TRUST_CNTL instead\n");
+		gpu_write(gpu, REG_A5XX_RBBM_SECVID_TRUST_CNTL, 0x0);
+	}
 
 	return 0;
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH 12/12] drm/msm: gpu: Use the zap shader on 5XX if we can
  2016-11-28 19:28 ` [PATCH 12/12] drm/msm: gpu: Use the zap shader on 5XX if we can Jordan Crouse
@ 2016-12-05 19:57   ` Bjorn Andersson
  2016-12-05 20:10     ` Bjorn Andersson
  2016-12-06 15:35     ` Jordan Crouse
  0 siblings, 2 replies; 34+ messages in thread
From: Bjorn Andersson @ 2016-12-05 19:57 UTC (permalink / raw)
  To: Jordan Crouse; +Cc: freedreno, linux-arm-msm, dri-devel

On Mon 28 Nov 11:28 PST 2016, Jordan Crouse wrote:

> The A5XX GPU powers on in "secure" mode. In secure mode the GPU can
> only render to buffers that are marked as secure and inaccessible
> to the kernel and user through a series of hardware protections. In
> practice secure mode is used to draw things like a UI on a secure
> video frame.
> 
> In order to switch out of secure mode the GPU executes a special
> shader that clears out the GMEM and other sensitve registers and
> then writes a register. Because the kernel can't be trusted the
> shader binary is signed and verified and programmed by the
> secure world. To do this we need to read the MDT header and the
> segments from the firmware location and put them in memory and
> present them for approval.
> 
> For targets without secure support there is an out: if the
> secure world doesn't support secure then there are no hardware
> protections and we can freely write the SECVID_TRUST register from
> the CPU. We don't have 100% confidence that we can query the
> secure capabilities at run time but we have enough calls that
> need to go right to give us some confidence that we're at least doing
> something useful.
> 
> Of course if we guess wrong you trigger a permissions violation
> which usually ends up in a system crash but thats a problem
> that shows up immediately.
> 
> Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> ---
>  drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 72 ++++++++++++++++++++++++++++++++++-
>  1 file changed, 70 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> index eefe197..a7a58ec 100644
> --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> @@ -469,6 +469,55 @@ static int a5xx_ucode_init(struct msm_gpu *gpu)
>  	return 0;
>  }
>  
> +static int a5xx_zap_shader_resume(struct msm_gpu *gpu)
> +{
> +	int ret;
> +
> +	ret = qcom_scm_gpu_zap_resume();
> +	if (ret)
> +		DRM_ERROR("%s: zap-shader resume failed: %d\n",
> +			gpu->name, ret);
> +
> +	return ret;
> +}
> +
> +static int a5xx_zap_shader_init(struct msm_gpu *gpu)
> +{
> +	static bool loaded;
> +	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> +	struct a5xx_gpu *a5xx_gpu = to_a5xx_gpu(adreno_gpu);
> +	struct platform_device *pdev = a5xx_gpu->pdev;
> +	struct device_node *node;
> +	int ret;
> +
> +	/*
> +	 * If the zap shader is already loaded into memory we just need to kick
> +	 * the remote processor to reinitialize it
> +	 */
> +	if (loaded)

Why is this handling needed? Why can init be called multiple times?

> +		return a5xx_zap_shader_resume(gpu);
> +
> +	/* Populate the sub-nodes if they haven't already been done */
> +	of_platform_populate(pdev->dev.of_node, NULL, NULL, &pdev->dev);

I haven't been able to find the qcom,zap-shader platform driver, but I
presume you have something like:

adreno {
	qcom,zap-shader {
		compatible = "qcom,zap-shader";

		firmware = "zapfw";
		memory-region = <&zap_region>;
	};
};

I presume this is done to not "taint" the adreno device's with the zap
memory region, but I don't think you should (ab)use a platform driver
for this.

You should rather add a struct device zap_dev to your adreno context, do
minimal initialization (name and a parent I think is enough), call
device_register(&zap_dev);, of_reserved_mem_device_init() and then use
that for your dma allocation.

This saves you from creating a platform_driver, instantiating a
platform_device and the worry of the race between the creation of that
device and the of_find_device_by_node() below.

> +
> +	/* Find the sub-node for the zap shader */
> +	node = of_find_node_by_name(pdev->dev.of_node, "qcom,zap-shader");

If you're looking for immediate children use of_get_child_by_name()

And no "qcom," in node names please.

> +	if (!node) {
> +		DRM_ERROR("%s: qcom,zap-shader not found in device tree\n",
> +			gpu->name);
> +		return -ENODEV;
> +	}
> +
> +	ret = _pil_tz_load_image(of_find_device_by_node(node));
> +	if (ret)
> +		DRM_ERROR("%s: Unable to load the zap shader\n",
> +			gpu->name);
> +
> +	loaded = !ret;
> +
> +	return ret;
> +}

Regards,
Bjorn

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 11/12] drm/msm: Add a quick and dirty PIL loader
  2016-11-28 19:28 ` [PATCH 11/12] drm/msm: Add a quick and dirty PIL loader Jordan Crouse
@ 2016-12-05 19:57   ` Bjorn Andersson
  2016-12-06 17:49     ` Jordan Crouse
  0 siblings, 1 reply; 34+ messages in thread
From: Bjorn Andersson @ 2016-12-05 19:57 UTC (permalink / raw)
  To: Jordan Crouse; +Cc: freedreno, linux-arm-msm, ML dri-devel

On Mon 28 Nov 11:28 PST 2016, Jordan Crouse wrote:

> In order to switch the GPU out of secure mode on most systems we
> need to load a zap shader into memory and get it authenticated
> and into the secure world.  All the bits and pieces to do
> the load are scattered throughout the kernel, but we need to
> bring everything together.
>
> Add a semi-custom loader that will read a MDT file and get
> it loaded and authenticated through SCM.
>

I've been trying to figure out a reasonable way to provide a
scm/pas/mdt-loader, but haven't come up with anything sane yet. Perhaps
it's better to provide helpers for the scm case and open code the MDT
loading in the non-scm driver. I think this is an okay approach for now.

But it's not a "PIL loader", that's just a side effect of reusing parts
of the existing PIL load in the Qualcomm tree. I would suggest just
making the subject "Add ZAP MDT loader".

Some minor comments below.

> Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> ---
>  drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 166 ++++++++++++++++++++++++++++++++++
>  1 file changed, 166 insertions(+)
>
> diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> index 3824bc4..eefe197 100644
> --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> @@ -11,12 +11,178 @@
>   *
>   */
>  
> +#include <linux/elf.h>
> +#include <linux/types.h>
> +#include <linux/cpumask.h>
> +#include <linux/qcom_scm.h>
> +#include <linux/dma-mapping.h>
> +#include <linux/of_reserved_mem.h>
>  #include "msm_gem.h"
>  #include "a5xx_gpu.h"
>  
>  extern bool hang_debug;
>  static void a5xx_dump(struct msm_gpu *gpu);
>  
> +static inline bool _check_segment(const struct elf32_phdr *phdr)
> +{
> + return ((phdr->p_type == PT_LOAD) &&
> + ((phdr->p_flags & (7 << 24)) != (2 << 24)) &&
> + phdr->p_memsz);
> +}
> +
> +static int __pil_tz_load_image(struct platform_device *pdev,

Rather than throwing more underscores at it I would have named this
"zap_load_segments".

> + const struct firmware *mdt, const char *fwname,
> + void *fwptr, size_t fw_size, unsigned long fw_min_addr)
> +{
> + char str[64] = { 0 };

Name this filename or similar and no need to initialize it.

> + const struct elf32_hdr *ehdr = (struct elf32_hdr *) mdt->data;
> + const struct elf32_phdr *phdrs = (struct elf32_phdr *) (ehdr + 1);
> + const struct firmware *fw;
> + int i, ret = 0;
> +
> + for (i = 0; i < ehdr->e_phnum; i++) {
> + const struct elf32_phdr *phdr = &phdrs[i];
> + size_t offset;
> +
> + /* Make sure the segment is loadable */
> + if (!_check_segment(phdr))
> + continue; > +
> + /* Get the offset of the segment within the region */
> + offset = (phdr->p_paddr - fw_min_addr);
> +
> + /* Request the file containing the segment */
> + snprintf(str, sizeof(str) - 1, "%s.b%02d", fwname, i);

snprintf(, size, ) writes at most size bytes and includes null
termination, so drop the "- 1".

> +
> + ret = request_firmware(&fw, str, &pdev->dev);
> + if (ret) {
> + dev_err(&pdev->dev, "Failed to load segment %s\n", str);
> + break;
> + }
> +
> + if (offset + fw->size > fw_size) {
> + dev_err(&pdev->dev, "Segment %s is too big\n", str);
> + ret = -EINVAL;
> + release_firmware(fw);
> + break;
> + }
> +
> + /* Copy the segment into place */
> + memcpy(fwptr + offset, fw->data, fw->size);

Just to be on the safe side, please add:

if (phdr->p_memsz > phdr->p_filesz)
memset(fwptr + fw->size, 0, phdr->p_memsz - phdr->p_filesz);

> + release_firmware(fw);
> + }
> +
> + return ret;
> +}
> +
> +static int _pil_tz_load_image(struct platform_device *pdev)

zap_load_mdt() ?

> +{
> + char str[64] = { 0 };
> + const char *fwname;
> + const struct elf32_hdr *ehdr;
> + const struct elf32_phdr *phdrs;
> + const struct firmware *mdt;
> + phys_addr_t fw_min_addr, fw_max_addr;
> + dma_addr_t fw_phys;
> + size_t fw_size;
> + u32 pas_id;
> + void *ptr;
> + int i, ret;
> +
> + if (pdev == NULL)
> + return -ENODEV;

This should not happen.

> +
> + if (!qcom_scm_is_available()) {
> + dev_err(&pdev->dev, "SCM is not available\n");
> + return -EINVAL;

We generally return -EPROBE_DEFER here.

> + }
> +
> + ret = of_reserved_mem_device_init(&pdev->dev);
> +

Please drop the extra newline.

> + if (ret) {
> + dev_err(&pdev->dev, "Unable to set up the reserved memory\n");
> + return ret;
> + }
> +
> + /* Get the firmware and PAS id from the device node */
> + if (of_property_read_string(pdev->dev.of_node, "qcom,firmware",
> + &fwname)) {
> + dev_err(&pdev->dev, "Could not read a firmware name\n");
> + return -EINVAL;
> + }
> +
> + if (of_property_read_u32(pdev->dev.of_node, "qcom,pas-id", &pas_id)) {

This is constant, so define it in the driver.

> + dev_err(&pdev->dev, "Could not read the pas ID\n");
> + return -EINVAL;
> + }
> +
> + snprintf(str, sizeof(str) - 1, "%s.mdt", fwname);

As above, re "- 1", name of str and initialization.

> +
> + /* Request the MDT file for the firmware */
> + ret = request_firmware(&mdt, str, &pdev->dev);
> + if (ret) {
> + dev_err(&pdev->dev, "Unable to load %s\n", str);
> + return ret;
> + }
> +
> + ehdr = (struct elf32_hdr *) mdt->data;
> + phdrs = (struct elf32_phdr *) (ehdr + 1);
> +
> + /* Get the extents of the firmware image */
> +
> + fw_min_addr = (phys_addr_t) ULLONG_MAX;
> + fw_max_addr = 0;
> +
> + for (i = 0; i < ehdr->e_phnum; i++) {
> + const struct elf32_phdr *phdr = &phdrs[i];
> +
> + if (!_check_segment(phdr))
> + continue;
> +
> + fw_min_addr = min_t(phys_addr_t, fw_min_addr, phdr->p_paddr);
> + fw_max_addr = max_t(phys_addr_t, fw_max_addr,
> + PAGE_ALIGN(phdr->p_paddr + phdr->p_memsz));
> + }
> +
> + if (fw_min_addr == (phys_addr_t) ULLONG_MAX && fw_max_addr == 0)
> + goto out;
> +
> + fw_size = (size_t) (fw_max_addr - fw_min_addr);
> +
> + /* Verify the MDT header */
> + ret = qcom_scm_pas_init_image(pas_id, mdt->data, mdt->size);
> + if (ret) {
> + dev_err(&pdev->dev, "Invalid firmware metadata\n");
> + goto out;
> + }
> +
> + /* allocate some memory */
> + ptr = dma_alloc_coherent(&pdev->dev, fw_size, &fw_phys, GFP_KERNEL);
> + if (ptr == NULL)
> + goto out;
> +
> + /* Set up the newly allocated memory region */
> + ret = qcom_scm_pas_mem_setup(pas_id, fw_phys, fw_size);
> + if (ret) {
> + dev_err(&pdev->dev, "Unable to set up firmware memory\n");
> + goto out;
> + }
> +
> + ret = __pil_tz_load_image(pdev, mdt, fwname, ptr, fw_size, fw_min_addr);
> + if (!ret) {

Please don't mix style like this, goto out on error.

> + ret = qcom_scm_pas_auth_and_reset(pas_id);
> + if (ret)
> + dev_err(&pdev->dev, "Unable to authorize the image\n");
> + }
> +
> +out:
> + if (ret && ptr)
> + dma_free_coherent(&pdev->dev, fw_size, ptr, fw_phys);
> +
> + release_firmware(mdt);
> + return ret;
> +}
> +

Regards,
Bjorn

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 12/12] drm/msm: gpu: Use the zap shader on 5XX if we can
  2016-12-05 19:57   ` Bjorn Andersson
@ 2016-12-05 20:10     ` Bjorn Andersson
  2016-12-06 15:35     ` Jordan Crouse
  1 sibling, 0 replies; 34+ messages in thread
From: Bjorn Andersson @ 2016-12-05 20:10 UTC (permalink / raw)
  To: Jordan Crouse; +Cc: freedreno, linux-arm-msm, dri-devel

On Mon 05 Dec 11:57 PST 2016, Bjorn Andersson wrote:

[..]
> You should rather add a struct device zap_dev to your adreno context, do
> minimal initialization (name and a parent I think is enough), call
> device_register(&zap_dev);, of_reserved_mem_device_init() and then use
> that for your dma allocation.

You would of course need to set the of_node on the zap_dev for
of_reserved_mem_device_init() to work. Sorry for missing that.

Regards,
Bjorn

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 12/12] drm/msm: gpu: Use the zap shader on 5XX if we can
  2016-12-05 19:57   ` Bjorn Andersson
  2016-12-05 20:10     ` Bjorn Andersson
@ 2016-12-06 15:35     ` Jordan Crouse
  2016-12-06 16:37       ` [Freedreno] " Rob Clark
       [not found]       ` <20161206153501.GA25541-9PYrDHPZ2Orvke4nUoYGnHL1okKdlPRT@public.gmane.org>
  1 sibling, 2 replies; 34+ messages in thread
From: Jordan Crouse @ 2016-12-06 15:35 UTC (permalink / raw)
  To: Bjorn Andersson
  Cc: linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On Mon, Dec 05, 2016 at 11:57:12AM -0800, Bjorn Andersson wrote:
> On Mon 28 Nov 11:28 PST 2016, Jordan Crouse wrote:
> 
> > The A5XX GPU powers on in "secure" mode. In secure mode the GPU can
> > only render to buffers that are marked as secure and inaccessible
> > to the kernel and user through a series of hardware protections. In
> > practice secure mode is used to draw things like a UI on a secure
> > video frame.
> > 
> > In order to switch out of secure mode the GPU executes a special
> > shader that clears out the GMEM and other sensitve registers and
> > then writes a register. Because the kernel can't be trusted the
> > shader binary is signed and verified and programmed by the
> > secure world. To do this we need to read the MDT header and the
> > segments from the firmware location and put them in memory and
> > present them for approval.
> > 
> > For targets without secure support there is an out: if the
> > secure world doesn't support secure then there are no hardware
> > protections and we can freely write the SECVID_TRUST register from
> > the CPU. We don't have 100% confidence that we can query the
> > secure capabilities at run time but we have enough calls that
> > need to go right to give us some confidence that we're at least doing
> > something useful.
> > 
> > Of course if we guess wrong you trigger a permissions violation
> > which usually ends up in a system crash but thats a problem
> > that shows up immediately.
> > 
> > Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> > ---
> >  drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 72 ++++++++++++++++++++++++++++++++++-
> >  1 file changed, 70 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > index eefe197..a7a58ec 100644
> > --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > @@ -469,6 +469,55 @@ static int a5xx_ucode_init(struct msm_gpu *gpu)
> >  	return 0;
> >  }
> >  
> > +static int a5xx_zap_shader_resume(struct msm_gpu *gpu)
> > +{
> > +	int ret;
> > +
> > +	ret = qcom_scm_gpu_zap_resume();
> > +	if (ret)
> > +		DRM_ERROR("%s: zap-shader resume failed: %d\n",
> > +			gpu->name, ret);
> > +
> > +	return ret;
> > +}
> > +
> > +static int a5xx_zap_shader_init(struct msm_gpu *gpu)
> > +{
> > +	static bool loaded;
> > +	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> > +	struct a5xx_gpu *a5xx_gpu = to_a5xx_gpu(adreno_gpu);
> > +	struct platform_device *pdev = a5xx_gpu->pdev;
> > +	struct device_node *node;
> > +	int ret;
> > +
> > +	/*
> > +	 * If the zap shader is already loaded into memory we just need to kick
> > +	 * the remote processor to reinitialize it
> > +	 */
> > +	if (loaded)
> 
> Why is this handling needed? Why can init be called multiple times?

This is for resume - if we suspend and resume the device without losing state
the secure zone we can't load it again, so we have to call a different
operation to "resume" it. This will be much more heavily used when we have more
aggressive power management.

> > +		return a5xx_zap_shader_resume(gpu);
> > +
> > +	/* Populate the sub-nodes if they haven't already been done */
> > +	of_platform_populate(pdev->dev.of_node, NULL, NULL, &pdev->dev);
> 
> I haven't been able to find the qcom,zap-shader platform driver, but I
> presume you have something like:
> 
> adreno {
> 	qcom,zap-shader {
> 		compatible = "qcom,zap-shader";
> 
> 		firmware = "zapfw";
> 		memory-region = <&zap_region>;
> 	};
> };
> 
> I presume this is done to not "taint" the adreno device's with the zap
> memory region, but I don't think you should (ab)use a platform driver
> for this.
> 
> You should rather add a struct device zap_dev to your adreno context, do
> minimal initialization (name and a parent I think is enough), call
> device_register(&zap_dev);, of_reserved_mem_device_init() and then use
> that for your dma allocation.
> 
> This saves you from creating a platform_driver, instantiating a
> platform_device and the worry of the race between the creation of that
> device and the of_find_device_by_node() below.

As far as I know, of_platform_populate() just fleshed out the platform devices
for the sub-nodes. We are not creating a new platform driver or doing any sort
of probe code, we're just setting up the useful memory to attach the
memory-region to. As far as I can tell, using of_platform_populate() +
of_find_device_by_node() does a lot of the heavy lifting of what you
describe.

> > +
> > +	/* Find the sub-node for the zap shader */
> > +	node = of_find_node_by_name(pdev->dev.of_node, "qcom,zap-shader");
> 
> If you're looking for immediate children use of_get_child_by_name()
> 
> And no "qcom," in node names please.

Okay, can do.

> > +	if (!node) {
> > +		DRM_ERROR("%s: qcom,zap-shader not found in device tree\n",
> > +			gpu->name);
> > +		return -ENODEV;
> > +	}
> > +
> > +	ret = _pil_tz_load_image(of_find_device_by_node(node));
> > +	if (ret)
> > +		DRM_ERROR("%s: Unable to load the zap shader\n",
> > +			gpu->name);
> > +
> > +	loaded = !ret;
> > +
> > +	return ret;
> > +}
> 
> Regards,
> Bjorn

Jordan
-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Freedreno] [PATCH 12/12] drm/msm: gpu: Use the zap shader on 5XX if we can
  2016-12-06 15:35     ` Jordan Crouse
@ 2016-12-06 16:37       ` Rob Clark
       [not found]       ` <20161206153501.GA25541-9PYrDHPZ2Orvke4nUoYGnHL1okKdlPRT@public.gmane.org>
  1 sibling, 0 replies; 34+ messages in thread
From: Rob Clark @ 2016-12-06 16:37 UTC (permalink / raw)
  To: Bjorn Andersson, linux-arm-msm, freedreno, dri-devel

On Tue, Dec 6, 2016 at 10:35 AM, Jordan Crouse <jcrouse@codeaurora.org> wrote:
> On Mon, Dec 05, 2016 at 11:57:12AM -0800, Bjorn Andersson wrote:
>> On Mon 28 Nov 11:28 PST 2016, Jordan Crouse wrote:
>>
>> > The A5XX GPU powers on in "secure" mode. In secure mode the GPU can
>> > only render to buffers that are marked as secure and inaccessible
>> > to the kernel and user through a series of hardware protections. In
>> > practice secure mode is used to draw things like a UI on a secure
>> > video frame.
>> >
>> > In order to switch out of secure mode the GPU executes a special
>> > shader that clears out the GMEM and other sensitve registers and
>> > then writes a register. Because the kernel can't be trusted the
>> > shader binary is signed and verified and programmed by the
>> > secure world. To do this we need to read the MDT header and the
>> > segments from the firmware location and put them in memory and
>> > present them for approval.
>> >
>> > For targets without secure support there is an out: if the
>> > secure world doesn't support secure then there are no hardware
>> > protections and we can freely write the SECVID_TRUST register from
>> > the CPU. We don't have 100% confidence that we can query the
>> > secure capabilities at run time but we have enough calls that
>> > need to go right to give us some confidence that we're at least doing
>> > something useful.
>> >
>> > Of course if we guess wrong you trigger a permissions violation
>> > which usually ends up in a system crash but thats a problem
>> > that shows up immediately.
>> >
>> > Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
>> > ---
>> >  drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 72 ++++++++++++++++++++++++++++++++++-
>> >  1 file changed, 70 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
>> > index eefe197..a7a58ec 100644
>> > --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
>> > +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
>> > @@ -469,6 +469,55 @@ static int a5xx_ucode_init(struct msm_gpu *gpu)
>> >     return 0;
>> >  }
>> >
>> > +static int a5xx_zap_shader_resume(struct msm_gpu *gpu)
>> > +{
>> > +   int ret;
>> > +
>> > +   ret = qcom_scm_gpu_zap_resume();
>> > +   if (ret)
>> > +           DRM_ERROR("%s: zap-shader resume failed: %d\n",
>> > +                   gpu->name, ret);
>> > +
>> > +   return ret;
>> > +}
>> > +
>> > +static int a5xx_zap_shader_init(struct msm_gpu *gpu)
>> > +{
>> > +   static bool loaded;
>> > +   struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
>> > +   struct a5xx_gpu *a5xx_gpu = to_a5xx_gpu(adreno_gpu);
>> > +   struct platform_device *pdev = a5xx_gpu->pdev;
>> > +   struct device_node *node;
>> > +   int ret;
>> > +
>> > +   /*
>> > +    * If the zap shader is already loaded into memory we just need to kick
>> > +    * the remote processor to reinitialize it
>> > +    */
>> > +   if (loaded)
>>
>> Why is this handling needed? Why can init be called multiple times?
>
> This is for resume - if we suspend and resume the device without losing state
> the secure zone we can't load it again, so we have to call a different
> operation to "resume" it. This will be much more heavily used when we have more
> aggressive power management.

also for recovery when I crash the gpu ;-)

BR,
-R

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 12/12] drm/msm: gpu: Use the zap shader on 5XX if we can
       [not found]       ` <20161206153501.GA25541-9PYrDHPZ2Orvke4nUoYGnHL1okKdlPRT@public.gmane.org>
@ 2016-12-06 17:18         ` Bjorn Andersson
  0 siblings, 0 replies; 34+ messages in thread
From: Bjorn Andersson @ 2016-12-06 17:18 UTC (permalink / raw)
  To: linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On Tue 06 Dec 07:35 PST 2016, Jordan Crouse wrote:

> On Mon, Dec 05, 2016 at 11:57:12AM -0800, Bjorn Andersson wrote:
> > On Mon 28 Nov 11:28 PST 2016, Jordan Crouse wrote:
> > 
> > > The A5XX GPU powers on in "secure" mode. In secure mode the GPU can
> > > only render to buffers that are marked as secure and inaccessible
> > > to the kernel and user through a series of hardware protections. In
> > > practice secure mode is used to draw things like a UI on a secure
> > > video frame.
> > > 
> > > In order to switch out of secure mode the GPU executes a special
> > > shader that clears out the GMEM and other sensitve registers and
> > > then writes a register. Because the kernel can't be trusted the
> > > shader binary is signed and verified and programmed by the
> > > secure world. To do this we need to read the MDT header and the
> > > segments from the firmware location and put them in memory and
> > > present them for approval.
> > > 
> > > For targets without secure support there is an out: if the
> > > secure world doesn't support secure then there are no hardware
> > > protections and we can freely write the SECVID_TRUST register from
> > > the CPU. We don't have 100% confidence that we can query the
> > > secure capabilities at run time but we have enough calls that
> > > need to go right to give us some confidence that we're at least doing
> > > something useful.
> > > 
> > > Of course if we guess wrong you trigger a permissions violation
> > > which usually ends up in a system crash but thats a problem
> > > that shows up immediately.
> > > 
> > > Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> > > ---
> > >  drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 72 ++++++++++++++++++++++++++++++++++-
> > >  1 file changed, 70 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > > index eefe197..a7a58ec 100644
> > > --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > > +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > > @@ -469,6 +469,55 @@ static int a5xx_ucode_init(struct msm_gpu *gpu)
> > >  	return 0;
> > >  }
> > >  
> > > +static int a5xx_zap_shader_resume(struct msm_gpu *gpu)
> > > +{
> > > +	int ret;
> > > +
> > > +	ret = qcom_scm_gpu_zap_resume();
> > > +	if (ret)
> > > +		DRM_ERROR("%s: zap-shader resume failed: %d\n",
> > > +			gpu->name, ret);
> > > +
> > > +	return ret;
> > > +}
> > > +
> > > +static int a5xx_zap_shader_init(struct msm_gpu *gpu)
> > > +{
> > > +	static bool loaded;
> > > +	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> > > +	struct a5xx_gpu *a5xx_gpu = to_a5xx_gpu(adreno_gpu);
> > > +	struct platform_device *pdev = a5xx_gpu->pdev;
> > > +	struct device_node *node;
> > > +	int ret;
> > > +
> > > +	/*
> > > +	 * If the zap shader is already loaded into memory we just need to kick
> > > +	 * the remote processor to reinitialize it
> > > +	 */
> > > +	if (loaded)
> > 
> > Why is this handling needed? Why can init be called multiple times?
> 
> This is for resume - if we suspend and resume the device without
> losing state the secure zone we can't load it again, so we have to
> call a different operation to "resume" it. This will be much more
> heavily used when we have more aggressive power management.
> 

I presume then that the rest of the resume path is the same as the
initialization.

> > > +		return a5xx_zap_shader_resume(gpu);
> > > +
> > > +	/* Populate the sub-nodes if they haven't already been done */
> > > +	of_platform_populate(pdev->dev.of_node, NULL, NULL, &pdev->dev);
> > 
> > I haven't been able to find the qcom,zap-shader platform driver, but I
> > presume you have something like:
> > 
> > adreno {
> > 	qcom,zap-shader {
> > 		compatible = "qcom,zap-shader";
> > 
> > 		firmware = "zapfw";
> > 		memory-region = <&zap_region>;
> > 	};
> > };
> > 
> > I presume this is done to not "taint" the adreno device's with the zap
> > memory region, but I don't think you should (ab)use a platform driver
> > for this.
> > 
> > You should rather add a struct device zap_dev to your adreno context, do
> > minimal initialization (name and a parent I think is enough), call
> > device_register(&zap_dev);, of_reserved_mem_device_init() and then use
> > that for your dma allocation.
> > 
> > This saves you from creating a platform_driver, instantiating a
> > platform_device and the worry of the race between the creation of that
> > device and the of_find_device_by_node() below.
> 
> As far as I know, of_platform_populate() just fleshed out the platform
> devices for the sub-nodes.

Correct, it will scan all subnodes and try to match their compatible
with platform_drivers. For each match it finds it probes that driver.

> We are not creating a new platform driver
> or doing any sort of probe code, we're just setting up the useful
> memory to attach the memory-region to. As far as I can tell, using
> of_platform_populate() + of_find_device_by_node() does a lot of the
> heavy lifting of what you describe.

But as far as I know the compatible of node that you name
"qcom,zap-shader" must actually match a platform_driver for there to be
a device that of_find_device_by_node() will be able to match against.

The heavy lifting I'm talking about is (untested):

	struct device zap_dev;

	zap_dev.of_node = of_get_child_by_name("zap-shader");
	zap_dev.parent = &pdev->dev;
	dev_set_name(&zap_dev, "zap-shader");

	ret = device_register(&zap_dev);
	if (ret < 0) {
		dev_err();
		return ret;
	}

	of_reserved_mem_device_init(&zap_dev);


And then you need a device_put(&zap_dev) when you're done with the
device and if you set zap_dev.release you can have that clean things
up...

Regards,
Bjorn
_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 11/12] drm/msm: Add a quick and dirty PIL loader
  2016-12-05 19:57   ` Bjorn Andersson
@ 2016-12-06 17:49     ` Jordan Crouse
  0 siblings, 0 replies; 34+ messages in thread
From: Jordan Crouse @ 2016-12-06 17:49 UTC (permalink / raw)
  To: Bjorn Andersson
  Cc: linux-arm-msm, freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, ML dri-devel

On Mon, Dec 05, 2016 at 11:57:43AM -0800, Bjorn Andersson wrote:
> > + if (of_property_read_u32(pdev->dev.of_node, "qcom,pas-id", &pas_id)) {
> 
> This is constant, so define it in the driver.

Little bit concerned it might not always be constant but I suppose we can cross
that bridge when we get to it.

> Regards,
> Bjorn

Jordan
-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 10/12] firmware: qcom_scm: Add qcom_scm_gpu_zap_resume()
  2016-11-28 19:28 ` [PATCH 10/12] firmware: qcom_scm: Add qcom_scm_gpu_zap_resume() Jordan Crouse
@ 2017-01-13 17:12   ` Andy Gross
       [not found]     ` <20170113171241.GH5710-3KkwrOJo9xYlRp7syxWybdHuzzzSOjJt@public.gmane.org>
  2017-01-13 23:24     ` [Freedreno] " Jordan Crouse
       [not found]   ` <1480361317-9937-11-git-send-email-jcrouse-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
  1 sibling, 2 replies; 34+ messages in thread
From: Andy Gross @ 2017-01-13 17:12 UTC (permalink / raw)
  To: Jordan Crouse; +Cc: freedreno, linux-arm-msm, dri-devel

On Mon, Nov 28, 2016 at 12:28:35PM -0700, Jordan Crouse wrote:
> Add an interface to trigger the remote processor to reinitialize the GPU
> zap shader on power-up.
> 
> Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> ---

<snip>

> +int __qcom_scm_gpu_zap_resume(struct device *dev)
> +{
> +	struct qcom_scm_desc desc = {0};
> +	struct arm_smccc_res res;
> +	int ret;
> +
> +	desc.args[0] = 0;
> +	desc.args[1] = 13;

Can I get a define here for these two?  Or maybe a comment on what these values
are?

> +	desc.arginfo = QCOM_SCM_ARGS(2);
> +
> +	ret = qcom_scm_call(dev, QCOM_SCM_SVC_BOOT, 0x0A, &desc, &res);

Same with the 0xA.  We usually throw a #define in for the command definitions.

Otherwise this all looks fine.  If you can get back to me with either the values
or a new patch I can include this in the next pull.

Thanks.

Andy

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 10/12] firmware: qcom_scm: Add qcom_scm_gpu_zap_resume()
       [not found]     ` <20170113171241.GH5710-3KkwrOJo9xYlRp7syxWybdHuzzzSOjJt@public.gmane.org>
@ 2017-01-13 17:22       ` Jordan Crouse
       [not found]         ` <20170113172244.GA28592-9PYrDHPZ2Orvke4nUoYGnHL1okKdlPRT@public.gmane.org>
  0 siblings, 1 reply; 34+ messages in thread
From: Jordan Crouse @ 2017-01-13 17:22 UTC (permalink / raw)
  To: Andy Gross
  Cc: linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On Fri, Jan 13, 2017 at 11:12:41AM -0600, Andy Gross wrote:
> On Mon, Nov 28, 2016 at 12:28:35PM -0700, Jordan Crouse wrote:
> > Add an interface to trigger the remote processor to reinitialize the GPU
> > zap shader on power-up.
> > 
> > Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> > ---
> 
> <snip>
> 
> > +int __qcom_scm_gpu_zap_resume(struct device *dev)
> > +{
> > +	struct qcom_scm_desc desc = {0};
> > +	struct arm_smccc_res res;
> > +	int ret;
> > +
> > +	desc.args[0] = 0;
> > +	desc.args[1] = 13;
> 
> Can I get a define here for these two?  Or maybe a comment on what these values
> are?
> 
> > +	desc.arginfo = QCOM_SCM_ARGS(2);
> > +
> > +	ret = qcom_scm_call(dev, QCOM_SCM_SVC_BOOT, 0x0A, &desc, &res);
> 
> Same with the 0xA.  We usually throw a #define in for the command definitions.
> 
> Otherwise this all looks fine.  If you can get back to me with either the values
> or a new patch I can include this in the next pull.

Thanks for reaching out. Since this isn't my usual area I'll need to work out
some terminology with the SCM folks to make sure I don't accidentally do
anything that will get me in trouble.

Jordan
-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 10/12] firmware: qcom_scm: Add qcom_scm_gpu_zap_resume()
       [not found]         ` <20170113172244.GA28592-9PYrDHPZ2Orvke4nUoYGnHL1okKdlPRT@public.gmane.org>
@ 2017-01-13 17:45           ` Andy Gross
  0 siblings, 0 replies; 34+ messages in thread
From: Andy Gross @ 2017-01-13 17:45 UTC (permalink / raw)
  To: freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On Fri, Jan 13, 2017 at 10:22:45AM -0700, Jordan Crouse wrote:
> On Fri, Jan 13, 2017 at 11:12:41AM -0600, Andy Gross wrote:
> > On Mon, Nov 28, 2016 at 12:28:35PM -0700, Jordan Crouse wrote:
> > > Add an interface to trigger the remote processor to reinitialize the GPU
> > > zap shader on power-up.
> > > 
> > > Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> > > ---
> > 
> > <snip>
> > 
> > > +int __qcom_scm_gpu_zap_resume(struct device *dev)
> > > +{
> > > +	struct qcom_scm_desc desc = {0};
> > > +	struct arm_smccc_res res;
> > > +	int ret;
> > > +
> > > +	desc.args[0] = 0;
> > > +	desc.args[1] = 13;
> > 
> > Can I get a define here for these two?  Or maybe a comment on what these values
> > are?
> > 
> > > +	desc.arginfo = QCOM_SCM_ARGS(2);
> > > +
> > > +	ret = qcom_scm_call(dev, QCOM_SCM_SVC_BOOT, 0x0A, &desc, &res);
> > 
> > Same with the 0xA.  We usually throw a #define in for the command definitions.
> > 
> > Otherwise this all looks fine.  If you can get back to me with either the values
> > or a new patch I can include this in the next pull.
> 
> Thanks for reaching out. Since this isn't my usual area I'll need to work out
> some terminology with the SCM folks to make sure I don't accidentally do
> anything that will get me in trouble.

Yeah I did a brief look through CAF.   I saw basically the same information.  If
you can't get an answer, let me know and I'll just take it and put in a comment.


Andy
_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Freedreno] [PATCH 10/12] firmware: qcom_scm: Add qcom_scm_gpu_zap_resume()
  2017-01-13 17:12   ` Andy Gross
       [not found]     ` <20170113171241.GH5710-3KkwrOJo9xYlRp7syxWybdHuzzzSOjJt@public.gmane.org>
@ 2017-01-13 23:24     ` Jordan Crouse
       [not found]       ` <20170113232438.GA24139-9PYrDHPZ2Orvke4nUoYGnHL1okKdlPRT@public.gmane.org>
  1 sibling, 1 reply; 34+ messages in thread
From: Jordan Crouse @ 2017-01-13 23:24 UTC (permalink / raw)
  To: Andy Gross; +Cc: linux-arm-msm, freedreno, dri-devel

On Fri, Jan 13, 2017 at 11:12:41AM -0600, Andy Gross wrote:
> On Mon, Nov 28, 2016 at 12:28:35PM -0700, Jordan Crouse wrote:
> > Add an interface to trigger the remote processor to reinitialize the GPU
> > zap shader on power-up.
> > 
> > Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> > ---
> 
> <snip>
> 
> > +int __qcom_scm_gpu_zap_resume(struct device *dev)
> > +{
> > +	struct qcom_scm_desc desc = {0};
> > +	struct arm_smccc_res res;
> > +	int ret;
> > +
> > +	desc.args[0] = 0;

This is an opcode to force the state to resume.

QCOM_SCM_BOOT_SET_STATE_RESUME perhaps?  Or something similar but shorter.

> > +	desc.args[1] = 13;

This is the same as the SCM id of the GPU but I think that is a coincidence.
We've always used it to identify the GPU in this call.

QCOM_SCM_BOOT_SET_STATE_GPU would be fine here - or something similar.

> Can I get a define here for these two?  Or maybe a comment on what these values
> are?
> 
> > +	desc.arginfo = QCOM_SCM_ARGS(2);
> > +
> > +	ret = qcom_scm_call(dev, QCOM_SCM_SVC_BOOT, 0x0A, &desc, &res);
> 
> Same with the 0xA.  We usually throw a #define in for the command definitions.

0x0A sets the state of the device - for us it is always 0 (resume) and always
the GPU.

#define  QCOM_SCM_BOOT_SET_STATE 0x0A

> Otherwise this all looks fine.  If you can get back to me with either the values
> or a new patch I can include this in the next pull.

I'll make the changes and start the song and dance, but you'll no doubt be
faster than I.

Jordan
-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 10/12] firmware: qcom_scm: Add qcom_scm_gpu_zap_resume()
       [not found]       ` <20170113232438.GA24139-9PYrDHPZ2Orvke4nUoYGnHL1okKdlPRT@public.gmane.org>
@ 2017-01-15  3:49         ` Andy Gross
  2017-01-15  5:20           ` [Freedreno] " Andy Gross
  0 siblings, 1 reply; 34+ messages in thread
From: Andy Gross @ 2017-01-15  3:49 UTC (permalink / raw)
  To: linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On Fri, Jan 13, 2017 at 04:24:38PM -0700, Jordan Crouse wrote:
> On Fri, Jan 13, 2017 at 11:12:41AM -0600, Andy Gross wrote:
> > On Mon, Nov 28, 2016 at 12:28:35PM -0700, Jordan Crouse wrote:
> > > Add an interface to trigger the remote processor to reinitialize the GPU
> > > zap shader on power-up.
> > > 
> > > Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> > > ---
> > 
> > <snip>
> > 
> > > +int __qcom_scm_gpu_zap_resume(struct device *dev)
> > > +{
> > > +	struct qcom_scm_desc desc = {0};
> > > +	struct arm_smccc_res res;
> > > +	int ret;
> > > +
> > > +	desc.args[0] = 0;
> 
> This is an opcode to force the state to resume.
> 
> QCOM_SCM_BOOT_SET_STATE_RESUME perhaps?  Or something similar but shorter.
> 
> > > +	desc.args[1] = 13;
> 
> This is the same as the SCM id of the GPU but I think that is a coincidence.
> We've always used it to identify the GPU in this call.
> 
> QCOM_SCM_BOOT_SET_STATE_GPU would be fine here - or something similar.
> 
> > Can I get a define here for these two?  Or maybe a comment on what these values
> > are?
> > 
> > > +	desc.arginfo = QCOM_SCM_ARGS(2);
> > > +
> > > +	ret = qcom_scm_call(dev, QCOM_SCM_SVC_BOOT, 0x0A, &desc, &res);
> > 
> > Same with the 0xA.  We usually throw a #define in for the command definitions.
> 
> 0x0A sets the state of the device - for us it is always 0 (resume) and always
> the GPU.
> 
> #define  QCOM_SCM_BOOT_SET_STATE 0x0A
> 
> > Otherwise this all looks fine.  If you can get back to me with either the values
> > or a new patch I can include this in the next pull.
> 
> I'll make the changes and start the song and dance, but you'll no doubt be
> faster than I.

I can just fix up the patch with the above.  Thanks for the additional details.

Andy
_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Freedreno] [PATCH 10/12] firmware: qcom_scm: Add qcom_scm_gpu_zap_resume()
  2017-01-15  3:49         ` Andy Gross
@ 2017-01-15  5:20           ` Andy Gross
  2017-01-16 15:13             ` Stanimir Varbanov
  2017-01-17 17:04             ` Jordan Crouse
  0 siblings, 2 replies; 34+ messages in thread
From: Andy Gross @ 2017-01-15  5:20 UTC (permalink / raw)
  To: linux-arm-msm, freedreno, dri-devel; +Cc: stanimir.varbanov

+ Stanimir

On Sat, Jan 14, 2017 at 09:49:01PM -0600, Andy Gross wrote:
> On Fri, Jan 13, 2017 at 04:24:38PM -0700, Jordan Crouse wrote:
> > On Fri, Jan 13, 2017 at 11:12:41AM -0600, Andy Gross wrote:
> > > On Mon, Nov 28, 2016 at 12:28:35PM -0700, Jordan Crouse wrote:
> > > > Add an interface to trigger the remote processor to reinitialize the GPU
> > > > zap shader on power-up.
> > > > 
> > > > Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> > > > ---
> > > 
> > > <snip>
> > > 
> > > > +int __qcom_scm_gpu_zap_resume(struct device *dev)
> > > > +{
> > > > +	struct qcom_scm_desc desc = {0};
> > > > +	struct arm_smccc_res res;
> > > > +	int ret;
> > > > +
> > > > +	desc.args[0] = 0;
> > 
> > This is an opcode to force the state to resume.
> > 
> > QCOM_SCM_BOOT_SET_STATE_RESUME perhaps?  Or something similar but shorter.
> > 
> > > > +	desc.args[1] = 13;
> > 
> > This is the same as the SCM id of the GPU but I think that is a coincidence.
> > We've always used it to identify the GPU in this call.
> > 
> > QCOM_SCM_BOOT_SET_STATE_GPU would be fine here - or something similar.
> > 
> > > Can I get a define here for these two?  Or maybe a comment on what these values
> > > are?
> > > 
> > > > +	desc.arginfo = QCOM_SCM_ARGS(2);
> > > > +
> > > > +	ret = qcom_scm_call(dev, QCOM_SCM_SVC_BOOT, 0x0A, &desc, &res);
> > > 
> > > Same with the 0xA.  We usually throw a #define in for the command definitions.
> > 
> > 0x0A sets the state of the device - for us it is always 0 (resume) and always
> > the GPU.
> > 
> > #define  QCOM_SCM_BOOT_SET_STATE 0x0A
> > 
> > > Otherwise this all looks fine.  If you can get back to me with either the values
> > > or a new patch I can include this in the next pull.
> > 
> > I'll make the changes and start the song and dance, but you'll no doubt be
> > faster than I.
> 
> I can just fix up the patch with the above.  Thanks for the additional details.

The plot thickens.  So I have a patch from Stanimir concerning another SCM call
that is using the same command and number of arguments.  And it also concerns
setting state.  I think that we need to roll a common API for setting the state
and then both of you can call it.  That way we can kill two birds with one
stone.

Something along the lines of a function prototype:
int qcom_scm_set_remote_state(u32 state, u32 id)
{
	return __qcom_scm_set_remote_state(__scm->dev, state, id);
}
EXPORT_SYMBOL(qcom_scm_set_remote_state);

where state is the state you want set, and id is the identifier of the remote
proc.

Does this make sense for both of your use cases?

Andy

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Freedreno] [PATCH 10/12] firmware: qcom_scm: Add qcom_scm_gpu_zap_resume()
  2017-01-15  5:20           ` [Freedreno] " Andy Gross
@ 2017-01-16 15:13             ` Stanimir Varbanov
  2017-01-17 17:04             ` Jordan Crouse
  1 sibling, 0 replies; 34+ messages in thread
From: Stanimir Varbanov @ 2017-01-16 15:13 UTC (permalink / raw)
  To: Andy Gross, linux-arm-msm, freedreno, dri-devel

Hi Andy,

On 01/15/2017 07:20 AM, Andy Gross wrote:
> + Stanimir
> 
> On Sat, Jan 14, 2017 at 09:49:01PM -0600, Andy Gross wrote:
>> On Fri, Jan 13, 2017 at 04:24:38PM -0700, Jordan Crouse wrote:
>>> On Fri, Jan 13, 2017 at 11:12:41AM -0600, Andy Gross wrote:
>>>> On Mon, Nov 28, 2016 at 12:28:35PM -0700, Jordan Crouse wrote:
>>>>> Add an interface to trigger the remote processor to reinitialize the GPU
>>>>> zap shader on power-up.
>>>>>
>>>>> Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
>>>>> ---
>>>>
>>>> <snip>
>>>>
>>>>> +int __qcom_scm_gpu_zap_resume(struct device *dev)
>>>>> +{
>>>>> +	struct qcom_scm_desc desc = {0};
>>>>> +	struct arm_smccc_res res;
>>>>> +	int ret;
>>>>> +
>>>>> +	desc.args[0] = 0;
>>>
>>> This is an opcode to force the state to resume.
>>>
>>> QCOM_SCM_BOOT_SET_STATE_RESUME perhaps?  Or something similar but shorter.
>>>
>>>>> +	desc.args[1] = 13;
>>>
>>> This is the same as the SCM id of the GPU but I think that is a coincidence.
>>> We've always used it to identify the GPU in this call.
>>>
>>> QCOM_SCM_BOOT_SET_STATE_GPU would be fine here - or something similar.
>>>
>>>> Can I get a define here for these two?  Or maybe a comment on what these values
>>>> are?
>>>>
>>>>> +	desc.arginfo = QCOM_SCM_ARGS(2);
>>>>> +
>>>>> +	ret = qcom_scm_call(dev, QCOM_SCM_SVC_BOOT, 0x0A, &desc, &res);
>>>>
>>>> Same with the 0xA.  We usually throw a #define in for the command definitions.
>>>
>>> 0x0A sets the state of the device - for us it is always 0 (resume) and always
>>> the GPU.
>>>
>>> #define  QCOM_SCM_BOOT_SET_STATE 0x0A
>>>
>>>> Otherwise this all looks fine.  If you can get back to me with either the values
>>>> or a new patch I can include this in the next pull.
>>>
>>> I'll make the changes and start the song and dance, but you'll no doubt be
>>> faster than I.
>>
>> I can just fix up the patch with the above.  Thanks for the additional details.
> 
> The plot thickens.  So I have a patch from Stanimir concerning another SCM call
> that is using the same command and number of arguments.  And it also concerns
> setting state.  I think that we need to roll a common API for setting the state
> and then both of you can call it.  That way we can kill two birds with one
> stone.
> 
> Something along the lines of a function prototype:
> int qcom_scm_set_remote_state(u32 state, u32 id)
> {
> 	return __qcom_scm_set_remote_state(__scm->dev, state, id);
> }
> EXPORT_SYMBOL(qcom_scm_set_remote_state);
> 
> where state is the state you want set, and id is the identifier of the remote
> proc.
> 
> Does this make sense for both of your use cases?

I'm fine with that.

-- 
regards,
Stan
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH] firmware: qcom_scm: Add set remote state API
       [not found]   ` <1480361317-9937-11-git-send-email-jcrouse-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
@ 2017-01-17  5:56     ` Andy Gross
       [not found]       ` <1484632578-4539-1-git-send-email-andy.gross-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
                         ` (2 more replies)
  0 siblings, 3 replies; 34+ messages in thread
From: Andy Gross @ 2017-01-17  5:56 UTC (permalink / raw)
  To: freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Andy Gross, linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	jcrouse-sgV2jX0FEOL9JmXXK+q4OQ,
	stanimir.varbanov-QSEj5FYQhm4dnm+yROfE0A,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

This patch adds a set remote state SCM API.  This will be used by the
Venus and GPU subsystems to set state on the remote processors.

This work was based on two patch sets by Jordan Crouse and Stanimir
Varbanov.

Signed-off-by: Andy Gross <andy.gross@linaro.org>
---
 drivers/firmware/qcom_scm-32.c | 18 ++++++++++++++++++
 drivers/firmware/qcom_scm-64.c | 16 ++++++++++++++++
 drivers/firmware/qcom_scm.c    |  6 ++++++
 drivers/firmware/qcom_scm.h    |  2 ++
 include/linux/qcom_scm.h       |  4 +++-
 5 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/drivers/firmware/qcom_scm-32.c b/drivers/firmware/qcom_scm-32.c
index c6aeedb..8ad226c 100644
--- a/drivers/firmware/qcom_scm-32.c
+++ b/drivers/firmware/qcom_scm-32.c
@@ -560,3 +560,21 @@ int __qcom_scm_pas_mss_reset(struct device *dev, bool reset)
 
 	return ret ? : le32_to_cpu(out);
 }
+
+int __qcom_scm_set_remote_state(struct device *dev, u32 state, u32 id)
+{
+	struct {
+		__le32 state;
+		__le32 id;
+	} req;
+	__le32 scm_ret = 0;
+	int ret;
+
+	req.state = cpu_to_le32(state);
+	req.id = cpu_to_le32(id);
+
+	ret = qcom_scm_call(dev, QCOM_SCM_SVC_BOOT, QCOM_SCM_SET_REMOTE_STATE,
+			    &req, sizeof(req), &scm_ret, sizeof(scm_ret));
+
+	return ret ? : le32_to_cpu(scm_ret);
+}
diff --git a/drivers/firmware/qcom_scm-64.c b/drivers/firmware/qcom_scm-64.c
index 4a0f5ea..4b220ab 100644
--- a/drivers/firmware/qcom_scm-64.c
+++ b/drivers/firmware/qcom_scm-64.c
@@ -358,3 +358,19 @@ int __qcom_scm_pas_mss_reset(struct device *dev, bool reset)
 
 	return ret ? : res.a1;
 }
+
+int __qcom_scm_set_remote_state(struct device *dev, u32 state, u32 id)
+{
+	struct qcom_scm_desc desc = {0};
+	struct arm_smccc_res res;
+	int ret;
+
+	desc.args[0] = state;
+	desc.args[1] = id;
+	desc.arginfo = QCOM_SCM_ARGS(2);
+
+	ret = qcom_scm_call(dev, QCOM_SCM_SVC_BOOT, QCOM_SCM_SET_REMOTE_STATE,
+			    &desc, &res);
+
+	return ret ? : res.a1;
+}
diff --git a/drivers/firmware/qcom_scm.c b/drivers/firmware/qcom_scm.c
index 65d0d9d..d987bcc 100644
--- a/drivers/firmware/qcom_scm.c
+++ b/drivers/firmware/qcom_scm.c
@@ -324,6 +324,12 @@ bool qcom_scm_is_available(void)
 }
 EXPORT_SYMBOL(qcom_scm_is_available);
 
+int qcom_scm_set_remote_state(u32 state, u32 id)
+{
+	return __qcom_scm_set_remote_state(__scm->dev, state, id);
+}
+EXPORT_SYMBOL(qcom_scm_set_remote_state);
+
 static int qcom_scm_probe(struct platform_device *pdev)
 {
 	struct qcom_scm *scm;
diff --git a/drivers/firmware/qcom_scm.h b/drivers/firmware/qcom_scm.h
index 3584b00..6a0f154 100644
--- a/drivers/firmware/qcom_scm.h
+++ b/drivers/firmware/qcom_scm.h
@@ -15,6 +15,8 @@
 #define QCOM_SCM_SVC_BOOT		0x1
 #define QCOM_SCM_BOOT_ADDR		0x1
 #define QCOM_SCM_BOOT_ADDR_MC		0x11
+#define QCOM_SCM_SET_REMOTE_STATE	0xa
+extern int __qcom_scm_set_remote_state(struct device *dev, u32 state, u32 id);
 
 #define QCOM_SCM_FLAG_HLOS		0x01
 #define QCOM_SCM_FLAG_COLDBOOT_MC	0x02
diff --git a/include/linux/qcom_scm.h b/include/linux/qcom_scm.h
index 7e004fb..d32f6f1 100644
--- a/include/linux/qcom_scm.h
+++ b/include/linux/qcom_scm.h
@@ -39,6 +39,7 @@ extern int qcom_scm_pas_mem_setup(u32 peripheral, phys_addr_t addr,
 extern int qcom_scm_pas_shutdown(u32 peripheral);
 extern void qcom_scm_cpu_power_down(u32 flags);
 extern u32 qcom_scm_get_version(void);
+extern int qcom_scm_set_remote_state(u32 state, u32 id);
 #else
 static inline
 int qcom_scm_set_cold_boot_addr(void *entry, const cpumask_t *cpus)
@@ -64,6 +65,7 @@ static inline int qcom_scm_pas_mem_setup(u32 peripheral, phys_addr_t addr,
 static inline int qcom_scm_pas_shutdown(u32 peripheral) { return -ENODEV; }
 static inline void qcom_scm_cpu_power_down(u32 flags) {}
 static inline u32 qcom_scm_get_version(void) { return 0; }
+static inline u32
+qcom_scm_set_remote_state(u32 state,u32 id) { return -ENODEV; }
 #endif
-
 #endif
-- 
1.9.1

_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [Freedreno] [PATCH 10/12] firmware: qcom_scm: Add qcom_scm_gpu_zap_resume()
  2017-01-15  5:20           ` [Freedreno] " Andy Gross
  2017-01-16 15:13             ` Stanimir Varbanov
@ 2017-01-17 17:04             ` Jordan Crouse
       [not found]               ` <20170117170459.GA29647-9PYrDHPZ2Orvke4nUoYGnHL1okKdlPRT@public.gmane.org>
  1 sibling, 1 reply; 34+ messages in thread
From: Jordan Crouse @ 2017-01-17 17:04 UTC (permalink / raw)
  To: Andy Gross; +Cc: linux-arm-msm, freedreno, stanimir.varbanov, dri-devel

On Sat, Jan 14, 2017 at 11:20:58PM -0600, Andy Gross wrote:
> + Stanimir
> 
> On Sat, Jan 14, 2017 at 09:49:01PM -0600, Andy Gross wrote:
> > On Fri, Jan 13, 2017 at 04:24:38PM -0700, Jordan Crouse wrote:
> > > On Fri, Jan 13, 2017 at 11:12:41AM -0600, Andy Gross wrote:
> > > > On Mon, Nov 28, 2016 at 12:28:35PM -0700, Jordan Crouse wrote:
> > > > > Add an interface to trigger the remote processor to reinitialize the GPU
> > > > > zap shader on power-up.
> > > > > 
> > > > > Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> > > > > ---
> > > > 
> > > > <snip>
> > > > 
> > > > > +int __qcom_scm_gpu_zap_resume(struct device *dev)
> > > > > +{
> > > > > +	struct qcom_scm_desc desc = {0};
> > > > > +	struct arm_smccc_res res;
> > > > > +	int ret;
> > > > > +
> > > > > +	desc.args[0] = 0;
> > > 
> > > This is an opcode to force the state to resume.
> > > 
> > > QCOM_SCM_BOOT_SET_STATE_RESUME perhaps?  Or something similar but shorter.
> > > 
> > > > > +	desc.args[1] = 13;
> > > 
> > > This is the same as the SCM id of the GPU but I think that is a coincidence.
> > > We've always used it to identify the GPU in this call.
> > > 
> > > QCOM_SCM_BOOT_SET_STATE_GPU would be fine here - or something similar.
> > > 
> > > > Can I get a define here for these two?  Or maybe a comment on what these values
> > > > are?
> > > > 
> > > > > +	desc.arginfo = QCOM_SCM_ARGS(2);
> > > > > +
> > > > > +	ret = qcom_scm_call(dev, QCOM_SCM_SVC_BOOT, 0x0A, &desc, &res);
> > > > 
> > > > Same with the 0xA.  We usually throw a #define in for the command definitions.
> > > 
> > > 0x0A sets the state of the device - for us it is always 0 (resume) and always
> > > the GPU.
> > > 
> > > #define  QCOM_SCM_BOOT_SET_STATE 0x0A
> > > 
> > > > Otherwise this all looks fine.  If you can get back to me with either the values
> > > > or a new patch I can include this in the next pull.
> > > 
> > > I'll make the changes and start the song and dance, but you'll no doubt be
> > > faster than I.
> > 
> > I can just fix up the patch with the above.  Thanks for the additional details.
> 
> The plot thickens.  So I have a patch from Stanimir concerning another SCM call
> that is using the same command and number of arguments.  And it also concerns
> setting state.  I think that we need to roll a common API for setting the state
> and then both of you can call it.  That way we can kill two birds with one
> stone.

I was worried about this, but not this quickly - glad to see that the other
drivers are coming along too.

If you look carefully, you'll note that the values are inverted - the video
state uses 0 to suspend and 1 to resume, and GPU uses 0 to resume and no
suspend. I asked why and got a bunch of shrugging back. Its not a big deal
because your API accounts for this but expect GPU to have a RESUME #define
that is a different value.

> Something along the lines of a function prototype:
> int qcom_scm_set_remote_state(u32 state, u32 id)
> {
> 	return __qcom_scm_set_remote_state(__scm->dev, state, id);
> }
> EXPORT_SYMBOL(qcom_scm_set_remote_state);
> 
> where state is the state you want set, and id is the identifier of the remote
> proc.
>
> Does this make sense for both of your use cases?

Yep - this is fine.

Jordan
-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 10/12] firmware: qcom_scm: Add qcom_scm_gpu_zap_resume()
       [not found]               ` <20170117170459.GA29647-9PYrDHPZ2Orvke4nUoYGnHL1okKdlPRT@public.gmane.org>
@ 2017-01-17 19:31                 ` Andy Gross
  0 siblings, 0 replies; 34+ messages in thread
From: Andy Gross @ 2017-01-17 19:31 UTC (permalink / raw)
  To: linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	stanimir.varbanov-QSEj5FYQhm4dnm+yROfE0A

On Tue, Jan 17, 2017 at 10:04:59AM -0700, Jordan Crouse wrote:
> On Sat, Jan 14, 2017 at 11:20:58PM -0600, Andy Gross wrote:
> > + Stanimir
> > 
> > On Sat, Jan 14, 2017 at 09:49:01PM -0600, Andy Gross wrote:
> > > On Fri, Jan 13, 2017 at 04:24:38PM -0700, Jordan Crouse wrote:
> > > > On Fri, Jan 13, 2017 at 11:12:41AM -0600, Andy Gross wrote:
> > > > > On Mon, Nov 28, 2016 at 12:28:35PM -0700, Jordan Crouse wrote:
> > > > > > Add an interface to trigger the remote processor to reinitialize the GPU
> > > > > > zap shader on power-up.
> > > > > > 
> > > > > > Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
> > > > > > ---
> > > > > 
> > > > > <snip>
> > > > > 
> > > > > > +int __qcom_scm_gpu_zap_resume(struct device *dev)
> > > > > > +{
> > > > > > +	struct qcom_scm_desc desc = {0};
> > > > > > +	struct arm_smccc_res res;
> > > > > > +	int ret;
> > > > > > +
> > > > > > +	desc.args[0] = 0;
> > > > 
> > > > This is an opcode to force the state to resume.
> > > > 
> > > > QCOM_SCM_BOOT_SET_STATE_RESUME perhaps?  Or something similar but shorter.
> > > > 
> > > > > > +	desc.args[1] = 13;
> > > > 
> > > > This is the same as the SCM id of the GPU but I think that is a coincidence.
> > > > We've always used it to identify the GPU in this call.
> > > > 
> > > > QCOM_SCM_BOOT_SET_STATE_GPU would be fine here - or something similar.
> > > > 
> > > > > Can I get a define here for these two?  Or maybe a comment on what these values
> > > > > are?
> > > > > 
> > > > > > +	desc.arginfo = QCOM_SCM_ARGS(2);
> > > > > > +
> > > > > > +	ret = qcom_scm_call(dev, QCOM_SCM_SVC_BOOT, 0x0A, &desc, &res);
> > > > > 
> > > > > Same with the 0xA.  We usually throw a #define in for the command definitions.
> > > > 
> > > > 0x0A sets the state of the device - for us it is always 0 (resume) and always
> > > > the GPU.
> > > > 
> > > > #define  QCOM_SCM_BOOT_SET_STATE 0x0A
> > > > 
> > > > > Otherwise this all looks fine.  If you can get back to me with either the values
> > > > > or a new patch I can include this in the next pull.
> > > > 
> > > > I'll make the changes and start the song and dance, but you'll no doubt be
> > > > faster than I.
> > > 
> > > I can just fix up the patch with the above.  Thanks for the additional details.
> > 
> > The plot thickens.  So I have a patch from Stanimir concerning another SCM call
> > that is using the same command and number of arguments.  And it also concerns
> > setting state.  I think that we need to roll a common API for setting the state
> > and then both of you can call it.  That way we can kill two birds with one
> > stone.
> 
> I was worried about this, but not this quickly - glad to see that the other
> drivers are coming along too.
> 
> If you look carefully, you'll note that the values are inverted - the video
> state uses 0 to suspend and 1 to resume, and GPU uses 0 to resume and no
> suspend. I asked why and got a bunch of shrugging back. Its not a big deal
> because your API accounts for this but expect GPU to have a RESUME #define
> that is a different value.
> 
> > Something along the lines of a function prototype:
> > int qcom_scm_set_remote_state(u32 state, u32 id)
> > {
> > 	return __qcom_scm_set_remote_state(__scm->dev, state, id);
> > }
> > EXPORT_SYMBOL(qcom_scm_set_remote_state);
> > 
> > where state is the state you want set, and id is the identifier of the remote
> > proc.
> >
> > Does this make sense for both of your use cases?
> 
> Yep - this is fine.

If you guys can ack/sob that patch I'd appreciate it.

As for the #defines, you guys can define whatever you like in your drivers.
This API just pushes all that into your client drivers.


Andy

_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] firmware: qcom_scm: Add set remote state API
       [not found]       ` <1484632578-4539-1-git-send-email-andy.gross-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
@ 2017-01-18 16:51         ` Jordan Crouse
  0 siblings, 0 replies; 34+ messages in thread
From: Jordan Crouse @ 2017-01-18 16:51 UTC (permalink / raw)
  To: Andy Gross
  Cc: linux-arm-msm-u79uwXL29TY76Z2rM5mHXA,
	freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	stanimir.varbanov-QSEj5FYQhm4dnm+yROfE0A,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On Mon, Jan 16, 2017 at 11:56:18PM -0600, Andy Gross wrote:
> This patch adds a set remote state SCM API.  This will be used by the
> Venus and GPU subsystems to set state on the remote processors.
> 
> This work was based on two patch sets by Jordan Crouse and Stanimir
> Varbanov.
> 
> Signed-off-by: Andy Gross <andy.gross@linaro.org>

Acked-by: Jordan Crouse <jcrouse@codeaurora.org>

> ---
>  drivers/firmware/qcom_scm-32.c | 18 ++++++++++++++++++
>  drivers/firmware/qcom_scm-64.c | 16 ++++++++++++++++
>  drivers/firmware/qcom_scm.c    |  6 ++++++
>  drivers/firmware/qcom_scm.h    |  2 ++
>  include/linux/qcom_scm.h       |  4 +++-
>  5 files changed, 45 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/firmware/qcom_scm-32.c b/drivers/firmware/qcom_scm-32.c
> index c6aeedb..8ad226c 100644
> --- a/drivers/firmware/qcom_scm-32.c
> +++ b/drivers/firmware/qcom_scm-32.c
> @@ -560,3 +560,21 @@ int __qcom_scm_pas_mss_reset(struct device *dev, bool reset)
>  
>  	return ret ? : le32_to_cpu(out);
>  }
> +
> +int __qcom_scm_set_remote_state(struct device *dev, u32 state, u32 id)
> +{
> +	struct {
> +		__le32 state;
> +		__le32 id;
> +	} req;
> +	__le32 scm_ret = 0;
> +	int ret;
> +
> +	req.state = cpu_to_le32(state);
> +	req.id = cpu_to_le32(id);
> +
> +	ret = qcom_scm_call(dev, QCOM_SCM_SVC_BOOT, QCOM_SCM_SET_REMOTE_STATE,
> +			    &req, sizeof(req), &scm_ret, sizeof(scm_ret));
> +
> +	return ret ? : le32_to_cpu(scm_ret);
> +}
> diff --git a/drivers/firmware/qcom_scm-64.c b/drivers/firmware/qcom_scm-64.c
> index 4a0f5ea..4b220ab 100644
> --- a/drivers/firmware/qcom_scm-64.c
> +++ b/drivers/firmware/qcom_scm-64.c
> @@ -358,3 +358,19 @@ int __qcom_scm_pas_mss_reset(struct device *dev, bool reset)
>  
>  	return ret ? : res.a1;
>  }
> +
> +int __qcom_scm_set_remote_state(struct device *dev, u32 state, u32 id)
> +{
> +	struct qcom_scm_desc desc = {0};
> +	struct arm_smccc_res res;
> +	int ret;
> +
> +	desc.args[0] = state;
> +	desc.args[1] = id;
> +	desc.arginfo = QCOM_SCM_ARGS(2);
> +
> +	ret = qcom_scm_call(dev, QCOM_SCM_SVC_BOOT, QCOM_SCM_SET_REMOTE_STATE,
> +			    &desc, &res);
> +
> +	return ret ? : res.a1;
> +}
> diff --git a/drivers/firmware/qcom_scm.c b/drivers/firmware/qcom_scm.c
> index 65d0d9d..d987bcc 100644
> --- a/drivers/firmware/qcom_scm.c
> +++ b/drivers/firmware/qcom_scm.c
> @@ -324,6 +324,12 @@ bool qcom_scm_is_available(void)
>  }
>  EXPORT_SYMBOL(qcom_scm_is_available);
>  
> +int qcom_scm_set_remote_state(u32 state, u32 id)
> +{
> +	return __qcom_scm_set_remote_state(__scm->dev, state, id);
> +}
> +EXPORT_SYMBOL(qcom_scm_set_remote_state);
> +
>  static int qcom_scm_probe(struct platform_device *pdev)
>  {
>  	struct qcom_scm *scm;
> diff --git a/drivers/firmware/qcom_scm.h b/drivers/firmware/qcom_scm.h
> index 3584b00..6a0f154 100644
> --- a/drivers/firmware/qcom_scm.h
> +++ b/drivers/firmware/qcom_scm.h
> @@ -15,6 +15,8 @@
>  #define QCOM_SCM_SVC_BOOT		0x1
>  #define QCOM_SCM_BOOT_ADDR		0x1
>  #define QCOM_SCM_BOOT_ADDR_MC		0x11
> +#define QCOM_SCM_SET_REMOTE_STATE	0xa
> +extern int __qcom_scm_set_remote_state(struct device *dev, u32 state, u32 id);
>  
>  #define QCOM_SCM_FLAG_HLOS		0x01
>  #define QCOM_SCM_FLAG_COLDBOOT_MC	0x02
> diff --git a/include/linux/qcom_scm.h b/include/linux/qcom_scm.h
> index 7e004fb..d32f6f1 100644
> --- a/include/linux/qcom_scm.h
> +++ b/include/linux/qcom_scm.h
> @@ -39,6 +39,7 @@ extern int qcom_scm_pas_mem_setup(u32 peripheral, phys_addr_t addr,
>  extern int qcom_scm_pas_shutdown(u32 peripheral);
>  extern void qcom_scm_cpu_power_down(u32 flags);
>  extern u32 qcom_scm_get_version(void);
> +extern int qcom_scm_set_remote_state(u32 state, u32 id);
>  #else
>  static inline
>  int qcom_scm_set_cold_boot_addr(void *entry, const cpumask_t *cpus)
> @@ -64,6 +65,7 @@ static inline int qcom_scm_pas_mem_setup(u32 peripheral, phys_addr_t addr,
>  static inline int qcom_scm_pas_shutdown(u32 peripheral) { return -ENODEV; }
>  static inline void qcom_scm_cpu_power_down(u32 flags) {}
>  static inline u32 qcom_scm_get_version(void) { return 0; }
> +static inline u32
> +qcom_scm_set_remote_state(u32 state,u32 id) { return -ENODEV; }
>  #endif
> -
>  #endif
> -- 
> 1.9.1
_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] firmware: qcom_scm: Add set remote state API
  2017-01-17  5:56     ` [PATCH] firmware: qcom_scm: Add set remote state API Andy Gross
       [not found]       ` <1484632578-4539-1-git-send-email-andy.gross-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
@ 2017-01-18 17:37       ` Stanimir Varbanov
  2017-01-24  9:54       ` Stanimir Varbanov
  2 siblings, 0 replies; 34+ messages in thread
From: Stanimir Varbanov @ 2017-01-18 17:37 UTC (permalink / raw)
  To: Andy Gross, freedreno; +Cc: linux-arm-msm, dri-devel

Thanks Andy,

On 01/17/2017 07:56 AM, Andy Gross wrote:
> This patch adds a set remote state SCM API.  This will be used by the
> Venus and GPU subsystems to set state on the remote processors.
> 
> This work was based on two patch sets by Jordan Crouse and Stanimir
> Varbanov.
> 
> Signed-off-by: Andy Gross <andy.gross@linaro.org>
> ---
>  drivers/firmware/qcom_scm-32.c | 18 ++++++++++++++++++
>  drivers/firmware/qcom_scm-64.c | 16 ++++++++++++++++
>  drivers/firmware/qcom_scm.c    |  6 ++++++
>  drivers/firmware/qcom_scm.h    |  2 ++
>  include/linux/qcom_scm.h       |  4 +++-
>  5 files changed, 45 insertions(+), 1 deletion(-)

Acked-by: Stanimir Varbanov <stanimir.varbanov@linaro.org>


-- 
regards,
Stan
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] firmware: qcom_scm: Add set remote state API
  2017-01-17  5:56     ` [PATCH] firmware: qcom_scm: Add set remote state API Andy Gross
       [not found]       ` <1484632578-4539-1-git-send-email-andy.gross-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
  2017-01-18 17:37       ` Stanimir Varbanov
@ 2017-01-24  9:54       ` Stanimir Varbanov
  2017-01-24 16:11         ` Andy Gross
  2 siblings, 1 reply; 34+ messages in thread
From: Stanimir Varbanov @ 2017-01-24  9:54 UTC (permalink / raw)
  To: Andy Gross, freedreno
  Cc: jcrouse, linux-arm-msm, dri-devel, stanimir.varbanov

Andy,

Could you queue this patch for 4.11?

On 01/17/2017 07:56 AM, Andy Gross wrote:
> This patch adds a set remote state SCM API.  This will be used by the
> Venus and GPU subsystems to set state on the remote processors.
> 
> This work was based on two patch sets by Jordan Crouse and Stanimir
> Varbanov.
> 
> Signed-off-by: Andy Gross <andy.gross@linaro.org>
> ---
>  drivers/firmware/qcom_scm-32.c | 18 ++++++++++++++++++
>  drivers/firmware/qcom_scm-64.c | 16 ++++++++++++++++
>  drivers/firmware/qcom_scm.c    |  6 ++++++
>  drivers/firmware/qcom_scm.h    |  2 ++
>  include/linux/qcom_scm.h       |  4 +++-
>  5 files changed, 45 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/firmware/qcom_scm-32.c b/drivers/firmware/qcom_scm-32.c
> index c6aeedb..8ad226c 100644
> --- a/drivers/firmware/qcom_scm-32.c
> +++ b/drivers/firmware/qcom_scm-32.c
> @@ -560,3 +560,21 @@ int __qcom_scm_pas_mss_reset(struct device *dev, bool reset)
>  
>  	return ret ? : le32_to_cpu(out);
>  }
> +
> +int __qcom_scm_set_remote_state(struct device *dev, u32 state, u32 id)
> +{
> +	struct {
> +		__le32 state;
> +		__le32 id;
> +	} req;
> +	__le32 scm_ret = 0;
> +	int ret;
> +
> +	req.state = cpu_to_le32(state);
> +	req.id = cpu_to_le32(id);
> +
> +	ret = qcom_scm_call(dev, QCOM_SCM_SVC_BOOT, QCOM_SCM_SET_REMOTE_STATE,
> +			    &req, sizeof(req), &scm_ret, sizeof(scm_ret));
> +
> +	return ret ? : le32_to_cpu(scm_ret);
> +}
> diff --git a/drivers/firmware/qcom_scm-64.c b/drivers/firmware/qcom_scm-64.c
> index 4a0f5ea..4b220ab 100644
> --- a/drivers/firmware/qcom_scm-64.c
> +++ b/drivers/firmware/qcom_scm-64.c
> @@ -358,3 +358,19 @@ int __qcom_scm_pas_mss_reset(struct device *dev, bool reset)
>  
>  	return ret ? : res.a1;
>  }
> +
> +int __qcom_scm_set_remote_state(struct device *dev, u32 state, u32 id)
> +{
> +	struct qcom_scm_desc desc = {0};
> +	struct arm_smccc_res res;
> +	int ret;
> +
> +	desc.args[0] = state;
> +	desc.args[1] = id;
> +	desc.arginfo = QCOM_SCM_ARGS(2);
> +
> +	ret = qcom_scm_call(dev, QCOM_SCM_SVC_BOOT, QCOM_SCM_SET_REMOTE_STATE,
> +			    &desc, &res);
> +
> +	return ret ? : res.a1;
> +}
> diff --git a/drivers/firmware/qcom_scm.c b/drivers/firmware/qcom_scm.c
> index 65d0d9d..d987bcc 100644
> --- a/drivers/firmware/qcom_scm.c
> +++ b/drivers/firmware/qcom_scm.c
> @@ -324,6 +324,12 @@ bool qcom_scm_is_available(void)
>  }
>  EXPORT_SYMBOL(qcom_scm_is_available);
>  
> +int qcom_scm_set_remote_state(u32 state, u32 id)
> +{
> +	return __qcom_scm_set_remote_state(__scm->dev, state, id);
> +}
> +EXPORT_SYMBOL(qcom_scm_set_remote_state);
> +
>  static int qcom_scm_probe(struct platform_device *pdev)
>  {
>  	struct qcom_scm *scm;
> diff --git a/drivers/firmware/qcom_scm.h b/drivers/firmware/qcom_scm.h
> index 3584b00..6a0f154 100644
> --- a/drivers/firmware/qcom_scm.h
> +++ b/drivers/firmware/qcom_scm.h
> @@ -15,6 +15,8 @@
>  #define QCOM_SCM_SVC_BOOT		0x1
>  #define QCOM_SCM_BOOT_ADDR		0x1
>  #define QCOM_SCM_BOOT_ADDR_MC		0x11
> +#define QCOM_SCM_SET_REMOTE_STATE	0xa
> +extern int __qcom_scm_set_remote_state(struct device *dev, u32 state, u32 id);
>  
>  #define QCOM_SCM_FLAG_HLOS		0x01
>  #define QCOM_SCM_FLAG_COLDBOOT_MC	0x02
> diff --git a/include/linux/qcom_scm.h b/include/linux/qcom_scm.h
> index 7e004fb..d32f6f1 100644
> --- a/include/linux/qcom_scm.h
> +++ b/include/linux/qcom_scm.h
> @@ -39,6 +39,7 @@ extern int qcom_scm_pas_mem_setup(u32 peripheral, phys_addr_t addr,
>  extern int qcom_scm_pas_shutdown(u32 peripheral);
>  extern void qcom_scm_cpu_power_down(u32 flags);
>  extern u32 qcom_scm_get_version(void);
> +extern int qcom_scm_set_remote_state(u32 state, u32 id);
>  #else
>  static inline
>  int qcom_scm_set_cold_boot_addr(void *entry, const cpumask_t *cpus)
> @@ -64,6 +65,7 @@ static inline int qcom_scm_pas_mem_setup(u32 peripheral, phys_addr_t addr,
>  static inline int qcom_scm_pas_shutdown(u32 peripheral) { return -ENODEV; }
>  static inline void qcom_scm_cpu_power_down(u32 flags) {}
>  static inline u32 qcom_scm_get_version(void) { return 0; }
> +static inline u32
> +qcom_scm_set_remote_state(u32 state,u32 id) { return -ENODEV; }
>  #endif
> -
>  #endif
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] firmware: qcom_scm: Add set remote state API
  2017-01-24  9:54       ` Stanimir Varbanov
@ 2017-01-24 16:11         ` Andy Gross
  0 siblings, 0 replies; 34+ messages in thread
From: Andy Gross @ 2017-01-24 16:11 UTC (permalink / raw)
  To: Stanimir Varbanov
  Cc: freedreno, Jordan Crouse, linux-arm-msm, dri-devel, Stanimir Varbanov

On 24 January 2017 at 03:54, Stanimir Varbanov <svarbanov@mm-sol.com> wrote:
> Andy,
>
> Could you queue this patch for 4.11?

This patch is in the qcom-drivers-for-4.11 pull request I sent 9 hours
ago.  So you should be good.

https://www.spinics.net/lists/arm-kernel/msg557325.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2017-01-24 16:11 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-28 19:28 [PATCH 00/12] Adreno A5XX support Jordan Crouse
2016-11-28 19:28 ` [PATCH 01/12] drm/msm: gpu: Cut down the list of "generic" registers to the ones we use Jordan Crouse
     [not found] ` <1480361317-9937-1-git-send-email-jcrouse-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
2016-11-28 19:28   ` [PATCH 02/12] drm/msm: gpu: Return error on hw_init failure Jordan Crouse
2016-11-28 19:28   ` [PATCH 05/12] drm/msm: gpu: Add OUT_TYPE4 and OUT_TYPE7 Jordan Crouse
2016-11-28 19:28   ` [PATCH 06/12] drm/msm: Remove 'src_clk' from adreno configuration Jordan Crouse
2016-11-28 19:28 ` [PATCH 03/12] drm/msm: gpu Add new gpu register read/write functions Jordan Crouse
2016-11-28 19:28 ` [PATCH 04/12] drm/msm: Add adreno_gpu_write64() Jordan Crouse
2016-11-28 19:28 ` [PATCH 07/12] drm/msm: Disable interrupts during init Jordan Crouse
2016-11-28 19:28 ` [PATCH 08/12] drm/msm: gpu: Add A5XX target support Jordan Crouse
2016-11-28 19:28 ` [PATCH 09/12] drm/msm: gpu: Add support for the GPMU Jordan Crouse
2016-11-28 19:28 ` [PATCH 10/12] firmware: qcom_scm: Add qcom_scm_gpu_zap_resume() Jordan Crouse
2017-01-13 17:12   ` Andy Gross
     [not found]     ` <20170113171241.GH5710-3KkwrOJo9xYlRp7syxWybdHuzzzSOjJt@public.gmane.org>
2017-01-13 17:22       ` Jordan Crouse
     [not found]         ` <20170113172244.GA28592-9PYrDHPZ2Orvke4nUoYGnHL1okKdlPRT@public.gmane.org>
2017-01-13 17:45           ` Andy Gross
2017-01-13 23:24     ` [Freedreno] " Jordan Crouse
     [not found]       ` <20170113232438.GA24139-9PYrDHPZ2Orvke4nUoYGnHL1okKdlPRT@public.gmane.org>
2017-01-15  3:49         ` Andy Gross
2017-01-15  5:20           ` [Freedreno] " Andy Gross
2017-01-16 15:13             ` Stanimir Varbanov
2017-01-17 17:04             ` Jordan Crouse
     [not found]               ` <20170117170459.GA29647-9PYrDHPZ2Orvke4nUoYGnHL1okKdlPRT@public.gmane.org>
2017-01-17 19:31                 ` Andy Gross
     [not found]   ` <1480361317-9937-11-git-send-email-jcrouse-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
2017-01-17  5:56     ` [PATCH] firmware: qcom_scm: Add set remote state API Andy Gross
     [not found]       ` <1484632578-4539-1-git-send-email-andy.gross-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
2017-01-18 16:51         ` Jordan Crouse
2017-01-18 17:37       ` Stanimir Varbanov
2017-01-24  9:54       ` Stanimir Varbanov
2017-01-24 16:11         ` Andy Gross
2016-11-28 19:28 ` [PATCH 11/12] drm/msm: Add a quick and dirty PIL loader Jordan Crouse
2016-12-05 19:57   ` Bjorn Andersson
2016-12-06 17:49     ` Jordan Crouse
2016-11-28 19:28 ` [PATCH 12/12] drm/msm: gpu: Use the zap shader on 5XX if we can Jordan Crouse
2016-12-05 19:57   ` Bjorn Andersson
2016-12-05 20:10     ` Bjorn Andersson
2016-12-06 15:35     ` Jordan Crouse
2016-12-06 16:37       ` [Freedreno] " Rob Clark
     [not found]       ` <20161206153501.GA25541-9PYrDHPZ2Orvke4nUoYGnHL1okKdlPRT@public.gmane.org>
2016-12-06 17:18         ` Bjorn Andersson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.