All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/12] Assorted KFD fixes
@ 2018-05-01 21:56 Felix Kuehling
       [not found] ` <1525211772-19683-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Felix Kuehling @ 2018-05-01 21:56 UTC (permalink / raw)
  To: oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Felix Kuehling

These are some random patches I noticed when comparing amdkfd-next against
amd-kfd-staging.

Ben Goz (1):
  drm/amdkfd: Locking PM mutex while allocating IB buffer

Felix Kuehling (4):
  drm/amdkfd: Remove redundant include of amd-iommu.h
  drm/amdkfd: Fix signal handling performance again
  drm/amdkfd: Remove initialization of cp_hqd_ib_control on CIK
  drm/amdkfd: Add sanity checks in IRQ handlers

Jay Cornwall (2):
  drm/amdkfd: Reduce priority of context-saving waves before spin-wait
  drm/amdkfd: Use volatile MTYPE in default/alternate apertures

Oak Zeng (1):
  drm/amdkfd: Dump HQD of HIQ

Philip Yang (1):
  drm/amdkfd: use %px to print user space address instead of %p

Shaoyun Liu (1):
  drm/amdkfd: Remove queue node when destroy queue failed

Yong Zhao (2):
  drm/amdkfd: Separate trap handler assembly code and its hex values
  drm/amdkfd: Fix CP soft hang on APUs

 drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c   |  20 +-
 drivers/gpu/drm/amd/amdkfd/cik_regs.h              |   3 +-
 drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h     | 560 +++++++++++++++++++++
 .../gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm  | 274 +---------
 .../gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm  | 307 +----------
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c           |   2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_device.c            |   6 +-
 .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  |  12 +
 drivers/gpu/drm/amd/amdkfd/kfd_events.c            |   2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c    |  40 +-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c   |   4 -
 drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c    |   7 +-
 .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |  10 +-
 drivers/gpu/drm/amd/amdkfd/kfd_queue.c             |   8 +-
 14 files changed, 659 insertions(+), 596 deletions(-)
 create mode 100644 drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h

-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 01/12] drm/amdkfd: Dump HQD of HIQ
       [not found] ` <1525211772-19683-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2018-05-01 21:56   ` Felix Kuehling
  2018-05-01 21:56   ` [PATCH 02/12] drm/amdkfd: Reduce priority of context-saving waves before spin-wait Felix Kuehling
                     ` (11 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Felix Kuehling @ 2018-05-01 21:56 UTC (permalink / raw)
  To: oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Oak Zeng, Felix Kuehling

From: Oak Zeng <Oak.Zeng@amd.com>

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 9af94b1..668ad07 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -1713,6 +1713,18 @@ int dqm_debugfs_hqds(struct seq_file *m, void *data)
 	int pipe, queue;
 	int r = 0;
 
+	r = dqm->dev->kfd2kgd->hqd_dump(dqm->dev->kgd,
+		KFD_CIK_HIQ_PIPE, KFD_CIK_HIQ_QUEUE, &dump, &n_regs);
+	if (!r) {
+		seq_printf(m, "  HIQ on MEC %d Pipe %d Queue %d\n",
+				KFD_CIK_HIQ_PIPE/get_pipes_per_mec(dqm)+1,
+				KFD_CIK_HIQ_PIPE%get_pipes_per_mec(dqm),
+				KFD_CIK_HIQ_QUEUE);
+		seq_reg_dump(m, dump, n_regs);
+
+		kfree(dump);
+	}
+
 	for (pipe = 0; pipe < get_pipes_per_mec(dqm); pipe++) {
 		int pipe_offset = pipe * get_queues_per_pipe(dqm);
 
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 02/12] drm/amdkfd: Reduce priority of context-saving waves before spin-wait
       [not found] ` <1525211772-19683-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2018-05-01 21:56   ` [PATCH 01/12] drm/amdkfd: Dump HQD of HIQ Felix Kuehling
@ 2018-05-01 21:56   ` Felix Kuehling
  2018-05-01 21:56   ` [PATCH 03/12] drm/amdkfd: Use volatile MTYPE in default/alternate apertures Felix Kuehling
                     ` (10 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Felix Kuehling @ 2018-05-01 21:56 UTC (permalink / raw)
  To: oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Felix Kuehling, Jay Cornwall

From: Jay Cornwall <Jay.Cornwall@amd.com>

Synchronization between context-saving wavefronts is achieved by
sending a SAVEWAVE message to the SPI and then spin-waiting for a
response. These spin-waiting wavefronts may inhibit the progress
of other wavefronts in the context save handler, leading to the
synchronization condition never being achieved.

Before spin-waiting reduce the priority of each wavefront to
guarantee foward progress in the others.

Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm | 10 ++++++++--
 drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm |  8 +++++++-
 2 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm
index 997a383d..34eabcd 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm
@@ -98,6 +98,7 @@ var SWIZZLE_EN                      =   0                   //whether we use swi
 /**************************************************************************/
 var SQ_WAVE_STATUS_INST_ATC_SHIFT  = 23
 var SQ_WAVE_STATUS_INST_ATC_MASK   = 0x00800000
+var SQ_WAVE_STATUS_SPI_PRIO_SHIFT  = 1
 var SQ_WAVE_STATUS_SPI_PRIO_MASK   = 0x00000006
 
 var SQ_WAVE_LDS_ALLOC_LDS_SIZE_SHIFT    = 12
@@ -319,6 +320,10 @@ end
         s_sendmsg   sendmsg(MSG_SAVEWAVE)  //send SPI a message and wait for SPI's write to EXEC
     end
 
+    // Set SPI_PRIO=2 to avoid starving instruction fetch in the waves we're waiting for.
+    s_or_b32 s_save_tmp, s_save_status, (2 << SQ_WAVE_STATUS_SPI_PRIO_SHIFT)
+    s_setreg_b32 hwreg(HW_REG_STATUS), s_save_tmp
+
   L_SLEEP:
     s_sleep 0x2                // sleep 1 (64clk) is not enough for 8 waves per SIMD, which will cause SQ hang, since the 7,8th wave could not get arbit to exec inst, while other waves are stuck into the sleep-loop and waiting for wrexec!=0
 
@@ -1132,7 +1137,7 @@ end
 #endif
 
 static const uint32_t cwsr_trap_gfx8_hex[] = {
-	0xbf820001, 0xbf820123,
+	0xbf820001, 0xbf820125,
 	0xb8f4f802, 0x89748674,
 	0xb8f5f803, 0x8675ff75,
 	0x00000400, 0xbf850011,
@@ -1158,7 +1163,8 @@ static const uint32_t cwsr_trap_gfx8_hex[] = {
 	0x867aff7a, 0x00007fff,
 	0xb97af807, 0xbef2007e,
 	0xbef3007f, 0xbefe0180,
-	0xbf900004, 0xbf8e0002,
+	0xbf900004, 0x877a8474,
+	0xb97af802, 0xbf8e0002,
 	0xbf88fffe, 0xbef8007e,
 	0x8679ff7f, 0x0000ffff,
 	0x8779ff79, 0x00040000,
diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm
index da09794..8fc3698 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm
@@ -97,6 +97,7 @@ var ACK_SQC_STORE		    =	1		    //workaround for suspected SQC store bug causing
 /**************************************************************************/
 var SQ_WAVE_STATUS_INST_ATC_SHIFT  = 23
 var SQ_WAVE_STATUS_INST_ATC_MASK   = 0x00800000
+var SQ_WAVE_STATUS_SPI_PRIO_SHIFT  = 1
 var SQ_WAVE_STATUS_SPI_PRIO_MASK   = 0x00000006
 var SQ_WAVE_STATUS_HALT_MASK       = 0x2000
 
@@ -362,6 +363,10 @@ end
 	s_sendmsg   sendmsg(MSG_SAVEWAVE)  //send SPI a message and wait for SPI's write to EXEC
     end
 
+    // Set SPI_PRIO=2 to avoid starving instruction fetch in the waves we're waiting for.
+    s_or_b32 s_save_tmp, s_save_status, (2 << SQ_WAVE_STATUS_SPI_PRIO_SHIFT)
+    s_setreg_b32 hwreg(HW_REG_STATUS), s_save_tmp
+
   L_SLEEP:
     s_sleep 0x2		       // sleep 1 (64clk) is not enough for 8 waves per SIMD, which will cause SQ hang, since the 7,8th wave could not get arbit to exec inst, while other waves are stuck into the sleep-loop and waiting for wrexec!=0
 
@@ -1210,7 +1215,7 @@ end
 #endif
 
 static const uint32_t cwsr_trap_gfx9_hex[] = {
-	0xbf820001, 0xbf820158,
+	0xbf820001, 0xbf82015a,
 	0xb8f8f802, 0x89788678,
 	0xb8f1f803, 0x866eff71,
 	0x00000400, 0xbf850034,
@@ -1249,6 +1254,7 @@ static const uint32_t cwsr_trap_gfx9_hex[] = {
 	0x00007fff, 0xb970f807,
 	0xbeee007e, 0xbeef007f,
 	0xbefe0180, 0xbf900004,
+	0x87708478, 0xb970f802,
 	0xbf8e0002, 0xbf88fffe,
 	0xb8f02a05, 0x80708170,
 	0x8e708a70, 0xb8f11605,
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 03/12] drm/amdkfd: Use volatile MTYPE in default/alternate apertures
       [not found] ` <1525211772-19683-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2018-05-01 21:56   ` [PATCH 01/12] drm/amdkfd: Dump HQD of HIQ Felix Kuehling
  2018-05-01 21:56   ` [PATCH 02/12] drm/amdkfd: Reduce priority of context-saving waves before spin-wait Felix Kuehling
@ 2018-05-01 21:56   ` Felix Kuehling
  2018-05-01 21:56   ` [PATCH 04/12] drm/amdkfd: use %px to print user space address instead of %p Felix Kuehling
                     ` (9 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Felix Kuehling @ 2018-05-01 21:56 UTC (permalink / raw)
  To: oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Felix Kuehling, Jay Cornwall

From: Jay Cornwall <Jay.Cornwall@amd.com>

MTYPE_NC_NV (0) marks scalar/vector L1 cache lines as non-volatile.
Cache lines loaded through these apertures are intended to be
invalidated before (and sometimes during) a dispatch. The non-volatile
qualifier prevents these cache lines from being distinguished from
those loaded through the private aperture.

Use MTYPE_NC (1) instead on both Gfx7 and Gfx8. This allows the
compiler to use the BUFFER_WBINVL1_VOL instruction and is a precursor
to automatic per-dispatch scalar/vector L1 volatile invalidation.

Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/cik_regs.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/cik_regs.h b/drivers/gpu/drm/amd/amdkfd/cik_regs.h
index 48769d1..37ce6dd 100644
--- a/drivers/gpu/drm/amd/amdkfd/cik_regs.h
+++ b/drivers/gpu/drm/amd/amdkfd/cik_regs.h
@@ -33,7 +33,8 @@
 #define	APE1_MTYPE(x)					((x) << 7)
 
 /* valid for both DEFAULT_MTYPE and APE1_MTYPE */
-#define	MTYPE_CACHED					0
+#define	MTYPE_CACHED_NV					0
+#define	MTYPE_CACHED					1
 #define	MTYPE_NONCACHED					3
 
 #define	DEFAULT_CP_HQD_PERSISTENT_STATE			(0x33U << 8)
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 04/12] drm/amdkfd: use %px to print user space address instead of %p
       [not found] ` <1525211772-19683-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (2 preceding siblings ...)
  2018-05-01 21:56   ` [PATCH 03/12] drm/amdkfd: Use volatile MTYPE in default/alternate apertures Felix Kuehling
@ 2018-05-01 21:56   ` Felix Kuehling
  2018-05-01 21:56   ` [PATCH 05/12] drm/amdkfd: Remove redundant include of amd-iommu.h Felix Kuehling
                     ` (8 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Felix Kuehling @ 2018-05-01 21:56 UTC (permalink / raw)
  To: oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Philip Yang, Felix Kuehling

From: Philip Yang <Philip.Yang@amd.com>

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_queue.c   | 8 ++++----
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 1a4d8dc..4ced5e9 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -233,7 +233,7 @@ static int set_queue_properties_from_user(struct queue_properties *q_properties,
 	pr_debug("Queue Size: 0x%llX, %u\n",
 			q_properties->queue_size, args->ring_size);
 
-	pr_debug("Queue r/w Pointers: %p, %p\n",
+	pr_debug("Queue r/w Pointers: %px, %px\n",
 			q_properties->read_ptr,
 			q_properties->write_ptr);
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_queue.c b/drivers/gpu/drm/amd/amdkfd/kfd_queue.c
index a5315d4..6dcd621 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_queue.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_queue.c
@@ -36,8 +36,8 @@ void print_queue_properties(struct queue_properties *q)
 	pr_debug("Queue Address: 0x%llX\n", q->queue_address);
 	pr_debug("Queue Id: %u\n", q->queue_id);
 	pr_debug("Queue Process Vmid: %u\n", q->vmid);
-	pr_debug("Queue Read Pointer: 0x%p\n", q->read_ptr);
-	pr_debug("Queue Write Pointer: 0x%p\n", q->write_ptr);
+	pr_debug("Queue Read Pointer: 0x%px\n", q->read_ptr);
+	pr_debug("Queue Write Pointer: 0x%px\n", q->write_ptr);
 	pr_debug("Queue Doorbell Pointer: 0x%p\n", q->doorbell_ptr);
 	pr_debug("Queue Doorbell Offset: %u\n", q->doorbell_off);
 }
@@ -53,8 +53,8 @@ void print_queue(struct queue *q)
 	pr_debug("Queue Address: 0x%llX\n", q->properties.queue_address);
 	pr_debug("Queue Id: %u\n", q->properties.queue_id);
 	pr_debug("Queue Process Vmid: %u\n", q->properties.vmid);
-	pr_debug("Queue Read Pointer: 0x%p\n", q->properties.read_ptr);
-	pr_debug("Queue Write Pointer: 0x%p\n", q->properties.write_ptr);
+	pr_debug("Queue Read Pointer: 0x%px\n", q->properties.read_ptr);
+	pr_debug("Queue Write Pointer: 0x%px\n", q->properties.write_ptr);
 	pr_debug("Queue Doorbell Pointer: 0x%p\n", q->properties.doorbell_ptr);
 	pr_debug("Queue Doorbell Offset: %u\n", q->properties.doorbell_off);
 	pr_debug("Queue MQD Address: 0x%p\n", q->mqd);
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 05/12] drm/amdkfd: Remove redundant include of amd-iommu.h
       [not found] ` <1525211772-19683-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (3 preceding siblings ...)
  2018-05-01 21:56   ` [PATCH 04/12] drm/amdkfd: use %px to print user space address instead of %p Felix Kuehling
@ 2018-05-01 21:56   ` Felix Kuehling
  2018-05-01 21:56   ` [PATCH 06/12] drm/amdkfd: Separate trap handler assembly code and its hex values Felix Kuehling
                     ` (7 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Felix Kuehling @ 2018-05-01 21:56 UTC (permalink / raw)
  To: oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Felix Kuehling

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_device.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index fb4a72d..17de4ac 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -20,9 +20,6 @@
  * OTHER DEALINGS IN THE SOFTWARE.
  */
 
-#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
-#include <linux/amd-iommu.h>
-#endif
 #include <linux/bsearch.h>
 #include <linux/pci.h>
 #include <linux/slab.h>
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 06/12] drm/amdkfd: Separate trap handler assembly code and its hex values
       [not found] ` <1525211772-19683-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (4 preceding siblings ...)
  2018-05-01 21:56   ` [PATCH 05/12] drm/amdkfd: Remove redundant include of amd-iommu.h Felix Kuehling
@ 2018-05-01 21:56   ` Felix Kuehling
  2018-05-01 21:56   ` [PATCH 07/12] drm/amdkfd: Fix CP soft hang on APUs Felix Kuehling
                     ` (6 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Felix Kuehling @ 2018-05-01 21:56 UTC (permalink / raw)
  To: oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Yong Zhao, Felix Kuehling

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=UTF-8, Size: 33616 bytes --]

From: Yong Zhao <yong.zhao-5C7GfCeVMHo@public.gmane.org>

Since the assembly code is inside "#if 0", it is ineffective. Despite that,
during debugging, we need to change the assembly code, extract it into
a separate file and compile the new file into hex values using sp3.
That process also requires us to remove "#if 0" and modify lines starting
with "#", so that sp3 can successfully compile the new file.

With this change, all the above chore is no longer needed, and
cwsr_trap_handler_gfx*.asm can be directly used by sp3 to generate its
hex values.

Signed-off-by: Yong Zhao <yong.zhao-5C7GfCeVMHo@public.gmane.org>
Reviewed-by: Felix Kuehling <Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
Signed-off-by: Felix Kuehling <Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
---
 drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h     | 560 +++++++++++++++++++++
 .../gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm  | 267 +---------
 .../gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm  | 300 +----------
 drivers/gpu/drm/amd/amdkfd/kfd_device.c            |   3 +-
 4 files changed, 575 insertions(+), 555 deletions(-)
 create mode 100644 drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h

diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
new file mode 100644
index 0000000..a546a21
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
@@ -0,0 +1,560 @@
+/*
+ * Copyright 2018 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+static const uint32_t cwsr_trap_gfx8_hex[] = {
+	0xbf820001, 0xbf820125,
+	0xb8f4f802, 0x89748674,
+	0xb8f5f803, 0x8675ff75,
+	0x00000400, 0xbf850011,
+	0xc00a1e37, 0x00000000,
+	0xbf8c007f, 0x87777978,
+	0xbf840002, 0xb974f802,
+	0xbe801d78, 0xb8f5f803,
+	0x8675ff75, 0x000001ff,
+	0xbf850002, 0x80708470,
+	0x82718071, 0x8671ff71,
+	0x0000ffff, 0xb974f802,
+	0xbe801f70, 0xb8f5f803,
+	0x8675ff75, 0x00000100,
+	0xbf840006, 0xbefa0080,
+	0xb97a0203, 0x8671ff71,
+	0x0000ffff, 0x80f08870,
+	0x82f18071, 0xbefa0080,
+	0xb97a0283, 0xbef60068,
+	0xbef70069, 0xb8fa1c07,
+	0x8e7a9c7a, 0x87717a71,
+	0xb8fa03c7, 0x8e7a9b7a,
+	0x87717a71, 0xb8faf807,
+	0x867aff7a, 0x00007fff,
+	0xb97af807, 0xbef2007e,
+	0xbef3007f, 0xbefe0180,
+	0xbf900004, 0x877a8474,
+	0xb97af802, 0xbf8e0002,
+	0xbf88fffe, 0xbef8007e,
+	0x8679ff7f, 0x0000ffff,
+	0x8779ff79, 0x00040000,
+	0xbefa0080, 0xbefb00ff,
+	0x00807fac, 0x867aff7f,
+	0x08000000, 0x8f7a837a,
+	0x877b7a7b, 0x867aff7f,
+	0x70000000, 0x8f7a817a,
+	0x877b7a7b, 0xbeef007c,
+	0xbeee0080, 0xb8ee2a05,
+	0x806e816e, 0x8e6e8a6e,
+	0xb8fa1605, 0x807a817a,
+	0x8e7a867a, 0x806e7a6e,
+	0xbefa0084, 0xbefa00ff,
+	0x01000000, 0xbefe007c,
+	0xbefc006e, 0xc0611bfc,
+	0x0000007c, 0x806e846e,
+	0xbefc007e, 0xbefe007c,
+	0xbefc006e, 0xc0611c3c,
+	0x0000007c, 0x806e846e,
+	0xbefc007e, 0xbefe007c,
+	0xbefc006e, 0xc0611c7c,
+	0x0000007c, 0x806e846e,
+	0xbefc007e, 0xbefe007c,
+	0xbefc006e, 0xc0611cbc,
+	0x0000007c, 0x806e846e,
+	0xbefc007e, 0xbefe007c,
+	0xbefc006e, 0xc0611cfc,
+	0x0000007c, 0x806e846e,
+	0xbefc007e, 0xbefe007c,
+	0xbefc006e, 0xc0611d3c,
+	0x0000007c, 0x806e846e,
+	0xbefc007e, 0xb8f5f803,
+	0xbefe007c, 0xbefc006e,
+	0xc0611d7c, 0x0000007c,
+	0x806e846e, 0xbefc007e,
+	0xbefe007c, 0xbefc006e,
+	0xc0611dbc, 0x0000007c,
+	0x806e846e, 0xbefc007e,
+	0xbefe007c, 0xbefc006e,
+	0xc0611dfc, 0x0000007c,
+	0x806e846e, 0xbefc007e,
+	0xb8eff801, 0xbefe007c,
+	0xbefc006e, 0xc0611bfc,
+	0x0000007c, 0x806e846e,
+	0xbefc007e, 0xbefe007c,
+	0xbefc006e, 0xc0611b3c,
+	0x0000007c, 0x806e846e,
+	0xbefc007e, 0xbefe007c,
+	0xbefc006e, 0xc0611b7c,
+	0x0000007c, 0x806e846e,
+	0xbefc007e, 0x867aff7f,
+	0x04000000, 0xbef30080,
+	0x8773737a, 0xb8ee2a05,
+	0x806e816e, 0x8e6e8a6e,
+	0xb8f51605, 0x80758175,
+	0x8e758475, 0x8e7a8275,
+	0xbefa00ff, 0x01000000,
+	0xbef60178, 0x80786e78,
+	0x82798079, 0xbefc0080,
+	0xbe802b00, 0xbe822b02,
+	0xbe842b04, 0xbe862b06,
+	0xbe882b08, 0xbe8a2b0a,
+	0xbe8c2b0c, 0xbe8e2b0e,
+	0xc06b003c, 0x00000000,
+	0xc06b013c, 0x00000010,
+	0xc06b023c, 0x00000020,
+	0xc06b033c, 0x00000030,
+	0x8078c078, 0x82798079,
+	0x807c907c, 0xbf0a757c,
+	0xbf85ffeb, 0xbef80176,
+	0xbeee0080, 0xbefe00c1,
+	0xbeff00c1, 0xbefa00ff,
+	0x01000000, 0xe0724000,
+	0x6e1e0000, 0xe0724100,
+	0x6e1e0100, 0xe0724200,
+	0x6e1e0200, 0xe0724300,
+	0x6e1e0300, 0xbefe00c1,
+	0xbeff00c1, 0xb8f54306,
+	0x8675c175, 0xbf84002c,
+	0xbf8a0000, 0x867aff73,
+	0x04000000, 0xbf840028,
+	0x8e758675, 0x8e758275,
+	0xbefa0075, 0xb8ee2a05,
+	0x806e816e, 0x8e6e8a6e,
+	0xb8fa1605, 0x807a817a,
+	0x8e7a867a, 0x806e7a6e,
+	0x806eff6e, 0x00000080,
+	0xbefa00ff, 0x01000000,
+	0xbefc0080, 0xd28c0002,
+	0x000100c1, 0xd28d0003,
+	0x000204c1, 0xd1060002,
+	0x00011103, 0x7e0602ff,
+	0x00000200, 0xbefc00ff,
+	0x00010000, 0xbe80007b,
+	0x867bff7b, 0xff7fffff,
+	0x877bff7b, 0x00058000,
+	0xd8ec0000, 0x00000002,
+	0xbf8c007f, 0xe0765000,
+	0x6e1e0002, 0x32040702,
+	0xd0c9006a, 0x0000eb02,
+	0xbf87fff7, 0xbefb0000,
+	0xbeee00ff, 0x00000400,
+	0xbefe00c1, 0xbeff00c1,
+	0xb8f52a05, 0x80758175,
+	0x8e758275, 0x8e7a8875,
+	0xbefa00ff, 0x01000000,
+	0xbefc0084, 0xbf0a757c,
+	0xbf840015, 0xbf11017c,
+	0x8075ff75, 0x00001000,
+	0x7e000300, 0x7e020301,
+	0x7e040302, 0x7e060303,
+	0xe0724000, 0x6e1e0000,
+	0xe0724100, 0x6e1e0100,
+	0xe0724200, 0x6e1e0200,
+	0xe0724300, 0x6e1e0300,
+	0x807c847c, 0x806eff6e,
+	0x00000400, 0xbf0a757c,
+	0xbf85ffef, 0xbf9c0000,
+	0xbf8200ca, 0xbef8007e,
+	0x8679ff7f, 0x0000ffff,
+	0x8779ff79, 0x00040000,
+	0xbefa0080, 0xbefb00ff,
+	0x00807fac, 0x8676ff7f,
+	0x08000000, 0x8f768376,
+	0x877b767b, 0x8676ff7f,
+	0x70000000, 0x8f768176,
+	0x877b767b, 0x8676ff7f,
+	0x04000000, 0xbf84001e,
+	0xbefe00c1, 0xbeff00c1,
+	0xb8f34306, 0x8673c173,
+	0xbf840019, 0x8e738673,
+	0x8e738273, 0xbefa0073,
+	0xb8f22a05, 0x80728172,
+	0x8e728a72, 0xb8f61605,
+	0x80768176, 0x8e768676,
+	0x80727672, 0x8072ff72,
+	0x00000080, 0xbefa00ff,
+	0x01000000, 0xbefc0080,
+	0xe0510000, 0x721e0000,
+	0xe0510100, 0x721e0000,
+	0x807cff7c, 0x00000200,
+	0x8072ff72, 0x00000200,
+	0xbf0a737c, 0xbf85fff6,
+	0xbef20080, 0xbefe00c1,
+	0xbeff00c1, 0xb8f32a05,
+	0x80738173, 0x8e738273,
+	0x8e7a8873, 0xbefa00ff,
+	0x01000000, 0xbef60072,
+	0x8072ff72, 0x00000400,
+	0xbefc0084, 0xbf11087c,
+	0x8073ff73, 0x00008000,
+	0xe0524000, 0x721e0000,
+	0xe0524100, 0x721e0100,
+	0xe0524200, 0x721e0200,
+	0xe0524300, 0x721e0300,
+	0xbf8c0f70, 0x7e000300,
+	0x7e020301, 0x7e040302,
+	0x7e060303, 0x807c847c,
+	0x8072ff72, 0x00000400,
+	0xbf0a737c, 0xbf85ffee,
+	0xbf9c0000, 0xe0524000,
+	0x761e0000, 0xe0524100,
+	0x761e0100, 0xe0524200,
+	0x761e0200, 0xe0524300,
+	0x761e0300, 0xb8f22a05,
+	0x80728172, 0x8e728a72,
+	0xb8f61605, 0x80768176,
+	0x8e768676, 0x80727672,
+	0x80f2c072, 0xb8f31605,
+	0x80738173, 0x8e738473,
+	0x8e7a8273, 0xbefa00ff,
+	0x01000000, 0xbefc0073,
+	0xc031003c, 0x00000072,
+	0x80f2c072, 0xbf8c007f,
+	0x80fc907c, 0xbe802d00,
+	0xbe822d02, 0xbe842d04,
+	0xbe862d06, 0xbe882d08,
+	0xbe8a2d0a, 0xbe8c2d0c,
+	0xbe8e2d0e, 0xbf06807c,
+	0xbf84fff1, 0xb8f22a05,
+	0x80728172, 0x8e728a72,
+	0xb8f61605, 0x80768176,
+	0x8e768676, 0x80727672,
+	0xbefa0084, 0xbefa00ff,
+	0x01000000, 0xc0211cfc,
+	0x00000072, 0x80728472,
+	0xc0211c3c, 0x00000072,
+	0x80728472, 0xc0211c7c,
+	0x00000072, 0x80728472,
+	0xc0211bbc, 0x00000072,
+	0x80728472, 0xc0211bfc,
+	0x00000072, 0x80728472,
+	0xc0211d3c, 0x00000072,
+	0x80728472, 0xc0211d7c,
+	0x00000072, 0x80728472,
+	0xc0211a3c, 0x00000072,
+	0x80728472, 0xc0211a7c,
+	0x00000072, 0x80728472,
+	0xc0211dfc, 0x00000072,
+	0x80728472, 0xc0211b3c,
+	0x00000072, 0x80728472,
+	0xc0211b7c, 0x00000072,
+	0x80728472, 0xbf8c007f,
+	0x8671ff71, 0x0000ffff,
+	0xbefc0073, 0xbefe006e,
+	0xbeff006f, 0x867375ff,
+	0x000003ff, 0xb9734803,
+	0x867375ff, 0xfffff800,
+	0x8f738b73, 0xb973a2c3,
+	0xb977f801, 0x8673ff71,
+	0xf0000000, 0x8f739c73,
+	0x8e739073, 0xbef60080,
+	0x87767376, 0x8673ff71,
+	0x08000000, 0x8f739b73,
+	0x8e738f73, 0x87767376,
+	0x8673ff74, 0x00800000,
+	0x8f739773, 0xb976f807,
+	0x86fe7e7e, 0x86ea6a6a,
+	0xb974f802, 0xbf8a0000,
+	0x95807370, 0xbf810000,
+};
+
+
+static const uint32_t cwsr_trap_gfx9_hex[] = {
+	0xbf820001, 0xbf82015a,
+	0xb8f8f802, 0x89788678,
+	0xb8f1f803, 0x866eff71,
+	0x00000400, 0xbf850034,
+	0x866eff71, 0x00000800,
+	0xbf850003, 0x866eff71,
+	0x00000100, 0xbf840008,
+	0x866eff78, 0x00002000,
+	0xbf840001, 0xbf810000,
+	0x8778ff78, 0x00002000,
+	0x80ec886c, 0x82ed806d,
+	0xb8eef807, 0x866fff6e,
+	0x001f8000, 0x8e6f8b6f,
+	0x8977ff77, 0xfc000000,
+	0x87776f77, 0x896eff6e,
+	0x001f8000, 0xb96ef807,
+	0xb8f0f812, 0xb8f1f813,
+	0x8ef08870, 0xc0071bb8,
+	0x00000000, 0xbf8cc07f,
+	0xc0071c38, 0x00000008,
+	0xbf8cc07f, 0x86ee6e6e,
+	0xbf840001, 0xbe801d6e,
+	0xb8f1f803, 0x8671ff71,
+	0x000001ff, 0xbf850002,
+	0x806c846c, 0x826d806d,
+	0x866dff6d, 0x0000ffff,
+	0x8f6e8b77, 0x866eff6e,
+	0x001f8000, 0xb96ef807,
+	0x86fe7e7e, 0x86ea6a6a,
+	0xb978f802, 0xbe801f6c,
+	0x866dff6d, 0x0000ffff,
+	0xbef00080, 0xb9700283,
+	0xb8f02407, 0x8e709c70,
+	0x876d706d, 0xb8f003c7,
+	0x8e709b70, 0x876d706d,
+	0xb8f0f807, 0x8670ff70,
+	0x00007fff, 0xb970f807,
+	0xbeee007e, 0xbeef007f,
+	0xbefe0180, 0xbf900004,
+	0x87708478, 0xb970f802,
+	0xbf8e0002, 0xbf88fffe,
+	0xb8f02a05, 0x80708170,
+	0x8e708a70, 0xb8f11605,
+	0x80718171, 0x8e718671,
+	0x80707170, 0x80707e70,
+	0x8271807f, 0x8671ff71,
+	0x0000ffff, 0xc0471cb8,
+	0x00000040, 0xbf8cc07f,
+	0xc04b1d38, 0x00000048,
+	0xbf8cc07f, 0xc0431e78,
+	0x00000058, 0xbf8cc07f,
+	0xc0471eb8, 0x0000005c,
+	0xbf8cc07f, 0xbef4007e,
+	0x8675ff7f, 0x0000ffff,
+	0x8775ff75, 0x00040000,
+	0xbef60080, 0xbef700ff,
+	0x00807fac, 0x8670ff7f,
+	0x08000000, 0x8f708370,
+	0x87777077, 0x8670ff7f,
+	0x70000000, 0x8f708170,
+	0x87777077, 0xbefb007c,
+	0xbefa0080, 0xb8fa2a05,
+	0x807a817a, 0x8e7a8a7a,
+	0xb8f01605, 0x80708170,
+	0x8e708670, 0x807a707a,
+	0xbef60084, 0xbef600ff,
+	0x01000000, 0xbefe007c,
+	0xbefc007a, 0xc0611efa,
+	0x0000007c, 0xbf8cc07f,
+	0x807a847a, 0xbefc007e,
+	0xbefe007c, 0xbefc007a,
+	0xc0611b3a, 0x0000007c,
+	0xbf8cc07f, 0x807a847a,
+	0xbefc007e, 0xbefe007c,
+	0xbefc007a, 0xc0611b7a,
+	0x0000007c, 0xbf8cc07f,
+	0x807a847a, 0xbefc007e,
+	0xbefe007c, 0xbefc007a,
+	0xc0611bba, 0x0000007c,
+	0xbf8cc07f, 0x807a847a,
+	0xbefc007e, 0xbefe007c,
+	0xbefc007a, 0xc0611bfa,
+	0x0000007c, 0xbf8cc07f,
+	0x807a847a, 0xbefc007e,
+	0xbefe007c, 0xbefc007a,
+	0xc0611e3a, 0x0000007c,
+	0xbf8cc07f, 0x807a847a,
+	0xbefc007e, 0xb8f1f803,
+	0xbefe007c, 0xbefc007a,
+	0xc0611c7a, 0x0000007c,
+	0xbf8cc07f, 0x807a847a,
+	0xbefc007e, 0xbefe007c,
+	0xbefc007a, 0xc0611a3a,
+	0x0000007c, 0xbf8cc07f,
+	0x807a847a, 0xbefc007e,
+	0xbefe007c, 0xbefc007a,
+	0xc0611a7a, 0x0000007c,
+	0xbf8cc07f, 0x807a847a,
+	0xbefc007e, 0xb8fbf801,
+	0xbefe007c, 0xbefc007a,
+	0xc0611efa, 0x0000007c,
+	0xbf8cc07f, 0x807a847a,
+	0xbefc007e, 0x8670ff7f,
+	0x04000000, 0xbeef0080,
+	0x876f6f70, 0xb8fa2a05,
+	0x807a817a, 0x8e7a8a7a,
+	0xb8f11605, 0x80718171,
+	0x8e718471, 0x8e768271,
+	0xbef600ff, 0x01000000,
+	0xbef20174, 0x80747a74,
+	0x82758075, 0xbefc0080,
+	0xbf800000, 0xbe802b00,
+	0xbe822b02, 0xbe842b04,
+	0xbe862b06, 0xbe882b08,
+	0xbe8a2b0a, 0xbe8c2b0c,
+	0xbe8e2b0e, 0xc06b003a,
+	0x00000000, 0xbf8cc07f,
+	0xc06b013a, 0x00000010,
+	0xbf8cc07f, 0xc06b023a,
+	0x00000020, 0xbf8cc07f,
+	0xc06b033a, 0x00000030,
+	0xbf8cc07f, 0x8074c074,
+	0x82758075, 0x807c907c,
+	0xbf0a717c, 0xbf85ffe7,
+	0xbef40172, 0xbefa0080,
+	0xbefe00c1, 0xbeff00c1,
+	0xbee80080, 0xbee90080,
+	0xbef600ff, 0x01000000,
+	0xe0724000, 0x7a1d0000,
+	0xe0724100, 0x7a1d0100,
+	0xe0724200, 0x7a1d0200,
+	0xe0724300, 0x7a1d0300,
+	0xbefe00c1, 0xbeff00c1,
+	0xb8f14306, 0x8671c171,
+	0xbf84002c, 0xbf8a0000,
+	0x8670ff6f, 0x04000000,
+	0xbf840028, 0x8e718671,
+	0x8e718271, 0xbef60071,
+	0xb8fa2a05, 0x807a817a,
+	0x8e7a8a7a, 0xb8f01605,
+	0x80708170, 0x8e708670,
+	0x807a707a, 0x807aff7a,
+	0x00000080, 0xbef600ff,
+	0x01000000, 0xbefc0080,
+	0xd28c0002, 0x000100c1,
+	0xd28d0003, 0x000204c1,
+	0xd1060002, 0x00011103,
+	0x7e0602ff, 0x00000200,
+	0xbefc00ff, 0x00010000,
+	0xbe800077, 0x8677ff77,
+	0xff7fffff, 0x8777ff77,
+	0x00058000, 0xd8ec0000,
+	0x00000002, 0xbf8cc07f,
+	0xe0765000, 0x7a1d0002,
+	0x68040702, 0xd0c9006a,
+	0x0000e302, 0xbf87fff7,
+	0xbef70000, 0xbefa00ff,
+	0x00000400, 0xbefe00c1,
+	0xbeff00c1, 0xb8f12a05,
+	0x80718171, 0x8e718271,
+	0x8e768871, 0xbef600ff,
+	0x01000000, 0xbefc0084,
+	0xbf0a717c, 0xbf840015,
+	0xbf11017c, 0x8071ff71,
+	0x00001000, 0x7e000300,
+	0x7e020301, 0x7e040302,
+	0x7e060303, 0xe0724000,
+	0x7a1d0000, 0xe0724100,
+	0x7a1d0100, 0xe0724200,
+	0x7a1d0200, 0xe0724300,
+	0x7a1d0300, 0x807c847c,
+	0x807aff7a, 0x00000400,
+	0xbf0a717c, 0xbf85ffef,
+	0xbf9c0000, 0xbf8200d9,
+	0xbef4007e, 0x8675ff7f,
+	0x0000ffff, 0x8775ff75,
+	0x00040000, 0xbef60080,
+	0xbef700ff, 0x00807fac,
+	0x866eff7f, 0x08000000,
+	0x8f6e836e, 0x87776e77,
+	0x866eff7f, 0x70000000,
+	0x8f6e816e, 0x87776e77,
+	0x866eff7f, 0x04000000,
+	0xbf84001e, 0xbefe00c1,
+	0xbeff00c1, 0xb8ef4306,
+	0x866fc16f, 0xbf840019,
+	0x8e6f866f, 0x8e6f826f,
+	0xbef6006f, 0xb8f82a05,
+	0x80788178, 0x8e788a78,
+	0xb8ee1605, 0x806e816e,
+	0x8e6e866e, 0x80786e78,
+	0x8078ff78, 0x00000080,
+	0xbef600ff, 0x01000000,
+	0xbefc0080, 0xe0510000,
+	0x781d0000, 0xe0510100,
+	0x781d0000, 0x807cff7c,
+	0x00000200, 0x8078ff78,
+	0x00000200, 0xbf0a6f7c,
+	0xbf85fff6, 0xbef80080,
+	0xbefe00c1, 0xbeff00c1,
+	0xb8ef2a05, 0x806f816f,
+	0x8e6f826f, 0x8e76886f,
+	0xbef600ff, 0x01000000,
+	0xbeee0078, 0x8078ff78,
+	0x00000400, 0xbefc0084,
+	0xbf11087c, 0x806fff6f,
+	0x00008000, 0xe0524000,
+	0x781d0000, 0xe0524100,
+	0x781d0100, 0xe0524200,
+	0x781d0200, 0xe0524300,
+	0x781d0300, 0xbf8c0f70,
+	0x7e000300, 0x7e020301,
+	0x7e040302, 0x7e060303,
+	0x807c847c, 0x8078ff78,
+	0x00000400, 0xbf0a6f7c,
+	0xbf85ffee, 0xbf9c0000,
+	0xe0524000, 0x6e1d0000,
+	0xe0524100, 0x6e1d0100,
+	0xe0524200, 0x6e1d0200,
+	0xe0524300, 0x6e1d0300,
+	0xb8f82a05, 0x80788178,
+	0x8e788a78, 0xb8ee1605,
+	0x806e816e, 0x8e6e866e,
+	0x80786e78, 0x80f8c078,
+	0xb8ef1605, 0x806f816f,
+	0x8e6f846f, 0x8e76826f,
+	0xbef600ff, 0x01000000,
+	0xbefc006f, 0xc031003a,
+	0x00000078, 0x80f8c078,
+	0xbf8cc07f, 0x80fc907c,
+	0xbf800000, 0xbe802d00,
+	0xbe822d02, 0xbe842d04,
+	0xbe862d06, 0xbe882d08,
+	0xbe8a2d0a, 0xbe8c2d0c,
+	0xbe8e2d0e, 0xbf06807c,
+	0xbf84fff0, 0xb8f82a05,
+	0x80788178, 0x8e788a78,
+	0xb8ee1605, 0x806e816e,
+	0x8e6e866e, 0x80786e78,
+	0xbef60084, 0xbef600ff,
+	0x01000000, 0xc0211bfa,
+	0x00000078, 0x80788478,
+	0xc0211b3a, 0x00000078,
+	0x80788478, 0xc0211b7a,
+	0x00000078, 0x80788478,
+	0xc0211eba, 0x00000078,
+	0x80788478, 0xc0211efa,
+	0x00000078, 0x80788478,
+	0xc0211c3a, 0x00000078,
+	0x80788478, 0xc0211c7a,
+	0x00000078, 0x80788478,
+	0xc0211a3a, 0x00000078,
+	0x80788478, 0xc0211a7a,
+	0x00000078, 0x80788478,
+	0xc0211cfa, 0x00000078,
+	0x80788478, 0xbf8cc07f,
+	0x866dff6d, 0x0000ffff,
+	0xbefc006f, 0xbefe007a,
+	0xbeff007b, 0x866f71ff,
+	0x000003ff, 0xb96f4803,
+	0x866f71ff, 0xfffff800,
+	0x8f6f8b6f, 0xb96fa2c3,
+	0xb973f801, 0xb8ee2a05,
+	0x806e816e, 0x8e6e8a6e,
+	0xb8ef1605, 0x806f816f,
+	0x8e6f866f, 0x806e6f6e,
+	0x806e746e, 0x826f8075,
+	0x866fff6f, 0x0000ffff,
+	0xc0071cb7, 0x00000040,
+	0xc00b1d37, 0x00000048,
+	0xc0031e77, 0x00000058,
+	0xc0071eb7, 0x0000005c,
+	0xbf8cc07f, 0x866fff6d,
+	0xf0000000, 0x8f6f9c6f,
+	0x8e6f906f, 0xbeee0080,
+	0x876e6f6e, 0x866fff6d,
+	0x08000000, 0x8f6f9b6f,
+	0x8e6f8f6f, 0x876e6f6e,
+	0x866fff70, 0x00800000,
+	0x8f6f976f, 0xb96ef807,
+	0x86fe7e7e, 0x86ea6a6a,
+	0xb970f802, 0xbf8a0000,
+	0x95806f6c, 0xbf810000,
+};
diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm
index 34eabcd..658a4c6 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm
@@ -20,9 +20,12 @@
  * OTHER DEALINGS IN THE SOFTWARE.
  */
 
-#if 0
-HW (VI) source code for CWSR trap handler
-#Version 18 + multiple trap handler
+/* To compile this assembly code:
+ * PROJECT=vi ./sp3 cwsr_trap_handler_gfx8.asm -hex tmp.hex
+ */
+
+/* HW (VI) source code for CWSR trap handler */
+/* Version 18 + multiple trap handler */
 
 // this performance-optimal version was originally from Seven Xu at SRDC
 
@@ -150,7 +153,7 @@ var s_save_spi_init_lo              =   exec_lo
 var s_save_spi_init_hi              =   exec_hi
 
                                                 //tba_lo and tba_hi need to be saved/restored
-var s_save_pc_lo            =   ttmp0           //{TTMP1, TTMP0} = {3??h0,pc_rewind[3:0], HT[0],trapID[7:0], PC[47:0]}
+var s_save_pc_lo            =   ttmp0           //{TTMP1, TTMP0} = {3'h0,pc_rewind[3:0], HT[0],trapID[7:0], PC[47:0]}
 var s_save_pc_hi            =   ttmp1
 var s_save_exec_lo          =   ttmp2
 var s_save_exec_hi          =   ttmp3
@@ -1132,259 +1135,3 @@ end
 function get_hwreg_size_bytes
     return 128 //HWREG size 128 bytes
 end
-
-
-#endif
-
-static const uint32_t cwsr_trap_gfx8_hex[] = {
-	0xbf820001, 0xbf820125,
-	0xb8f4f802, 0x89748674,
-	0xb8f5f803, 0x8675ff75,
-	0x00000400, 0xbf850011,
-	0xc00a1e37, 0x00000000,
-	0xbf8c007f, 0x87777978,
-	0xbf840002, 0xb974f802,
-	0xbe801d78, 0xb8f5f803,
-	0x8675ff75, 0x000001ff,
-	0xbf850002, 0x80708470,
-	0x82718071, 0x8671ff71,
-	0x0000ffff, 0xb974f802,
-	0xbe801f70, 0xb8f5f803,
-	0x8675ff75, 0x00000100,
-	0xbf840006, 0xbefa0080,
-	0xb97a0203, 0x8671ff71,
-	0x0000ffff, 0x80f08870,
-	0x82f18071, 0xbefa0080,
-	0xb97a0283, 0xbef60068,
-	0xbef70069, 0xb8fa1c07,
-	0x8e7a9c7a, 0x87717a71,
-	0xb8fa03c7, 0x8e7a9b7a,
-	0x87717a71, 0xb8faf807,
-	0x867aff7a, 0x00007fff,
-	0xb97af807, 0xbef2007e,
-	0xbef3007f, 0xbefe0180,
-	0xbf900004, 0x877a8474,
-	0xb97af802, 0xbf8e0002,
-	0xbf88fffe, 0xbef8007e,
-	0x8679ff7f, 0x0000ffff,
-	0x8779ff79, 0x00040000,
-	0xbefa0080, 0xbefb00ff,
-	0x00807fac, 0x867aff7f,
-	0x08000000, 0x8f7a837a,
-	0x877b7a7b, 0x867aff7f,
-	0x70000000, 0x8f7a817a,
-	0x877b7a7b, 0xbeef007c,
-	0xbeee0080, 0xb8ee2a05,
-	0x806e816e, 0x8e6e8a6e,
-	0xb8fa1605, 0x807a817a,
-	0x8e7a867a, 0x806e7a6e,
-	0xbefa0084, 0xbefa00ff,
-	0x01000000, 0xbefe007c,
-	0xbefc006e, 0xc0611bfc,
-	0x0000007c, 0x806e846e,
-	0xbefc007e, 0xbefe007c,
-	0xbefc006e, 0xc0611c3c,
-	0x0000007c, 0x806e846e,
-	0xbefc007e, 0xbefe007c,
-	0xbefc006e, 0xc0611c7c,
-	0x0000007c, 0x806e846e,
-	0xbefc007e, 0xbefe007c,
-	0xbefc006e, 0xc0611cbc,
-	0x0000007c, 0x806e846e,
-	0xbefc007e, 0xbefe007c,
-	0xbefc006e, 0xc0611cfc,
-	0x0000007c, 0x806e846e,
-	0xbefc007e, 0xbefe007c,
-	0xbefc006e, 0xc0611d3c,
-	0x0000007c, 0x806e846e,
-	0xbefc007e, 0xb8f5f803,
-	0xbefe007c, 0xbefc006e,
-	0xc0611d7c, 0x0000007c,
-	0x806e846e, 0xbefc007e,
-	0xbefe007c, 0xbefc006e,
-	0xc0611dbc, 0x0000007c,
-	0x806e846e, 0xbefc007e,
-	0xbefe007c, 0xbefc006e,
-	0xc0611dfc, 0x0000007c,
-	0x806e846e, 0xbefc007e,
-	0xb8eff801, 0xbefe007c,
-	0xbefc006e, 0xc0611bfc,
-	0x0000007c, 0x806e846e,
-	0xbefc007e, 0xbefe007c,
-	0xbefc006e, 0xc0611b3c,
-	0x0000007c, 0x806e846e,
-	0xbefc007e, 0xbefe007c,
-	0xbefc006e, 0xc0611b7c,
-	0x0000007c, 0x806e846e,
-	0xbefc007e, 0x867aff7f,
-	0x04000000, 0xbef30080,
-	0x8773737a, 0xb8ee2a05,
-	0x806e816e, 0x8e6e8a6e,
-	0xb8f51605, 0x80758175,
-	0x8e758475, 0x8e7a8275,
-	0xbefa00ff, 0x01000000,
-	0xbef60178, 0x80786e78,
-	0x82798079, 0xbefc0080,
-	0xbe802b00, 0xbe822b02,
-	0xbe842b04, 0xbe862b06,
-	0xbe882b08, 0xbe8a2b0a,
-	0xbe8c2b0c, 0xbe8e2b0e,
-	0xc06b003c, 0x00000000,
-	0xc06b013c, 0x00000010,
-	0xc06b023c, 0x00000020,
-	0xc06b033c, 0x00000030,
-	0x8078c078, 0x82798079,
-	0x807c907c, 0xbf0a757c,
-	0xbf85ffeb, 0xbef80176,
-	0xbeee0080, 0xbefe00c1,
-	0xbeff00c1, 0xbefa00ff,
-	0x01000000, 0xe0724000,
-	0x6e1e0000, 0xe0724100,
-	0x6e1e0100, 0xe0724200,
-	0x6e1e0200, 0xe0724300,
-	0x6e1e0300, 0xbefe00c1,
-	0xbeff00c1, 0xb8f54306,
-	0x8675c175, 0xbf84002c,
-	0xbf8a0000, 0x867aff73,
-	0x04000000, 0xbf840028,
-	0x8e758675, 0x8e758275,
-	0xbefa0075, 0xb8ee2a05,
-	0x806e816e, 0x8e6e8a6e,
-	0xb8fa1605, 0x807a817a,
-	0x8e7a867a, 0x806e7a6e,
-	0x806eff6e, 0x00000080,
-	0xbefa00ff, 0x01000000,
-	0xbefc0080, 0xd28c0002,
-	0x000100c1, 0xd28d0003,
-	0x000204c1, 0xd1060002,
-	0x00011103, 0x7e0602ff,
-	0x00000200, 0xbefc00ff,
-	0x00010000, 0xbe80007b,
-	0x867bff7b, 0xff7fffff,
-	0x877bff7b, 0x00058000,
-	0xd8ec0000, 0x00000002,
-	0xbf8c007f, 0xe0765000,
-	0x6e1e0002, 0x32040702,
-	0xd0c9006a, 0x0000eb02,
-	0xbf87fff7, 0xbefb0000,
-	0xbeee00ff, 0x00000400,
-	0xbefe00c1, 0xbeff00c1,
-	0xb8f52a05, 0x80758175,
-	0x8e758275, 0x8e7a8875,
-	0xbefa00ff, 0x01000000,
-	0xbefc0084, 0xbf0a757c,
-	0xbf840015, 0xbf11017c,
-	0x8075ff75, 0x00001000,
-	0x7e000300, 0x7e020301,
-	0x7e040302, 0x7e060303,
-	0xe0724000, 0x6e1e0000,
-	0xe0724100, 0x6e1e0100,
-	0xe0724200, 0x6e1e0200,
-	0xe0724300, 0x6e1e0300,
-	0x807c847c, 0x806eff6e,
-	0x00000400, 0xbf0a757c,
-	0xbf85ffef, 0xbf9c0000,
-	0xbf8200ca, 0xbef8007e,
-	0x8679ff7f, 0x0000ffff,
-	0x8779ff79, 0x00040000,
-	0xbefa0080, 0xbefb00ff,
-	0x00807fac, 0x8676ff7f,
-	0x08000000, 0x8f768376,
-	0x877b767b, 0x8676ff7f,
-	0x70000000, 0x8f768176,
-	0x877b767b, 0x8676ff7f,
-	0x04000000, 0xbf84001e,
-	0xbefe00c1, 0xbeff00c1,
-	0xb8f34306, 0x8673c173,
-	0xbf840019, 0x8e738673,
-	0x8e738273, 0xbefa0073,
-	0xb8f22a05, 0x80728172,
-	0x8e728a72, 0xb8f61605,
-	0x80768176, 0x8e768676,
-	0x80727672, 0x8072ff72,
-	0x00000080, 0xbefa00ff,
-	0x01000000, 0xbefc0080,
-	0xe0510000, 0x721e0000,
-	0xe0510100, 0x721e0000,
-	0x807cff7c, 0x00000200,
-	0x8072ff72, 0x00000200,
-	0xbf0a737c, 0xbf85fff6,
-	0xbef20080, 0xbefe00c1,
-	0xbeff00c1, 0xb8f32a05,
-	0x80738173, 0x8e738273,
-	0x8e7a8873, 0xbefa00ff,
-	0x01000000, 0xbef60072,
-	0x8072ff72, 0x00000400,
-	0xbefc0084, 0xbf11087c,
-	0x8073ff73, 0x00008000,
-	0xe0524000, 0x721e0000,
-	0xe0524100, 0x721e0100,
-	0xe0524200, 0x721e0200,
-	0xe0524300, 0x721e0300,
-	0xbf8c0f70, 0x7e000300,
-	0x7e020301, 0x7e040302,
-	0x7e060303, 0x807c847c,
-	0x8072ff72, 0x00000400,
-	0xbf0a737c, 0xbf85ffee,
-	0xbf9c0000, 0xe0524000,
-	0x761e0000, 0xe0524100,
-	0x761e0100, 0xe0524200,
-	0x761e0200, 0xe0524300,
-	0x761e0300, 0xb8f22a05,
-	0x80728172, 0x8e728a72,
-	0xb8f61605, 0x80768176,
-	0x8e768676, 0x80727672,
-	0x80f2c072, 0xb8f31605,
-	0x80738173, 0x8e738473,
-	0x8e7a8273, 0xbefa00ff,
-	0x01000000, 0xbefc0073,
-	0xc031003c, 0x00000072,
-	0x80f2c072, 0xbf8c007f,
-	0x80fc907c, 0xbe802d00,
-	0xbe822d02, 0xbe842d04,
-	0xbe862d06, 0xbe882d08,
-	0xbe8a2d0a, 0xbe8c2d0c,
-	0xbe8e2d0e, 0xbf06807c,
-	0xbf84fff1, 0xb8f22a05,
-	0x80728172, 0x8e728a72,
-	0xb8f61605, 0x80768176,
-	0x8e768676, 0x80727672,
-	0xbefa0084, 0xbefa00ff,
-	0x01000000, 0xc0211cfc,
-	0x00000072, 0x80728472,
-	0xc0211c3c, 0x00000072,
-	0x80728472, 0xc0211c7c,
-	0x00000072, 0x80728472,
-	0xc0211bbc, 0x00000072,
-	0x80728472, 0xc0211bfc,
-	0x00000072, 0x80728472,
-	0xc0211d3c, 0x00000072,
-	0x80728472, 0xc0211d7c,
-	0x00000072, 0x80728472,
-	0xc0211a3c, 0x00000072,
-	0x80728472, 0xc0211a7c,
-	0x00000072, 0x80728472,
-	0xc0211dfc, 0x00000072,
-	0x80728472, 0xc0211b3c,
-	0x00000072, 0x80728472,
-	0xc0211b7c, 0x00000072,
-	0x80728472, 0xbf8c007f,
-	0x8671ff71, 0x0000ffff,
-	0xbefc0073, 0xbefe006e,
-	0xbeff006f, 0x867375ff,
-	0x000003ff, 0xb9734803,
-	0x867375ff, 0xfffff800,
-	0x8f738b73, 0xb973a2c3,
-	0xb977f801, 0x8673ff71,
-	0xf0000000, 0x8f739c73,
-	0x8e739073, 0xbef60080,
-	0x87767376, 0x8673ff71,
-	0x08000000, 0x8f739b73,
-	0x8e738f73, 0x87767376,
-	0x8673ff74, 0x00800000,
-	0x8f739773, 0xb976f807,
-	0x86fe7e7e, 0x86ea6a6a,
-	0xb974f802, 0xbf8a0000,
-	0x95807370, 0xbf810000,
-};
-
diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm
index 8fc3698..065f55a 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm
@@ -20,9 +20,12 @@
  * OTHER DEALINGS IN THE SOFTWARE.
  */
 
-#if 0
-HW (GFX9) source code for CWSR trap handler
-#Version 18 + multiple trap handler
+/* To compile this assembly code:
+ * PROJECT=greenland ./sp3 cwsr_trap_handler_gfx9.asm -hex tmp.hex
+ */
+
+/* HW (GFX9) source code for CWSR trap handler */
+/* Version 18 + multiple trap handler */
 
 // this performance-optimal version was originally from Seven Xu at SRDC
 
@@ -151,7 +154,7 @@ var S_SAVE_PC_HI_FIRST_REPLAY_MASK	=   0x08000000		//FIXME
 var s_save_spi_init_lo		    =	exec_lo
 var s_save_spi_init_hi		    =	exec_hi
 
-var s_save_pc_lo	    =	ttmp0		//{TTMP1, TTMP0} = {3¡¯h0,pc_rewind[3:0], HT[0],trapID[7:0], PC[47:0]}
+var s_save_pc_lo	    =	ttmp0		//{TTMP1, TTMP0} = {3'h0,pc_rewind[3:0], HT[0],trapID[7:0], PC[47:0]}
 var s_save_pc_hi	    =	ttmp1
 var s_save_exec_lo	    =	ttmp2
 var s_save_exec_hi	    =	ttmp3
@@ -1210,292 +1213,3 @@ function ack_sqc_store_workaround
         s_waitcnt lgkmcnt(0)
     end
 end
-
-
-#endif
-
-static const uint32_t cwsr_trap_gfx9_hex[] = {
-	0xbf820001, 0xbf82015a,
-	0xb8f8f802, 0x89788678,
-	0xb8f1f803, 0x866eff71,
-	0x00000400, 0xbf850034,
-	0x866eff71, 0x00000800,
-	0xbf850003, 0x866eff71,
-	0x00000100, 0xbf840008,
-	0x866eff78, 0x00002000,
-	0xbf840001, 0xbf810000,
-	0x8778ff78, 0x00002000,
-	0x80ec886c, 0x82ed806d,
-	0xb8eef807, 0x866fff6e,
-	0x001f8000, 0x8e6f8b6f,
-	0x8977ff77, 0xfc000000,
-	0x87776f77, 0x896eff6e,
-	0x001f8000, 0xb96ef807,
-	0xb8f0f812, 0xb8f1f813,
-	0x8ef08870, 0xc0071bb8,
-	0x00000000, 0xbf8cc07f,
-	0xc0071c38, 0x00000008,
-	0xbf8cc07f, 0x86ee6e6e,
-	0xbf840001, 0xbe801d6e,
-	0xb8f1f803, 0x8671ff71,
-	0x000001ff, 0xbf850002,
-	0x806c846c, 0x826d806d,
-	0x866dff6d, 0x0000ffff,
-	0x8f6e8b77, 0x866eff6e,
-	0x001f8000, 0xb96ef807,
-	0x86fe7e7e, 0x86ea6a6a,
-	0xb978f802, 0xbe801f6c,
-	0x866dff6d, 0x0000ffff,
-	0xbef00080, 0xb9700283,
-	0xb8f02407, 0x8e709c70,
-	0x876d706d, 0xb8f003c7,
-	0x8e709b70, 0x876d706d,
-	0xb8f0f807, 0x8670ff70,
-	0x00007fff, 0xb970f807,
-	0xbeee007e, 0xbeef007f,
-	0xbefe0180, 0xbf900004,
-	0x87708478, 0xb970f802,
-	0xbf8e0002, 0xbf88fffe,
-	0xb8f02a05, 0x80708170,
-	0x8e708a70, 0xb8f11605,
-	0x80718171, 0x8e718671,
-	0x80707170, 0x80707e70,
-	0x8271807f, 0x8671ff71,
-	0x0000ffff, 0xc0471cb8,
-	0x00000040, 0xbf8cc07f,
-	0xc04b1d38, 0x00000048,
-	0xbf8cc07f, 0xc0431e78,
-	0x00000058, 0xbf8cc07f,
-	0xc0471eb8, 0x0000005c,
-	0xbf8cc07f, 0xbef4007e,
-	0x8675ff7f, 0x0000ffff,
-	0x8775ff75, 0x00040000,
-	0xbef60080, 0xbef700ff,
-	0x00807fac, 0x8670ff7f,
-	0x08000000, 0x8f708370,
-	0x87777077, 0x8670ff7f,
-	0x70000000, 0x8f708170,
-	0x87777077, 0xbefb007c,
-	0xbefa0080, 0xb8fa2a05,
-	0x807a817a, 0x8e7a8a7a,
-	0xb8f01605, 0x80708170,
-	0x8e708670, 0x807a707a,
-	0xbef60084, 0xbef600ff,
-	0x01000000, 0xbefe007c,
-	0xbefc007a, 0xc0611efa,
-	0x0000007c, 0xbf8cc07f,
-	0x807a847a, 0xbefc007e,
-	0xbefe007c, 0xbefc007a,
-	0xc0611b3a, 0x0000007c,
-	0xbf8cc07f, 0x807a847a,
-	0xbefc007e, 0xbefe007c,
-	0xbefc007a, 0xc0611b7a,
-	0x0000007c, 0xbf8cc07f,
-	0x807a847a, 0xbefc007e,
-	0xbefe007c, 0xbefc007a,
-	0xc0611bba, 0x0000007c,
-	0xbf8cc07f, 0x807a847a,
-	0xbefc007e, 0xbefe007c,
-	0xbefc007a, 0xc0611bfa,
-	0x0000007c, 0xbf8cc07f,
-	0x807a847a, 0xbefc007e,
-	0xbefe007c, 0xbefc007a,
-	0xc0611e3a, 0x0000007c,
-	0xbf8cc07f, 0x807a847a,
-	0xbefc007e, 0xb8f1f803,
-	0xbefe007c, 0xbefc007a,
-	0xc0611c7a, 0x0000007c,
-	0xbf8cc07f, 0x807a847a,
-	0xbefc007e, 0xbefe007c,
-	0xbefc007a, 0xc0611a3a,
-	0x0000007c, 0xbf8cc07f,
-	0x807a847a, 0xbefc007e,
-	0xbefe007c, 0xbefc007a,
-	0xc0611a7a, 0x0000007c,
-	0xbf8cc07f, 0x807a847a,
-	0xbefc007e, 0xb8fbf801,
-	0xbefe007c, 0xbefc007a,
-	0xc0611efa, 0x0000007c,
-	0xbf8cc07f, 0x807a847a,
-	0xbefc007e, 0x8670ff7f,
-	0x04000000, 0xbeef0080,
-	0x876f6f70, 0xb8fa2a05,
-	0x807a817a, 0x8e7a8a7a,
-	0xb8f11605, 0x80718171,
-	0x8e718471, 0x8e768271,
-	0xbef600ff, 0x01000000,
-	0xbef20174, 0x80747a74,
-	0x82758075, 0xbefc0080,
-	0xbf800000, 0xbe802b00,
-	0xbe822b02, 0xbe842b04,
-	0xbe862b06, 0xbe882b08,
-	0xbe8a2b0a, 0xbe8c2b0c,
-	0xbe8e2b0e, 0xc06b003a,
-	0x00000000, 0xbf8cc07f,
-	0xc06b013a, 0x00000010,
-	0xbf8cc07f, 0xc06b023a,
-	0x00000020, 0xbf8cc07f,
-	0xc06b033a, 0x00000030,
-	0xbf8cc07f, 0x8074c074,
-	0x82758075, 0x807c907c,
-	0xbf0a717c, 0xbf85ffe7,
-	0xbef40172, 0xbefa0080,
-	0xbefe00c1, 0xbeff00c1,
-	0xbee80080, 0xbee90080,
-	0xbef600ff, 0x01000000,
-	0xe0724000, 0x7a1d0000,
-	0xe0724100, 0x7a1d0100,
-	0xe0724200, 0x7a1d0200,
-	0xe0724300, 0x7a1d0300,
-	0xbefe00c1, 0xbeff00c1,
-	0xb8f14306, 0x8671c171,
-	0xbf84002c, 0xbf8a0000,
-	0x8670ff6f, 0x04000000,
-	0xbf840028, 0x8e718671,
-	0x8e718271, 0xbef60071,
-	0xb8fa2a05, 0x807a817a,
-	0x8e7a8a7a, 0xb8f01605,
-	0x80708170, 0x8e708670,
-	0x807a707a, 0x807aff7a,
-	0x00000080, 0xbef600ff,
-	0x01000000, 0xbefc0080,
-	0xd28c0002, 0x000100c1,
-	0xd28d0003, 0x000204c1,
-	0xd1060002, 0x00011103,
-	0x7e0602ff, 0x00000200,
-	0xbefc00ff, 0x00010000,
-	0xbe800077, 0x8677ff77,
-	0xff7fffff, 0x8777ff77,
-	0x00058000, 0xd8ec0000,
-	0x00000002, 0xbf8cc07f,
-	0xe0765000, 0x7a1d0002,
-	0x68040702, 0xd0c9006a,
-	0x0000e302, 0xbf87fff7,
-	0xbef70000, 0xbefa00ff,
-	0x00000400, 0xbefe00c1,
-	0xbeff00c1, 0xb8f12a05,
-	0x80718171, 0x8e718271,
-	0x8e768871, 0xbef600ff,
-	0x01000000, 0xbefc0084,
-	0xbf0a717c, 0xbf840015,
-	0xbf11017c, 0x8071ff71,
-	0x00001000, 0x7e000300,
-	0x7e020301, 0x7e040302,
-	0x7e060303, 0xe0724000,
-	0x7a1d0000, 0xe0724100,
-	0x7a1d0100, 0xe0724200,
-	0x7a1d0200, 0xe0724300,
-	0x7a1d0300, 0x807c847c,
-	0x807aff7a, 0x00000400,
-	0xbf0a717c, 0xbf85ffef,
-	0xbf9c0000, 0xbf8200d9,
-	0xbef4007e, 0x8675ff7f,
-	0x0000ffff, 0x8775ff75,
-	0x00040000, 0xbef60080,
-	0xbef700ff, 0x00807fac,
-	0x866eff7f, 0x08000000,
-	0x8f6e836e, 0x87776e77,
-	0x866eff7f, 0x70000000,
-	0x8f6e816e, 0x87776e77,
-	0x866eff7f, 0x04000000,
-	0xbf84001e, 0xbefe00c1,
-	0xbeff00c1, 0xb8ef4306,
-	0x866fc16f, 0xbf840019,
-	0x8e6f866f, 0x8e6f826f,
-	0xbef6006f, 0xb8f82a05,
-	0x80788178, 0x8e788a78,
-	0xb8ee1605, 0x806e816e,
-	0x8e6e866e, 0x80786e78,
-	0x8078ff78, 0x00000080,
-	0xbef600ff, 0x01000000,
-	0xbefc0080, 0xe0510000,
-	0x781d0000, 0xe0510100,
-	0x781d0000, 0x807cff7c,
-	0x00000200, 0x8078ff78,
-	0x00000200, 0xbf0a6f7c,
-	0xbf85fff6, 0xbef80080,
-	0xbefe00c1, 0xbeff00c1,
-	0xb8ef2a05, 0x806f816f,
-	0x8e6f826f, 0x8e76886f,
-	0xbef600ff, 0x01000000,
-	0xbeee0078, 0x8078ff78,
-	0x00000400, 0xbefc0084,
-	0xbf11087c, 0x806fff6f,
-	0x00008000, 0xe0524000,
-	0x781d0000, 0xe0524100,
-	0x781d0100, 0xe0524200,
-	0x781d0200, 0xe0524300,
-	0x781d0300, 0xbf8c0f70,
-	0x7e000300, 0x7e020301,
-	0x7e040302, 0x7e060303,
-	0x807c847c, 0x8078ff78,
-	0x00000400, 0xbf0a6f7c,
-	0xbf85ffee, 0xbf9c0000,
-	0xe0524000, 0x6e1d0000,
-	0xe0524100, 0x6e1d0100,
-	0xe0524200, 0x6e1d0200,
-	0xe0524300, 0x6e1d0300,
-	0xb8f82a05, 0x80788178,
-	0x8e788a78, 0xb8ee1605,
-	0x806e816e, 0x8e6e866e,
-	0x80786e78, 0x80f8c078,
-	0xb8ef1605, 0x806f816f,
-	0x8e6f846f, 0x8e76826f,
-	0xbef600ff, 0x01000000,
-	0xbefc006f, 0xc031003a,
-	0x00000078, 0x80f8c078,
-	0xbf8cc07f, 0x80fc907c,
-	0xbf800000, 0xbe802d00,
-	0xbe822d02, 0xbe842d04,
-	0xbe862d06, 0xbe882d08,
-	0xbe8a2d0a, 0xbe8c2d0c,
-	0xbe8e2d0e, 0xbf06807c,
-	0xbf84fff0, 0xb8f82a05,
-	0x80788178, 0x8e788a78,
-	0xb8ee1605, 0x806e816e,
-	0x8e6e866e, 0x80786e78,
-	0xbef60084, 0xbef600ff,
-	0x01000000, 0xc0211bfa,
-	0x00000078, 0x80788478,
-	0xc0211b3a, 0x00000078,
-	0x80788478, 0xc0211b7a,
-	0x00000078, 0x80788478,
-	0xc0211eba, 0x00000078,
-	0x80788478, 0xc0211efa,
-	0x00000078, 0x80788478,
-	0xc0211c3a, 0x00000078,
-	0x80788478, 0xc0211c7a,
-	0x00000078, 0x80788478,
-	0xc0211a3a, 0x00000078,
-	0x80788478, 0xc0211a7a,
-	0x00000078, 0x80788478,
-	0xc0211cfa, 0x00000078,
-	0x80788478, 0xbf8cc07f,
-	0x866dff6d, 0x0000ffff,
-	0xbefc006f, 0xbefe007a,
-	0xbeff007b, 0x866f71ff,
-	0x000003ff, 0xb96f4803,
-	0x866f71ff, 0xfffff800,
-	0x8f6f8b6f, 0xb96fa2c3,
-	0xb973f801, 0xb8ee2a05,
-	0x806e816e, 0x8e6e8a6e,
-	0xb8ef1605, 0x806f816f,
-	0x8e6f866f, 0x806e6f6e,
-	0x806e746e, 0x826f8075,
-	0x866fff6f, 0x0000ffff,
-	0xc0071cb7, 0x00000040,
-	0xc00b1d37, 0x00000048,
-	0xc0031e77, 0x00000058,
-	0xc0071eb7, 0x0000005c,
-	0xbf8cc07f, 0x866fff6d,
-	0xf0000000, 0x8f6f9c6f,
-	0x8e6f906f, 0xbeee0080,
-	0x876e6f6e, 0x866fff6d,
-	0x08000000, 0x8f6f9b6f,
-	0x8e6f8f6f, 0x876e6f6e,
-	0x866fff70, 0x00800000,
-	0x8f6f976f, 0xb96ef807,
-	0x86fe7e7e, 0x86ea6a6a,
-	0xb970f802, 0xbf8a0000,
-	0x95806f6c, 0xbf810000,
-};
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 17de4ac..42eba6b 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -26,8 +26,7 @@
 #include "kfd_priv.h"
 #include "kfd_device_queue_manager.h"
 #include "kfd_pm4_headers_vi.h"
-#include "cwsr_trap_handler_gfx8.asm"
-#include "cwsr_trap_handler_gfx9.asm"
+#include "cwsr_trap_handler.h"
 #include "kfd_iommu.h"
 
 #define MQD_SIZE_ALIGNED 768
-- 
2.7.4


[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 07/12] drm/amdkfd: Fix CP soft hang on APUs
       [not found] ` <1525211772-19683-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (5 preceding siblings ...)
  2018-05-01 21:56   ` [PATCH 06/12] drm/amdkfd: Separate trap handler assembly code and its hex values Felix Kuehling
@ 2018-05-01 21:56   ` Felix Kuehling
  2018-05-01 21:56   ` [PATCH 08/12] drm/amdkfd: Fix signal handling performance again Felix Kuehling
                     ` (5 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Felix Kuehling @ 2018-05-01 21:56 UTC (permalink / raw)
  To: oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Yong Zhao, Felix Kuehling, Jay Cornwall

From: Yong Zhao <yong.zhao@amd.com>

The problem happens on Raven and Carrizo. The context save handler
should not clear the high bits of PC_HI before extracting the bits
of IB_STS.

The bug is not relevant to VEGA10 until we enable demand paging.

Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h        | 4 ++--
 drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm | 3 +--
 drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm | 3 +--
 3 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
index a546a21..f68aef0 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
@@ -253,7 +253,6 @@ static const uint32_t cwsr_trap_gfx8_hex[] = {
 	0x00000072, 0x80728472,
 	0xc0211b7c, 0x00000072,
 	0x80728472, 0xbf8c007f,
-	0x8671ff71, 0x0000ffff,
 	0xbefc0073, 0xbefe006e,
 	0xbeff006f, 0x867375ff,
 	0x000003ff, 0xb9734803,
@@ -267,6 +266,7 @@ static const uint32_t cwsr_trap_gfx8_hex[] = {
 	0x8e738f73, 0x87767376,
 	0x8673ff74, 0x00800000,
 	0x8f739773, 0xb976f807,
+	0x8671ff71, 0x0000ffff,
 	0x86fe7e7e, 0x86ea6a6a,
 	0xb974f802, 0xbf8a0000,
 	0x95807370, 0xbf810000,
@@ -530,7 +530,6 @@ static const uint32_t cwsr_trap_gfx9_hex[] = {
 	0x00000078, 0x80788478,
 	0xc0211cfa, 0x00000078,
 	0x80788478, 0xbf8cc07f,
-	0x866dff6d, 0x0000ffff,
 	0xbefc006f, 0xbefe007a,
 	0xbeff007b, 0x866f71ff,
 	0x000003ff, 0xb96f4803,
@@ -554,6 +553,7 @@ static const uint32_t cwsr_trap_gfx9_hex[] = {
 	0x8e6f8f6f, 0x876e6f6e,
 	0x866fff70, 0x00800000,
 	0x8f6f976f, 0xb96ef807,
+	0x866dff6d, 0x0000ffff,
 	0x86fe7e7e, 0x86ea6a6a,
 	0xb970f802, 0xbf8a0000,
 	0x95806f6c, 0xbf810000,
diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm
index 658a4c6..a2a04bb 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm
@@ -1015,8 +1015,6 @@ end
 
     s_waitcnt       lgkmcnt(0)                                                                                      //from now on, it is safe to restore STATUS and IB_STS
 
-    s_and_b32 s_restore_pc_hi, s_restore_pc_hi, 0x0000ffff      //pc[47:32]        //Do it here in order not to affect STATUS
-
     //for normal save & restore, the saved PC points to the next inst to execute, no adjustment needs to be made, otherwise:
     if ((EMU_RUN_HACK) && (!EMU_RUN_HACK_RESTORE_NORMAL))
         s_add_u32 s_restore_pc_lo, s_restore_pc_lo, 8            //pc[31:0]+8     //two back-to-back s_trap are used (first for save and second for restore)
@@ -1052,6 +1050,7 @@ end
     s_lshr_b32      s_restore_m0, s_restore_m0, SQ_WAVE_STATUS_INST_ATC_SHIFT
     s_setreg_b32    hwreg(HW_REG_IB_STS),   s_restore_tmp
 
+    s_and_b32 s_restore_pc_hi, s_restore_pc_hi, 0x0000ffff      //pc[47:32]        //Do it here in order not to affect STATUS
     s_and_b64    exec, exec, exec  // Restore STATUS.EXECZ, not writable by s_setreg_b32
     s_and_b64    vcc, vcc, vcc  // Restore STATUS.VCCZ, not writable by s_setreg_b32
     s_setreg_b32    hwreg(HW_REG_STATUS),   s_restore_status     // SCC is included, which is changed by previous salu
diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm
index 065f55a..998be96 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm
@@ -1067,8 +1067,6 @@ end
 
     s_waitcnt	    lgkmcnt(0)											    //from now on, it is safe to restore STATUS and IB_STS
 
-    s_and_b32 s_restore_pc_hi, s_restore_pc_hi, 0x0000ffff	//pc[47:32]	   //Do it here in order not to affect STATUS
-
     //for normal save & restore, the saved PC points to the next inst to execute, no adjustment needs to be made, otherwise:
     if ((EMU_RUN_HACK) && (!EMU_RUN_HACK_RESTORE_NORMAL))
 	s_add_u32 s_restore_pc_lo, s_restore_pc_lo, 8		 //pc[31:0]+8	  //two back-to-back s_trap are used (first for save and second for restore)
@@ -1119,6 +1117,7 @@ end
     s_lshr_b32	    s_restore_m0, s_restore_m0, SQ_WAVE_STATUS_INST_ATC_SHIFT
     s_setreg_b32    hwreg(HW_REG_IB_STS),   s_restore_tmp
 
+    s_and_b32 s_restore_pc_hi, s_restore_pc_hi, 0x0000ffff	//pc[47:32]	   //Do it here in order not to affect STATUS
     s_and_b64	 exec, exec, exec  // Restore STATUS.EXECZ, not writable by s_setreg_b32
     s_and_b64	 vcc, vcc, vcc	// Restore STATUS.VCCZ, not writable by s_setreg_b32
     s_setreg_b32    hwreg(HW_REG_STATUS),   s_restore_status	 // SCC is included, which is changed by previous salu
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 08/12] drm/amdkfd: Fix signal handling performance again
       [not found] ` <1525211772-19683-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (6 preceding siblings ...)
  2018-05-01 21:56   ` [PATCH 07/12] drm/amdkfd: Fix CP soft hang on APUs Felix Kuehling
@ 2018-05-01 21:56   ` Felix Kuehling
  2018-05-01 21:56   ` [PATCH 09/12] drm/amdkfd: Remove initialization of cp_hqd_ib_control on CIK Felix Kuehling
                     ` (4 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Felix Kuehling @ 2018-05-01 21:56 UTC (permalink / raw)
  To: oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Felix Kuehling

It turns out that idr_for_each_entry is really slow compared to just
iterating over the slots. Based on measurements the difference is
estimated to be about a factor 64. That means using idr_for_each_entry
is only worth it with very few allocated events.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_events.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
index bccf2f7..7862fcf 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
@@ -496,7 +496,7 @@ void kfd_signal_event_interrupt(unsigned int pasid, uint32_t partial_id,
 			pr_debug_ratelimited("Partial ID invalid: %u (%u valid bits)\n",
 					     partial_id, valid_id_bits);
 
-		if (p->signal_event_count < KFD_SIGNAL_EVENT_LIMIT/2) {
+		if (p->signal_event_count < KFD_SIGNAL_EVENT_LIMIT/64) {
 			/* With relatively few events, it's faster to
 			 * iterate over the event IDR
 			 */
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 09/12] drm/amdkfd: Remove initialization of cp_hqd_ib_control on CIK
       [not found] ` <1525211772-19683-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (7 preceding siblings ...)
  2018-05-01 21:56   ` [PATCH 08/12] drm/amdkfd: Fix signal handling performance again Felix Kuehling
@ 2018-05-01 21:56   ` Felix Kuehling
  2018-05-01 21:56   ` [PATCH 10/12] drm/amdkfd: Locking PM mutex while allocating IB buffer Felix Kuehling
                     ` (3 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Felix Kuehling @ 2018-05-01 21:56 UTC (permalink / raw)
  To: oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Felix Kuehling

The initialization is not necessary. amd-kfd-staging and ROCm
releases have worked without it for two years.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
index 2bc49c6..06eaa21 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
@@ -79,10 +79,6 @@ static int init_mqd(struct mqd_manager *mm, void **mqd,
 	m->cp_mqd_base_addr_lo        = lower_32_bits(addr);
 	m->cp_mqd_base_addr_hi        = upper_32_bits(addr);
 
-	m->cp_hqd_ib_control = DEFAULT_MIN_IB_AVAIL_SIZE | IB_ATC_EN;
-	/* Although WinKFD writes this, I suspect it should not be necessary */
-	m->cp_hqd_ib_control = IB_ATC_EN | DEFAULT_MIN_IB_AVAIL_SIZE;
-
 	m->cp_hqd_quantum = QUANTUM_EN | QUANTUM_SCALE_1MS |
 				QUANTUM_DURATION(10);
 
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 10/12] drm/amdkfd: Locking PM mutex while allocating IB buffer
       [not found] ` <1525211772-19683-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (8 preceding siblings ...)
  2018-05-01 21:56   ` [PATCH 09/12] drm/amdkfd: Remove initialization of cp_hqd_ib_control on CIK Felix Kuehling
@ 2018-05-01 21:56   ` Felix Kuehling
  2018-05-01 21:56   ` [PATCH 11/12] drm/amdkfd: Remove queue node when destroy queue failed Felix Kuehling
                     ` (2 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: Felix Kuehling @ 2018-05-01 21:56 UTC (permalink / raw)
  To: oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Ben Goz, Felix Kuehling

From: Ben Goz <ben.goz@amd.com>

Signed-off-by: Ben Goz <ben.goz@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
index 91f0350..c317feb4 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
@@ -94,12 +94,14 @@ static int pm_allocate_runlist_ib(struct packet_manager *pm,
 
 	pm_calc_rlib_size(pm, rl_buffer_size, is_over_subscription);
 
+	mutex_lock(&pm->lock);
+
 	retval = kfd_gtt_sa_allocate(pm->dqm->dev, *rl_buffer_size,
 					&pm->ib_buffer_obj);
 
 	if (retval) {
 		pr_err("Failed to allocate runlist IB\n");
-		return retval;
+		goto out;
 	}
 
 	*(void **)rl_buffer = pm->ib_buffer_obj->cpu_ptr;
@@ -107,6 +109,9 @@ static int pm_allocate_runlist_ib(struct packet_manager *pm,
 
 	memset(*rl_buffer, 0, *rl_buffer_size);
 	pm->allocated = true;
+
+out:
+	mutex_unlock(&pm->lock);
 	return retval;
 }
 
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 11/12] drm/amdkfd: Remove queue node when destroy queue failed
       [not found] ` <1525211772-19683-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (9 preceding siblings ...)
  2018-05-01 21:56   ` [PATCH 10/12] drm/amdkfd: Locking PM mutex while allocating IB buffer Felix Kuehling
@ 2018-05-01 21:56   ` Felix Kuehling
  2018-05-01 21:56   ` [PATCH 12/12] drm/amdkfd: Add sanity checks in IRQ handlers Felix Kuehling
  2018-05-12 11:13   ` [PATCH 00/12] Assorted KFD fixes Oded Gabbay
  12 siblings, 0 replies; 14+ messages in thread
From: Felix Kuehling @ 2018-05-01 21:56 UTC (permalink / raw)
  To: oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Felix Kuehling, Shaoyun Liu

From: Shaoyun Liu <Shaoyun.Liu@amd.com>

HWS may hang in the middle of destroy queue, remove the queue from the
process queue list so it won't be freed again in the future

Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
index 3045aeb..d65ce04 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
@@ -241,7 +241,8 @@ int pqm_create_queue(struct process_queue_manager *pqm,
 	}
 
 	if (retval != 0) {
-		pr_err("DQM create queue failed\n");
+		pr_err("Pasid %d DQM create queue %d failed. ret %d\n",
+			pqm->process->pasid, type, retval);
 		goto err_create_queue;
 	}
 
@@ -319,8 +320,11 @@ int pqm_destroy_queue(struct process_queue_manager *pqm, unsigned int qid)
 		dqm = pqn->q->device->dqm;
 		retval = dqm->ops.destroy_queue(dqm, &pdd->qpd, pqn->q);
 		if (retval) {
-			pr_debug("Destroy queue failed, returned %d\n", retval);
-			goto err_destroy_queue;
+			pr_err("Pasid %d destroy queue %d failed, ret %d\n",
+				pqm->process->pasid,
+				pqn->q->properties.queue_id, retval);
+			if (retval != -ETIME)
+				goto err_destroy_queue;
 		}
 		uninit_queue(pqn->q);
 	}
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 12/12] drm/amdkfd: Add sanity checks in IRQ handlers
       [not found] ` <1525211772-19683-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (10 preceding siblings ...)
  2018-05-01 21:56   ` [PATCH 11/12] drm/amdkfd: Remove queue node when destroy queue failed Felix Kuehling
@ 2018-05-01 21:56   ` Felix Kuehling
  2018-05-12 11:13   ` [PATCH 00/12] Assorted KFD fixes Oded Gabbay
  12 siblings, 0 replies; 14+ messages in thread
From: Felix Kuehling @ 2018-05-01 21:56 UTC (permalink / raw)
  To: oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Felix Kuehling

Only accept interrupts from KFD VMIDs. Just checking for a PASID may
not be enough because amdgpu started using PASIDs to map VM faults
to processes.

Warn if an IRQ doesn't have a valid PASID (indicating a firmware bug).

Suggested-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Suggested-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c | 20 +++++++++---
 drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c  | 40 ++++++++++++++----------
 2 files changed, 39 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c b/drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c
index 3d5ccb3..49df6c7 100644
--- a/drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c
+++ b/drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c
@@ -27,18 +27,28 @@
 static bool cik_event_interrupt_isr(struct kfd_dev *dev,
 					const uint32_t *ih_ring_entry)
 {
-	unsigned int pasid;
 	const struct cik_ih_ring_entry *ihre =
 			(const struct cik_ih_ring_entry *)ih_ring_entry;
+	unsigned int vmid, pasid;
+
+	/* Only handle interrupts from KFD VMIDs */
+	vmid  = (ihre->ring_id & 0x0000ff00) >> 8;
+	if (vmid < dev->vm_info.first_vmid_kfd ||
+	    vmid > dev->vm_info.last_vmid_kfd)
+		return 0;
 
+	/* If there is no valid PASID, it's likely a firmware bug */
 	pasid = (ihre->ring_id & 0xffff0000) >> 16;
+	if (WARN_ONCE(pasid == 0, "FW bug: No PASID in KFD interrupt"))
+		return 0;
 
-	/* Do not process in ISR, just request it to be forwarded to WQ. */
-	return (pasid != 0) &&
-		(ihre->source_id == CIK_INTSRC_CP_END_OF_PIPE ||
+	/* Interrupt types we care about: various signals and faults.
+	 * They will be forwarded to a work queue (see below).
+	 */
+	return ihre->source_id == CIK_INTSRC_CP_END_OF_PIPE ||
 		ihre->source_id == CIK_INTSRC_SDMA_TRAP ||
 		ihre->source_id == CIK_INTSRC_SQ_INTERRUPT_MSG ||
-		ihre->source_id == CIK_INTSRC_CP_BAD_OPCODE);
+		ihre->source_id == CIK_INTSRC_CP_BAD_OPCODE;
 }
 
 static void cik_event_interrupt_wq(struct kfd_dev *dev,
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c
index 39d4115..37029ba 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c
@@ -29,27 +29,35 @@ static bool event_interrupt_isr_v9(struct kfd_dev *dev,
 					const uint32_t *ih_ring_entry)
 {
 	uint16_t source_id, client_id, pasid, vmid;
+	const uint32_t *data = ih_ring_entry;
 
-	source_id = SOC15_SOURCE_ID_FROM_IH_ENTRY(ih_ring_entry);
-	client_id = SOC15_CLIENT_ID_FROM_IH_ENTRY(ih_ring_entry);
-	pasid = SOC15_PASID_FROM_IH_ENTRY(ih_ring_entry);
+	/* Only handle interrupts from KFD VMIDs */
 	vmid = SOC15_VMID_FROM_IH_ENTRY(ih_ring_entry);
+	if (vmid < dev->vm_info.first_vmid_kfd ||
+	    vmid > dev->vm_info.last_vmid_kfd)
+		return 0;
+
+	/* If there is no valid PASID, it's likely a firmware bug */
+	pasid = SOC15_PASID_FROM_IH_ENTRY(ih_ring_entry);
+	if (WARN_ONCE(pasid == 0, "FW bug: No PASID in KFD interrupt"))
+		return 0;
 
-	if (pasid) {
-		const uint32_t *data = ih_ring_entry;
+	source_id = SOC15_SOURCE_ID_FROM_IH_ENTRY(ih_ring_entry);
+	client_id = SOC15_CLIENT_ID_FROM_IH_ENTRY(ih_ring_entry);
 
-		pr_debug("client id 0x%x, source id %d, pasid 0x%x. raw data:\n",
-			 client_id, source_id, pasid);
-		pr_debug("%8X, %8X, %8X, %8X, %8X, %8X, %8X, %8X.\n",
-			 data[0], data[1], data[2], data[3],
-			 data[4], data[5], data[6], data[7]);
-	}
+	pr_debug("client id 0x%x, source id %d, pasid 0x%x. raw data:\n",
+		 client_id, source_id, pasid);
+	pr_debug("%8X, %8X, %8X, %8X, %8X, %8X, %8X, %8X.\n",
+		 data[0], data[1], data[2], data[3],
+		 data[4], data[5], data[6], data[7]);
 
-	return (pasid != 0) &&
-		(source_id == SOC15_INTSRC_CP_END_OF_PIPE ||
-		 source_id == SOC15_INTSRC_SDMA_TRAP ||
-		 source_id == SOC15_INTSRC_SQ_INTERRUPT_MSG ||
-		 source_id == SOC15_INTSRC_CP_BAD_OPCODE);
+	/* Interrupt types we care about: various signals and faults.
+	 * They will be forwarded to a work queue (see below).
+	 */
+	return source_id == SOC15_INTSRC_CP_END_OF_PIPE ||
+		source_id == SOC15_INTSRC_SDMA_TRAP ||
+		source_id == SOC15_INTSRC_SQ_INTERRUPT_MSG ||
+		source_id == SOC15_INTSRC_CP_BAD_OPCODE;
 }
 
 static void event_interrupt_wq_v9(struct kfd_dev *dev,
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 00/12] Assorted KFD fixes
       [not found] ` <1525211772-19683-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (11 preceding siblings ...)
  2018-05-01 21:56   ` [PATCH 12/12] drm/amdkfd: Add sanity checks in IRQ handlers Felix Kuehling
@ 2018-05-12 11:13   ` Oded Gabbay
  12 siblings, 0 replies; 14+ messages in thread
From: Oded Gabbay @ 2018-05-12 11:13 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: amd-gfx list

Thanks Felix!
Series applied to amdkfd-next

Oded

On Wed, May 2, 2018 at 12:56 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> These are some random patches I noticed when comparing amdkfd-next against
> amd-kfd-staging.
>
> Ben Goz (1):
>   drm/amdkfd: Locking PM mutex while allocating IB buffer
>
> Felix Kuehling (4):
>   drm/amdkfd: Remove redundant include of amd-iommu.h
>   drm/amdkfd: Fix signal handling performance again
>   drm/amdkfd: Remove initialization of cp_hqd_ib_control on CIK
>   drm/amdkfd: Add sanity checks in IRQ handlers
>
> Jay Cornwall (2):
>   drm/amdkfd: Reduce priority of context-saving waves before spin-wait
>   drm/amdkfd: Use volatile MTYPE in default/alternate apertures
>
> Oak Zeng (1):
>   drm/amdkfd: Dump HQD of HIQ
>
> Philip Yang (1):
>   drm/amdkfd: use %px to print user space address instead of %p
>
> Shaoyun Liu (1):
>   drm/amdkfd: Remove queue node when destroy queue failed
>
> Yong Zhao (2):
>   drm/amdkfd: Separate trap handler assembly code and its hex values
>   drm/amdkfd: Fix CP soft hang on APUs
>
>  drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c   |  20 +-
>  drivers/gpu/drm/amd/amdkfd/cik_regs.h              |   3 +-
>  drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h     | 560 +++++++++++++++++++++
>  .../gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx8.asm  | 274 +---------
>  .../gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm  | 307 +----------
>  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c           |   2 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_device.c            |   6 +-
>  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  |  12 +
>  drivers/gpu/drm/amd/amdkfd/kfd_events.c            |   2 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c    |  40 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c   |   4 -
>  drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c    |   7 +-
>  .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |  10 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_queue.c             |   8 +-
>  14 files changed, 659 insertions(+), 596 deletions(-)
>  create mode 100644 drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
>
> --
> 2.7.4
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2018-05-12 11:13 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-01 21:56 [PATCH 00/12] Assorted KFD fixes Felix Kuehling
     [not found] ` <1525211772-19683-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2018-05-01 21:56   ` [PATCH 01/12] drm/amdkfd: Dump HQD of HIQ Felix Kuehling
2018-05-01 21:56   ` [PATCH 02/12] drm/amdkfd: Reduce priority of context-saving waves before spin-wait Felix Kuehling
2018-05-01 21:56   ` [PATCH 03/12] drm/amdkfd: Use volatile MTYPE in default/alternate apertures Felix Kuehling
2018-05-01 21:56   ` [PATCH 04/12] drm/amdkfd: use %px to print user space address instead of %p Felix Kuehling
2018-05-01 21:56   ` [PATCH 05/12] drm/amdkfd: Remove redundant include of amd-iommu.h Felix Kuehling
2018-05-01 21:56   ` [PATCH 06/12] drm/amdkfd: Separate trap handler assembly code and its hex values Felix Kuehling
2018-05-01 21:56   ` [PATCH 07/12] drm/amdkfd: Fix CP soft hang on APUs Felix Kuehling
2018-05-01 21:56   ` [PATCH 08/12] drm/amdkfd: Fix signal handling performance again Felix Kuehling
2018-05-01 21:56   ` [PATCH 09/12] drm/amdkfd: Remove initialization of cp_hqd_ib_control on CIK Felix Kuehling
2018-05-01 21:56   ` [PATCH 10/12] drm/amdkfd: Locking PM mutex while allocating IB buffer Felix Kuehling
2018-05-01 21:56   ` [PATCH 11/12] drm/amdkfd: Remove queue node when destroy queue failed Felix Kuehling
2018-05-01 21:56   ` [PATCH 12/12] drm/amdkfd: Add sanity checks in IRQ handlers Felix Kuehling
2018-05-12 11:13   ` [PATCH 00/12] Assorted KFD fixes Oded Gabbay

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.