All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Kuehling, Felix" <Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
To: "amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org"
	<amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org>
Cc: "Kuehling, Felix" <Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>,
	"Cornwall, Jay" <Jay.Cornwall-5C7GfCeVMHo@public.gmane.org>
Subject: [PATCH 21/27] drm/amdkfd: Preserve wave state after instruction fetch MEM_VIOL
Date: Sun, 28 Apr 2019 07:44:18 +0000	[thread overview]
Message-ID: <20190428074331.30107-22-Felix.Kuehling@amd.com> (raw)
In-Reply-To: <20190428074331.30107-1-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>

From: Jay Cornwall <Jay.Cornwall@amd.com>

If instruction fetch fails the wave cannot be halted and returned to
the shader without raising MEM_VIOL again. Currently the wave is
terminated if this occurs, but this loses information about the cause
of the fault. The debugger would prefer the faulting wave state to be
context-saved.

Poll inside the trap handler until TRAPSTS.SAVECTX indicates context
save is ready. Exit the poll loop and complete the remainder of the
exception handler, then return to the shader. The next instruction
fetch will be from the trap handler and not the faulting PC. Context
save will then deschedule the wave and save its state.

Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h        | 10 ++++++----
 drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm | 10 ++++++++--
 2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
index ec9a9a99f808..097da0dd3b04 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
@@ -274,15 +274,17 @@ static const uint32_t cwsr_trap_gfx8_hex[] = {
 
 
 static const uint32_t cwsr_trap_gfx9_hex[] = {
-	0xbf820001, 0xbf82015d,
+	0xbf820001, 0xbf820161,
 	0xb8f8f802, 0x89788678,
 	0xb8f1f803, 0x866eff71,
-	0x00000400, 0xbf850037,
+	0x00000400, 0xbf85003b,
 	0x866eff71, 0x00000800,
 	0xbf850003, 0x866eff71,
-	0x00000100, 0xbf840008,
+	0x00000100, 0xbf84000c,
 	0x866eff78, 0x00002000,
-	0xbf840001, 0xbf810000,
+	0xbf840005, 0xbf8e0010,
+	0xb8eef803, 0x866eff6e,
+	0x00000400, 0xbf84fffb,
 	0x8778ff78, 0x00002000,
 	0x80ec886c, 0x82ed806d,
 	0xb8eef807, 0x866fff6e,
diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm
index 0bb9c577b3a2..6a010c9e55de 100644
--- a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm
+++ b/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm
@@ -266,10 +266,16 @@ if (!EMU_RUN_HACK)
 
 L_HALT_WAVE:
     // If STATUS.HALT is set then this fault must come from SQC instruction fetch.
-    // We cannot prevent further faults so just terminate the wavefront.
+    // We cannot prevent further faults. Spin wait until context saved.
     s_and_b32       ttmp2, s_save_status, SQ_WAVE_STATUS_HALT_MASK
     s_cbranch_scc0  L_NOT_ALREADY_HALTED
-    s_endpgm
+
+L_WAIT_CTX_SAVE:
+    s_sleep         0x10
+    s_getreg_b32    ttmp2, hwreg(HW_REG_TRAPSTS)
+    s_and_b32       ttmp2, ttmp2, SQ_WAVE_TRAPSTS_SAVECTX_MASK
+    s_cbranch_scc0  L_WAIT_CTX_SAVE
+
 L_NOT_ALREADY_HALTED:
     s_or_b32        s_save_status, s_save_status, SQ_WAVE_STATUS_HALT_MASK
 
-- 
2.17.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

  parent reply	other threads:[~2019-04-28  7:44 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-28  7:44 [PATCH 00/27] KFD upstreaming Kuehling, Felix
     [not found] ` <20190428074331.30107-1-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2019-04-28  7:44   ` [PATCH 01/27] drm/amdkfd: Use 64 bit sdma_bitmap Kuehling, Felix
2019-04-28  7:44   ` [PATCH 02/27] drm/amdkfd: Add sdma allocation debug message Kuehling, Felix
2019-04-28  7:44   ` [PATCH 03/27] drm/amdkfd: Differentiate b/t sdma_id and sdma_queue_id Kuehling, Felix
2019-04-28  7:44   ` [PATCH 05/27] drm/amdkfd: Fix a potential memory leak Kuehling, Felix
2019-04-28  7:44   ` [PATCH 04/27] drm/amdkfd: Shift sdma_engine_id and sdma_queue_id in mqd Kuehling, Felix
2019-04-28  7:44   ` [PATCH 06/27] drm/amdkfd: Introduce asic-specific mqd_manager_init function Kuehling, Felix
2019-04-28  7:44   ` [PATCH 07/27] drm/amdkfd: Introduce DIQ type mqd manager Kuehling, Felix
2019-04-28  7:44   ` [PATCH 08/27] drm/amdkfd: Init mqd managers in device queue manager init Kuehling, Felix
2019-04-28  7:44   ` [PATCH 09/27] drm/amdkfd: Add mqd size in mqd manager struct Kuehling, Felix
2019-04-28  7:44   ` [PATCH 10/27] drm/amdkfd: Allocate MQD trunk for HIQ and SDMA Kuehling, Felix
2019-04-28  7:44   ` [PATCH 11/27] drm/amdkfd: Move non-sdma mqd allocation out of init_mqd Kuehling, Felix
2019-04-28  7:44   ` [PATCH 12/27] drm/amdkfd: Allocate hiq and sdma mqd from mqd trunk Kuehling, Felix
2019-04-28  7:44   ` [PATCH 13/27] drm/amdkfd: Move sdma_queue_id calculation into allocate_sdma_queue() Kuehling, Felix
2019-04-28  7:44   ` [PATCH 14/27] drm/amdkfd: Fix compute profile switching Kuehling, Felix
2019-04-28  7:44   ` [PATCH 15/27] drm/amdkfd: Fix sdma queue map issue Kuehling, Felix
2019-04-28  7:44   ` [PATCH 16/27] drm/amdkfd: Introduce XGMI SDMA queue type Kuehling, Felix
2019-04-28  7:44   ` [PATCH 17/27] drm/amdkfd: Expose sdma engine numbers to topology Kuehling, Felix
2019-04-28  7:44   ` [PATCH 18/27] drm/amdkfd: Delete alloc_format field from map_queue struct Kuehling, Felix
2019-04-28  7:44   ` [PATCH 19/27] drm/amdkfd: Fix a circular lock dependency Kuehling, Felix
2019-04-28  7:44   ` [PATCH 20/27] drm/amdkfd: Fix gfx8 MEM_VIOL exception handler Kuehling, Felix
2019-04-28  7:44   ` Kuehling, Felix [this message]
2019-04-28  7:44   ` [PATCH 22/27] drm/amdkfd: Fix gfx9 XNACK state save/restore Kuehling, Felix
2019-04-28  7:44   ` [PATCH 23/27] drm/amdkfd: Preserve ttmp[4:5] instead of ttmp[14:15] Kuehling, Felix
2019-04-28  7:44   ` [PATCH 24/27] drm/amdkfd: Add VegaM support Kuehling, Felix
2019-04-28  7:44   ` [PATCH 25/27] drm/amdkfd: Add domain number into gpu_id Kuehling, Felix
2019-04-28  7:44   ` [PATCH 26/27] drm/amdgpu: Use heavy weight for tlb invalidation on xgmi configuration Kuehling, Felix
2019-04-28  7:44   ` [PATCH 27/27] drm/amdgpu: Fix GTT size calculation Kuehling, Felix
     [not found]     ` <20190428074331.30107-28-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2019-04-29 12:34       ` Christian König
     [not found]         ` <86fa9fc3-7a8f-9855-ae1d-5c7ccf2b5260-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2019-04-29 23:16           ` Kuehling, Felix
     [not found]             ` <1b1ec993-1c4b-8661-9b3f-ac0ad8ae64c7-5C7GfCeVMHo@public.gmane.org>
2019-04-30  9:32               ` Christian König
     [not found]                 ` <134a4999-776f-44c6-99a2-42e8b9366a73-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2019-04-30 15:36                   ` Kuehling, Felix
     [not found]                     ` <9f882acd-c48f-3bbd-2d90-659c2edead39-5C7GfCeVMHo@public.gmane.org>
2019-04-30 17:03                       ` Koenig, Christian
     [not found]                         ` <f5c698ad-2aff-b3c5-2041-05a10983438a-5C7GfCeVMHo@public.gmane.org>
2019-04-30 17:25                           ` Kuehling, Felix
     [not found]                             ` <8ba952ab-4836-4ca3-cd80-99f7367a7979-5C7GfCeVMHo@public.gmane.org>
2019-05-02 13:06                               ` Koenig, Christian
2019-07-13 20:24                           ` Felix Kuehling
2019-04-29 23:23   ` [PATCH 00/27] KFD upstreaming Kuehling, Felix

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190428074331.30107-22-Felix.Kuehling@amd.com \
    --to=felix.kuehling-5c7gfcevmho@public.gmane.org \
    --cc=Jay.Cornwall-5C7GfCeVMHo@public.gmane.org \
    --cc=amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.