All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jonathan Kim <jonathan.kim@amd.com>
To: <amd-gfx@lists.freedesktop.org>, <dri-devel@lists.freedesktop.org>
Cc: Felix.Kuehling@amd.com, Jonathan.kim@amd.com
Subject: [PATCH 28/34] drm/amdkfd: add debug set flags operation
Date: Mon, 27 Mar 2023 14:43:33 -0400	[thread overview]
Message-ID: <20230327184339.125016-28-jonathan.kim@amd.com> (raw)
In-Reply-To: <20230327184339.125016-1-jonathan.kim@amd.com>

Allow the debugger to set single memory and single ALU operations.

Some exceptions are imprecise (memory violations, address watch) in the
sense that a trap occurs only when the exception interrupt occurs and
not at the non-halting faulty instruction.  Trap temporaries 0 & 1 save
the program counter address, which means that these values will not point
to the faulty instruction address but to whenever the interrupt was
raised.

Setting the Single Memory Operations flag will inject an automatic wait
on every memory operation instruction forcing imprecise memory exceptions
to become precise at the cost of performance.  This setting is not
permitted on debug devices that support only a global setting of this
option.

Return the previous set flags to the debugger as well.

v3: fix var declare spacing and add rewind to failed flag setting.

v3: make precise mem op the only available flag for now.

v2: add gfx11 support.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c |  2 +
 drivers/gpu/drm/amd/amdkfd/kfd_debug.c   | 58 ++++++++++++++++++++++++
 drivers/gpu/drm/amd/amdkfd/kfd_debug.h   |  1 +
 3 files changed, 61 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 194221ab0f25..da3478b133bd 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -3021,6 +3021,8 @@ static int kfd_ioctl_set_debug_trap(struct file *filep, struct kfd_process *p, v
 				args->clear_node_address_watch.id);
 		break;
 	case KFD_IOC_DBG_TRAP_SET_FLAGS:
+		r = kfd_dbg_trap_set_flags(target, &args->set_flags.flags);
+		break;
 	case KFD_IOC_DBG_TRAP_QUERY_DEBUG_EVENT:
 	case KFD_IOC_DBG_TRAP_QUERY_EXCEPTION_INFO:
 	case KFD_IOC_DBG_TRAP_GET_QUEUE_SNAPSHOT:
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
index 4b8b71b1a322..5d3193ae71e3 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.c
@@ -23,6 +23,7 @@
 #include "kfd_debug.h"
 #include "kfd_device_queue_manager.h"
 #include <linux/file.h>
+#include <uapi/linux/kfd_ioctl.h>
 
 #define MAX_WATCH_ADDRESSES	4
 
@@ -423,6 +424,59 @@ static void kfd_dbg_clear_process_address_watch(struct kfd_process *target)
 			kfd_dbg_trap_clear_dev_address_watch(target->pdds[i], j);
 }
 
+int kfd_dbg_trap_set_flags(struct kfd_process *target, uint32_t *flags)
+{
+	uint32_t prev_flags = target->dbg_flags;
+	int i, r = 0, rewind_count = 0;
+
+	for (i = 0; i < target->n_pdds; i++) {
+		if (!kfd_dbg_is_per_vmid_supported(target->pdds[i]->dev) &&
+			(*flags & KFD_DBG_TRAP_FLAG_SINGLE_MEM_OP)) {
+			*flags = prev_flags;
+			return -EACCES;
+		}
+	}
+
+	target->dbg_flags = *flags & KFD_DBG_TRAP_FLAG_SINGLE_MEM_OP;
+	*flags = prev_flags;
+	for (i = 0; i < target->n_pdds; i++) {
+		struct kfd_process_device *pdd = target->pdds[i];
+
+		if (!kfd_dbg_is_per_vmid_supported(pdd->dev))
+			continue;
+
+		if (!pdd->dev->shared_resources.enable_mes)
+			r = debug_refresh_runlist(pdd->dev->dqm);
+		else
+			r = kfd_dbg_set_mes_debug_mode(pdd);
+
+		if (r) {
+			target->dbg_flags = prev_flags;
+			break;
+		}
+
+		rewind_count++;
+	}
+
+	/* Rewind flags */
+	if (r) {
+		target->dbg_flags = prev_flags;
+
+		for (i = 0; i < rewind_count; i++) {
+			struct kfd_process_device *pdd = target->pdds[i];
+
+			if (!kfd_dbg_is_per_vmid_supported(pdd->dev))
+				continue;
+
+			if (!pdd->dev->shared_resources.enable_mes)
+				debug_refresh_runlist(pdd->dev->dqm);
+			else
+				kfd_dbg_set_mes_debug_mode(pdd);
+		}
+	}
+
+	return r;
+}
 
 /* kfd_dbg_trap_deactivate:
  *	target: target process
@@ -437,9 +491,13 @@ void kfd_dbg_trap_deactivate(struct kfd_process *target, bool unwind, int unwind
 	int i;
 
 	if (!unwind) {
+		uint32_t flags = 0;
+
 		cancel_work_sync(&target->debug_event_workarea);
 		kfd_dbg_clear_process_address_watch(target);
 		kfd_dbg_trap_set_wave_launch_mode(target, 0);
+
+		kfd_dbg_trap_set_flags(target, &flags);
 	}
 
 	for (i = 0; i < target->n_pdds; i++) {
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debug.h b/drivers/gpu/drm/amd/amdkfd/kfd_debug.h
index 63c716ce5ab9..782362d82890 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_debug.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_debug.h
@@ -57,6 +57,7 @@ int kfd_dbg_trap_set_dev_address_watch(struct kfd_process_device *pdd,
 					uint32_t watch_address_mask,
 					uint32_t *watch_id,
 					uint32_t watch_mode);
+int kfd_dbg_trap_set_flags(struct kfd_process *target, uint32_t *flags);
 int kfd_dbg_send_exception_to_runtime(struct kfd_process *p,
 					unsigned int dev_id,
 					unsigned int queue_id,
-- 
2.25.1


  parent reply	other threads:[~2023-03-27 18:45 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-27 18:43 [PATCH 01/34] drm/amdkfd: add debug and runtime enable interface Jonathan Kim
2023-03-27 18:43 ` [PATCH 02/34] drm/amdkfd: display debug capabilities Jonathan Kim
2023-03-27 18:43 ` [PATCH 03/34] drm/amdkfd: prepare per-process debug enable and disable Jonathan Kim
2023-03-27 18:43 ` [PATCH 04/34] drm/amdgpu: add kgd hw debug mode setting interface Jonathan Kim
2023-03-27 18:43 ` [PATCH 05/34] drm/amdgpu: setup hw debug registers on driver initialization Jonathan Kim
2023-03-27 18:43 ` [PATCH 06/34] drm/amdgpu: add gfx9 hw debug mode enable and disable calls Jonathan Kim
2023-03-27 18:43 ` [PATCH 07/34] drm/amdgpu: add gfx9.4.1 " Jonathan Kim
2023-03-28  5:28   ` kernel test robot
2023-03-28  5:28     ` kernel test robot
2023-03-27 18:43 ` [PATCH 08/34] drm/amdkfd: fix kfd_suspend_all_processes for gfx941 debugging Jonathan Kim
2023-03-27 21:20   ` Felix Kuehling
2023-03-27 18:43 ` [PATCH 09/34] drm/amdgpu: add gfx10 hw debug mode enable and disable calls Jonathan Kim
2023-03-27 18:43 ` [PATCH 10/34] drm/amdgpu: add gfx9.4.2 " Jonathan Kim
2023-03-27 18:43 ` [PATCH 11/34] drm/amdgpu: add gfx11 " Jonathan Kim
2023-03-27 18:43 ` [PATCH 12/34] drm/amdgpu: add configurable grace period for unmap queues Jonathan Kim
2023-03-28 15:19   ` Russell, Kent
2023-03-28 15:45     ` Kim, Jonathan
2023-03-27 18:43 ` [PATCH 13/34] drm/amdkfd: prepare map process for single process debug devices Jonathan Kim
2023-03-27 18:43 ` [PATCH 14/34] drm/amdgpu: prepare map process for multi-process " Jonathan Kim
2023-03-27 18:43 ` [PATCH 15/34] drm/amdgpu: expose debug api for mes Jonathan Kim
2023-03-27 18:43 ` [PATCH 16/34] drm/amdkfd: add per process hw trap enable and disable functions Jonathan Kim
2023-03-27 18:43 ` [PATCH 17/34] drm/amdkfd: apply trap workaround for gfx11 Jonathan Kim
2023-03-27 18:43 ` [PATCH 18/34] drm/amdkfd: add raise exception event function Jonathan Kim
2023-03-27 18:43 ` [PATCH 19/34] drm/amdkfd: add send exception operation Jonathan Kim
2023-03-27 18:43 ` [PATCH 20/34] drm/amdkfd: add runtime enable operation Jonathan Kim
2023-03-27 18:43 ` [PATCH 21/34] drm/amdkfd: add debug trap enabled flag to tma Jonathan Kim
2023-03-27 21:29   ` Felix Kuehling
2023-03-27 18:43 ` [PATCH 22/34] drm/amdkfd: update process interrupt handling for debug events Jonathan Kim
2023-03-27 18:43 ` [PATCH 23/34] drm/amdkfd: add debug set exceptions enabled operation Jonathan Kim
2023-03-27 18:43 ` [PATCH 24/34] drm/amdkfd: add debug wave launch override operation Jonathan Kim
2023-03-27 18:43 ` [PATCH 25/34] drm/amdkfd: add debug wave launch mode operation Jonathan Kim
2023-03-27 18:43 ` [PATCH 26/34] drm/amdkfd: add debug suspend and resume process queues operation Jonathan Kim
2023-03-27 18:43 ` [PATCH 27/34] drm/amdkfd: add debug set and clear address watch points operation Jonathan Kim
2023-03-28  9:21   ` kernel test robot
2023-03-28  9:21     ` kernel test robot
2023-03-31  0:08   ` kernel test robot
2023-03-31  0:08     ` kernel test robot
2023-03-27 18:43 ` Jonathan Kim [this message]
2023-03-27 18:43 ` [PATCH 29/34] drm/amdkfd: add debug query event operation Jonathan Kim
2023-03-27 18:43 ` [PATCH 30/34] drm/amdkfd: add debug query exception info operation Jonathan Kim
2023-03-27 18:43 ` [PATCH 31/34] drm/amdkfd: add debug queue snapshot operation Jonathan Kim
2023-03-27 18:43 ` [PATCH 32/34] drm/amdkfd: add debug device " Jonathan Kim
2023-03-27 18:43 ` [PATCH 33/34] drm/amdkfd: bump kfd ioctl minor version for debug api availability Jonathan Kim
2023-03-27 18:43 ` [PATCH 34/34] drm/amdkfd: optimize gfx off enable toggle for debugging Jonathan Kim
2023-03-27 21:44   ` Felix Kuehling
2023-03-27 21:47 ` [PATCH 01/34] drm/amdkfd: add debug and runtime enable interface Felix Kuehling

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230327184339.125016-28-jonathan.kim@amd.com \
    --to=jonathan.kim@amd.com \
    --cc=Felix.Kuehling@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=dri-devel@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.