All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/14] KFD upstreaming 20171127
@ 2017-11-27 23:29 Felix Kuehling
       [not found] ` <1511825396-24579-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: Felix Kuehling @ 2017-11-27 23:29 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Felix Kuehling

Patches 1,2: random fixes
Patches 3,4: New feature: allow HWS to schedule multiple processes concurrently
             and a related fix
Patches 5-7: New feature: debugfs support
Patches 8-14: Simplify process locking and lock dependencies

After these patches I'm ready to start upstreaming dGPU support.

Felix Kuehling (11):
  drm/amdgpu: fix get_max_engine_clock_in_mhz
  drm/amdkfd: map multiple processes to HW scheduler
  drm/amdkfd: Fix oversubscription accounting
  drm/amdgpu: Fix definition of KFD_CIK_SDMA_QUEUE_OFFSET
  drm/amdgpu: Add kfd2kgd APIs for dumping HQDs
  drm/amdkfd: Add debugfs support to KFD
  drm/amdkfd: Get reference to lead_thread task struct
  drm/amdkfd: Make kfd_process reference counted
  drm/amdkfd: Use ref count to prevent kfd_process destruction
  drm/amdkfd: Reduce nesting in kfd_create_process_device_data
  drm/amdkfd: Factor PDD destruction out of kfd_process_wq_release

Philip Yang (1):
  drm/amdkfd: Add crash protection in debugger register path

Yong Zhao (2):
  drm/amdkfd: Return NULL if kfd_lookup_process_by_pasid fails
  drm/amdkfd: Simplify locking during process creation

 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c         |   7 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c  |  71 +++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c  |  80 ++++++++
 drivers/gpu/drm/amd/amdgpu/cikd.h                  |   2 +-
 drivers/gpu/drm/amd/amdkfd/Makefile                |   2 +
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c           |   2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_debugfs.c           |  75 ++++++++
 drivers/gpu/drm/amd/amdkfd/kfd_device.c            |  11 ++
 .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  |  71 +++++++
 drivers/gpu/drm/amd/amdkfd/kfd_events.c            |  14 +-
 drivers/gpu/drm/amd/amdkfd/kfd_module.c            |   8 +
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h       |   4 +
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c   |  27 +++
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c    |  25 +++
 drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c    |  57 +++++-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h              |  35 ++++
 drivers/gpu/drm/amd/amdkfd/kfd_process.c           | 205 ++++++++++++---------
 .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |  68 +++++++
 drivers/gpu/drm/amd/amdkfd/kfd_topology.c          |  55 ++++++
 drivers/gpu/drm/amd/include/kgd_kfd_interface.h    |  14 ++
 20 files changed, 734 insertions(+), 99 deletions(-)
 create mode 100644 drivers/gpu/drm/amd/amdkfd/kfd_debugfs.c

-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH 01/14] drm/amdgpu: fix get_max_engine_clock_in_mhz
       [not found] ` <1511825396-24579-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-11-27 23:29   ` Felix Kuehling
       [not found]     ` <1511825396-24579-2-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-11-27 23:29   ` [PATCH 02/14] drm/amdkfd: Add crash protection in debugger register path Felix Kuehling
                     ` (12 subsequent siblings)
  13 siblings, 1 reply; 35+ messages in thread
From: Felix Kuehling @ 2017-11-27 23:29 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Felix Kuehling

Use proper powerplay function. This fixes OpenCL initialization
problems.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 5432af3..f7fa767 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -265,6 +265,9 @@ uint32_t get_max_engine_clock_in_mhz(struct kgd_dev *kgd)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)kgd;
 
-	/* The sclk is in quantas of 10kHz */
-	return adev->pm.dpm.dyn_state.max_clock_voltage_on_ac.sclk / 100;
+	/* the sclk is in quantas of 10kHz */
+	if (amdgpu_sriov_vf(adev))
+		return adev->clock.default_sclk / 100;
+
+	return amdgpu_dpm_get_sclk(adev, false) / 100;
 }
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 02/14] drm/amdkfd: Add crash protection in debugger register path
       [not found] ` <1511825396-24579-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-11-27 23:29   ` [PATCH 01/14] drm/amdgpu: fix get_max_engine_clock_in_mhz Felix Kuehling
@ 2017-11-27 23:29   ` Felix Kuehling
       [not found]     ` <1511825396-24579-3-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-11-27 23:29   ` [PATCH 03/14] drm/amdkfd: map multiple processes to HW scheduler Felix Kuehling
                     ` (11 subsequent siblings)
  13 siblings, 1 reply; 35+ messages in thread
From: Felix Kuehling @ 2017-11-27 23:29 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Philip Yang, Felix Kuehling

From: Philip Yang <Philip.Yang@amd.com>

After debugger is registered, the pqm_destroy_queue fails because is_debug
is true, the queue should not be removed from process_queue_list since
the count is not reduced.

Test application calls debugger unregister without register debugger, add
null pointer check protection to avoid crash for this case

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c               | 2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 5 +++++
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index cc61ec2..62c3d9c 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -526,7 +526,7 @@ static int kfd_ioctl_dbg_unregister(struct file *filep,
 	long status;
 
 	dev = kfd_device_by_id(args->gpu_id);
-	if (!dev)
+	if (!dev || !dev->dbgmgr)
 		return -EINVAL;
 
 	if (dev->device_info->asic_family == CHIP_CARRIZO) {
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
index eeb7726..2c98858 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
@@ -313,6 +313,10 @@ int pqm_destroy_queue(struct process_queue_manager *pqm, unsigned int qid)
 	if (pqn->q) {
 		dqm = pqn->q->device->dqm;
 		retval = dqm->ops.destroy_queue(dqm, &pdd->qpd, pqn->q);
+		if (retval) {
+			pr_debug("Destroy queue failed, returned %d\n", retval);
+			goto err_destroy_queue;
+		}
 		uninit_queue(pqn->q);
 	}
 
@@ -324,6 +328,7 @@ int pqm_destroy_queue(struct process_queue_manager *pqm, unsigned int qid)
 	    list_empty(&pdd->qpd.priv_queue_list))
 		dqm->ops.unregister_process(dqm, &pdd->qpd);
 
+err_destroy_queue:
 	return retval;
 }
 
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 03/14] drm/amdkfd: map multiple processes to HW scheduler
       [not found] ` <1511825396-24579-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-11-27 23:29   ` [PATCH 01/14] drm/amdgpu: fix get_max_engine_clock_in_mhz Felix Kuehling
  2017-11-27 23:29   ` [PATCH 02/14] drm/amdkfd: Add crash protection in debugger register path Felix Kuehling
@ 2017-11-27 23:29   ` Felix Kuehling
       [not found]     ` <1511825396-24579-4-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-11-27 23:29   ` [PATCH 04/14] drm/amdkfd: Fix oversubscription accounting Felix Kuehling
                     ` (10 subsequent siblings)
  13 siblings, 1 reply; 35+ messages in thread
From: Felix Kuehling @ 2017-11-27 23:29 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Yong Zhao, Felix Kuehling, Jay Cornwall

Allow HWS to to execute multiple processes on the hardware
concurrently. The number of concurrent processes is limited by
the number of VMIDs allocated to the HWS.

A module parameter can be used for limiting this further or turn
it off altogether (mainly for debugging purposes).

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_device.c         | 11 +++++++++
 drivers/gpu/drm/amd/amdkfd/kfd_module.c         |  5 +++++
 drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c | 30 +++++++++++++++++++++++--
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h           |  9 ++++++++
 4 files changed, 53 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 4f05eac..a8fa33a 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -238,6 +238,17 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
 	kfd->vm_info.vmid_num_kfd = kfd->vm_info.last_vmid_kfd
 			- kfd->vm_info.first_vmid_kfd + 1;
 
+	/* Verify module parameters regarding mapped process number*/
+	if ((hws_max_conc_proc < 0)
+			|| (hws_max_conc_proc > kfd->vm_info.vmid_num_kfd)) {
+		dev_err(kfd_device,
+			"hws_max_conc_proc %d must be between 0 and %d, use %d instead\n",
+			hws_max_conc_proc, kfd->vm_info.vmid_num_kfd,
+			kfd->vm_info.vmid_num_kfd);
+		kfd->max_proc_per_quantum = kfd->vm_info.vmid_num_kfd;
+	} else
+		kfd->max_proc_per_quantum = hws_max_conc_proc;
+
 	/* calculate max size of mqds needed for queues */
 	size = max_num_of_queues_per_device *
 			kfd->device_info->mqd_size_aligned;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_module.c b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
index ee8adf6..4e060c8 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
@@ -50,6 +50,11 @@ module_param(sched_policy, int, 0444);
 MODULE_PARM_DESC(sched_policy,
 	"Scheduling policy (0 = HWS (Default), 1 = HWS without over-subscription, 2 = Non-HWS (Used for debugging only)");
 
+int hws_max_conc_proc = 8;
+module_param(hws_max_conc_proc, int, 0444);
+MODULE_PARM_DESC(hws_max_conc_proc,
+	"Max # processes HWS can execute concurrently when sched_policy=0 (0 = no concurrency, #VMIDs for KFD = Maximum(default))");
+
 int cwsr_enable = 1;
 module_param(cwsr_enable, int, 0444);
 MODULE_PARM_DESC(cwsr_enable, "CWSR enable (0 = Off, 1 = On (Default))");
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
index 69c147a..0b7092e 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
@@ -57,13 +57,24 @@ static void pm_calc_rlib_size(struct packet_manager *pm,
 {
 	unsigned int process_count, queue_count;
 	unsigned int map_queue_size;
+	unsigned int max_proc_per_quantum = 1;
+	struct kfd_dev *dev = pm->dqm->dev;
 
 	process_count = pm->dqm->processes_count;
 	queue_count = pm->dqm->queue_count;
 
-	/* check if there is over subscription*/
+	/* check if there is over subscription
+	 * Note: the arbitration between the number of VMIDs and
+	 * hws_max_conc_proc has been done in
+	 * kgd2kfd_device_init().
+	 */
 	*over_subscription = false;
-	if ((process_count > 1) || queue_count > get_queues_num(pm->dqm)) {
+
+	if (dev->max_proc_per_quantum > 1)
+		max_proc_per_quantum = dev->max_proc_per_quantum;
+
+	if ((process_count > max_proc_per_quantum) ||
+	    queue_count > get_queues_num(pm->dqm)) {
 		*over_subscription = true;
 		pr_debug("Over subscribed runlist\n");
 	}
@@ -116,10 +127,24 @@ static int pm_create_runlist(struct packet_manager *pm, uint32_t *buffer,
 			uint64_t ib, size_t ib_size_in_dwords, bool chain)
 {
 	struct pm4_mes_runlist *packet;
+	int concurrent_proc_cnt = 0;
+	struct kfd_dev *kfd = pm->dqm->dev;
 
 	if (WARN_ON(!ib))
 		return -EFAULT;
 
+	/* Determine the number of processes to map together to HW:
+	 * it can not exceed the number of VMIDs available to the
+	 * scheduler, and it is determined by the smaller of the number
+	 * of processes in the runlist and kfd module parameter
+	 * hws_max_conc_proc.
+	 * Note: the arbitration between the number of VMIDs and
+	 * hws_max_conc_proc has been done in
+	 * kgd2kfd_device_init().
+	 */
+	concurrent_proc_cnt = min(pm->dqm->processes_count,
+			kfd->max_proc_per_quantum);
+
 	packet = (struct pm4_mes_runlist *)buffer;
 
 	memset(buffer, 0, sizeof(struct pm4_mes_runlist));
@@ -130,6 +155,7 @@ static int pm_create_runlist(struct packet_manager *pm, uint32_t *buffer,
 	packet->bitfields4.chain = chain ? 1 : 0;
 	packet->bitfields4.offload_polling = 0;
 	packet->bitfields4.valid = 1;
+	packet->bitfields4.process_cnt = concurrent_proc_cnt;
 	packet->ordinal2 = lower_32_bits(ib);
 	packet->bitfields3.ib_base_hi = upper_32_bits(ib);
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index a668764..1edab21 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -88,6 +88,12 @@ extern int max_num_of_queues_per_device;
 /* Kernel module parameter to specify the scheduling policy */
 extern int sched_policy;
 
+/*
+ * Kernel module parameter to specify the maximum process
+ * number per HW scheduler
+ */
+extern int hws_max_conc_proc;
+
 extern int cwsr_enable;
 
 /*
@@ -214,6 +220,9 @@ struct kfd_dev {
 	/* Debug manager */
 	struct kfd_dbgmgr           *dbgmgr;
 
+	/* Maximum process number mapped to HW scheduler */
+	unsigned int max_proc_per_quantum;
+
 	/* CWSR */
 	bool cwsr_enabled;
 	const void *cwsr_isa;
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 04/14] drm/amdkfd: Fix oversubscription accounting
       [not found] ` <1511825396-24579-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (2 preceding siblings ...)
  2017-11-27 23:29   ` [PATCH 03/14] drm/amdkfd: map multiple processes to HW scheduler Felix Kuehling
@ 2017-11-27 23:29   ` Felix Kuehling
       [not found]     ` <1511825396-24579-5-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-11-27 23:29   ` [PATCH 05/14] drm/amdgpu: Fix definition of KFD_CIK_SDMA_QUEUE_OFFSET Felix Kuehling
                     ` (9 subsequent siblings)
  13 siblings, 1 reply; 35+ messages in thread
From: Felix Kuehling @ 2017-11-27 23:29 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Felix Kuehling, Jay Cornwall

Don't count SDMA queues towards compute HQD oversubscription when
deciding whether to create a chained runlist.

Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
index 0b7092e..c3230b9 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
@@ -55,13 +55,14 @@ static void pm_calc_rlib_size(struct packet_manager *pm,
 				unsigned int *rlib_size,
 				bool *over_subscription)
 {
-	unsigned int process_count, queue_count;
+	unsigned int process_count, queue_count, compute_queue_count;
 	unsigned int map_queue_size;
 	unsigned int max_proc_per_quantum = 1;
 	struct kfd_dev *dev = pm->dqm->dev;
 
 	process_count = pm->dqm->processes_count;
 	queue_count = pm->dqm->queue_count;
+	compute_queue_count = queue_count - pm->dqm->sdma_queue_count;
 
 	/* check if there is over subscription
 	 * Note: the arbitration between the number of VMIDs and
@@ -74,7 +75,7 @@ static void pm_calc_rlib_size(struct packet_manager *pm,
 		max_proc_per_quantum = dev->max_proc_per_quantum;
 
 	if ((process_count > max_proc_per_quantum) ||
-	    queue_count > get_queues_num(pm->dqm)) {
+	    compute_queue_count > get_queues_num(pm->dqm)) {
 		*over_subscription = true;
 		pr_debug("Over subscribed runlist\n");
 	}
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 05/14] drm/amdgpu: Fix definition of KFD_CIK_SDMA_QUEUE_OFFSET
       [not found] ` <1511825396-24579-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (3 preceding siblings ...)
  2017-11-27 23:29   ` [PATCH 04/14] drm/amdkfd: Fix oversubscription accounting Felix Kuehling
@ 2017-11-27 23:29   ` Felix Kuehling
       [not found]     ` <1511825396-24579-6-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-11-27 23:29   ` [PATCH 06/14] drm/amdgpu: Add kfd2kgd APIs for dumping HQDs Felix Kuehling
                     ` (8 subsequent siblings)
  13 siblings, 1 reply; 35+ messages in thread
From: Felix Kuehling @ 2017-11-27 23:29 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Felix Kuehling

This counts the queue offset in register index, not register address.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/cikd.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/cikd.h b/drivers/gpu/drm/amd/amdgpu/cikd.h
index 6a9e38a..cee6e8a 100644
--- a/drivers/gpu/drm/amd/amdgpu/cikd.h
+++ b/drivers/gpu/drm/amd/amdgpu/cikd.h
@@ -562,7 +562,7 @@
 #define	PRIVATE_BASE(x)	((x) << 0) /* scratch */
 #define	SHARED_BASE(x)	((x) << 16) /* LDS */
 
-#define KFD_CIK_SDMA_QUEUE_OFFSET	0x200
+#define KFD_CIK_SDMA_QUEUE_OFFSET (mmSDMA0_RLC1_RB_CNTL - mmSDMA0_RLC0_RB_CNTL)
 
 /* valid for both DEFAULT_MTYPE and APE1_MTYPE */
 enum {
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 06/14] drm/amdgpu: Add kfd2kgd APIs for dumping HQDs
       [not found] ` <1511825396-24579-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (4 preceding siblings ...)
  2017-11-27 23:29   ` [PATCH 05/14] drm/amdgpu: Fix definition of KFD_CIK_SDMA_QUEUE_OFFSET Felix Kuehling
@ 2017-11-27 23:29   ` Felix Kuehling
       [not found]     ` <1511825396-24579-7-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-11-27 23:29   ` [PATCH 07/14] drm/amdkfd: Add debugfs support to KFD Felix Kuehling
                     ` (7 subsequent siblings)
  13 siblings, 1 reply; 35+ messages in thread
From: Felix Kuehling @ 2017-11-27 23:29 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Felix Kuehling

This can be used by KFD for debugging features, such as dumping
HQDs in debugfs.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c | 71 ++++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c | 80 +++++++++++++++++++++++
 drivers/gpu/drm/amd/include/kgd_kfd_interface.h   | 14 ++++
 3 files changed, 165 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
index 14333af..12feba8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
@@ -105,8 +105,14 @@ static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, uint32_t pipe_id,
 			uint32_t queue_id, uint32_t __user *wptr,
 			uint32_t wptr_shift, uint32_t wptr_mask,
 			struct mm_struct *mm);
+static int kgd_hqd_dump(struct kgd_dev *kgd,
+			uint32_t pipe_id, uint32_t queue_id,
+			uint32_t (**dump)[2], uint32_t *n_regs);
 static int kgd_hqd_sdma_load(struct kgd_dev *kgd, void *mqd,
 			     uint32_t __user *wptr, struct mm_struct *mm);
+static int kgd_hqd_sdma_dump(struct kgd_dev *kgd,
+			     uint32_t engine_id, uint32_t queue_id,
+			     uint32_t (**dump)[2], uint32_t *n_regs);
 static bool kgd_hqd_is_occupied(struct kgd_dev *kgd, uint64_t queue_address,
 				uint32_t pipe_id, uint32_t queue_id);
 
@@ -178,6 +184,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
 	.init_interrupts = kgd_init_interrupts,
 	.hqd_load = kgd_hqd_load,
 	.hqd_sdma_load = kgd_hqd_sdma_load,
+	.hqd_dump = kgd_hqd_dump,
+	.hqd_sdma_dump = kgd_hqd_sdma_dump,
 	.hqd_is_occupied = kgd_hqd_is_occupied,
 	.hqd_sdma_is_occupied = kgd_hqd_sdma_is_occupied,
 	.hqd_destroy = kgd_hqd_destroy,
@@ -376,6 +384,42 @@ static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, uint32_t pipe_id,
 	return 0;
 }
 
+static int kgd_hqd_dump(struct kgd_dev *kgd,
+			uint32_t pipe_id, uint32_t queue_id,
+			uint32_t (**dump)[2], uint32_t *n_regs)
+{
+	struct amdgpu_device *adev = get_amdgpu_device(kgd);
+	uint32_t i = 0, reg;
+#define HQD_N_REGS (35+4)
+#define DUMP_REG(addr) do {				\
+		if (WARN_ON_ONCE(i >= HQD_N_REGS))	\
+			break;				\
+		(*dump)[i][0] = (addr) << 2;		\
+		(*dump)[i++][1] = RREG32(addr);		\
+	} while (0)
+
+	*dump = kmalloc(HQD_N_REGS*2*sizeof(uint32_t), GFP_KERNEL);
+	if (*dump == NULL)
+		return -ENOMEM;
+
+	acquire_queue(kgd, pipe_id, queue_id);
+
+	DUMP_REG(mmCOMPUTE_STATIC_THREAD_MGMT_SE0);
+	DUMP_REG(mmCOMPUTE_STATIC_THREAD_MGMT_SE1);
+	DUMP_REG(mmCOMPUTE_STATIC_THREAD_MGMT_SE2);
+	DUMP_REG(mmCOMPUTE_STATIC_THREAD_MGMT_SE3);
+
+	for (reg = mmCP_MQD_BASE_ADDR; reg <= mmCP_MQD_CONTROL; reg++)
+		DUMP_REG(reg);
+
+	release_queue(kgd);
+
+	WARN_ON_ONCE(i != HQD_N_REGS);
+	*n_regs = i;
+
+	return 0;
+}
+
 static int kgd_hqd_sdma_load(struct kgd_dev *kgd, void *mqd,
 			     uint32_t __user *wptr, struct mm_struct *mm)
 {
@@ -440,6 +484,33 @@ static int kgd_hqd_sdma_load(struct kgd_dev *kgd, void *mqd,
 	return 0;
 }
 
+static int kgd_hqd_sdma_dump(struct kgd_dev *kgd,
+			     uint32_t engine_id, uint32_t queue_id,
+			     uint32_t (**dump)[2], uint32_t *n_regs)
+{
+	struct amdgpu_device *adev = get_amdgpu_device(kgd);
+	uint32_t sdma_offset = engine_id * SDMA1_REGISTER_OFFSET +
+		queue_id * KFD_CIK_SDMA_QUEUE_OFFSET;
+	uint32_t i = 0, reg;
+#undef HQD_N_REGS
+#define HQD_N_REGS (19+4)
+
+	*dump = kmalloc(HQD_N_REGS*2*sizeof(uint32_t), GFP_KERNEL);
+	if (*dump == NULL)
+		return -ENOMEM;
+
+	for (reg = mmSDMA0_RLC0_RB_CNTL; reg <= mmSDMA0_RLC0_DOORBELL; reg++)
+		DUMP_REG(sdma_offset + reg);
+	for (reg = mmSDMA0_RLC0_VIRTUAL_ADDR; reg <= mmSDMA0_RLC0_WATERMARK;
+	     reg++)
+		DUMP_REG(sdma_offset + reg);
+
+	WARN_ON_ONCE(i != HQD_N_REGS);
+	*n_regs = i;
+
+	return 0;
+}
+
 static bool kgd_hqd_is_occupied(struct kgd_dev *kgd, uint64_t queue_address,
 				uint32_t pipe_id, uint32_t queue_id)
 {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
index 1d989e4..b380495 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
@@ -64,8 +64,14 @@ static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, uint32_t pipe_id,
 			uint32_t queue_id, uint32_t __user *wptr,
 			uint32_t wptr_shift, uint32_t wptr_mask,
 			struct mm_struct *mm);
+static int kgd_hqd_dump(struct kgd_dev *kgd,
+			uint32_t pipe_id, uint32_t queue_id,
+			uint32_t (**dump)[2], uint32_t *n_regs);
 static int kgd_hqd_sdma_load(struct kgd_dev *kgd, void *mqd,
 			     uint32_t __user *wptr, struct mm_struct *mm);
+static int kgd_hqd_sdma_dump(struct kgd_dev *kgd,
+			     uint32_t engine_id, uint32_t queue_id,
+			     uint32_t (**dump)[2], uint32_t *n_regs);
 static bool kgd_hqd_is_occupied(struct kgd_dev *kgd, uint64_t queue_address,
 		uint32_t pipe_id, uint32_t queue_id);
 static bool kgd_hqd_sdma_is_occupied(struct kgd_dev *kgd, void *mqd);
@@ -137,6 +143,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
 	.init_interrupts = kgd_init_interrupts,
 	.hqd_load = kgd_hqd_load,
 	.hqd_sdma_load = kgd_hqd_sdma_load,
+	.hqd_dump = kgd_hqd_dump,
+	.hqd_sdma_dump = kgd_hqd_sdma_dump,
 	.hqd_is_occupied = kgd_hqd_is_occupied,
 	.hqd_sdma_is_occupied = kgd_hqd_sdma_is_occupied,
 	.hqd_destroy = kgd_hqd_destroy,
@@ -365,6 +373,42 @@ static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, uint32_t pipe_id,
 	return 0;
 }
 
+static int kgd_hqd_dump(struct kgd_dev *kgd,
+			uint32_t pipe_id, uint32_t queue_id,
+			uint32_t (**dump)[2], uint32_t *n_regs)
+{
+	struct amdgpu_device *adev = get_amdgpu_device(kgd);
+	uint32_t i = 0, reg;
+#define HQD_N_REGS (54+4)
+#define DUMP_REG(addr) do {				\
+		if (WARN_ON_ONCE(i >= HQD_N_REGS))	\
+			break;				\
+		(*dump)[i][0] = (addr) << 2;		\
+		(*dump)[i++][1] = RREG32(addr);		\
+	} while (0)
+
+	*dump = kmalloc(HQD_N_REGS*2*sizeof(uint32_t), GFP_KERNEL);
+	if (*dump == NULL)
+		return -ENOMEM;
+
+	acquire_queue(kgd, pipe_id, queue_id);
+
+	DUMP_REG(mmCOMPUTE_STATIC_THREAD_MGMT_SE0);
+	DUMP_REG(mmCOMPUTE_STATIC_THREAD_MGMT_SE1);
+	DUMP_REG(mmCOMPUTE_STATIC_THREAD_MGMT_SE2);
+	DUMP_REG(mmCOMPUTE_STATIC_THREAD_MGMT_SE3);
+
+	for (reg = mmCP_MQD_BASE_ADDR; reg <= mmCP_HQD_EOP_DONES; reg++)
+		DUMP_REG(reg);
+
+	release_queue(kgd);
+
+	WARN_ON_ONCE(i != HQD_N_REGS);
+	*n_regs = i;
+
+	return 0;
+}
+
 static int kgd_hqd_sdma_load(struct kgd_dev *kgd, void *mqd,
 			     uint32_t __user *wptr, struct mm_struct *mm)
 {
@@ -428,6 +472,42 @@ static int kgd_hqd_sdma_load(struct kgd_dev *kgd, void *mqd,
 	return 0;
 }
 
+static int kgd_hqd_sdma_dump(struct kgd_dev *kgd,
+			     uint32_t engine_id, uint32_t queue_id,
+			     uint32_t (**dump)[2], uint32_t *n_regs)
+{
+	struct amdgpu_device *adev = get_amdgpu_device(kgd);
+	uint32_t sdma_offset = engine_id * SDMA1_REGISTER_OFFSET +
+		queue_id * KFD_VI_SDMA_QUEUE_OFFSET;
+	uint32_t i = 0, reg;
+#undef HQD_N_REGS
+#define HQD_N_REGS (19+4+2+3+7)
+
+	*dump = kmalloc(HQD_N_REGS*2*sizeof(uint32_t), GFP_KERNEL);
+	if (*dump == NULL)
+		return -ENOMEM;
+
+	for (reg = mmSDMA0_RLC0_RB_CNTL; reg <= mmSDMA0_RLC0_DOORBELL; reg++)
+		DUMP_REG(sdma_offset + reg);
+	for (reg = mmSDMA0_RLC0_VIRTUAL_ADDR; reg <= mmSDMA0_RLC0_WATERMARK;
+	     reg++)
+		DUMP_REG(sdma_offset + reg);
+	for (reg = mmSDMA0_RLC0_CSA_ADDR_LO; reg <= mmSDMA0_RLC0_CSA_ADDR_HI;
+	     reg++)
+		DUMP_REG(sdma_offset + reg);
+	for (reg = mmSDMA0_RLC0_IB_SUB_REMAIN; reg <= mmSDMA0_RLC0_DUMMY_REG;
+	     reg++)
+		DUMP_REG(sdma_offset + reg);
+	for (reg = mmSDMA0_RLC0_MIDCMD_DATA0; reg <= mmSDMA0_RLC0_MIDCMD_CNTL;
+	     reg++)
+		DUMP_REG(sdma_offset + reg);
+
+	WARN_ON_ONCE(i != HQD_N_REGS);
+	*n_regs = i;
+
+	return 0;
+}
+
 static bool kgd_hqd_is_occupied(struct kgd_dev *kgd, uint64_t queue_address,
 				uint32_t pipe_id, uint32_t queue_id)
 {
diff --git a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
index c6d4e64..fe3079a 100644
--- a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
+++ b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
@@ -131,6 +131,12 @@ struct tile_config {
  * @hqd_sdma_load: Loads the SDMA mqd structure to a H/W SDMA hqd slot.
  * used only for no HWS mode.
  *
+ * @hqd_dump: Dumps CPC HQD registers to an array of address-value pairs.
+ * Array is allocated with kmalloc, needs to be freed with kfree by caller.
+ *
+ * @hqd_sdma_dump: Dumps SDMA HQD registers to an array of address-value pairs.
+ * Array is allocated with kmalloc, needs to be freed with kfree by caller.
+ *
  * @hqd_is_occupies: Checks if a hqd slot is occupied.
  *
  * @hqd_destroy: Destructs and preempts the queue assigned to that hqd slot.
@@ -187,6 +193,14 @@ struct kfd2kgd_calls {
 	int (*hqd_sdma_load)(struct kgd_dev *kgd, void *mqd,
 			     uint32_t __user *wptr, struct mm_struct *mm);
 
+	int (*hqd_dump)(struct kgd_dev *kgd,
+			uint32_t pipe_id, uint32_t queue_id,
+			uint32_t (**dump)[2], uint32_t *n_regs);
+
+	int (*hqd_sdma_dump)(struct kgd_dev *kgd,
+			     uint32_t engine_id, uint32_t queue_id,
+			     uint32_t (**dump)[2], uint32_t *n_regs);
+
 	bool (*hqd_is_occupied)(struct kgd_dev *kgd, uint64_t queue_address,
 				uint32_t pipe_id, uint32_t queue_id);
 
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 07/14] drm/amdkfd: Add debugfs support to KFD
       [not found] ` <1511825396-24579-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (5 preceding siblings ...)
  2017-11-27 23:29   ` [PATCH 06/14] drm/amdgpu: Add kfd2kgd APIs for dumping HQDs Felix Kuehling
@ 2017-11-27 23:29   ` Felix Kuehling
       [not found]     ` <1511825396-24579-8-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-11-27 23:29   ` [PATCH 08/14] drm/amdkfd: Get reference to lead_thread task struct Felix Kuehling
                     ` (6 subsequent siblings)
  13 siblings, 1 reply; 35+ messages in thread
From: Felix Kuehling @ 2017-11-27 23:29 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Yong Zhao, Felix Kuehling

This commit adds several debugfs entries for kfd:

kfd/hqds: dumps all HQDs on all GPUs for KFD-controlled compute and
    SDMA RLC queues

kfd/mqds: dumps all MQDs of all KFD processes on all GPUs

kfd/rls: dumps HWS runlists on all GPUs

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/Makefile                |  2 +
 drivers/gpu/drm/amd/amdkfd/kfd_debugfs.c           | 75 ++++++++++++++++++++++
 .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  | 71 ++++++++++++++++++++
 drivers/gpu/drm/amd/amdkfd/kfd_module.c            |  3 +
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h       |  4 ++
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c   | 27 ++++++++
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c    | 25 ++++++++
 drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c    | 24 +++++++
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h              | 21 ++++++
 drivers/gpu/drm/amd/amdkfd/kfd_process.c           | 29 +++++++++
 .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 63 ++++++++++++++++++
 drivers/gpu/drm/amd/amdkfd/kfd_topology.c          | 55 ++++++++++++++++
 12 files changed, 399 insertions(+)
 create mode 100644 drivers/gpu/drm/amd/amdkfd/kfd_debugfs.c

diff --git a/drivers/gpu/drm/amd/amdkfd/Makefile b/drivers/gpu/drm/amd/amdkfd/Makefile
index b400d56..5263e4d 100644
--- a/drivers/gpu/drm/amd/amdkfd/Makefile
+++ b/drivers/gpu/drm/amd/amdkfd/Makefile
@@ -16,4 +16,6 @@ amdkfd-y	:= kfd_module.o kfd_device.o kfd_chardev.o kfd_topology.o \
 		kfd_interrupt.o kfd_events.o cik_event_interrupt.o \
 		kfd_dbgdev.o kfd_dbgmgr.o
 
+amdkfd-$(CONFIG_DEBUG_FS) += kfd_debugfs.o
+
 obj-$(CONFIG_HSA_AMD)	+= amdkfd.o
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debugfs.c b/drivers/gpu/drm/amd/amdkfd/kfd_debugfs.c
new file mode 100644
index 0000000..4bd6ebf
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_debugfs.c
@@ -0,0 +1,75 @@
+/*
+ * Copyright 2016-2017 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+#include <linux/debugfs.h>
+#include "kfd_priv.h"
+
+static struct dentry *debugfs_root;
+
+static int kfd_debugfs_open(struct inode *inode, struct file *file)
+{
+	int (*show)(struct seq_file *, void *) = inode->i_private;
+
+	return single_open(file, show, NULL);
+}
+
+static const struct file_operations kfd_debugfs_fops = {
+	.owner = THIS_MODULE,
+	.open = kfd_debugfs_open,
+	.read = seq_read,
+	.llseek = seq_lseek,
+	.release = single_release,
+};
+
+void kfd_debugfs_init(void)
+{
+	struct dentry *ent;
+
+	debugfs_root = debugfs_create_dir("kfd", NULL);
+	if (!debugfs_root || debugfs_root == ERR_PTR(-ENODEV)) {
+		pr_warn("Failed to create kfd debugfs dir\n");
+		return;
+	}
+
+	ent = debugfs_create_file("mqds", S_IFREG | 0444, debugfs_root,
+				  kfd_debugfs_mqds_by_process,
+				  &kfd_debugfs_fops);
+	if (!ent)
+		pr_warn("Failed to create mqds in kfd debugfs\n");
+
+	ent = debugfs_create_file("hqds", S_IFREG | 0444, debugfs_root,
+				  kfd_debugfs_hqds_by_device,
+				  &kfd_debugfs_fops);
+	if (!ent)
+		pr_warn("Failed to create hqds in kfd debugfs\n");
+
+	ent = debugfs_create_file("rls", S_IFREG | 0444, debugfs_root,
+				  kfd_debugfs_rls_by_device,
+				  &kfd_debugfs_fops);
+	if (!ent)
+		pr_warn("Failed to create rls in kfd debugfs\n");
+}
+
+void kfd_debugfs_fini(void)
+{
+	debugfs_remove_recursive(debugfs_root);
+}
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 8447810..eef8b98 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -1318,3 +1318,74 @@ void device_queue_manager_uninit(struct device_queue_manager *dqm)
 	dqm->ops.uninitialize(dqm);
 	kfree(dqm);
 }
+
+#if defined(CONFIG_DEBUG_FS)
+
+static void seq_reg_dump(struct seq_file *m,
+			 uint32_t (*dump)[2], uint32_t n_regs)
+{
+	uint32_t i, count;
+
+	for (i = 0, count = 0; i < n_regs; i++) {
+		if (count == 0 ||
+		    dump[i-1][0] + sizeof(uint32_t) != dump[i][0]) {
+			seq_printf(m, "%s    %08x: %08x",
+				   i ? "\n" : "",
+				   dump[i][0], dump[i][1]);
+			count = 7;
+		} else {
+			seq_printf(m, " %08x", dump[i][1]);
+			count--;
+		}
+	}
+
+	seq_puts(m, "\n");
+}
+
+int dqm_debugfs_hqds(struct seq_file *m, void *data)
+{
+	struct device_queue_manager *dqm = data;
+	uint32_t (*dump)[2], n_regs;
+	int pipe, queue;
+	int r = 0;
+
+	for (pipe = 0; pipe < get_pipes_per_mec(dqm); pipe++) {
+		int pipe_offset = pipe * get_queues_per_pipe(dqm);
+
+		for (queue = 0; queue < get_queues_per_pipe(dqm); queue++) {
+			if (!test_bit(pipe_offset + queue,
+				      dqm->dev->shared_resources.queue_bitmap))
+				continue;
+
+			r = dqm->dev->kfd2kgd->hqd_dump(
+				dqm->dev->kgd, pipe, queue, &dump, &n_regs);
+			if (r)
+				break;
+
+			seq_printf(m, "  CP Pipe %d, Queue %d\n",
+				  pipe, queue);
+			seq_reg_dump(m, dump, n_regs);
+
+			kfree(dump);
+		}
+	}
+
+	for (pipe = 0; pipe < CIK_SDMA_ENGINE_NUM; pipe++) {
+		for (queue = 0; queue < CIK_SDMA_QUEUES_PER_ENGINE; queue++) {
+			r = dqm->dev->kfd2kgd->hqd_sdma_dump(
+				dqm->dev->kgd, pipe, queue, &dump, &n_regs);
+			if (r)
+				break;
+
+			seq_printf(m, "  SDMA Engine %d, RLC %d\n",
+				  pipe, queue);
+			seq_reg_dump(m, dump, n_regs);
+
+			kfree(dump);
+		}
+	}
+
+	return r;
+}
+
+#endif
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_module.c b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
index 4e060c8..f50e494 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
@@ -123,6 +123,8 @@ static int __init kfd_module_init(void)
 
 	kfd_process_create_wq();
 
+	kfd_debugfs_init();
+
 	amdkfd_init_completed = 1;
 
 	dev_info(kfd_device, "Initialized module\n");
@@ -139,6 +141,7 @@ static void __exit kfd_module_exit(void)
 {
 	amdkfd_init_completed = 0;
 
+	kfd_debugfs_fini();
 	kfd_process_destroy_wq();
 	kfd_topology_shutdown();
 	kfd_chardev_exit();
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h
index 1f3a6ba..8972bcf 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h
@@ -85,6 +85,10 @@ struct mqd_manager {
 				uint64_t queue_address,	uint32_t pipe_id,
 				uint32_t queue_id);
 
+#if defined(CONFIG_DEBUG_FS)
+	int	(*debugfs_show_mqd)(struct seq_file *m, void *data);
+#endif
+
 	struct mutex	mqd_mutex;
 	struct kfd_dev	*dev;
 };
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
index 7aa57ab..f8ef4a0 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
@@ -365,6 +365,24 @@ static int update_mqd_hiq(struct mqd_manager *mm, void *mqd,
 	return 0;
 }
 
+#if defined(CONFIG_DEBUG_FS)
+
+static int debugfs_show_mqd(struct seq_file *m, void *data)
+{
+	seq_hex_dump(m, "    ", DUMP_PREFIX_OFFSET, 32, 4,
+		     data, sizeof(struct cik_mqd), false);
+	return 0;
+}
+
+static int debugfs_show_mqd_sdma(struct seq_file *m, void *data)
+{
+	seq_hex_dump(m, "    ", DUMP_PREFIX_OFFSET, 32, 4,
+		     data, sizeof(struct cik_sdma_rlc_registers), false);
+	return 0;
+}
+
+#endif
+
 
 struct mqd_manager *mqd_manager_init_cik(enum KFD_MQD_TYPE type,
 		struct kfd_dev *dev)
@@ -389,6 +407,9 @@ struct mqd_manager *mqd_manager_init_cik(enum KFD_MQD_TYPE type,
 		mqd->update_mqd = update_mqd;
 		mqd->destroy_mqd = destroy_mqd;
 		mqd->is_occupied = is_occupied;
+#if defined(CONFIG_DEBUG_FS)
+		mqd->debugfs_show_mqd = debugfs_show_mqd;
+#endif
 		break;
 	case KFD_MQD_TYPE_HIQ:
 		mqd->init_mqd = init_mqd_hiq;
@@ -397,6 +418,9 @@ struct mqd_manager *mqd_manager_init_cik(enum KFD_MQD_TYPE type,
 		mqd->update_mqd = update_mqd_hiq;
 		mqd->destroy_mqd = destroy_mqd;
 		mqd->is_occupied = is_occupied;
+#if defined(CONFIG_DEBUG_FS)
+		mqd->debugfs_show_mqd = debugfs_show_mqd;
+#endif
 		break;
 	case KFD_MQD_TYPE_SDMA:
 		mqd->init_mqd = init_mqd_sdma;
@@ -405,6 +429,9 @@ struct mqd_manager *mqd_manager_init_cik(enum KFD_MQD_TYPE type,
 		mqd->update_mqd = update_mqd_sdma;
 		mqd->destroy_mqd = destroy_mqd_sdma;
 		mqd->is_occupied = is_occupied_sdma;
+#if defined(CONFIG_DEBUG_FS)
+		mqd->debugfs_show_mqd = debugfs_show_mqd_sdma;
+#endif
 		break;
 	default:
 		kfree(mqd);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
index 00e1f1a..971aec0 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
@@ -358,7 +358,23 @@ static bool is_occupied_sdma(struct mqd_manager *mm, void *mqd,
 	return mm->dev->kfd2kgd->hqd_sdma_is_occupied(mm->dev->kgd, mqd);
 }
 
+#if defined(CONFIG_DEBUG_FS)
 
+static int debugfs_show_mqd(struct seq_file *m, void *data)
+{
+	seq_hex_dump(m, "    ", DUMP_PREFIX_OFFSET, 32, 4,
+		     data, sizeof(struct vi_mqd), false);
+	return 0;
+}
+
+static int debugfs_show_mqd_sdma(struct seq_file *m, void *data)
+{
+	seq_hex_dump(m, "    ", DUMP_PREFIX_OFFSET, 32, 4,
+		     data, sizeof(struct vi_sdma_mqd), false);
+	return 0;
+}
+
+#endif
 
 struct mqd_manager *mqd_manager_init_vi(enum KFD_MQD_TYPE type,
 		struct kfd_dev *dev)
@@ -383,6 +399,9 @@ struct mqd_manager *mqd_manager_init_vi(enum KFD_MQD_TYPE type,
 		mqd->update_mqd = update_mqd;
 		mqd->destroy_mqd = destroy_mqd;
 		mqd->is_occupied = is_occupied;
+#if defined(CONFIG_DEBUG_FS)
+		mqd->debugfs_show_mqd = debugfs_show_mqd;
+#endif
 		break;
 	case KFD_MQD_TYPE_HIQ:
 		mqd->init_mqd = init_mqd_hiq;
@@ -391,6 +410,9 @@ struct mqd_manager *mqd_manager_init_vi(enum KFD_MQD_TYPE type,
 		mqd->update_mqd = update_mqd_hiq;
 		mqd->destroy_mqd = destroy_mqd;
 		mqd->is_occupied = is_occupied;
+#if defined(CONFIG_DEBUG_FS)
+		mqd->debugfs_show_mqd = debugfs_show_mqd;
+#endif
 		break;
 	case KFD_MQD_TYPE_SDMA:
 		mqd->init_mqd = init_mqd_sdma;
@@ -399,6 +421,9 @@ struct mqd_manager *mqd_manager_init_vi(enum KFD_MQD_TYPE type,
 		mqd->update_mqd = update_mqd_sdma;
 		mqd->destroy_mqd = destroy_mqd_sdma;
 		mqd->is_occupied = is_occupied_sdma;
+#if defined(CONFIG_DEBUG_FS)
+		mqd->debugfs_show_mqd = debugfs_show_mqd_sdma;
+#endif
 		break;
 	default:
 		kfree(mqd);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
index c3230b9..0ecbd1f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
@@ -278,6 +278,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
 		return retval;
 
 	*rl_size_bytes = alloc_size_bytes;
+	pm->ib_size_bytes = alloc_size_bytes;
 
 	pr_debug("Building runlist ib process count: %d queues count %d\n",
 		pm->dqm->processes_count, pm->dqm->queue_count);
@@ -591,3 +592,26 @@ void pm_release_ib(struct packet_manager *pm)
 	}
 	mutex_unlock(&pm->lock);
 }
+
+#if defined(CONFIG_DEBUG_FS)
+
+int pm_debugfs_runlist(struct seq_file *m, void *data)
+{
+	struct packet_manager *pm = data;
+
+	mutex_lock(&pm->lock);
+
+	if (!pm->allocated) {
+		seq_puts(m, "  No active runlist\n");
+		goto out;
+	}
+
+	seq_hex_dump(m, "  ", DUMP_PREFIX_OFFSET, 32, 4,
+		     pm->ib_buffer_obj->cpu_ptr, pm->ib_size_bytes, false);
+
+out:
+	mutex_unlock(&pm->lock);
+	return 0;
+}
+
+#endif
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 1edab21..dca493b 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -33,6 +33,7 @@
 #include <linux/kfd_ioctl.h>
 #include <linux/idr.h>
 #include <linux/kfifo.h>
+#include <linux/seq_file.h>
 #include <kgd_kfd_interface.h>
 
 #include "amd_shared.h"
@@ -735,6 +736,7 @@ struct packet_manager {
 	struct mutex lock;
 	bool allocated;
 	struct kfd_mem_obj *ib_buffer_obj;
+	unsigned int ib_size_bytes;
 };
 
 int pm_init(struct packet_manager *pm, struct device_queue_manager *dqm);
@@ -781,4 +783,23 @@ int kfd_event_destroy(struct kfd_process *p, uint32_t event_id);
 
 int dbgdev_wave_reset_wavefronts(struct kfd_dev *dev, struct kfd_process *p);
 
+/* Debugfs */
+#if defined(CONFIG_DEBUG_FS)
+
+void kfd_debugfs_init(void);
+void kfd_debugfs_fini(void);
+int kfd_debugfs_mqds_by_process(struct seq_file *m, void *data);
+int pqm_debugfs_mqds(struct seq_file *m, void *data);
+int kfd_debugfs_hqds_by_device(struct seq_file *m, void *data);
+int dqm_debugfs_hqds(struct seq_file *m, void *data);
+int kfd_debugfs_rls_by_device(struct seq_file *m, void *data);
+int pm_debugfs_runlist(struct seq_file *m, void *data);
+
+#else
+
+static inline void kfd_debugfs_init(void) {}
+static inline void kfd_debugfs_fini(void) {}
+
+#endif
+
 #endif
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index 39f4c19..99c18ee 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -620,3 +620,32 @@ int kfd_reserved_mem_mmap(struct kfd_process *process,
 			       PFN_DOWN(__pa(qpd->cwsr_kaddr)),
 			       KFD_CWSR_TBA_TMA_SIZE, vma->vm_page_prot);
 }
+
+#if defined(CONFIG_DEBUG_FS)
+
+int kfd_debugfs_mqds_by_process(struct seq_file *m, void *data)
+{
+	struct kfd_process *p;
+	unsigned int temp;
+	int r = 0;
+
+	int idx = srcu_read_lock(&kfd_processes_srcu);
+
+	hash_for_each_rcu(kfd_processes_table, temp, p, kfd_processes) {
+		seq_printf(m, "Process %d PASID %d:\n",
+			   p->lead_thread->tgid, p->pasid);
+
+		mutex_lock(&p->mutex);
+		r = pqm_debugfs_mqds(m, &p->pqm);
+		mutex_unlock(&p->mutex);
+
+		if (r)
+			break;
+	}
+
+	srcu_read_unlock(&kfd_processes_srcu, idx);
+
+	return r;
+}
+
+#endif
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
index 2c98858..2573455 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
@@ -370,4 +370,67 @@ struct kernel_queue *pqm_get_kernel_queue(
 	return NULL;
 }
 
+#if defined(CONFIG_DEBUG_FS)
 
+int pqm_debugfs_mqds(struct seq_file *m, void *data)
+{
+	struct process_queue_manager *pqm = data;
+	struct process_queue_node *pqn;
+	struct queue *q;
+	enum KFD_MQD_TYPE mqd_type;
+	struct mqd_manager *mqd_manager;
+	int r = 0;
+
+	list_for_each_entry(pqn, &pqm->queues, process_queue_list) {
+		if (pqn->q) {
+			q = pqn->q;
+			switch (q->properties.type) {
+			case KFD_QUEUE_TYPE_SDMA:
+				seq_printf(m, "  SDMA queue on device %x\n",
+					   q->device->id);
+				mqd_type = KFD_MQD_TYPE_SDMA;
+				break;
+			case KFD_QUEUE_TYPE_COMPUTE:
+				seq_printf(m, "  Compute queue on device %x\n",
+					   q->device->id);
+				mqd_type = KFD_MQD_TYPE_CP;
+				break;
+			default:
+				seq_printf(m,
+				"  Bad user queue type %d on device %x\n",
+					   q->properties.type, q->device->id);
+				continue;
+			}
+			mqd_manager = q->device->dqm->ops.get_mqd_manager(
+				q->device->dqm, mqd_type);
+		} else if (pqn->kq) {
+			q = pqn->kq->queue;
+			mqd_manager = pqn->kq->mqd;
+			switch (q->properties.type) {
+			case KFD_QUEUE_TYPE_DIQ:
+				seq_printf(m, "  DIQ on device %x\n",
+					   pqn->kq->dev->id);
+				mqd_type = KFD_MQD_TYPE_HIQ;
+				break;
+			default:
+				seq_printf(m,
+				"  Bad kernel queue type %d on device %x\n",
+					   q->properties.type,
+					   pqn->kq->dev->id);
+				continue;
+			}
+		} else {
+			seq_printf(m,
+		"  Weird: Queue node with neither kernel nor user queue\n");
+			continue;
+		}
+
+		r = mqd_manager->debugfs_show_mqd(m, q->mqd);
+		if (r != 0)
+			break;
+	}
+
+	return r;
+}
+
+#endif
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
index 19ce590..9d03a56 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
@@ -32,6 +32,7 @@
 #include "kfd_priv.h"
 #include "kfd_crat.h"
 #include "kfd_topology.h"
+#include "kfd_device_queue_manager.h"
 
 static struct list_head topology_device_list;
 static int topology_crat_parsed;
@@ -1233,3 +1234,57 @@ struct kfd_dev *kfd_topology_enum_kfd_devices(uint8_t idx)
 	return device;
 
 }
+
+#if defined(CONFIG_DEBUG_FS)
+
+int kfd_debugfs_hqds_by_device(struct seq_file *m, void *data)
+{
+	struct kfd_topology_device *dev;
+	unsigned int i = 0;
+	int r = 0;
+
+	down_read(&topology_lock);
+
+	list_for_each_entry(dev, &topology_device_list, list) {
+		if (!dev->gpu) {
+			i++;
+			continue;
+		}
+
+		seq_printf(m, "Node %u, gpu_id %x:\n", i++, dev->gpu->id);
+		r = dqm_debugfs_hqds(m, dev->gpu->dqm);
+		if (r)
+			break;
+	}
+
+	up_read(&topology_lock);
+
+	return r;
+}
+
+int kfd_debugfs_rls_by_device(struct seq_file *m, void *data)
+{
+	struct kfd_topology_device *dev;
+	unsigned int i = 0;
+	int r = 0;
+
+	down_read(&topology_lock);
+
+	list_for_each_entry(dev, &topology_device_list, list) {
+		if (!dev->gpu) {
+			i++;
+			continue;
+		}
+
+		seq_printf(m, "Node %u, gpu_id %x:\n", i++, dev->gpu->id);
+		r = pm_debugfs_runlist(m, &dev->gpu->dqm->packets);
+		if (r)
+			break;
+	}
+
+	up_read(&topology_lock);
+
+	return r;
+}
+
+#endif
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 08/14] drm/amdkfd: Get reference to lead_thread task struct
       [not found] ` <1511825396-24579-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (6 preceding siblings ...)
  2017-11-27 23:29   ` [PATCH 07/14] drm/amdkfd: Add debugfs support to KFD Felix Kuehling
@ 2017-11-27 23:29   ` Felix Kuehling
       [not found]     ` <1511825396-24579-9-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-11-27 23:29   ` [PATCH 09/14] drm/amdkfd: Make kfd_process reference counted Felix Kuehling
                     ` (5 subsequent siblings)
  13 siblings, 1 reply; 35+ messages in thread
From: Felix Kuehling @ 2017-11-27 23:29 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Felix Kuehling

Increment the kfd_process.lead_thread's reference counter to make
it safe to dereference. This is needed for getting a safe reference
to the process' mm_struct.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_process.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index 99c18ee..660d8bc 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -24,6 +24,7 @@
 #include <linux/log2.h>
 #include <linux/sched.h>
 #include <linux/sched/mm.h>
+#include <linux/sched/task.h>
 #include <linux/slab.h>
 #include <linux/amd-iommu.h>
 #include <linux/notifier.h>
@@ -191,6 +192,8 @@ static void kfd_process_wq_release(struct work_struct *work)
 
 	mutex_destroy(&p->mutex);
 
+	put_task_struct(p->lead_thread);
+
 	kfree(p);
 
 	kfree(work);
@@ -342,6 +345,7 @@ static struct kfd_process *create_process(const struct task_struct *thread)
 			(uintptr_t)process->mm);
 
 	process->lead_thread = thread->group_leader;
+	get_task_struct(process->lead_thread);
 
 	INIT_LIST_HEAD(&process->per_device_data);
 
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 09/14] drm/amdkfd: Make kfd_process reference counted
       [not found] ` <1511825396-24579-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (7 preceding siblings ...)
  2017-11-27 23:29   ` [PATCH 08/14] drm/amdkfd: Get reference to lead_thread task struct Felix Kuehling
@ 2017-11-27 23:29   ` Felix Kuehling
       [not found]     ` <1511825396-24579-10-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-11-27 23:29   ` [PATCH 10/14] drm/amdkfd: Use ref count to prevent kfd_process destruction Felix Kuehling
                     ` (4 subsequent siblings)
  13 siblings, 1 reply; 35+ messages in thread
From: Felix Kuehling @ 2017-11-27 23:29 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Felix Kuehling

This will be used to elliminate the use of the process lock for
preventing concurrent process destruction. This will simplify lock
dependencies between KFD and KGD.

This also simplifies the process destruction in a few ways:
* Don't allocate work struct dynamically
* Remove unnecessary hack that increments mm reference counter
* Remove unnecessary process locking during destruction

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h    |  4 +++
 drivers/gpu/drm/amd/amdkfd/kfd_process.c | 58 ++++++++++++--------------------
 2 files changed, 26 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index dca493b..248e4f5 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -34,6 +34,7 @@
 #include <linux/idr.h>
 #include <linux/kfifo.h>
 #include <linux/seq_file.h>
+#include <linux/kref.h>
 #include <kgd_kfd_interface.h>
 
 #include "amd_shared.h"
@@ -537,6 +538,9 @@ struct kfd_process {
 	 */
 	void *mm;
 
+	struct kref ref;
+	struct work_struct release_work;
+
 	struct mutex mutex;
 
 	/*
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index 660d8bc..e02e8a2 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -48,11 +48,6 @@ DEFINE_STATIC_SRCU(kfd_processes_srcu);
 
 static struct workqueue_struct *kfd_process_wq;
 
-struct kfd_process_release_work {
-	struct work_struct kfd_work;
-	struct kfd_process *p;
-};
-
 static struct kfd_process *find_process(const struct task_struct *thread);
 static struct kfd_process *create_process(const struct task_struct *thread);
 static int kfd_process_init_cwsr(struct kfd_process *p, struct file *filep);
@@ -151,21 +146,20 @@ static struct kfd_process *find_process(const struct task_struct *thread)
 	return p;
 }
 
+/* No process locking is needed in this function, because the process
+ * is not findable any more. We must assume that no other thread is
+ * using it any more, otherwise we couldn't safely free the process
+ * structure in the end.
+ */
 static void kfd_process_wq_release(struct work_struct *work)
 {
-	struct kfd_process_release_work *my_work;
+	struct kfd_process *p = container_of(work, struct kfd_process,
+					     release_work);
 	struct kfd_process_device *pdd, *temp;
-	struct kfd_process *p;
-
-	my_work = (struct kfd_process_release_work *) work;
-
-	p = my_work->p;
 
 	pr_debug("Releasing process (pasid %d) in workqueue\n",
 			p->pasid);
 
-	mutex_lock(&p->mutex);
-
 	list_for_each_entry_safe(pdd, temp, &p->per_device_data,
 							per_device_list) {
 		pr_debug("Releasing pdd (topology id %d) for process (pasid %d) in workqueue\n",
@@ -188,33 +182,26 @@ static void kfd_process_wq_release(struct work_struct *work)
 	kfd_pasid_free(p->pasid);
 	kfd_free_process_doorbells(p);
 
-	mutex_unlock(&p->mutex);
-
 	mutex_destroy(&p->mutex);
 
 	put_task_struct(p->lead_thread);
 
 	kfree(p);
-
-	kfree(work);
 }
 
-static void kfd_process_destroy_delayed(struct rcu_head *rcu)
+static void kfd_process_ref_release(struct kref *ref)
 {
-	struct kfd_process_release_work *work;
-	struct kfd_process *p;
-
-	p = container_of(rcu, struct kfd_process, rcu);
+	struct kfd_process *p = container_of(ref, struct kfd_process, ref);
 
-	mmdrop(p->mm);
+	INIT_WORK(&p->release_work, kfd_process_wq_release);
+	queue_work(kfd_process_wq, &p->release_work);
+}
 
-	work = kmalloc(sizeof(struct kfd_process_release_work), GFP_ATOMIC);
+static void kfd_process_destroy_delayed(struct rcu_head *rcu)
+{
+	struct kfd_process *p = container_of(rcu, struct kfd_process, rcu);
 
-	if (work) {
-		INIT_WORK((struct work_struct *) work, kfd_process_wq_release);
-		work->p = p;
-		queue_work(kfd_process_wq, (struct work_struct *) work);
-	}
+	kref_put(&p->ref, kfd_process_ref_release);
 }
 
 static void kfd_process_notifier_release(struct mmu_notifier *mn,
@@ -258,15 +245,12 @@ static void kfd_process_notifier_release(struct mmu_notifier *mn,
 	kfd_process_dequeue_from_all_devices(p);
 	pqm_uninit(&p->pqm);
 
+	/* Indicate to other users that MM is no longer valid */
+	p->mm = NULL;
+
 	mutex_unlock(&p->mutex);
 
-	/*
-	 * Because we drop mm_count inside kfd_process_destroy_delayed
-	 * and because the mmu_notifier_unregister function also drop
-	 * mm_count we need to take an extra count here.
-	 */
-	mmgrab(p->mm);
-	mmu_notifier_unregister_no_release(&p->mmu_notifier, p->mm);
+	mmu_notifier_unregister_no_release(&p->mmu_notifier, mm);
 	mmu_notifier_call_srcu(&p->rcu, &kfd_process_destroy_delayed);
 }
 
@@ -331,6 +315,8 @@ static struct kfd_process *create_process(const struct task_struct *thread)
 	if (kfd_alloc_process_doorbells(process) < 0)
 		goto err_alloc_doorbells;
 
+	kref_init(&process->ref);
+
 	mutex_init(&process->mutex);
 
 	process->mm = thread->mm;
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 10/14] drm/amdkfd: Use ref count to prevent kfd_process destruction
       [not found] ` <1511825396-24579-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (8 preceding siblings ...)
  2017-11-27 23:29   ` [PATCH 09/14] drm/amdkfd: Make kfd_process reference counted Felix Kuehling
@ 2017-11-27 23:29   ` Felix Kuehling
       [not found]     ` <1511825396-24579-11-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-11-27 23:29   ` [PATCH 11/14] drm/amdkfd: Return NULL if kfd_lookup_process_by_pasid fails Felix Kuehling
                     ` (3 subsequent siblings)
  13 siblings, 1 reply; 35+ messages in thread
From: Felix Kuehling @ 2017-11-27 23:29 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Felix Kuehling

Use a reference counter instead of a lock to prevent process
destruction while functions running out of process context are using
the kfd_process structure. In many cases these functions don't need
the structure to be locked. In the few cases that really do need the
process lock, take it explicitly.

This helps simplify lock dependencies between the process lock and
other locks, particularly amdgpu and mm_struct locks. This will be
important when amdgpu calls back to amdkfd for memory evictions.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_events.c  | 14 +++++++-------
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h    |  1 +
 drivers/gpu/drm/amd/amdkfd/kfd_process.c | 16 +++++++++++++---
 3 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
index cb92d4b..93aae5c 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
@@ -441,7 +441,7 @@ void kfd_signal_event_interrupt(unsigned int pasid, uint32_t partial_id,
 	/*
 	 * Because we are called from arbitrary context (workqueue) as opposed
 	 * to process context, kfd_process could attempt to exit while we are
-	 * running so the lookup function returns a locked process.
+	 * running so the lookup function increments the process ref count.
 	 */
 	struct kfd_process *p = kfd_lookup_process_by_pasid(pasid);
 
@@ -493,7 +493,7 @@ void kfd_signal_event_interrupt(unsigned int pasid, uint32_t partial_id,
 	}
 
 	mutex_unlock(&p->event_mutex);
-	mutex_unlock(&p->mutex);
+	kfd_unref_process(p);
 }
 
 static struct kfd_event_waiter *alloc_event_waiters(uint32_t num_events)
@@ -847,7 +847,7 @@ void kfd_signal_iommu_event(struct kfd_dev *dev, unsigned int pasid,
 	/*
 	 * Because we are called from arbitrary context (workqueue) as opposed
 	 * to process context, kfd_process could attempt to exit while we are
-	 * running so the lookup function returns a locked process.
+	 * running so the lookup function increments the process ref count.
 	 */
 	struct kfd_process *p = kfd_lookup_process_by_pasid(pasid);
 	struct mm_struct *mm;
@@ -860,7 +860,7 @@ void kfd_signal_iommu_event(struct kfd_dev *dev, unsigned int pasid,
 	 */
 	mm = get_task_mm(p->lead_thread);
 	if (!mm) {
-		mutex_unlock(&p->mutex);
+		kfd_unref_process(p);
 		return; /* Process is exiting */
 	}
 
@@ -903,7 +903,7 @@ void kfd_signal_iommu_event(struct kfd_dev *dev, unsigned int pasid,
 			&memory_exception_data);
 
 	mutex_unlock(&p->event_mutex);
-	mutex_unlock(&p->mutex);
+	kfd_unref_process(p);
 }
 
 void kfd_signal_hw_exception_event(unsigned int pasid)
@@ -911,7 +911,7 @@ void kfd_signal_hw_exception_event(unsigned int pasid)
 	/*
 	 * Because we are called from arbitrary context (workqueue) as opposed
 	 * to process context, kfd_process could attempt to exit while we are
-	 * running so the lookup function returns a locked process.
+	 * running so the lookup function increments the process ref count.
 	 */
 	struct kfd_process *p = kfd_lookup_process_by_pasid(pasid);
 
@@ -924,5 +924,5 @@ void kfd_signal_hw_exception_event(unsigned int pasid)
 	lookup_events_by_type_and_signal(p, KFD_EVENT_TYPE_HW_EXCEPTION, NULL);
 
 	mutex_unlock(&p->event_mutex);
-	mutex_unlock(&p->mutex);
+	kfd_unref_process(p);
 }
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 248e4f5..0c96a6b 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -606,6 +606,7 @@ void kfd_process_destroy_wq(void);
 struct kfd_process *kfd_create_process(struct file *filep);
 struct kfd_process *kfd_get_process(const struct task_struct *);
 struct kfd_process *kfd_lookup_process_by_pasid(unsigned int pasid);
+void kfd_unref_process(struct kfd_process *p);
 
 struct kfd_process_device *kfd_bind_process_to_device(struct kfd_dev *dev,
 						struct kfd_process *p);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index e02e8a2..509f987 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -49,6 +49,7 @@ DEFINE_STATIC_SRCU(kfd_processes_srcu);
 static struct workqueue_struct *kfd_process_wq;
 
 static struct kfd_process *find_process(const struct task_struct *thread);
+static void kfd_process_ref_release(struct kref *ref);
 static struct kfd_process *create_process(const struct task_struct *thread);
 static int kfd_process_init_cwsr(struct kfd_process *p, struct file *filep);
 
@@ -146,6 +147,11 @@ static struct kfd_process *find_process(const struct task_struct *thread)
 	return p;
 }
 
+void kfd_unref_process(struct kfd_process *p)
+{
+	kref_put(&p->ref, kfd_process_ref_release);
+}
+
 /* No process locking is needed in this function, because the process
  * is not findable any more. We must assume that no other thread is
  * using it any more, otherwise we couldn't safely free the process
@@ -201,7 +207,7 @@ static void kfd_process_destroy_delayed(struct rcu_head *rcu)
 {
 	struct kfd_process *p = container_of(rcu, struct kfd_process, rcu);
 
-	kref_put(&p->ref, kfd_process_ref_release);
+	kfd_unref_process(p);
 }
 
 static void kfd_process_notifier_release(struct mmu_notifier *mn,
@@ -525,6 +531,8 @@ void kfd_process_iommu_unbind_callback(struct kfd_dev *dev, unsigned int pasid)
 
 	mutex_unlock(kfd_get_dbgmgr_mutex());
 
+	mutex_lock(&p->mutex);
+
 	pdd = kfd_get_process_device_data(dev, p);
 	if (pdd)
 		/* For GPU relying on IOMMU, we need to dequeue here
@@ -533,6 +541,8 @@ void kfd_process_iommu_unbind_callback(struct kfd_dev *dev, unsigned int pasid)
 		kfd_process_dequeue_from_device(pdd);
 
 	mutex_unlock(&p->mutex);
+
+	kfd_unref_process(p);
 }
 
 struct kfd_process_device *kfd_get_first_process_device_data(
@@ -557,7 +567,7 @@ bool kfd_has_process_device_data(struct kfd_process *p)
 	return !(list_empty(&p->per_device_data));
 }
 
-/* This returns with process->mutex locked. */
+/* This increments the process->ref counter. */
 struct kfd_process *kfd_lookup_process_by_pasid(unsigned int pasid)
 {
 	struct kfd_process *p;
@@ -567,7 +577,7 @@ struct kfd_process *kfd_lookup_process_by_pasid(unsigned int pasid)
 
 	hash_for_each_rcu(kfd_processes_table, temp, p, kfd_processes) {
 		if (p->pasid == pasid) {
-			mutex_lock(&p->mutex);
+			kref_get(&p->ref);
 			break;
 		}
 	}
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 11/14] drm/amdkfd: Return NULL if kfd_lookup_process_by_pasid fails
       [not found] ` <1511825396-24579-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (9 preceding siblings ...)
  2017-11-27 23:29   ` [PATCH 10/14] drm/amdkfd: Use ref count to prevent kfd_process destruction Felix Kuehling
@ 2017-11-27 23:29   ` Felix Kuehling
       [not found]     ` <1511825396-24579-12-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-11-27 23:29   ` [PATCH 12/14] drm/amdkfd: Reduce nesting in kfd_create_process_device_data Felix Kuehling
                     ` (2 subsequent siblings)
  13 siblings, 1 reply; 35+ messages in thread
From: Felix Kuehling @ 2017-11-27 23:29 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Yong Zhao, Felix Kuehling

From: Yong Zhao <yong.zhao@amd.com>

If no matching process is found, return NULL instead of a pointer
to the last process in the kfd_processes_table.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_process.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index 509f987..93f9019 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -570,7 +570,7 @@ bool kfd_has_process_device_data(struct kfd_process *p)
 /* This increments the process->ref counter. */
 struct kfd_process *kfd_lookup_process_by_pasid(unsigned int pasid)
 {
-	struct kfd_process *p;
+	struct kfd_process *p, *ret_p = NULL;
 	unsigned int temp;
 
 	int idx = srcu_read_lock(&kfd_processes_srcu);
@@ -578,13 +578,14 @@ struct kfd_process *kfd_lookup_process_by_pasid(unsigned int pasid)
 	hash_for_each_rcu(kfd_processes_table, temp, p, kfd_processes) {
 		if (p->pasid == pasid) {
 			kref_get(&p->ref);
+			ret_p = p;
 			break;
 		}
 	}
 
 	srcu_read_unlock(&kfd_processes_srcu, idx);
 
-	return p;
+	return ret_p;
 }
 
 int kfd_reserved_mem_mmap(struct kfd_process *process,
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 12/14] drm/amdkfd: Reduce nesting in kfd_create_process_device_data
       [not found] ` <1511825396-24579-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (10 preceding siblings ...)
  2017-11-27 23:29   ` [PATCH 11/14] drm/amdkfd: Return NULL if kfd_lookup_process_by_pasid fails Felix Kuehling
@ 2017-11-27 23:29   ` Felix Kuehling
       [not found]     ` <1511825396-24579-13-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-11-27 23:29   ` [PATCH 13/14] drm/amdkfd: Factor PDD destruction out of kfd_process_wq_release Felix Kuehling
  2017-11-27 23:29   ` [PATCH 14/14] drm/amdkfd: Simplify locking during process creation Felix Kuehling
  13 siblings, 1 reply; 35+ messages in thread
From: Felix Kuehling @ 2017-11-27 23:29 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Felix Kuehling

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_process.c | 23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index 93f9019..88fc822 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -390,17 +390,18 @@ struct kfd_process_device *kfd_create_process_device_data(struct kfd_dev *dev,
 	struct kfd_process_device *pdd = NULL;
 
 	pdd = kzalloc(sizeof(*pdd), GFP_KERNEL);
-	if (pdd != NULL) {
-		pdd->dev = dev;
-		INIT_LIST_HEAD(&pdd->qpd.queues_list);
-		INIT_LIST_HEAD(&pdd->qpd.priv_queue_list);
-		pdd->qpd.dqm = dev->dqm;
-		pdd->qpd.pqm = &p->pqm;
-		pdd->process = p;
-		pdd->bound = PDD_UNBOUND;
-		pdd->already_dequeued = false;
-		list_add(&pdd->per_device_list, &p->per_device_data);
-	}
+	if (!pdd)
+		return NULL;
+
+	pdd->dev = dev;
+	INIT_LIST_HEAD(&pdd->qpd.queues_list);
+	INIT_LIST_HEAD(&pdd->qpd.priv_queue_list);
+	pdd->qpd.dqm = dev->dqm;
+	pdd->qpd.pqm = &p->pqm;
+	pdd->process = p;
+	pdd->bound = PDD_UNBOUND;
+	pdd->already_dequeued = false;
+	list_add(&pdd->per_device_list, &p->per_device_data);
 
 	return pdd;
 }
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 13/14] drm/amdkfd: Factor PDD destruction out of kfd_process_wq_release
       [not found] ` <1511825396-24579-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (11 preceding siblings ...)
  2017-11-27 23:29   ` [PATCH 12/14] drm/amdkfd: Reduce nesting in kfd_create_process_device_data Felix Kuehling
@ 2017-11-27 23:29   ` Felix Kuehling
       [not found]     ` <1511825396-24579-14-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-11-27 23:29   ` [PATCH 14/14] drm/amdkfd: Simplify locking during process creation Felix Kuehling
  13 siblings, 1 reply; 35+ messages in thread
From: Felix Kuehling @ 2017-11-27 23:29 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Felix Kuehling

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_process.c | 40 +++++++++++++++++++-------------
 1 file changed, 24 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index 88fc822..096710c 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -152,28 +152,15 @@ void kfd_unref_process(struct kfd_process *p)
 	kref_put(&p->ref, kfd_process_ref_release);
 }
 
-/* No process locking is needed in this function, because the process
- * is not findable any more. We must assume that no other thread is
- * using it any more, otherwise we couldn't safely free the process
- * structure in the end.
- */
-static void kfd_process_wq_release(struct work_struct *work)
+static void kfd_process_destroy_pdds(struct kfd_process *p)
 {
-	struct kfd_process *p = container_of(work, struct kfd_process,
-					     release_work);
 	struct kfd_process_device *pdd, *temp;
 
-	pr_debug("Releasing process (pasid %d) in workqueue\n",
-			p->pasid);
-
 	list_for_each_entry_safe(pdd, temp, &p->per_device_data,
-							per_device_list) {
-		pr_debug("Releasing pdd (topology id %d) for process (pasid %d) in workqueue\n",
+				 per_device_list) {
+		pr_debug("Releasing pdd (topology id %d) for process (pasid %d)\n",
 				pdd->dev->id, p->pasid);
 
-		if (pdd->bound == PDD_BOUND)
-			amd_iommu_unbind_pasid(pdd->dev->pdev, p->pasid);
-
 		list_del(&pdd->per_device_list);
 
 		if (pdd->qpd.cwsr_kaddr)
@@ -182,6 +169,27 @@ static void kfd_process_wq_release(struct work_struct *work)
 
 		kfree(pdd);
 	}
+}
+
+/* No process locking is needed in this function, because the process
+ * is not findable any more. We must assume that no other thread is
+ * using it any more, otherwise we couldn't safely free the process
+ * structure in the end.
+ */
+static void kfd_process_wq_release(struct work_struct *work)
+{
+	struct kfd_process *p = container_of(work, struct kfd_process,
+					     release_work);
+	struct kfd_process_device *pdd;
+
+	pr_debug("Releasing process (pasid %d) in workqueue\n", p->pasid);
+
+	list_for_each_entry(pdd, &p->per_device_data, per_device_list) {
+		if (pdd->bound == PDD_BOUND)
+			amd_iommu_unbind_pasid(pdd->dev->pdev, p->pasid);
+	}
+
+	kfd_process_destroy_pdds(p);
 
 	kfd_event_free_process(p);
 
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 14/14] drm/amdkfd: Simplify locking during process creation
       [not found] ` <1511825396-24579-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (12 preceding siblings ...)
  2017-11-27 23:29   ` [PATCH 13/14] drm/amdkfd: Factor PDD destruction out of kfd_process_wq_release Felix Kuehling
@ 2017-11-27 23:29   ` Felix Kuehling
       [not found]     ` <1511825396-24579-15-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  13 siblings, 1 reply; 35+ messages in thread
From: Felix Kuehling @ 2017-11-27 23:29 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Yong Zhao, Felix Kuehling

From: Yong Zhao <yong.zhao@amd.com>

Also fixes error handling if kfd_process_init_cwsr fails.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_process.c | 46 +++++++++++++++-----------------
 1 file changed, 21 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index 096710c..a22fb071 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -50,7 +50,8 @@ static struct workqueue_struct *kfd_process_wq;
 
 static struct kfd_process *find_process(const struct task_struct *thread);
 static void kfd_process_ref_release(struct kref *ref);
-static struct kfd_process *create_process(const struct task_struct *thread);
+static struct kfd_process *create_process(const struct task_struct *thread,
+					struct file *filep);
 static int kfd_process_init_cwsr(struct kfd_process *p, struct file *filep);
 
 
@@ -80,9 +81,6 @@ struct kfd_process *kfd_create_process(struct file *filep)
 	if (thread->group_leader->mm != thread->mm)
 		return ERR_PTR(-EINVAL);
 
-	/* Take mmap_sem because we call __mmu_notifier_register inside */
-	down_write(&thread->mm->mmap_sem);
-
 	/*
 	 * take kfd processes mutex before starting of process creation
 	 * so there won't be a case where two threads of the same process
@@ -94,16 +92,11 @@ struct kfd_process *kfd_create_process(struct file *filep)
 	process = find_process(thread);
 	if (process)
 		pr_debug("Process already found\n");
-
-	if (!process)
-		process = create_process(thread);
+	else
+		process = create_process(thread, filep);
 
 	mutex_unlock(&kfd_processes_mutex);
 
-	up_write(&thread->mm->mmap_sem);
-
-	kfd_process_init_cwsr(process, filep);
-
 	return process;
 }
 
@@ -274,15 +267,12 @@ static const struct mmu_notifier_ops kfd_process_mmu_notifier_ops = {
 
 static int kfd_process_init_cwsr(struct kfd_process *p, struct file *filep)
 {
-	int err = 0;
 	unsigned long  offset;
-	struct kfd_process_device *temp, *pdd = NULL;
+	struct kfd_process_device *pdd = NULL;
 	struct kfd_dev *dev = NULL;
 	struct qcm_process_device *qpd = NULL;
 
-	mutex_lock(&p->mutex);
-	list_for_each_entry_safe(pdd, temp, &p->per_device_data,
-				per_device_list) {
+	list_for_each_entry(pdd, &p->per_device_data, per_device_list) {
 		dev = pdd->dev;
 		qpd = &pdd->qpd;
 		if (!dev->cwsr_enabled || qpd->cwsr_kaddr)
@@ -293,12 +283,12 @@ static int kfd_process_init_cwsr(struct kfd_process *p, struct file *filep)
 			MAP_SHARED, offset);
 
 		if (IS_ERR_VALUE(qpd->tba_addr)) {
-			pr_err("Failure to set tba address. error -%d.\n",
-				(int)qpd->tba_addr);
-			err = qpd->tba_addr;
+			int err = qpd->tba_addr;
+
+			pr_err("Failure to set tba address. error %d.\n", err);
 			qpd->tba_addr = 0;
 			qpd->cwsr_kaddr = NULL;
-			goto out;
+			return err;
 		}
 
 		memcpy(qpd->cwsr_kaddr, dev->cwsr_isa, dev->cwsr_isa_size);
@@ -307,12 +297,12 @@ static int kfd_process_init_cwsr(struct kfd_process *p, struct file *filep)
 		pr_debug("set tba :0x%llx, tma:0x%llx, cwsr_kaddr:%p for pqm.\n",
 			qpd->tba_addr, qpd->tma_addr, qpd->cwsr_kaddr);
 	}
-out:
-	mutex_unlock(&p->mutex);
-	return err;
+
+	return 0;
 }
 
-static struct kfd_process *create_process(const struct task_struct *thread)
+static struct kfd_process *create_process(const struct task_struct *thread,
+					struct file *filep)
 {
 	struct kfd_process *process;
 	int err = -ENOMEM;
@@ -337,7 +327,7 @@ static struct kfd_process *create_process(const struct task_struct *thread)
 
 	/* register notifier */
 	process->mmu_notifier.ops = &kfd_process_mmu_notifier_ops;
-	err = __mmu_notifier_register(&process->mmu_notifier, process->mm);
+	err = mmu_notifier_register(&process->mmu_notifier, process->mm);
 	if (err)
 		goto err_mmu_notifier;
 
@@ -361,8 +351,14 @@ static struct kfd_process *create_process(const struct task_struct *thread)
 	if (err != 0)
 		goto err_init_apertures;
 
+	err = kfd_process_init_cwsr(process, filep);
+	if (err)
+		goto err_init_cwsr;
+
 	return process;
 
+err_init_cwsr:
+	kfd_process_destroy_pdds(process);
 err_init_apertures:
 	pqm_uninit(&process->pqm);
 err_process_pqm_init:
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH 10/14] drm/amdkfd: Use ref count to prevent kfd_process destruction
       [not found]     ` <1511825396-24579-11-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-11-28  9:52       ` Christian König
       [not found]         ` <7fb6a8a7-5616-95d6-c2c9-3b69a75a3613-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: Christian König @ 2017-11-28  9:52 UTC (permalink / raw)
  To: Felix Kuehling, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w

Am 28.11.2017 um 00:29 schrieb Felix Kuehling:
> Use a reference counter instead of a lock to prevent process
> destruction while functions running out of process context are using
> the kfd_process structure. In many cases these functions don't need
> the structure to be locked. In the few cases that really do need the
> process lock, take it explicitly.
>
> This helps simplify lock dependencies between the process lock and
> other locks, particularly amdgpu and mm_struct locks. This will be
> important when amdgpu calls back to amdkfd for memory evictions.

Actually that is not only an optimization or cleanup, but a rather 
important bug fix.

Using a mutex as protection to prevent object deletion is illegal 
because mutex_unlock() can accesses the mutex object even after it is 
unlocked.

See this LWN article as well https://lwn.net/Articles/575460/.

If you have other use cases like this in the KFD it should better be 
fixed as well.

> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>

Acked-by: Christian König <christian.koenig@amd.com>

Regards,
Christian.

> ---
>   drivers/gpu/drm/amd/amdkfd/kfd_events.c  | 14 +++++++-------
>   drivers/gpu/drm/amd/amdkfd/kfd_priv.h    |  1 +
>   drivers/gpu/drm/amd/amdkfd/kfd_process.c | 16 +++++++++++++---
>   3 files changed, 21 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
> index cb92d4b..93aae5c 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
> @@ -441,7 +441,7 @@ void kfd_signal_event_interrupt(unsigned int pasid, uint32_t partial_id,
>   	/*
>   	 * Because we are called from arbitrary context (workqueue) as opposed
>   	 * to process context, kfd_process could attempt to exit while we are
> -	 * running so the lookup function returns a locked process.
> +	 * running so the lookup function increments the process ref count.
>   	 */
>   	struct kfd_process *p = kfd_lookup_process_by_pasid(pasid);
>   
> @@ -493,7 +493,7 @@ void kfd_signal_event_interrupt(unsigned int pasid, uint32_t partial_id,
>   	}
>   
>   	mutex_unlock(&p->event_mutex);
> -	mutex_unlock(&p->mutex);
> +	kfd_unref_process(p);
>   }
>   
>   static struct kfd_event_waiter *alloc_event_waiters(uint32_t num_events)
> @@ -847,7 +847,7 @@ void kfd_signal_iommu_event(struct kfd_dev *dev, unsigned int pasid,
>   	/*
>   	 * Because we are called from arbitrary context (workqueue) as opposed
>   	 * to process context, kfd_process could attempt to exit while we are
> -	 * running so the lookup function returns a locked process.
> +	 * running so the lookup function increments the process ref count.
>   	 */
>   	struct kfd_process *p = kfd_lookup_process_by_pasid(pasid);
>   	struct mm_struct *mm;
> @@ -860,7 +860,7 @@ void kfd_signal_iommu_event(struct kfd_dev *dev, unsigned int pasid,
>   	 */
>   	mm = get_task_mm(p->lead_thread);
>   	if (!mm) {
> -		mutex_unlock(&p->mutex);
> +		kfd_unref_process(p);
>   		return; /* Process is exiting */
>   	}
>   
> @@ -903,7 +903,7 @@ void kfd_signal_iommu_event(struct kfd_dev *dev, unsigned int pasid,
>   			&memory_exception_data);
>   
>   	mutex_unlock(&p->event_mutex);
> -	mutex_unlock(&p->mutex);
> +	kfd_unref_process(p);
>   }
>   
>   void kfd_signal_hw_exception_event(unsigned int pasid)
> @@ -911,7 +911,7 @@ void kfd_signal_hw_exception_event(unsigned int pasid)
>   	/*
>   	 * Because we are called from arbitrary context (workqueue) as opposed
>   	 * to process context, kfd_process could attempt to exit while we are
> -	 * running so the lookup function returns a locked process.
> +	 * running so the lookup function increments the process ref count.
>   	 */
>   	struct kfd_process *p = kfd_lookup_process_by_pasid(pasid);
>   
> @@ -924,5 +924,5 @@ void kfd_signal_hw_exception_event(unsigned int pasid)
>   	lookup_events_by_type_and_signal(p, KFD_EVENT_TYPE_HW_EXCEPTION, NULL);
>   
>   	mutex_unlock(&p->event_mutex);
> -	mutex_unlock(&p->mutex);
> +	kfd_unref_process(p);
>   }
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> index 248e4f5..0c96a6b 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> @@ -606,6 +606,7 @@ void kfd_process_destroy_wq(void);
>   struct kfd_process *kfd_create_process(struct file *filep);
>   struct kfd_process *kfd_get_process(const struct task_struct *);
>   struct kfd_process *kfd_lookup_process_by_pasid(unsigned int pasid);
> +void kfd_unref_process(struct kfd_process *p);
>   
>   struct kfd_process_device *kfd_bind_process_to_device(struct kfd_dev *dev,
>   						struct kfd_process *p);
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> index e02e8a2..509f987 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> @@ -49,6 +49,7 @@ DEFINE_STATIC_SRCU(kfd_processes_srcu);
>   static struct workqueue_struct *kfd_process_wq;
>   
>   static struct kfd_process *find_process(const struct task_struct *thread);
> +static void kfd_process_ref_release(struct kref *ref);
>   static struct kfd_process *create_process(const struct task_struct *thread);
>   static int kfd_process_init_cwsr(struct kfd_process *p, struct file *filep);
>   
> @@ -146,6 +147,11 @@ static struct kfd_process *find_process(const struct task_struct *thread)
>   	return p;
>   }
>   
> +void kfd_unref_process(struct kfd_process *p)
> +{
> +	kref_put(&p->ref, kfd_process_ref_release);
> +}
> +
>   /* No process locking is needed in this function, because the process
>    * is not findable any more. We must assume that no other thread is
>    * using it any more, otherwise we couldn't safely free the process
> @@ -201,7 +207,7 @@ static void kfd_process_destroy_delayed(struct rcu_head *rcu)
>   {
>   	struct kfd_process *p = container_of(rcu, struct kfd_process, rcu);
>   
> -	kref_put(&p->ref, kfd_process_ref_release);
> +	kfd_unref_process(p);
>   }
>   
>   static void kfd_process_notifier_release(struct mmu_notifier *mn,
> @@ -525,6 +531,8 @@ void kfd_process_iommu_unbind_callback(struct kfd_dev *dev, unsigned int pasid)
>   
>   	mutex_unlock(kfd_get_dbgmgr_mutex());
>   
> +	mutex_lock(&p->mutex);
> +
>   	pdd = kfd_get_process_device_data(dev, p);
>   	if (pdd)
>   		/* For GPU relying on IOMMU, we need to dequeue here
> @@ -533,6 +541,8 @@ void kfd_process_iommu_unbind_callback(struct kfd_dev *dev, unsigned int pasid)
>   		kfd_process_dequeue_from_device(pdd);
>   
>   	mutex_unlock(&p->mutex);
> +
> +	kfd_unref_process(p);
>   }
>   
>   struct kfd_process_device *kfd_get_first_process_device_data(
> @@ -557,7 +567,7 @@ bool kfd_has_process_device_data(struct kfd_process *p)
>   	return !(list_empty(&p->per_device_data));
>   }
>   
> -/* This returns with process->mutex locked. */
> +/* This increments the process->ref counter. */
>   struct kfd_process *kfd_lookup_process_by_pasid(unsigned int pasid)
>   {
>   	struct kfd_process *p;
> @@ -567,7 +577,7 @@ struct kfd_process *kfd_lookup_process_by_pasid(unsigned int pasid)
>   
>   	hash_for_each_rcu(kfd_processes_table, temp, p, kfd_processes) {
>   		if (p->pasid == pasid) {
> -			mutex_lock(&p->mutex);
> +			kref_get(&p->ref);
>   			break;
>   		}
>   	}

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 01/14] drm/amdgpu: fix get_max_engine_clock_in_mhz
       [not found]     ` <1511825396-24579-2-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-11-30 16:03       ` Oded Gabbay
  0 siblings, 0 replies; 35+ messages in thread
From: Oded Gabbay @ 2017-11-30 16:03 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: amd-gfx list

On Tue, Nov 28, 2017 at 1:29 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> Use proper powerplay function. This fixes OpenCL initialization
> problems.
>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> index 5432af3..f7fa767 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> @@ -265,6 +265,9 @@ uint32_t get_max_engine_clock_in_mhz(struct kgd_dev *kgd)
>  {
>         struct amdgpu_device *adev = (struct amdgpu_device *)kgd;
>
> -       /* The sclk is in quantas of 10kHz */
> -       return adev->pm.dpm.dyn_state.max_clock_voltage_on_ac.sclk / 100;
> +       /* the sclk is in quantas of 10kHz */
> +       if (amdgpu_sriov_vf(adev))
> +               return adev->clock.default_sclk / 100;
> +
> +       return amdgpu_dpm_get_sclk(adev, false) / 100;
>  }
> --
> 2.7.4
>

This patch is:
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 02/14] drm/amdkfd: Add crash protection in debugger register path
       [not found]     ` <1511825396-24579-3-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-11-30 16:14       ` Oded Gabbay
  0 siblings, 0 replies; 35+ messages in thread
From: Oded Gabbay @ 2017-11-30 16:14 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: Philip Yang, amd-gfx list

On Tue, Nov 28, 2017 at 1:29 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> From: Philip Yang <Philip.Yang@amd.com>
>
> After debugger is registered, the pqm_destroy_queue fails because is_debug
> is true, the queue should not be removed from process_queue_list since
> the count is not reduced.
>
> Test application calls debugger unregister without register debugger, add
> null pointer check protection to avoid crash for this case
>
> Signed-off-by: Philip Yang <Philip.Yang@amd.com>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c               | 2 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 5 +++++
>  2 files changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> index cc61ec2..62c3d9c 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> @@ -526,7 +526,7 @@ static int kfd_ioctl_dbg_unregister(struct file *filep,
>         long status;
>
>         dev = kfd_device_by_id(args->gpu_id);
> -       if (!dev)
> +       if (!dev || !dev->dbgmgr)
>                 return -EINVAL;
>
>         if (dev->device_info->asic_family == CHIP_CARRIZO) {
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> index eeb7726..2c98858 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> @@ -313,6 +313,10 @@ int pqm_destroy_queue(struct process_queue_manager *pqm, unsigned int qid)
>         if (pqn->q) {
>                 dqm = pqn->q->device->dqm;
>                 retval = dqm->ops.destroy_queue(dqm, &pdd->qpd, pqn->q);
> +               if (retval) {
> +                       pr_debug("Destroy queue failed, returned %d\n", retval);
> +                       goto err_destroy_queue;
> +               }
>                 uninit_queue(pqn->q);
>         }
>
> @@ -324,6 +328,7 @@ int pqm_destroy_queue(struct process_queue_manager *pqm, unsigned int qid)
>             list_empty(&pdd->qpd.priv_queue_list))
>                 dqm->ops.unregister_process(dqm, &pdd->qpd);
>
> +err_destroy_queue:
>         return retval;
>  }
>
> --
> 2.7.4
>
This patch is:
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 10/14] drm/amdkfd: Use ref count to prevent kfd_process destruction
       [not found]         ` <7fb6a8a7-5616-95d6-c2c9-3b69a75a3613-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-12-01 21:17           ` Felix Kuehling
       [not found]             ` <ab143937-aca3-9f3e-b6f4-4d354fde3c05-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: Felix Kuehling @ 2017-12-01 21:17 UTC (permalink / raw)
  To: christian.koenig-5C7GfCeVMHo,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w


On 2017-11-28 04:52 AM, Christian König wrote:
> Am 28.11.2017 um 00:29 schrieb Felix Kuehling:
>> Use a reference counter instead of a lock to prevent process
>> destruction while functions running out of process context are using
>> the kfd_process structure. In many cases these functions don't need
>> the structure to be locked. In the few cases that really do need the
>> process lock, take it explicitly.
>>
>> This helps simplify lock dependencies between the process lock and
>> other locks, particularly amdgpu and mm_struct locks. This will be
>> important when amdgpu calls back to amdkfd for memory evictions.
>
> Actually that is not only an optimization or cleanup, but a rather
> important bug fix.
>
> Using a mutex as protection to prevent object deletion is illegal
> because mutex_unlock() can accesses the mutex object even after it is
> unlocked.
>
> See this LWN article as well https://lwn.net/Articles/575460/.
>
> If you have other use cases like this in the KFD it should better be
> fixed as well.

I'm not aware of other misuses of Mutexes in KFD.

The article sounded like this was likely to get fixed in the mutex
rather than hoping to track down all incorrect uses of Mutexes. Quote:
>
> As of this writing, no patches have been posted. It would be
> surprising, though, if a fix for this particular problem did not
> surface by the time the 3.14 merge window opens. Locking problems are
> hard enough to deal with when the locking primitives have simple and
> easily understood behavior; having subtle traps built into that layer
> of the kernel is a recipe for a lot of long-term pain.
>
I haven't found such a fix. That said, in the discussion under that
article some argued that the example would be broken even with a
spinlock. So maybe there is no such general fix.

Regards,
  Felix


>
>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>
> Acked-by: Christian König <christian.koenig@amd.com>
>
> Regards,
> Christian.
>
>> ---
>>   drivers/gpu/drm/amd/amdkfd/kfd_events.c  | 14 +++++++-------
>>   drivers/gpu/drm/amd/amdkfd/kfd_priv.h    |  1 +
>>   drivers/gpu/drm/amd/amdkfd/kfd_process.c | 16 +++++++++++++---
>>   3 files changed, 21 insertions(+), 10 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
>> b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
>> index cb92d4b..93aae5c 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
>> @@ -441,7 +441,7 @@ void kfd_signal_event_interrupt(unsigned int
>> pasid, uint32_t partial_id,
>>       /*
>>        * Because we are called from arbitrary context (workqueue) as
>> opposed
>>        * to process context, kfd_process could attempt to exit while
>> we are
>> -     * running so the lookup function returns a locked process.
>> +     * running so the lookup function increments the process ref count.
>>        */
>>       struct kfd_process *p = kfd_lookup_process_by_pasid(pasid);
>>   @@ -493,7 +493,7 @@ void kfd_signal_event_interrupt(unsigned int
>> pasid, uint32_t partial_id,
>>       }
>>         mutex_unlock(&p->event_mutex);
>> -    mutex_unlock(&p->mutex);
>> +    kfd_unref_process(p);
>>   }
>>     static struct kfd_event_waiter *alloc_event_waiters(uint32_t
>> num_events)
>> @@ -847,7 +847,7 @@ void kfd_signal_iommu_event(struct kfd_dev *dev,
>> unsigned int pasid,
>>       /*
>>        * Because we are called from arbitrary context (workqueue) as
>> opposed
>>        * to process context, kfd_process could attempt to exit while
>> we are
>> -     * running so the lookup function returns a locked process.
>> +     * running so the lookup function increments the process ref count.
>>        */
>>       struct kfd_process *p = kfd_lookup_process_by_pasid(pasid);
>>       struct mm_struct *mm;
>> @@ -860,7 +860,7 @@ void kfd_signal_iommu_event(struct kfd_dev *dev,
>> unsigned int pasid,
>>        */
>>       mm = get_task_mm(p->lead_thread);
>>       if (!mm) {
>> -        mutex_unlock(&p->mutex);
>> +        kfd_unref_process(p);
>>           return; /* Process is exiting */
>>       }
>>   @@ -903,7 +903,7 @@ void kfd_signal_iommu_event(struct kfd_dev
>> *dev, unsigned int pasid,
>>               &memory_exception_data);
>>         mutex_unlock(&p->event_mutex);
>> -    mutex_unlock(&p->mutex);
>> +    kfd_unref_process(p);
>>   }
>>     void kfd_signal_hw_exception_event(unsigned int pasid)
>> @@ -911,7 +911,7 @@ void kfd_signal_hw_exception_event(unsigned int
>> pasid)
>>       /*
>>        * Because we are called from arbitrary context (workqueue) as
>> opposed
>>        * to process context, kfd_process could attempt to exit while
>> we are
>> -     * running so the lookup function returns a locked process.
>> +     * running so the lookup function increments the process ref count.
>>        */
>>       struct kfd_process *p = kfd_lookup_process_by_pasid(pasid);
>>   @@ -924,5 +924,5 @@ void kfd_signal_hw_exception_event(unsigned int
>> pasid)
>>       lookup_events_by_type_and_signal(p,
>> KFD_EVENT_TYPE_HW_EXCEPTION, NULL);
>>         mutex_unlock(&p->event_mutex);
>> -    mutex_unlock(&p->mutex);
>> +    kfd_unref_process(p);
>>   }
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>> b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>> index 248e4f5..0c96a6b 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>> @@ -606,6 +606,7 @@ void kfd_process_destroy_wq(void);
>>   struct kfd_process *kfd_create_process(struct file *filep);
>>   struct kfd_process *kfd_get_process(const struct task_struct *);
>>   struct kfd_process *kfd_lookup_process_by_pasid(unsigned int pasid);
>> +void kfd_unref_process(struct kfd_process *p);
>>     struct kfd_process_device *kfd_bind_process_to_device(struct
>> kfd_dev *dev,
>>                           struct kfd_process *p);
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>> b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>> index e02e8a2..509f987 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>> @@ -49,6 +49,7 @@ DEFINE_STATIC_SRCU(kfd_processes_srcu);
>>   static struct workqueue_struct *kfd_process_wq;
>>     static struct kfd_process *find_process(const struct task_struct
>> *thread);
>> +static void kfd_process_ref_release(struct kref *ref);
>>   static struct kfd_process *create_process(const struct task_struct
>> *thread);
>>   static int kfd_process_init_cwsr(struct kfd_process *p, struct file
>> *filep);
>>   @@ -146,6 +147,11 @@ static struct kfd_process *find_process(const
>> struct task_struct *thread)
>>       return p;
>>   }
>>   +void kfd_unref_process(struct kfd_process *p)
>> +{
>> +    kref_put(&p->ref, kfd_process_ref_release);
>> +}
>> +
>>   /* No process locking is needed in this function, because the process
>>    * is not findable any more. We must assume that no other thread is
>>    * using it any more, otherwise we couldn't safely free the process
>> @@ -201,7 +207,7 @@ static void kfd_process_destroy_delayed(struct
>> rcu_head *rcu)
>>   {
>>       struct kfd_process *p = container_of(rcu, struct kfd_process,
>> rcu);
>>   -    kref_put(&p->ref, kfd_process_ref_release);
>> +    kfd_unref_process(p);
>>   }
>>     static void kfd_process_notifier_release(struct mmu_notifier *mn,
>> @@ -525,6 +531,8 @@ void kfd_process_iommu_unbind_callback(struct
>> kfd_dev *dev, unsigned int pasid)
>>         mutex_unlock(kfd_get_dbgmgr_mutex());
>>   +    mutex_lock(&p->mutex);
>> +
>>       pdd = kfd_get_process_device_data(dev, p);
>>       if (pdd)
>>           /* For GPU relying on IOMMU, we need to dequeue here
>> @@ -533,6 +541,8 @@ void kfd_process_iommu_unbind_callback(struct
>> kfd_dev *dev, unsigned int pasid)
>>           kfd_process_dequeue_from_device(pdd);
>>         mutex_unlock(&p->mutex);
>> +
>> +    kfd_unref_process(p);
>>   }
>>     struct kfd_process_device *kfd_get_first_process_device_data(
>> @@ -557,7 +567,7 @@ bool kfd_has_process_device_data(struct
>> kfd_process *p)
>>       return !(list_empty(&p->per_device_data));
>>   }
>>   -/* This returns with process->mutex locked. */
>> +/* This increments the process->ref counter. */
>>   struct kfd_process *kfd_lookup_process_by_pasid(unsigned int pasid)
>>   {
>>       struct kfd_process *p;
>> @@ -567,7 +577,7 @@ struct kfd_process
>> *kfd_lookup_process_by_pasid(unsigned int pasid)
>>         hash_for_each_rcu(kfd_processes_table, temp, p, kfd_processes) {
>>           if (p->pasid == pasid) {
>> -            mutex_lock(&p->mutex);
>> +            kref_get(&p->ref);
>>               break;
>>           }
>>       }
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 03/14] drm/amdkfd: map multiple processes to HW scheduler
       [not found]     ` <1511825396-24579-4-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-12-05  8:04       ` Oded Gabbay
  0 siblings, 0 replies; 35+ messages in thread
From: Oded Gabbay @ 2017-12-05  8:04 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: Yong Zhao, Jay Cornwall, amd-gfx list

On Tue, Nov 28, 2017 at 1:29 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> Allow HWS to to execute multiple processes on the hardware
> concurrently. The number of concurrent processes is limited by
> the number of VMIDs allocated to the HWS.
>
> A module parameter can be used for limiting this further or turn
> it off altogether (mainly for debugging purposes).
>
> Signed-off-by: Yong Zhao <yong.zhao@amd.com>
> Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_device.c         | 11 +++++++++
>  drivers/gpu/drm/amd/amdkfd/kfd_module.c         |  5 +++++
>  drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c | 30 +++++++++++++++++++++++--
>  drivers/gpu/drm/amd/amdkfd/kfd_priv.h           |  9 ++++++++
>  4 files changed, 53 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> index 4f05eac..a8fa33a 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> @@ -238,6 +238,17 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
>         kfd->vm_info.vmid_num_kfd = kfd->vm_info.last_vmid_kfd
>                         - kfd->vm_info.first_vmid_kfd + 1;
>
> +       /* Verify module parameters regarding mapped process number*/
> +       if ((hws_max_conc_proc < 0)
> +                       || (hws_max_conc_proc > kfd->vm_info.vmid_num_kfd)) {
> +               dev_err(kfd_device,
> +                       "hws_max_conc_proc %d must be between 0 and %d, use %d instead\n",
> +                       hws_max_conc_proc, kfd->vm_info.vmid_num_kfd,
> +                       kfd->vm_info.vmid_num_kfd);
> +               kfd->max_proc_per_quantum = kfd->vm_info.vmid_num_kfd;
> +       } else
> +               kfd->max_proc_per_quantum = hws_max_conc_proc;
> +
>         /* calculate max size of mqds needed for queues */
>         size = max_num_of_queues_per_device *
>                         kfd->device_info->mqd_size_aligned;
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_module.c b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
> index ee8adf6..4e060c8 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
> @@ -50,6 +50,11 @@ module_param(sched_policy, int, 0444);
>  MODULE_PARM_DESC(sched_policy,
>         "Scheduling policy (0 = HWS (Default), 1 = HWS without over-subscription, 2 = Non-HWS (Used for debugging only)");
>
> +int hws_max_conc_proc = 8;
> +module_param(hws_max_conc_proc, int, 0444);
> +MODULE_PARM_DESC(hws_max_conc_proc,
> +       "Max # processes HWS can execute concurrently when sched_policy=0 (0 = no concurrency, #VMIDs for KFD = Maximum(default))");
> +
>  int cwsr_enable = 1;
>  module_param(cwsr_enable, int, 0444);
>  MODULE_PARM_DESC(cwsr_enable, "CWSR enable (0 = Off, 1 = On (Default))");
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> index 69c147a..0b7092e 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> @@ -57,13 +57,24 @@ static void pm_calc_rlib_size(struct packet_manager *pm,
>  {
>         unsigned int process_count, queue_count;
>         unsigned int map_queue_size;
> +       unsigned int max_proc_per_quantum = 1;
> +       struct kfd_dev *dev = pm->dqm->dev;
>
>         process_count = pm->dqm->processes_count;
>         queue_count = pm->dqm->queue_count;
>
> -       /* check if there is over subscription*/
> +       /* check if there is over subscription
> +        * Note: the arbitration between the number of VMIDs and
> +        * hws_max_conc_proc has been done in
> +        * kgd2kfd_device_init().
> +        */
>         *over_subscription = false;
> -       if ((process_count > 1) || queue_count > get_queues_num(pm->dqm)) {
> +
> +       if (dev->max_proc_per_quantum > 1)
> +               max_proc_per_quantum = dev->max_proc_per_quantum;
> +
> +       if ((process_count > max_proc_per_quantum) ||
> +           queue_count > get_queues_num(pm->dqm)) {
>                 *over_subscription = true;
>                 pr_debug("Over subscribed runlist\n");
>         }
> @@ -116,10 +127,24 @@ static int pm_create_runlist(struct packet_manager *pm, uint32_t *buffer,
>                         uint64_t ib, size_t ib_size_in_dwords, bool chain)
>  {
>         struct pm4_mes_runlist *packet;
> +       int concurrent_proc_cnt = 0;
> +       struct kfd_dev *kfd = pm->dqm->dev;
>
>         if (WARN_ON(!ib))
>                 return -EFAULT;
>
> +       /* Determine the number of processes to map together to HW:
> +        * it can not exceed the number of VMIDs available to the
> +        * scheduler, and it is determined by the smaller of the number
> +        * of processes in the runlist and kfd module parameter
> +        * hws_max_conc_proc.
> +        * Note: the arbitration between the number of VMIDs and
> +        * hws_max_conc_proc has been done in
> +        * kgd2kfd_device_init().
> +        */
> +       concurrent_proc_cnt = min(pm->dqm->processes_count,
> +                       kfd->max_proc_per_quantum);
> +
>         packet = (struct pm4_mes_runlist *)buffer;
>
>         memset(buffer, 0, sizeof(struct pm4_mes_runlist));
> @@ -130,6 +155,7 @@ static int pm_create_runlist(struct packet_manager *pm, uint32_t *buffer,
>         packet->bitfields4.chain = chain ? 1 : 0;
>         packet->bitfields4.offload_polling = 0;
>         packet->bitfields4.valid = 1;
> +       packet->bitfields4.process_cnt = concurrent_proc_cnt;
>         packet->ordinal2 = lower_32_bits(ib);
>         packet->bitfields3.ib_base_hi = upper_32_bits(ib);
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> index a668764..1edab21 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> @@ -88,6 +88,12 @@ extern int max_num_of_queues_per_device;
>  /* Kernel module parameter to specify the scheduling policy */
>  extern int sched_policy;
>
> +/*
> + * Kernel module parameter to specify the maximum process
> + * number per HW scheduler
> + */
> +extern int hws_max_conc_proc;
> +
>  extern int cwsr_enable;
>
>  /*
> @@ -214,6 +220,9 @@ struct kfd_dev {
>         /* Debug manager */
>         struct kfd_dbgmgr           *dbgmgr;
>
> +       /* Maximum process number mapped to HW scheduler */
> +       unsigned int max_proc_per_quantum;
> +
>         /* CWSR */
>         bool cwsr_enabled;
>         const void *cwsr_isa;
> --
> 2.7.4
>

This patch is:
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 04/14] drm/amdkfd: Fix oversubscription accounting
       [not found]     ` <1511825396-24579-5-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-12-05  8:10       ` Oded Gabbay
       [not found]         ` <CAFCwf11eM8pYmBOHdD1o4NVDj9nesJwp3Ny9dGukzstM5iP=Ag-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: Oded Gabbay @ 2017-12-05  8:10 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: Jay Cornwall, amd-gfx list

On Tue, Nov 28, 2017 at 1:29 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> Don't count SDMA queues towards compute HQD oversubscription when
> deciding whether to create a chained runlist.
>
> Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> index 0b7092e..c3230b9 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> @@ -55,13 +55,14 @@ static void pm_calc_rlib_size(struct packet_manager *pm,
>                                 unsigned int *rlib_size,
>                                 bool *over_subscription)
>  {
> -       unsigned int process_count, queue_count;
> +       unsigned int process_count, queue_count, compute_queue_count;
>         unsigned int map_queue_size;
>         unsigned int max_proc_per_quantum = 1;
>         struct kfd_dev *dev = pm->dqm->dev;
>
>         process_count = pm->dqm->processes_count;
>         queue_count = pm->dqm->queue_count;
> +       compute_queue_count = queue_count - pm->dqm->sdma_queue_count;
>
>         /* check if there is over subscription
>          * Note: the arbitration between the number of VMIDs and
> @@ -74,7 +75,7 @@ static void pm_calc_rlib_size(struct packet_manager *pm,
>                 max_proc_per_quantum = dev->max_proc_per_quantum;
>
>         if ((process_count > max_proc_per_quantum) ||
> -           queue_count > get_queues_num(pm->dqm)) {
> +           compute_queue_count > get_queues_num(pm->dqm)) {
>                 *over_subscription = true;
>                 pr_debug("Over subscribed runlist\n");
>         }
> --
> 2.7.4
>
Don't you need to update this line as well (I'm less familiar with the
runlist so just asking) ?

*rlib_size = process_count * sizeof(struct pm4_mes_map_process) +
queue_count * map_queue_size;

Oded
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 05/14] drm/amdgpu: Fix definition of KFD_CIK_SDMA_QUEUE_OFFSET
       [not found]     ` <1511825396-24579-6-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-12-05  8:15       ` Oded Gabbay
  0 siblings, 0 replies; 35+ messages in thread
From: Oded Gabbay @ 2017-12-05  8:15 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: amd-gfx list

On Tue, Nov 28, 2017 at 1:29 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> This counts the queue offset in register index, not register address.
>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/cikd.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/cikd.h b/drivers/gpu/drm/amd/amdgpu/cikd.h
> index 6a9e38a..cee6e8a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/cikd.h
> +++ b/drivers/gpu/drm/amd/amdgpu/cikd.h
> @@ -562,7 +562,7 @@
>  #define        PRIVATE_BASE(x) ((x) << 0) /* scratch */
>  #define        SHARED_BASE(x)  ((x) << 16) /* LDS */
>
> -#define KFD_CIK_SDMA_QUEUE_OFFSET      0x200
> +#define KFD_CIK_SDMA_QUEUE_OFFSET (mmSDMA0_RLC1_RB_CNTL - mmSDMA0_RLC0_RB_CNTL)
>
>  /* valid for both DEFAULT_MTYPE and APE1_MTYPE */
>  enum {
> --
> 2.7.4
>

This patch is:
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 06/14] drm/amdgpu: Add kfd2kgd APIs for dumping HQDs
       [not found]     ` <1511825396-24579-7-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-12-05  8:23       ` Oded Gabbay
  0 siblings, 0 replies; 35+ messages in thread
From: Oded Gabbay @ 2017-12-05  8:23 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: amd-gfx list

On Tue, Nov 28, 2017 at 1:29 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> This can be used by KFD for debugging features, such as dumping
> HQDs in debugfs.
>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c | 71 ++++++++++++++++++++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c | 80 +++++++++++++++++++++++
>  drivers/gpu/drm/amd/include/kgd_kfd_interface.h   | 14 ++++
>  3 files changed, 165 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
> index 14333af..12feba8 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
> @@ -105,8 +105,14 @@ static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, uint32_t pipe_id,
>                         uint32_t queue_id, uint32_t __user *wptr,
>                         uint32_t wptr_shift, uint32_t wptr_mask,
>                         struct mm_struct *mm);
> +static int kgd_hqd_dump(struct kgd_dev *kgd,
> +                       uint32_t pipe_id, uint32_t queue_id,
> +                       uint32_t (**dump)[2], uint32_t *n_regs);
>  static int kgd_hqd_sdma_load(struct kgd_dev *kgd, void *mqd,
>                              uint32_t __user *wptr, struct mm_struct *mm);
> +static int kgd_hqd_sdma_dump(struct kgd_dev *kgd,
> +                            uint32_t engine_id, uint32_t queue_id,
> +                            uint32_t (**dump)[2], uint32_t *n_regs);
>  static bool kgd_hqd_is_occupied(struct kgd_dev *kgd, uint64_t queue_address,
>                                 uint32_t pipe_id, uint32_t queue_id);
>
> @@ -178,6 +184,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
>         .init_interrupts = kgd_init_interrupts,
>         .hqd_load = kgd_hqd_load,
>         .hqd_sdma_load = kgd_hqd_sdma_load,
> +       .hqd_dump = kgd_hqd_dump,
> +       .hqd_sdma_dump = kgd_hqd_sdma_dump,
>         .hqd_is_occupied = kgd_hqd_is_occupied,
>         .hqd_sdma_is_occupied = kgd_hqd_sdma_is_occupied,
>         .hqd_destroy = kgd_hqd_destroy,
> @@ -376,6 +384,42 @@ static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, uint32_t pipe_id,
>         return 0;
>  }
>
> +static int kgd_hqd_dump(struct kgd_dev *kgd,
> +                       uint32_t pipe_id, uint32_t queue_id,
> +                       uint32_t (**dump)[2], uint32_t *n_regs)
> +{
> +       struct amdgpu_device *adev = get_amdgpu_device(kgd);
> +       uint32_t i = 0, reg;
> +#define HQD_N_REGS (35+4)
> +#define DUMP_REG(addr) do {                            \
> +               if (WARN_ON_ONCE(i >= HQD_N_REGS))      \
> +                       break;                          \
> +               (*dump)[i][0] = (addr) << 2;            \
> +               (*dump)[i++][1] = RREG32(addr);         \
> +       } while (0)
> +
> +       *dump = kmalloc(HQD_N_REGS*2*sizeof(uint32_t), GFP_KERNEL);
> +       if (*dump == NULL)
> +               return -ENOMEM;
> +
> +       acquire_queue(kgd, pipe_id, queue_id);
> +
> +       DUMP_REG(mmCOMPUTE_STATIC_THREAD_MGMT_SE0);
> +       DUMP_REG(mmCOMPUTE_STATIC_THREAD_MGMT_SE1);
> +       DUMP_REG(mmCOMPUTE_STATIC_THREAD_MGMT_SE2);
> +       DUMP_REG(mmCOMPUTE_STATIC_THREAD_MGMT_SE3);
> +
> +       for (reg = mmCP_MQD_BASE_ADDR; reg <= mmCP_MQD_CONTROL; reg++)
> +               DUMP_REG(reg);
> +
> +       release_queue(kgd);
> +
> +       WARN_ON_ONCE(i != HQD_N_REGS);
> +       *n_regs = i;
> +
> +       return 0;
> +}
> +
>  static int kgd_hqd_sdma_load(struct kgd_dev *kgd, void *mqd,
>                              uint32_t __user *wptr, struct mm_struct *mm)
>  {
> @@ -440,6 +484,33 @@ static int kgd_hqd_sdma_load(struct kgd_dev *kgd, void *mqd,
>         return 0;
>  }
>
> +static int kgd_hqd_sdma_dump(struct kgd_dev *kgd,
> +                            uint32_t engine_id, uint32_t queue_id,
> +                            uint32_t (**dump)[2], uint32_t *n_regs)
> +{
> +       struct amdgpu_device *adev = get_amdgpu_device(kgd);
> +       uint32_t sdma_offset = engine_id * SDMA1_REGISTER_OFFSET +
> +               queue_id * KFD_CIK_SDMA_QUEUE_OFFSET;
> +       uint32_t i = 0, reg;
> +#undef HQD_N_REGS
> +#define HQD_N_REGS (19+4)
> +
> +       *dump = kmalloc(HQD_N_REGS*2*sizeof(uint32_t), GFP_KERNEL);
> +       if (*dump == NULL)
> +               return -ENOMEM;
> +
> +       for (reg = mmSDMA0_RLC0_RB_CNTL; reg <= mmSDMA0_RLC0_DOORBELL; reg++)
> +               DUMP_REG(sdma_offset + reg);
> +       for (reg = mmSDMA0_RLC0_VIRTUAL_ADDR; reg <= mmSDMA0_RLC0_WATERMARK;
> +            reg++)
> +               DUMP_REG(sdma_offset + reg);
> +
> +       WARN_ON_ONCE(i != HQD_N_REGS);
> +       *n_regs = i;
> +
> +       return 0;
> +}
> +
>  static bool kgd_hqd_is_occupied(struct kgd_dev *kgd, uint64_t queue_address,
>                                 uint32_t pipe_id, uint32_t queue_id)
>  {
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
> index 1d989e4..b380495 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
> @@ -64,8 +64,14 @@ static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, uint32_t pipe_id,
>                         uint32_t queue_id, uint32_t __user *wptr,
>                         uint32_t wptr_shift, uint32_t wptr_mask,
>                         struct mm_struct *mm);
> +static int kgd_hqd_dump(struct kgd_dev *kgd,
> +                       uint32_t pipe_id, uint32_t queue_id,
> +                       uint32_t (**dump)[2], uint32_t *n_regs);
>  static int kgd_hqd_sdma_load(struct kgd_dev *kgd, void *mqd,
>                              uint32_t __user *wptr, struct mm_struct *mm);
> +static int kgd_hqd_sdma_dump(struct kgd_dev *kgd,
> +                            uint32_t engine_id, uint32_t queue_id,
> +                            uint32_t (**dump)[2], uint32_t *n_regs);
>  static bool kgd_hqd_is_occupied(struct kgd_dev *kgd, uint64_t queue_address,
>                 uint32_t pipe_id, uint32_t queue_id);
>  static bool kgd_hqd_sdma_is_occupied(struct kgd_dev *kgd, void *mqd);
> @@ -137,6 +143,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
>         .init_interrupts = kgd_init_interrupts,
>         .hqd_load = kgd_hqd_load,
>         .hqd_sdma_load = kgd_hqd_sdma_load,
> +       .hqd_dump = kgd_hqd_dump,
> +       .hqd_sdma_dump = kgd_hqd_sdma_dump,
>         .hqd_is_occupied = kgd_hqd_is_occupied,
>         .hqd_sdma_is_occupied = kgd_hqd_sdma_is_occupied,
>         .hqd_destroy = kgd_hqd_destroy,
> @@ -365,6 +373,42 @@ static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, uint32_t pipe_id,
>         return 0;
>  }
>
> +static int kgd_hqd_dump(struct kgd_dev *kgd,
> +                       uint32_t pipe_id, uint32_t queue_id,
> +                       uint32_t (**dump)[2], uint32_t *n_regs)
> +{
> +       struct amdgpu_device *adev = get_amdgpu_device(kgd);
> +       uint32_t i = 0, reg;
> +#define HQD_N_REGS (54+4)
> +#define DUMP_REG(addr) do {                            \
> +               if (WARN_ON_ONCE(i >= HQD_N_REGS))      \
> +                       break;                          \
> +               (*dump)[i][0] = (addr) << 2;            \
> +               (*dump)[i++][1] = RREG32(addr);         \
> +       } while (0)
> +
> +       *dump = kmalloc(HQD_N_REGS*2*sizeof(uint32_t), GFP_KERNEL);
> +       if (*dump == NULL)
> +               return -ENOMEM;
> +
> +       acquire_queue(kgd, pipe_id, queue_id);
> +
> +       DUMP_REG(mmCOMPUTE_STATIC_THREAD_MGMT_SE0);
> +       DUMP_REG(mmCOMPUTE_STATIC_THREAD_MGMT_SE1);
> +       DUMP_REG(mmCOMPUTE_STATIC_THREAD_MGMT_SE2);
> +       DUMP_REG(mmCOMPUTE_STATIC_THREAD_MGMT_SE3);
> +
> +       for (reg = mmCP_MQD_BASE_ADDR; reg <= mmCP_HQD_EOP_DONES; reg++)
> +               DUMP_REG(reg);
> +
> +       release_queue(kgd);
> +
> +       WARN_ON_ONCE(i != HQD_N_REGS);
> +       *n_regs = i;
> +
> +       return 0;
> +}
> +
>  static int kgd_hqd_sdma_load(struct kgd_dev *kgd, void *mqd,
>                              uint32_t __user *wptr, struct mm_struct *mm)
>  {
> @@ -428,6 +472,42 @@ static int kgd_hqd_sdma_load(struct kgd_dev *kgd, void *mqd,
>         return 0;
>  }
>
> +static int kgd_hqd_sdma_dump(struct kgd_dev *kgd,
> +                            uint32_t engine_id, uint32_t queue_id,
> +                            uint32_t (**dump)[2], uint32_t *n_regs)
> +{
> +       struct amdgpu_device *adev = get_amdgpu_device(kgd);
> +       uint32_t sdma_offset = engine_id * SDMA1_REGISTER_OFFSET +
> +               queue_id * KFD_VI_SDMA_QUEUE_OFFSET;
> +       uint32_t i = 0, reg;
> +#undef HQD_N_REGS
> +#define HQD_N_REGS (19+4+2+3+7)
> +
> +       *dump = kmalloc(HQD_N_REGS*2*sizeof(uint32_t), GFP_KERNEL);
> +       if (*dump == NULL)
> +               return -ENOMEM;
> +
> +       for (reg = mmSDMA0_RLC0_RB_CNTL; reg <= mmSDMA0_RLC0_DOORBELL; reg++)
> +               DUMP_REG(sdma_offset + reg);
> +       for (reg = mmSDMA0_RLC0_VIRTUAL_ADDR; reg <= mmSDMA0_RLC0_WATERMARK;
> +            reg++)
> +               DUMP_REG(sdma_offset + reg);
> +       for (reg = mmSDMA0_RLC0_CSA_ADDR_LO; reg <= mmSDMA0_RLC0_CSA_ADDR_HI;
> +            reg++)
> +               DUMP_REG(sdma_offset + reg);
> +       for (reg = mmSDMA0_RLC0_IB_SUB_REMAIN; reg <= mmSDMA0_RLC0_DUMMY_REG;
> +            reg++)
> +               DUMP_REG(sdma_offset + reg);
> +       for (reg = mmSDMA0_RLC0_MIDCMD_DATA0; reg <= mmSDMA0_RLC0_MIDCMD_CNTL;
> +            reg++)
> +               DUMP_REG(sdma_offset + reg);
> +
> +       WARN_ON_ONCE(i != HQD_N_REGS);
> +       *n_regs = i;
> +
> +       return 0;
> +}
> +
>  static bool kgd_hqd_is_occupied(struct kgd_dev *kgd, uint64_t queue_address,
>                                 uint32_t pipe_id, uint32_t queue_id)
>  {
> diff --git a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
> index c6d4e64..fe3079a 100644
> --- a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
> +++ b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
> @@ -131,6 +131,12 @@ struct tile_config {
>   * @hqd_sdma_load: Loads the SDMA mqd structure to a H/W SDMA hqd slot.
>   * used only for no HWS mode.
>   *
> + * @hqd_dump: Dumps CPC HQD registers to an array of address-value pairs.
> + * Array is allocated with kmalloc, needs to be freed with kfree by caller.
> + *
> + * @hqd_sdma_dump: Dumps SDMA HQD registers to an array of address-value pairs.
> + * Array is allocated with kmalloc, needs to be freed with kfree by caller.
> + *
>   * @hqd_is_occupies: Checks if a hqd slot is occupied.
>   *
>   * @hqd_destroy: Destructs and preempts the queue assigned to that hqd slot.
> @@ -187,6 +193,14 @@ struct kfd2kgd_calls {
>         int (*hqd_sdma_load)(struct kgd_dev *kgd, void *mqd,
>                              uint32_t __user *wptr, struct mm_struct *mm);
>
> +       int (*hqd_dump)(struct kgd_dev *kgd,
> +                       uint32_t pipe_id, uint32_t queue_id,
> +                       uint32_t (**dump)[2], uint32_t *n_regs);
> +
> +       int (*hqd_sdma_dump)(struct kgd_dev *kgd,
> +                            uint32_t engine_id, uint32_t queue_id,
> +                            uint32_t (**dump)[2], uint32_t *n_regs);
> +
>         bool (*hqd_is_occupied)(struct kgd_dev *kgd, uint64_t queue_address,
>                                 uint32_t pipe_id, uint32_t queue_id);
>
> --
> 2.7.4
>

This patch is:
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 07/14] drm/amdkfd: Add debugfs support to KFD
       [not found]     ` <1511825396-24579-8-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-12-05  8:27       ` Oded Gabbay
  0 siblings, 0 replies; 35+ messages in thread
From: Oded Gabbay @ 2017-12-05  8:27 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: Yong Zhao, amd-gfx list

On Tue, Nov 28, 2017 at 1:29 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> This commit adds several debugfs entries for kfd:
>
> kfd/hqds: dumps all HQDs on all GPUs for KFD-controlled compute and
>     SDMA RLC queues
>
> kfd/mqds: dumps all MQDs of all KFD processes on all GPUs
>
> kfd/rls: dumps HWS runlists on all GPUs
>
> Signed-off-by: Yong Zhao <yong.zhao@amd.com>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/Makefile                |  2 +
>  drivers/gpu/drm/amd/amdkfd/kfd_debugfs.c           | 75 ++++++++++++++++++++++
>  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  | 71 ++++++++++++++++++++
>  drivers/gpu/drm/amd/amdkfd/kfd_module.c            |  3 +
>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h       |  4 ++
>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c   | 27 ++++++++
>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c    | 25 ++++++++
>  drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c    | 24 +++++++
>  drivers/gpu/drm/amd/amdkfd/kfd_priv.h              | 21 ++++++
>  drivers/gpu/drm/amd/amdkfd/kfd_process.c           | 29 +++++++++
>  .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 63 ++++++++++++++++++
>  drivers/gpu/drm/amd/amdkfd/kfd_topology.c          | 55 ++++++++++++++++
>  12 files changed, 399 insertions(+)
>  create mode 100644 drivers/gpu/drm/amd/amdkfd/kfd_debugfs.c
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/Makefile b/drivers/gpu/drm/amd/amdkfd/Makefile
> index b400d56..5263e4d 100644
> --- a/drivers/gpu/drm/amd/amdkfd/Makefile
> +++ b/drivers/gpu/drm/amd/amdkfd/Makefile
> @@ -16,4 +16,6 @@ amdkfd-y      := kfd_module.o kfd_device.o kfd_chardev.o kfd_topology.o \
>                 kfd_interrupt.o kfd_events.o cik_event_interrupt.o \
>                 kfd_dbgdev.o kfd_dbgmgr.o
>
> +amdkfd-$(CONFIG_DEBUG_FS) += kfd_debugfs.o
> +
>  obj-$(CONFIG_HSA_AMD)  += amdkfd.o
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_debugfs.c b/drivers/gpu/drm/amd/amdkfd/kfd_debugfs.c
> new file mode 100644
> index 0000000..4bd6ebf
> --- /dev/null
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_debugfs.c
> @@ -0,0 +1,75 @@
> +/*
> + * Copyright 2016-2017 Advanced Micro Devices, Inc.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
> + * OTHER DEALINGS IN THE SOFTWARE.
> + */
> +
> +#include <linux/debugfs.h>
> +#include "kfd_priv.h"
> +
> +static struct dentry *debugfs_root;
> +
> +static int kfd_debugfs_open(struct inode *inode, struct file *file)
> +{
> +       int (*show)(struct seq_file *, void *) = inode->i_private;
> +
> +       return single_open(file, show, NULL);
> +}
> +
> +static const struct file_operations kfd_debugfs_fops = {
> +       .owner = THIS_MODULE,
> +       .open = kfd_debugfs_open,
> +       .read = seq_read,
> +       .llseek = seq_lseek,
> +       .release = single_release,
> +};
> +
> +void kfd_debugfs_init(void)
> +{
> +       struct dentry *ent;
> +
> +       debugfs_root = debugfs_create_dir("kfd", NULL);
> +       if (!debugfs_root || debugfs_root == ERR_PTR(-ENODEV)) {
> +               pr_warn("Failed to create kfd debugfs dir\n");
> +               return;
> +       }
> +
> +       ent = debugfs_create_file("mqds", S_IFREG | 0444, debugfs_root,
> +                                 kfd_debugfs_mqds_by_process,
> +                                 &kfd_debugfs_fops);
> +       if (!ent)
> +               pr_warn("Failed to create mqds in kfd debugfs\n");
> +
> +       ent = debugfs_create_file("hqds", S_IFREG | 0444, debugfs_root,
> +                                 kfd_debugfs_hqds_by_device,
> +                                 &kfd_debugfs_fops);
> +       if (!ent)
> +               pr_warn("Failed to create hqds in kfd debugfs\n");
> +
> +       ent = debugfs_create_file("rls", S_IFREG | 0444, debugfs_root,
> +                                 kfd_debugfs_rls_by_device,
> +                                 &kfd_debugfs_fops);
> +       if (!ent)
> +               pr_warn("Failed to create rls in kfd debugfs\n");
> +}
> +
> +void kfd_debugfs_fini(void)
> +{
> +       debugfs_remove_recursive(debugfs_root);
> +}
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> index 8447810..eef8b98 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> @@ -1318,3 +1318,74 @@ void device_queue_manager_uninit(struct device_queue_manager *dqm)
>         dqm->ops.uninitialize(dqm);
>         kfree(dqm);
>  }
> +
> +#if defined(CONFIG_DEBUG_FS)
> +
> +static void seq_reg_dump(struct seq_file *m,
> +                        uint32_t (*dump)[2], uint32_t n_regs)
> +{
> +       uint32_t i, count;
> +
> +       for (i = 0, count = 0; i < n_regs; i++) {
> +               if (count == 0 ||
> +                   dump[i-1][0] + sizeof(uint32_t) != dump[i][0]) {
> +                       seq_printf(m, "%s    %08x: %08x",
> +                                  i ? "\n" : "",
> +                                  dump[i][0], dump[i][1]);
> +                       count = 7;
> +               } else {
> +                       seq_printf(m, " %08x", dump[i][1]);
> +                       count--;
> +               }
> +       }
> +
> +       seq_puts(m, "\n");
> +}
> +
> +int dqm_debugfs_hqds(struct seq_file *m, void *data)
> +{
> +       struct device_queue_manager *dqm = data;
> +       uint32_t (*dump)[2], n_regs;
> +       int pipe, queue;
> +       int r = 0;
> +
> +       for (pipe = 0; pipe < get_pipes_per_mec(dqm); pipe++) {
> +               int pipe_offset = pipe * get_queues_per_pipe(dqm);
> +
> +               for (queue = 0; queue < get_queues_per_pipe(dqm); queue++) {
> +                       if (!test_bit(pipe_offset + queue,
> +                                     dqm->dev->shared_resources.queue_bitmap))
> +                               continue;
> +
> +                       r = dqm->dev->kfd2kgd->hqd_dump(
> +                               dqm->dev->kgd, pipe, queue, &dump, &n_regs);
> +                       if (r)
> +                               break;
> +
> +                       seq_printf(m, "  CP Pipe %d, Queue %d\n",
> +                                 pipe, queue);
> +                       seq_reg_dump(m, dump, n_regs);
> +
> +                       kfree(dump);
> +               }
> +       }
> +
> +       for (pipe = 0; pipe < CIK_SDMA_ENGINE_NUM; pipe++) {
> +               for (queue = 0; queue < CIK_SDMA_QUEUES_PER_ENGINE; queue++) {
> +                       r = dqm->dev->kfd2kgd->hqd_sdma_dump(
> +                               dqm->dev->kgd, pipe, queue, &dump, &n_regs);
> +                       if (r)
> +                               break;
> +
> +                       seq_printf(m, "  SDMA Engine %d, RLC %d\n",
> +                                 pipe, queue);
> +                       seq_reg_dump(m, dump, n_regs);
> +
> +                       kfree(dump);
> +               }
> +       }
> +
> +       return r;
> +}
> +
> +#endif
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_module.c b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
> index 4e060c8..f50e494 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
> @@ -123,6 +123,8 @@ static int __init kfd_module_init(void)
>
>         kfd_process_create_wq();
>
> +       kfd_debugfs_init();
> +
>         amdkfd_init_completed = 1;
>
>         dev_info(kfd_device, "Initialized module\n");
> @@ -139,6 +141,7 @@ static void __exit kfd_module_exit(void)
>  {
>         amdkfd_init_completed = 0;
>
> +       kfd_debugfs_fini();
>         kfd_process_destroy_wq();
>         kfd_topology_shutdown();
>         kfd_chardev_exit();
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h
> index 1f3a6ba..8972bcf 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h
> @@ -85,6 +85,10 @@ struct mqd_manager {
>                                 uint64_t queue_address, uint32_t pipe_id,
>                                 uint32_t queue_id);
>
> +#if defined(CONFIG_DEBUG_FS)
> +       int     (*debugfs_show_mqd)(struct seq_file *m, void *data);
> +#endif
> +
>         struct mutex    mqd_mutex;
>         struct kfd_dev  *dev;
>  };
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
> index 7aa57ab..f8ef4a0 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
> @@ -365,6 +365,24 @@ static int update_mqd_hiq(struct mqd_manager *mm, void *mqd,
>         return 0;
>  }
>
> +#if defined(CONFIG_DEBUG_FS)
> +
> +static int debugfs_show_mqd(struct seq_file *m, void *data)
> +{
> +       seq_hex_dump(m, "    ", DUMP_PREFIX_OFFSET, 32, 4,
> +                    data, sizeof(struct cik_mqd), false);
> +       return 0;
> +}
> +
> +static int debugfs_show_mqd_sdma(struct seq_file *m, void *data)
> +{
> +       seq_hex_dump(m, "    ", DUMP_PREFIX_OFFSET, 32, 4,
> +                    data, sizeof(struct cik_sdma_rlc_registers), false);
> +       return 0;
> +}
> +
> +#endif
> +
>
>  struct mqd_manager *mqd_manager_init_cik(enum KFD_MQD_TYPE type,
>                 struct kfd_dev *dev)
> @@ -389,6 +407,9 @@ struct mqd_manager *mqd_manager_init_cik(enum KFD_MQD_TYPE type,
>                 mqd->update_mqd = update_mqd;
>                 mqd->destroy_mqd = destroy_mqd;
>                 mqd->is_occupied = is_occupied;
> +#if defined(CONFIG_DEBUG_FS)
> +               mqd->debugfs_show_mqd = debugfs_show_mqd;
> +#endif
>                 break;
>         case KFD_MQD_TYPE_HIQ:
>                 mqd->init_mqd = init_mqd_hiq;
> @@ -397,6 +418,9 @@ struct mqd_manager *mqd_manager_init_cik(enum KFD_MQD_TYPE type,
>                 mqd->update_mqd = update_mqd_hiq;
>                 mqd->destroy_mqd = destroy_mqd;
>                 mqd->is_occupied = is_occupied;
> +#if defined(CONFIG_DEBUG_FS)
> +               mqd->debugfs_show_mqd = debugfs_show_mqd;
> +#endif
>                 break;
>         case KFD_MQD_TYPE_SDMA:
>                 mqd->init_mqd = init_mqd_sdma;
> @@ -405,6 +429,9 @@ struct mqd_manager *mqd_manager_init_cik(enum KFD_MQD_TYPE type,
>                 mqd->update_mqd = update_mqd_sdma;
>                 mqd->destroy_mqd = destroy_mqd_sdma;
>                 mqd->is_occupied = is_occupied_sdma;
> +#if defined(CONFIG_DEBUG_FS)
> +               mqd->debugfs_show_mqd = debugfs_show_mqd_sdma;
> +#endif
>                 break;
>         default:
>                 kfree(mqd);
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
> index 00e1f1a..971aec0 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
> @@ -358,7 +358,23 @@ static bool is_occupied_sdma(struct mqd_manager *mm, void *mqd,
>         return mm->dev->kfd2kgd->hqd_sdma_is_occupied(mm->dev->kgd, mqd);
>  }
>
> +#if defined(CONFIG_DEBUG_FS)
>
> +static int debugfs_show_mqd(struct seq_file *m, void *data)
> +{
> +       seq_hex_dump(m, "    ", DUMP_PREFIX_OFFSET, 32, 4,
> +                    data, sizeof(struct vi_mqd), false);
> +       return 0;
> +}
> +
> +static int debugfs_show_mqd_sdma(struct seq_file *m, void *data)
> +{
> +       seq_hex_dump(m, "    ", DUMP_PREFIX_OFFSET, 32, 4,
> +                    data, sizeof(struct vi_sdma_mqd), false);
> +       return 0;
> +}
> +
> +#endif
>
>  struct mqd_manager *mqd_manager_init_vi(enum KFD_MQD_TYPE type,
>                 struct kfd_dev *dev)
> @@ -383,6 +399,9 @@ struct mqd_manager *mqd_manager_init_vi(enum KFD_MQD_TYPE type,
>                 mqd->update_mqd = update_mqd;
>                 mqd->destroy_mqd = destroy_mqd;
>                 mqd->is_occupied = is_occupied;
> +#if defined(CONFIG_DEBUG_FS)
> +               mqd->debugfs_show_mqd = debugfs_show_mqd;
> +#endif
>                 break;
>         case KFD_MQD_TYPE_HIQ:
>                 mqd->init_mqd = init_mqd_hiq;
> @@ -391,6 +410,9 @@ struct mqd_manager *mqd_manager_init_vi(enum KFD_MQD_TYPE type,
>                 mqd->update_mqd = update_mqd_hiq;
>                 mqd->destroy_mqd = destroy_mqd;
>                 mqd->is_occupied = is_occupied;
> +#if defined(CONFIG_DEBUG_FS)
> +               mqd->debugfs_show_mqd = debugfs_show_mqd;
> +#endif
>                 break;
>         case KFD_MQD_TYPE_SDMA:
>                 mqd->init_mqd = init_mqd_sdma;
> @@ -399,6 +421,9 @@ struct mqd_manager *mqd_manager_init_vi(enum KFD_MQD_TYPE type,
>                 mqd->update_mqd = update_mqd_sdma;
>                 mqd->destroy_mqd = destroy_mqd_sdma;
>                 mqd->is_occupied = is_occupied_sdma;
> +#if defined(CONFIG_DEBUG_FS)
> +               mqd->debugfs_show_mqd = debugfs_show_mqd_sdma;
> +#endif
>                 break;
>         default:
>                 kfree(mqd);
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> index c3230b9..0ecbd1f 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> @@ -278,6 +278,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
>                 return retval;
>
>         *rl_size_bytes = alloc_size_bytes;
> +       pm->ib_size_bytes = alloc_size_bytes;
>
>         pr_debug("Building runlist ib process count: %d queues count %d\n",
>                 pm->dqm->processes_count, pm->dqm->queue_count);
> @@ -591,3 +592,26 @@ void pm_release_ib(struct packet_manager *pm)
>         }
>         mutex_unlock(&pm->lock);
>  }
> +
> +#if defined(CONFIG_DEBUG_FS)
> +
> +int pm_debugfs_runlist(struct seq_file *m, void *data)
> +{
> +       struct packet_manager *pm = data;
> +
> +       mutex_lock(&pm->lock);
> +
> +       if (!pm->allocated) {
> +               seq_puts(m, "  No active runlist\n");
> +               goto out;
> +       }
> +
> +       seq_hex_dump(m, "  ", DUMP_PREFIX_OFFSET, 32, 4,
> +                    pm->ib_buffer_obj->cpu_ptr, pm->ib_size_bytes, false);
> +
> +out:
> +       mutex_unlock(&pm->lock);
> +       return 0;
> +}
> +
> +#endif
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> index 1edab21..dca493b 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> @@ -33,6 +33,7 @@
>  #include <linux/kfd_ioctl.h>
>  #include <linux/idr.h>
>  #include <linux/kfifo.h>
> +#include <linux/seq_file.h>
>  #include <kgd_kfd_interface.h>
>
>  #include "amd_shared.h"
> @@ -735,6 +736,7 @@ struct packet_manager {
>         struct mutex lock;
>         bool allocated;
>         struct kfd_mem_obj *ib_buffer_obj;
> +       unsigned int ib_size_bytes;
>  };
>
>  int pm_init(struct packet_manager *pm, struct device_queue_manager *dqm);
> @@ -781,4 +783,23 @@ int kfd_event_destroy(struct kfd_process *p, uint32_t event_id);
>
>  int dbgdev_wave_reset_wavefronts(struct kfd_dev *dev, struct kfd_process *p);
>
> +/* Debugfs */
> +#if defined(CONFIG_DEBUG_FS)
> +
> +void kfd_debugfs_init(void);
> +void kfd_debugfs_fini(void);
> +int kfd_debugfs_mqds_by_process(struct seq_file *m, void *data);
> +int pqm_debugfs_mqds(struct seq_file *m, void *data);
> +int kfd_debugfs_hqds_by_device(struct seq_file *m, void *data);
> +int dqm_debugfs_hqds(struct seq_file *m, void *data);
> +int kfd_debugfs_rls_by_device(struct seq_file *m, void *data);
> +int pm_debugfs_runlist(struct seq_file *m, void *data);
> +
> +#else
> +
> +static inline void kfd_debugfs_init(void) {}
> +static inline void kfd_debugfs_fini(void) {}
> +
> +#endif
> +
>  #endif
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> index 39f4c19..99c18ee 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> @@ -620,3 +620,32 @@ int kfd_reserved_mem_mmap(struct kfd_process *process,
>                                PFN_DOWN(__pa(qpd->cwsr_kaddr)),
>                                KFD_CWSR_TBA_TMA_SIZE, vma->vm_page_prot);
>  }
> +
> +#if defined(CONFIG_DEBUG_FS)
> +
> +int kfd_debugfs_mqds_by_process(struct seq_file *m, void *data)
> +{
> +       struct kfd_process *p;
> +       unsigned int temp;
> +       int r = 0;
> +
> +       int idx = srcu_read_lock(&kfd_processes_srcu);
> +
> +       hash_for_each_rcu(kfd_processes_table, temp, p, kfd_processes) {
> +               seq_printf(m, "Process %d PASID %d:\n",
> +                          p->lead_thread->tgid, p->pasid);
> +
> +               mutex_lock(&p->mutex);
> +               r = pqm_debugfs_mqds(m, &p->pqm);
> +               mutex_unlock(&p->mutex);
> +
> +               if (r)
> +                       break;
> +       }
> +
> +       srcu_read_unlock(&kfd_processes_srcu, idx);
> +
> +       return r;
> +}
> +
> +#endif
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> index 2c98858..2573455 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> @@ -370,4 +370,67 @@ struct kernel_queue *pqm_get_kernel_queue(
>         return NULL;
>  }
>
> +#if defined(CONFIG_DEBUG_FS)
>
> +int pqm_debugfs_mqds(struct seq_file *m, void *data)
> +{
> +       struct process_queue_manager *pqm = data;
> +       struct process_queue_node *pqn;
> +       struct queue *q;
> +       enum KFD_MQD_TYPE mqd_type;
> +       struct mqd_manager *mqd_manager;
> +       int r = 0;
> +
> +       list_for_each_entry(pqn, &pqm->queues, process_queue_list) {
> +               if (pqn->q) {
> +                       q = pqn->q;
> +                       switch (q->properties.type) {
> +                       case KFD_QUEUE_TYPE_SDMA:
> +                               seq_printf(m, "  SDMA queue on device %x\n",
> +                                          q->device->id);
> +                               mqd_type = KFD_MQD_TYPE_SDMA;
> +                               break;
> +                       case KFD_QUEUE_TYPE_COMPUTE:
> +                               seq_printf(m, "  Compute queue on device %x\n",
> +                                          q->device->id);
> +                               mqd_type = KFD_MQD_TYPE_CP;
> +                               break;
> +                       default:
> +                               seq_printf(m,
> +                               "  Bad user queue type %d on device %x\n",
> +                                          q->properties.type, q->device->id);
> +                               continue;
> +                       }
> +                       mqd_manager = q->device->dqm->ops.get_mqd_manager(
> +                               q->device->dqm, mqd_type);
> +               } else if (pqn->kq) {
> +                       q = pqn->kq->queue;
> +                       mqd_manager = pqn->kq->mqd;
> +                       switch (q->properties.type) {
> +                       case KFD_QUEUE_TYPE_DIQ:
> +                               seq_printf(m, "  DIQ on device %x\n",
> +                                          pqn->kq->dev->id);
> +                               mqd_type = KFD_MQD_TYPE_HIQ;
> +                               break;
> +                       default:
> +                               seq_printf(m,
> +                               "  Bad kernel queue type %d on device %x\n",
> +                                          q->properties.type,
> +                                          pqn->kq->dev->id);
> +                               continue;
> +                       }
> +               } else {
> +                       seq_printf(m,
> +               "  Weird: Queue node with neither kernel nor user queue\n");
> +                       continue;
> +               }
> +
> +               r = mqd_manager->debugfs_show_mqd(m, q->mqd);
> +               if (r != 0)
> +                       break;
> +       }
> +
> +       return r;
> +}
> +
> +#endif
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> index 19ce590..9d03a56 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> @@ -32,6 +32,7 @@
>  #include "kfd_priv.h"
>  #include "kfd_crat.h"
>  #include "kfd_topology.h"
> +#include "kfd_device_queue_manager.h"
>
>  static struct list_head topology_device_list;
>  static int topology_crat_parsed;
> @@ -1233,3 +1234,57 @@ struct kfd_dev *kfd_topology_enum_kfd_devices(uint8_t idx)
>         return device;
>
>  }
> +
> +#if defined(CONFIG_DEBUG_FS)
> +
> +int kfd_debugfs_hqds_by_device(struct seq_file *m, void *data)
> +{
> +       struct kfd_topology_device *dev;
> +       unsigned int i = 0;
> +       int r = 0;
> +
> +       down_read(&topology_lock);
> +
> +       list_for_each_entry(dev, &topology_device_list, list) {
> +               if (!dev->gpu) {
> +                       i++;
> +                       continue;
> +               }
> +
> +               seq_printf(m, "Node %u, gpu_id %x:\n", i++, dev->gpu->id);
> +               r = dqm_debugfs_hqds(m, dev->gpu->dqm);
> +               if (r)
> +                       break;
> +       }
> +
> +       up_read(&topology_lock);
> +
> +       return r;
> +}
> +
> +int kfd_debugfs_rls_by_device(struct seq_file *m, void *data)
> +{
> +       struct kfd_topology_device *dev;
> +       unsigned int i = 0;
> +       int r = 0;
> +
> +       down_read(&topology_lock);
> +
> +       list_for_each_entry(dev, &topology_device_list, list) {
> +               if (!dev->gpu) {
> +                       i++;
> +                       continue;
> +               }
> +
> +               seq_printf(m, "Node %u, gpu_id %x:\n", i++, dev->gpu->id);
> +               r = pm_debugfs_runlist(m, &dev->gpu->dqm->packets);
> +               if (r)
> +                       break;
> +       }
> +
> +       up_read(&topology_lock);
> +
> +       return r;
> +}
> +
> +#endif
> --
> 2.7.4
>
This patch is:
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 08/14] drm/amdkfd: Get reference to lead_thread task struct
       [not found]     ` <1511825396-24579-9-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-12-05  8:43       ` Oded Gabbay
       [not found]         ` <CAFCwf12ELNgXW7bL+zb2F3v1XWn914maeRP3e3M+3U15B521wg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: Oded Gabbay @ 2017-12-05  8:43 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: amd-gfx list

On Tue, Nov 28, 2017 at 1:29 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> Increment the kfd_process.lead_thread's reference counter to make
> it safe to dereference. This is needed for getting a safe reference
> to the process' mm_struct.

I don't object to this patch, but I thought we don't dereference the
process' mm_struct...

From kfd_priv.h:

/*
* Opaque pointer to mm_struct. We don't hold a reference to
* it so it should never be dereferenced from here. This is
* only used for looking up processes by their mm.
*/
void *mm;


>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_process.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> index 99c18ee..660d8bc 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> @@ -24,6 +24,7 @@
>  #include <linux/log2.h>
>  #include <linux/sched.h>
>  #include <linux/sched/mm.h>
> +#include <linux/sched/task.h>
>  #include <linux/slab.h>
>  #include <linux/amd-iommu.h>
>  #include <linux/notifier.h>
> @@ -191,6 +192,8 @@ static void kfd_process_wq_release(struct work_struct *work)
>
>         mutex_destroy(&p->mutex);
>
> +       put_task_struct(p->lead_thread);
> +
>         kfree(p);
>
>         kfree(work);
> @@ -342,6 +345,7 @@ static struct kfd_process *create_process(const struct task_struct *thread)
>                         (uintptr_t)process->mm);
>
>         process->lead_thread = thread->group_leader;
> +       get_task_struct(process->lead_thread);
>
>         INIT_LIST_HEAD(&process->per_device_data);
>
> --
> 2.7.4
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 08/14] drm/amdkfd: Get reference to lead_thread task struct
       [not found]         ` <CAFCwf12ELNgXW7bL+zb2F3v1XWn914maeRP3e3M+3U15B521wg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-12-05 19:14           ` Felix Kuehling
       [not found]             ` <e23118b1-eb9b-6a2b-e937-09747b4a9aac-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: Felix Kuehling @ 2017-12-05 19:14 UTC (permalink / raw)
  To: Oded Gabbay; +Cc: amd-gfx list

On 2017-12-05 03:43 AM, Oded Gabbay wrote:
> On Tue, Nov 28, 2017 at 1:29 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
>> Increment the kfd_process.lead_thread's reference counter to make
>> it safe to dereference. This is needed for getting a safe reference
>> to the process' mm_struct.
> I don't object to this patch, but I thought we don't dereference the
> process' mm_struct...

Correct. Instead we get the mm by calling get_task_mm(p->lead_thread).
But for that we need to properly hold a reference to the task. Otherwise
p->lead_thread may be pointing to a task that doesn't exist any more. I
missed that part when I first ported that change.

Regards,
  Felix

>
> From kfd_priv.h:
>
> /*
> * Opaque pointer to mm_struct. We don't hold a reference to
> * it so it should never be dereferenced from here. This is
> * only used for looking up processes by their mm.
> */
> void *mm;
>
>
>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>> ---
>>  drivers/gpu/drm/amd/amdkfd/kfd_process.c | 4 ++++
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>> index 99c18ee..660d8bc 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>> @@ -24,6 +24,7 @@
>>  #include <linux/log2.h>
>>  #include <linux/sched.h>
>>  #include <linux/sched/mm.h>
>> +#include <linux/sched/task.h>
>>  #include <linux/slab.h>
>>  #include <linux/amd-iommu.h>
>>  #include <linux/notifier.h>
>> @@ -191,6 +192,8 @@ static void kfd_process_wq_release(struct work_struct *work)
>>
>>         mutex_destroy(&p->mutex);
>>
>> +       put_task_struct(p->lead_thread);
>> +
>>         kfree(p);
>>
>>         kfree(work);
>> @@ -342,6 +345,7 @@ static struct kfd_process *create_process(const struct task_struct *thread)
>>                         (uintptr_t)process->mm);
>>
>>         process->lead_thread = thread->group_leader;
>> +       get_task_struct(process->lead_thread);
>>
>>         INIT_LIST_HEAD(&process->per_device_data);
>>
>> --
>> 2.7.4
>>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 04/14] drm/amdkfd: Fix oversubscription accounting
       [not found]         ` <CAFCwf11eM8pYmBOHdD1o4NVDj9nesJwp3Ny9dGukzstM5iP=Ag-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-12-05 19:27           ` Felix Kuehling
       [not found]             ` <9f625c72-a44e-6560-1d2b-6d998ad0e2e2-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 35+ messages in thread
From: Felix Kuehling @ 2017-12-05 19:27 UTC (permalink / raw)
  To: Oded Gabbay; +Cc: Jay Cornwall, amd-gfx list

On 2017-12-05 03:10 AM, Oded Gabbay wrote:
> On Tue, Nov 28, 2017 at 1:29 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
>> Don't count SDMA queues towards compute HQD oversubscription when
>> deciding whether to create a chained runlist.
>>
>> Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>> ---
>>  drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c | 5 +++--
>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
>> index 0b7092e..c3230b9 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
>> @@ -55,13 +55,14 @@ static void pm_calc_rlib_size(struct packet_manager *pm,
>>                                 unsigned int *rlib_size,
>>                                 bool *over_subscription)
>>  {
>> -       unsigned int process_count, queue_count;
>> +       unsigned int process_count, queue_count, compute_queue_count;
>>         unsigned int map_queue_size;
>>         unsigned int max_proc_per_quantum = 1;
>>         struct kfd_dev *dev = pm->dqm->dev;
>>
>>         process_count = pm->dqm->processes_count;
>>         queue_count = pm->dqm->queue_count;
>> +       compute_queue_count = queue_count - pm->dqm->sdma_queue_count;
>>
>>         /* check if there is over subscription
>>          * Note: the arbitration between the number of VMIDs and
>> @@ -74,7 +75,7 @@ static void pm_calc_rlib_size(struct packet_manager *pm,
>>                 max_proc_per_quantum = dev->max_proc_per_quantum;
>>
>>         if ((process_count > max_proc_per_quantum) ||
>> -           queue_count > get_queues_num(pm->dqm)) {
>> +           compute_queue_count > get_queues_num(pm->dqm)) {
>>                 *over_subscription = true;
>>                 pr_debug("Over subscribed runlist\n");
>>         }
>> --
>> 2.7.4
>>
> Don't you need to update this line as well (I'm less familiar with the
> runlist so just asking) ?
>
> *rlib_size = process_count * sizeof(struct pm4_mes_map_process) +
> queue_count * map_queue_size;

No. This change doesn't directly affect the runlist size. It deals with
HW resource limitations and whether the HWS needs to handle compute
queue oversubscription. SDMA queues don't count against the limited
number of HQDs for compute queues. So we should not count them for
determining compute queue oversubscription. But the SDMA queues are
still part of the runlist IB, so rlib_size doesn't change.

rlib_size will be indirectly affected, because just below this code
modifies the runlist size for the oversubscription case:

        /*
         * Increase the allocation size in case we need a chained run list
         * when over subscription
         */
        if (*over_subscription)
                *rlib_size += sizeof(struct pm4_mes_runlist);

Regards,
  Felix

>
> Oded

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 04/14] drm/amdkfd: Fix oversubscription accounting
       [not found]             ` <9f625c72-a44e-6560-1d2b-6d998ad0e2e2-5C7GfCeVMHo@public.gmane.org>
@ 2017-12-10  9:00               ` Oded Gabbay
  0 siblings, 0 replies; 35+ messages in thread
From: Oded Gabbay @ 2017-12-10  9:00 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: Jay Cornwall, amd-gfx list

On Tue, Dec 5, 2017 at 9:27 PM, Felix Kuehling <felix.kuehling@amd.com> wrote:
> On 2017-12-05 03:10 AM, Oded Gabbay wrote:
>> On Tue, Nov 28, 2017 at 1:29 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
>>> Don't count SDMA queues towards compute HQD oversubscription when
>>> deciding whether to create a chained runlist.
>>>
>>> Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
>>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>>> ---
>>>  drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c | 5 +++--
>>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
>>> index 0b7092e..c3230b9 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
>>> @@ -55,13 +55,14 @@ static void pm_calc_rlib_size(struct packet_manager *pm,
>>>                                 unsigned int *rlib_size,
>>>                                 bool *over_subscription)
>>>  {
>>> -       unsigned int process_count, queue_count;
>>> +       unsigned int process_count, queue_count, compute_queue_count;
>>>         unsigned int map_queue_size;
>>>         unsigned int max_proc_per_quantum = 1;
>>>         struct kfd_dev *dev = pm->dqm->dev;
>>>
>>>         process_count = pm->dqm->processes_count;
>>>         queue_count = pm->dqm->queue_count;
>>> +       compute_queue_count = queue_count - pm->dqm->sdma_queue_count;
>>>
>>>         /* check if there is over subscription
>>>          * Note: the arbitration between the number of VMIDs and
>>> @@ -74,7 +75,7 @@ static void pm_calc_rlib_size(struct packet_manager *pm,
>>>                 max_proc_per_quantum = dev->max_proc_per_quantum;
>>>
>>>         if ((process_count > max_proc_per_quantum) ||
>>> -           queue_count > get_queues_num(pm->dqm)) {
>>> +           compute_queue_count > get_queues_num(pm->dqm)) {
>>>                 *over_subscription = true;
>>>                 pr_debug("Over subscribed runlist\n");
>>>         }
>>> --
>>> 2.7.4
>>>
>> Don't you need to update this line as well (I'm less familiar with the
>> runlist so just asking) ?
>>
>> *rlib_size = process_count * sizeof(struct pm4_mes_map_process) +
>> queue_count * map_queue_size;
>
> No. This change doesn't directly affect the runlist size. It deals with
> HW resource limitations and whether the HWS needs to handle compute
> queue oversubscription. SDMA queues don't count against the limited
> number of HQDs for compute queues. So we should not count them for
> determining compute queue oversubscription. But the SDMA queues are
> still part of the runlist IB, so rlib_size doesn't change.
>
> rlib_size will be indirectly affected, because just below this code
> modifies the runlist size for the oversubscription case:
>
>         /*
>          * Increase the allocation size in case we need a chained run list
>          * when over subscription
>          */
>         if (*over_subscription)
>                 *rlib_size += sizeof(struct pm4_mes_runlist);
>
> Regards,
>   Felix
>
>>
>> Oded
>
ok, thanks for the explanation.


This patch is:
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 08/14] drm/amdkfd: Get reference to lead_thread task struct
       [not found]             ` <e23118b1-eb9b-6a2b-e937-09747b4a9aac-5C7GfCeVMHo@public.gmane.org>
@ 2017-12-10  9:04               ` Oded Gabbay
  0 siblings, 0 replies; 35+ messages in thread
From: Oded Gabbay @ 2017-12-10  9:04 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: amd-gfx list

On Tue, Dec 5, 2017 at 9:14 PM, Felix Kuehling <felix.kuehling@amd.com> wrote:
> On 2017-12-05 03:43 AM, Oded Gabbay wrote:
>> On Tue, Nov 28, 2017 at 1:29 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
>>> Increment the kfd_process.lead_thread's reference counter to make
>>> it safe to dereference. This is needed for getting a safe reference
>>> to the process' mm_struct.
>> I don't object to this patch, but I thought we don't dereference the
>> process' mm_struct...
>
> Correct. Instead we get the mm by calling get_task_mm(p->lead_thread).
> But for that we need to properly hold a reference to the task. Otherwise
> p->lead_thread may be pointing to a task that doesn't exist any more. I
> missed that part when I first ported that change.
>
> Regards,
>   Felix
Yes, agreed (I missed that too).

This patch is:
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>


>
>>
>> From kfd_priv.h:
>>
>> /*
>> * Opaque pointer to mm_struct. We don't hold a reference to
>> * it so it should never be dereferenced from here. This is
>> * only used for looking up processes by their mm.
>> */
>> void *mm;
>>
>>
>>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>>> ---
>>>  drivers/gpu/drm/amd/amdkfd/kfd_process.c | 4 ++++
>>>  1 file changed, 4 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>> index 99c18ee..660d8bc 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>> @@ -24,6 +24,7 @@
>>>  #include <linux/log2.h>
>>>  #include <linux/sched.h>
>>>  #include <linux/sched/mm.h>
>>> +#include <linux/sched/task.h>
>>>  #include <linux/slab.h>
>>>  #include <linux/amd-iommu.h>
>>>  #include <linux/notifier.h>
>>> @@ -191,6 +192,8 @@ static void kfd_process_wq_release(struct work_struct *work)
>>>
>>>         mutex_destroy(&p->mutex);
>>>
>>> +       put_task_struct(p->lead_thread);
>>> +
>>>         kfree(p);
>>>
>>>         kfree(work);
>>> @@ -342,6 +345,7 @@ static struct kfd_process *create_process(const struct task_struct *thread)
>>>                         (uintptr_t)process->mm);
>>>
>>>         process->lead_thread = thread->group_leader;
>>> +       get_task_struct(process->lead_thread);
>>>
>>>         INIT_LIST_HEAD(&process->per_device_data);
>>>
>>> --
>>> 2.7.4
>>>
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 10/14] drm/amdkfd: Use ref count to prevent kfd_process destruction
       [not found]             ` <ab143937-aca3-9f3e-b6f4-4d354fde3c05-5C7GfCeVMHo@public.gmane.org>
@ 2017-12-10  9:59               ` Oded Gabbay
  0 siblings, 0 replies; 35+ messages in thread
From: Oded Gabbay @ 2017-12-10  9:59 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: Christian König, amd-gfx list

On Fri, Dec 1, 2017 at 11:17 PM, Felix Kuehling <felix.kuehling@amd.com> wrote:
>
> On 2017-11-28 04:52 AM, Christian König wrote:
>> Am 28.11.2017 um 00:29 schrieb Felix Kuehling:
>>> Use a reference counter instead of a lock to prevent process
>>> destruction while functions running out of process context are using
>>> the kfd_process structure. In many cases these functions don't need
>>> the structure to be locked. In the few cases that really do need the
>>> process lock, take it explicitly.
>>>
>>> This helps simplify lock dependencies between the process lock and
>>> other locks, particularly amdgpu and mm_struct locks. This will be
>>> important when amdgpu calls back to amdkfd for memory evictions.
>>
>> Actually that is not only an optimization or cleanup, but a rather
>> important bug fix.
>>
>> Using a mutex as protection to prevent object deletion is illegal
>> because mutex_unlock() can accesses the mutex object even after it is
>> unlocked.
>>
>> See this LWN article as well https://lwn.net/Articles/575460/.
>>
>> If you have other use cases like this in the KFD it should better be
>> fixed as well.
>
> I'm not aware of other misuses of Mutexes in KFD.
>
> The article sounded like this was likely to get fixed in the mutex
> rather than hoping to track down all incorrect uses of Mutexes. Quote:
>>
>> As of this writing, no patches have been posted. It would be
>> surprising, though, if a fix for this particular problem did not
>> surface by the time the 3.14 merge window opens. Locking problems are
>> hard enough to deal with when the locking primitives have simple and
>> easily understood behavior; having subtle traps built into that layer
>> of the kernel is a recipe for a lot of long-term pain.
>>
> I haven't found such a fix. That said, in the discussion under that
> article some argued that the example would be broken even with a
> spinlock. So maybe there is no such general fix.
>
> Regards,
>   Felix
>
>
>>
>>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>>
>> Acked-by: Christian König <christian.koenig@amd.com>
>>
>> Regards,
>> Christian.
>>
>>> ---
>>>   drivers/gpu/drm/amd/amdkfd/kfd_events.c  | 14 +++++++-------
>>>   drivers/gpu/drm/amd/amdkfd/kfd_priv.h    |  1 +
>>>   drivers/gpu/drm/amd/amdkfd/kfd_process.c | 16 +++++++++++++---
>>>   3 files changed, 21 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
>>> b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
>>> index cb92d4b..93aae5c 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
>>> @@ -441,7 +441,7 @@ void kfd_signal_event_interrupt(unsigned int
>>> pasid, uint32_t partial_id,
>>>       /*
>>>        * Because we are called from arbitrary context (workqueue) as
>>> opposed
>>>        * to process context, kfd_process could attempt to exit while
>>> we are
>>> -     * running so the lookup function returns a locked process.
>>> +     * running so the lookup function increments the process ref count.
>>>        */
>>>       struct kfd_process *p = kfd_lookup_process_by_pasid(pasid);
>>>   @@ -493,7 +493,7 @@ void kfd_signal_event_interrupt(unsigned int
>>> pasid, uint32_t partial_id,
>>>       }
>>>         mutex_unlock(&p->event_mutex);
>>> -    mutex_unlock(&p->mutex);
>>> +    kfd_unref_process(p);
>>>   }
>>>     static struct kfd_event_waiter *alloc_event_waiters(uint32_t
>>> num_events)
>>> @@ -847,7 +847,7 @@ void kfd_signal_iommu_event(struct kfd_dev *dev,
>>> unsigned int pasid,
>>>       /*
>>>        * Because we are called from arbitrary context (workqueue) as
>>> opposed
>>>        * to process context, kfd_process could attempt to exit while
>>> we are
>>> -     * running so the lookup function returns a locked process.
>>> +     * running so the lookup function increments the process ref count.
>>>        */
>>>       struct kfd_process *p = kfd_lookup_process_by_pasid(pasid);
>>>       struct mm_struct *mm;
>>> @@ -860,7 +860,7 @@ void kfd_signal_iommu_event(struct kfd_dev *dev,
>>> unsigned int pasid,
>>>        */
>>>       mm = get_task_mm(p->lead_thread);
>>>       if (!mm) {
>>> -        mutex_unlock(&p->mutex);
>>> +        kfd_unref_process(p);
>>>           return; /* Process is exiting */
>>>       }
>>>   @@ -903,7 +903,7 @@ void kfd_signal_iommu_event(struct kfd_dev
>>> *dev, unsigned int pasid,
>>>               &memory_exception_data);
>>>         mutex_unlock(&p->event_mutex);
>>> -    mutex_unlock(&p->mutex);
>>> +    kfd_unref_process(p);
>>>   }
>>>     void kfd_signal_hw_exception_event(unsigned int pasid)
>>> @@ -911,7 +911,7 @@ void kfd_signal_hw_exception_event(unsigned int
>>> pasid)
>>>       /*
>>>        * Because we are called from arbitrary context (workqueue) as
>>> opposed
>>>        * to process context, kfd_process could attempt to exit while
>>> we are
>>> -     * running so the lookup function returns a locked process.
>>> +     * running so the lookup function increments the process ref count.
>>>        */
>>>       struct kfd_process *p = kfd_lookup_process_by_pasid(pasid);
>>>   @@ -924,5 +924,5 @@ void kfd_signal_hw_exception_event(unsigned int
>>> pasid)
>>>       lookup_events_by_type_and_signal(p,
>>> KFD_EVENT_TYPE_HW_EXCEPTION, NULL);
>>>         mutex_unlock(&p->event_mutex);
>>> -    mutex_unlock(&p->mutex);
>>> +    kfd_unref_process(p);
>>>   }
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>>> b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>>> index 248e4f5..0c96a6b 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>>> @@ -606,6 +606,7 @@ void kfd_process_destroy_wq(void);
>>>   struct kfd_process *kfd_create_process(struct file *filep);
>>>   struct kfd_process *kfd_get_process(const struct task_struct *);
>>>   struct kfd_process *kfd_lookup_process_by_pasid(unsigned int pasid);
>>> +void kfd_unref_process(struct kfd_process *p);
>>>     struct kfd_process_device *kfd_bind_process_to_device(struct
>>> kfd_dev *dev,
>>>                           struct kfd_process *p);
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>> b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>> index e02e8a2..509f987 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>> @@ -49,6 +49,7 @@ DEFINE_STATIC_SRCU(kfd_processes_srcu);
>>>   static struct workqueue_struct *kfd_process_wq;
>>>     static struct kfd_process *find_process(const struct task_struct
>>> *thread);
>>> +static void kfd_process_ref_release(struct kref *ref);
>>>   static struct kfd_process *create_process(const struct task_struct
>>> *thread);
>>>   static int kfd_process_init_cwsr(struct kfd_process *p, struct file
>>> *filep);
>>>   @@ -146,6 +147,11 @@ static struct kfd_process *find_process(const
>>> struct task_struct *thread)
>>>       return p;
>>>   }
>>>   +void kfd_unref_process(struct kfd_process *p)
>>> +{
>>> +    kref_put(&p->ref, kfd_process_ref_release);
>>> +}
>>> +
>>>   /* No process locking is needed in this function, because the process
>>>    * is not findable any more. We must assume that no other thread is
>>>    * using it any more, otherwise we couldn't safely free the process
>>> @@ -201,7 +207,7 @@ static void kfd_process_destroy_delayed(struct
>>> rcu_head *rcu)
>>>   {
>>>       struct kfd_process *p = container_of(rcu, struct kfd_process,
>>> rcu);
>>>   -    kref_put(&p->ref, kfd_process_ref_release);
>>> +    kfd_unref_process(p);
>>>   }
>>>     static void kfd_process_notifier_release(struct mmu_notifier *mn,
>>> @@ -525,6 +531,8 @@ void kfd_process_iommu_unbind_callback(struct
>>> kfd_dev *dev, unsigned int pasid)
>>>         mutex_unlock(kfd_get_dbgmgr_mutex());
>>>   +    mutex_lock(&p->mutex);
>>> +
>>>       pdd = kfd_get_process_device_data(dev, p);
>>>       if (pdd)
>>>           /* For GPU relying on IOMMU, we need to dequeue here
>>> @@ -533,6 +541,8 @@ void kfd_process_iommu_unbind_callback(struct
>>> kfd_dev *dev, unsigned int pasid)
>>>           kfd_process_dequeue_from_device(pdd);
>>>         mutex_unlock(&p->mutex);
>>> +
>>> +    kfd_unref_process(p);
>>>   }
>>>     struct kfd_process_device *kfd_get_first_process_device_data(
>>> @@ -557,7 +567,7 @@ bool kfd_has_process_device_data(struct
>>> kfd_process *p)
>>>       return !(list_empty(&p->per_device_data));
>>>   }
>>>   -/* This returns with process->mutex locked. */
>>> +/* This increments the process->ref counter. */
>>>   struct kfd_process *kfd_lookup_process_by_pasid(unsigned int pasid)
>>>   {
>>>       struct kfd_process *p;
>>> @@ -567,7 +577,7 @@ struct kfd_process
>>> *kfd_lookup_process_by_pasid(unsigned int pasid)
>>>         hash_for_each_rcu(kfd_processes_table, temp, p, kfd_processes) {
>>>           if (p->pasid == pasid) {
>>> -            mutex_lock(&p->mutex);
>>> +            kref_get(&p->ref);
>>>               break;
>>>           }
>>>       }
>>
>
This patch is:
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 09/14] drm/amdkfd: Make kfd_process reference counted
       [not found]     ` <1511825396-24579-10-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-12-10 10:00       ` Oded Gabbay
  0 siblings, 0 replies; 35+ messages in thread
From: Oded Gabbay @ 2017-12-10 10:00 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: amd-gfx list

On Tue, Nov 28, 2017 at 1:29 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> This will be used to elliminate the use of the process lock for
> preventing concurrent process destruction. This will simplify lock
> dependencies between KFD and KGD.
>
> This also simplifies the process destruction in a few ways:
> * Don't allocate work struct dynamically
> * Remove unnecessary hack that increments mm reference counter
> * Remove unnecessary process locking during destruction
>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_priv.h    |  4 +++
>  drivers/gpu/drm/amd/amdkfd/kfd_process.c | 58 ++++++++++++--------------------
>  2 files changed, 26 insertions(+), 36 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> index dca493b..248e4f5 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> @@ -34,6 +34,7 @@
>  #include <linux/idr.h>
>  #include <linux/kfifo.h>
>  #include <linux/seq_file.h>
> +#include <linux/kref.h>
>  #include <kgd_kfd_interface.h>
>
>  #include "amd_shared.h"
> @@ -537,6 +538,9 @@ struct kfd_process {
>          */
>         void *mm;
>
> +       struct kref ref;
> +       struct work_struct release_work;
> +
>         struct mutex mutex;
>
>         /*
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> index 660d8bc..e02e8a2 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> @@ -48,11 +48,6 @@ DEFINE_STATIC_SRCU(kfd_processes_srcu);
>
>  static struct workqueue_struct *kfd_process_wq;
>
> -struct kfd_process_release_work {
> -       struct work_struct kfd_work;
> -       struct kfd_process *p;
> -};
> -
>  static struct kfd_process *find_process(const struct task_struct *thread);
>  static struct kfd_process *create_process(const struct task_struct *thread);
>  static int kfd_process_init_cwsr(struct kfd_process *p, struct file *filep);
> @@ -151,21 +146,20 @@ static struct kfd_process *find_process(const struct task_struct *thread)
>         return p;
>  }
>
> +/* No process locking is needed in this function, because the process
> + * is not findable any more. We must assume that no other thread is
> + * using it any more, otherwise we couldn't safely free the process
> + * structure in the end.
> + */
>  static void kfd_process_wq_release(struct work_struct *work)
>  {
> -       struct kfd_process_release_work *my_work;
> +       struct kfd_process *p = container_of(work, struct kfd_process,
> +                                            release_work);
>         struct kfd_process_device *pdd, *temp;
> -       struct kfd_process *p;
> -
> -       my_work = (struct kfd_process_release_work *) work;
> -
> -       p = my_work->p;
>
>         pr_debug("Releasing process (pasid %d) in workqueue\n",
>                         p->pasid);
>
> -       mutex_lock(&p->mutex);
> -
>         list_for_each_entry_safe(pdd, temp, &p->per_device_data,
>                                                         per_device_list) {
>                 pr_debug("Releasing pdd (topology id %d) for process (pasid %d) in workqueue\n",
> @@ -188,33 +182,26 @@ static void kfd_process_wq_release(struct work_struct *work)
>         kfd_pasid_free(p->pasid);
>         kfd_free_process_doorbells(p);
>
> -       mutex_unlock(&p->mutex);
> -
>         mutex_destroy(&p->mutex);
>
>         put_task_struct(p->lead_thread);
>
>         kfree(p);
> -
> -       kfree(work);
>  }
>
> -static void kfd_process_destroy_delayed(struct rcu_head *rcu)
> +static void kfd_process_ref_release(struct kref *ref)
>  {
> -       struct kfd_process_release_work *work;
> -       struct kfd_process *p;
> -
> -       p = container_of(rcu, struct kfd_process, rcu);
> +       struct kfd_process *p = container_of(ref, struct kfd_process, ref);
>
> -       mmdrop(p->mm);
> +       INIT_WORK(&p->release_work, kfd_process_wq_release);
> +       queue_work(kfd_process_wq, &p->release_work);
> +}
>
> -       work = kmalloc(sizeof(struct kfd_process_release_work), GFP_ATOMIC);
> +static void kfd_process_destroy_delayed(struct rcu_head *rcu)
> +{
> +       struct kfd_process *p = container_of(rcu, struct kfd_process, rcu);
>
> -       if (work) {
> -               INIT_WORK((struct work_struct *) work, kfd_process_wq_release);
> -               work->p = p;
> -               queue_work(kfd_process_wq, (struct work_struct *) work);
> -       }
> +       kref_put(&p->ref, kfd_process_ref_release);
>  }
>
>  static void kfd_process_notifier_release(struct mmu_notifier *mn,
> @@ -258,15 +245,12 @@ static void kfd_process_notifier_release(struct mmu_notifier *mn,
>         kfd_process_dequeue_from_all_devices(p);
>         pqm_uninit(&p->pqm);
>
> +       /* Indicate to other users that MM is no longer valid */
> +       p->mm = NULL;
> +
>         mutex_unlock(&p->mutex);
>
> -       /*
> -        * Because we drop mm_count inside kfd_process_destroy_delayed
> -        * and because the mmu_notifier_unregister function also drop
> -        * mm_count we need to take an extra count here.
> -        */
> -       mmgrab(p->mm);
> -       mmu_notifier_unregister_no_release(&p->mmu_notifier, p->mm);
> +       mmu_notifier_unregister_no_release(&p->mmu_notifier, mm);
>         mmu_notifier_call_srcu(&p->rcu, &kfd_process_destroy_delayed);
>  }
>
> @@ -331,6 +315,8 @@ static struct kfd_process *create_process(const struct task_struct *thread)
>         if (kfd_alloc_process_doorbells(process) < 0)
>                 goto err_alloc_doorbells;
>
> +       kref_init(&process->ref);
> +
>         mutex_init(&process->mutex);
>
>         process->mm = thread->mm;
> --
> 2.7.4
>
This patch is:
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 11/14] drm/amdkfd: Return NULL if kfd_lookup_process_by_pasid fails
       [not found]     ` <1511825396-24579-12-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-12-10 10:12       ` Oded Gabbay
  0 siblings, 0 replies; 35+ messages in thread
From: Oded Gabbay @ 2017-12-10 10:12 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: Yong Zhao, amd-gfx list

On Tue, Nov 28, 2017 at 1:29 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> From: Yong Zhao <yong.zhao@amd.com>
>
> If no matching process is found, return NULL instead of a pointer
> to the last process in the kfd_processes_table.
>
> Signed-off-by: Yong Zhao <yong.zhao@amd.com>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_process.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> index 509f987..93f9019 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> @@ -570,7 +570,7 @@ bool kfd_has_process_device_data(struct kfd_process *p)
>  /* This increments the process->ref counter. */
>  struct kfd_process *kfd_lookup_process_by_pasid(unsigned int pasid)
>  {
> -       struct kfd_process *p;
> +       struct kfd_process *p, *ret_p = NULL;
>         unsigned int temp;
>
>         int idx = srcu_read_lock(&kfd_processes_srcu);
> @@ -578,13 +578,14 @@ struct kfd_process *kfd_lookup_process_by_pasid(unsigned int pasid)
>         hash_for_each_rcu(kfd_processes_table, temp, p, kfd_processes) {
>                 if (p->pasid == pasid) {
>                         kref_get(&p->ref);
> +                       ret_p = p;
>                         break;
>                 }
>         }
>
>         srcu_read_unlock(&kfd_processes_srcu, idx);
>
> -       return p;
> +       return ret_p;
>  }
>
>  int kfd_reserved_mem_mmap(struct kfd_process *process,
> --
> 2.7.4
>
This patch is:
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 12/14] drm/amdkfd: Reduce nesting in kfd_create_process_device_data
       [not found]     ` <1511825396-24579-13-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-12-10 10:13       ` Oded Gabbay
  0 siblings, 0 replies; 35+ messages in thread
From: Oded Gabbay @ 2017-12-10 10:13 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: amd-gfx list

On Tue, Nov 28, 2017 at 1:29 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_process.c | 23 ++++++++++++-----------
>  1 file changed, 12 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> index 93f9019..88fc822 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> @@ -390,17 +390,18 @@ struct kfd_process_device *kfd_create_process_device_data(struct kfd_dev *dev,
>         struct kfd_process_device *pdd = NULL;
>
>         pdd = kzalloc(sizeof(*pdd), GFP_KERNEL);
> -       if (pdd != NULL) {
> -               pdd->dev = dev;
> -               INIT_LIST_HEAD(&pdd->qpd.queues_list);
> -               INIT_LIST_HEAD(&pdd->qpd.priv_queue_list);
> -               pdd->qpd.dqm = dev->dqm;
> -               pdd->qpd.pqm = &p->pqm;
> -               pdd->process = p;
> -               pdd->bound = PDD_UNBOUND;
> -               pdd->already_dequeued = false;
> -               list_add(&pdd->per_device_list, &p->per_device_data);
> -       }
> +       if (!pdd)
> +               return NULL;
> +
> +       pdd->dev = dev;
> +       INIT_LIST_HEAD(&pdd->qpd.queues_list);
> +       INIT_LIST_HEAD(&pdd->qpd.priv_queue_list);
> +       pdd->qpd.dqm = dev->dqm;
> +       pdd->qpd.pqm = &p->pqm;
> +       pdd->process = p;
> +       pdd->bound = PDD_UNBOUND;
> +       pdd->already_dequeued = false;
> +       list_add(&pdd->per_device_list, &p->per_device_data);
>
>         return pdd;
>  }
> --
> 2.7.4
>
This patch is:
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 13/14] drm/amdkfd: Factor PDD destruction out of kfd_process_wq_release
       [not found]     ` <1511825396-24579-14-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-12-10 10:15       ` Oded Gabbay
  0 siblings, 0 replies; 35+ messages in thread
From: Oded Gabbay @ 2017-12-10 10:15 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: amd-gfx list

On Tue, Nov 28, 2017 at 1:29 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_process.c | 40 +++++++++++++++++++-------------
>  1 file changed, 24 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> index 88fc822..096710c 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> @@ -152,28 +152,15 @@ void kfd_unref_process(struct kfd_process *p)
>         kref_put(&p->ref, kfd_process_ref_release);
>  }
>
> -/* No process locking is needed in this function, because the process
> - * is not findable any more. We must assume that no other thread is
> - * using it any more, otherwise we couldn't safely free the process
> - * structure in the end.
> - */
> -static void kfd_process_wq_release(struct work_struct *work)
> +static void kfd_process_destroy_pdds(struct kfd_process *p)
>  {
> -       struct kfd_process *p = container_of(work, struct kfd_process,
> -                                            release_work);
>         struct kfd_process_device *pdd, *temp;
>
> -       pr_debug("Releasing process (pasid %d) in workqueue\n",
> -                       p->pasid);
> -
>         list_for_each_entry_safe(pdd, temp, &p->per_device_data,
> -                                                       per_device_list) {
> -               pr_debug("Releasing pdd (topology id %d) for process (pasid %d) in workqueue\n",
> +                                per_device_list) {
> +               pr_debug("Releasing pdd (topology id %d) for process (pasid %d)\n",
>                                 pdd->dev->id, p->pasid);
>
> -               if (pdd->bound == PDD_BOUND)
> -                       amd_iommu_unbind_pasid(pdd->dev->pdev, p->pasid);
> -
>                 list_del(&pdd->per_device_list);
>
>                 if (pdd->qpd.cwsr_kaddr)
> @@ -182,6 +169,27 @@ static void kfd_process_wq_release(struct work_struct *work)
>
>                 kfree(pdd);
>         }
> +}
> +
> +/* No process locking is needed in this function, because the process
> + * is not findable any more. We must assume that no other thread is
> + * using it any more, otherwise we couldn't safely free the process
> + * structure in the end.
> + */
> +static void kfd_process_wq_release(struct work_struct *work)
> +{
> +       struct kfd_process *p = container_of(work, struct kfd_process,
> +                                            release_work);
> +       struct kfd_process_device *pdd;
> +
> +       pr_debug("Releasing process (pasid %d) in workqueue\n", p->pasid);
> +
> +       list_for_each_entry(pdd, &p->per_device_data, per_device_list) {
> +               if (pdd->bound == PDD_BOUND)
> +                       amd_iommu_unbind_pasid(pdd->dev->pdev, p->pasid);
> +       }
> +
> +       kfd_process_destroy_pdds(p);
>
>         kfd_event_free_process(p);
>
> --
> 2.7.4
>
This patch is:
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 14/14] drm/amdkfd: Simplify locking during process creation
       [not found]     ` <1511825396-24579-15-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-12-10 10:22       ` Oded Gabbay
  0 siblings, 0 replies; 35+ messages in thread
From: Oded Gabbay @ 2017-12-10 10:22 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: Yong Zhao, amd-gfx list

On Tue, Nov 28, 2017 at 1:29 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> From: Yong Zhao <yong.zhao@amd.com>
>
> Also fixes error handling if kfd_process_init_cwsr fails.
>
> Signed-off-by: Yong Zhao <yong.zhao@amd.com>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_process.c | 46 +++++++++++++++-----------------
>  1 file changed, 21 insertions(+), 25 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> index 096710c..a22fb071 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> @@ -50,7 +50,8 @@ static struct workqueue_struct *kfd_process_wq;
>
>  static struct kfd_process *find_process(const struct task_struct *thread);
>  static void kfd_process_ref_release(struct kref *ref);
> -static struct kfd_process *create_process(const struct task_struct *thread);
> +static struct kfd_process *create_process(const struct task_struct *thread,
> +                                       struct file *filep);
>  static int kfd_process_init_cwsr(struct kfd_process *p, struct file *filep);
>
>
> @@ -80,9 +81,6 @@ struct kfd_process *kfd_create_process(struct file *filep)
>         if (thread->group_leader->mm != thread->mm)
>                 return ERR_PTR(-EINVAL);
>
> -       /* Take mmap_sem because we call __mmu_notifier_register inside */
> -       down_write(&thread->mm->mmap_sem);
> -
>         /*
>          * take kfd processes mutex before starting of process creation
>          * so there won't be a case where two threads of the same process
> @@ -94,16 +92,11 @@ struct kfd_process *kfd_create_process(struct file *filep)
>         process = find_process(thread);
>         if (process)
>                 pr_debug("Process already found\n");
> -
> -       if (!process)
> -               process = create_process(thread);
> +       else
> +               process = create_process(thread, filep);
>
>         mutex_unlock(&kfd_processes_mutex);
>
> -       up_write(&thread->mm->mmap_sem);
> -
> -       kfd_process_init_cwsr(process, filep);
> -
>         return process;
>  }
>
> @@ -274,15 +267,12 @@ static const struct mmu_notifier_ops kfd_process_mmu_notifier_ops = {
>
>  static int kfd_process_init_cwsr(struct kfd_process *p, struct file *filep)
>  {
> -       int err = 0;
>         unsigned long  offset;
> -       struct kfd_process_device *temp, *pdd = NULL;
> +       struct kfd_process_device *pdd = NULL;
>         struct kfd_dev *dev = NULL;
>         struct qcm_process_device *qpd = NULL;
>
> -       mutex_lock(&p->mutex);
> -       list_for_each_entry_safe(pdd, temp, &p->per_device_data,
> -                               per_device_list) {
> +       list_for_each_entry(pdd, &p->per_device_data, per_device_list) {
>                 dev = pdd->dev;
>                 qpd = &pdd->qpd;
>                 if (!dev->cwsr_enabled || qpd->cwsr_kaddr)
> @@ -293,12 +283,12 @@ static int kfd_process_init_cwsr(struct kfd_process *p, struct file *filep)
>                         MAP_SHARED, offset);
>
>                 if (IS_ERR_VALUE(qpd->tba_addr)) {
> -                       pr_err("Failure to set tba address. error -%d.\n",
> -                               (int)qpd->tba_addr);
> -                       err = qpd->tba_addr;
> +                       int err = qpd->tba_addr;
> +
> +                       pr_err("Failure to set tba address. error %d.\n", err);
>                         qpd->tba_addr = 0;
>                         qpd->cwsr_kaddr = NULL;
> -                       goto out;
> +                       return err;
>                 }
>
>                 memcpy(qpd->cwsr_kaddr, dev->cwsr_isa, dev->cwsr_isa_size);
> @@ -307,12 +297,12 @@ static int kfd_process_init_cwsr(struct kfd_process *p, struct file *filep)
>                 pr_debug("set tba :0x%llx, tma:0x%llx, cwsr_kaddr:%p for pqm.\n",
>                         qpd->tba_addr, qpd->tma_addr, qpd->cwsr_kaddr);
>         }
> -out:
> -       mutex_unlock(&p->mutex);
> -       return err;
> +
> +       return 0;
>  }
>
> -static struct kfd_process *create_process(const struct task_struct *thread)
> +static struct kfd_process *create_process(const struct task_struct *thread,
> +                                       struct file *filep)
>  {
>         struct kfd_process *process;
>         int err = -ENOMEM;
> @@ -337,7 +327,7 @@ static struct kfd_process *create_process(const struct task_struct *thread)
>
>         /* register notifier */
>         process->mmu_notifier.ops = &kfd_process_mmu_notifier_ops;
> -       err = __mmu_notifier_register(&process->mmu_notifier, process->mm);
> +       err = mmu_notifier_register(&process->mmu_notifier, process->mm);
>         if (err)
>                 goto err_mmu_notifier;
>
> @@ -361,8 +351,14 @@ static struct kfd_process *create_process(const struct task_struct *thread)
>         if (err != 0)
>                 goto err_init_apertures;
>
> +       err = kfd_process_init_cwsr(process, filep);
> +       if (err)
> +               goto err_init_cwsr;
> +
>         return process;
>
> +err_init_cwsr:
> +       kfd_process_destroy_pdds(process);
>  err_init_apertures:
>         pqm_uninit(&process->pqm);
>  err_process_pqm_init:
> --
> 2.7.4
>
This patch is:
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2017-12-10 10:22 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-27 23:29 [PATCH 00/14] KFD upstreaming 20171127 Felix Kuehling
     [not found] ` <1511825396-24579-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-11-27 23:29   ` [PATCH 01/14] drm/amdgpu: fix get_max_engine_clock_in_mhz Felix Kuehling
     [not found]     ` <1511825396-24579-2-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-11-30 16:03       ` Oded Gabbay
2017-11-27 23:29   ` [PATCH 02/14] drm/amdkfd: Add crash protection in debugger register path Felix Kuehling
     [not found]     ` <1511825396-24579-3-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-11-30 16:14       ` Oded Gabbay
2017-11-27 23:29   ` [PATCH 03/14] drm/amdkfd: map multiple processes to HW scheduler Felix Kuehling
     [not found]     ` <1511825396-24579-4-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-12-05  8:04       ` Oded Gabbay
2017-11-27 23:29   ` [PATCH 04/14] drm/amdkfd: Fix oversubscription accounting Felix Kuehling
     [not found]     ` <1511825396-24579-5-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-12-05  8:10       ` Oded Gabbay
     [not found]         ` <CAFCwf11eM8pYmBOHdD1o4NVDj9nesJwp3Ny9dGukzstM5iP=Ag-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-12-05 19:27           ` Felix Kuehling
     [not found]             ` <9f625c72-a44e-6560-1d2b-6d998ad0e2e2-5C7GfCeVMHo@public.gmane.org>
2017-12-10  9:00               ` Oded Gabbay
2017-11-27 23:29   ` [PATCH 05/14] drm/amdgpu: Fix definition of KFD_CIK_SDMA_QUEUE_OFFSET Felix Kuehling
     [not found]     ` <1511825396-24579-6-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-12-05  8:15       ` Oded Gabbay
2017-11-27 23:29   ` [PATCH 06/14] drm/amdgpu: Add kfd2kgd APIs for dumping HQDs Felix Kuehling
     [not found]     ` <1511825396-24579-7-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-12-05  8:23       ` Oded Gabbay
2017-11-27 23:29   ` [PATCH 07/14] drm/amdkfd: Add debugfs support to KFD Felix Kuehling
     [not found]     ` <1511825396-24579-8-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-12-05  8:27       ` Oded Gabbay
2017-11-27 23:29   ` [PATCH 08/14] drm/amdkfd: Get reference to lead_thread task struct Felix Kuehling
     [not found]     ` <1511825396-24579-9-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-12-05  8:43       ` Oded Gabbay
     [not found]         ` <CAFCwf12ELNgXW7bL+zb2F3v1XWn914maeRP3e3M+3U15B521wg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-12-05 19:14           ` Felix Kuehling
     [not found]             ` <e23118b1-eb9b-6a2b-e937-09747b4a9aac-5C7GfCeVMHo@public.gmane.org>
2017-12-10  9:04               ` Oded Gabbay
2017-11-27 23:29   ` [PATCH 09/14] drm/amdkfd: Make kfd_process reference counted Felix Kuehling
     [not found]     ` <1511825396-24579-10-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-12-10 10:00       ` Oded Gabbay
2017-11-27 23:29   ` [PATCH 10/14] drm/amdkfd: Use ref count to prevent kfd_process destruction Felix Kuehling
     [not found]     ` <1511825396-24579-11-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-11-28  9:52       ` Christian König
     [not found]         ` <7fb6a8a7-5616-95d6-c2c9-3b69a75a3613-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-12-01 21:17           ` Felix Kuehling
     [not found]             ` <ab143937-aca3-9f3e-b6f4-4d354fde3c05-5C7GfCeVMHo@public.gmane.org>
2017-12-10  9:59               ` Oded Gabbay
2017-11-27 23:29   ` [PATCH 11/14] drm/amdkfd: Return NULL if kfd_lookup_process_by_pasid fails Felix Kuehling
     [not found]     ` <1511825396-24579-12-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-12-10 10:12       ` Oded Gabbay
2017-11-27 23:29   ` [PATCH 12/14] drm/amdkfd: Reduce nesting in kfd_create_process_device_data Felix Kuehling
     [not found]     ` <1511825396-24579-13-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-12-10 10:13       ` Oded Gabbay
2017-11-27 23:29   ` [PATCH 13/14] drm/amdkfd: Factor PDD destruction out of kfd_process_wq_release Felix Kuehling
     [not found]     ` <1511825396-24579-14-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-12-10 10:15       ` Oded Gabbay
2017-11-27 23:29   ` [PATCH 14/14] drm/amdkfd: Simplify locking during process creation Felix Kuehling
     [not found]     ` <1511825396-24579-15-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-12-10 10:22       ` Oded Gabbay

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.