All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/19] KFD fixes and cleanups
@ 2017-08-11 21:56 Felix Kuehling
       [not found] ` <1502488589-30272-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: Felix Kuehling @ 2017-08-11 21:56 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Felix Kuehling

This is the first round of changes preparing for upstreaming KFD
changes made internally in the last 2 years at AMD. A big part of it
is coding style and messaging cleanup. I have tried to avoid making
gratuitous formatting changes. All coding style changes should have a
justification based on the Linux style guide.

The last few patches (15-19) enable running pieces of the current ROCm
user mode stack (with minor Thunk fixes for backwards compatibility)
on this soon-to-be upstream kernel on CZ. At this time I can run some
KFDTest unit tests, which are currently not open source. I'm trying to
find other more substantial tests using a real compute API as a
baseline for testing further KFD upstreaming patches.

This patch series is freshly rebased on amd-staging-4.12.

Felix Kuehling (11):
  drm/amdkfd: Fix typo in dbgdev_wave_reset_wavefronts
  drm/amdkfd: Remove bogus divide-by-sizeof(uint32_t)
  drm/amdkfd: Fix allocated_queues bitmap initialization
  drm/amdkfd: Remove BUG_ONs for NULL pointer arguments
  drm/amdkfd: Fix doorbell initialization and finalization
  drm/amdkfd: Allocate gtt_sa_bitmap in long units
  drm/amdkfd: Handle remaining BUG_ONs more gracefully
  drm/amdkfd: Update PM4 packet headers
  drm/amdgpu: Remove hard-coded assumptions about compute pipes
  drm/amdgpu: Disable GFX PG on CZ
  drm/amd: Update MEC HQD loading code for KFD

Jay Cornwall (1):
  drm/amdkfd: Clamp EOP queue size correctly on Gfx8

Kent Russell (5):
  drm/amdkfd: Clean up KFD style errors and warnings
  drm/amdkfd: Consolidate and clean up log commands
  drm/amdkfd: Change x==NULL/false references to !x
  drm/amdkfd: Fix goto usage
  drm/amdkfd: Remove usage of alloc(sizeof(struct...

Yair Shachar (1):
  drm/amdkfd: Fix double Mutex lock order

Yong Zhao (1):
  drm/amdkfd: Add more error printing to help bringup

 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c         |   4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h         |  16 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c  | 156 +++++++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c  | 185 ++++++++++--
 drivers/gpu/drm/amd/amdgpu/vi.c                    |   3 +-
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c           | 107 +++----
 drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c            | 102 +++----
 drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c            |  21 +-
 drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.h            |  27 +-
 drivers/gpu/drm/amd/amdkfd/kfd_device.c            | 122 ++++----
 .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  | 313 ++++++++-----------
 .../drm/amd/amdkfd/kfd_device_queue_manager_cik.c  |   6 +-
 .../drm/amd/amdkfd/kfd_device_queue_manager_vi.c   |   6 +-
 drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c          |  40 +--
 drivers/gpu/drm/amd/amdkfd/kfd_events.c            |  33 +--
 drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c       |   2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c         |   2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c      |  63 ++--
 drivers/gpu/drm/amd/amdkfd/kfd_module.c            |  10 +-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h       |   3 +-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c   |  62 ++--
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c    |  46 +--
 drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c    | 301 +++++++------------
 drivers/gpu/drm/amd/amdkfd/kfd_pasid.c             |   7 +-
 drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h       | 330 +++------------------
 drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h    | 140 ++++++++-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h              |  31 +-
 drivers/gpu/drm/amd/amdkfd/kfd_process.c           |  25 +-
 .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |  71 ++---
 drivers/gpu/drm/amd/amdkfd/kfd_queue.c             |  12 +-
 drivers/gpu/drm/amd/amdkfd/kfd_topology.c          |  46 +--
 drivers/gpu/drm/amd/include/kgd_kfd_interface.h    |  11 +-
 drivers/gpu/drm/radeon/radeon_kfd.c                |  12 +-
 33 files changed, 1054 insertions(+), 1261 deletions(-)

-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH 01/19] drm/amdkfd: Fix double Mutex lock order
       [not found] ` <1502488589-30272-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-08-11 21:56   ` Felix Kuehling
       [not found]     ` <1502488589-30272-2-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-08-11 21:56   ` [PATCH 02/19] drm/amdkfd: Fix typo in dbgdev_wave_reset_wavefronts Felix Kuehling
                     ` (18 subsequent siblings)
  19 siblings, 1 reply; 70+ messages in thread
From: Felix Kuehling @ 2017-08-11 21:56 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Yair Shachar

From: Yair Shachar <yair.shachar@amd.com>

Signed-off-by: Yair Shachar <yair.shachar@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 6316aad..2a45718e 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -451,8 +451,8 @@ static int kfd_ioctl_dbg_register(struct file *filep,
 		return -EINVAL;
 	}
 
-	mutex_lock(kfd_get_dbgmgr_mutex());
 	mutex_lock(&p->mutex);
+	mutex_lock(kfd_get_dbgmgr_mutex());
 
 	/*
 	 * make sure that we have pdd, if this the first queue created for
@@ -460,8 +460,8 @@ static int kfd_ioctl_dbg_register(struct file *filep,
 	 */
 	pdd = kfd_bind_process_to_device(dev, p);
 	if (IS_ERR(pdd)) {
-		mutex_unlock(&p->mutex);
 		mutex_unlock(kfd_get_dbgmgr_mutex());
+		mutex_unlock(&p->mutex);
 		return PTR_ERR(pdd);
 	}
 
@@ -480,8 +480,8 @@ static int kfd_ioctl_dbg_register(struct file *filep,
 		status = -EINVAL;
 	}
 
-	mutex_unlock(&p->mutex);
 	mutex_unlock(kfd_get_dbgmgr_mutex());
+	mutex_unlock(&p->mutex);
 
 	return status;
 }
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 02/19] drm/amdkfd: Fix typo in dbgdev_wave_reset_wavefronts
       [not found] ` <1502488589-30272-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-08-11 21:56   ` [PATCH 01/19] drm/amdkfd: Fix double Mutex lock order Felix Kuehling
@ 2017-08-11 21:56   ` Felix Kuehling
       [not found]     ` <1502488589-30272-3-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-08-11 21:56   ` [PATCH 03/19] drm/amdkfd: Remove bogus divide-by-sizeof(uint32_t) Felix Kuehling
                     ` (17 subsequent siblings)
  19 siblings, 1 reply; 70+ messages in thread
From: Felix Kuehling @ 2017-08-11 21:56 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Felix Kuehling

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
index d5e19b5..8b14a4e 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
@@ -823,7 +823,7 @@ int dbgdev_wave_reset_wavefronts(struct kfd_dev *dev, struct kfd_process *p)
 	for (vmid = first_vmid_to_scan; vmid <= last_vmid_to_scan; vmid++) {
 		if (dev->kfd2kgd->get_atc_vmid_pasid_mapping_valid
 				(dev->kgd, vmid)) {
-			if (dev->kfd2kgd->get_atc_vmid_pasid_mapping_valid
+			if (dev->kfd2kgd->get_atc_vmid_pasid_mapping_pasid
 					(dev->kgd, vmid) == p->pasid) {
 				pr_debug("Killing wave fronts of vmid %d and pasid %d\n",
 						vmid, p->pasid);
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 03/19] drm/amdkfd: Remove bogus divide-by-sizeof(uint32_t)
       [not found] ` <1502488589-30272-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-08-11 21:56   ` [PATCH 01/19] drm/amdkfd: Fix double Mutex lock order Felix Kuehling
  2017-08-11 21:56   ` [PATCH 02/19] drm/amdkfd: Fix typo in dbgdev_wave_reset_wavefronts Felix Kuehling
@ 2017-08-11 21:56   ` Felix Kuehling
       [not found]     ` <1502488589-30272-4-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-08-11 21:56   ` [PATCH 04/19] drm/amdkfd: Fix allocated_queues bitmap initialization Felix Kuehling
                     ` (16 subsequent siblings)
  19 siblings, 1 reply; 70+ messages in thread
From: Felix Kuehling @ 2017-08-11 21:56 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Felix Kuehling

kfd2kgd->address_watch_get_offset returns dword register offsets.
The divide-by-sizeof(uint32_t) is incorrect.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
index 8b14a4e..faa0790 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
@@ -442,8 +442,6 @@ static int dbgdev_address_watch_diq(struct kfd_dbgdev *dbgdev,
 					i,
 					ADDRESS_WATCH_REG_CNTL);
 
-		aw_reg_add_dword /= sizeof(uint32_t);
-
 		packets_vec[0].bitfields2.reg_offset =
 					aw_reg_add_dword - AMD_CONFIG_REG_BASE;
 
@@ -455,8 +453,6 @@ static int dbgdev_address_watch_diq(struct kfd_dbgdev *dbgdev,
 					i,
 					ADDRESS_WATCH_REG_ADDR_HI);
 
-		aw_reg_add_dword /= sizeof(uint32_t);
-
 		packets_vec[1].bitfields2.reg_offset =
 					aw_reg_add_dword - AMD_CONFIG_REG_BASE;
 		packets_vec[1].reg_data[0] = addrHi.u32All;
@@ -467,8 +463,6 @@ static int dbgdev_address_watch_diq(struct kfd_dbgdev *dbgdev,
 					i,
 					ADDRESS_WATCH_REG_ADDR_LO);
 
-		aw_reg_add_dword /= sizeof(uint32_t);
-
 		packets_vec[2].bitfields2.reg_offset =
 				aw_reg_add_dword - AMD_CONFIG_REG_BASE;
 		packets_vec[2].reg_data[0] = addrLo.u32All;
@@ -485,8 +479,6 @@ static int dbgdev_address_watch_diq(struct kfd_dbgdev *dbgdev,
 					i,
 					ADDRESS_WATCH_REG_CNTL);
 
-		aw_reg_add_dword /= sizeof(uint32_t);
-
 		packets_vec[3].bitfields2.reg_offset =
 					aw_reg_add_dword - AMD_CONFIG_REG_BASE;
 		packets_vec[3].reg_data[0] = cntl.u32All;
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 04/19] drm/amdkfd: Fix allocated_queues bitmap initialization
       [not found] ` <1502488589-30272-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (2 preceding siblings ...)
  2017-08-11 21:56   ` [PATCH 03/19] drm/amdkfd: Remove bogus divide-by-sizeof(uint32_t) Felix Kuehling
@ 2017-08-11 21:56   ` Felix Kuehling
       [not found]     ` <1502488589-30272-5-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-08-11 21:56   ` [PATCH 05/19] drm/amdkfd: Clean up KFD style errors and warnings Felix Kuehling
                     ` (15 subsequent siblings)
  19 siblings, 1 reply; 70+ messages in thread
From: Felix Kuehling @ 2017-08-11 21:56 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Felix Kuehling

Use shared_resources.queue_bitmap to determine the queues available
for KFD in each pipe.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 42de22b..9d2796b 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -513,7 +513,7 @@ static int init_scheduler(struct device_queue_manager *dqm)
 
 static int initialize_nocpsch(struct device_queue_manager *dqm)
 {
-	int i;
+	int pipe, queue;
 
 	BUG_ON(!dqm);
 
@@ -531,8 +531,14 @@ static int initialize_nocpsch(struct device_queue_manager *dqm)
 		return -ENOMEM;
 	}
 
-	for (i = 0; i < get_pipes_per_mec(dqm); i++)
-		dqm->allocated_queues[i] = (1 << get_queues_per_pipe(dqm)) - 1;
+	for (pipe = 0; pipe < get_pipes_per_mec(dqm); pipe++) {
+		int pipe_offset = pipe * get_queues_per_pipe(dqm);
+
+		for (queue = 0; queue < get_queues_per_pipe(dqm); queue++)
+			if (test_bit(pipe_offset + queue,
+				     dqm->dev->shared_resources.queue_bitmap))
+				dqm->allocated_queues[pipe] |= 1 << queue;
+	}
 
 	dqm->vmid_bitmap = (1 << VMID_PER_DEVICE) - 1;
 	dqm->sdma_bitmap = (1 << CIK_SDMA_QUEUES) - 1;
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 05/19] drm/amdkfd: Clean up KFD style errors and warnings
       [not found] ` <1502488589-30272-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (3 preceding siblings ...)
  2017-08-11 21:56   ` [PATCH 04/19] drm/amdkfd: Fix allocated_queues bitmap initialization Felix Kuehling
@ 2017-08-11 21:56   ` Felix Kuehling
       [not found]     ` <1502488589-30272-6-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-08-11 21:56   ` [PATCH 06/19] drm/amdkfd: Consolidate and clean up log commands Felix Kuehling
                     ` (14 subsequent siblings)
  19 siblings, 1 reply; 70+ messages in thread
From: Felix Kuehling @ 2017-08-11 21:56 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Felix Kuehling, Kent Russell

From: Kent Russell <kent.russell@amd.com>

Using checkpatch.pl -f <file> showed a number of style issues. This
patch addresses as many of them as possible. Some long lines have been
left for readability, but attempts to minimize them have been made.

Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c         |  4 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c  | 24 +++++++------------
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c  | 16 ++++++-------
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c           |  6 +++--
 drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c            |  7 +++---
 drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.h            | 27 +++++++++++-----------
 drivers/gpu/drm/amd/amdkfd/kfd_device.c            |  5 ++--
 .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  |  8 ++++---
 drivers/gpu/drm/amd/amdkfd/kfd_events.c            |  5 ++--
 drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c         |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_module.c            |  3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c   |  3 +--
 drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c    |  5 ++--
 drivers/gpu/drm/amd/amdkfd/kfd_pasid.c             |  3 ++-
 drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h       | 16 ++++++-------
 drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h    | 10 ++++----
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h              | 23 ++++++++++--------
 drivers/gpu/drm/amd/amdkfd/kfd_process.c           |  6 +++--
 drivers/gpu/drm/amd/amdkfd/kfd_topology.c          |  4 ++--
 19 files changed, 91 insertions(+), 86 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 3949736..342dc3e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -28,14 +28,14 @@
 #include <linux/module.h>
 
 const struct kgd2kfd_calls *kgd2kfd;
-bool (*kgd2kfd_init_p)(unsigned, const struct kgd2kfd_calls**);
+bool (*kgd2kfd_init_p)(unsigned int, const struct kgd2kfd_calls**);
 
 int amdgpu_amdkfd_init(void)
 {
 	int ret;
 
 #if defined(CONFIG_HSA_AMD_MODULE)
-	int (*kgd2kfd_init_p)(unsigned, const struct kgd2kfd_calls**);
+	int (*kgd2kfd_init_p)(unsigned int, const struct kgd2kfd_calls**);
 
 	kgd2kfd_init_p = symbol_request(kgd2kfd_init);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
index 5254562..5936222 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
@@ -565,43 +565,35 @@ static uint16_t get_fw_version(struct kgd_dev *kgd, enum kgd_engine_type type)
 
 	switch (type) {
 	case KGD_ENGINE_PFP:
-		hdr = (const union amdgpu_firmware_header *)
-							adev->gfx.pfp_fw->data;
+		hdr = (const union amdgpu_firmware_header *)adev->gfx.pfp_fw->data;
 		break;
 
 	case KGD_ENGINE_ME:
-		hdr = (const union amdgpu_firmware_header *)
-							adev->gfx.me_fw->data;
+		hdr = (const union amdgpu_firmware_header *)adev->gfx.me_fw->data;
 		break;
 
 	case KGD_ENGINE_CE:
-		hdr = (const union amdgpu_firmware_header *)
-							adev->gfx.ce_fw->data;
+		hdr = (const union amdgpu_firmware_header *)adev->gfx.ce_fw->data;
 		break;
 
 	case KGD_ENGINE_MEC1:
-		hdr = (const union amdgpu_firmware_header *)
-							adev->gfx.mec_fw->data;
+		hdr = (const union amdgpu_firmware_header *)adev->gfx.mec_fw->data;
 		break;
 
 	case KGD_ENGINE_MEC2:
-		hdr = (const union amdgpu_firmware_header *)
-							adev->gfx.mec2_fw->data;
+		hdr = (const union amdgpu_firmware_header *)adev->gfx.mec2_fw->data;
 		break;
 
 	case KGD_ENGINE_RLC:
-		hdr = (const union amdgpu_firmware_header *)
-							adev->gfx.rlc_fw->data;
+		hdr = (const union amdgpu_firmware_header *)adev->gfx.rlc_fw->data;
 		break;
 
 	case KGD_ENGINE_SDMA1:
-		hdr = (const union amdgpu_firmware_header *)
-							adev->sdma.instance[0].fw->data;
+		hdr = (const union amdgpu_firmware_header *)adev->sdma.instance[0].fw->data;
 		break;
 
 	case KGD_ENGINE_SDMA2:
-		hdr = (const union amdgpu_firmware_header *)
-							adev->sdma.instance[1].fw->data;
+		hdr = (const union amdgpu_firmware_header *)adev->sdma.instance[1].fw->data;
 		break;
 
 	default:
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
index 133d066..90271f6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
@@ -454,42 +454,42 @@ static uint16_t get_fw_version(struct kgd_dev *kgd, enum kgd_engine_type type)
 	switch (type) {
 	case KGD_ENGINE_PFP:
 		hdr = (const union amdgpu_firmware_header *)
-							adev->gfx.pfp_fw->data;
+						adev->gfx.pfp_fw->data;
 		break;
 
 	case KGD_ENGINE_ME:
 		hdr = (const union amdgpu_firmware_header *)
-							adev->gfx.me_fw->data;
+						adev->gfx.me_fw->data;
 		break;
 
 	case KGD_ENGINE_CE:
 		hdr = (const union amdgpu_firmware_header *)
-							adev->gfx.ce_fw->data;
+						adev->gfx.ce_fw->data;
 		break;
 
 	case KGD_ENGINE_MEC1:
 		hdr = (const union amdgpu_firmware_header *)
-							adev->gfx.mec_fw->data;
+						adev->gfx.mec_fw->data;
 		break;
 
 	case KGD_ENGINE_MEC2:
 		hdr = (const union amdgpu_firmware_header *)
-							adev->gfx.mec2_fw->data;
+						adev->gfx.mec2_fw->data;
 		break;
 
 	case KGD_ENGINE_RLC:
 		hdr = (const union amdgpu_firmware_header *)
-							adev->gfx.rlc_fw->data;
+						adev->gfx.rlc_fw->data;
 		break;
 
 	case KGD_ENGINE_SDMA1:
 		hdr = (const union amdgpu_firmware_header *)
-							adev->sdma.instance[0].fw->data;
+						adev->sdma.instance[0].fw->data;
 		break;
 
 	case KGD_ENGINE_SDMA2:
 		hdr = (const union amdgpu_firmware_header *)
-							adev->sdma.instance[1].fw->data;
+						adev->sdma.instance[1].fw->data;
 		break;
 
 	default:
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 2a45718e..98f4dbf 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -782,7 +782,8 @@ static int kfd_ioctl_get_process_apertures(struct file *filp,
 				"scratch_limit %llX\n", pdd->scratch_limit);
 
 			args->num_of_nodes++;
-		} while ((pdd = kfd_get_next_process_device_data(p, pdd)) != NULL &&
+		} while ((pdd = kfd_get_next_process_device_data(p, pdd)) !=
+				NULL &&
 				(args->num_of_nodes < NUM_OF_SUPPORTED_GPUS));
 	}
 
@@ -848,7 +849,8 @@ static int kfd_ioctl_wait_events(struct file *filp, struct kfd_process *p,
 }
 
 #define AMDKFD_IOCTL_DEF(ioctl, _func, _flags) \
-	[_IOC_NR(ioctl)] = {.cmd = ioctl, .func = _func, .flags = _flags, .cmd_drv = 0, .name = #ioctl}
+	[_IOC_NR(ioctl)] = {.cmd = ioctl, .func = _func, .flags = _flags, \
+			    .cmd_drv = 0, .name = #ioctl}
 
 /** Ioctl table */
 static const struct amdkfd_ioctl_desc amdkfd_ioctls[] = {
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
index faa0790..a7548a5 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
@@ -313,7 +313,7 @@ static int dbgdev_address_watch_nodiq(struct kfd_dbgdev *dbgdev,
 		return -EINVAL;
 	}
 
-	for (i = 0 ; i < adw_info->num_watch_points ; i++) {
+	for (i = 0; i < adw_info->num_watch_points; i++) {
 		dbgdev_address_watch_set_registers(adw_info, &addrHi, &addrLo,
 						&cntl, i, pdd->qpd.vmid);
 
@@ -623,7 +623,7 @@ static int dbgdev_wave_control_diq(struct kfd_dbgdev *dbgdev,
 		return status;
 	}
 
-	/* we do not control the VMID in DIQ,so reset it to a known value */
+	/* we do not control the VMID in DIQ, so reset it to a known value */
 	reg_sq_cmd.bits.vm_id = 0;
 
 	pr_debug("\t\t %30s\n", "* * * * * * * * * * * * * * * * * *");
@@ -810,7 +810,8 @@ int dbgdev_wave_reset_wavefronts(struct kfd_dev *dev, struct kfd_process *p)
 
 	/* Scan all registers in the range ATC_VMID8_PASID_MAPPING ..
 	 * ATC_VMID15_PASID_MAPPING
-	 * to check which VMID the current process is mapped to. */
+	 * to check which VMID the current process is mapped to.
+	 */
 
 	for (vmid = first_vmid_to_scan; vmid <= last_vmid_to_scan; vmid++) {
 		if (dev->kfd2kgd->get_atc_vmid_pasid_mapping_valid
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.h b/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.h
index 257a745..a04a1fe 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.h
@@ -30,13 +30,11 @@
 #pragma pack(push, 4)
 
 enum HSA_DBG_WAVEOP {
-	HSA_DBG_WAVEOP_HALT = 1,	/* Halts a wavefront		*/
-	HSA_DBG_WAVEOP_RESUME = 2,	/* Resumes a wavefront		*/
-	HSA_DBG_WAVEOP_KILL = 3,	/* Kills a wavefront		*/
-	HSA_DBG_WAVEOP_DEBUG = 4,	/* Causes wavefront to enter
-						debug mode		*/
-	HSA_DBG_WAVEOP_TRAP = 5,	/* Causes wavefront to take
-						a trap			*/
+	HSA_DBG_WAVEOP_HALT = 1,   /* Halts a wavefront */
+	HSA_DBG_WAVEOP_RESUME = 2, /* Resumes a wavefront */
+	HSA_DBG_WAVEOP_KILL = 3,   /* Kills a wavefront */
+	HSA_DBG_WAVEOP_DEBUG = 4,  /* Causes wavefront to enter dbg mode */
+	HSA_DBG_WAVEOP_TRAP = 5,   /* Causes wavefront to take a trap */
 	HSA_DBG_NUM_WAVEOP = 5,
 	HSA_DBG_MAX_WAVEOP = 0xFFFFFFFF
 };
@@ -81,15 +79,13 @@ struct HsaDbgWaveMsgAMDGen2 {
 			uint32_t UserData:8;	/* user data */
 			uint32_t ShaderArray:1;	/* Shader array */
 			uint32_t Priv:1;	/* Privileged */
-			uint32_t Reserved0:4;	/* This field is reserved,
-						   should be 0 */
+			uint32_t Reserved0:4;	/* Reserved, should be 0 */
 			uint32_t WaveId:4;	/* wave id */
 			uint32_t SIMD:2;	/* SIMD id */
 			uint32_t HSACU:4;	/* Compute unit */
 			uint32_t ShaderEngine:2;/* Shader engine */
 			uint32_t MessageType:2;	/* see HSA_DBG_WAVEMSG_TYPE */
-			uint32_t Reserved1:4;	/* This field is reserved,
-						   should be 0 */
+			uint32_t Reserved1:4;	/* Reserved, should be 0 */
 		} ui32;
 		uint32_t Value;
 	};
@@ -121,20 +117,23 @@ struct HsaDbgWaveMessage {
  * in the user mode instruction stream. The OS scheduler event is typically
  * associated and signaled by an interrupt issued by the GPU, but other HSA
  * system interrupt conditions from other HW (e.g. IOMMUv2) may be surfaced
- * by the KFD by this mechanism, too. */
+ * by the KFD by this mechanism, too.
+ */
 
 /* these are the new definitions for events */
 enum HSA_EVENTTYPE {
 	HSA_EVENTTYPE_SIGNAL = 0,	/* user-mode generated GPU signal */
 	HSA_EVENTTYPE_NODECHANGE = 1,	/* HSA node change (attach/detach) */
 	HSA_EVENTTYPE_DEVICESTATECHANGE = 2,	/* HSA device state change
-						   (start/stop) */
+						 * (start/stop)
+						 */
 	HSA_EVENTTYPE_HW_EXCEPTION = 3,	/* GPU shader exception event */
 	HSA_EVENTTYPE_SYSTEM_EVENT = 4,	/* GPU SYSCALL with parameter info */
 	HSA_EVENTTYPE_DEBUG_EVENT = 5,	/* GPU signal for debugging */
 	HSA_EVENTTYPE_PROFILE_EVENT = 6,/* GPU signal for profiling */
 	HSA_EVENTTYPE_QUEUE_EVENT = 7,	/* GPU signal queue idle state
-					   (EOP pm4) */
+					 * (EOP pm4)
+					 */
 	/* ...  */
 	HSA_EVENTTYPE_MAXID,
 	HSA_EVENTTYPE_TYPE_SIZE = 0xFFFFFFFF
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 3f95f7c..1f50325 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -155,12 +155,13 @@ static bool device_iommu_pasid_init(struct kfd_dev *kfd)
 		dev_err(kfd_device, "error required iommu flags ats(%i), pri(%i), pasid(%i)\n",
 		       (iommu_info.flags & AMD_IOMMU_DEVICE_FLAG_ATS_SUP) != 0,
 		       (iommu_info.flags & AMD_IOMMU_DEVICE_FLAG_PRI_SUP) != 0,
-		       (iommu_info.flags & AMD_IOMMU_DEVICE_FLAG_PASID_SUP) != 0);
+		       (iommu_info.flags & AMD_IOMMU_DEVICE_FLAG_PASID_SUP)
+									!= 0);
 		return false;
 	}
 
 	pasid_limit = min_t(unsigned int,
-			(unsigned int)1 << kfd->device_info->max_pasid_bits,
+			(unsigned int)(1 << kfd->device_info->max_pasid_bits),
 			iommu_info.max_pasids);
 	/*
 	 * last pasid is used for kernel queues doorbells
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 9d2796b..3b850da 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -216,7 +216,8 @@ static int allocate_hqd(struct device_queue_manager *dqm, struct queue *q)
 
 	set = false;
 
-	for (pipe = dqm->next_pipe_to_allocate, i = 0; i < get_pipes_per_mec(dqm);
+	for (pipe = dqm->next_pipe_to_allocate, i = 0;
+			i < get_pipes_per_mec(dqm);
 			pipe = ((pipe + 1) % get_pipes_per_mec(dqm)), ++i) {
 
 		if (!is_pipe_enabled(dqm, 0, pipe))
@@ -669,7 +670,8 @@ static int set_sched_resources(struct device_queue_manager *dqm)
 
 		/* This situation may be hit in the future if a new HW
 		 * generation exposes more than 64 queues. If so, the
-		 * definition of res.queue_mask needs updating */
+		 * definition of res.queue_mask needs updating
+		 */
 		if (WARN_ON(i >= (sizeof(res.queue_mask)*8))) {
 			pr_err("Invalid queue enabled by amdgpu: %d\n", i);
 			break;
@@ -890,7 +892,7 @@ static int create_queue_cpsch(struct device_queue_manager *dqm, struct queue *q,
 	}
 
 	if (q->properties.type == KFD_QUEUE_TYPE_SDMA)
-			dqm->sdma_queue_count++;
+		dqm->sdma_queue_count++;
 	/*
 	 * Unconditionally increment this counter, regardless of the queue's
 	 * type or whether the queue is active.
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
index d1ce83d..d8b9b3c 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
@@ -194,7 +194,8 @@ static void release_event_notification_slot(struct signal_page *page,
 	page->free_slots++;
 
 	/* We don't free signal pages, they are retained by the process
-	 * and reused until it exits. */
+	 * and reused until it exits.
+	 */
 }
 
 static struct signal_page *lookup_signal_page_by_index(struct kfd_process *p,
@@ -584,7 +585,7 @@ void kfd_signal_event_interrupt(unsigned int pasid, uint32_t partial_id,
 		 * search faster.
 		 */
 		struct signal_page *page;
-		unsigned i;
+		unsigned int i;
 
 		list_for_each_entry(page, &p->signal_event_pages, event_pages)
 			for (i = 0; i < SLOTS_PER_PAGE; i++)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c b/drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c
index 7f134aa..70b3a99c 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c
@@ -179,7 +179,7 @@ static void interrupt_wq(struct work_struct *work)
 bool interrupt_is_wanted(struct kfd_dev *dev, const uint32_t *ih_ring_entry)
 {
 	/* integer and bitwise OR so there is no boolean short-circuiting */
-	unsigned wanted = 0;
+	unsigned int wanted = 0;
 
 	wanted |= dev->device_info->event_interrupt_class->interrupt_isr(dev,
 								ih_ring_entry);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_module.c b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
index 850a562..af5bfc1 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
@@ -61,7 +61,8 @@ MODULE_PARM_DESC(send_sigterm,
 
 static int amdkfd_init_completed;
 
-int kgd2kfd_init(unsigned interface_version, const struct kgd2kfd_calls **g2f)
+int kgd2kfd_init(unsigned int interface_version,
+		const struct kgd2kfd_calls **g2f)
 {
 	if (!amdkfd_init_completed)
 		return -EPROBE_DEFER;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
index 6acc431..ac59229 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
@@ -193,9 +193,8 @@ static int update_mqd(struct mqd_manager *mm, void *mqd,
 
 	m->cp_hqd_vmid = q->vmid;
 
-	if (q->format == KFD_QUEUE_FORMAT_AQL) {
+	if (q->format == KFD_QUEUE_FORMAT_AQL)
 		m->cp_hqd_pq_control |= NO_UPDATE_RPTR;
-	}
 
 	m->cp_hqd_active = 0;
 	q->is_active = false;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
index 7131998..99c11a4 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
@@ -458,7 +458,7 @@ int pm_send_set_resources(struct packet_manager *pm,
 	mutex_lock(&pm->lock);
 	pm->priv_queue->ops.acquire_packet_buffer(pm->priv_queue,
 					sizeof(*packet) / sizeof(uint32_t),
-			(unsigned int **)&packet);
+					(unsigned int **)&packet);
 	if (packet == NULL) {
 		mutex_unlock(&pm->lock);
 		pr_err("kfd: failed to allocate buffer on kernel queue\n");
@@ -530,8 +530,7 @@ int pm_send_runlist(struct packet_manager *pm, struct list_head *dqm_queues)
 fail_acquire_packet_buffer:
 	mutex_unlock(&pm->lock);
 fail_create_runlist_ib:
-	if (pm->allocated)
-		pm_release_ib(pm);
+	pm_release_ib(pm);
 	return retval;
 }
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pasid.c b/drivers/gpu/drm/amd/amdkfd/kfd_pasid.c
index 6cfe7f1..b3f7d43 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_pasid.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_pasid.c
@@ -32,7 +32,8 @@ int kfd_pasid_init(void)
 {
 	pasid_limit = KFD_MAX_NUM_OF_PROCESSES;
 
-	pasid_bitmap = kcalloc(BITS_TO_LONGS(pasid_limit), sizeof(long), GFP_KERNEL);
+	pasid_bitmap = kcalloc(BITS_TO_LONGS(pasid_limit), sizeof(long),
+				GFP_KERNEL);
 	if (!pasid_bitmap)
 		return -ENOMEM;
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
index 5b393f3..97e5442 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
@@ -28,14 +28,14 @@
 #define PM4_MES_HEADER_DEFINED
 union PM4_MES_TYPE_3_HEADER {
 	struct {
-		uint32_t reserved1:8;	/* < reserved */
-		uint32_t opcode:8;	/* < IT opcode */
-		uint32_t count:14;	/* < number of DWORDs - 1
-					 * in the information body.
-					 */
-		uint32_t type:2;	/* < packet identifier.
-					 * It should be 3 for type 3 packets
-					 */
+		/* reserved */
+		uint32_t reserved1:8;
+		/* IT opcode */
+		uint32_t opcode:8;
+		/* number of DWORDs - 1 in the information body */
+		uint32_t count:14;
+		/* packet identifier. It should be 3 for type 3 packets */
+		uint32_t type:2;
 	};
 	uint32_t u32all;
 };
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
index 08c72192..c4eda6f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
@@ -30,10 +30,12 @@ union PM4_MES_TYPE_3_HEADER {
 	struct {
 		uint32_t reserved1 : 8; /* < reserved */
 		uint32_t opcode    : 8; /* < IT opcode */
-		uint32_t count     : 14;/* < number of DWORDs - 1 in the
-		information body. */
-		uint32_t type      : 2; /* < packet identifier.
-					It should be 3 for type 3 packets */
+		uint32_t count     : 14;/* < Number of DWORDS - 1 in the
+					 *   information body
+					 */
+		uint32_t type      : 2; /* < packet identifier
+					 *   It should be 3 for type 3 packets
+					 */
 	};
 	uint32_t u32All;
 };
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 4750cab..469b7ea 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -294,13 +294,13 @@ enum kfd_queue_format {
  * @write_ptr: Defines the number of dwords written to the ring buffer.
  *
  * @doorbell_ptr: This field aim is to notify the H/W of new packet written to
- * the queue ring buffer. This field should be similar to write_ptr and the user
- * should update this field after he updated the write_ptr.
+ * the queue ring buffer. This field should be similar to write_ptr and the
+ * user should update this field after he updated the write_ptr.
  *
  * @doorbell_off: The doorbell offset in the doorbell pci-bar.
  *
- * @is_interop: Defines if this is a interop queue. Interop queue means that the
- * queue can access both graphics and compute resources.
+ * @is_interop: Defines if this is a interop queue. Interop queue means that
+ * the queue can access both graphics and compute resources.
  *
  * @is_active: Defines if the queue is active or not.
  *
@@ -352,9 +352,10 @@ struct queue_properties {
  * @properties: The queue properties.
  *
  * @mec: Used only in no cp scheduling mode and identifies to micro engine id
- * that the queue should be execute on.
+ *	 that the queue should be execute on.
  *
- * @pipe: Used only in no cp scheduling mode and identifies the queue's pipe id.
+ * @pipe: Used only in no cp scheduling mode and identifies the queue's pipe
+ *	  id.
  *
  * @queue: Used only in no cp scheduliong mode and identifies the queue's slot.
  *
@@ -520,8 +521,8 @@ struct kfd_process {
 	struct mutex event_mutex;
 	/* All events in process hashed by ID, linked on kfd_event.events. */
 	DECLARE_HASHTABLE(events, 4);
-	struct list_head signal_event_pages;	/* struct slot_page_header.
-								event_pages */
+	/* struct slot_page_header.event_pages */
+	struct list_head signal_event_pages;
 	u32 next_nonsignal_event_id;
 	size_t signal_event_count;
 };
@@ -559,8 +560,10 @@ struct kfd_process_device *kfd_create_process_device_data(struct kfd_dev *dev,
 							struct kfd_process *p);
 
 /* Process device data iterator */
-struct kfd_process_device *kfd_get_first_process_device_data(struct kfd_process *p);
-struct kfd_process_device *kfd_get_next_process_device_data(struct kfd_process *p,
+struct kfd_process_device *kfd_get_first_process_device_data(
+							struct kfd_process *p);
+struct kfd_process_device *kfd_get_next_process_device_data(
+						struct kfd_process *p,
 						struct kfd_process_device *pdd);
 bool kfd_has_process_device_data(struct kfd_process *p);
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index 035bbc9..a4e4a2d 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -449,14 +449,16 @@ void kfd_unbind_process_from_device(struct kfd_dev *dev, unsigned int pasid)
 	mutex_unlock(&p->mutex);
 }
 
-struct kfd_process_device *kfd_get_first_process_device_data(struct kfd_process *p)
+struct kfd_process_device *kfd_get_first_process_device_data(
+						struct kfd_process *p)
 {
 	return list_first_entry(&p->per_device_data,
 				struct kfd_process_device,
 				per_device_list);
 }
 
-struct kfd_process_device *kfd_get_next_process_device_data(struct kfd_process *p,
+struct kfd_process_device *kfd_get_next_process_device_data(
+						struct kfd_process *p,
 						struct kfd_process_device *pdd)
 {
 	if (list_is_last(&pdd->per_device_list, &p->per_device_data))
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
index 1e50647..0200dae 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
@@ -1170,8 +1170,8 @@ int kfd_topology_add_device(struct kfd_dev *gpu)
 		 * GPU vBIOS
 		 */
 
-		/*
-		 * Update the SYSFS tree, since we added another topology device
+		/* Update the SYSFS tree, since we added another topology
+		 * device
 		 */
 		if (kfd_topology_update_sysfs() < 0)
 			kfd_topology_release_sysfs();
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 06/19] drm/amdkfd: Consolidate and clean up log commands
       [not found] ` <1502488589-30272-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (4 preceding siblings ...)
  2017-08-11 21:56   ` [PATCH 05/19] drm/amdkfd: Clean up KFD style errors and warnings Felix Kuehling
@ 2017-08-11 21:56   ` Felix Kuehling
       [not found]     ` <1502488589-30272-7-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-08-11 21:56   ` [PATCH 07/19] drm/amdkfd: Change x==NULL/false references to !x Felix Kuehling
                     ` (13 subsequent siblings)
  19 siblings, 1 reply; 70+ messages in thread
From: Felix Kuehling @ 2017-08-11 21:56 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Yong Zhao, Felix Kuehling, Kent Russell

From: Kent Russell <kent.russell@amd.com>

Consolidate log commands so that dev_info(NULL, "Error...") uses the more
accurate pr_err, remove the module name from the log (can be seen via
dynamic debugging with +m), and the function name (can be seen via
dynamic debugging with +f). We also don't need debug messages saying
what function we're in. Those can be added by devs when needed

Don't print vendor and device ID in error messages. They are typically
the same for all GPUs in a multi-GPU system. So this doesn't add any
value to the message.

Lastly, remove parentheses around %d, %i and 0x%llX.
According to kernel.org:
"Printing numbers in parentheses (%d) adds no value and should be
avoided."

Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c           | 64 ++++++++---------
 drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c            | 38 +++++-----
 drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c            |  4 +-
 drivers/gpu/drm/amd/amdkfd/kfd_device.c            | 51 ++++++--------
 .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  | 81 +++++++---------------
 .../drm/amd/amdkfd/kfd_device_queue_manager_cik.c  |  2 +-
 .../drm/amd/amdkfd/kfd_device_queue_manager_vi.c   |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c          | 21 +++---
 drivers/gpu/drm/amd/amdkfd/kfd_events.c            | 22 +++---
 drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c      | 16 ++---
 drivers/gpu/drm/amd/amdkfd/kfd_module.c            |  4 +-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c   | 10 ---
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c    |  8 +--
 drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c    | 34 ++++-----
 drivers/gpu/drm/amd/amdkfd/kfd_process.c           |  4 +-
 .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 27 +++-----
 drivers/gpu/drm/amd/amdkfd/kfd_topology.c          |  6 +-
 17 files changed, 158 insertions(+), 236 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 98f4dbf..6244958 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -142,12 +142,12 @@ static int set_queue_properties_from_user(struct queue_properties *q_properties,
 				struct kfd_ioctl_create_queue_args *args)
 {
 	if (args->queue_percentage > KFD_MAX_QUEUE_PERCENTAGE) {
-		pr_err("kfd: queue percentage must be between 0 to KFD_MAX_QUEUE_PERCENTAGE\n");
+		pr_err("Queue percentage must be between 0 to KFD_MAX_QUEUE_PERCENTAGE\n");
 		return -EINVAL;
 	}
 
 	if (args->queue_priority > KFD_MAX_QUEUE_PRIORITY) {
-		pr_err("kfd: queue priority must be between 0 to KFD_MAX_QUEUE_PRIORITY\n");
+		pr_err("Queue priority must be between 0 to KFD_MAX_QUEUE_PRIORITY\n");
 		return -EINVAL;
 	}
 
@@ -155,26 +155,26 @@ static int set_queue_properties_from_user(struct queue_properties *q_properties,
 		(!access_ok(VERIFY_WRITE,
 			(const void __user *) args->ring_base_address,
 			sizeof(uint64_t)))) {
-		pr_err("kfd: can't access ring base address\n");
+		pr_err("Can't access ring base address\n");
 		return -EFAULT;
 	}
 
 	if (!is_power_of_2(args->ring_size) && (args->ring_size != 0)) {
-		pr_err("kfd: ring size must be a power of 2 or 0\n");
+		pr_err("Ring size must be a power of 2 or 0\n");
 		return -EINVAL;
 	}
 
 	if (!access_ok(VERIFY_WRITE,
 			(const void __user *) args->read_pointer_address,
 			sizeof(uint32_t))) {
-		pr_err("kfd: can't access read pointer\n");
+		pr_err("Can't access read pointer\n");
 		return -EFAULT;
 	}
 
 	if (!access_ok(VERIFY_WRITE,
 			(const void __user *) args->write_pointer_address,
 			sizeof(uint32_t))) {
-		pr_err("kfd: can't access write pointer\n");
+		pr_err("Can't access write pointer\n");
 		return -EFAULT;
 	}
 
@@ -182,7 +182,7 @@ static int set_queue_properties_from_user(struct queue_properties *q_properties,
 		!access_ok(VERIFY_WRITE,
 			(const void __user *) args->eop_buffer_address,
 			sizeof(uint32_t))) {
-		pr_debug("kfd: can't access eop buffer");
+		pr_debug("Can't access eop buffer");
 		return -EFAULT;
 	}
 
@@ -190,7 +190,7 @@ static int set_queue_properties_from_user(struct queue_properties *q_properties,
 		!access_ok(VERIFY_WRITE,
 			(const void __user *) args->ctx_save_restore_address,
 			sizeof(uint32_t))) {
-		pr_debug("kfd: can't access ctx save restore buffer");
+		pr_debug("Can't access ctx save restore buffer");
 		return -EFAULT;
 	}
 
@@ -219,27 +219,27 @@ static int set_queue_properties_from_user(struct queue_properties *q_properties,
 	else
 		q_properties->format = KFD_QUEUE_FORMAT_PM4;
 
-	pr_debug("Queue Percentage (%d, %d)\n",
+	pr_debug("Queue Percentage: %d, %d\n",
 			q_properties->queue_percent, args->queue_percentage);
 
-	pr_debug("Queue Priority (%d, %d)\n",
+	pr_debug("Queue Priority: %d, %d\n",
 			q_properties->priority, args->queue_priority);
 
-	pr_debug("Queue Address (0x%llX, 0x%llX)\n",
+	pr_debug("Queue Address: 0x%llX, 0x%llX\n",
 			q_properties->queue_address, args->ring_base_address);
 
-	pr_debug("Queue Size (0x%llX, %u)\n",
+	pr_debug("Queue Size: 0x%llX, %u\n",
 			q_properties->queue_size, args->ring_size);
 
-	pr_debug("Queue r/w Pointers (0x%llX, 0x%llX)\n",
-			(uint64_t) q_properties->read_ptr,
-			(uint64_t) q_properties->write_ptr);
+	pr_debug("Queue r/w Pointers: %p, %p\n",
+			q_properties->read_ptr,
+			q_properties->write_ptr);
 
-	pr_debug("Queue Format (%d)\n", q_properties->format);
+	pr_debug("Queue Format: %d\n", q_properties->format);
 
-	pr_debug("Queue EOP (0x%llX)\n", q_properties->eop_ring_buffer_address);
+	pr_debug("Queue EOP: 0x%llX\n", q_properties->eop_ring_buffer_address);
 
-	pr_debug("Queue CTX save arex (0x%llX)\n",
+	pr_debug("Queue CTX save area: 0x%llX\n",
 			q_properties->ctx_save_restore_area_address);
 
 	return 0;
@@ -257,16 +257,16 @@ static int kfd_ioctl_create_queue(struct file *filep, struct kfd_process *p,
 
 	memset(&q_properties, 0, sizeof(struct queue_properties));
 
-	pr_debug("kfd: creating queue ioctl\n");
+	pr_debug("Creating queue ioctl\n");
 
 	err = set_queue_properties_from_user(&q_properties, args);
 	if (err)
 		return err;
 
-	pr_debug("kfd: looking for gpu id 0x%x\n", args->gpu_id);
+	pr_debug("Looking for gpu id 0x%x\n", args->gpu_id);
 	dev = kfd_device_by_id(args->gpu_id);
 	if (dev == NULL) {
-		pr_debug("kfd: gpu id 0x%x was not found\n", args->gpu_id);
+		pr_debug("Could not find gpu id 0x%x\n", args->gpu_id);
 		return -EINVAL;
 	}
 
@@ -278,7 +278,7 @@ static int kfd_ioctl_create_queue(struct file *filep, struct kfd_process *p,
 		goto err_bind_process;
 	}
 
-	pr_debug("kfd: creating queue for PASID %d on GPU 0x%x\n",
+	pr_debug("Creating queue for PASID %d on gpu 0x%x\n",
 			p->pasid,
 			dev->id);
 
@@ -296,15 +296,15 @@ static int kfd_ioctl_create_queue(struct file *filep, struct kfd_process *p,
 
 	mutex_unlock(&p->mutex);
 
-	pr_debug("kfd: queue id %d was created successfully\n", args->queue_id);
+	pr_debug("Queue id %d was created successfully\n", args->queue_id);
 
-	pr_debug("ring buffer address == 0x%016llX\n",
+	pr_debug("Ring buffer address == 0x%016llX\n",
 			args->ring_base_address);
 
-	pr_debug("read ptr address    == 0x%016llX\n",
+	pr_debug("Read ptr address    == 0x%016llX\n",
 			args->read_pointer_address);
 
-	pr_debug("write ptr address   == 0x%016llX\n",
+	pr_debug("Write ptr address   == 0x%016llX\n",
 			args->write_pointer_address);
 
 	return 0;
@@ -321,7 +321,7 @@ static int kfd_ioctl_destroy_queue(struct file *filp, struct kfd_process *p,
 	int retval;
 	struct kfd_ioctl_destroy_queue_args *args = data;
 
-	pr_debug("kfd: destroying queue id %d for PASID %d\n",
+	pr_debug("Destroying queue id %d for pasid %d\n",
 				args->queue_id,
 				p->pasid);
 
@@ -341,12 +341,12 @@ static int kfd_ioctl_update_queue(struct file *filp, struct kfd_process *p,
 	struct queue_properties properties;
 
 	if (args->queue_percentage > KFD_MAX_QUEUE_PERCENTAGE) {
-		pr_err("kfd: queue percentage must be between 0 to KFD_MAX_QUEUE_PERCENTAGE\n");
+		pr_err("Queue percentage must be between 0 to KFD_MAX_QUEUE_PERCENTAGE\n");
 		return -EINVAL;
 	}
 
 	if (args->queue_priority > KFD_MAX_QUEUE_PRIORITY) {
-		pr_err("kfd: queue priority must be between 0 to KFD_MAX_QUEUE_PRIORITY\n");
+		pr_err("Queue priority must be between 0 to KFD_MAX_QUEUE_PRIORITY\n");
 		return -EINVAL;
 	}
 
@@ -354,12 +354,12 @@ static int kfd_ioctl_update_queue(struct file *filp, struct kfd_process *p,
 		(!access_ok(VERIFY_WRITE,
 			(const void __user *) args->ring_base_address,
 			sizeof(uint64_t)))) {
-		pr_err("kfd: can't access ring base address\n");
+		pr_err("Can't access ring base address\n");
 		return -EFAULT;
 	}
 
 	if (!is_power_of_2(args->ring_size) && (args->ring_size != 0)) {
-		pr_err("kfd: ring size must be a power of 2 or 0\n");
+		pr_err("Ring size must be a power of 2 or 0\n");
 		return -EINVAL;
 	}
 
@@ -368,7 +368,7 @@ static int kfd_ioctl_update_queue(struct file *filp, struct kfd_process *p,
 	properties.queue_percent = args->queue_percentage;
 	properties.priority = args->queue_priority;
 
-	pr_debug("kfd: updating queue id %d for PASID %d\n",
+	pr_debug("Updating queue id %d for pasid %d\n",
 			args->queue_id, p->pasid);
 
 	mutex_lock(&p->mutex);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
index a7548a5..bf8ee19 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
@@ -78,7 +78,7 @@ static int dbgdev_diq_submit_ib(struct kfd_dbgdev *dbgdev,
 				pq_packets_size_in_bytes / sizeof(uint32_t),
 				&ib_packet_buff);
 	if (status != 0) {
-		pr_err("amdkfd: acquire_packet_buffer failed\n");
+		pr_err("acquire_packet_buffer failed\n");
 		return status;
 	}
 
@@ -116,7 +116,7 @@ static int dbgdev_diq_submit_ib(struct kfd_dbgdev *dbgdev,
 					&mem_obj);
 
 	if (status != 0) {
-		pr_err("amdkfd: Failed to allocate GART memory\n");
+		pr_err("Failed to allocate GART memory\n");
 		kq->ops.rollback_packet(kq);
 		return status;
 	}
@@ -194,7 +194,7 @@ static int dbgdev_register_diq(struct kfd_dbgdev *dbgdev)
 				&qid);
 
 	if (status) {
-		pr_err("amdkfd: Failed to create DIQ\n");
+		pr_err("Failed to create DIQ\n");
 		return status;
 	}
 
@@ -203,7 +203,7 @@ static int dbgdev_register_diq(struct kfd_dbgdev *dbgdev)
 	kq = pqm_get_kernel_queue(dbgdev->pqm, qid);
 
 	if (kq == NULL) {
-		pr_err("amdkfd: Error getting DIQ\n");
+		pr_err("Error getting DIQ\n");
 		pqm_destroy_queue(dbgdev->pqm, qid);
 		return -EFAULT;
 	}
@@ -279,7 +279,7 @@ static void dbgdev_address_watch_set_registers(
 }
 
 static int dbgdev_address_watch_nodiq(struct kfd_dbgdev *dbgdev,
-					struct dbg_address_watch_info *adw_info)
+				      struct dbg_address_watch_info *adw_info)
 {
 	union TCP_WATCH_ADDR_H_BITS addrHi;
 	union TCP_WATCH_ADDR_L_BITS addrLo;
@@ -293,7 +293,7 @@ static int dbgdev_address_watch_nodiq(struct kfd_dbgdev *dbgdev,
 	pdd = kfd_get_process_device_data(dbgdev->dev,
 					adw_info->process);
 	if (!pdd) {
-		pr_err("amdkfd: Failed to get pdd for wave control no DIQ\n");
+		pr_err("Failed to get pdd for wave control no DIQ\n");
 		return -EFAULT;
 	}
 
@@ -303,13 +303,13 @@ static int dbgdev_address_watch_nodiq(struct kfd_dbgdev *dbgdev,
 
 	if ((adw_info->num_watch_points > MAX_WATCH_ADDRESSES) ||
 			(adw_info->num_watch_points == 0)) {
-		pr_err("amdkfd: num_watch_points is invalid\n");
+		pr_err("num_watch_points is invalid\n");
 		return -EINVAL;
 	}
 
 	if ((adw_info->watch_mode == NULL) ||
 		(adw_info->watch_address == NULL)) {
-		pr_err("amdkfd: adw_info fields are not valid\n");
+		pr_err("adw_info fields are not valid\n");
 		return -EINVAL;
 	}
 
@@ -348,7 +348,7 @@ static int dbgdev_address_watch_nodiq(struct kfd_dbgdev *dbgdev,
 }
 
 static int dbgdev_address_watch_diq(struct kfd_dbgdev *dbgdev,
-					struct dbg_address_watch_info *adw_info)
+				    struct dbg_address_watch_info *adw_info)
 {
 	struct pm4__set_config_reg *packets_vec;
 	union TCP_WATCH_ADDR_H_BITS addrHi;
@@ -371,20 +371,20 @@ static int dbgdev_address_watch_diq(struct kfd_dbgdev *dbgdev,
 
 	if ((adw_info->num_watch_points > MAX_WATCH_ADDRESSES) ||
 			(adw_info->num_watch_points == 0)) {
-		pr_err("amdkfd: num_watch_points is invalid\n");
+		pr_err("num_watch_points is invalid\n");
 		return -EINVAL;
 	}
 
 	if ((NULL == adw_info->watch_mode) ||
 			(NULL == adw_info->watch_address)) {
-		pr_err("amdkfd: adw_info fields are not valid\n");
+		pr_err("adw_info fields are not valid\n");
 		return -EINVAL;
 	}
 
 	status = kfd_gtt_sa_allocate(dbgdev->dev, ib_size, &mem_obj);
 
 	if (status != 0) {
-		pr_err("amdkfd: Failed to allocate GART memory\n");
+		pr_err("Failed to allocate GART memory\n");
 		return status;
 	}
 
@@ -491,7 +491,7 @@ static int dbgdev_address_watch_diq(struct kfd_dbgdev *dbgdev,
 					ib_size);
 
 		if (status != 0) {
-			pr_err("amdkfd: Failed to submit IB to DIQ\n");
+			pr_err("Failed to submit IB to DIQ\n");
 			break;
 		}
 	}
@@ -619,7 +619,7 @@ static int dbgdev_wave_control_diq(struct kfd_dbgdev *dbgdev,
 	status = dbgdev_wave_control_set_registers(wac_info, &reg_sq_cmd,
 							&reg_gfx_index);
 	if (status) {
-		pr_err("amdkfd: Failed to set wave control registers\n");
+		pr_err("Failed to set wave control registers\n");
 		return status;
 	}
 
@@ -659,7 +659,7 @@ static int dbgdev_wave_control_diq(struct kfd_dbgdev *dbgdev,
 	status = kfd_gtt_sa_allocate(dbgdev->dev, ib_size, &mem_obj);
 
 	if (status != 0) {
-		pr_err("amdkfd: Failed to allocate GART memory\n");
+		pr_err("Failed to allocate GART memory\n");
 		return status;
 	}
 
@@ -712,7 +712,7 @@ static int dbgdev_wave_control_diq(struct kfd_dbgdev *dbgdev,
 			ib_size);
 
 	if (status != 0)
-		pr_err("amdkfd: Failed to submit IB to DIQ\n");
+		pr_err("Failed to submit IB to DIQ\n");
 
 	kfd_gtt_sa_free(dbgdev->dev, mem_obj);
 
@@ -735,13 +735,13 @@ static int dbgdev_wave_control_nodiq(struct kfd_dbgdev *dbgdev,
 	pdd = kfd_get_process_device_data(dbgdev->dev, wac_info->process);
 
 	if (!pdd) {
-		pr_err("amdkfd: Failed to get pdd for wave control no DIQ\n");
+		pr_err("Failed to get pdd for wave control no DIQ\n");
 		return -EFAULT;
 	}
 	status = dbgdev_wave_control_set_registers(wac_info, &reg_sq_cmd,
 							&reg_gfx_index);
 	if (status) {
-		pr_err("amdkfd: Failed to set wave control registers\n");
+		pr_err("Failed to set wave control registers\n");
 		return status;
 	}
 
@@ -826,7 +826,7 @@ int dbgdev_wave_reset_wavefronts(struct kfd_dev *dev, struct kfd_process *p)
 	}
 
 	if (vmid > last_vmid_to_scan) {
-		pr_err("amdkfd: didn't found vmid for pasid (%d)\n", p->pasid);
+		pr_err("Didn't find vmid for pasid %d\n", p->pasid);
 		return -EFAULT;
 	}
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c b/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c
index 56d6763..7225789 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c
@@ -71,7 +71,7 @@ bool kfd_dbgmgr_create(struct kfd_dbgmgr **ppmgr, struct kfd_dev *pdev)
 
 	new_buff = kfd_alloc_struct(new_buff);
 	if (!new_buff) {
-		pr_err("amdkfd: Failed to allocate dbgmgr instance\n");
+		pr_err("Failed to allocate dbgmgr instance\n");
 		return false;
 	}
 
@@ -79,7 +79,7 @@ bool kfd_dbgmgr_create(struct kfd_dbgmgr **ppmgr, struct kfd_dev *pdev)
 	new_buff->dev = pdev;
 	new_buff->dbgdev = kfd_alloc_struct(new_buff->dbgdev);
 	if (!new_buff->dbgdev) {
-		pr_err("amdkfd: Failed to allocate dbgdev instance\n");
+		pr_err("Failed to allocate dbgdev instance\n");
 		kfree(new_buff);
 		return false;
 	}
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 1f50325..87df8bf 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -152,7 +152,7 @@ static bool device_iommu_pasid_init(struct kfd_dev *kfd)
 	}
 
 	if ((iommu_info.flags & required_iommu_flags) != required_iommu_flags) {
-		dev_err(kfd_device, "error required iommu flags ats(%i), pri(%i), pasid(%i)\n",
+		dev_err(kfd_device, "error required iommu flags ats %i, pri %i, pasid %i\n",
 		       (iommu_info.flags & AMD_IOMMU_DEVICE_FLAG_ATS_SUP) != 0,
 		       (iommu_info.flags & AMD_IOMMU_DEVICE_FLAG_PRI_SUP) != 0,
 		       (iommu_info.flags & AMD_IOMMU_DEVICE_FLAG_PASID_SUP)
@@ -248,42 +248,33 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
 	if (kfd->kfd2kgd->init_gtt_mem_allocation(
 			kfd->kgd, size, &kfd->gtt_mem,
 			&kfd->gtt_start_gpu_addr, &kfd->gtt_start_cpu_ptr)){
-		dev_err(kfd_device,
-			"Could not allocate %d bytes for device (%x:%x)\n",
-			size, kfd->pdev->vendor, kfd->pdev->device);
+		dev_err(kfd_device, "Could not allocate %d bytes\n", size);
 		goto out;
 	}
 
-	dev_info(kfd_device,
-		"Allocated %d bytes on gart for device(%x:%x)\n",
-		size, kfd->pdev->vendor, kfd->pdev->device);
+	dev_info(kfd_device, "Allocated %d bytes on gart\n", size);
 
 	/* Initialize GTT sa with 512 byte chunk size */
 	if (kfd_gtt_sa_init(kfd, size, 512) != 0) {
-		dev_err(kfd_device,
-			"Error initializing gtt sub-allocator\n");
+		dev_err(kfd_device, "Error initializing gtt sub-allocator\n");
 		goto kfd_gtt_sa_init_error;
 	}
 
 	kfd_doorbell_init(kfd);
 
 	if (kfd_topology_add_device(kfd) != 0) {
-		dev_err(kfd_device,
-			"Error adding device (%x:%x) to topology\n",
-			kfd->pdev->vendor, kfd->pdev->device);
+		dev_err(kfd_device, "Error adding device to topology\n");
 		goto kfd_topology_add_device_error;
 	}
 
 	if (kfd_interrupt_init(kfd)) {
-		dev_err(kfd_device,
-			"Error initializing interrupts for device (%x:%x)\n",
-			kfd->pdev->vendor, kfd->pdev->device);
+		dev_err(kfd_device, "Error initializing interrupts\n");
 		goto kfd_interrupt_error;
 	}
 
 	if (!device_iommu_pasid_init(kfd)) {
 		dev_err(kfd_device,
-			"Error initializing iommuv2 for device (%x:%x)\n",
+			"Error initializing iommuv2 for device %x:%x\n",
 			kfd->pdev->vendor, kfd->pdev->device);
 		goto device_iommu_pasid_error;
 	}
@@ -293,15 +284,13 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
 
 	kfd->dqm = device_queue_manager_init(kfd);
 	if (!kfd->dqm) {
-		dev_err(kfd_device,
-			"Error initializing queue manager for device (%x:%x)\n",
-			kfd->pdev->vendor, kfd->pdev->device);
+		dev_err(kfd_device, "Error initializing queue manager\n");
 		goto device_queue_manager_error;
 	}
 
 	if (kfd->dqm->ops.start(kfd->dqm) != 0) {
 		dev_err(kfd_device,
-			"Error starting queuen manager for device (%x:%x)\n",
+			"Error starting queue manager for device %x:%x\n",
 			kfd->pdev->vendor, kfd->pdev->device);
 		goto dqm_start_error;
 	}
@@ -309,10 +298,10 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
 	kfd->dbgmgr = NULL;
 
 	kfd->init_complete = true;
-	dev_info(kfd_device, "added device (%x:%x)\n", kfd->pdev->vendor,
+	dev_info(kfd_device, "added device %x:%x\n", kfd->pdev->vendor,
 		 kfd->pdev->device);
 
-	pr_debug("kfd: Starting kfd with the following scheduling policy %d\n",
+	pr_debug("Starting kfd with the following scheduling policy %d\n",
 		sched_policy);
 
 	goto out;
@@ -330,7 +319,7 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
 kfd_gtt_sa_init_error:
 	kfd->kfd2kgd->free_gtt_mem(kfd->kgd, kfd->gtt_mem);
 	dev_err(kfd_device,
-		"device (%x:%x) NOT added due to errors\n",
+		"device %x:%x NOT added due to errors\n",
 		kfd->pdev->vendor, kfd->pdev->device);
 out:
 	return kfd->init_complete;
@@ -422,7 +411,7 @@ static int kfd_gtt_sa_init(struct kfd_dev *kfd, unsigned int buf_size,
 	if (!kfd->gtt_sa_bitmap)
 		return -ENOMEM;
 
-	pr_debug("kfd: gtt_sa_num_of_chunks = %d, gtt_sa_bitmap = %p\n",
+	pr_debug("gtt_sa_num_of_chunks = %d, gtt_sa_bitmap = %p\n",
 			kfd->gtt_sa_num_of_chunks, kfd->gtt_sa_bitmap);
 
 	mutex_init(&kfd->gtt_sa_lock);
@@ -468,7 +457,7 @@ int kfd_gtt_sa_allocate(struct kfd_dev *kfd, unsigned int size,
 	if ((*mem_obj) == NULL)
 		return -ENOMEM;
 
-	pr_debug("kfd: allocated mem_obj = %p for size = %d\n", *mem_obj, size);
+	pr_debug("Allocated mem_obj = %p for size = %d\n", *mem_obj, size);
 
 	start_search = 0;
 
@@ -480,7 +469,7 @@ int kfd_gtt_sa_allocate(struct kfd_dev *kfd, unsigned int size,
 					kfd->gtt_sa_num_of_chunks,
 					start_search);
 
-	pr_debug("kfd: found = %d\n", found);
+	pr_debug("Found = %d\n", found);
 
 	/* If there wasn't any free chunk, bail out */
 	if (found == kfd->gtt_sa_num_of_chunks)
@@ -498,12 +487,12 @@ int kfd_gtt_sa_allocate(struct kfd_dev *kfd, unsigned int size,
 					found,
 					kfd->gtt_sa_chunk_size);
 
-	pr_debug("kfd: gpu_addr = %p, cpu_addr = %p\n",
+	pr_debug("gpu_addr = %p, cpu_addr = %p\n",
 			(uint64_t *) (*mem_obj)->gpu_addr, (*mem_obj)->cpu_ptr);
 
 	/* If we need only one chunk, mark it as allocated and get out */
 	if (size <= kfd->gtt_sa_chunk_size) {
-		pr_debug("kfd: single bit\n");
+		pr_debug("Single bit\n");
 		set_bit(found, kfd->gtt_sa_bitmap);
 		goto kfd_gtt_out;
 	}
@@ -538,7 +527,7 @@ int kfd_gtt_sa_allocate(struct kfd_dev *kfd, unsigned int size,
 
 	} while (cur_size > 0);
 
-	pr_debug("kfd: range_start = %d, range_end = %d\n",
+	pr_debug("range_start = %d, range_end = %d\n",
 		(*mem_obj)->range_start, (*mem_obj)->range_end);
 
 	/* Mark the chunks as allocated */
@@ -552,7 +541,7 @@ int kfd_gtt_sa_allocate(struct kfd_dev *kfd, unsigned int size,
 	return 0;
 
 kfd_gtt_no_free_chunk:
-	pr_debug("kfd: allocation failed with mem_obj = %p\n", mem_obj);
+	pr_debug("Allocation failed with mem_obj = %p\n", mem_obj);
 	mutex_unlock(&kfd->gtt_sa_lock);
 	kfree(mem_obj);
 	return -ENOMEM;
@@ -568,7 +557,7 @@ int kfd_gtt_sa_free(struct kfd_dev *kfd, struct kfd_mem_obj *mem_obj)
 	if (!mem_obj)
 		return 0;
 
-	pr_debug("kfd: free mem_obj = %p, range_start = %d, range_end = %d\n",
+	pr_debug("Free mem_obj = %p, range_start = %d, range_end = %d\n",
 			mem_obj, mem_obj->range_start, mem_obj->range_end);
 
 	mutex_lock(&kfd->gtt_sa_lock);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 3b850da..8b147e4 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -121,7 +121,7 @@ static int allocate_vmid(struct device_queue_manager *dqm,
 
 	/* Kaveri kfd vmid's starts from vmid 8 */
 	allocated_vmid = bit + KFD_VMID_START_OFFSET;
-	pr_debug("kfd: vmid allocation %d\n", allocated_vmid);
+	pr_debug("vmid allocation %d\n", allocated_vmid);
 	qpd->vmid = allocated_vmid;
 	q->properties.vmid = allocated_vmid;
 
@@ -154,13 +154,12 @@ static int create_queue_nocpsch(struct device_queue_manager *dqm,
 
 	BUG_ON(!dqm || !q || !qpd || !allocated_vmid);
 
-	pr_debug("kfd: In func %s\n", __func__);
 	print_queue(q);
 
 	mutex_lock(&dqm->lock);
 
 	if (dqm->total_queue_count >= max_num_of_queues_per_device) {
-		pr_warn("amdkfd: Can't create new usermode queue because %d queues were already created\n",
+		pr_warn("Can't create new usermode queue because %d queues were already created\n",
 				dqm->total_queue_count);
 		mutex_unlock(&dqm->lock);
 		return -EPERM;
@@ -240,8 +239,7 @@ static int allocate_hqd(struct device_queue_manager *dqm, struct queue *q)
 	if (!set)
 		return -EBUSY;
 
-	pr_debug("kfd: DQM %s hqd slot - pipe (%d) queue(%d)\n",
-				__func__, q->pipe, q->queue);
+	pr_debug("hqd slot - pipe %d, queue %d\n", q->pipe, q->queue);
 	/* horizontal hqd allocation */
 	dqm->next_pipe_to_allocate = (pipe + 1) % get_pipes_per_mec(dqm);
 
@@ -278,9 +276,8 @@ static int create_compute_queue_nocpsch(struct device_queue_manager *dqm,
 		return retval;
 	}
 
-	pr_debug("kfd: loading mqd to hqd on pipe (%d) queue (%d)\n",
-			q->pipe,
-			q->queue);
+	pr_debug("Loading mqd to hqd on pipe %d, queue %d\n",
+			q->pipe, q->queue);
 
 	retval = mqd->load_mqd(mqd, q->mqd, q->pipe,
 			q->queue, (uint32_t __user *) q->properties.write_ptr);
@@ -304,8 +301,6 @@ static int destroy_queue_nocpsch(struct device_queue_manager *dqm,
 
 	retval = 0;
 
-	pr_debug("kfd: In Func %s\n", __func__);
-
 	mutex_lock(&dqm->lock);
 
 	if (q->properties.type == KFD_QUEUE_TYPE_COMPUTE) {
@@ -324,7 +319,7 @@ static int destroy_queue_nocpsch(struct device_queue_manager *dqm,
 		dqm->sdma_queue_count--;
 		deallocate_sdma_queue(dqm, q->sdma_id);
 	} else {
-		pr_debug("q->properties.type is invalid (%d)\n",
+		pr_debug("q->properties.type %d is invalid\n",
 				q->properties.type);
 		retval = -EINVAL;
 		goto out;
@@ -403,13 +398,13 @@ static struct mqd_manager *get_mqd_manager_nocpsch(
 
 	BUG_ON(!dqm || type >= KFD_MQD_TYPE_MAX);
 
-	pr_debug("kfd: In func %s mqd type %d\n", __func__, type);
+	pr_debug("mqd type %d\n", type);
 
 	mqd = dqm->mqds[type];
 	if (!mqd) {
 		mqd = mqd_manager_init(type, dqm->dev);
 		if (mqd == NULL)
-			pr_err("kfd: mqd manager is NULL");
+			pr_err("mqd manager is NULL");
 		dqm->mqds[type] = mqd;
 	}
 
@@ -424,8 +419,6 @@ static int register_process_nocpsch(struct device_queue_manager *dqm,
 
 	BUG_ON(!dqm || !qpd);
 
-	pr_debug("kfd: In func %s\n", __func__);
-
 	n = kzalloc(sizeof(struct device_process_node), GFP_KERNEL);
 	if (!n)
 		return -ENOMEM;
@@ -452,8 +445,6 @@ static int unregister_process_nocpsch(struct device_queue_manager *dqm,
 
 	BUG_ON(!dqm || !qpd);
 
-	pr_debug("In func %s\n", __func__);
-
 	pr_debug("qpd->queues_list is %s\n",
 			list_empty(&qpd->queues_list) ? "empty" : "not empty");
 
@@ -501,25 +492,13 @@ static void init_interrupts(struct device_queue_manager *dqm)
 			dqm->dev->kfd2kgd->init_interrupts(dqm->dev->kgd, i);
 }
 
-static int init_scheduler(struct device_queue_manager *dqm)
-{
-	int retval = 0;
-
-	BUG_ON(!dqm);
-
-	pr_debug("kfd: In %s\n", __func__);
-
-	return retval;
-}
-
 static int initialize_nocpsch(struct device_queue_manager *dqm)
 {
 	int pipe, queue;
 
 	BUG_ON(!dqm);
 
-	pr_debug("kfd: In func %s num of pipes: %d\n",
-			__func__, get_pipes_per_mec(dqm));
+	pr_debug("num of pipes: %d\n", get_pipes_per_mec(dqm));
 
 	mutex_init(&dqm->lock);
 	INIT_LIST_HEAD(&dqm->queues);
@@ -544,7 +523,6 @@ static int initialize_nocpsch(struct device_queue_manager *dqm)
 	dqm->vmid_bitmap = (1 << VMID_PER_DEVICE) - 1;
 	dqm->sdma_bitmap = (1 << CIK_SDMA_QUEUES) - 1;
 
-	init_scheduler(dqm);
 	return 0;
 }
 
@@ -617,9 +595,9 @@ static int create_sdma_queue_nocpsch(struct device_queue_manager *dqm,
 	q->properties.sdma_queue_id = q->sdma_id % CIK_SDMA_QUEUES_PER_ENGINE;
 	q->properties.sdma_engine_id = q->sdma_id / CIK_SDMA_ENGINE_NUM;
 
-	pr_debug("kfd: sdma id is:    %d\n", q->sdma_id);
-	pr_debug("     sdma queue id: %d\n", q->properties.sdma_queue_id);
-	pr_debug("     sdma engine id: %d\n", q->properties.sdma_engine_id);
+	pr_debug("SDMA id is:    %d\n", q->sdma_id);
+	pr_debug("SDMA queue id: %d\n", q->properties.sdma_queue_id);
+	pr_debug("SDMA engine id: %d\n", q->properties.sdma_engine_id);
 
 	dqm->ops_asic_specific.init_sdma_vm(dqm, q, qpd);
 	retval = mqd->init_mqd(mqd, &q->mqd, &q->mqd_mem_obj,
@@ -651,8 +629,6 @@ static int set_sched_resources(struct device_queue_manager *dqm)
 
 	BUG_ON(!dqm);
 
-	pr_debug("kfd: In func %s\n", __func__);
-
 	res.vmid_mask = (1 << VMID_PER_DEVICE) - 1;
 	res.vmid_mask <<= KFD_VMID_START_OFFSET;
 
@@ -682,9 +658,9 @@ static int set_sched_resources(struct device_queue_manager *dqm)
 	res.gws_mask = res.oac_mask = res.gds_heap_base =
 						res.gds_heap_size = 0;
 
-	pr_debug("kfd: scheduling resources:\n"
-			"      vmid mask: 0x%8X\n"
-			"      queue mask: 0x%8llX\n",
+	pr_debug("Scheduling resources:\n"
+			"vmid mask: 0x%8X\n"
+			"queue mask: 0x%8llX\n",
 			res.vmid_mask, res.queue_mask);
 
 	return pm_send_set_resources(&dqm->packets, &res);
@@ -696,8 +672,7 @@ static int initialize_cpsch(struct device_queue_manager *dqm)
 
 	BUG_ON(!dqm);
 
-	pr_debug("kfd: In func %s num of pipes: %d\n",
-			__func__, get_pipes_per_mec(dqm));
+	pr_debug("num of pipes: %d\n", get_pipes_per_mec(dqm));
 
 	mutex_init(&dqm->lock);
 	INIT_LIST_HEAD(&dqm->queues);
@@ -732,7 +707,7 @@ static int start_cpsch(struct device_queue_manager *dqm)
 	if (retval != 0)
 		goto fail_set_sched_resources;
 
-	pr_debug("kfd: allocating fence memory\n");
+	pr_debug("Allocating fence memory\n");
 
 	/* allocate fence memory on the gart */
 	retval = kfd_gtt_sa_allocate(dqm->dev, sizeof(*dqm->fence_addr),
@@ -786,11 +761,9 @@ static int create_kernel_queue_cpsch(struct device_queue_manager *dqm,
 {
 	BUG_ON(!dqm || !kq || !qpd);
 
-	pr_debug("kfd: In func %s\n", __func__);
-
 	mutex_lock(&dqm->lock);
 	if (dqm->total_queue_count >= max_num_of_queues_per_device) {
-		pr_warn("amdkfd: Can't create new kernel queue because %d queues were already created\n",
+		pr_warn("Can't create new kernel queue because %d queues were already created\n",
 				dqm->total_queue_count);
 		mutex_unlock(&dqm->lock);
 		return -EPERM;
@@ -819,8 +792,6 @@ static void destroy_kernel_queue_cpsch(struct device_queue_manager *dqm,
 {
 	BUG_ON(!dqm || !kq);
 
-	pr_debug("kfd: In %s\n", __func__);
-
 	mutex_lock(&dqm->lock);
 	/* here we actually preempt the DIQ */
 	destroy_queues_cpsch(dqm, true, false);
@@ -862,7 +833,7 @@ static int create_queue_cpsch(struct device_queue_manager *dqm, struct queue *q,
 	mutex_lock(&dqm->lock);
 
 	if (dqm->total_queue_count >= max_num_of_queues_per_device) {
-		pr_warn("amdkfd: Can't create new usermode queue because %d queues were already created\n",
+		pr_warn("Can't create new usermode queue because %d queues were already created\n",
 				dqm->total_queue_count);
 		retval = -EPERM;
 		goto out;
@@ -916,7 +887,7 @@ int amdkfd_fence_wait_timeout(unsigned int *fence_addr,
 
 	while (*fence_addr != fence_value) {
 		if (time_after(jiffies, timeout)) {
-			pr_err("kfd: qcm fence wait loop timeout expired\n");
+			pr_err("qcm fence wait loop timeout expired\n");
 			return -ETIME;
 		}
 		schedule();
@@ -949,7 +920,7 @@ static int destroy_queues_cpsch(struct device_queue_manager *dqm,
 	if (!dqm->active_runlist)
 		goto out;
 
-	pr_debug("kfd: Before destroying queues, sdma queue count is : %u\n",
+	pr_debug("Before destroying queues, sdma queue count is : %u\n",
 		dqm->sdma_queue_count);
 
 	if (dqm->sdma_queue_count > 0) {
@@ -998,7 +969,7 @@ static int execute_queues_cpsch(struct device_queue_manager *dqm, bool lock)
 
 	retval = destroy_queues_cpsch(dqm, false, false);
 	if (retval != 0) {
-		pr_err("kfd: the cp might be in an unrecoverable state due to an unsuccessful queues preemption");
+		pr_err("The cp might be in an unrecoverable state due to an unsuccessful queues preemption");
 		goto out;
 	}
 
@@ -1014,7 +985,7 @@ static int execute_queues_cpsch(struct device_queue_manager *dqm, bool lock)
 
 	retval = pm_send_runlist(&dqm->packets, &dqm->queues);
 	if (retval != 0) {
-		pr_err("kfd: failed to execute runlist");
+		pr_err("failed to execute runlist");
 		goto out;
 	}
 	dqm->active_runlist = true;
@@ -1106,8 +1077,6 @@ static bool set_cache_memory_policy(struct device_queue_manager *dqm,
 {
 	bool retval;
 
-	pr_debug("kfd: In func %s\n", __func__);
-
 	mutex_lock(&dqm->lock);
 
 	if (alternate_aperture_size == 0) {
@@ -1152,7 +1121,7 @@ static bool set_cache_memory_policy(struct device_queue_manager *dqm,
 	if ((sched_policy == KFD_SCHED_POLICY_NO_HWS) && (qpd->vmid != 0))
 		program_sh_mem_settings(dqm, qpd);
 
-	pr_debug("kfd: sh_mem_config: 0x%x, ape1_base: 0x%x, ape1_limit: 0x%x\n",
+	pr_debug("sh_mem_config: 0x%x, ape1_base: 0x%x, ape1_limit: 0x%x\n",
 		qpd->sh_mem_config, qpd->sh_mem_ape1_base,
 		qpd->sh_mem_ape1_limit);
 
@@ -1170,7 +1139,7 @@ struct device_queue_manager *device_queue_manager_init(struct kfd_dev *dev)
 
 	BUG_ON(!dev);
 
-	pr_debug("kfd: loading device queue manager\n");
+	pr_debug("Loading device queue manager\n");
 
 	dqm = kzalloc(sizeof(struct device_queue_manager), GFP_KERNEL);
 	if (!dqm)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c
index 48dc056..a263e2a 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c
@@ -127,7 +127,7 @@ static int register_process_cik(struct device_queue_manager *dqm,
 		qpd->sh_mem_bases = compute_sh_mem_bases_64bit(temp);
 	}
 
-	pr_debug("kfd: is32bit process: %d sh_mem_bases nybble: 0x%X and register 0x%X\n",
+	pr_debug("is32bit process: %d sh_mem_bases nybble: 0x%X and register 0x%X\n",
 		qpd->pqm->process->is_32bit_user_mode, temp, qpd->sh_mem_bases);
 
 	return 0;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c
index 7e9cae9..8c45c86 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c
@@ -139,7 +139,7 @@ static int register_process_vi(struct device_queue_manager *dqm,
 			SH_MEM_CONFIG__ADDRESS_MODE__SHIFT;
 	}
 
-	pr_debug("kfd: is32bit process: %d sh_mem_bases nybble: 0x%X and register 0x%X\n",
+	pr_debug("is32bit process: %d sh_mem_bases nybble: 0x%X and register 0x%X\n",
 		qpd->pqm->process->is_32bit_user_mode, temp, qpd->sh_mem_bases);
 
 	return 0;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
index 453c5d6..ca21538 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
@@ -97,23 +97,23 @@ void kfd_doorbell_init(struct kfd_dev *kfd)
 
 	BUG_ON(!kfd->doorbell_kernel_ptr);
 
-	pr_debug("kfd: doorbell initialization:\n");
-	pr_debug("kfd: doorbell base           == 0x%08lX\n",
+	pr_debug("Doorbell initialization:\n");
+	pr_debug("doorbell base           == 0x%08lX\n",
 			(uintptr_t)kfd->doorbell_base);
 
-	pr_debug("kfd: doorbell_id_offset      == 0x%08lX\n",
+	pr_debug("doorbell_id_offset      == 0x%08lX\n",
 			kfd->doorbell_id_offset);
 
-	pr_debug("kfd: doorbell_process_limit  == 0x%08lX\n",
+	pr_debug("doorbell_process_limit  == 0x%08lX\n",
 			doorbell_process_limit);
 
-	pr_debug("kfd: doorbell_kernel_offset  == 0x%08lX\n",
+	pr_debug("doorbell_kernel_offset  == 0x%08lX\n",
 			(uintptr_t)kfd->doorbell_base);
 
-	pr_debug("kfd: doorbell aperture size  == 0x%08lX\n",
+	pr_debug("doorbell aperture size  == 0x%08lX\n",
 			kfd->shared_resources.doorbell_aperture_size);
 
-	pr_debug("kfd: doorbell kernel address == 0x%08lX\n",
+	pr_debug("doorbell kernel address == 0x%08lX\n",
 			(uintptr_t)kfd->doorbell_kernel_ptr);
 }
 
@@ -142,12 +142,11 @@ int kfd_doorbell_mmap(struct kfd_process *process, struct vm_area_struct *vma)
 
 	vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
 
-	pr_debug("kfd: mapping doorbell page in %s\n"
+	pr_debug("Mapping doorbell page\n"
 		 "     target user address == 0x%08llX\n"
 		 "     physical address    == 0x%08llX\n"
 		 "     vm_flags            == 0x%04lX\n"
 		 "     size                == 0x%04lX\n",
-		 __func__,
 		 (unsigned long long) vma->vm_start, address, vma->vm_flags,
 		 doorbell_process_allocation());
 
@@ -185,7 +184,7 @@ u32 __iomem *kfd_get_kernel_doorbell(struct kfd_dev *kfd,
 	*doorbell_off = KERNEL_DOORBELL_PASID * (doorbell_process_allocation() /
 							sizeof(u32)) + inx;
 
-	pr_debug("kfd: get kernel queue doorbell\n"
+	pr_debug("Get kernel queue doorbell\n"
 			 "     doorbell offset   == 0x%08X\n"
 			 "     kernel address    == 0x%08lX\n",
 		*doorbell_off, (uintptr_t)(kfd->doorbell_kernel_ptr + inx));
@@ -210,7 +209,7 @@ inline void write_kernel_doorbell(u32 __iomem *db, u32 value)
 {
 	if (db) {
 		writel(value, db);
-		pr_debug("writing %d to doorbell address 0x%p\n", value, db);
+		pr_debug("Writing %d to doorbell address 0x%p\n", value, db);
 	}
 }
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
index d8b9b3c..abdaf95 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
@@ -110,7 +110,7 @@ static bool allocate_free_slot(struct kfd_process *process,
 			*out_page = page;
 			*out_slot_index = slot;
 
-			pr_debug("allocated event signal slot in page %p, slot %d\n",
+			pr_debug("Allocated event signal slot in page %p, slot %d\n",
 					page, slot);
 
 			return true;
@@ -155,9 +155,9 @@ static bool allocate_signal_page(struct file *devkfd, struct kfd_process *p)
 						   struct signal_page,
 						   event_pages)->page_index + 1;
 
-	pr_debug("allocated new event signal page at %p, for process %p\n",
+	pr_debug("Allocated new event signal page at %p, for process %p\n",
 			page, p);
-	pr_debug("page index is %d\n", page->page_index);
+	pr_debug("Page index is %d\n", page->page_index);
 
 	list_add(&page->event_pages, &p->signal_event_pages);
 
@@ -292,13 +292,13 @@ static int create_signal_event(struct file *devkfd,
 				struct kfd_event *ev)
 {
 	if (p->signal_event_count == KFD_SIGNAL_EVENT_LIMIT) {
-		pr_warn("amdkfd: Signal event wasn't created because limit was reached\n");
+		pr_warn("Signal event wasn't created because limit was reached\n");
 		return -ENOMEM;
 	}
 
 	if (!allocate_event_notification_slot(devkfd, p, &ev->signal_page,
 						&ev->signal_slot_index)) {
-		pr_warn("amdkfd: Signal event wasn't created because out of kernel memory\n");
+		pr_warn("Signal event wasn't created because out of kernel memory\n");
 		return -ENOMEM;
 	}
 
@@ -310,11 +310,7 @@ static int create_signal_event(struct file *devkfd,
 	ev->event_id = make_signal_event_id(ev->signal_page,
 						ev->signal_slot_index);
 
-	pr_debug("signal event number %zu created with id %d, address %p\n",
-			p->signal_event_count, ev->event_id,
-			ev->user_signal_address);
-
-	pr_debug("signal event number %zu created with id %d, address %p\n",
+	pr_debug("Signal event number %zu created with id %d, address %p\n",
 			p->signal_event_count, ev->event_id,
 			ev->user_signal_address);
 
@@ -817,7 +813,7 @@ int kfd_event_mmap(struct kfd_process *p, struct vm_area_struct *vma)
 	/* check required size is logical */
 	if (get_order(KFD_SIGNAL_EVENT_LIMIT * 8) !=
 			get_order(vma->vm_end - vma->vm_start)) {
-		pr_err("amdkfd: event page mmap requested illegal size\n");
+		pr_err("Event page mmap requested illegal size\n");
 		return -EINVAL;
 	}
 
@@ -826,7 +822,7 @@ int kfd_event_mmap(struct kfd_process *p, struct vm_area_struct *vma)
 	page = lookup_signal_page_by_index(p, page_index);
 	if (!page) {
 		/* Probably KFD bug, but mmap is user-accessible. */
-		pr_debug("signal page could not be found for page_index %u\n",
+		pr_debug("Signal page could not be found for page_index %u\n",
 				page_index);
 		return -EINVAL;
 	}
@@ -837,7 +833,7 @@ int kfd_event_mmap(struct kfd_process *p, struct vm_area_struct *vma)
 	vma->vm_flags |= VM_IO | VM_DONTCOPY | VM_DONTEXPAND | VM_NORESERVE
 		       | VM_DONTDUMP | VM_PFNMAP;
 
-	pr_debug("mapping signal page\n");
+	pr_debug("Mapping signal page\n");
 	pr_debug("     start user address  == 0x%08lx\n", vma->vm_start);
 	pr_debug("     end user address    == 0x%08lx\n", vma->vm_end);
 	pr_debug("     pfn                 == 0x%016lX\n", pfn);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
index d135cd0..f89d366 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
@@ -44,8 +44,8 @@ static bool initialize(struct kernel_queue *kq, struct kfd_dev *dev,
 	BUG_ON(!kq || !dev);
 	BUG_ON(type != KFD_QUEUE_TYPE_DIQ && type != KFD_QUEUE_TYPE_HIQ);
 
-	pr_debug("amdkfd: In func %s initializing queue type %d size %d\n",
-			__func__, KFD_QUEUE_TYPE_HIQ, queue_size);
+	pr_debug("Initializing queue type %d size %d\n", KFD_QUEUE_TYPE_HIQ,
+			queue_size);
 
 	memset(&prop, 0, sizeof(prop));
 	memset(&nop, 0, sizeof(nop));
@@ -73,13 +73,13 @@ static bool initialize(struct kernel_queue *kq, struct kfd_dev *dev,
 	prop.doorbell_ptr = kfd_get_kernel_doorbell(dev, &prop.doorbell_off);
 
 	if (prop.doorbell_ptr == NULL) {
-		pr_err("amdkfd: error init doorbell");
+		pr_err("Failed to initialize doorbell");
 		goto err_get_kernel_doorbell;
 	}
 
 	retval = kfd_gtt_sa_allocate(dev, queue_size, &kq->pq);
 	if (retval != 0) {
-		pr_err("amdkfd: error init pq queues size (%d)\n", queue_size);
+		pr_err("Failed to init pq queues size %d\n", queue_size);
 		goto err_pq_allocate_vidmem;
 	}
 
@@ -139,7 +139,7 @@ static bool initialize(struct kernel_queue *kq, struct kfd_dev *dev,
 
 	/* assign HIQ to HQD */
 	if (type == KFD_QUEUE_TYPE_HIQ) {
-		pr_debug("assigning hiq to hqd\n");
+		pr_debug("Assigning hiq to hqd\n");
 		kq->queue->pipe = KFD_CIK_HIQ_PIPE;
 		kq->queue->queue = KFD_CIK_HIQ_QUEUE;
 		kq->mqd->load_mqd(kq->mqd, kq->queue->mqd, kq->queue->pipe,
@@ -304,7 +304,7 @@ struct kernel_queue *kernel_queue_init(struct kfd_dev *dev,
 	}
 
 	if (!kq->ops.initialize(kq, dev, type, KFD_KERNEL_QUEUE_SIZE)) {
-		pr_err("amdkfd: failed to init kernel queue\n");
+		pr_err("Failed to init kernel queue\n");
 		kfree(kq);
 		return NULL;
 	}
@@ -327,7 +327,7 @@ static __attribute__((unused)) void test_kq(struct kfd_dev *dev)
 
 	BUG_ON(!dev);
 
-	pr_err("amdkfd: starting kernel queue test\n");
+	pr_err("Starting kernel queue test\n");
 
 	kq = kernel_queue_init(dev, KFD_QUEUE_TYPE_HIQ);
 	BUG_ON(!kq);
@@ -338,7 +338,7 @@ static __attribute__((unused)) void test_kq(struct kfd_dev *dev)
 		buffer[i] = kq->nop_packet;
 	kq->ops.submit_packet(kq);
 
-	pr_err("amdkfd: ending kernel queue test\n");
+	pr_err("Ending kernel queue test\n");
 }
 
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_module.c b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
index af5bfc1..819a442 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
@@ -91,7 +91,7 @@ static int __init kfd_module_init(void)
 	/* Verify module parameters */
 	if ((sched_policy < KFD_SCHED_POLICY_HWS) ||
 		(sched_policy > KFD_SCHED_POLICY_NO_HWS)) {
-		pr_err("kfd: sched_policy has invalid value\n");
+		pr_err("sched_policy has invalid value\n");
 		return -1;
 	}
 
@@ -99,7 +99,7 @@ static int __init kfd_module_init(void)
 	if ((max_num_of_queues_per_device < 1) ||
 		(max_num_of_queues_per_device >
 			KFD_MAX_NUM_OF_QUEUES_PER_DEVICE)) {
-		pr_err("kfd: max_num_of_queues_per_device must be between 1 to KFD_MAX_NUM_OF_QUEUES_PER_DEVICE\n");
+		pr_err("max_num_of_queues_per_device must be between 1 to KFD_MAX_NUM_OF_QUEUES_PER_DEVICE\n");
 		return -1;
 	}
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
index ac59229..27fd930 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
@@ -46,8 +46,6 @@ static int init_mqd(struct mqd_manager *mm, void **mqd,
 
 	BUG_ON(!mm || !q || !mqd);
 
-	pr_debug("kfd: In func %s\n", __func__);
-
 	retval = kfd_gtt_sa_allocate(mm->dev, sizeof(struct cik_mqd),
 					mqd_mem_obj);
 
@@ -172,8 +170,6 @@ static int update_mqd(struct mqd_manager *mm, void *mqd,
 
 	BUG_ON(!mm || !q || !mqd);
 
-	pr_debug("kfd: In func %s\n", __func__);
-
 	m = get_mqd(mqd);
 	m->cp_hqd_pq_control = DEFAULT_RPTR_BLOCK_SIZE |
 				DEFAULT_MIN_AVAIL_SIZE | PQ_ATC_EN;
@@ -302,8 +298,6 @@ static int init_mqd_hiq(struct mqd_manager *mm, void **mqd,
 
 	BUG_ON(!mm || !q || !mqd || !mqd_mem_obj);
 
-	pr_debug("kfd: In func %s\n", __func__);
-
 	retval = kfd_gtt_sa_allocate(mm->dev, sizeof(struct cik_mqd),
 					mqd_mem_obj);
 
@@ -360,8 +354,6 @@ static int update_mqd_hiq(struct mqd_manager *mm, void *mqd,
 
 	BUG_ON(!mm || !q || !mqd);
 
-	pr_debug("kfd: In func %s\n", __func__);
-
 	m = get_mqd(mqd);
 	m->cp_hqd_pq_control = DEFAULT_RPTR_BLOCK_SIZE |
 				DEFAULT_MIN_AVAIL_SIZE |
@@ -414,8 +406,6 @@ struct mqd_manager *mqd_manager_init_cik(enum KFD_MQD_TYPE type,
 	BUG_ON(!dev);
 	BUG_ON(type >= KFD_MQD_TYPE_MAX);
 
-	pr_debug("kfd: In func %s\n", __func__);
-
 	mqd = kzalloc(sizeof(struct mqd_manager), GFP_KERNEL);
 	if (!mqd)
 		return NULL;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
index a9b9882..5dc30f5 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
@@ -108,8 +108,6 @@ static int __update_mqd(struct mqd_manager *mm, void *mqd,
 
 	BUG_ON(!mm || !q || !mqd);
 
-	pr_debug("kfd: In func %s\n", __func__);
-
 	m = get_mqd(mqd);
 
 	m->cp_hqd_pq_control = 5 << CP_HQD_PQ_CONTROL__RPTR_BLOCK_SIZE__SHIFT |
@@ -117,7 +115,7 @@ static int __update_mqd(struct mqd_manager *mm, void *mqd,
 			mtype << CP_HQD_PQ_CONTROL__MTYPE__SHIFT;
 	m->cp_hqd_pq_control |=
 			ffs(q->queue_size / sizeof(unsigned int)) - 1 - 1;
-	pr_debug("kfd: cp_hqd_pq_control 0x%x\n", m->cp_hqd_pq_control);
+	pr_debug("cp_hqd_pq_control 0x%x\n", m->cp_hqd_pq_control);
 
 	m->cp_hqd_pq_base_lo = lower_32_bits((uint64_t)q->queue_address >> 8);
 	m->cp_hqd_pq_base_hi = upper_32_bits((uint64_t)q->queue_address >> 8);
@@ -129,7 +127,7 @@ static int __update_mqd(struct mqd_manager *mm, void *mqd,
 		1 << CP_HQD_PQ_DOORBELL_CONTROL__DOORBELL_EN__SHIFT |
 		q->doorbell_off <<
 			CP_HQD_PQ_DOORBELL_CONTROL__DOORBELL_OFFSET__SHIFT;
-	pr_debug("kfd: cp_hqd_pq_doorbell_control 0x%x\n",
+	pr_debug("cp_hqd_pq_doorbell_control 0x%x\n",
 			m->cp_hqd_pq_doorbell_control);
 
 	m->cp_hqd_eop_control = atc_bit << CP_HQD_EOP_CONTROL__EOP_ATC__SHIFT |
@@ -241,8 +239,6 @@ struct mqd_manager *mqd_manager_init_vi(enum KFD_MQD_TYPE type,
 	BUG_ON(!dev);
 	BUG_ON(type >= KFD_MQD_TYPE_MAX);
 
-	pr_debug("kfd: In func %s\n", __func__);
-
 	mqd = kzalloc(sizeof(struct mqd_manager), GFP_KERNEL);
 	if (!mqd)
 		return NULL;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
index 99c11a4..31d7d46 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
@@ -67,7 +67,7 @@ static void pm_calc_rlib_size(struct packet_manager *pm,
 	*over_subscription = false;
 	if ((process_count > 1) || queue_count > get_queues_num(pm->dqm)) {
 		*over_subscription = true;
-		pr_debug("kfd: over subscribed runlist\n");
+		pr_debug("Over subscribed runlist\n");
 	}
 
 	map_queue_size =
@@ -85,7 +85,7 @@ static void pm_calc_rlib_size(struct packet_manager *pm,
 	if (*over_subscription)
 		*rlib_size += sizeof(struct pm4_runlist);
 
-	pr_debug("kfd: runlist ib size %d\n", *rlib_size);
+	pr_debug("runlist ib size %d\n", *rlib_size);
 }
 
 static int pm_allocate_runlist_ib(struct packet_manager *pm,
@@ -106,7 +106,7 @@ static int pm_allocate_runlist_ib(struct packet_manager *pm,
 					&pm->ib_buffer_obj);
 
 	if (retval != 0) {
-		pr_err("kfd: failed to allocate runlist IB\n");
+		pr_err("Failed to allocate runlist IB\n");
 		return retval;
 	}
 
@@ -152,8 +152,6 @@ static int pm_create_map_process(struct packet_manager *pm, uint32_t *buffer,
 
 	packet = (struct pm4_map_process *)buffer;
 
-	pr_debug("kfd: In func %s\n", __func__);
-
 	memset(buffer, 0, sizeof(struct pm4_map_process));
 
 	packet->header.u32all = build_pm4_header(IT_MAP_PROCESS,
@@ -189,8 +187,6 @@ static int pm_create_map_queue_vi(struct packet_manager *pm, uint32_t *buffer,
 
 	BUG_ON(!pm || !buffer || !q);
 
-	pr_debug("kfd: In func %s\n", __func__);
-
 	packet = (struct pm4_mes_map_queues *)buffer;
 	memset(buffer, 0, sizeof(struct pm4_map_queues));
 
@@ -223,8 +219,7 @@ static int pm_create_map_queue_vi(struct packet_manager *pm, uint32_t *buffer,
 		use_static = false; /* no static queues under SDMA */
 		break;
 	default:
-		pr_err("kfd: in %s queue type %d\n", __func__,
-				q->properties.type);
+		pr_err("queue type %d\n", q->properties.type);
 		BUG();
 		break;
 	}
@@ -254,8 +249,6 @@ static int pm_create_map_queue(struct packet_manager *pm, uint32_t *buffer,
 
 	BUG_ON(!pm || !buffer || !q);
 
-	pr_debug("kfd: In func %s\n", __func__);
-
 	packet = (struct pm4_map_queues *)buffer;
 	memset(buffer, 0, sizeof(struct pm4_map_queues));
 
@@ -333,8 +326,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
 
 	*rl_size_bytes = alloc_size_bytes;
 
-	pr_debug("kfd: In func %s\n", __func__);
-	pr_debug("kfd: building runlist ib process count: %d queues count %d\n",
+	pr_debug("Building runlist ib process count: %d queues count %d\n",
 		pm->dqm->processes_count, pm->dqm->queue_count);
 
 	/* build the run list ib packet */
@@ -342,7 +334,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
 		qpd = cur->qpd;
 		/* build map process packet */
 		if (proccesses_mapped >= pm->dqm->processes_count) {
-			pr_debug("kfd: not enough space left in runlist IB\n");
+			pr_debug("Not enough space left in runlist IB\n");
 			pm_release_ib(pm);
 			return -ENOMEM;
 		}
@@ -359,7 +351,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
 			if (!kq->queue->properties.is_active)
 				continue;
 
-			pr_debug("kfd: static_queue, mapping kernel q %d, is debug status %d\n",
+			pr_debug("static_queue, mapping kernel q %d, is debug status %d\n",
 				kq->queue->queue, qpd->is_debug);
 
 			if (pm->dqm->dev->device_info->asic_family ==
@@ -385,7 +377,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
 			if (!q->properties.is_active)
 				continue;
 
-			pr_debug("kfd: static_queue, mapping user queue %d, is debug status %d\n",
+			pr_debug("static_queue, mapping user queue %d, is debug status %d\n",
 				q->queue, qpd->is_debug);
 
 			if (pm->dqm->dev->device_info->asic_family ==
@@ -409,7 +401,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
 		}
 	}
 
-	pr_debug("kfd: finished map process and queues to runlist\n");
+	pr_debug("Finished map process and queues to runlist\n");
 
 	if (is_over_subscription)
 		pm_create_runlist(pm, &rl_buffer[rl_wptr], *rl_gpu_addr,
@@ -453,15 +445,13 @@ int pm_send_set_resources(struct packet_manager *pm,
 
 	BUG_ON(!pm || !res);
 
-	pr_debug("kfd: In func %s\n", __func__);
-
 	mutex_lock(&pm->lock);
 	pm->priv_queue->ops.acquire_packet_buffer(pm->priv_queue,
 					sizeof(*packet) / sizeof(uint32_t),
 					(unsigned int **)&packet);
 	if (packet == NULL) {
 		mutex_unlock(&pm->lock);
-		pr_err("kfd: failed to allocate buffer on kernel queue\n");
+		pr_err("Failed to allocate buffer on kernel queue\n");
 		return -ENOMEM;
 	}
 
@@ -504,7 +494,7 @@ int pm_send_runlist(struct packet_manager *pm, struct list_head *dqm_queues)
 	if (retval != 0)
 		goto fail_create_runlist_ib;
 
-	pr_debug("kfd: runlist IB address: 0x%llX\n", rl_gpu_ib_addr);
+	pr_debug("runlist IB address: 0x%llX\n", rl_gpu_ib_addr);
 
 	packet_size_dwords = sizeof(struct pm4_runlist) / sizeof(uint32_t);
 	mutex_lock(&pm->lock);
@@ -595,7 +585,7 @@ int pm_send_unmap_queue(struct packet_manager *pm, enum kfd_queue_type type,
 
 	packet = (struct pm4_unmap_queues *)buffer;
 	memset(buffer, 0, sizeof(struct pm4_unmap_queues));
-	pr_debug("kfd: static_queue: unmapping queues: mode is %d , reset is %d , type is %d\n",
+	pr_debug("static_queue: unmapping queues: mode is %d , reset is %d , type is %d\n",
 		mode, reset, type);
 	packet->header.u32all = build_pm4_header(IT_UNMAP_QUEUES,
 					sizeof(struct pm4_unmap_queues));
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index a4e4a2d..86032bd 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -101,7 +101,7 @@ struct kfd_process *kfd_create_process(const struct task_struct *thread)
 	/* A prior open of /dev/kfd could have already created the process. */
 	process = find_process(thread);
 	if (process)
-		pr_debug("kfd: process already found\n");
+		pr_debug("Process already found\n");
 
 	if (!process)
 		process = create_process(thread);
@@ -250,7 +250,7 @@ static void kfd_process_notifier_release(struct mmu_notifier *mn,
 			kfd_dbgmgr_destroy(pdd->dev->dbgmgr);
 
 		if (pdd->reset_wavefronts) {
-			pr_warn("amdkfd: Resetting all wave fronts\n");
+			pr_warn("Resetting all wave fronts\n");
 			dbgdev_wave_reset_wavefronts(pdd->dev, p);
 			pdd->reset_wavefronts = false;
 		}
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
index 32cdf2b..9482a5a 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
@@ -51,15 +51,13 @@ static int find_available_queue_slot(struct process_queue_manager *pqm,
 
 	BUG_ON(!pqm || !qid);
 
-	pr_debug("kfd: in %s\n", __func__);
-
 	found = find_first_zero_bit(pqm->queue_slot_bitmap,
 			KFD_MAX_NUM_OF_QUEUES_PER_PROCESS);
 
-	pr_debug("kfd: the new slot id %lu\n", found);
+	pr_debug("The new slot id %lu\n", found);
 
 	if (found >= KFD_MAX_NUM_OF_QUEUES_PER_PROCESS) {
-		pr_info("amdkfd: Can not open more queues for process with pasid %d\n",
+		pr_info("Cannot open more queues for process with pasid %d\n",
 				pqm->process->pasid);
 		return -ENOMEM;
 	}
@@ -92,8 +90,6 @@ void pqm_uninit(struct process_queue_manager *pqm)
 
 	BUG_ON(!pqm);
 
-	pr_debug("In func %s\n", __func__);
-
 	list_for_each_entry_safe(pqn, next, &pqm->queues, process_queue_list) {
 		retval = pqm_destroy_queue(
 				pqm,
@@ -102,7 +98,7 @@ void pqm_uninit(struct process_queue_manager *pqm)
 					pqn->kq->queue->properties.queue_id);
 
 		if (retval != 0) {
-			pr_err("kfd: failed to destroy queue\n");
+			pr_err("failed to destroy queue\n");
 			return;
 		}
 	}
@@ -136,7 +132,7 @@ static int create_cp_queue(struct process_queue_manager *pqm,
 	(*q)->device = dev;
 	(*q)->process = pqm->process;
 
-	pr_debug("kfd: PQM After init queue");
+	pr_debug("PQM After init queue");
 
 	return retval;
 
@@ -210,7 +206,7 @@ int pqm_create_queue(struct process_queue_manager *pqm,
 		if ((sched_policy == KFD_SCHED_POLICY_HWS_NO_OVERSUBSCRIPTION) &&
 		((dev->dqm->processes_count >= VMID_PER_DEVICE) ||
 		(dev->dqm->queue_count >= get_queues_num(dev->dqm)))) {
-			pr_err("kfd: over-subscription is not allowed in radeon_kfd.sched_policy == 1\n");
+			pr_err("Over-subscription is not allowed in radeon_kfd.sched_policy == 1\n");
 			retval = -EPERM;
 			goto err_create_queue;
 		}
@@ -243,17 +239,17 @@ int pqm_create_queue(struct process_queue_manager *pqm,
 	}
 
 	if (retval != 0) {
-		pr_debug("Error dqm create queue\n");
+		pr_err("DQM create queue failed\n");
 		goto err_create_queue;
 	}
 
-	pr_debug("kfd: PQM After DQM create queue\n");
+	pr_debug("PQM After DQM create queue\n");
 
 	list_add(&pqn->process_queue_list, &pqm->queues);
 
 	if (q) {
 		*properties = q->properties;
-		pr_debug("kfd: PQM done creating queue\n");
+		pr_debug("PQM done creating queue\n");
 		print_queue_properties(properties);
 	}
 
@@ -282,11 +278,9 @@ int pqm_destroy_queue(struct process_queue_manager *pqm, unsigned int qid)
 	BUG_ON(!pqm);
 	retval = 0;
 
-	pr_debug("kfd: In Func %s\n", __func__);
-
 	pqn = get_queue_by_qid(pqm, qid);
 	if (pqn == NULL) {
-		pr_err("kfd: queue id does not match any known queue\n");
+		pr_err("Queue id does not match any known queue\n");
 		return -EINVAL;
 	}
 
@@ -339,8 +333,7 @@ int pqm_update_queue(struct process_queue_manager *pqm, unsigned int qid,
 
 	pqn = get_queue_by_qid(pqm, qid);
 	if (!pqn) {
-		pr_debug("amdkfd: No queue %d exists for update operation\n",
-				qid);
+		pr_debug("No queue %d exists for update operation\n", qid);
 		return -EFAULT;
 	}
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
index 0200dae..72d566a 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
@@ -666,7 +666,7 @@ static ssize_t node_show(struct kobject *kobj, struct attribute *attr,
 			dev->node_props.simd_count);
 
 	if (dev->mem_bank_count < dev->node_props.mem_banks_count) {
-		pr_info_once("kfd: mem_banks_count truncated from %d to %d\n",
+		pr_info_once("mem_banks_count truncated from %d to %d\n",
 				dev->node_props.mem_banks_count,
 				dev->mem_bank_count);
 		sysfs_show_32bit_prop(buffer, "mem_banks_count",
@@ -1147,7 +1147,7 @@ int kfd_topology_add_device(struct kfd_dev *gpu)
 
 	gpu_id = kfd_generate_gpu_id(gpu);
 
-	pr_debug("kfd: Adding new GPU (ID: 0x%x) to topology\n", gpu_id);
+	pr_debug("Adding new GPU (ID: 0x%x) to topology\n", gpu_id);
 
 	down_write(&topology_lock);
 	/*
@@ -1190,7 +1190,7 @@ int kfd_topology_add_device(struct kfd_dev *gpu)
 
 	if (dev->gpu->device_info->asic_family == CHIP_CARRIZO) {
 		dev->node_props.capability |= HSA_CAP_DOORBELL_PACKET_TYPE;
-		pr_info("amdkfd: adding doorbell packet type capability\n");
+		pr_info("Adding doorbell packet type capability\n");
 	}
 
 	res = 0;
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 07/19] drm/amdkfd: Change x==NULL/false references to !x
       [not found] ` <1502488589-30272-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (5 preceding siblings ...)
  2017-08-11 21:56   ` [PATCH 06/19] drm/amdkfd: Consolidate and clean up log commands Felix Kuehling
@ 2017-08-11 21:56   ` Felix Kuehling
       [not found]     ` <1502488589-30272-8-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-08-11 21:56   ` [PATCH 08/19] drm/amdkfd: Fix goto usage Felix Kuehling
                     ` (12 subsequent siblings)
  19 siblings, 1 reply; 70+ messages in thread
From: Felix Kuehling @ 2017-08-11 21:56 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Felix Kuehling, Kent Russell

From: Kent Russell <kent.russell@amd.com>

Upstream prefers the !x notation to x==NULL or x==false. Along those lines
change the ==true or !=NULL references as well. Also make the references
to !x the same, excluding () for readability.

Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c           | 22 +++++-----
 drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c            | 20 ++++-----
 drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c            |  4 +-
 drivers/gpu/drm/amd/amdkfd/kfd_device.c            | 10 ++---
 .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  | 50 +++++++++++-----------
 drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c          |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_events.c            |  6 +--
 drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c       |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c      |  6 +--
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c   |  4 +-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c    |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c    | 26 +++++------
 drivers/gpu/drm/amd/amdkfd/kfd_process.c           |  6 +--
 .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |  6 +--
 drivers/gpu/drm/amd/amdkfd/kfd_topology.c          |  6 +--
 15 files changed, 85 insertions(+), 87 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 6244958..c22401e 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -265,7 +265,7 @@ static int kfd_ioctl_create_queue(struct file *filep, struct kfd_process *p,
 
 	pr_debug("Looking for gpu id 0x%x\n", args->gpu_id);
 	dev = kfd_device_by_id(args->gpu_id);
-	if (dev == NULL) {
+	if (!dev) {
 		pr_debug("Could not find gpu id 0x%x\n", args->gpu_id);
 		return -EINVAL;
 	}
@@ -400,7 +400,7 @@ static int kfd_ioctl_set_memory_policy(struct file *filep,
 	}
 
 	dev = kfd_device_by_id(args->gpu_id);
-	if (dev == NULL)
+	if (!dev)
 		return -EINVAL;
 
 	mutex_lock(&p->mutex);
@@ -443,7 +443,7 @@ static int kfd_ioctl_dbg_register(struct file *filep,
 	long status = 0;
 
 	dev = kfd_device_by_id(args->gpu_id);
-	if (dev == NULL)
+	if (!dev)
 		return -EINVAL;
 
 	if (dev->device_info->asic_family == CHIP_CARRIZO) {
@@ -465,7 +465,7 @@ static int kfd_ioctl_dbg_register(struct file *filep,
 		return PTR_ERR(pdd);
 	}
 
-	if (dev->dbgmgr == NULL) {
+	if (!dev->dbgmgr) {
 		/* In case of a legal call, we have no dbgmgr yet */
 		create_ok = kfd_dbgmgr_create(&dbgmgr_ptr, dev);
 		if (create_ok) {
@@ -494,7 +494,7 @@ static int kfd_ioctl_dbg_unregister(struct file *filep,
 	long status;
 
 	dev = kfd_device_by_id(args->gpu_id);
-	if (dev == NULL)
+	if (!dev)
 		return -EINVAL;
 
 	if (dev->device_info->asic_family == CHIP_CARRIZO) {
@@ -505,7 +505,7 @@ static int kfd_ioctl_dbg_unregister(struct file *filep,
 	mutex_lock(kfd_get_dbgmgr_mutex());
 
 	status = kfd_dbgmgr_unregister(dev->dbgmgr, p);
-	if (status == 0) {
+	if (!status) {
 		kfd_dbgmgr_destroy(dev->dbgmgr);
 		dev->dbgmgr = NULL;
 	}
@@ -539,7 +539,7 @@ static int kfd_ioctl_dbg_address_watch(struct file *filep,
 	memset((void *) &aw_info, 0, sizeof(struct dbg_address_watch_info));
 
 	dev = kfd_device_by_id(args->gpu_id);
-	if (dev == NULL)
+	if (!dev)
 		return -EINVAL;
 
 	if (dev->device_info->asic_family == CHIP_CARRIZO) {
@@ -646,7 +646,7 @@ static int kfd_ioctl_dbg_wave_control(struct file *filep,
 				sizeof(wac_info.trapId);
 
 	dev = kfd_device_by_id(args->gpu_id);
-	if (dev == NULL)
+	if (!dev)
 		return -EINVAL;
 
 	if (dev->device_info->asic_family == CHIP_CARRIZO) {
@@ -782,9 +782,9 @@ static int kfd_ioctl_get_process_apertures(struct file *filp,
 				"scratch_limit %llX\n", pdd->scratch_limit);
 
 			args->num_of_nodes++;
-		} while ((pdd = kfd_get_next_process_device_data(p, pdd)) !=
-				NULL &&
-				(args->num_of_nodes < NUM_OF_SUPPORTED_GPUS));
+
+			pdd = kfd_get_next_process_device_data(p, pdd);
+		} while (pdd && (args->num_of_nodes < NUM_OF_SUPPORTED_GPUS));
 	}
 
 	mutex_unlock(&p->mutex);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
index bf8ee19..0ef9136 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
@@ -77,7 +77,7 @@ static int dbgdev_diq_submit_ib(struct kfd_dbgdev *dbgdev,
 	status = kq->ops.acquire_packet_buffer(kq,
 				pq_packets_size_in_bytes / sizeof(uint32_t),
 				&ib_packet_buff);
-	if (status != 0) {
+	if (status) {
 		pr_err("acquire_packet_buffer failed\n");
 		return status;
 	}
@@ -115,7 +115,7 @@ static int dbgdev_diq_submit_ib(struct kfd_dbgdev *dbgdev,
 	status = kfd_gtt_sa_allocate(dbgdev->dev, sizeof(uint64_t),
 					&mem_obj);
 
-	if (status != 0) {
+	if (status) {
 		pr_err("Failed to allocate GART memory\n");
 		kq->ops.rollback_packet(kq);
 		return status;
@@ -202,7 +202,7 @@ static int dbgdev_register_diq(struct kfd_dbgdev *dbgdev)
 
 	kq = pqm_get_kernel_queue(dbgdev->pqm, qid);
 
-	if (kq == NULL) {
+	if (!kq) {
 		pr_err("Error getting DIQ\n");
 		pqm_destroy_queue(dbgdev->pqm, qid);
 		return -EFAULT;
@@ -252,7 +252,7 @@ static void dbgdev_address_watch_set_registers(
 	addrLo->u32All = 0;
 	cntl->u32All = 0;
 
-	if (adw_info->watch_mask != NULL)
+	if (adw_info->watch_mask)
 		cntl->bitfields.mask =
 			(uint32_t) (adw_info->watch_mask[index] &
 					ADDRESS_WATCH_REG_CNTL_DEFAULT_MASK);
@@ -307,8 +307,7 @@ static int dbgdev_address_watch_nodiq(struct kfd_dbgdev *dbgdev,
 		return -EINVAL;
 	}
 
-	if ((adw_info->watch_mode == NULL) ||
-		(adw_info->watch_address == NULL)) {
+	if (!adw_info->watch_mode || !adw_info->watch_address) {
 		pr_err("adw_info fields are not valid\n");
 		return -EINVAL;
 	}
@@ -375,15 +374,14 @@ static int dbgdev_address_watch_diq(struct kfd_dbgdev *dbgdev,
 		return -EINVAL;
 	}
 
-	if ((NULL == adw_info->watch_mode) ||
-			(NULL == adw_info->watch_address)) {
+	if (!adw_info->watch_mode || !adw_info->watch_address) {
 		pr_err("adw_info fields are not valid\n");
 		return -EINVAL;
 	}
 
 	status = kfd_gtt_sa_allocate(dbgdev->dev, ib_size, &mem_obj);
 
-	if (status != 0) {
+	if (status) {
 		pr_err("Failed to allocate GART memory\n");
 		return status;
 	}
@@ -490,7 +488,7 @@ static int dbgdev_address_watch_diq(struct kfd_dbgdev *dbgdev,
 					packet_buff_uint,
 					ib_size);
 
-		if (status != 0) {
+		if (status) {
 			pr_err("Failed to submit IB to DIQ\n");
 			break;
 		}
@@ -711,7 +709,7 @@ static int dbgdev_wave_control_diq(struct kfd_dbgdev *dbgdev,
 			packet_buff_uint,
 			ib_size);
 
-	if (status != 0)
+	if (status)
 		pr_err("Failed to submit IB to DIQ\n");
 
 	kfd_gtt_sa_free(dbgdev->dev, mem_obj);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c b/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c
index 7225789..210bdc1 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c
@@ -55,7 +55,7 @@ static void kfd_dbgmgr_uninitialize(struct kfd_dbgmgr *pmgr)
 
 void kfd_dbgmgr_destroy(struct kfd_dbgmgr *pmgr)
 {
-	if (pmgr != NULL) {
+	if (pmgr) {
 		kfd_dbgmgr_uninitialize(pmgr);
 		kfree(pmgr);
 	}
@@ -66,7 +66,7 @@ bool kfd_dbgmgr_create(struct kfd_dbgmgr **ppmgr, struct kfd_dev *pdev)
 	enum DBGDEV_TYPE type = DBGDEV_TYPE_DIQ;
 	struct kfd_dbgmgr *new_buff;
 
-	BUG_ON(pdev == NULL);
+	BUG_ON(!pdev);
 	BUG_ON(!pdev->init_complete);
 
 	new_buff = kfd_alloc_struct(new_buff);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 87df8bf..d962342 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -98,7 +98,7 @@ static const struct kfd_device_info *lookup_device_info(unsigned short did)
 
 	for (i = 0; i < ARRAY_SIZE(supported_devices); i++) {
 		if (supported_devices[i].did == did) {
-			BUG_ON(supported_devices[i].device_info == NULL);
+			BUG_ON(!supported_devices[i].device_info);
 			return supported_devices[i].device_info;
 		}
 	}
@@ -212,7 +212,7 @@ static int iommu_invalid_ppr_cb(struct pci_dev *pdev, int pasid,
 			flags);
 
 	dev = kfd_device_by_pci_dev(pdev);
-	BUG_ON(dev == NULL);
+	BUG_ON(!dev);
 
 	kfd_signal_iommu_event(dev, pasid, address,
 			flags & PPR_FAULT_WRITE, flags & PPR_FAULT_EXEC);
@@ -262,7 +262,7 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
 
 	kfd_doorbell_init(kfd);
 
-	if (kfd_topology_add_device(kfd) != 0) {
+	if (kfd_topology_add_device(kfd)) {
 		dev_err(kfd_device, "Error adding device to topology\n");
 		goto kfd_topology_add_device_error;
 	}
@@ -288,7 +288,7 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
 		goto device_queue_manager_error;
 	}
 
-	if (kfd->dqm->ops.start(kfd->dqm) != 0) {
+	if (kfd->dqm->ops.start(kfd->dqm)) {
 		dev_err(kfd_device,
 			"Error starting queue manager for device %x:%x\n",
 			kfd->pdev->vendor, kfd->pdev->device);
@@ -341,7 +341,7 @@ void kgd2kfd_device_exit(struct kfd_dev *kfd)
 
 void kgd2kfd_suspend(struct kfd_dev *kfd)
 {
-	BUG_ON(kfd == NULL);
+	BUG_ON(!kfd);
 
 	if (kfd->init_complete) {
 		kfd->dqm->ops.stop(kfd->dqm);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 8b147e4..df93531 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -167,7 +167,7 @@ static int create_queue_nocpsch(struct device_queue_manager *dqm,
 
 	if (list_empty(&qpd->queues_list)) {
 		retval = allocate_vmid(dqm, qpd, q);
-		if (retval != 0) {
+		if (retval) {
 			mutex_unlock(&dqm->lock);
 			return retval;
 		}
@@ -180,7 +180,7 @@ static int create_queue_nocpsch(struct device_queue_manager *dqm,
 	if (q->properties.type == KFD_QUEUE_TYPE_SDMA)
 		retval = create_sdma_queue_nocpsch(dqm, q, qpd);
 
-	if (retval != 0) {
+	if (retval) {
 		if (list_empty(&qpd->queues_list)) {
 			deallocate_vmid(dqm, qpd, q);
 			*allocated_vmid = 0;
@@ -262,16 +262,16 @@ static int create_compute_queue_nocpsch(struct device_queue_manager *dqm,
 	BUG_ON(!dqm || !q || !qpd);
 
 	mqd = dqm->ops.get_mqd_manager(dqm, KFD_MQD_TYPE_COMPUTE);
-	if (mqd == NULL)
+	if (!mqd)
 		return -ENOMEM;
 
 	retval = allocate_hqd(dqm, q);
-	if (retval != 0)
+	if (retval)
 		return retval;
 
 	retval = mqd->init_mqd(mqd, &q->mqd, &q->mqd_mem_obj,
 				&q->gart_mqd_addr, &q->properties);
-	if (retval != 0) {
+	if (retval) {
 		deallocate_hqd(dqm, q);
 		return retval;
 	}
@@ -281,7 +281,7 @@ static int create_compute_queue_nocpsch(struct device_queue_manager *dqm,
 
 	retval = mqd->load_mqd(mqd, q->mqd, q->pipe,
 			q->queue, (uint32_t __user *) q->properties.write_ptr);
-	if (retval != 0) {
+	if (retval) {
 		deallocate_hqd(dqm, q);
 		mqd->uninit_mqd(mqd, q->mqd, q->mqd_mem_obj);
 		return retval;
@@ -330,7 +330,7 @@ static int destroy_queue_nocpsch(struct device_queue_manager *dqm,
 				QUEUE_PREEMPT_DEFAULT_TIMEOUT_MS,
 				q->pipe, q->queue);
 
-	if (retval != 0)
+	if (retval)
 		goto out;
 
 	mqd->uninit_mqd(mqd, q->mqd, q->mqd_mem_obj);
@@ -365,7 +365,7 @@ static int update_queue(struct device_queue_manager *dqm, struct queue *q)
 	mutex_lock(&dqm->lock);
 	mqd = dqm->ops.get_mqd_manager(dqm,
 			get_mqd_type_from_queue_type(q->properties.type));
-	if (mqd == NULL) {
+	if (!mqd) {
 		mutex_unlock(&dqm->lock);
 		return -ENOMEM;
 	}
@@ -381,7 +381,7 @@ static int update_queue(struct device_queue_manager *dqm, struct queue *q)
 	retval = mqd->update_mqd(mqd, q->mqd, &q->properties);
 	if ((q->properties.is_active) && (!prev_active))
 		dqm->queue_count++;
-	else if ((!q->properties.is_active) && (prev_active))
+	else if (!q->properties.is_active && prev_active)
 		dqm->queue_count--;
 
 	if (sched_policy != KFD_SCHED_POLICY_NO_HWS)
@@ -403,7 +403,7 @@ static struct mqd_manager *get_mqd_manager_nocpsch(
 	mqd = dqm->mqds[type];
 	if (!mqd) {
 		mqd = mqd_manager_init(type, dqm->dev);
-		if (mqd == NULL)
+		if (!mqd)
 			pr_err("mqd manager is NULL");
 		dqm->mqds[type] = mqd;
 	}
@@ -485,7 +485,7 @@ static void init_interrupts(struct device_queue_manager *dqm)
 {
 	unsigned int i;
 
-	BUG_ON(dqm == NULL);
+	BUG_ON(!dqm);
 
 	for (i = 0 ; i < get_pipes_per_mec(dqm) ; i++)
 		if (is_pipe_enabled(dqm, 0, i))
@@ -589,7 +589,7 @@ static int create_sdma_queue_nocpsch(struct device_queue_manager *dqm,
 		return -ENOMEM;
 
 	retval = allocate_sdma_queue(dqm, &q->sdma_id);
-	if (retval != 0)
+	if (retval)
 		return retval;
 
 	q->properties.sdma_queue_id = q->sdma_id % CIK_SDMA_QUEUES_PER_ENGINE;
@@ -602,14 +602,14 @@ static int create_sdma_queue_nocpsch(struct device_queue_manager *dqm,
 	dqm->ops_asic_specific.init_sdma_vm(dqm, q, qpd);
 	retval = mqd->init_mqd(mqd, &q->mqd, &q->mqd_mem_obj,
 				&q->gart_mqd_addr, &q->properties);
-	if (retval != 0) {
+	if (retval) {
 		deallocate_sdma_queue(dqm, q->sdma_id);
 		return retval;
 	}
 
 	retval = mqd->load_mqd(mqd, q->mqd, 0,
 				0, NULL);
-	if (retval != 0) {
+	if (retval) {
 		deallocate_sdma_queue(dqm, q->sdma_id);
 		mqd->uninit_mqd(mqd, q->mqd, q->mqd_mem_obj);
 		return retval;
@@ -680,7 +680,7 @@ static int initialize_cpsch(struct device_queue_manager *dqm)
 	dqm->sdma_queue_count = 0;
 	dqm->active_runlist = false;
 	retval = dqm->ops_asic_specific.initialize(dqm);
-	if (retval != 0)
+	if (retval)
 		goto fail_init_pipelines;
 
 	return 0;
@@ -700,11 +700,11 @@ static int start_cpsch(struct device_queue_manager *dqm)
 	retval = 0;
 
 	retval = pm_init(&dqm->packets, dqm);
-	if (retval != 0)
+	if (retval)
 		goto fail_packet_manager_init;
 
 	retval = set_sched_resources(dqm);
-	if (retval != 0)
+	if (retval)
 		goto fail_set_sched_resources;
 
 	pr_debug("Allocating fence memory\n");
@@ -713,7 +713,7 @@ static int start_cpsch(struct device_queue_manager *dqm)
 	retval = kfd_gtt_sa_allocate(dqm->dev, sizeof(*dqm->fence_addr),
 					&dqm->fence_mem);
 
-	if (retval != 0)
+	if (retval)
 		goto fail_allocate_vidmem;
 
 	dqm->fence_addr = dqm->fence_mem->cpu_ptr;
@@ -845,7 +845,7 @@ static int create_queue_cpsch(struct device_queue_manager *dqm, struct queue *q,
 	mqd = dqm->ops.get_mqd_manager(dqm,
 			get_mqd_type_from_queue_type(q->properties.type));
 
-	if (mqd == NULL) {
+	if (!mqd) {
 		mutex_unlock(&dqm->lock);
 		return -ENOMEM;
 	}
@@ -853,7 +853,7 @@ static int create_queue_cpsch(struct device_queue_manager *dqm, struct queue *q,
 	dqm->ops_asic_specific.init_sdma_vm(dqm, q, qpd);
 	retval = mqd->init_mqd(mqd, &q->mqd, &q->mqd_mem_obj,
 				&q->gart_mqd_addr, &q->properties);
-	if (retval != 0)
+	if (retval)
 		goto out;
 
 	list_add(&q->list, &qpd->queues_list);
@@ -934,7 +934,7 @@ static int destroy_queues_cpsch(struct device_queue_manager *dqm,
 
 	retval = pm_send_unmap_queue(&dqm->packets, KFD_QUEUE_TYPE_COMPUTE,
 			preempt_type, 0, false, 0);
-	if (retval != 0)
+	if (retval)
 		goto out;
 
 	*dqm->fence_addr = KFD_FENCE_INIT;
@@ -943,7 +943,7 @@ static int destroy_queues_cpsch(struct device_queue_manager *dqm,
 	/* should be timed out */
 	retval = amdkfd_fence_wait_timeout(dqm->fence_addr, KFD_FENCE_COMPLETED,
 				QUEUE_PREEMPT_DEFAULT_TIMEOUT_MS);
-	if (retval != 0) {
+	if (retval) {
 		pdd = kfd_get_process_device_data(dqm->dev,
 				kfd_get_process(current));
 		pdd->reset_wavefronts = true;
@@ -968,7 +968,7 @@ static int execute_queues_cpsch(struct device_queue_manager *dqm, bool lock)
 		mutex_lock(&dqm->lock);
 
 	retval = destroy_queues_cpsch(dqm, false, false);
-	if (retval != 0) {
+	if (retval) {
 		pr_err("The cp might be in an unrecoverable state due to an unsuccessful queues preemption");
 		goto out;
 	}
@@ -984,7 +984,7 @@ static int execute_queues_cpsch(struct device_queue_manager *dqm, bool lock)
 	}
 
 	retval = pm_send_runlist(&dqm->packets, &dqm->queues);
-	if (retval != 0) {
+	if (retval) {
 		pr_err("failed to execute runlist");
 		goto out;
 	}
@@ -1193,7 +1193,7 @@ struct device_queue_manager *device_queue_manager_init(struct kfd_dev *dev)
 		break;
 	}
 
-	if (dqm->ops.initialize(dqm) != 0) {
+	if (dqm->ops.initialize(dqm)) {
 		kfree(dqm);
 		return NULL;
 	}
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
index ca21538..48018a3 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
@@ -131,7 +131,7 @@ int kfd_doorbell_mmap(struct kfd_process *process, struct vm_area_struct *vma)
 
 	/* Find kfd device according to gpu id */
 	dev = kfd_device_by_id(vma->vm_pgoff);
-	if (dev == NULL)
+	if (!dev)
 		return -EINVAL;
 
 	/* Calculate physical address of doorbell */
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
index abdaf95..5979158 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
@@ -247,7 +247,7 @@ static u32 make_nonsignal_event_id(struct kfd_process *p)
 
 	for (id = p->next_nonsignal_event_id;
 		id < KFD_LAST_NONSIGNAL_EVENT_ID &&
-		lookup_event_by_id(p, id) != NULL;
+		lookup_event_by_id(p, id);
 		id++)
 		;
 
@@ -266,7 +266,7 @@ static u32 make_nonsignal_event_id(struct kfd_process *p)
 
 	for (id = KFD_FIRST_NONSIGNAL_EVENT_ID;
 		id < KFD_LAST_NONSIGNAL_EVENT_ID &&
-		lookup_event_by_id(p, id) != NULL;
+		lookup_event_by_id(p, id);
 		id++)
 		;
 
@@ -342,7 +342,7 @@ void kfd_event_init_process(struct kfd_process *p)
 
 static void destroy_event(struct kfd_process *p, struct kfd_event *ev)
 {
-	if (ev->signal_page != NULL) {
+	if (ev->signal_page) {
 		release_event_notification_slot(ev->signal_page,
 						ev->signal_slot_index);
 		p->signal_event_count--;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c b/drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c
index 2b65510..c59384b 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c
@@ -304,7 +304,7 @@ int kfd_init_apertures(struct kfd_process *process)
 		id < NUM_OF_SUPPORTED_GPUS) {
 
 		pdd = kfd_create_process_device_data(dev, process);
-		if (pdd == NULL) {
+		if (!pdd) {
 			pr_err("Failed to create process device data\n");
 			return -1;
 		}
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
index f89d366..8844798 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
@@ -67,12 +67,12 @@ static bool initialize(struct kernel_queue *kq, struct kfd_dev *dev,
 		break;
 	}
 
-	if (kq->mqd == NULL)
+	if (!kq->mqd)
 		return false;
 
 	prop.doorbell_ptr = kfd_get_kernel_doorbell(dev, &prop.doorbell_off);
 
-	if (prop.doorbell_ptr == NULL) {
+	if (!prop.doorbell_ptr) {
 		pr_err("Failed to initialize doorbell");
 		goto err_get_kernel_doorbell;
 	}
@@ -87,7 +87,7 @@ static bool initialize(struct kernel_queue *kq, struct kfd_dev *dev,
 	kq->pq_gpu_addr = kq->pq->gpu_addr;
 
 	retval = kq->ops_asic_specific.initialize(kq, dev, type, queue_size);
-	if (retval == false)
+	if (!retval)
 		goto err_eop_allocate_vidmem;
 
 	retval = kfd_gtt_sa_allocate(dev, sizeof(*kq->rptr_kernel),
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
index 27fd930..9908227 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
@@ -99,7 +99,7 @@ static int init_mqd(struct mqd_manager *mm, void **mqd,
 		m->cp_hqd_iq_rptr = AQL_ENABLE;
 
 	*mqd = m;
-	if (gart_addr != NULL)
+	if (gart_addr)
 		*gart_addr = addr;
 	retval = mm->update_mqd(mm, m, q);
 
@@ -127,7 +127,7 @@ static int init_mqd_sdma(struct mqd_manager *mm, void **mqd,
 	memset(m, 0, sizeof(struct cik_sdma_rlc_registers));
 
 	*mqd = m;
-	if (gart_addr != NULL)
+	if (gart_addr)
 		*gart_addr = (*mqd_mem_obj)->gpu_addr;
 
 	retval = mm->update_mqd(mm, m, q);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
index 5dc30f5..5ba3b40 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
@@ -85,7 +85,7 @@ static int init_mqd(struct mqd_manager *mm, void **mqd,
 		m->cp_hqd_iq_rptr = 1;
 
 	*mqd = m;
-	if (gart_addr != NULL)
+	if (gart_addr)
 		*gart_addr = addr;
 	retval = mm->update_mqd(mm, m, q);
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
index 31d7d46..f3b8cc8 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
@@ -98,14 +98,14 @@ static int pm_allocate_runlist_ib(struct packet_manager *pm,
 
 	BUG_ON(!pm);
 	BUG_ON(pm->allocated);
-	BUG_ON(is_over_subscription == NULL);
+	BUG_ON(!is_over_subscription);
 
 	pm_calc_rlib_size(pm, rl_buffer_size, is_over_subscription);
 
 	retval = kfd_gtt_sa_allocate(pm->dqm->dev, *rl_buffer_size,
 					&pm->ib_buffer_obj);
 
-	if (retval != 0) {
+	if (retval) {
 		pr_err("Failed to allocate runlist IB\n");
 		return retval;
 	}
@@ -321,7 +321,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
 
 	retval = pm_allocate_runlist_ib(pm, &rl_buffer, rl_gpu_addr,
 				&alloc_size_bytes, &is_over_subscription);
-	if (retval != 0)
+	if (retval)
 		return retval;
 
 	*rl_size_bytes = alloc_size_bytes;
@@ -340,7 +340,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
 		}
 
 		retval = pm_create_map_process(pm, &rl_buffer[rl_wptr], qpd);
-		if (retval != 0)
+		if (retval)
 			return retval;
 
 		proccesses_mapped++;
@@ -365,7 +365,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
 						&rl_buffer[rl_wptr],
 						kq->queue,
 						qpd->is_debug);
-			if (retval != 0)
+			if (retval)
 				return retval;
 
 			inc_wptr(&rl_wptr,
@@ -392,7 +392,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
 						q,
 						qpd->is_debug);
 
-			if (retval != 0)
+			if (retval)
 				return retval;
 
 			inc_wptr(&rl_wptr,
@@ -421,7 +421,7 @@ int pm_init(struct packet_manager *pm, struct device_queue_manager *dqm)
 	pm->dqm = dqm;
 	mutex_init(&pm->lock);
 	pm->priv_queue = kernel_queue_init(dqm->dev, KFD_QUEUE_TYPE_HIQ);
-	if (pm->priv_queue == NULL) {
+	if (!pm->priv_queue) {
 		mutex_destroy(&pm->lock);
 		return -ENOMEM;
 	}
@@ -449,7 +449,7 @@ int pm_send_set_resources(struct packet_manager *pm,
 	pm->priv_queue->ops.acquire_packet_buffer(pm->priv_queue,
 					sizeof(*packet) / sizeof(uint32_t),
 					(unsigned int **)&packet);
-	if (packet == NULL) {
+	if (!packet) {
 		mutex_unlock(&pm->lock);
 		pr_err("Failed to allocate buffer on kernel queue\n");
 		return -ENOMEM;
@@ -491,7 +491,7 @@ int pm_send_runlist(struct packet_manager *pm, struct list_head *dqm_queues)
 
 	retval = pm_create_runlist_ib(pm, dqm_queues, &rl_gpu_ib_addr,
 					&rl_ib_size);
-	if (retval != 0)
+	if (retval)
 		goto fail_create_runlist_ib;
 
 	pr_debug("runlist IB address: 0x%llX\n", rl_gpu_ib_addr);
@@ -501,12 +501,12 @@ int pm_send_runlist(struct packet_manager *pm, struct list_head *dqm_queues)
 
 	retval = pm->priv_queue->ops.acquire_packet_buffer(pm->priv_queue,
 					packet_size_dwords, &rl_buffer);
-	if (retval != 0)
+	if (retval)
 		goto fail_acquire_packet_buffer;
 
 	retval = pm_create_runlist(pm, rl_buffer, rl_gpu_ib_addr,
 					rl_ib_size / sizeof(uint32_t), false);
-	if (retval != 0)
+	if (retval)
 		goto fail_create_runlist;
 
 	pm->priv_queue->ops.submit_packet(pm->priv_queue);
@@ -537,7 +537,7 @@ int pm_send_query_status(struct packet_manager *pm, uint64_t fence_address,
 			pm->priv_queue,
 			sizeof(struct pm4_query_status) / sizeof(uint32_t),
 			(unsigned int **)&packet);
-	if (retval != 0)
+	if (retval)
 		goto fail_acquire_packet_buffer;
 
 	packet->header.u32all = build_pm4_header(IT_QUERY_STATUS,
@@ -580,7 +580,7 @@ int pm_send_unmap_queue(struct packet_manager *pm, enum kfd_queue_type type,
 			pm->priv_queue,
 			sizeof(struct pm4_unmap_queues) / sizeof(uint32_t),
 			&buffer);
-	if (retval != 0)
+	if (retval)
 		goto err_acquire_packet_buffer;
 
 	packet = (struct pm4_unmap_queues *)buffer;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index 86032bd..d877cda 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -81,7 +81,7 @@ struct kfd_process *kfd_create_process(const struct task_struct *thread)
 
 	BUG_ON(!kfd_process_wq);
 
-	if (thread->mm == NULL)
+	if (!thread->mm)
 		return ERR_PTR(-EINVAL);
 
 	/* Only the pthreads threading model is supported. */
@@ -117,7 +117,7 @@ struct kfd_process *kfd_get_process(const struct task_struct *thread)
 {
 	struct kfd_process *process;
 
-	if (thread->mm == NULL)
+	if (!thread->mm)
 		return ERR_PTR(-EINVAL);
 
 	/* Only the pthreads threading model is supported. */
@@ -407,7 +407,7 @@ void kfd_unbind_process_from_device(struct kfd_dev *dev, unsigned int pasid)
 	struct kfd_process *p;
 	struct kfd_process_device *pdd;
 
-	BUG_ON(dev == NULL);
+	BUG_ON(!dev);
 
 	/*
 	 * Look for the process that matches the pasid. If there is no such
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
index 9482a5a..d4f8bae 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
@@ -76,7 +76,7 @@ int pqm_init(struct process_queue_manager *pqm, struct kfd_process *p)
 	pqm->queue_slot_bitmap =
 			kzalloc(DIV_ROUND_UP(KFD_MAX_NUM_OF_QUEUES_PER_PROCESS,
 					BITS_PER_BYTE), GFP_KERNEL);
-	if (pqm->queue_slot_bitmap == NULL)
+	if (!pqm->queue_slot_bitmap)
 		return -ENOMEM;
 	pqm->process = p;
 
@@ -223,7 +223,7 @@ int pqm_create_queue(struct process_queue_manager *pqm,
 		break;
 	case KFD_QUEUE_TYPE_DIQ:
 		kq = kernel_queue_init(dev, KFD_QUEUE_TYPE_DIQ);
-		if (kq == NULL) {
+		if (!kq) {
 			retval = -ENOMEM;
 			goto err_create_queue;
 		}
@@ -279,7 +279,7 @@ int pqm_destroy_queue(struct process_queue_manager *pqm, unsigned int qid)
 	retval = 0;
 
 	pqn = get_queue_by_qid(pqm, qid);
-	if (pqn == NULL) {
+	if (!pqn) {
 		pr_err("Queue id does not match any known queue\n");
 		return -EINVAL;
 	}
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
index 72d566a..113c1ce 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
@@ -416,7 +416,7 @@ static struct kfd_topology_device *kfd_create_topology_device(void)
 	struct kfd_topology_device *dev;
 
 	dev = kfd_alloc_struct(dev);
-	if (dev == NULL) {
+	if (!dev) {
 		pr_err("No memory to allocate a topology device");
 		return NULL;
 	}
@@ -957,7 +957,7 @@ static int kfd_topology_update_sysfs(void)
 	int ret;
 
 	pr_info("Creating topology SYSFS entries\n");
-	if (sys_props.kobj_topology == NULL) {
+	if (!sys_props.kobj_topology) {
 		sys_props.kobj_topology =
 				kfd_alloc_struct(sys_props.kobj_topology);
 		if (!sys_props.kobj_topology)
@@ -1120,7 +1120,7 @@ static struct kfd_topology_device *kfd_assign_gpu(struct kfd_dev *gpu)
 	BUG_ON(!gpu);
 
 	list_for_each_entry(dev, &topology_device_list, list)
-		if (dev->gpu == NULL && dev->node_props.simd_count > 0) {
+		if (!dev->gpu && (dev->node_props.simd_count > 0)) {
 			dev->gpu = gpu;
 			out_dev = dev;
 			break;
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 08/19] drm/amdkfd: Fix goto usage
       [not found] ` <1502488589-30272-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (6 preceding siblings ...)
  2017-08-11 21:56   ` [PATCH 07/19] drm/amdkfd: Change x==NULL/false references to !x Felix Kuehling
@ 2017-08-11 21:56   ` Felix Kuehling
       [not found]     ` <1502488589-30272-9-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-08-11 21:56   ` [PATCH 09/19] drm/amdkfd: Remove usage of alloc(sizeof(struct Felix Kuehling
                     ` (11 subsequent siblings)
  19 siblings, 1 reply; 70+ messages in thread
From: Felix Kuehling @ 2017-08-11 21:56 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Felix Kuehling, Kent Russell

From: Kent Russell <kent.russell@amd.com>

Remove gotos that do not feature any common cleanup, and use gotos
instead of repeating cleanup commands.

According to kernel.org: "The goto statement comes in handy when a
function exits from multiple locations and some common work such as
cleanup has to be done. If there is no cleanup needed then just return
directly."

Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c           |  15 +--
 .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  | 102 ++++++++++-----------
 drivers/gpu/drm/amd/amdkfd/kfd_module.c            |   3 +-
 drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c    |  14 +--
 .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |  14 +--
 5 files changed, 65 insertions(+), 83 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index c22401e..7d78119 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -460,9 +460,8 @@ static int kfd_ioctl_dbg_register(struct file *filep,
 	 */
 	pdd = kfd_bind_process_to_device(dev, p);
 	if (IS_ERR(pdd)) {
-		mutex_unlock(kfd_get_dbgmgr_mutex());
-		mutex_unlock(&p->mutex);
-		return PTR_ERR(pdd);
+		status = PTR_ERR(pdd);
+		goto out;
 	}
 
 	if (!dev->dbgmgr) {
@@ -480,6 +479,7 @@ static int kfd_ioctl_dbg_register(struct file *filep,
 		status = -EINVAL;
 	}
 
+out:
 	mutex_unlock(kfd_get_dbgmgr_mutex());
 	mutex_unlock(&p->mutex);
 
@@ -580,8 +580,8 @@ static int kfd_ioctl_dbg_address_watch(struct file *filep,
 	args_idx += sizeof(aw_info.watch_address) * aw_info.num_watch_points;
 
 	if (args_idx >= args->buf_size_in_bytes - sizeof(*args)) {
-		kfree(args_buff);
-		return -EINVAL;
+		status = -EINVAL;
+		goto out;
 	}
 
 	watch_mask_value = (uint64_t) args_buff[args_idx];
@@ -604,8 +604,8 @@ static int kfd_ioctl_dbg_address_watch(struct file *filep,
 	}
 
 	if (args_idx >= args->buf_size_in_bytes - sizeof(args)) {
-		kfree(args_buff);
-		return -EINVAL;
+		status = -EINVAL;
+		goto out;
 	}
 
 	/* Currently HSA Event is not supported for DBG */
@@ -617,6 +617,7 @@ static int kfd_ioctl_dbg_address_watch(struct file *filep,
 
 	mutex_unlock(kfd_get_dbgmgr_mutex());
 
+out:
 	kfree(args_buff);
 
 	return status;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index df93531..2003a7e 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -150,7 +150,7 @@ static int create_queue_nocpsch(struct device_queue_manager *dqm,
 				struct qcm_process_device *qpd,
 				int *allocated_vmid)
 {
-	int retval;
+	int retval = 0;
 
 	BUG_ON(!dqm || !q || !qpd || !allocated_vmid);
 
@@ -161,23 +161,21 @@ static int create_queue_nocpsch(struct device_queue_manager *dqm,
 	if (dqm->total_queue_count >= max_num_of_queues_per_device) {
 		pr_warn("Can't create new usermode queue because %d queues were already created\n",
 				dqm->total_queue_count);
-		mutex_unlock(&dqm->lock);
-		return -EPERM;
+		retval = -EPERM;
+		goto out_unlock;
 	}
 
 	if (list_empty(&qpd->queues_list)) {
 		retval = allocate_vmid(dqm, qpd, q);
-		if (retval) {
-			mutex_unlock(&dqm->lock);
-			return retval;
-		}
+		if (retval)
+			goto out_unlock;
 	}
 	*allocated_vmid = qpd->vmid;
 	q->properties.vmid = qpd->vmid;
 
 	if (q->properties.type == KFD_QUEUE_TYPE_COMPUTE)
 		retval = create_compute_queue_nocpsch(dqm, q, qpd);
-	if (q->properties.type == KFD_QUEUE_TYPE_SDMA)
+	else if (q->properties.type == KFD_QUEUE_TYPE_SDMA)
 		retval = create_sdma_queue_nocpsch(dqm, q, qpd);
 
 	if (retval) {
@@ -185,8 +183,7 @@ static int create_queue_nocpsch(struct device_queue_manager *dqm,
 			deallocate_vmid(dqm, qpd, q);
 			*allocated_vmid = 0;
 		}
-		mutex_unlock(&dqm->lock);
-		return retval;
+		goto out_unlock;
 	}
 
 	list_add(&q->list, &qpd->queues_list);
@@ -204,8 +201,9 @@ static int create_queue_nocpsch(struct device_queue_manager *dqm,
 	pr_debug("Total of %d queues are accountable so far\n",
 			dqm->total_queue_count);
 
+out_unlock:
 	mutex_unlock(&dqm->lock);
-	return 0;
+	return retval;
 }
 
 static int allocate_hqd(struct device_queue_manager *dqm, struct queue *q)
@@ -271,23 +269,25 @@ static int create_compute_queue_nocpsch(struct device_queue_manager *dqm,
 
 	retval = mqd->init_mqd(mqd, &q->mqd, &q->mqd_mem_obj,
 				&q->gart_mqd_addr, &q->properties);
-	if (retval) {
-		deallocate_hqd(dqm, q);
-		return retval;
-	}
+	if (retval)
+		goto out_deallocate_hqd;
 
 	pr_debug("Loading mqd to hqd on pipe %d, queue %d\n",
 			q->pipe, q->queue);
 
 	retval = mqd->load_mqd(mqd, q->mqd, q->pipe,
 			q->queue, (uint32_t __user *) q->properties.write_ptr);
-	if (retval) {
-		deallocate_hqd(dqm, q);
-		mqd->uninit_mqd(mqd, q->mqd, q->mqd_mem_obj);
-		return retval;
-	}
+	if (retval)
+		goto out_uninit_mqd;
 
 	return 0;
+
+out_uninit_mqd:
+	mqd->uninit_mqd(mqd, q->mqd, q->mqd_mem_obj);
+out_deallocate_hqd:
+	deallocate_hqd(dqm, q);
+
+	return retval;
 }
 
 static int destroy_queue_nocpsch(struct device_queue_manager *dqm,
@@ -366,8 +366,8 @@ static int update_queue(struct device_queue_manager *dqm, struct queue *q)
 	mqd = dqm->ops.get_mqd_manager(dqm,
 			get_mqd_type_from_queue_type(q->properties.type));
 	if (!mqd) {
-		mutex_unlock(&dqm->lock);
-		return -ENOMEM;
+		retval = -ENOMEM;
+		goto out_unlock;
 	}
 
 	if (q->properties.is_active)
@@ -387,6 +387,7 @@ static int update_queue(struct device_queue_manager *dqm, struct queue *q)
 	if (sched_policy != KFD_SCHED_POLICY_NO_HWS)
 		retval = execute_queues_cpsch(dqm, false);
 
+out_unlock:
 	mutex_unlock(&dqm->lock);
 	return retval;
 }
@@ -500,16 +501,15 @@ static int initialize_nocpsch(struct device_queue_manager *dqm)
 
 	pr_debug("num of pipes: %d\n", get_pipes_per_mec(dqm));
 
+	dqm->allocated_queues = kcalloc(get_pipes_per_mec(dqm),
+					sizeof(unsigned int), GFP_KERNEL);
+	if (!dqm->allocated_queues)
+		return -ENOMEM;
+
 	mutex_init(&dqm->lock);
 	INIT_LIST_HEAD(&dqm->queues);
 	dqm->queue_count = dqm->next_pipe_to_allocate = 0;
 	dqm->sdma_queue_count = 0;
-	dqm->allocated_queues = kcalloc(get_pipes_per_mec(dqm),
-					sizeof(unsigned int), GFP_KERNEL);
-	if (!dqm->allocated_queues) {
-		mutex_destroy(&dqm->lock);
-		return -ENOMEM;
-	}
 
 	for (pipe = 0; pipe < get_pipes_per_mec(dqm); pipe++) {
 		int pipe_offset = pipe * get_queues_per_pipe(dqm);
@@ -602,20 +602,22 @@ static int create_sdma_queue_nocpsch(struct device_queue_manager *dqm,
 	dqm->ops_asic_specific.init_sdma_vm(dqm, q, qpd);
 	retval = mqd->init_mqd(mqd, &q->mqd, &q->mqd_mem_obj,
 				&q->gart_mqd_addr, &q->properties);
-	if (retval) {
-		deallocate_sdma_queue(dqm, q->sdma_id);
-		return retval;
-	}
+	if (retval)
+		goto out_deallocate_sdma_queue;
 
 	retval = mqd->load_mqd(mqd, q->mqd, 0,
 				0, NULL);
-	if (retval) {
-		deallocate_sdma_queue(dqm, q->sdma_id);
-		mqd->uninit_mqd(mqd, q->mqd, q->mqd_mem_obj);
-		return retval;
-	}
+	if (retval)
+		goto out_uninit_mqd;
 
 	return 0;
+
+out_uninit_mqd:
+	mqd->uninit_mqd(mqd, q->mqd, q->mqd_mem_obj);
+out_deallocate_sdma_queue:
+	deallocate_sdma_queue(dqm, q->sdma_id);
+
+	return retval;
 }
 
 /*
@@ -681,12 +683,8 @@ static int initialize_cpsch(struct device_queue_manager *dqm)
 	dqm->active_runlist = false;
 	retval = dqm->ops_asic_specific.initialize(dqm);
 	if (retval)
-		goto fail_init_pipelines;
-
-	return 0;
+		mutex_destroy(&dqm->lock);
 
-fail_init_pipelines:
-	mutex_destroy(&dqm->lock);
 	return retval;
 }
 
@@ -846,8 +844,8 @@ static int create_queue_cpsch(struct device_queue_manager *dqm, struct queue *q,
 			get_mqd_type_from_queue_type(q->properties.type));
 
 	if (!mqd) {
-		mutex_unlock(&dqm->lock);
-		return -ENOMEM;
+		retval = -ENOMEM;
+		goto out;
 	}
 
 	dqm->ops_asic_specific.init_sdma_vm(dqm, q, qpd);
@@ -1097,14 +1095,11 @@ static bool set_cache_memory_policy(struct device_queue_manager *dqm,
 		uint64_t base = (uintptr_t)alternate_aperture_base;
 		uint64_t limit = base + alternate_aperture_size - 1;
 
-		if (limit <= base)
-			goto out;
-
-		if ((base & APE1_FIXED_BITS_MASK) != 0)
-			goto out;
-
-		if ((limit & APE1_FIXED_BITS_MASK) != APE1_LIMIT_ALIGNMENT)
+		if (limit <= base || (base & APE1_FIXED_BITS_MASK) != 0 ||
+		   (limit & APE1_FIXED_BITS_MASK) != APE1_LIMIT_ALIGNMENT) {
+			retval = false;
 			goto out;
+		}
 
 		qpd->sh_mem_ape1_base = base >> 16;
 		qpd->sh_mem_ape1_limit = limit >> 16;
@@ -1125,12 +1120,9 @@ static bool set_cache_memory_policy(struct device_queue_manager *dqm,
 		qpd->sh_mem_config, qpd->sh_mem_ape1_base,
 		qpd->sh_mem_ape1_limit);
 
-	mutex_unlock(&dqm->lock);
-	return retval;
-
 out:
 	mutex_unlock(&dqm->lock);
-	return false;
+	return retval;
 }
 
 struct device_queue_manager *device_queue_manager_init(struct kfd_dev *dev)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_module.c b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
index 819a442..0d73bea 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
@@ -105,7 +105,7 @@ static int __init kfd_module_init(void)
 
 	err = kfd_pasid_init();
 	if (err < 0)
-		goto err_pasid;
+		return err;
 
 	err = kfd_chardev_init();
 	if (err < 0)
@@ -127,7 +127,6 @@ static int __init kfd_module_init(void)
 	kfd_chardev_exit();
 err_ioctl:
 	kfd_pasid_exit();
-err_pasid:
 	return err;
 }
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
index f3b8cc8..c4030b3 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
@@ -442,6 +442,7 @@ int pm_send_set_resources(struct packet_manager *pm,
 				struct scheduling_resources *res)
 {
 	struct pm4_set_resources *packet;
+	int retval = 0;
 
 	BUG_ON(!pm || !res);
 
@@ -450,9 +451,9 @@ int pm_send_set_resources(struct packet_manager *pm,
 					sizeof(*packet) / sizeof(uint32_t),
 					(unsigned int **)&packet);
 	if (!packet) {
-		mutex_unlock(&pm->lock);
 		pr_err("Failed to allocate buffer on kernel queue\n");
-		return -ENOMEM;
+		retval = -ENOMEM;
+		goto out;
 	}
 
 	memset(packet, 0, sizeof(struct pm4_set_resources));
@@ -475,9 +476,10 @@ int pm_send_set_resources(struct packet_manager *pm,
 
 	pm->priv_queue->ops.submit_packet(pm->priv_queue);
 
+out:
 	mutex_unlock(&pm->lock);
 
-	return 0;
+	return retval;
 }
 
 int pm_send_runlist(struct packet_manager *pm, struct list_head *dqm_queues)
@@ -555,9 +557,6 @@ int pm_send_query_status(struct packet_manager *pm, uint64_t fence_address,
 	packet->data_lo = lower_32_bits((uint64_t)fence_value);
 
 	pm->priv_queue->ops.submit_packet(pm->priv_queue);
-	mutex_unlock(&pm->lock);
-
-	return 0;
 
 fail_acquire_packet_buffer:
 	mutex_unlock(&pm->lock);
@@ -639,9 +638,6 @@ int pm_send_unmap_queue(struct packet_manager *pm, enum kfd_queue_type type,
 
 	pm->priv_queue->ops.submit_packet(pm->priv_queue);
 
-	mutex_unlock(&pm->lock);
-	return 0;
-
 err_acquire_packet_buffer:
 	mutex_unlock(&pm->lock);
 	return retval;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
index d4f8bae..8432f5f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
@@ -35,9 +35,8 @@ static inline struct process_queue_node *get_queue_by_qid(
 	BUG_ON(!pqm);
 
 	list_for_each_entry(pqn, &pqm->queues, process_queue_list) {
-		if (pqn->q && pqn->q->properties.queue_id == qid)
-			return pqn;
-		if (pqn->kq && pqn->kq->queue->properties.queue_id == qid)
+		if ((pqn->q && pqn->q->properties.queue_id == qid) ||
+		    (pqn->kq && pqn->kq->queue->properties.queue_id == qid))
 			return pqn;
 	}
 
@@ -113,8 +112,6 @@ static int create_cp_queue(struct process_queue_manager *pqm,
 {
 	int retval;
 
-	retval = 0;
-
 	/* Doorbell initialized in user space*/
 	q_properties->doorbell_ptr = NULL;
 
@@ -127,7 +124,7 @@ static int create_cp_queue(struct process_queue_manager *pqm,
 
 	retval = init_queue(q, q_properties);
 	if (retval != 0)
-		goto err_init_queue;
+		return retval;
 
 	(*q)->device = dev;
 	(*q)->process = pqm->process;
@@ -135,9 +132,6 @@ static int create_cp_queue(struct process_queue_manager *pqm,
 	pr_debug("PQM After init queue");
 
 	return retval;
-
-err_init_queue:
-	return retval;
 }
 
 int pqm_create_queue(struct process_queue_manager *pqm,
@@ -181,7 +175,7 @@ int pqm_create_queue(struct process_queue_manager *pqm,
 		list_for_each_entry(cur, &pdd->qpd.queues_list, list)
 			num_queues++;
 		if (num_queues >= dev->device_info->max_no_of_hqd/2)
-			return (-ENOSPC);
+			return -ENOSPC;
 	}
 
 	retval = find_available_queue_slot(pqm, qid);
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 09/19] drm/amdkfd: Remove usage of alloc(sizeof(struct...
       [not found] ` <1502488589-30272-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (7 preceding siblings ...)
  2017-08-11 21:56   ` [PATCH 08/19] drm/amdkfd: Fix goto usage Felix Kuehling
@ 2017-08-11 21:56   ` Felix Kuehling
       [not found]     ` <1502488589-30272-10-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-08-11 21:56   ` [PATCH 10/19] drm/amdkfd: Remove BUG_ONs for NULL pointer arguments Felix Kuehling
                     ` (10 subsequent siblings)
  19 siblings, 1 reply; 70+ messages in thread
From: Felix Kuehling @ 2017-08-11 21:56 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Felix Kuehling, Kent Russell

From: Kent Russell <kent.russell@amd.com>

See https://kernel.org/doc/html/latest/process/coding-style.html
under "14) Allocating Memory" for rationale behind removing the
x=alloc(sizeof(struct) style and using x=alloc(sizeof(*x) instead

Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  |  4 ++--
 drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c          |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c       |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c        |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_queue.c                 | 10 +++++-----
 6 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 2003a7e..68fe6ed 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -420,7 +420,7 @@ static int register_process_nocpsch(struct device_queue_manager *dqm,
 
 	BUG_ON(!dqm || !qpd);
 
-	n = kzalloc(sizeof(struct device_process_node), GFP_KERNEL);
+	n = kzalloc(sizeof(*n), GFP_KERNEL);
 	if (!n)
 		return -ENOMEM;
 
@@ -1133,7 +1133,7 @@ struct device_queue_manager *device_queue_manager_init(struct kfd_dev *dev)
 
 	pr_debug("Loading device queue manager\n");
 
-	dqm = kzalloc(sizeof(struct device_queue_manager), GFP_KERNEL);
+	dqm = kzalloc(sizeof(*dqm), GFP_KERNEL);
 	if (!dqm)
 		return NULL;
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
index 8844798..47e2e8a 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
@@ -283,7 +283,7 @@ struct kernel_queue *kernel_queue_init(struct kfd_dev *dev,
 
 	BUG_ON(!dev);
 
-	kq = kzalloc(sizeof(struct kernel_queue), GFP_KERNEL);
+	kq = kzalloc(sizeof(*kq), GFP_KERNEL);
 	if (!kq)
 		return NULL;
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
index 9908227..dca4fc7 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
@@ -406,7 +406,7 @@ struct mqd_manager *mqd_manager_init_cik(enum KFD_MQD_TYPE type,
 	BUG_ON(!dev);
 	BUG_ON(type >= KFD_MQD_TYPE_MAX);
 
-	mqd = kzalloc(sizeof(struct mqd_manager), GFP_KERNEL);
+	mqd = kzalloc(sizeof(*mqd), GFP_KERNEL);
 	if (!mqd)
 		return NULL;
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
index 5ba3b40..aaaa87a 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
@@ -239,7 +239,7 @@ struct mqd_manager *mqd_manager_init_vi(enum KFD_MQD_TYPE type,
 	BUG_ON(!dev);
 	BUG_ON(type >= KFD_MQD_TYPE_MAX);
 
-	mqd = kzalloc(sizeof(struct mqd_manager), GFP_KERNEL);
+	mqd = kzalloc(sizeof(*mqd), GFP_KERNEL);
 	if (!mqd)
 		return NULL;
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
index 8432f5f..1d056a6 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
@@ -187,7 +187,7 @@ int pqm_create_queue(struct process_queue_manager *pqm,
 		dev->dqm->ops.register_process(dev->dqm, &pdd->qpd);
 	}
 
-	pqn = kzalloc(sizeof(struct process_queue_node), GFP_KERNEL);
+	pqn = kzalloc(sizeof(*pqn), GFP_KERNEL);
 	if (!pqn) {
 		retval = -ENOMEM;
 		goto err_allocate_pqn;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_queue.c b/drivers/gpu/drm/amd/amdkfd/kfd_queue.c
index 0ab1970..5ad9f6f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_queue.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_queue.c
@@ -65,17 +65,17 @@ void print_queue(struct queue *q)
 
 int init_queue(struct queue **q, const struct queue_properties *properties)
 {
-	struct queue *tmp;
+	struct queue *tmp_q;
 
 	BUG_ON(!q);
 
-	tmp = kzalloc(sizeof(struct queue), GFP_KERNEL);
-	if (!tmp)
+	tmp_q = kzalloc(sizeof(*tmp_q), GFP_KERNEL);
+	if (!tmp_q)
 		return -ENOMEM;
 
-	memcpy(&tmp->properties, properties, sizeof(struct queue_properties));
+	memcpy(&tmp_q->properties, properties, sizeof(*properties));
 
-	*q = tmp;
+	*q = tmp_q;
 	return 0;
 }
 
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 10/19] drm/amdkfd: Remove BUG_ONs for NULL pointer arguments
       [not found] ` <1502488589-30272-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (8 preceding siblings ...)
  2017-08-11 21:56   ` [PATCH 09/19] drm/amdkfd: Remove usage of alloc(sizeof(struct Felix Kuehling
@ 2017-08-11 21:56   ` Felix Kuehling
       [not found]     ` <1502488589-30272-11-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-08-11 21:56   ` [PATCH 11/19] drm/amdkfd: Fix doorbell initialization and finalization Felix Kuehling
                     ` (9 subsequent siblings)
  19 siblings, 1 reply; 70+ messages in thread
From: Felix Kuehling @ 2017-08-11 21:56 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Felix Kuehling

Remove BUG_ONs that check for NULL pointer arguments that are
dereferenced in the same function. Dereferencing the NULL pointer
will generate a BUG anyway, so the explicit check is redundant and
unnecessary overhead.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c            | 26 +-----------
 drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c            | 12 ------
 drivers/gpu/drm/amd/amdkfd/kfd_device.c            | 10 -----
 .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  | 48 +---------------------
 .../drm/amd/amdkfd/kfd_device_queue_manager_cik.c  |  2 -
 .../drm/amd/amdkfd/kfd_device_queue_manager_vi.c   |  2 -
 drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c          |  4 --
 drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c      | 16 --------
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c   | 17 --------
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c    |  4 --
 drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c    | 28 +------------
 drivers/gpu/drm/amd/amdkfd/kfd_process.c           |  2 -
 .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 15 -------
 drivers/gpu/drm/amd/amdkfd/kfd_queue.c             |  2 -
 drivers/gpu/drm/amd/amdkfd/kfd_topology.c          | 28 -------------
 15 files changed, 4 insertions(+), 212 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
index 0ef9136..3841cad 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
@@ -42,8 +42,6 @@
 
 static void dbgdev_address_watch_disable_nodiq(struct kfd_dev *dev)
 {
-	BUG_ON(!dev || !dev->kfd2kgd);
-
 	dev->kfd2kgd->address_watch_disable(dev->kgd);
 }
 
@@ -62,7 +60,7 @@ static int dbgdev_diq_submit_ib(struct kfd_dbgdev *dbgdev,
 	unsigned int *ib_packet_buff;
 	int status;
 
-	BUG_ON(!dbgdev || !dbgdev->kq || !packet_buff || !size_in_bytes);
+	BUG_ON(!size_in_bytes);
 
 	kq = dbgdev->kq;
 
@@ -168,8 +166,6 @@ static int dbgdev_diq_submit_ib(struct kfd_dbgdev *dbgdev,
 
 static int dbgdev_register_nodiq(struct kfd_dbgdev *dbgdev)
 {
-	BUG_ON(!dbgdev);
-
 	/*
 	 * no action is needed in this case,
 	 * just make sure diq will not be used
@@ -187,8 +183,6 @@ static int dbgdev_register_diq(struct kfd_dbgdev *dbgdev)
 	struct kernel_queue *kq = NULL;
 	int status;
 
-	BUG_ON(!dbgdev || !dbgdev->pqm || !dbgdev->dev);
-
 	status = pqm_create_queue(dbgdev->pqm, dbgdev->dev, NULL,
 				&properties, 0, KFD_QUEUE_TYPE_DIQ,
 				&qid);
@@ -215,8 +209,6 @@ static int dbgdev_register_diq(struct kfd_dbgdev *dbgdev)
 
 static int dbgdev_unregister_nodiq(struct kfd_dbgdev *dbgdev)
 {
-	BUG_ON(!dbgdev || !dbgdev->dev);
-
 	/* disable watch address */
 	dbgdev_address_watch_disable_nodiq(dbgdev->dev);
 	return 0;
@@ -227,8 +219,6 @@ static int dbgdev_unregister_diq(struct kfd_dbgdev *dbgdev)
 	/* todo - disable address watch */
 	int status;
 
-	BUG_ON(!dbgdev || !dbgdev->pqm || !dbgdev->kq);
-
 	status = pqm_destroy_queue(dbgdev->pqm,
 			dbgdev->kq->queue->properties.queue_id);
 	dbgdev->kq = NULL;
@@ -245,8 +235,6 @@ static void dbgdev_address_watch_set_registers(
 {
 	union ULARGE_INTEGER addr;
 
-	BUG_ON(!adw_info || !addrHi || !addrLo || !cntl);
-
 	addr.quad_part = 0;
 	addrHi->u32All = 0;
 	addrLo->u32All = 0;
@@ -287,8 +275,6 @@ static int dbgdev_address_watch_nodiq(struct kfd_dbgdev *dbgdev,
 	struct kfd_process_device *pdd;
 	unsigned int i;
 
-	BUG_ON(!dbgdev || !dbgdev->dev || !adw_info);
-
 	/* taking the vmid for that process on the safe way using pdd */
 	pdd = kfd_get_process_device_data(dbgdev->dev,
 					adw_info->process);
@@ -362,8 +348,6 @@ static int dbgdev_address_watch_diq(struct kfd_dbgdev *dbgdev,
 	/* we do not control the vmid in DIQ mode, just a place holder */
 	unsigned int vmid = 0;
 
-	BUG_ON(!dbgdev || !dbgdev->dev || !adw_info);
-
 	addrHi.u32All = 0;
 	addrLo.u32All = 0;
 	cntl.u32All = 0;
@@ -508,8 +492,6 @@ static int dbgdev_wave_control_set_registers(
 	union GRBM_GFX_INDEX_BITS reg_gfx_index;
 	struct HsaDbgWaveMsgAMDGen2 *pMsg;
 
-	BUG_ON(!wac_info || !in_reg_sq_cmd || !in_reg_gfx_index);
-
 	reg_sq_cmd.u32All = 0;
 	reg_gfx_index.u32All = 0;
 	pMsg = &wac_info->dbgWave_msg.DbgWaveMsg.WaveMsgInfoGen2;
@@ -610,8 +592,6 @@ static int dbgdev_wave_control_diq(struct kfd_dbgdev *dbgdev,
 	struct pm4__set_config_reg *packets_vec;
 	size_t ib_size = sizeof(struct pm4__set_config_reg) * 3;
 
-	BUG_ON(!dbgdev || !wac_info);
-
 	reg_sq_cmd.u32All = 0;
 
 	status = dbgdev_wave_control_set_registers(wac_info, &reg_sq_cmd,
@@ -725,8 +705,6 @@ static int dbgdev_wave_control_nodiq(struct kfd_dbgdev *dbgdev,
 	union GRBM_GFX_INDEX_BITS reg_gfx_index;
 	struct kfd_process_device *pdd;
 
-	BUG_ON(!dbgdev || !dbgdev->dev || !wac_info);
-
 	reg_sq_cmd.u32All = 0;
 
 	/* taking the VMID for that process on the safe way using PDD */
@@ -851,8 +829,6 @@ int dbgdev_wave_reset_wavefronts(struct kfd_dev *dev, struct kfd_process *p)
 void kfd_dbgdev_init(struct kfd_dbgdev *pdbgdev, struct kfd_dev *pdev,
 			enum DBGDEV_TYPE type)
 {
-	BUG_ON(!pdbgdev || !pdev);
-
 	pdbgdev->dev = pdev;
 	pdbgdev->kq = NULL;
 	pdbgdev->type = type;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c b/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c
index 210bdc1..2d5555c 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c
@@ -44,8 +44,6 @@ struct mutex *kfd_get_dbgmgr_mutex(void)
 
 static void kfd_dbgmgr_uninitialize(struct kfd_dbgmgr *pmgr)
 {
-	BUG_ON(!pmgr);
-
 	kfree(pmgr->dbgdev);
 
 	pmgr->dbgdev = NULL;
@@ -66,7 +64,6 @@ bool kfd_dbgmgr_create(struct kfd_dbgmgr **ppmgr, struct kfd_dev *pdev)
 	enum DBGDEV_TYPE type = DBGDEV_TYPE_DIQ;
 	struct kfd_dbgmgr *new_buff;
 
-	BUG_ON(!pdev);
 	BUG_ON(!pdev->init_complete);
 
 	new_buff = kfd_alloc_struct(new_buff);
@@ -96,8 +93,6 @@ bool kfd_dbgmgr_create(struct kfd_dbgmgr **ppmgr, struct kfd_dev *pdev)
 
 long kfd_dbgmgr_register(struct kfd_dbgmgr *pmgr, struct kfd_process *p)
 {
-	BUG_ON(!p || !pmgr || !pmgr->dbgdev);
-
 	if (pmgr->pasid != 0) {
 		pr_debug("H/W debugger is already active using pasid %d\n",
 				pmgr->pasid);
@@ -118,8 +113,6 @@ long kfd_dbgmgr_register(struct kfd_dbgmgr *pmgr, struct kfd_process *p)
 
 long kfd_dbgmgr_unregister(struct kfd_dbgmgr *pmgr, struct kfd_process *p)
 {
-	BUG_ON(!p || !pmgr || !pmgr->dbgdev);
-
 	/* Is the requests coming from the already registered process? */
 	if (pmgr->pasid != p->pasid) {
 		pr_debug("H/W debugger is not registered by calling pasid %d\n",
@@ -137,8 +130,6 @@ long kfd_dbgmgr_unregister(struct kfd_dbgmgr *pmgr, struct kfd_process *p)
 long kfd_dbgmgr_wave_control(struct kfd_dbgmgr *pmgr,
 				struct dbg_wave_control_info *wac_info)
 {
-	BUG_ON(!pmgr || !pmgr->dbgdev || !wac_info);
-
 	/* Is the requests coming from the already registered process? */
 	if (pmgr->pasid != wac_info->process->pasid) {
 		pr_debug("H/W debugger support was not registered for requester pasid %d\n",
@@ -152,9 +143,6 @@ long kfd_dbgmgr_wave_control(struct kfd_dbgmgr *pmgr,
 long kfd_dbgmgr_address_watch(struct kfd_dbgmgr *pmgr,
 				struct dbg_address_watch_info *adw_info)
 {
-	BUG_ON(!pmgr || !pmgr->dbgdev || !adw_info);
-
-
 	/* Is the requests coming from the already registered process? */
 	if (pmgr->pasid != adw_info->process->pasid) {
 		pr_debug("H/W debugger support was not registered for requester pasid %d\n",
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index d962342..e28e818 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -341,8 +341,6 @@ void kgd2kfd_device_exit(struct kfd_dev *kfd)
 
 void kgd2kfd_suspend(struct kfd_dev *kfd)
 {
-	BUG_ON(!kfd);
-
 	if (kfd->init_complete) {
 		kfd->dqm->ops.stop(kfd->dqm);
 		amd_iommu_set_invalidate_ctx_cb(kfd->pdev, NULL);
@@ -356,8 +354,6 @@ int kgd2kfd_resume(struct kfd_dev *kfd)
 	unsigned int pasid_limit;
 	int err;
 
-	BUG_ON(kfd == NULL);
-
 	pasid_limit = kfd_get_pasid_limit();
 
 	if (kfd->init_complete) {
@@ -394,8 +390,6 @@ static int kfd_gtt_sa_init(struct kfd_dev *kfd, unsigned int buf_size,
 {
 	unsigned int num_of_bits;
 
-	BUG_ON(!kfd);
-	BUG_ON(!kfd->gtt_mem);
 	BUG_ON(buf_size < chunk_size);
 	BUG_ON(buf_size == 0);
 	BUG_ON(chunk_size == 0);
@@ -445,8 +439,6 @@ int kfd_gtt_sa_allocate(struct kfd_dev *kfd, unsigned int size,
 {
 	unsigned int found, start_search, cur_size;
 
-	BUG_ON(!kfd);
-
 	if (size == 0)
 		return -EINVAL;
 
@@ -551,8 +543,6 @@ int kfd_gtt_sa_free(struct kfd_dev *kfd, struct kfd_mem_obj *mem_obj)
 {
 	unsigned int bit;
 
-	BUG_ON(!kfd);
-
 	/* Act like kfree when trying to free a NULL object */
 	if (!mem_obj)
 		return 0;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 68fe6ed..43bc1b5 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -79,20 +79,17 @@ static bool is_pipe_enabled(struct device_queue_manager *dqm, int mec, int pipe)
 
 unsigned int get_queues_num(struct device_queue_manager *dqm)
 {
-	BUG_ON(!dqm || !dqm->dev);
 	return bitmap_weight(dqm->dev->shared_resources.queue_bitmap,
 				KGD_MAX_QUEUES);
 }
 
 unsigned int get_queues_per_pipe(struct device_queue_manager *dqm)
 {
-	BUG_ON(!dqm || !dqm->dev);
 	return dqm->dev->shared_resources.num_queue_per_pipe;
 }
 
 unsigned int get_pipes_per_mec(struct device_queue_manager *dqm)
 {
-	BUG_ON(!dqm || !dqm->dev);
 	return dqm->dev->shared_resources.num_pipe_per_mec;
 }
 
@@ -152,8 +149,6 @@ static int create_queue_nocpsch(struct device_queue_manager *dqm,
 {
 	int retval = 0;
 
-	BUG_ON(!dqm || !q || !qpd || !allocated_vmid);
-
 	print_queue(q);
 
 	mutex_lock(&dqm->lock);
@@ -257,8 +252,6 @@ static int create_compute_queue_nocpsch(struct device_queue_manager *dqm,
 	int retval;
 	struct mqd_manager *mqd;
 
-	BUG_ON(!dqm || !q || !qpd);
-
 	mqd = dqm->ops.get_mqd_manager(dqm, KFD_MQD_TYPE_COMPUTE);
 	if (!mqd)
 		return -ENOMEM;
@@ -297,8 +290,6 @@ static int destroy_queue_nocpsch(struct device_queue_manager *dqm,
 	int retval;
 	struct mqd_manager *mqd;
 
-	BUG_ON(!dqm || !q || !q->mqd || !qpd);
-
 	retval = 0;
 
 	mutex_lock(&dqm->lock);
@@ -360,8 +351,6 @@ static int update_queue(struct device_queue_manager *dqm, struct queue *q)
 	struct mqd_manager *mqd;
 	bool prev_active = false;
 
-	BUG_ON(!dqm || !q || !q->mqd);
-
 	mutex_lock(&dqm->lock);
 	mqd = dqm->ops.get_mqd_manager(dqm,
 			get_mqd_type_from_queue_type(q->properties.type));
@@ -397,7 +386,7 @@ static struct mqd_manager *get_mqd_manager_nocpsch(
 {
 	struct mqd_manager *mqd;
 
-	BUG_ON(!dqm || type >= KFD_MQD_TYPE_MAX);
+	BUG_ON(type >= KFD_MQD_TYPE_MAX);
 
 	pr_debug("mqd type %d\n", type);
 
@@ -418,8 +407,6 @@ static int register_process_nocpsch(struct device_queue_manager *dqm,
 	struct device_process_node *n;
 	int retval;
 
-	BUG_ON(!dqm || !qpd);
-
 	n = kzalloc(sizeof(*n), GFP_KERNEL);
 	if (!n)
 		return -ENOMEM;
@@ -444,8 +431,6 @@ static int unregister_process_nocpsch(struct device_queue_manager *dqm,
 	int retval;
 	struct device_process_node *cur, *next;
 
-	BUG_ON(!dqm || !qpd);
-
 	pr_debug("qpd->queues_list is %s\n",
 			list_empty(&qpd->queues_list) ? "empty" : "not empty");
 
@@ -486,8 +471,6 @@ static void init_interrupts(struct device_queue_manager *dqm)
 {
 	unsigned int i;
 
-	BUG_ON(!dqm);
-
 	for (i = 0 ; i < get_pipes_per_mec(dqm) ; i++)
 		if (is_pipe_enabled(dqm, 0, i))
 			dqm->dev->kfd2kgd->init_interrupts(dqm->dev->kgd, i);
@@ -497,8 +480,6 @@ static int initialize_nocpsch(struct device_queue_manager *dqm)
 {
 	int pipe, queue;
 
-	BUG_ON(!dqm);
-
 	pr_debug("num of pipes: %d\n", get_pipes_per_mec(dqm));
 
 	dqm->allocated_queues = kcalloc(get_pipes_per_mec(dqm),
@@ -530,8 +511,6 @@ static void uninitialize_nocpsch(struct device_queue_manager *dqm)
 {
 	int i;
 
-	BUG_ON(!dqm);
-
 	BUG_ON(dqm->queue_count > 0 || dqm->processes_count > 0);
 
 	kfree(dqm->allocated_queues);
@@ -629,8 +608,6 @@ static int set_sched_resources(struct device_queue_manager *dqm)
 	int i, mec;
 	struct scheduling_resources res;
 
-	BUG_ON(!dqm);
-
 	res.vmid_mask = (1 << VMID_PER_DEVICE) - 1;
 	res.vmid_mask <<= KFD_VMID_START_OFFSET;
 
@@ -672,8 +649,6 @@ static int initialize_cpsch(struct device_queue_manager *dqm)
 {
 	int retval;
 
-	BUG_ON(!dqm);
-
 	pr_debug("num of pipes: %d\n", get_pipes_per_mec(dqm));
 
 	mutex_init(&dqm->lock);
@@ -693,8 +668,6 @@ static int start_cpsch(struct device_queue_manager *dqm)
 	struct device_process_node *node;
 	int retval;
 
-	BUG_ON(!dqm);
-
 	retval = 0;
 
 	retval = pm_init(&dqm->packets, dqm);
@@ -739,8 +712,6 @@ static int stop_cpsch(struct device_queue_manager *dqm)
 	struct device_process_node *node;
 	struct kfd_process_device *pdd;
 
-	BUG_ON(!dqm);
-
 	destroy_queues_cpsch(dqm, true, true);
 
 	list_for_each_entry(node, &dqm->queues, list) {
@@ -757,8 +728,6 @@ static int create_kernel_queue_cpsch(struct device_queue_manager *dqm,
 					struct kernel_queue *kq,
 					struct qcm_process_device *qpd)
 {
-	BUG_ON(!dqm || !kq || !qpd);
-
 	mutex_lock(&dqm->lock);
 	if (dqm->total_queue_count >= max_num_of_queues_per_device) {
 		pr_warn("Can't create new kernel queue because %d queues were already created\n",
@@ -788,8 +757,6 @@ static void destroy_kernel_queue_cpsch(struct device_queue_manager *dqm,
 					struct kernel_queue *kq,
 					struct qcm_process_device *qpd)
 {
-	BUG_ON(!dqm || !kq);
-
 	mutex_lock(&dqm->lock);
 	/* here we actually preempt the DIQ */
 	destroy_queues_cpsch(dqm, true, false);
@@ -821,8 +788,6 @@ static int create_queue_cpsch(struct device_queue_manager *dqm, struct queue *q,
 	int retval;
 	struct mqd_manager *mqd;
 
-	BUG_ON(!dqm || !q || !qpd);
-
 	retval = 0;
 
 	if (allocate_vmid)
@@ -880,7 +845,6 @@ int amdkfd_fence_wait_timeout(unsigned int *fence_addr,
 				unsigned int fence_value,
 				unsigned long timeout)
 {
-	BUG_ON(!fence_addr);
 	timeout += jiffies;
 
 	while (*fence_addr != fence_value) {
@@ -909,8 +873,6 @@ static int destroy_queues_cpsch(struct device_queue_manager *dqm,
 	enum kfd_preempt_type_filter preempt_type;
 	struct kfd_process_device *pdd;
 
-	BUG_ON(!dqm);
-
 	retval = 0;
 
 	if (lock)
@@ -960,8 +922,6 @@ static int execute_queues_cpsch(struct device_queue_manager *dqm, bool lock)
 {
 	int retval;
 
-	BUG_ON(!dqm);
-
 	if (lock)
 		mutex_lock(&dqm->lock);
 
@@ -1002,8 +962,6 @@ static int destroy_queue_cpsch(struct device_queue_manager *dqm,
 	struct mqd_manager *mqd;
 	bool preempt_all_queues;
 
-	BUG_ON(!dqm || !qpd || !q);
-
 	preempt_all_queues = false;
 
 	retval = 0;
@@ -1129,8 +1087,6 @@ struct device_queue_manager *device_queue_manager_init(struct kfd_dev *dev)
 {
 	struct device_queue_manager *dqm;
 
-	BUG_ON(!dev);
-
 	pr_debug("Loading device queue manager\n");
 
 	dqm = kzalloc(sizeof(*dqm), GFP_KERNEL);
@@ -1195,8 +1151,6 @@ struct device_queue_manager *device_queue_manager_init(struct kfd_dev *dev)
 
 void device_queue_manager_uninit(struct device_queue_manager *dqm)
 {
-	BUG_ON(!dqm);
-
 	dqm->ops.uninitialize(dqm);
 	kfree(dqm);
 }
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c
index a263e2a..43194b4 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c
@@ -104,8 +104,6 @@ static int register_process_cik(struct device_queue_manager *dqm,
 	struct kfd_process_device *pdd;
 	unsigned int temp;
 
-	BUG_ON(!dqm || !qpd);
-
 	pdd = qpd_to_pdd(qpd);
 
 	/* check if sh_mem_config register already configured */
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c
index 8c45c86..47ef910 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c
@@ -110,8 +110,6 @@ static int register_process_vi(struct device_queue_manager *dqm,
 	struct kfd_process_device *pdd;
 	unsigned int temp;
 
-	BUG_ON(!dqm || !qpd);
-
 	pdd = qpd_to_pdd(qpd);
 
 	/* check if sh_mem_config register already configured */
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
index 48018a3..0055270 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
@@ -165,8 +165,6 @@ u32 __iomem *kfd_get_kernel_doorbell(struct kfd_dev *kfd,
 {
 	u32 inx;
 
-	BUG_ON(!kfd || !doorbell_off);
-
 	mutex_lock(&kfd->doorbell_mutex);
 	inx = find_first_zero_bit(kfd->doorbell_available_index,
 					KFD_MAX_NUM_OF_QUEUES_PER_PROCESS);
@@ -196,8 +194,6 @@ void kfd_release_kernel_doorbell(struct kfd_dev *kfd, u32 __iomem *db_addr)
 {
 	unsigned int inx;
 
-	BUG_ON(!kfd || !db_addr);
-
 	inx = (unsigned int)(db_addr - kfd->doorbell_kernel_ptr);
 
 	mutex_lock(&kfd->doorbell_mutex);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
index 47e2e8a..970bc07 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
@@ -41,7 +41,6 @@ static bool initialize(struct kernel_queue *kq, struct kfd_dev *dev,
 	int retval;
 	union PM4_MES_TYPE_3_HEADER nop;
 
-	BUG_ON(!kq || !dev);
 	BUG_ON(type != KFD_QUEUE_TYPE_DIQ && type != KFD_QUEUE_TYPE_HIQ);
 
 	pr_debug("Initializing queue type %d size %d\n", KFD_QUEUE_TYPE_HIQ,
@@ -180,8 +179,6 @@ static bool initialize(struct kernel_queue *kq, struct kfd_dev *dev,
 
 static void uninitialize(struct kernel_queue *kq)
 {
-	BUG_ON(!kq);
-
 	if (kq->queue->properties.type == KFD_QUEUE_TYPE_HIQ)
 		kq->mqd->destroy_mqd(kq->mqd,
 					NULL,
@@ -211,8 +208,6 @@ static int acquire_packet_buffer(struct kernel_queue *kq,
 	uint32_t wptr, rptr;
 	unsigned int *queue_address;
 
-	BUG_ON(!kq || !buffer_ptr);
-
 	rptr = *kq->rptr_kernel;
 	wptr = *kq->wptr_kernel;
 	queue_address = (unsigned int *)kq->pq_kernel_addr;
@@ -252,11 +247,7 @@ static void submit_packet(struct kernel_queue *kq)
 {
 #ifdef DEBUG
 	int i;
-#endif
-
-	BUG_ON(!kq);
 
-#ifdef DEBUG
 	for (i = *kq->wptr_kernel; i < kq->pending_wptr; i++) {
 		pr_debug("0x%2X ", kq->pq_kernel_addr[i]);
 		if (i % 15 == 0)
@@ -272,7 +263,6 @@ static void submit_packet(struct kernel_queue *kq)
 
 static void rollback_packet(struct kernel_queue *kq)
 {
-	BUG_ON(!kq);
 	kq->pending_wptr = *kq->queue->properties.write_ptr;
 }
 
@@ -281,8 +271,6 @@ struct kernel_queue *kernel_queue_init(struct kfd_dev *dev,
 {
 	struct kernel_queue *kq;
 
-	BUG_ON(!dev);
-
 	kq = kzalloc(sizeof(*kq), GFP_KERNEL);
 	if (!kq)
 		return NULL;
@@ -313,8 +301,6 @@ struct kernel_queue *kernel_queue_init(struct kfd_dev *dev,
 
 void kernel_queue_uninit(struct kernel_queue *kq)
 {
-	BUG_ON(!kq);
-
 	kq->ops.uninitialize(kq);
 	kfree(kq);
 }
@@ -325,8 +311,6 @@ static __attribute__((unused)) void test_kq(struct kfd_dev *dev)
 	uint32_t *buffer, i;
 	int retval;
 
-	BUG_ON(!dev);
-
 	pr_err("Starting kernel queue test\n");
 
 	kq = kernel_queue_init(dev, KFD_QUEUE_TYPE_HIQ);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
index dca4fc7..a11477d 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
@@ -44,8 +44,6 @@ static int init_mqd(struct mqd_manager *mm, void **mqd,
 	struct cik_mqd *m;
 	int retval;
 
-	BUG_ON(!mm || !q || !mqd);
-
 	retval = kfd_gtt_sa_allocate(mm->dev, sizeof(struct cik_mqd),
 					mqd_mem_obj);
 
@@ -113,8 +111,6 @@ static int init_mqd_sdma(struct mqd_manager *mm, void **mqd,
 	int retval;
 	struct cik_sdma_rlc_registers *m;
 
-	BUG_ON(!mm || !mqd || !mqd_mem_obj);
-
 	retval = kfd_gtt_sa_allocate(mm->dev,
 					sizeof(struct cik_sdma_rlc_registers),
 					mqd_mem_obj);
@@ -138,14 +134,12 @@ static int init_mqd_sdma(struct mqd_manager *mm, void **mqd,
 static void uninit_mqd(struct mqd_manager *mm, void *mqd,
 			struct kfd_mem_obj *mqd_mem_obj)
 {
-	BUG_ON(!mm || !mqd);
 	kfd_gtt_sa_free(mm->dev, mqd_mem_obj);
 }
 
 static void uninit_mqd_sdma(struct mqd_manager *mm, void *mqd,
 				struct kfd_mem_obj *mqd_mem_obj)
 {
-	BUG_ON(!mm || !mqd);
 	kfd_gtt_sa_free(mm->dev, mqd_mem_obj);
 }
 
@@ -168,8 +162,6 @@ static int update_mqd(struct mqd_manager *mm, void *mqd,
 {
 	struct cik_mqd *m;
 
-	BUG_ON(!mm || !q || !mqd);
-
 	m = get_mqd(mqd);
 	m->cp_hqd_pq_control = DEFAULT_RPTR_BLOCK_SIZE |
 				DEFAULT_MIN_AVAIL_SIZE | PQ_ATC_EN;
@@ -209,8 +201,6 @@ static int update_mqd_sdma(struct mqd_manager *mm, void *mqd,
 {
 	struct cik_sdma_rlc_registers *m;
 
-	BUG_ON(!mm || !mqd || !q);
-
 	m = get_sdma_mqd(mqd);
 	m->sdma_rlc_rb_cntl = ffs(q->queue_size / sizeof(unsigned int)) <<
 			SDMA0_RLC0_RB_CNTL__RB_SIZE__SHIFT |
@@ -296,8 +286,6 @@ static int init_mqd_hiq(struct mqd_manager *mm, void **mqd,
 	struct cik_mqd *m;
 	int retval;
 
-	BUG_ON(!mm || !q || !mqd || !mqd_mem_obj);
-
 	retval = kfd_gtt_sa_allocate(mm->dev, sizeof(struct cik_mqd),
 					mqd_mem_obj);
 
@@ -352,8 +340,6 @@ static int update_mqd_hiq(struct mqd_manager *mm, void *mqd,
 {
 	struct cik_mqd *m;
 
-	BUG_ON(!mm || !q || !mqd);
-
 	m = get_mqd(mqd);
 	m->cp_hqd_pq_control = DEFAULT_RPTR_BLOCK_SIZE |
 				DEFAULT_MIN_AVAIL_SIZE |
@@ -391,8 +377,6 @@ struct cik_sdma_rlc_registers *get_sdma_mqd(void *mqd)
 {
 	struct cik_sdma_rlc_registers *m;
 
-	BUG_ON(!mqd);
-
 	m = (struct cik_sdma_rlc_registers *)mqd;
 
 	return m;
@@ -403,7 +387,6 @@ struct mqd_manager *mqd_manager_init_cik(enum KFD_MQD_TYPE type,
 {
 	struct mqd_manager *mqd;
 
-	BUG_ON(!dev);
 	BUG_ON(type >= KFD_MQD_TYPE_MAX);
 
 	mqd = kzalloc(sizeof(*mqd), GFP_KERNEL);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
index aaaa87a..d638c2c 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
@@ -106,8 +106,6 @@ static int __update_mqd(struct mqd_manager *mm, void *mqd,
 {
 	struct vi_mqd *m;
 
-	BUG_ON(!mm || !q || !mqd);
-
 	m = get_mqd(mqd);
 
 	m->cp_hqd_pq_control = 5 << CP_HQD_PQ_CONTROL__RPTR_BLOCK_SIZE__SHIFT |
@@ -186,7 +184,6 @@ static int destroy_mqd(struct mqd_manager *mm, void *mqd,
 static void uninit_mqd(struct mqd_manager *mm, void *mqd,
 			struct kfd_mem_obj *mqd_mem_obj)
 {
-	BUG_ON(!mm || !mqd);
 	kfd_gtt_sa_free(mm->dev, mqd_mem_obj);
 }
 
@@ -236,7 +233,6 @@ struct mqd_manager *mqd_manager_init_vi(enum KFD_MQD_TYPE type,
 {
 	struct mqd_manager *mqd;
 
-	BUG_ON(!dev);
 	BUG_ON(type >= KFD_MQD_TYPE_MAX);
 
 	mqd = kzalloc(sizeof(*mqd), GFP_KERNEL);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
index c4030b3..aacd5a3 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
@@ -58,8 +58,6 @@ static void pm_calc_rlib_size(struct packet_manager *pm,
 	unsigned int process_count, queue_count;
 	unsigned int map_queue_size;
 
-	BUG_ON(!pm || !rlib_size || !over_subscription);
-
 	process_count = pm->dqm->processes_count;
 	queue_count = pm->dqm->queue_count;
 
@@ -96,9 +94,7 @@ static int pm_allocate_runlist_ib(struct packet_manager *pm,
 {
 	int retval;
 
-	BUG_ON(!pm);
 	BUG_ON(pm->allocated);
-	BUG_ON(!is_over_subscription);
 
 	pm_calc_rlib_size(pm, rl_buffer_size, is_over_subscription);
 
@@ -123,7 +119,7 @@ static int pm_create_runlist(struct packet_manager *pm, uint32_t *buffer,
 {
 	struct pm4_runlist *packet;
 
-	BUG_ON(!pm || !buffer || !ib);
+	BUG_ON(!ib);
 
 	packet = (struct pm4_runlist *)buffer;
 
@@ -148,8 +144,6 @@ static int pm_create_map_process(struct packet_manager *pm, uint32_t *buffer,
 	struct queue *cur;
 	uint32_t num_queues;
 
-	BUG_ON(!pm || !buffer || !qpd);
-
 	packet = (struct pm4_map_process *)buffer;
 
 	memset(buffer, 0, sizeof(struct pm4_map_process));
@@ -185,8 +179,6 @@ static int pm_create_map_queue_vi(struct packet_manager *pm, uint32_t *buffer,
 	struct pm4_mes_map_queues *packet;
 	bool use_static = is_static;
 
-	BUG_ON(!pm || !buffer || !q);
-
 	packet = (struct pm4_mes_map_queues *)buffer;
 	memset(buffer, 0, sizeof(struct pm4_map_queues));
 
@@ -247,8 +239,6 @@ static int pm_create_map_queue(struct packet_manager *pm, uint32_t *buffer,
 	struct pm4_map_queues *packet;
 	bool use_static = is_static;
 
-	BUG_ON(!pm || !buffer || !q);
-
 	packet = (struct pm4_map_queues *)buffer;
 	memset(buffer, 0, sizeof(struct pm4_map_queues));
 
@@ -315,8 +305,6 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
 	struct kernel_queue *kq;
 	bool is_over_subscription;
 
-	BUG_ON(!pm || !queues || !rl_size_bytes || !rl_gpu_addr);
-
 	rl_wptr = retval = proccesses_mapped = 0;
 
 	retval = pm_allocate_runlist_ib(pm, &rl_buffer, rl_gpu_addr,
@@ -416,8 +404,6 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
 
 int pm_init(struct packet_manager *pm, struct device_queue_manager *dqm)
 {
-	BUG_ON(!dqm);
-
 	pm->dqm = dqm;
 	mutex_init(&pm->lock);
 	pm->priv_queue = kernel_queue_init(dqm->dev, KFD_QUEUE_TYPE_HIQ);
@@ -432,8 +418,6 @@ int pm_init(struct packet_manager *pm, struct device_queue_manager *dqm)
 
 void pm_uninit(struct packet_manager *pm)
 {
-	BUG_ON(!pm);
-
 	mutex_destroy(&pm->lock);
 	kernel_queue_uninit(pm->priv_queue);
 }
@@ -444,8 +428,6 @@ int pm_send_set_resources(struct packet_manager *pm,
 	struct pm4_set_resources *packet;
 	int retval = 0;
 
-	BUG_ON(!pm || !res);
-
 	mutex_lock(&pm->lock);
 	pm->priv_queue->ops.acquire_packet_buffer(pm->priv_queue,
 					sizeof(*packet) / sizeof(uint32_t),
@@ -489,8 +471,6 @@ int pm_send_runlist(struct packet_manager *pm, struct list_head *dqm_queues)
 	size_t rl_ib_size, packet_size_dwords;
 	int retval;
 
-	BUG_ON(!pm || !dqm_queues);
-
 	retval = pm_create_runlist_ib(pm, dqm_queues, &rl_gpu_ib_addr,
 					&rl_ib_size);
 	if (retval)
@@ -532,7 +512,7 @@ int pm_send_query_status(struct packet_manager *pm, uint64_t fence_address,
 	int retval;
 	struct pm4_query_status *packet;
 
-	BUG_ON(!pm || !fence_address);
+	BUG_ON(!fence_address);
 
 	mutex_lock(&pm->lock);
 	retval = pm->priv_queue->ops.acquire_packet_buffer(
@@ -572,8 +552,6 @@ int pm_send_unmap_queue(struct packet_manager *pm, enum kfd_queue_type type,
 	uint32_t *buffer;
 	struct pm4_unmap_queues *packet;
 
-	BUG_ON(!pm);
-
 	mutex_lock(&pm->lock);
 	retval = pm->priv_queue->ops.acquire_packet_buffer(
 			pm->priv_queue,
@@ -645,8 +623,6 @@ int pm_send_unmap_queue(struct packet_manager *pm, enum kfd_queue_type type,
 
 void pm_release_ib(struct packet_manager *pm)
 {
-	BUG_ON(!pm);
-
 	mutex_lock(&pm->lock);
 	if (pm->allocated) {
 		kfd_gtt_sa_free(pm->dqm->dev, pm->ib_buffer_obj);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index d877cda..d030d76 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -407,8 +407,6 @@ void kfd_unbind_process_from_device(struct kfd_dev *dev, unsigned int pasid)
 	struct kfd_process *p;
 	struct kfd_process_device *pdd;
 
-	BUG_ON(!dev);
-
 	/*
 	 * Look for the process that matches the pasid. If there is no such
 	 * process, we either released it in amdkfd's own notifier, or there
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
index 1d056a6..f6ecdff 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
@@ -32,8 +32,6 @@ static inline struct process_queue_node *get_queue_by_qid(
 {
 	struct process_queue_node *pqn;
 
-	BUG_ON(!pqm);
-
 	list_for_each_entry(pqn, &pqm->queues, process_queue_list) {
 		if ((pqn->q && pqn->q->properties.queue_id == qid) ||
 		    (pqn->kq && pqn->kq->queue->properties.queue_id == qid))
@@ -48,8 +46,6 @@ static int find_available_queue_slot(struct process_queue_manager *pqm,
 {
 	unsigned long found;
 
-	BUG_ON(!pqm || !qid);
-
 	found = find_first_zero_bit(pqm->queue_slot_bitmap,
 			KFD_MAX_NUM_OF_QUEUES_PER_PROCESS);
 
@@ -69,8 +65,6 @@ static int find_available_queue_slot(struct process_queue_manager *pqm,
 
 int pqm_init(struct process_queue_manager *pqm, struct kfd_process *p)
 {
-	BUG_ON(!pqm);
-
 	INIT_LIST_HEAD(&pqm->queues);
 	pqm->queue_slot_bitmap =
 			kzalloc(DIV_ROUND_UP(KFD_MAX_NUM_OF_QUEUES_PER_PROCESS,
@@ -87,8 +81,6 @@ void pqm_uninit(struct process_queue_manager *pqm)
 	int retval;
 	struct process_queue_node *pqn, *next;
 
-	BUG_ON(!pqm);
-
 	list_for_each_entry_safe(pqn, next, &pqm->queues, process_queue_list) {
 		retval = pqm_destroy_queue(
 				pqm,
@@ -151,8 +143,6 @@ int pqm_create_queue(struct process_queue_manager *pqm,
 	int num_queues = 0;
 	struct queue *cur;
 
-	BUG_ON(!pqm || !dev || !properties || !qid);
-
 	memset(&q_properties, 0, sizeof(struct queue_properties));
 	memcpy(&q_properties, properties, sizeof(struct queue_properties));
 	q = NULL;
@@ -269,7 +259,6 @@ int pqm_destroy_queue(struct process_queue_manager *pqm, unsigned int qid)
 
 	dqm = NULL;
 
-	BUG_ON(!pqm);
 	retval = 0;
 
 	pqn = get_queue_by_qid(pqm, qid);
@@ -323,8 +312,6 @@ int pqm_update_queue(struct process_queue_manager *pqm, unsigned int qid,
 	int retval;
 	struct process_queue_node *pqn;
 
-	BUG_ON(!pqm);
-
 	pqn = get_queue_by_qid(pqm, qid);
 	if (!pqn) {
 		pr_debug("No queue %d exists for update operation\n", qid);
@@ -350,8 +337,6 @@ struct kernel_queue *pqm_get_kernel_queue(
 {
 	struct process_queue_node *pqn;
 
-	BUG_ON(!pqm);
-
 	pqn = get_queue_by_qid(pqm, qid);
 	if (pqn && pqn->kq)
 		return pqn->kq;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_queue.c b/drivers/gpu/drm/amd/amdkfd/kfd_queue.c
index 5ad9f6f..a5315d4 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_queue.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_queue.c
@@ -67,8 +67,6 @@ int init_queue(struct queue **q, const struct queue_properties *properties)
 {
 	struct queue *tmp_q;
 
-	BUG_ON(!q);
-
 	tmp_q = kzalloc(sizeof(*tmp_q), GFP_KERNEL);
 	if (!tmp_q)
 		return -ENOMEM;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
index 113c1ce..e5486f4 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
@@ -108,9 +108,6 @@ static int kfd_topology_get_crat_acpi(void *crat_image, size_t *size)
 static void kfd_populated_cu_info_cpu(struct kfd_topology_device *dev,
 		struct crat_subtype_computeunit *cu)
 {
-	BUG_ON(!dev);
-	BUG_ON(!cu);
-
 	dev->node_props.cpu_cores_count = cu->num_cpu_cores;
 	dev->node_props.cpu_core_id_base = cu->processor_id_low;
 	if (cu->hsa_capability & CRAT_CU_FLAGS_IOMMU_PRESENT)
@@ -123,9 +120,6 @@ static void kfd_populated_cu_info_cpu(struct kfd_topology_device *dev,
 static void kfd_populated_cu_info_gpu(struct kfd_topology_device *dev,
 		struct crat_subtype_computeunit *cu)
 {
-	BUG_ON(!dev);
-	BUG_ON(!cu);
-
 	dev->node_props.simd_id_base = cu->processor_id_low;
 	dev->node_props.simd_count = cu->num_simd_cores;
 	dev->node_props.lds_size_in_kb = cu->lds_size_in_kb;
@@ -148,8 +142,6 @@ static int kfd_parse_subtype_cu(struct crat_subtype_computeunit *cu)
 	struct kfd_topology_device *dev;
 	int i = 0;
 
-	BUG_ON(!cu);
-
 	pr_info("Found CU entry in CRAT table with proximity_domain=%d caps=%x\n",
 			cu->proximity_domain, cu->hsa_capability);
 	list_for_each_entry(dev, &topology_device_list, list) {
@@ -177,8 +169,6 @@ static int kfd_parse_subtype_mem(struct crat_subtype_memory *mem)
 	struct kfd_topology_device *dev;
 	int i = 0;
 
-	BUG_ON(!mem);
-
 	pr_info("Found memory entry in CRAT table with proximity_domain=%d\n",
 			mem->promixity_domain);
 	list_for_each_entry(dev, &topology_device_list, list) {
@@ -223,8 +213,6 @@ static int kfd_parse_subtype_cache(struct crat_subtype_cache *cache)
 	struct kfd_topology_device *dev;
 	uint32_t id;
 
-	BUG_ON(!cache);
-
 	id = cache->processor_id_low;
 
 	pr_info("Found cache entry in CRAT table with processor_id=%d\n", id);
@@ -274,8 +262,6 @@ static int kfd_parse_subtype_iolink(struct crat_subtype_iolink *iolink)
 	uint32_t id_from;
 	uint32_t id_to;
 
-	BUG_ON(!iolink);
-
 	id_from = iolink->proximity_domain_from;
 	id_to = iolink->proximity_domain_to;
 
@@ -323,8 +309,6 @@ static int kfd_parse_subtype(struct crat_subtype_generic *sub_type_hdr)
 	struct crat_subtype_iolink *iolink;
 	int ret = 0;
 
-	BUG_ON(!sub_type_hdr);
-
 	switch (sub_type_hdr->type) {
 	case CRAT_SUBTYPE_COMPUTEUNIT_AFFINITY:
 		cu = (struct crat_subtype_computeunit *)sub_type_hdr;
@@ -368,8 +352,6 @@ static void kfd_release_topology_device(struct kfd_topology_device *dev)
 	struct kfd_cache_properties *cache;
 	struct kfd_iolink_properties *iolink;
 
-	BUG_ON(!dev);
-
 	list_del(&dev->list);
 
 	while (dev->mem_props.next != &dev->mem_props) {
@@ -763,8 +745,6 @@ static void kfd_remove_sysfs_node_entry(struct kfd_topology_device *dev)
 	struct kfd_cache_properties *cache;
 	struct kfd_mem_properties *mem;
 
-	BUG_ON(!dev);
-
 	if (dev->kobj_iolink) {
 		list_for_each_entry(iolink, &dev->io_link_props, list)
 			if (iolink->kobj) {
@@ -819,8 +799,6 @@ static int kfd_build_sysfs_node_entry(struct kfd_topology_device *dev,
 	int ret;
 	uint32_t i;
 
-	BUG_ON(!dev);
-
 	/*
 	 * Creating the sysfs folders
 	 */
@@ -1117,8 +1095,6 @@ static struct kfd_topology_device *kfd_assign_gpu(struct kfd_dev *gpu)
 	struct kfd_topology_device *dev;
 	struct kfd_topology_device *out_dev = NULL;
 
-	BUG_ON(!gpu);
-
 	list_for_each_entry(dev, &topology_device_list, list)
 		if (!dev->gpu && (dev->node_props.simd_count > 0)) {
 			dev->gpu = gpu;
@@ -1143,8 +1119,6 @@ int kfd_topology_add_device(struct kfd_dev *gpu)
 	struct kfd_topology_device *dev;
 	int res;
 
-	BUG_ON(!gpu);
-
 	gpu_id = kfd_generate_gpu_id(gpu);
 
 	pr_debug("Adding new GPU (ID: 0x%x) to topology\n", gpu_id);
@@ -1210,8 +1184,6 @@ int kfd_topology_remove_device(struct kfd_dev *gpu)
 	uint32_t gpu_id;
 	int res = -ENODEV;
 
-	BUG_ON(!gpu);
-
 	down_write(&topology_lock);
 
 	list_for_each_entry(dev, &topology_device_list, list)
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 11/19] drm/amdkfd: Fix doorbell initialization and finalization
       [not found] ` <1502488589-30272-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (9 preceding siblings ...)
  2017-08-11 21:56   ` [PATCH 10/19] drm/amdkfd: Remove BUG_ONs for NULL pointer arguments Felix Kuehling
@ 2017-08-11 21:56   ` Felix Kuehling
       [not found]     ` <1502488589-30272-12-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-08-11 21:56   ` [PATCH 12/19] drm/amdkfd: Allocate gtt_sa_bitmap in long units Felix Kuehling
                     ` (8 subsequent siblings)
  19 siblings, 1 reply; 70+ messages in thread
From: Felix Kuehling @ 2017-08-11 21:56 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Felix Kuehling

Handle errors in doorbell aperture initialization instead of BUG_ON.
iounmap doorbell aperture during finalization.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_device.c   |  9 ++++++++-
 drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c | 13 +++++++++++--
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h     |  3 ++-
 3 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index e28e818..cb7ed02 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -260,7 +260,11 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
 		goto kfd_gtt_sa_init_error;
 	}
 
-	kfd_doorbell_init(kfd);
+	if (kfd_doorbell_init(kfd)) {
+		dev_err(kfd_device,
+			"Error initializing doorbell aperture\n");
+		goto kfd_doorbell_error;
+	}
 
 	if (kfd_topology_add_device(kfd)) {
 		dev_err(kfd_device, "Error adding device to topology\n");
@@ -315,6 +319,8 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
 kfd_interrupt_error:
 	kfd_topology_remove_device(kfd);
 kfd_topology_add_device_error:
+	kfd_doorbell_fini(kfd);
+kfd_doorbell_error:
 	kfd_gtt_sa_fini(kfd);
 kfd_gtt_sa_init_error:
 	kfd->kfd2kgd->free_gtt_mem(kfd->kgd, kfd->gtt_mem);
@@ -332,6 +338,7 @@ void kgd2kfd_device_exit(struct kfd_dev *kfd)
 		amd_iommu_free_device(kfd->pdev);
 		kfd_interrupt_exit(kfd);
 		kfd_topology_remove_device(kfd);
+		kfd_doorbell_fini(kfd);
 		kfd_gtt_sa_fini(kfd);
 		kfd->kfd2kgd->free_gtt_mem(kfd->kgd, kfd->gtt_mem);
 	}
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
index 0055270..acf4d2a 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
@@ -59,7 +59,7 @@ static inline size_t doorbell_process_allocation(void)
 }
 
 /* Doorbell calculations for device init. */
-void kfd_doorbell_init(struct kfd_dev *kfd)
+int kfd_doorbell_init(struct kfd_dev *kfd)
 {
 	size_t doorbell_start_offset;
 	size_t doorbell_aperture_size;
@@ -95,7 +95,8 @@ void kfd_doorbell_init(struct kfd_dev *kfd)
 	kfd->doorbell_kernel_ptr = ioremap(kfd->doorbell_base,
 						doorbell_process_allocation());
 
-	BUG_ON(!kfd->doorbell_kernel_ptr);
+	if (!kfd->doorbell_kernel_ptr)
+		return -ENOMEM;
 
 	pr_debug("Doorbell initialization:\n");
 	pr_debug("doorbell base           == 0x%08lX\n",
@@ -115,6 +116,14 @@ void kfd_doorbell_init(struct kfd_dev *kfd)
 
 	pr_debug("doorbell kernel address == 0x%08lX\n",
 			(uintptr_t)kfd->doorbell_kernel_ptr);
+
+	return 0;
+}
+
+void kfd_doorbell_fini(struct kfd_dev *kfd)
+{
+	if (kfd->doorbell_kernel_ptr)
+		iounmap(kfd->doorbell_kernel_ptr);
 }
 
 int kfd_doorbell_mmap(struct kfd_process *process, struct vm_area_struct *vma)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 469b7ea..f0d55cc0 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -576,7 +576,8 @@ unsigned int kfd_pasid_alloc(void);
 void kfd_pasid_free(unsigned int pasid);
 
 /* Doorbells */
-void kfd_doorbell_init(struct kfd_dev *kfd);
+int kfd_doorbell_init(struct kfd_dev *kfd);
+void kfd_doorbell_fini(struct kfd_dev *kfd);
 int kfd_doorbell_mmap(struct kfd_process *process, struct vm_area_struct *vma);
 u32 __iomem *kfd_get_kernel_doorbell(struct kfd_dev *kfd,
 					unsigned int *doorbell_off);
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 12/19] drm/amdkfd: Allocate gtt_sa_bitmap in long units
       [not found] ` <1502488589-30272-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (10 preceding siblings ...)
  2017-08-11 21:56   ` [PATCH 11/19] drm/amdkfd: Fix doorbell initialization and finalization Felix Kuehling
@ 2017-08-11 21:56   ` Felix Kuehling
       [not found]     ` <1502488589-30272-13-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-08-11 21:56   ` [PATCH 13/19] drm/amdkfd: Handle remaining BUG_ONs more gracefully Felix Kuehling
                     ` (7 subsequent siblings)
  19 siblings, 1 reply; 70+ messages in thread
From: Felix Kuehling @ 2017-08-11 21:56 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Felix Kuehling

gtt_sa_bitmap is accessed by bitmap functions, which operate on longs.
Therefore the array should be allocated in long units. Also round up
in case the number of bits is not a multiple of BITS_PER_LONG.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_device.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index cb7ed02..416955f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -395,7 +395,7 @@ void kgd2kfd_interrupt(struct kfd_dev *kfd, const void *ih_ring_entry)
 static int kfd_gtt_sa_init(struct kfd_dev *kfd, unsigned int buf_size,
 				unsigned int chunk_size)
 {
-	unsigned int num_of_bits;
+	unsigned int num_of_longs;
 
 	BUG_ON(buf_size < chunk_size);
 	BUG_ON(buf_size == 0);
@@ -404,10 +404,10 @@ static int kfd_gtt_sa_init(struct kfd_dev *kfd, unsigned int buf_size,
 	kfd->gtt_sa_chunk_size = chunk_size;
 	kfd->gtt_sa_num_of_chunks = buf_size / chunk_size;
 
-	num_of_bits = kfd->gtt_sa_num_of_chunks / BITS_PER_BYTE;
-	BUG_ON(num_of_bits == 0);
+	num_of_longs = (kfd->gtt_sa_num_of_chunks + BITS_PER_LONG - 1) /
+		BITS_PER_LONG;
 
-	kfd->gtt_sa_bitmap = kzalloc(num_of_bits, GFP_KERNEL);
+	kfd->gtt_sa_bitmap = kcalloc(num_of_longs, sizeof(long), GFP_KERNEL);
 
 	if (!kfd->gtt_sa_bitmap)
 		return -ENOMEM;
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 13/19] drm/amdkfd: Handle remaining BUG_ONs more gracefully
       [not found] ` <1502488589-30272-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (11 preceding siblings ...)
  2017-08-11 21:56   ` [PATCH 12/19] drm/amdkfd: Allocate gtt_sa_bitmap in long units Felix Kuehling
@ 2017-08-11 21:56   ` Felix Kuehling
       [not found]     ` <1502488589-30272-14-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-08-11 21:56   ` [PATCH 14/19] drm/amdkfd: Add more error printing to help bringup Felix Kuehling
                     ` (6 subsequent siblings)
  19 siblings, 1 reply; 70+ messages in thread
From: Felix Kuehling @ 2017-08-11 21:56 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Felix Kuehling

In most cases, BUG_ONs can be replaced with WARN_ON with an error
return. In some void functions just turn them into a WARN_ON and
possibly an early exit.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c            |  3 +-
 drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c            |  3 +-
 drivers/gpu/drm/amd/amdkfd/kfd_device.c            | 16 ++++----
 .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  | 19 ++++-----
 .../drm/amd/amdkfd/kfd_device_queue_manager_cik.c  |  2 +-
 .../drm/amd/amdkfd/kfd_device_queue_manager_vi.c   |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c      | 20 +++++++---
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c   |  3 +-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c    |  3 +-
 drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c    | 45 +++++++++++++---------
 drivers/gpu/drm/amd/amdkfd/kfd_pasid.c             |  4 +-
 drivers/gpu/drm/amd/amdkfd/kfd_process.c           |  9 ++---
 .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |  7 ++--
 drivers/gpu/drm/amd/amdkfd/kfd_topology.c          |  4 +-
 14 files changed, 84 insertions(+), 56 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
index 3841cad..0aa021a 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
@@ -60,7 +60,8 @@ static int dbgdev_diq_submit_ib(struct kfd_dbgdev *dbgdev,
 	unsigned int *ib_packet_buff;
 	int status;
 
-	BUG_ON(!size_in_bytes);
+	if (WARN_ON(!size_in_bytes))
+		return -EINVAL;
 
 	kq = dbgdev->kq;
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c b/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c
index 2d5555c..3da25f7 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c
@@ -64,7 +64,8 @@ bool kfd_dbgmgr_create(struct kfd_dbgmgr **ppmgr, struct kfd_dev *pdev)
 	enum DBGDEV_TYPE type = DBGDEV_TYPE_DIQ;
 	struct kfd_dbgmgr *new_buff;
 
-	BUG_ON(!pdev->init_complete);
+	if (WARN_ON(!pdev->init_complete))
+		return false;
 
 	new_buff = kfd_alloc_struct(new_buff);
 	if (!new_buff) {
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 416955f..f628ac3 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -98,7 +98,7 @@ static const struct kfd_device_info *lookup_device_info(unsigned short did)
 
 	for (i = 0; i < ARRAY_SIZE(supported_devices); i++) {
 		if (supported_devices[i].did == did) {
-			BUG_ON(!supported_devices[i].device_info);
+			WARN_ON(!supported_devices[i].device_info);
 			return supported_devices[i].device_info;
 		}
 	}
@@ -212,9 +212,8 @@ static int iommu_invalid_ppr_cb(struct pci_dev *pdev, int pasid,
 			flags);
 
 	dev = kfd_device_by_pci_dev(pdev);
-	BUG_ON(!dev);
-
-	kfd_signal_iommu_event(dev, pasid, address,
+	if (!WARN_ON(!dev))
+		kfd_signal_iommu_event(dev, pasid, address,
 			flags & PPR_FAULT_WRITE, flags & PPR_FAULT_EXEC);
 
 	return AMD_IOMMU_INV_PRI_RSP_INVALID;
@@ -397,9 +396,12 @@ static int kfd_gtt_sa_init(struct kfd_dev *kfd, unsigned int buf_size,
 {
 	unsigned int num_of_longs;
 
-	BUG_ON(buf_size < chunk_size);
-	BUG_ON(buf_size == 0);
-	BUG_ON(chunk_size == 0);
+	if (WARN_ON(buf_size < chunk_size))
+		return -EINVAL;
+	if (WARN_ON(buf_size == 0))
+		return -EINVAL;
+	if (WARN_ON(chunk_size == 0))
+		return -EINVAL;
 
 	kfd->gtt_sa_chunk_size = chunk_size;
 	kfd->gtt_sa_num_of_chunks = buf_size / chunk_size;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 43bc1b5..5dac29d 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -386,7 +386,8 @@ static struct mqd_manager *get_mqd_manager_nocpsch(
 {
 	struct mqd_manager *mqd;
 
-	BUG_ON(type >= KFD_MQD_TYPE_MAX);
+	if (WARN_ON(type >= KFD_MQD_TYPE_MAX))
+		return NULL;
 
 	pr_debug("mqd type %d\n", type);
 
@@ -511,7 +512,7 @@ static void uninitialize_nocpsch(struct device_queue_manager *dqm)
 {
 	int i;
 
-	BUG_ON(dqm->queue_count > 0 || dqm->processes_count > 0);
+	WARN_ON(dqm->queue_count > 0 || dqm->processes_count > 0);
 
 	kfree(dqm->allocated_queues);
 	for (i = 0 ; i < KFD_MQD_TYPE_MAX ; i++)
@@ -1127,8 +1128,8 @@ struct device_queue_manager *device_queue_manager_init(struct kfd_dev *dev)
 		dqm->ops.set_cache_memory_policy = set_cache_memory_policy;
 		break;
 	default:
-		BUG();
-		break;
+		pr_err("Invalid scheduling policy %d\n", sched_policy);
+		goto out_free;
 	}
 
 	switch (dev->device_info->asic_family) {
@@ -1141,12 +1142,12 @@ struct device_queue_manager *device_queue_manager_init(struct kfd_dev *dev)
 		break;
 	}
 
-	if (dqm->ops.initialize(dqm)) {
-		kfree(dqm);
-		return NULL;
-	}
+	if (!dqm->ops.initialize(dqm))
+		return dqm;
 
-	return dqm;
+out_free:
+	kfree(dqm);
+	return NULL;
 }
 
 void device_queue_manager_uninit(struct device_queue_manager *dqm)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c
index 43194b4..fadc56a 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c
@@ -65,7 +65,7 @@ static uint32_t compute_sh_mem_bases_64bit(unsigned int top_address_nybble)
 	 * for LDS/Scratch and GPUVM.
 	 */
 
-	BUG_ON((top_address_nybble & 1) || top_address_nybble > 0xE ||
+	WARN_ON((top_address_nybble & 1) || top_address_nybble > 0xE ||
 		top_address_nybble == 0);
 
 	return PRIVATE_BASE(top_address_nybble << 12) |
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c
index 47ef910..15e81ae 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c
@@ -67,7 +67,7 @@ static uint32_t compute_sh_mem_bases_64bit(unsigned int top_address_nybble)
 	 * for LDS/Scratch and GPUVM.
 	 */
 
-	BUG_ON((top_address_nybble & 1) || top_address_nybble > 0xE ||
+	WARN_ON((top_address_nybble & 1) || top_address_nybble > 0xE ||
 		top_address_nybble == 0);
 
 	return top_address_nybble << 12 |
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
index 970bc07..0e4d4a9 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
@@ -41,7 +41,8 @@ static bool initialize(struct kernel_queue *kq, struct kfd_dev *dev,
 	int retval;
 	union PM4_MES_TYPE_3_HEADER nop;
 
-	BUG_ON(type != KFD_QUEUE_TYPE_DIQ && type != KFD_QUEUE_TYPE_HIQ);
+	if (WARN_ON(type != KFD_QUEUE_TYPE_DIQ && type != KFD_QUEUE_TYPE_HIQ))
+		return false;
 
 	pr_debug("Initializing queue type %d size %d\n", KFD_QUEUE_TYPE_HIQ,
 			queue_size);
@@ -62,8 +63,8 @@ static bool initialize(struct kernel_queue *kq, struct kfd_dev *dev,
 						KFD_MQD_TYPE_HIQ);
 		break;
 	default:
-		BUG();
-		break;
+		pr_err("Invalid queue type %d\n", type);
+		return false;
 	}
 
 	if (!kq->mqd)
@@ -305,6 +306,7 @@ void kernel_queue_uninit(struct kernel_queue *kq)
 	kfree(kq);
 }
 
+/* FIXME: Can this test be removed? */
 static __attribute__((unused)) void test_kq(struct kfd_dev *dev)
 {
 	struct kernel_queue *kq;
@@ -314,10 +316,18 @@ static __attribute__((unused)) void test_kq(struct kfd_dev *dev)
 	pr_err("Starting kernel queue test\n");
 
 	kq = kernel_queue_init(dev, KFD_QUEUE_TYPE_HIQ);
-	BUG_ON(!kq);
+	if (unlikely(!kq)) {
+		pr_err("  Failed to initialize HIQ\n");
+		pr_err("Kernel queue test failed\n");
+		return;
+	}
 
 	retval = kq->ops.acquire_packet_buffer(kq, 5, &buffer);
-	BUG_ON(retval != 0);
+	if (unlikely(retval != 0)) {
+		pr_err("  Failed to acquire packet buffer\n");
+		pr_err("Kernel queue test failed\n");
+		return;
+	}
 	for (i = 0; i < 5; i++)
 		buffer[i] = kq->nop_packet;
 	kq->ops.submit_packet(kq);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
index a11477d..7e0ec6b 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
@@ -387,7 +387,8 @@ struct mqd_manager *mqd_manager_init_cik(enum KFD_MQD_TYPE type,
 {
 	struct mqd_manager *mqd;
 
-	BUG_ON(type >= KFD_MQD_TYPE_MAX);
+	if (WARN_ON(type >= KFD_MQD_TYPE_MAX))
+		return NULL;
 
 	mqd = kzalloc(sizeof(*mqd), GFP_KERNEL);
 	if (!mqd)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
index d638c2c..f4c8c23 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
@@ -233,7 +233,8 @@ struct mqd_manager *mqd_manager_init_vi(enum KFD_MQD_TYPE type,
 {
 	struct mqd_manager *mqd;
 
-	BUG_ON(type >= KFD_MQD_TYPE_MAX);
+	if (WARN_ON(type >= KFD_MQD_TYPE_MAX))
+		return NULL;
 
 	mqd = kzalloc(sizeof(*mqd), GFP_KERNEL);
 	if (!mqd)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
index aacd5a3..77a6f2b 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
@@ -35,7 +35,8 @@ static inline void inc_wptr(unsigned int *wptr, unsigned int increment_bytes,
 {
 	unsigned int temp = *wptr + increment_bytes / sizeof(uint32_t);
 
-	BUG_ON((temp * sizeof(uint32_t)) > buffer_size_bytes);
+	WARN((temp * sizeof(uint32_t)) > buffer_size_bytes,
+	     "Runlist IB overflow");
 	*wptr = temp;
 }
 
@@ -94,7 +95,8 @@ static int pm_allocate_runlist_ib(struct packet_manager *pm,
 {
 	int retval;
 
-	BUG_ON(pm->allocated);
+	if (WARN_ON(pm->allocated))
+		return -EINVAL;
 
 	pm_calc_rlib_size(pm, rl_buffer_size, is_over_subscription);
 
@@ -119,7 +121,8 @@ static int pm_create_runlist(struct packet_manager *pm, uint32_t *buffer,
 {
 	struct pm4_runlist *packet;
 
-	BUG_ON(!ib);
+	if (WARN_ON(!ib))
+		return -EFAULT;
 
 	packet = (struct pm4_runlist *)buffer;
 
@@ -211,9 +214,8 @@ static int pm_create_map_queue_vi(struct packet_manager *pm, uint32_t *buffer,
 		use_static = false; /* no static queues under SDMA */
 		break;
 	default:
-		pr_err("queue type %d\n", q->properties.type);
-		BUG();
-		break;
+		WARN(1, "queue type %d", q->properties.type);
+		return -EINVAL;
 	}
 	packet->bitfields3.doorbell_offset =
 			q->properties.doorbell_off;
@@ -266,8 +268,8 @@ static int pm_create_map_queue(struct packet_manager *pm, uint32_t *buffer,
 		use_static = false; /* no static queues under SDMA */
 		break;
 	default:
-		BUG();
-		break;
+		WARN(1, "queue type %d", q->properties.type);
+		return -EINVAL;
 	}
 
 	packet->mes_map_queues_ordinals[0].bitfields3.doorbell_offset =
@@ -392,14 +394,16 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
 	pr_debug("Finished map process and queues to runlist\n");
 
 	if (is_over_subscription)
-		pm_create_runlist(pm, &rl_buffer[rl_wptr], *rl_gpu_addr,
-				alloc_size_bytes / sizeof(uint32_t), true);
+		retval = pm_create_runlist(pm, &rl_buffer[rl_wptr],
+					*rl_gpu_addr,
+					alloc_size_bytes / sizeof(uint32_t),
+					true);
 
 	for (i = 0; i < alloc_size_bytes / sizeof(uint32_t); i++)
 		pr_debug("0x%2X ", rl_buffer[i]);
 	pr_debug("\n");
 
-	return 0;
+	return retval;
 }
 
 int pm_init(struct packet_manager *pm, struct device_queue_manager *dqm)
@@ -512,7 +516,8 @@ int pm_send_query_status(struct packet_manager *pm, uint64_t fence_address,
 	int retval;
 	struct pm4_query_status *packet;
 
-	BUG_ON(!fence_address);
+	if (WARN_ON(!fence_address))
+		return -EFAULT;
 
 	mutex_lock(&pm->lock);
 	retval = pm->priv_queue->ops.acquire_packet_buffer(
@@ -577,8 +582,9 @@ int pm_send_unmap_queue(struct packet_manager *pm, enum kfd_queue_type type,
 			engine_sel__mes_unmap_queues__sdma0 + sdma_engine;
 		break;
 	default:
-		BUG();
-		break;
+		WARN(1, "queue type %d", type);
+		retval = -EINVAL;
+		goto err_invalid;
 	}
 
 	if (reset)
@@ -610,12 +616,15 @@ int pm_send_unmap_queue(struct packet_manager *pm, enum kfd_queue_type type,
 				queue_sel__mes_unmap_queues__perform_request_on_dynamic_queues_only;
 		break;
 	default:
-		BUG();
-		break;
+		WARN(1, "filter %d", mode);
+		retval = -EINVAL;
 	}
 
-	pm->priv_queue->ops.submit_packet(pm->priv_queue);
-
+err_invalid:
+	if (!retval)
+		pm->priv_queue->ops.submit_packet(pm->priv_queue);
+	else
+		pm->priv_queue->ops.rollback_packet(pm->priv_queue);
 err_acquire_packet_buffer:
 	mutex_unlock(&pm->lock);
 	return retval;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pasid.c b/drivers/gpu/drm/amd/amdkfd/kfd_pasid.c
index b3f7d43..1e06de0 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_pasid.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_pasid.c
@@ -92,6 +92,6 @@ unsigned int kfd_pasid_alloc(void)
 
 void kfd_pasid_free(unsigned int pasid)
 {
-	BUG_ON(pasid == 0 || pasid >= pasid_limit);
-	clear_bit(pasid, pasid_bitmap);
+	if (!WARN_ON(pasid == 0 || pasid >= pasid_limit))
+		clear_bit(pasid, pasid_bitmap);
 }
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index d030d76..41a9976 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -79,8 +79,6 @@ struct kfd_process *kfd_create_process(const struct task_struct *thread)
 {
 	struct kfd_process *process;
 
-	BUG_ON(!kfd_process_wq);
-
 	if (!thread->mm)
 		return ERR_PTR(-EINVAL);
 
@@ -202,10 +200,10 @@ static void kfd_process_destroy_delayed(struct rcu_head *rcu)
 	struct kfd_process_release_work *work;
 	struct kfd_process *p;
 
-	BUG_ON(!kfd_process_wq);
+	WARN_ON(!kfd_process_wq);
 
 	p = container_of(rcu, struct kfd_process, rcu);
-	BUG_ON(atomic_read(&p->mm->mm_count) <= 0);
+	WARN_ON(atomic_read(&p->mm->mm_count) <= 0);
 
 	mmdrop(p->mm);
 
@@ -229,7 +227,8 @@ static void kfd_process_notifier_release(struct mmu_notifier *mn,
 	 * mmu_notifier srcu is read locked
 	 */
 	p = container_of(mn, struct kfd_process, mmu_notifier);
-	BUG_ON(p->mm != mm);
+	if (WARN_ON(p->mm != mm))
+		return;
 
 	mutex_lock(&kfd_processes_mutex);
 	hash_del_rcu(&p->kfd_processes);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
index f6ecdff..1cae95e 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
@@ -218,8 +218,8 @@ int pqm_create_queue(struct process_queue_manager *pqm,
 							kq, &pdd->qpd);
 		break;
 	default:
-		BUG();
-		break;
+		WARN(1, "Invalid queue type %d", type);
+		retval = -EINVAL;
 	}
 
 	if (retval != 0) {
@@ -272,7 +272,8 @@ int pqm_destroy_queue(struct process_queue_manager *pqm, unsigned int qid)
 		dev = pqn->kq->dev;
 	if (pqn->q)
 		dev = pqn->q->device;
-	BUG_ON(!dev);
+	if (WARN_ON(!dev))
+		return -ENODEV;
 
 	pdd = kfd_get_process_device_data(dev, pqm->process);
 	if (!pdd) {
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
index e5486f4..19ce590 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
@@ -799,10 +799,12 @@ static int kfd_build_sysfs_node_entry(struct kfd_topology_device *dev,
 	int ret;
 	uint32_t i;
 
+	if (WARN_ON(dev->kobj_node))
+		return -EEXIST;
+
 	/*
 	 * Creating the sysfs folders
 	 */
-	BUG_ON(dev->kobj_node);
 	dev->kobj_node = kfd_alloc_struct(dev->kobj_node);
 	if (!dev->kobj_node)
 		return -ENOMEM;
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 14/19] drm/amdkfd: Add more error printing to help bringup
       [not found] ` <1502488589-30272-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (12 preceding siblings ...)
  2017-08-11 21:56   ` [PATCH 13/19] drm/amdkfd: Handle remaining BUG_ONs more gracefully Felix Kuehling
@ 2017-08-11 21:56   ` Felix Kuehling
       [not found]     ` <1502488589-30272-15-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-08-11 21:56   ` [PATCH 15/19] drm/amdkfd: Clamp EOP queue size correctly on Gfx8 Felix Kuehling
                     ` (5 subsequent siblings)
  19 siblings, 1 reply; 70+ messages in thread
From: Felix Kuehling @ 2017-08-11 21:56 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Yong Zhao, Felix Kuehling

From: Yong Zhao <Yong.Zhao@amd.com>

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_device.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index f628ac3..e1c2ad2 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -103,6 +103,8 @@ static const struct kfd_device_info *lookup_device_info(unsigned short did)
 		}
 	}
 
+	WARN(1, "device is not added to supported_devices\n");
+
 	return NULL;
 }
 
@@ -114,8 +116,10 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd,
 	const struct kfd_device_info *device_info =
 					lookup_device_info(pdev->device);
 
-	if (!device_info)
+	if (!device_info) {
+		dev_err(kfd_device, "kgd2kfd_probe failed\n");
 		return NULL;
+	}
 
 	kfd = kzalloc(sizeof(*kfd), GFP_KERNEL);
 	if (!kfd)
@@ -364,8 +368,11 @@ int kgd2kfd_resume(struct kfd_dev *kfd)
 
 	if (kfd->init_complete) {
 		err = amd_iommu_init_device(kfd->pdev, pasid_limit);
-		if (err < 0)
+		if (err < 0) {
+			dev_err(kfd_device, "failed to initialize iommu\n");
 			return -ENXIO;
+		}
+
 		amd_iommu_set_invalidate_ctx_cb(kfd->pdev,
 						iommu_pasid_shutdown_callback);
 		amd_iommu_set_invalid_ppr_cb(kfd->pdev, iommu_invalid_ppr_cb);
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 15/19] drm/amdkfd: Clamp EOP queue size correctly on Gfx8
       [not found] ` <1502488589-30272-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (13 preceding siblings ...)
  2017-08-11 21:56   ` [PATCH 14/19] drm/amdkfd: Add more error printing to help bringup Felix Kuehling
@ 2017-08-11 21:56   ` Felix Kuehling
       [not found]     ` <1502488589-30272-16-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-08-11 21:56   ` [PATCH 16/19] drm/amdkfd: Update PM4 packet headers Felix Kuehling
                     ` (4 subsequent siblings)
  19 siblings, 1 reply; 70+ messages in thread
From: Felix Kuehling @ 2017-08-11 21:56 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Jay Cornwall

From: Jay Cornwall <Jay.Cornwall@amd.com>

Gfx8 HW incorrectly clamps CP_HQD_EOP_CONTROL.EOP_SIZE, which can
lead to scheduling deadlock due to SE EOP done counter overflow.

Enforce a EOP queue size limit which prevents the CP from sending
more than 0xFF events at a time.

Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
index f4c8c23..98a930e 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
@@ -135,8 +135,15 @@ static int __update_mqd(struct mqd_manager *mm, void *mqd,
 			3 << CP_HQD_IB_CONTROL__MIN_IB_AVAIL_SIZE__SHIFT |
 			mtype << CP_HQD_IB_CONTROL__MTYPE__SHIFT;
 
-	m->cp_hqd_eop_control |=
-		ffs(q->eop_ring_buffer_size / sizeof(unsigned int)) - 1 - 1;
+	/*
+	 * HW does not clamp this field correctly. Maximum EOP queue size
+	 * is constrained by per-SE EOP done signal count, which is 8-bit.
+	 * Limit is 0xFF EOP entries (= 0x7F8 dwords). CP will not submit
+	 * more than (EOP entry count - 1) so a queue size of 0x800 dwords
+	 * is safe, giving a maximum field value of 0xA.
+	 */
+	m->cp_hqd_eop_control |= min(0xA,
+		ffs(q->eop_ring_buffer_size / sizeof(unsigned int)) - 1 - 1);
 	m->cp_hqd_eop_base_addr_lo =
 			lower_32_bits(q->eop_ring_buffer_address >> 8);
 	m->cp_hqd_eop_base_addr_hi =
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 16/19] drm/amdkfd: Update PM4 packet headers
       [not found] ` <1502488589-30272-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (14 preceding siblings ...)
  2017-08-11 21:56   ` [PATCH 15/19] drm/amdkfd: Clamp EOP queue size correctly on Gfx8 Felix Kuehling
@ 2017-08-11 21:56   ` Felix Kuehling
       [not found]     ` <1502488589-30272-17-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-08-11 21:56   ` [PATCH 17/19] drm/amdgpu: Remove hard-coded assumptions about compute pipes Felix Kuehling
                     ` (3 subsequent siblings)
  19 siblings, 1 reply; 70+ messages in thread
From: Felix Kuehling @ 2017-08-11 21:56 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Felix Kuehling

To match current firmware. The map process packet has been extended
to support scratch. This is a non-backwards compatible change and
it's about two years old. So no point keeping the old version around
conditionally.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_device.c         |   8 +-
 drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c | 161 ++++--------
 drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h    | 314 +++---------------------
 drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h | 130 +++++++++-
 4 files changed, 199 insertions(+), 414 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index e1c2ad2..e790e7f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -26,7 +26,7 @@
 #include <linux/slab.h>
 #include "kfd_priv.h"
 #include "kfd_device_queue_manager.h"
-#include "kfd_pm4_headers.h"
+#include "kfd_pm4_headers_vi.h"
 
 #define MQD_SIZE_ALIGNED 768
 
@@ -238,9 +238,9 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
 	 * calculate max size of runlist packet.
 	 * There can be only 2 packets at once
 	 */
-	size += (KFD_MAX_NUM_OF_PROCESSES * sizeof(struct pm4_map_process) +
-		max_num_of_queues_per_device *
-		sizeof(struct pm4_map_queues) + sizeof(struct pm4_runlist)) * 2;
+	size += (KFD_MAX_NUM_OF_PROCESSES * sizeof(struct pm4_mes_map_process) +
+		max_num_of_queues_per_device * sizeof(struct pm4_mes_map_queues)
+		+ sizeof(struct pm4_mes_runlist)) * 2;
 
 	/* Add size of HIQ & DIQ */
 	size += KFD_KERNEL_QUEUE_SIZE * 2;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
index 77a6f2b..3141e05 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
@@ -26,7 +26,6 @@
 #include "kfd_device_queue_manager.h"
 #include "kfd_kernel_queue.h"
 #include "kfd_priv.h"
-#include "kfd_pm4_headers.h"
 #include "kfd_pm4_headers_vi.h"
 #include "kfd_pm4_opcodes.h"
 
@@ -44,12 +43,12 @@ static unsigned int build_pm4_header(unsigned int opcode, size_t packet_size)
 {
 	union PM4_MES_TYPE_3_HEADER header;
 
-	header.u32all = 0;
+	header.u32All = 0;
 	header.opcode = opcode;
 	header.count = packet_size/sizeof(uint32_t) - 2;
 	header.type = PM4_TYPE_3;
 
-	return header.u32all;
+	return header.u32All;
 }
 
 static void pm_calc_rlib_size(struct packet_manager *pm,
@@ -69,12 +68,9 @@ static void pm_calc_rlib_size(struct packet_manager *pm,
 		pr_debug("Over subscribed runlist\n");
 	}
 
-	map_queue_size =
-		(pm->dqm->dev->device_info->asic_family == CHIP_CARRIZO) ?
-		sizeof(struct pm4_mes_map_queues) :
-		sizeof(struct pm4_map_queues);
+	map_queue_size = sizeof(struct pm4_mes_map_queues);
 	/* calculate run list ib allocation size */
-	*rlib_size = process_count * sizeof(struct pm4_map_process) +
+	*rlib_size = process_count * sizeof(struct pm4_mes_map_process) +
 		     queue_count * map_queue_size;
 
 	/*
@@ -82,7 +78,7 @@ static void pm_calc_rlib_size(struct packet_manager *pm,
 	 * when over subscription
 	 */
 	if (*over_subscription)
-		*rlib_size += sizeof(struct pm4_runlist);
+		*rlib_size += sizeof(struct pm4_mes_runlist);
 
 	pr_debug("runlist ib size %d\n", *rlib_size);
 }
@@ -119,16 +115,16 @@ static int pm_allocate_runlist_ib(struct packet_manager *pm,
 static int pm_create_runlist(struct packet_manager *pm, uint32_t *buffer,
 			uint64_t ib, size_t ib_size_in_dwords, bool chain)
 {
-	struct pm4_runlist *packet;
+	struct pm4_mes_runlist *packet;
 
 	if (WARN_ON(!ib))
 		return -EFAULT;
 
-	packet = (struct pm4_runlist *)buffer;
+	packet = (struct pm4_mes_runlist *)buffer;
 
-	memset(buffer, 0, sizeof(struct pm4_runlist));
-	packet->header.u32all = build_pm4_header(IT_RUN_LIST,
-						sizeof(struct pm4_runlist));
+	memset(buffer, 0, sizeof(struct pm4_mes_runlist));
+	packet->header.u32All = build_pm4_header(IT_RUN_LIST,
+						sizeof(struct pm4_mes_runlist));
 
 	packet->bitfields4.ib_size = ib_size_in_dwords;
 	packet->bitfields4.chain = chain ? 1 : 0;
@@ -143,16 +139,16 @@ static int pm_create_runlist(struct packet_manager *pm, uint32_t *buffer,
 static int pm_create_map_process(struct packet_manager *pm, uint32_t *buffer,
 				struct qcm_process_device *qpd)
 {
-	struct pm4_map_process *packet;
+	struct pm4_mes_map_process *packet;
 	struct queue *cur;
 	uint32_t num_queues;
 
-	packet = (struct pm4_map_process *)buffer;
+	packet = (struct pm4_mes_map_process *)buffer;
 
-	memset(buffer, 0, sizeof(struct pm4_map_process));
+	memset(buffer, 0, sizeof(struct pm4_mes_map_process));
 
-	packet->header.u32all = build_pm4_header(IT_MAP_PROCESS,
-					sizeof(struct pm4_map_process));
+	packet->header.u32All = build_pm4_header(IT_MAP_PROCESS,
+					sizeof(struct pm4_mes_map_process));
 	packet->bitfields2.diq_enable = (qpd->is_debug) ? 1 : 0;
 	packet->bitfields2.process_quantum = 1;
 	packet->bitfields2.pasid = qpd->pqm->process->pasid;
@@ -170,23 +166,26 @@ static int pm_create_map_process(struct packet_manager *pm, uint32_t *buffer,
 	packet->sh_mem_ape1_base = qpd->sh_mem_ape1_base;
 	packet->sh_mem_ape1_limit = qpd->sh_mem_ape1_limit;
 
+	/* TODO: scratch support */
+	packet->sh_hidden_private_base_vmid = 0;
+
 	packet->gds_addr_lo = lower_32_bits(qpd->gds_context_area);
 	packet->gds_addr_hi = upper_32_bits(qpd->gds_context_area);
 
 	return 0;
 }
 
-static int pm_create_map_queue_vi(struct packet_manager *pm, uint32_t *buffer,
+static int pm_create_map_queue(struct packet_manager *pm, uint32_t *buffer,
 		struct queue *q, bool is_static)
 {
 	struct pm4_mes_map_queues *packet;
 	bool use_static = is_static;
 
 	packet = (struct pm4_mes_map_queues *)buffer;
-	memset(buffer, 0, sizeof(struct pm4_map_queues));
+	memset(buffer, 0, sizeof(struct pm4_mes_map_queues));
 
-	packet->header.u32all = build_pm4_header(IT_MAP_QUEUES,
-						sizeof(struct pm4_map_queues));
+	packet->header.u32All = build_pm4_header(IT_MAP_QUEUES,
+						sizeof(struct pm4_mes_map_queues));
 	packet->bitfields2.alloc_format =
 		alloc_format__mes_map_queues__one_per_pipe_vi;
 	packet->bitfields2.num_queues = 1;
@@ -235,64 +234,6 @@ static int pm_create_map_queue_vi(struct packet_manager *pm, uint32_t *buffer,
 	return 0;
 }
 
-static int pm_create_map_queue(struct packet_manager *pm, uint32_t *buffer,
-				struct queue *q, bool is_static)
-{
-	struct pm4_map_queues *packet;
-	bool use_static = is_static;
-
-	packet = (struct pm4_map_queues *)buffer;
-	memset(buffer, 0, sizeof(struct pm4_map_queues));
-
-	packet->header.u32all = build_pm4_header(IT_MAP_QUEUES,
-						sizeof(struct pm4_map_queues));
-	packet->bitfields2.alloc_format =
-				alloc_format__mes_map_queues__one_per_pipe;
-	packet->bitfields2.num_queues = 1;
-	packet->bitfields2.queue_sel =
-		queue_sel__mes_map_queues__map_to_hws_determined_queue_slots;
-
-	packet->bitfields2.vidmem = (q->properties.is_interop) ?
-			vidmem__mes_map_queues__uses_video_memory :
-			vidmem__mes_map_queues__uses_no_video_memory;
-
-	switch (q->properties.type) {
-	case KFD_QUEUE_TYPE_COMPUTE:
-	case KFD_QUEUE_TYPE_DIQ:
-		packet->bitfields2.engine_sel =
-				engine_sel__mes_map_queues__compute;
-		break;
-	case KFD_QUEUE_TYPE_SDMA:
-		packet->bitfields2.engine_sel =
-				engine_sel__mes_map_queues__sdma0;
-		use_static = false; /* no static queues under SDMA */
-		break;
-	default:
-		WARN(1, "queue type %d", q->properties.type);
-		return -EINVAL;
-	}
-
-	packet->mes_map_queues_ordinals[0].bitfields3.doorbell_offset =
-			q->properties.doorbell_off;
-
-	packet->mes_map_queues_ordinals[0].bitfields3.is_static =
-			(use_static) ? 1 : 0;
-
-	packet->mes_map_queues_ordinals[0].mqd_addr_lo =
-			lower_32_bits(q->gart_mqd_addr);
-
-	packet->mes_map_queues_ordinals[0].mqd_addr_hi =
-			upper_32_bits(q->gart_mqd_addr);
-
-	packet->mes_map_queues_ordinals[0].wptr_addr_lo =
-			lower_32_bits((uint64_t)q->properties.write_ptr);
-
-	packet->mes_map_queues_ordinals[0].wptr_addr_hi =
-			upper_32_bits((uint64_t)q->properties.write_ptr);
-
-	return 0;
-}
-
 static int pm_create_runlist_ib(struct packet_manager *pm,
 				struct list_head *queues,
 				uint64_t *rl_gpu_addr,
@@ -334,7 +275,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
 			return retval;
 
 		proccesses_mapped++;
-		inc_wptr(&rl_wptr, sizeof(struct pm4_map_process),
+		inc_wptr(&rl_wptr, sizeof(struct pm4_mes_map_process),
 				alloc_size_bytes);
 
 		list_for_each_entry(kq, &qpd->priv_queue_list, list) {
@@ -344,14 +285,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
 			pr_debug("static_queue, mapping kernel q %d, is debug status %d\n",
 				kq->queue->queue, qpd->is_debug);
 
-			if (pm->dqm->dev->device_info->asic_family ==
-					CHIP_CARRIZO)
-				retval = pm_create_map_queue_vi(pm,
-						&rl_buffer[rl_wptr],
-						kq->queue,
-						qpd->is_debug);
-			else
-				retval = pm_create_map_queue(pm,
+			retval = pm_create_map_queue(pm,
 						&rl_buffer[rl_wptr],
 						kq->queue,
 						qpd->is_debug);
@@ -359,7 +293,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
 				return retval;
 
 			inc_wptr(&rl_wptr,
-				sizeof(struct pm4_map_queues),
+				sizeof(struct pm4_mes_map_queues),
 				alloc_size_bytes);
 		}
 
@@ -370,14 +304,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
 			pr_debug("static_queue, mapping user queue %d, is debug status %d\n",
 				q->queue, qpd->is_debug);
 
-			if (pm->dqm->dev->device_info->asic_family ==
-					CHIP_CARRIZO)
-				retval = pm_create_map_queue_vi(pm,
-						&rl_buffer[rl_wptr],
-						q,
-						qpd->is_debug);
-			else
-				retval = pm_create_map_queue(pm,
+			retval = pm_create_map_queue(pm,
 						&rl_buffer[rl_wptr],
 						q,
 						qpd->is_debug);
@@ -386,7 +313,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
 				return retval;
 
 			inc_wptr(&rl_wptr,
-				sizeof(struct pm4_map_queues),
+				sizeof(struct pm4_mes_map_queues),
 				alloc_size_bytes);
 		}
 	}
@@ -429,7 +356,7 @@ void pm_uninit(struct packet_manager *pm)
 int pm_send_set_resources(struct packet_manager *pm,
 				struct scheduling_resources *res)
 {
-	struct pm4_set_resources *packet;
+	struct pm4_mes_set_resources *packet;
 	int retval = 0;
 
 	mutex_lock(&pm->lock);
@@ -442,9 +369,9 @@ int pm_send_set_resources(struct packet_manager *pm,
 		goto out;
 	}
 
-	memset(packet, 0, sizeof(struct pm4_set_resources));
-	packet->header.u32all = build_pm4_header(IT_SET_RESOURCES,
-					sizeof(struct pm4_set_resources));
+	memset(packet, 0, sizeof(struct pm4_mes_set_resources));
+	packet->header.u32All = build_pm4_header(IT_SET_RESOURCES,
+					sizeof(struct pm4_mes_set_resources));
 
 	packet->bitfields2.queue_type =
 			queue_type__mes_set_resources__hsa_interface_queue_hiq;
@@ -482,7 +409,7 @@ int pm_send_runlist(struct packet_manager *pm, struct list_head *dqm_queues)
 
 	pr_debug("runlist IB address: 0x%llX\n", rl_gpu_ib_addr);
 
-	packet_size_dwords = sizeof(struct pm4_runlist) / sizeof(uint32_t);
+	packet_size_dwords = sizeof(struct pm4_mes_runlist) / sizeof(uint32_t);
 	mutex_lock(&pm->lock);
 
 	retval = pm->priv_queue->ops.acquire_packet_buffer(pm->priv_queue,
@@ -514,7 +441,7 @@ int pm_send_query_status(struct packet_manager *pm, uint64_t fence_address,
 			uint32_t fence_value)
 {
 	int retval;
-	struct pm4_query_status *packet;
+	struct pm4_mes_query_status *packet;
 
 	if (WARN_ON(!fence_address))
 		return -EFAULT;
@@ -522,13 +449,13 @@ int pm_send_query_status(struct packet_manager *pm, uint64_t fence_address,
 	mutex_lock(&pm->lock);
 	retval = pm->priv_queue->ops.acquire_packet_buffer(
 			pm->priv_queue,
-			sizeof(struct pm4_query_status) / sizeof(uint32_t),
+			sizeof(struct pm4_mes_query_status) / sizeof(uint32_t),
 			(unsigned int **)&packet);
 	if (retval)
 		goto fail_acquire_packet_buffer;
 
-	packet->header.u32all = build_pm4_header(IT_QUERY_STATUS,
-					sizeof(struct pm4_query_status));
+	packet->header.u32All = build_pm4_header(IT_QUERY_STATUS,
+					sizeof(struct pm4_mes_query_status));
 
 	packet->bitfields2.context_id = 0;
 	packet->bitfields2.interrupt_sel =
@@ -555,22 +482,22 @@ int pm_send_unmap_queue(struct packet_manager *pm, enum kfd_queue_type type,
 {
 	int retval;
 	uint32_t *buffer;
-	struct pm4_unmap_queues *packet;
+	struct pm4_mes_unmap_queues *packet;
 
 	mutex_lock(&pm->lock);
 	retval = pm->priv_queue->ops.acquire_packet_buffer(
 			pm->priv_queue,
-			sizeof(struct pm4_unmap_queues) / sizeof(uint32_t),
+			sizeof(struct pm4_mes_unmap_queues) / sizeof(uint32_t),
 			&buffer);
 	if (retval)
 		goto err_acquire_packet_buffer;
 
-	packet = (struct pm4_unmap_queues *)buffer;
-	memset(buffer, 0, sizeof(struct pm4_unmap_queues));
+	packet = (struct pm4_mes_unmap_queues *)buffer;
+	memset(buffer, 0, sizeof(struct pm4_mes_unmap_queues));
 	pr_debug("static_queue: unmapping queues: mode is %d , reset is %d , type is %d\n",
 		mode, reset, type);
-	packet->header.u32all = build_pm4_header(IT_UNMAP_QUEUES,
-					sizeof(struct pm4_unmap_queues));
+	packet->header.u32All = build_pm4_header(IT_UNMAP_QUEUES,
+					sizeof(struct pm4_mes_unmap_queues));
 	switch (type) {
 	case KFD_QUEUE_TYPE_COMPUTE:
 	case KFD_QUEUE_TYPE_DIQ:
@@ -608,12 +535,12 @@ int pm_send_unmap_queue(struct packet_manager *pm, enum kfd_queue_type type,
 		break;
 	case KFD_PREEMPT_TYPE_FILTER_ALL_QUEUES:
 		packet->bitfields2.queue_sel =
-				queue_sel__mes_unmap_queues__perform_request_on_all_active_queues;
+				queue_sel__mes_unmap_queues__unmap_all_queues;
 		break;
 	case KFD_PREEMPT_TYPE_FILTER_DYNAMIC_QUEUES:
 		/* in this case, we do not preempt static queues */
 		packet->bitfields2.queue_sel =
-				queue_sel__mes_unmap_queues__perform_request_on_dynamic_queues_only;
+				queue_sel__mes_unmap_queues__unmap_all_non_static_queues;
 		break;
 	default:
 		WARN(1, "filter %d", mode);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
index 97e5442..e50f73d 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
@@ -41,99 +41,6 @@ union PM4_MES_TYPE_3_HEADER {
 };
 #endif /* PM4_MES_HEADER_DEFINED */
 
-/* --------------------MES_SET_RESOURCES-------------------- */
-
-#ifndef PM4_MES_SET_RESOURCES_DEFINED
-#define PM4_MES_SET_RESOURCES_DEFINED
-enum set_resources_queue_type_enum {
-	queue_type__mes_set_resources__kernel_interface_queue_kiq = 0,
-	queue_type__mes_set_resources__hsa_interface_queue_hiq = 1,
-	queue_type__mes_set_resources__hsa_debug_interface_queue = 4
-};
-
-struct pm4_set_resources {
-	union {
-		union PM4_MES_TYPE_3_HEADER header;	/* header */
-		uint32_t ordinal1;
-	};
-
-	union {
-		struct {
-			uint32_t vmid_mask:16;
-			uint32_t unmap_latency:8;
-			uint32_t reserved1:5;
-			enum set_resources_queue_type_enum queue_type:3;
-		} bitfields2;
-		uint32_t ordinal2;
-	};
-
-	uint32_t queue_mask_lo;
-	uint32_t queue_mask_hi;
-	uint32_t gws_mask_lo;
-	uint32_t gws_mask_hi;
-
-	union {
-		struct {
-			uint32_t oac_mask:16;
-			uint32_t reserved2:16;
-		} bitfields7;
-		uint32_t ordinal7;
-	};
-
-	union {
-		struct {
-			uint32_t gds_heap_base:6;
-			uint32_t reserved3:5;
-			uint32_t gds_heap_size:6;
-			uint32_t reserved4:15;
-		} bitfields8;
-		uint32_t ordinal8;
-	};
-
-};
-#endif
-
-/*--------------------MES_RUN_LIST-------------------- */
-
-#ifndef PM4_MES_RUN_LIST_DEFINED
-#define PM4_MES_RUN_LIST_DEFINED
-
-struct pm4_runlist {
-	union {
-		union PM4_MES_TYPE_3_HEADER header;	/* header */
-		uint32_t ordinal1;
-	};
-
-	union {
-		struct {
-			uint32_t reserved1:2;
-			uint32_t ib_base_lo:30;
-		} bitfields2;
-		uint32_t ordinal2;
-	};
-
-	union {
-		struct {
-			uint32_t ib_base_hi:16;
-			uint32_t reserved2:16;
-		} bitfields3;
-		uint32_t ordinal3;
-	};
-
-	union {
-		struct {
-			uint32_t ib_size:20;
-			uint32_t chain:1;
-			uint32_t offload_polling:1;
-			uint32_t reserved3:1;
-			uint32_t valid:1;
-			uint32_t reserved4:8;
-		} bitfields4;
-		uint32_t ordinal4;
-	};
-
-};
-#endif
 
 /*--------------------MES_MAP_PROCESS-------------------- */
 
@@ -186,217 +93,58 @@ struct pm4_map_process {
 };
 #endif
 
-/*--------------------MES_MAP_QUEUES--------------------*/
-
-#ifndef PM4_MES_MAP_QUEUES_DEFINED
-#define PM4_MES_MAP_QUEUES_DEFINED
-enum map_queues_queue_sel_enum {
-	queue_sel__mes_map_queues__map_to_specified_queue_slots = 0,
-	queue_sel__mes_map_queues__map_to_hws_determined_queue_slots = 1,
-	queue_sel__mes_map_queues__enable_process_queues = 2
-};
+#ifndef PM4_MES_MAP_PROCESS_DEFINED_KV_SCRATCH
+#define PM4_MES_MAP_PROCESS_DEFINED_KV_SCRATCH
 
-enum map_queues_vidmem_enum {
-	vidmem__mes_map_queues__uses_no_video_memory = 0,
-	vidmem__mes_map_queues__uses_video_memory = 1
-};
-
-enum map_queues_alloc_format_enum {
-	alloc_format__mes_map_queues__one_per_pipe = 0,
-	alloc_format__mes_map_queues__all_on_one_pipe = 1
-};
-
-enum map_queues_engine_sel_enum {
-	engine_sel__mes_map_queues__compute = 0,
-	engine_sel__mes_map_queues__sdma0 = 2,
-	engine_sel__mes_map_queues__sdma1 = 3
-};
-
-struct pm4_map_queues {
+struct pm4_map_process_scratch_kv {
 	union {
-		union PM4_MES_TYPE_3_HEADER header;	/* header */
-		uint32_t ordinal1;
-	};
-
-	union {
-		struct {
-			uint32_t reserved1:4;
-			enum map_queues_queue_sel_enum queue_sel:2;
-			uint32_t reserved2:2;
-			uint32_t vmid:4;
-			uint32_t reserved3:4;
-			enum map_queues_vidmem_enum vidmem:2;
-			uint32_t reserved4:6;
-			enum map_queues_alloc_format_enum alloc_format:2;
-			enum map_queues_engine_sel_enum engine_sel:3;
-			uint32_t num_queues:3;
-		} bitfields2;
-		uint32_t ordinal2;
-	};
-
-	struct {
-		union {
-			struct {
-				uint32_t is_static:1;
-				uint32_t reserved5:1;
-				uint32_t doorbell_offset:21;
-				uint32_t reserved6:3;
-				uint32_t queue:6;
-			} bitfields3;
-			uint32_t ordinal3;
-		};
-
-		uint32_t mqd_addr_lo;
-		uint32_t mqd_addr_hi;
-		uint32_t wptr_addr_lo;
-		uint32_t wptr_addr_hi;
-
-	} mes_map_queues_ordinals[1];	/* 1..N of these ordinal groups */
-
-};
-#endif
-
-/*--------------------MES_QUERY_STATUS--------------------*/
-
-#ifndef PM4_MES_QUERY_STATUS_DEFINED
-#define PM4_MES_QUERY_STATUS_DEFINED
-enum query_status_interrupt_sel_enum {
-	interrupt_sel__mes_query_status__completion_status = 0,
-	interrupt_sel__mes_query_status__process_status = 1,
-	interrupt_sel__mes_query_status__queue_status = 2
-};
-
-enum query_status_command_enum {
-	command__mes_query_status__interrupt_only = 0,
-	command__mes_query_status__fence_only_immediate = 1,
-	command__mes_query_status__fence_only_after_write_ack = 2,
-	command__mes_query_status__fence_wait_for_write_ack_send_interrupt = 3
-};
-
-enum query_status_engine_sel_enum {
-	engine_sel__mes_query_status__compute = 0,
-	engine_sel__mes_query_status__sdma0_queue = 2,
-	engine_sel__mes_query_status__sdma1_queue = 3
-};
-
-struct pm4_query_status {
-	union {
-		union PM4_MES_TYPE_3_HEADER header;	/* header */
-		uint32_t ordinal1;
-	};
-
-	union {
-		struct {
-			uint32_t context_id:28;
-			enum query_status_interrupt_sel_enum interrupt_sel:2;
-			enum query_status_command_enum command:2;
-		} bitfields2;
-		uint32_t ordinal2;
+		union PM4_MES_TYPE_3_HEADER   header; /* header */
+		uint32_t            ordinal1;
 	};
 
 	union {
 		struct {
 			uint32_t pasid:16;
-			uint32_t reserved1:16;
-		} bitfields3a;
-		struct {
-			uint32_t reserved2:2;
-			uint32_t doorbell_offset:21;
-			uint32_t reserved3:3;
-			enum query_status_engine_sel_enum engine_sel:3;
-			uint32_t reserved4:3;
-		} bitfields3b;
-		uint32_t ordinal3;
-	};
-
-	uint32_t addr_lo;
-	uint32_t addr_hi;
-	uint32_t data_lo;
-	uint32_t data_hi;
-};
-#endif
-
-/*--------------------MES_UNMAP_QUEUES--------------------*/
-
-#ifndef PM4_MES_UNMAP_QUEUES_DEFINED
-#define PM4_MES_UNMAP_QUEUES_DEFINED
-enum unmap_queues_action_enum {
-	action__mes_unmap_queues__preempt_queues = 0,
-	action__mes_unmap_queues__reset_queues = 1,
-	action__mes_unmap_queues__disable_process_queues = 2
-};
-
-enum unmap_queues_queue_sel_enum {
-	queue_sel__mes_unmap_queues__perform_request_on_specified_queues = 0,
-	queue_sel__mes_unmap_queues__perform_request_on_pasid_queues = 1,
-	queue_sel__mes_unmap_queues__perform_request_on_all_active_queues = 2,
-	queue_sel__mes_unmap_queues__perform_request_on_dynamic_queues_only = 3
-};
-
-enum unmap_queues_engine_sel_enum {
-	engine_sel__mes_unmap_queues__compute = 0,
-	engine_sel__mes_unmap_queues__sdma0 = 2,
-	engine_sel__mes_unmap_queues__sdma1 = 3
-};
-
-struct pm4_unmap_queues {
-	union {
-		union PM4_MES_TYPE_3_HEADER header;	/* header */
-		uint32_t ordinal1;
-	};
-
-	union {
-		struct {
-			enum unmap_queues_action_enum action:2;
-			uint32_t reserved1:2;
-			enum unmap_queues_queue_sel_enum queue_sel:2;
-			uint32_t reserved2:20;
-			enum unmap_queues_engine_sel_enum engine_sel:3;
-			uint32_t num_queues:3;
+			uint32_t reserved1:8;
+			uint32_t diq_enable:1;
+			uint32_t process_quantum:7;
 		} bitfields2;
 		uint32_t ordinal2;
 	};
 
 	union {
 		struct {
-			uint32_t pasid:16;
-			uint32_t reserved3:16;
-		} bitfields3a;
-		struct {
-			uint32_t reserved4:2;
-			uint32_t doorbell_offset0:21;
-			uint32_t reserved5:9;
-		} bitfields3b;
+			uint32_t page_table_base:28;
+			uint32_t reserved2:4;
+		} bitfields3;
 		uint32_t ordinal3;
 	};
 
-	union {
-		struct {
-			uint32_t reserved6:2;
-			uint32_t doorbell_offset1:21;
-			uint32_t reserved7:9;
-		} bitfields4;
-		uint32_t ordinal4;
-	};
-
-	union {
-		struct {
-			uint32_t reserved8:2;
-			uint32_t doorbell_offset2:21;
-			uint32_t reserved9:9;
-		} bitfields5;
-		uint32_t ordinal5;
-	};
+	uint32_t reserved3;
+	uint32_t sh_mem_bases;
+	uint32_t sh_mem_config;
+	uint32_t sh_mem_ape1_base;
+	uint32_t sh_mem_ape1_limit;
+	uint32_t sh_hidden_private_base_vmid;
+	uint32_t reserved4;
+	uint32_t reserved5;
+	uint32_t gds_addr_lo;
+	uint32_t gds_addr_hi;
 
 	union {
 		struct {
-			uint32_t reserved10:2;
-			uint32_t doorbell_offset3:21;
-			uint32_t reserved11:9;
-		} bitfields6;
-		uint32_t ordinal6;
+			uint32_t num_gws:6;
+			uint32_t reserved6:2;
+			uint32_t num_oac:4;
+			uint32_t reserved7:4;
+			uint32_t gds_size:6;
+			uint32_t num_queues:10;
+		} bitfields14;
+		uint32_t ordinal14;
 	};
 
+	uint32_t completion_signal_lo32;
+uint32_t completion_signal_hi32;
 };
 #endif
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
index c4eda6f..7c8d9b3 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
@@ -126,9 +126,10 @@ struct pm4_mes_runlist {
 			uint32_t ib_size:20;
 			uint32_t chain:1;
 			uint32_t offload_polling:1;
-			uint32_t reserved3:1;
+			uint32_t reserved2:1;
 			uint32_t valid:1;
-			uint32_t reserved4:8;
+			uint32_t process_cnt:4;
+			uint32_t reserved3:4;
 		} bitfields4;
 		uint32_t ordinal4;
 	};
@@ -143,8 +144,8 @@ struct pm4_mes_runlist {
 
 struct pm4_mes_map_process {
 	union {
-		union PM4_MES_TYPE_3_HEADER   header;            /* header */
-		uint32_t            ordinal1;
+		union PM4_MES_TYPE_3_HEADER header;	/* header */
+		uint32_t ordinal1;
 	};
 
 	union {
@@ -155,36 +156,48 @@ struct pm4_mes_map_process {
 			uint32_t process_quantum:7;
 		} bitfields2;
 		uint32_t ordinal2;
-};
+	};
 
 	union {
 		struct {
 			uint32_t page_table_base:28;
-			uint32_t reserved2:4;
+			uint32_t reserved3:4;
 		} bitfields3;
 		uint32_t ordinal3;
 	};
 
+	uint32_t reserved;
+
 	uint32_t sh_mem_bases;
+	uint32_t sh_mem_config;
 	uint32_t sh_mem_ape1_base;
 	uint32_t sh_mem_ape1_limit;
-	uint32_t sh_mem_config;
+
+	uint32_t sh_hidden_private_base_vmid;
+
+	uint32_t reserved2;
+	uint32_t reserved3;
+
 	uint32_t gds_addr_lo;
 	uint32_t gds_addr_hi;
 
 	union {
 		struct {
 			uint32_t num_gws:6;
-			uint32_t reserved3:2;
+			uint32_t reserved4:2;
 			uint32_t num_oac:4;
-			uint32_t reserved4:4;
+			uint32_t reserved5:4;
 			uint32_t gds_size:6;
 			uint32_t num_queues:10;
 		} bitfields10;
 		uint32_t ordinal10;
 	};
 
+	uint32_t completion_signal_lo;
+	uint32_t completion_signal_hi;
+
 };
+
 #endif
 
 /*--------------------MES_MAP_QUEUES--------------------*/
@@ -337,7 +350,7 @@ enum mes_unmap_queues_engine_sel_enum {
 	engine_sel__mes_unmap_queues__sdmal = 3
 };
 
-struct PM4_MES_UNMAP_QUEUES {
+struct pm4_mes_unmap_queues {
 	union {
 		union PM4_MES_TYPE_3_HEADER   header;            /* header */
 		uint32_t            ordinal1;
@@ -397,4 +410,101 @@ struct PM4_MES_UNMAP_QUEUES {
 };
 #endif
 
+#ifndef PM4_MEC_RELEASE_MEM_DEFINED
+#define PM4_MEC_RELEASE_MEM_DEFINED
+enum RELEASE_MEM_event_index_enum {
+	event_index___release_mem__end_of_pipe = 5,
+	event_index___release_mem__shader_done = 6
+};
+
+enum RELEASE_MEM_cache_policy_enum {
+	cache_policy___release_mem__lru = 0,
+	cache_policy___release_mem__stream = 1,
+	cache_policy___release_mem__bypass = 2
+};
+
+enum RELEASE_MEM_dst_sel_enum {
+	dst_sel___release_mem__memory_controller = 0,
+	dst_sel___release_mem__tc_l2 = 1,
+	dst_sel___release_mem__queue_write_pointer_register = 2,
+	dst_sel___release_mem__queue_write_pointer_poll_mask_bit = 3
+};
+
+enum RELEASE_MEM_int_sel_enum {
+	int_sel___release_mem__none = 0,
+	int_sel___release_mem__send_interrupt_only = 1,
+	int_sel___release_mem__send_interrupt_after_write_confirm = 2,
+	int_sel___release_mem__send_data_after_write_confirm = 3
+};
+
+enum RELEASE_MEM_data_sel_enum {
+	data_sel___release_mem__none = 0,
+	data_sel___release_mem__send_32_bit_low = 1,
+	data_sel___release_mem__send_64_bit_data = 2,
+	data_sel___release_mem__send_gpu_clock_counter = 3,
+	data_sel___release_mem__send_cp_perfcounter_hi_lo = 4,
+	data_sel___release_mem__store_gds_data_to_memory = 5
+};
+
+struct pm4_mec_release_mem {
+	union {
+		union PM4_MES_TYPE_3_HEADER header;     /*header */
+		unsigned int ordinal1;
+	};
+
+	union {
+		struct {
+			unsigned int event_type:6;
+			unsigned int reserved1:2;
+			enum RELEASE_MEM_event_index_enum event_index:4;
+			unsigned int tcl1_vol_action_ena:1;
+			unsigned int tc_vol_action_ena:1;
+			unsigned int reserved2:1;
+			unsigned int tc_wb_action_ena:1;
+			unsigned int tcl1_action_ena:1;
+			unsigned int tc_action_ena:1;
+			unsigned int reserved3:6;
+			unsigned int atc:1;
+			enum RELEASE_MEM_cache_policy_enum cache_policy:2;
+			unsigned int reserved4:5;
+		} bitfields2;
+		unsigned int ordinal2;
+	};
+
+	union {
+		struct {
+			unsigned int reserved5:16;
+			enum RELEASE_MEM_dst_sel_enum dst_sel:2;
+			unsigned int reserved6:6;
+			enum RELEASE_MEM_int_sel_enum int_sel:3;
+			unsigned int reserved7:2;
+			enum RELEASE_MEM_data_sel_enum data_sel:3;
+		} bitfields3;
+		unsigned int ordinal3;
+	};
+
+	union {
+		struct {
+			unsigned int reserved8:2;
+			unsigned int address_lo_32b:30;
+		} bitfields4;
+		struct {
+			unsigned int reserved9:3;
+			unsigned int address_lo_64b:29;
+		} bitfields5;
+		unsigned int ordinal4;
+	};
+
+	unsigned int address_hi;
+
+	unsigned int data_lo;
+
+	unsigned int data_hi;
+};
+#endif
+
+enum {
+	CACHE_FLUSH_AND_INV_TS_EVENT = 0x00000014
+};
+
 #endif
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 17/19] drm/amdgpu: Remove hard-coded assumptions about compute pipes
       [not found] ` <1502488589-30272-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (15 preceding siblings ...)
  2017-08-11 21:56   ` [PATCH 16/19] drm/amdkfd: Update PM4 packet headers Felix Kuehling
@ 2017-08-11 21:56   ` Felix Kuehling
       [not found]     ` <1502488589-30272-18-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-08-11 21:56   ` [PATCH 18/19] drm/amdgpu: Disable GFX PG on CZ Felix Kuehling
                     ` (2 subsequent siblings)
  19 siblings, 1 reply; 70+ messages in thread
From: Felix Kuehling @ 2017-08-11 21:56 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Felix Kuehling

Remove hard-coded assumption that the first compute pipe is
reserved for amdgpu. Pipe 0 actually means pipe 0 now.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
index 5936222..dfb8c74 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
@@ -186,7 +186,7 @@ static void acquire_queue(struct kgd_dev *kgd, uint32_t pipe_id,
 {
 	struct amdgpu_device *adev = get_amdgpu_device(kgd);
 
-	uint32_t mec = (++pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
+	uint32_t mec = (pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
 	uint32_t pipe = (pipe_id % adev->gfx.mec.num_pipe_per_mec);
 
 	lock_srbm(kgd, mec, pipe, queue_id, 0);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
index 90271f6..0fccd30 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
@@ -147,7 +147,7 @@ static void acquire_queue(struct kgd_dev *kgd, uint32_t pipe_id,
 {
 	struct amdgpu_device *adev = get_amdgpu_device(kgd);
 
-	uint32_t mec = (++pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
+	uint32_t mec = (pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
 	uint32_t pipe = (pipe_id % adev->gfx.mec.num_pipe_per_mec);
 
 	lock_srbm(kgd, mec, pipe, queue_id, 0);
@@ -216,7 +216,7 @@ static int kgd_init_interrupts(struct kgd_dev *kgd, uint32_t pipe_id)
 	uint32_t mec;
 	uint32_t pipe;
 
-	mec = (++pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
+	mec = (pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
 	pipe = (pipe_id % adev->gfx.mec.num_pipe_per_mec);
 
 	lock_srbm(kgd, mec, pipe, 0, 0);
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 18/19] drm/amdgpu: Disable GFX PG on CZ
       [not found] ` <1502488589-30272-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (16 preceding siblings ...)
  2017-08-11 21:56   ` [PATCH 17/19] drm/amdgpu: Remove hard-coded assumptions about compute pipes Felix Kuehling
@ 2017-08-11 21:56   ` Felix Kuehling
       [not found]     ` <1502488589-30272-19-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-08-11 21:56   ` [PATCH 19/19] drm/amd: Update MEC HQD loading code for KFD Felix Kuehling
  2017-08-12 12:28   ` [PATCH 00/19] KFD fixes and cleanups Oded Gabbay
  19 siblings, 1 reply; 70+ messages in thread
From: Felix Kuehling @ 2017-08-11 21:56 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Felix Kuehling

It's causing problems with user mode queues and the HIQ, and can
lead to hard hangs during boot after programming RLC_CP_SCHEDULERS.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/vi.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c b/drivers/gpu/drm/amd/amdgpu/vi.c
index 18bb3cb..495c8a3 100644
--- a/drivers/gpu/drm/amd/amdgpu/vi.c
+++ b/drivers/gpu/drm/amd/amdgpu/vi.c
@@ -1029,8 +1029,7 @@ static int vi_common_early_init(void *handle)
 		/* rev0 hardware requires workarounds to support PG */
 		adev->pg_flags = 0;
 		if (adev->rev_id != 0x00 || CZ_REV_BRISTOL(adev->pdev->revision)) {
-			adev->pg_flags |= AMD_PG_SUPPORT_GFX_PG |
-				AMD_PG_SUPPORT_GFX_SMG |
+			adev->pg_flags |= AMD_PG_SUPPORT_GFX_SMG |
 				AMD_PG_SUPPORT_GFX_PIPELINE |
 				AMD_PG_SUPPORT_CP |
 				AMD_PG_SUPPORT_UVD |
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 19/19] drm/amd: Update MEC HQD loading code for KFD
       [not found] ` <1502488589-30272-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (17 preceding siblings ...)
  2017-08-11 21:56   ` [PATCH 18/19] drm/amdgpu: Disable GFX PG on CZ Felix Kuehling
@ 2017-08-11 21:56   ` Felix Kuehling
       [not found]     ` <1502488589-30272-20-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-08-12 12:28   ` [PATCH 00/19] KFD fixes and cleanups Oded Gabbay
  19 siblings, 1 reply; 70+ messages in thread
From: Felix Kuehling @ 2017-08-11 21:56 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Felix Kuehling

Various bug fixes and improvements that accumulated over the last two
years.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h         |  16 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c  | 130 +++++++++++++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c  | 165 ++++++++++++++++++---
 .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  |   7 +-
 drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c      |   3 +-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h       |   3 +-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c   |  23 +--
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c    |  16 +-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h              |   5 -
 drivers/gpu/drm/amd/include/kgd_kfd_interface.h    |  11 +-
 drivers/gpu/drm/radeon/radeon_kfd.c                |  12 +-
 11 files changed, 322 insertions(+), 69 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index b8802a5..8d689ab 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -26,6 +26,7 @@
 #define AMDGPU_AMDKFD_H_INCLUDED
 
 #include <linux/types.h>
+#include <linux/mmu_context.h>
 #include <kgd_kfd_interface.h>
 
 struct amdgpu_device;
@@ -60,4 +61,19 @@ uint64_t get_gpu_clock_counter(struct kgd_dev *kgd);
 
 uint32_t get_max_engine_clock_in_mhz(struct kgd_dev *kgd);
 
+#define read_user_wptr(mmptr, wptr, dst)				\
+	({								\
+		bool valid = false;					\
+		if ((mmptr) && (wptr)) {				\
+			if ((mmptr) == current->mm) {			\
+				valid = !get_user((dst), (wptr));	\
+			} else if (current->mm == NULL) {		\
+				use_mm(mmptr);				\
+				valid = !get_user((dst), (wptr));	\
+				unuse_mm(mmptr);			\
+			}						\
+		}							\
+		valid;							\
+	})
+
 #endif /* AMDGPU_AMDKFD_H_INCLUDED */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
index dfb8c74..994d262 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
@@ -39,6 +39,12 @@
 #include "gmc/gmc_7_1_sh_mask.h"
 #include "cik_structs.h"
 
+enum hqd_dequeue_request_type {
+	NO_ACTION = 0,
+	DRAIN_PIPE,
+	RESET_WAVES
+};
+
 enum {
 	MAX_TRAPID = 8,		/* 3 bits in the bitfield. */
 	MAX_WATCH_ADDRESSES = 4
@@ -96,12 +102,15 @@ static int kgd_init_pipeline(struct kgd_dev *kgd, uint32_t pipe_id,
 				uint32_t hpd_size, uint64_t hpd_gpu_addr);
 static int kgd_init_interrupts(struct kgd_dev *kgd, uint32_t pipe_id);
 static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, uint32_t pipe_id,
-			uint32_t queue_id, uint32_t __user *wptr);
+			uint32_t queue_id, uint32_t __user *wptr,
+			uint32_t wptr_shift, uint32_t wptr_mask,
+			struct mm_struct *mm);
 static int kgd_hqd_sdma_load(struct kgd_dev *kgd, void *mqd);
 static bool kgd_hqd_is_occupied(struct kgd_dev *kgd, uint64_t queue_address,
 				uint32_t pipe_id, uint32_t queue_id);
 
-static int kgd_hqd_destroy(struct kgd_dev *kgd, uint32_t reset_type,
+static int kgd_hqd_destroy(struct kgd_dev *kgd, void *mqd,
+				enum kfd_preempt_type reset_type,
 				unsigned int utimeout, uint32_t pipe_id,
 				uint32_t queue_id);
 static bool kgd_hqd_sdma_is_occupied(struct kgd_dev *kgd, void *mqd);
@@ -290,20 +299,38 @@ static inline struct cik_sdma_rlc_registers *get_sdma_mqd(void *mqd)
 }
 
 static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, uint32_t pipe_id,
-			uint32_t queue_id, uint32_t __user *wptr)
+			uint32_t queue_id, uint32_t __user *wptr,
+			uint32_t wptr_shift, uint32_t wptr_mask,
+			struct mm_struct *mm)
 {
 	struct amdgpu_device *adev = get_amdgpu_device(kgd);
-	uint32_t wptr_shadow, is_wptr_shadow_valid;
 	struct cik_mqd *m;
+	uint32_t *mqd_hqd;
+	uint32_t reg, wptr_val, data;
 
 	m = get_mqd(mqd);
 
-	is_wptr_shadow_valid = !get_user(wptr_shadow, wptr);
-	if (is_wptr_shadow_valid)
-		m->cp_hqd_pq_wptr = wptr_shadow;
-
 	acquire_queue(kgd, pipe_id, queue_id);
-	gfx_v7_0_mqd_commit(adev, m);
+
+	/* HQD registers extend from CP_MQD_BASE_ADDR to CP_MQD_CONTROL. */
+	mqd_hqd = &m->cp_mqd_base_addr_lo;
+
+	for (reg = mmCP_MQD_BASE_ADDR; reg <= mmCP_MQD_CONTROL; reg++)
+		WREG32(reg, mqd_hqd[reg - mmCP_MQD_BASE_ADDR]);
+
+	/* Copy userspace write pointer value to register.
+	 * Activate doorbell logic to monitor subsequent changes.
+	 */
+	data = REG_SET_FIELD(m->cp_hqd_pq_doorbell_control,
+			     CP_HQD_PQ_DOORBELL_CONTROL, DOORBELL_EN, 1);
+	WREG32(mmCP_HQD_PQ_DOORBELL_CONTROL, data);
+
+	if (read_user_wptr(mm, wptr, wptr_val))
+		WREG32(mmCP_HQD_PQ_WPTR, (wptr_val << wptr_shift) & wptr_mask);
+
+	data = REG_SET_FIELD(m->cp_hqd_active, CP_HQD_ACTIVE, ACTIVE, 1);
+	WREG32(mmCP_HQD_ACTIVE, data);
+
 	release_queue(kgd);
 
 	return 0;
@@ -382,30 +409,99 @@ static bool kgd_hqd_sdma_is_occupied(struct kgd_dev *kgd, void *mqd)
 	return false;
 }
 
-static int kgd_hqd_destroy(struct kgd_dev *kgd, uint32_t reset_type,
+static int kgd_hqd_destroy(struct kgd_dev *kgd, void *mqd,
+				enum kfd_preempt_type reset_type,
 				unsigned int utimeout, uint32_t pipe_id,
 				uint32_t queue_id)
 {
 	struct amdgpu_device *adev = get_amdgpu_device(kgd);
 	uint32_t temp;
-	int timeout = utimeout;
+	enum hqd_dequeue_request_type type;
+	unsigned long flags, end_jiffies;
+	int retry;
 
 	acquire_queue(kgd, pipe_id, queue_id);
 	WREG32(mmCP_HQD_PQ_DOORBELL_CONTROL, 0);
 
-	WREG32(mmCP_HQD_DEQUEUE_REQUEST, reset_type);
+	switch (reset_type) {
+	case KFD_PREEMPT_TYPE_WAVEFRONT_DRAIN:
+		type = DRAIN_PIPE;
+		break;
+	case KFD_PREEMPT_TYPE_WAVEFRONT_RESET:
+		type = RESET_WAVES;
+		break;
+	default:
+		type = DRAIN_PIPE;
+		break;
+	}
 
+	/* Workaround: If IQ timer is active and the wait time is close to or
+	 * equal to 0, dequeueing is not safe. Wait until either the wait time
+	 * is larger or timer is cleared. Also, ensure that IQ_REQ_PEND is
+	 * cleared before continuing. Also, ensure wait times are set to at
+	 * least 0x3.
+	 */
+	local_irq_save(flags);
+	preempt_disable();
+	retry = 5000; /* wait for 500 usecs at maximum */
+	while (true) {
+		temp = RREG32(mmCP_HQD_IQ_TIMER);
+		if (REG_GET_FIELD(temp, CP_HQD_IQ_TIMER, PROCESSING_IQ)) {
+			pr_debug("HW is processing IQ\n");
+			goto loop;
+		}
+		if (REG_GET_FIELD(temp, CP_HQD_IQ_TIMER, ACTIVE)) {
+			if (REG_GET_FIELD(temp, CP_HQD_IQ_TIMER, RETRY_TYPE)
+					== 3) /* SEM-rearm is safe */
+				break;
+			/* Wait time 3 is safe for CP, but our MMIO read/write
+			 * time is close to 1 microsecond, so check for 10 to
+			 * leave more buffer room
+			 */
+			if (REG_GET_FIELD(temp, CP_HQD_IQ_TIMER, WAIT_TIME)
+					>= 10)
+				break;
+			pr_debug("IQ timer is active\n");
+		} else
+			break;
+loop:
+		if (!retry) {
+			pr_err("CP HQD IQ timer status time out\n");
+			break;
+		}
+		ndelay(100);
+		--retry;
+	}
+	retry = 1000;
+	while (true) {
+		temp = RREG32(mmCP_HQD_DEQUEUE_REQUEST);
+		if (!(temp & CP_HQD_DEQUEUE_REQUEST__IQ_REQ_PEND_MASK))
+			break;
+		pr_debug("Dequeue request is pending\n");
+
+		if (!retry) {
+			pr_err("CP HQD dequeue request time out\n");
+			break;
+		}
+		ndelay(100);
+		--retry;
+	}
+	local_irq_restore(flags);
+	preempt_enable();
+
+	WREG32(mmCP_HQD_DEQUEUE_REQUEST, type);
+
+	end_jiffies = (utimeout * HZ / 1000) + jiffies;
 	while (true) {
 		temp = RREG32(mmCP_HQD_ACTIVE);
-		if (temp & CP_HQD_ACTIVE__ACTIVE_MASK)
+		if (!(temp & CP_HQD_ACTIVE__ACTIVE_MASK))
 			break;
-		if (timeout <= 0) {
-			pr_err("kfd: cp queue preemption time out.\n");
+		if (time_after(jiffies, end_jiffies)) {
+			pr_err("cp queue preemption time out\n");
 			release_queue(kgd);
 			return -ETIME;
 		}
-		msleep(20);
-		timeout -= 20;
+		usleep_range(500, 1000);
 	}
 
 	release_queue(kgd);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
index 0fccd30..29a6f5d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
@@ -39,6 +39,12 @@
 #include "vi_structs.h"
 #include "vid.h"
 
+enum hqd_dequeue_request_type {
+	NO_ACTION = 0,
+	DRAIN_PIPE,
+	RESET_WAVES
+};
+
 struct cik_sdma_rlc_registers;
 
 /*
@@ -55,12 +61,15 @@ static int kgd_init_pipeline(struct kgd_dev *kgd, uint32_t pipe_id,
 		uint32_t hpd_size, uint64_t hpd_gpu_addr);
 static int kgd_init_interrupts(struct kgd_dev *kgd, uint32_t pipe_id);
 static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, uint32_t pipe_id,
-		uint32_t queue_id, uint32_t __user *wptr);
+			uint32_t queue_id, uint32_t __user *wptr,
+			uint32_t wptr_shift, uint32_t wptr_mask,
+			struct mm_struct *mm);
 static int kgd_hqd_sdma_load(struct kgd_dev *kgd, void *mqd);
 static bool kgd_hqd_is_occupied(struct kgd_dev *kgd, uint64_t queue_address,
 		uint32_t pipe_id, uint32_t queue_id);
 static bool kgd_hqd_sdma_is_occupied(struct kgd_dev *kgd, void *mqd);
-static int kgd_hqd_destroy(struct kgd_dev *kgd, uint32_t reset_type,
+static int kgd_hqd_destroy(struct kgd_dev *kgd, void *mqd,
+				enum kfd_preempt_type reset_type,
 				unsigned int utimeout, uint32_t pipe_id,
 				uint32_t queue_id);
 static int kgd_hqd_sdma_destroy(struct kgd_dev *kgd, void *mqd,
@@ -244,20 +253,67 @@ static inline struct cik_sdma_rlc_registers *get_sdma_mqd(void *mqd)
 }
 
 static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, uint32_t pipe_id,
-			uint32_t queue_id, uint32_t __user *wptr)
+			uint32_t queue_id, uint32_t __user *wptr,
+			uint32_t wptr_shift, uint32_t wptr_mask,
+			struct mm_struct *mm)
 {
-	struct vi_mqd *m;
-	uint32_t shadow_wptr, valid_wptr;
 	struct amdgpu_device *adev = get_amdgpu_device(kgd);
+	struct vi_mqd *m;
+	uint32_t *mqd_hqd;
+	uint32_t reg, wptr_val, data;
 
 	m = get_mqd(mqd);
 
-	valid_wptr = copy_from_user(&shadow_wptr, wptr, sizeof(shadow_wptr));
-	if (valid_wptr == 0)
-		m->cp_hqd_pq_wptr = shadow_wptr;
-
 	acquire_queue(kgd, pipe_id, queue_id);
-	gfx_v8_0_mqd_commit(adev, mqd);
+
+	/* HIQ is set during driver init period with vmid set to 0*/
+	if (m->cp_hqd_vmid == 0) {
+		uint32_t value, mec, pipe;
+
+		mec = (pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
+		pipe = (pipe_id % adev->gfx.mec.num_pipe_per_mec);
+
+		pr_debug("kfd: set HIQ, mec:%d, pipe:%d, queue:%d.\n",
+			mec, pipe, queue_id);
+		value = RREG32(mmRLC_CP_SCHEDULERS);
+		value = REG_SET_FIELD(value, RLC_CP_SCHEDULERS, scheduler1,
+			((mec << 5) | (pipe << 3) | queue_id | 0x80));
+		WREG32(mmRLC_CP_SCHEDULERS, value);
+	}
+
+	/* HQD registers extend from CP_MQD_BASE_ADDR to CP_HQD_EOP_WPTR_MEM. */
+	mqd_hqd = &m->cp_mqd_base_addr_lo;
+
+	for (reg = mmCP_MQD_BASE_ADDR; reg <= mmCP_HQD_EOP_CONTROL; reg++)
+		WREG32(reg, mqd_hqd[reg - mmCP_MQD_BASE_ADDR]);
+
+	/* Tonga errata: EOP RPTR/WPTR should be left unmodified.
+	 * This is safe since EOP RPTR==WPTR for any inactive HQD
+	 * on ASICs that do not support context-save.
+	 * EOP writes/reads can start anywhere in the ring.
+	 */
+	if (get_amdgpu_device(kgd)->asic_type != CHIP_TONGA) {
+		WREG32(mmCP_HQD_EOP_RPTR, m->cp_hqd_eop_rptr);
+		WREG32(mmCP_HQD_EOP_WPTR, m->cp_hqd_eop_wptr);
+		WREG32(mmCP_HQD_EOP_WPTR_MEM, m->cp_hqd_eop_wptr_mem);
+	}
+
+	for (reg = mmCP_HQD_EOP_EVENTS; reg <= mmCP_HQD_ERROR; reg++)
+		WREG32(reg, mqd_hqd[reg - mmCP_MQD_BASE_ADDR]);
+
+	/* Copy userspace write pointer value to register.
+	 * Activate doorbell logic to monitor subsequent changes.
+	 */
+	data = REG_SET_FIELD(m->cp_hqd_pq_doorbell_control,
+			     CP_HQD_PQ_DOORBELL_CONTROL, DOORBELL_EN, 1);
+	WREG32(mmCP_HQD_PQ_DOORBELL_CONTROL, data);
+
+	if (read_user_wptr(mm, wptr, wptr_val))
+		WREG32(mmCP_HQD_PQ_WPTR, (wptr_val << wptr_shift) & wptr_mask);
+
+	data = REG_SET_FIELD(m->cp_hqd_active, CP_HQD_ACTIVE, ACTIVE, 1);
+	WREG32(mmCP_HQD_ACTIVE, data);
+
 	release_queue(kgd);
 
 	return 0;
@@ -308,29 +364,102 @@ static bool kgd_hqd_sdma_is_occupied(struct kgd_dev *kgd, void *mqd)
 	return false;
 }
 
-static int kgd_hqd_destroy(struct kgd_dev *kgd, uint32_t reset_type,
+static int kgd_hqd_destroy(struct kgd_dev *kgd, void *mqd,
+				enum kfd_preempt_type reset_type,
 				unsigned int utimeout, uint32_t pipe_id,
 				uint32_t queue_id)
 {
 	struct amdgpu_device *adev = get_amdgpu_device(kgd);
 	uint32_t temp;
-	int timeout = utimeout;
+	enum hqd_dequeue_request_type type;
+	unsigned long flags, end_jiffies;
+	int retry;
+	struct vi_mqd *m = get_mqd(mqd);
 
 	acquire_queue(kgd, pipe_id, queue_id);
 
-	WREG32(mmCP_HQD_DEQUEUE_REQUEST, reset_type);
+	if (m->cp_hqd_vmid == 0)
+		WREG32_FIELD(RLC_CP_SCHEDULERS, scheduler1, 0);
+
+	switch (reset_type) {
+	case KFD_PREEMPT_TYPE_WAVEFRONT_DRAIN:
+		type = DRAIN_PIPE;
+		break;
+	case KFD_PREEMPT_TYPE_WAVEFRONT_RESET:
+		type = RESET_WAVES;
+		break;
+	default:
+		type = DRAIN_PIPE;
+		break;
+	}
+
+	/* Workaround: If IQ timer is active and the wait time is close to or
+	 * equal to 0, dequeueing is not safe. Wait until either the wait time
+	 * is larger or timer is cleared. Also, ensure that IQ_REQ_PEND is
+	 * cleared before continuing. Also, ensure wait times are set to at
+	 * least 0x3.
+	 */
+	local_irq_save(flags);
+	preempt_disable();
+	retry = 5000; /* wait for 500 usecs at maximum */
+	while (true) {
+		temp = RREG32(mmCP_HQD_IQ_TIMER);
+		if (REG_GET_FIELD(temp, CP_HQD_IQ_TIMER, PROCESSING_IQ)) {
+			pr_debug("HW is processing IQ\n");
+			goto loop;
+		}
+		if (REG_GET_FIELD(temp, CP_HQD_IQ_TIMER, ACTIVE)) {
+			if (REG_GET_FIELD(temp, CP_HQD_IQ_TIMER, RETRY_TYPE)
+					== 3) /* SEM-rearm is safe */
+				break;
+			/* Wait time 3 is safe for CP, but our MMIO read/write
+			 * time is close to 1 microsecond, so check for 10 to
+			 * leave more buffer room
+			 */
+			if (REG_GET_FIELD(temp, CP_HQD_IQ_TIMER, WAIT_TIME)
+					>= 10)
+				break;
+			pr_debug("IQ timer is active\n");
+		} else
+			break;
+loop:
+		if (!retry) {
+			pr_err("CP HQD IQ timer status time out\n");
+			break;
+		}
+		ndelay(100);
+		--retry;
+	}
+	retry = 1000;
+	while (true) {
+		temp = RREG32(mmCP_HQD_DEQUEUE_REQUEST);
+		if (!(temp & CP_HQD_DEQUEUE_REQUEST__IQ_REQ_PEND_MASK))
+			break;
+		pr_debug("Dequeue request is pending\n");
 
+		if (!retry) {
+			pr_err("CP HQD dequeue request time out\n");
+			break;
+		}
+		ndelay(100);
+		--retry;
+	}
+	local_irq_restore(flags);
+	preempt_enable();
+
+	WREG32(mmCP_HQD_DEQUEUE_REQUEST, type);
+
+	end_jiffies = (utimeout * HZ / 1000) + jiffies;
 	while (true) {
 		temp = RREG32(mmCP_HQD_ACTIVE);
-		if (temp & CP_HQD_ACTIVE__ACTIVE_MASK)
+		if (!(temp & CP_HQD_ACTIVE__ACTIVE_MASK))
 			break;
-		if (timeout <= 0) {
-			pr_err("kfd: cp queue preemption time out.\n");
+		if (time_after(jiffies, end_jiffies)) {
+			pr_err("cp queue preemption time out.\n");
 			release_queue(kgd);
 			return -ETIME;
 		}
-		msleep(20);
-		timeout -= 20;
+		usleep_range(500, 1000);
 	}
 
 	release_queue(kgd);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 5dac29d..3891fe5 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -268,8 +268,8 @@ static int create_compute_queue_nocpsch(struct device_queue_manager *dqm,
 	pr_debug("Loading mqd to hqd on pipe %d, queue %d\n",
 			q->pipe, q->queue);
 
-	retval = mqd->load_mqd(mqd, q->mqd, q->pipe,
-			q->queue, (uint32_t __user *) q->properties.write_ptr);
+	retval = mqd->load_mqd(mqd, q->mqd, q->pipe, q->queue, &q->properties,
+			       q->process->mm);
 	if (retval)
 		goto out_uninit_mqd;
 
@@ -585,8 +585,7 @@ static int create_sdma_queue_nocpsch(struct device_queue_manager *dqm,
 	if (retval)
 		goto out_deallocate_sdma_queue;
 
-	retval = mqd->load_mqd(mqd, q->mqd, 0,
-				0, NULL);
+	retval = mqd->load_mqd(mqd, q->mqd, 0, 0, &q->properties, NULL);
 	if (retval)
 		goto out_uninit_mqd;
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
index 0e4d4a9..681b639 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
@@ -143,7 +143,8 @@ static bool initialize(struct kernel_queue *kq, struct kfd_dev *dev,
 		kq->queue->pipe = KFD_CIK_HIQ_PIPE;
 		kq->queue->queue = KFD_CIK_HIQ_QUEUE;
 		kq->mqd->load_mqd(kq->mqd, kq->queue->mqd, kq->queue->pipe,
-					kq->queue->queue, NULL);
+				  kq->queue->queue, &kq->queue->properties,
+				  NULL);
 	} else {
 		/* allocate fence for DIQ */
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h
index 213a71e..1f3a6ba 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h
@@ -67,7 +67,8 @@ struct mqd_manager {
 
 	int	(*load_mqd)(struct mqd_manager *mm, void *mqd,
 				uint32_t pipe_id, uint32_t queue_id,
-				uint32_t __user *wptr);
+				struct queue_properties *p,
+				struct mm_struct *mms);
 
 	int	(*update_mqd)(struct mqd_manager *mm, void *mqd,
 				struct queue_properties *q);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
index 7e0ec6b..44ffd23 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
@@ -144,15 +144,21 @@ static void uninit_mqd_sdma(struct mqd_manager *mm, void *mqd,
 }
 
 static int load_mqd(struct mqd_manager *mm, void *mqd, uint32_t pipe_id,
-			uint32_t queue_id, uint32_t __user *wptr)
+		    uint32_t queue_id, struct queue_properties *p,
+		    struct mm_struct *mms)
 {
-	return mm->dev->kfd2kgd->hqd_load
-		(mm->dev->kgd, mqd, pipe_id, queue_id, wptr);
+	/* AQL write pointer counts in 64B packets, PM4/CP counts in dwords. */
+	uint32_t wptr_shift = (p->format == KFD_QUEUE_FORMAT_AQL ? 4 : 0);
+	uint32_t wptr_mask = (uint32_t)((p->queue_size / sizeof(uint32_t)) - 1);
+
+	return mm->dev->kfd2kgd->hqd_load(mm->dev->kgd, mqd, pipe_id, queue_id,
+					  (uint32_t __user *)p->write_ptr,
+					  wptr_shift, wptr_mask, mms);
 }
 
 static int load_mqd_sdma(struct mqd_manager *mm, void *mqd,
-			uint32_t pipe_id, uint32_t queue_id,
-			uint32_t __user *wptr)
+			 uint32_t pipe_id, uint32_t queue_id,
+			 struct queue_properties *p, struct mm_struct *mms)
 {
 	return mm->dev->kfd2kgd->hqd_sdma_load(mm->dev->kgd, mqd);
 }
@@ -176,20 +182,17 @@ static int update_mqd(struct mqd_manager *mm, void *mqd,
 	m->cp_hqd_pq_base_hi = upper_32_bits((uint64_t)q->queue_address >> 8);
 	m->cp_hqd_pq_rptr_report_addr_lo = lower_32_bits((uint64_t)q->read_ptr);
 	m->cp_hqd_pq_rptr_report_addr_hi = upper_32_bits((uint64_t)q->read_ptr);
-	m->cp_hqd_pq_doorbell_control = DOORBELL_EN |
-					DOORBELL_OFFSET(q->doorbell_off);
+	m->cp_hqd_pq_doorbell_control = DOORBELL_OFFSET(q->doorbell_off);
 
 	m->cp_hqd_vmid = q->vmid;
 
 	if (q->format == KFD_QUEUE_FORMAT_AQL)
 		m->cp_hqd_pq_control |= NO_UPDATE_RPTR;
 
-	m->cp_hqd_active = 0;
 	q->is_active = false;
 	if (q->queue_size > 0 &&
 			q->queue_address != 0 &&
 			q->queue_percent > 0) {
-		m->cp_hqd_active = 1;
 		q->is_active = true;
 	}
 
@@ -239,7 +242,7 @@ static int destroy_mqd(struct mqd_manager *mm, void *mqd,
 			unsigned int timeout, uint32_t pipe_id,
 			uint32_t queue_id)
 {
-	return mm->dev->kfd2kgd->hqd_destroy(mm->dev->kgd, type, timeout,
+	return mm->dev->kfd2kgd->hqd_destroy(mm->dev->kgd, mqd, type, timeout,
 					pipe_id, queue_id);
 }
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
index 98a930e..73cbfe1 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
@@ -94,10 +94,15 @@ static int init_mqd(struct mqd_manager *mm, void **mqd,
 
 static int load_mqd(struct mqd_manager *mm, void *mqd,
 			uint32_t pipe_id, uint32_t queue_id,
-			uint32_t __user *wptr)
+			struct queue_properties *p, struct mm_struct *mms)
 {
-	return mm->dev->kfd2kgd->hqd_load
-		(mm->dev->kgd, mqd, pipe_id, queue_id, wptr);
+	/* AQL write pointer counts in 64B packets, PM4/CP counts in dwords. */
+	uint32_t wptr_shift = (p->format == KFD_QUEUE_FORMAT_AQL ? 4 : 0);
+	uint32_t wptr_mask = (uint32_t)((p->queue_size / sizeof(uint32_t)) - 1);
+
+	return mm->dev->kfd2kgd->hqd_load(mm->dev->kgd, mqd, pipe_id, queue_id,
+					  (uint32_t __user *)p->write_ptr,
+					  wptr_shift, wptr_mask, mms);
 }
 
 static int __update_mqd(struct mqd_manager *mm, void *mqd,
@@ -122,7 +127,6 @@ static int __update_mqd(struct mqd_manager *mm, void *mqd,
 	m->cp_hqd_pq_rptr_report_addr_hi = upper_32_bits((uint64_t)q->read_ptr);
 
 	m->cp_hqd_pq_doorbell_control =
-		1 << CP_HQD_PQ_DOORBELL_CONTROL__DOORBELL_EN__SHIFT |
 		q->doorbell_off <<
 			CP_HQD_PQ_DOORBELL_CONTROL__DOORBELL_OFFSET__SHIFT;
 	pr_debug("cp_hqd_pq_doorbell_control 0x%x\n",
@@ -159,12 +163,10 @@ static int __update_mqd(struct mqd_manager *mm, void *mqd,
 				2 << CP_HQD_PQ_CONTROL__SLOT_BASED_WPTR__SHIFT;
 	}
 
-	m->cp_hqd_active = 0;
 	q->is_active = false;
 	if (q->queue_size > 0 &&
 			q->queue_address != 0 &&
 			q->queue_percent > 0) {
-		m->cp_hqd_active = 1;
 		q->is_active = true;
 	}
 
@@ -184,7 +186,7 @@ static int destroy_mqd(struct mqd_manager *mm, void *mqd,
 			uint32_t queue_id)
 {
 	return mm->dev->kfd2kgd->hqd_destroy
-		(mm->dev->kgd, type, timeout,
+		(mm->dev->kgd, mqd, type, timeout,
 		pipe_id, queue_id);
 }
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index f0d55cc0..30ce92c 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -239,11 +239,6 @@ enum kfd_preempt_type_filter {
 	KFD_PREEMPT_TYPE_FILTER_BY_PASID
 };
 
-enum kfd_preempt_type {
-	KFD_PREEMPT_TYPE_WAVEFRONT,
-	KFD_PREEMPT_TYPE_WAVEFRONT_RESET
-};
-
 /**
  * enum kfd_queue_type
  *
diff --git a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
index 36f3766..ffafda0 100644
--- a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
+++ b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
@@ -41,6 +41,11 @@ struct kgd_dev;
 
 struct kgd_mem;
 
+enum kfd_preempt_type {
+	KFD_PREEMPT_TYPE_WAVEFRONT_DRAIN = 0,
+	KFD_PREEMPT_TYPE_WAVEFRONT_RESET,
+};
+
 enum kgd_memory_pool {
 	KGD_POOL_SYSTEM_CACHEABLE = 1,
 	KGD_POOL_SYSTEM_WRITECOMBINE = 2,
@@ -153,14 +158,16 @@ struct kfd2kgd_calls {
 	int (*init_interrupts)(struct kgd_dev *kgd, uint32_t pipe_id);
 
 	int (*hqd_load)(struct kgd_dev *kgd, void *mqd, uint32_t pipe_id,
-			uint32_t queue_id, uint32_t __user *wptr);
+			uint32_t queue_id, uint32_t __user *wptr,
+			uint32_t wptr_shift, uint32_t wptr_mask,
+			struct mm_struct *mm);
 
 	int (*hqd_sdma_load)(struct kgd_dev *kgd, void *mqd);
 
 	bool (*hqd_is_occupied)(struct kgd_dev *kgd, uint64_t queue_address,
 				uint32_t pipe_id, uint32_t queue_id);
 
-	int (*hqd_destroy)(struct kgd_dev *kgd, uint32_t reset_type,
+	int (*hqd_destroy)(struct kgd_dev *kgd, void *mqd, uint32_t reset_type,
 				unsigned int timeout, uint32_t pipe_id,
 				uint32_t queue_id);
 
diff --git a/drivers/gpu/drm/radeon/radeon_kfd.c b/drivers/gpu/drm/radeon/radeon_kfd.c
index a2ab6dc..695117a 100644
--- a/drivers/gpu/drm/radeon/radeon_kfd.c
+++ b/drivers/gpu/drm/radeon/radeon_kfd.c
@@ -75,12 +75,14 @@ static int kgd_init_pipeline(struct kgd_dev *kgd, uint32_t pipe_id,
 				uint32_t hpd_size, uint64_t hpd_gpu_addr);
 static int kgd_init_interrupts(struct kgd_dev *kgd, uint32_t pipe_id);
 static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, uint32_t pipe_id,
-			uint32_t queue_id, uint32_t __user *wptr);
+			uint32_t queue_id, uint32_t __user *wptr,
+			uint32_t wptr_shift, uint32_t wptr_mask,
+			struct mm_struct *mm);
 static int kgd_hqd_sdma_load(struct kgd_dev *kgd, void *mqd);
 static bool kgd_hqd_is_occupied(struct kgd_dev *kgd, uint64_t queue_address,
 				uint32_t pipe_id, uint32_t queue_id);
 
-static int kgd_hqd_destroy(struct kgd_dev *kgd, uint32_t reset_type,
+static int kgd_hqd_destroy(struct kgd_dev *kgd, void *mqd, uint32_t reset_type,
 				unsigned int timeout, uint32_t pipe_id,
 				uint32_t queue_id);
 static bool kgd_hqd_sdma_is_occupied(struct kgd_dev *kgd, void *mqd);
@@ -482,7 +484,9 @@ static inline struct cik_sdma_rlc_registers *get_sdma_mqd(void *mqd)
 }
 
 static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, uint32_t pipe_id,
-			uint32_t queue_id, uint32_t __user *wptr)
+			uint32_t queue_id, uint32_t __user *wptr,
+			uint32_t wptr_shift, uint32_t wptr_mask,
+			struct mm_struct *mm)
 {
 	uint32_t wptr_shadow, is_wptr_shadow_valid;
 	struct cik_mqd *m;
@@ -636,7 +640,7 @@ static bool kgd_hqd_sdma_is_occupied(struct kgd_dev *kgd, void *mqd)
 	return false;
 }
 
-static int kgd_hqd_destroy(struct kgd_dev *kgd, uint32_t reset_type,
+static int kgd_hqd_destroy(struct kgd_dev *kgd, void *mqd, uint32_t reset_type,
 				unsigned int timeout, uint32_t pipe_id,
 				uint32_t queue_id)
 {
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [PATCH 18/19] drm/amdgpu: Disable GFX PG on CZ
       [not found]     ` <1502488589-30272-19-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-08-11 23:45       ` StDenis, Tom
       [not found]         ` <DM5PR1201MB0074CE2415F056F739B6A8B7F7890-grEf7a3NxMAwZliakWjjqGrFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: StDenis, Tom @ 2017-08-11 23:45 UTC (permalink / raw)
  To: Kuehling, Felix, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w

Hi Felix,

I'm assuming your tree is up to date with amd-staging-4.11 or 4.12 but we did previously have issues with compute rings if PG was enabled (specifically CGCG + PG) on Carrizo.  Then David committed some KIQ upgrades and it started working properly.

Could that be related?  Because GFX PG "should work" on Carrizo is the official line last I heard from the GFX IP team.

Cheers,
Tom
________________________________________
From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> on behalf of Felix Kuehling <Felix.Kuehling@amd.com>
Sent: Friday, August 11, 2017 17:56
To: amd-gfx@lists.freedesktop.org; oded.gabbay@gmail.com
Cc: Kuehling, Felix
Subject: [PATCH 18/19] drm/amdgpu: Disable GFX PG on CZ

It's causing problems with user mode queues and the HIQ, and can
lead to hard hangs during boot after programming RLC_CP_SCHEDULERS.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/vi.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c b/drivers/gpu/drm/amd/amdgpu/vi.c
index 18bb3cb..495c8a3 100644
--- a/drivers/gpu/drm/amd/amdgpu/vi.c
+++ b/drivers/gpu/drm/amd/amdgpu/vi.c
@@ -1029,8 +1029,7 @@ static int vi_common_early_init(void *handle)
                /* rev0 hardware requires workarounds to support PG */
                adev->pg_flags = 0;
                if (adev->rev_id != 0x00 || CZ_REV_BRISTOL(adev->pdev->revision)) {
-                       adev->pg_flags |= AMD_PG_SUPPORT_GFX_PG |
-                               AMD_PG_SUPPORT_GFX_SMG |
+                       adev->pg_flags |= AMD_PG_SUPPORT_GFX_SMG |
                                AMD_PG_SUPPORT_GFX_PIPELINE |
                                AMD_PG_SUPPORT_CP |
                                AMD_PG_SUPPORT_UVD |
--
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [PATCH 18/19] drm/amdgpu: Disable GFX PG on CZ
       [not found]         ` <DM5PR1201MB0074CE2415F056F739B6A8B7F7890-grEf7a3NxMAwZliakWjjqGrFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
@ 2017-08-11 23:56           ` Felix Kuehling
       [not found]             ` <b5039bc6-5be9-d8b3-c995-082a60b40490-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: Felix Kuehling @ 2017-08-11 23:56 UTC (permalink / raw)
  To: StDenis, Tom, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w

Yes, I'm up-to-date. KFD doesn't use the KIQ to map the HIQ. And HIQ
maps all our other queues (unless we're disabling the hardware scheduler).

Regards,
  Felix


On 2017-08-11 07:45 PM, StDenis, Tom wrote:
> Hi Felix,
>
> I'm assuming your tree is up to date with amd-staging-4.11 or 4.12 but we did previously have issues with compute rings if PG was enabled (specifically CGCG + PG) on Carrizo.  Then David committed some KIQ upgrades and it started working properly.
>
> Could that be related?  Because GFX PG "should work" on Carrizo is the official line last I heard from the GFX IP team.
>
> Cheers,
> Tom
> ________________________________________
> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> on behalf of Felix Kuehling <Felix.Kuehling@amd.com>
> Sent: Friday, August 11, 2017 17:56
> To: amd-gfx@lists.freedesktop.org; oded.gabbay@gmail.com
> Cc: Kuehling, Felix
> Subject: [PATCH 18/19] drm/amdgpu: Disable GFX PG on CZ
>
> It's causing problems with user mode queues and the HIQ, and can
> lead to hard hangs during boot after programming RLC_CP_SCHEDULERS.
>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/vi.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c b/drivers/gpu/drm/amd/amdgpu/vi.c
> index 18bb3cb..495c8a3 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vi.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vi.c
> @@ -1029,8 +1029,7 @@ static int vi_common_early_init(void *handle)
>                 /* rev0 hardware requires workarounds to support PG */
>                 adev->pg_flags = 0;
>                 if (adev->rev_id != 0x00 || CZ_REV_BRISTOL(adev->pdev->revision)) {
> -                       adev->pg_flags |= AMD_PG_SUPPORT_GFX_PG |
> -                               AMD_PG_SUPPORT_GFX_SMG |
> +                       adev->pg_flags |= AMD_PG_SUPPORT_GFX_SMG |
>                                 AMD_PG_SUPPORT_GFX_PIPELINE |
>                                 AMD_PG_SUPPORT_CP |
>                                 AMD_PG_SUPPORT_UVD |
> --
> 2.7.4
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 18/19] drm/amdgpu: Disable GFX PG on CZ
       [not found]             ` <b5039bc6-5be9-d8b3-c995-082a60b40490-5C7GfCeVMHo@public.gmane.org>
@ 2017-08-12  0:08               ` StDenis, Tom
       [not found]                 ` <DM5PR1201MB0074364796DD927ADB4746A7F78E0-grEf7a3NxMAwZliakWjjqGrFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: StDenis, Tom @ 2017-08-12  0:08 UTC (permalink / raw)
  To: Kuehling, Felix, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w

Hmm, I'd still be careful about disabling GFX PG since we may fail to meet energy star requirements.

Does the system hard hang or simply GPU hang?

Tom

________________________________________
From: Kuehling, Felix
Sent: Friday, August 11, 2017 19:56
To: StDenis, Tom; amd-gfx@lists.freedesktop.org; oded.gabbay@gmail.com
Subject: Re: [PATCH 18/19] drm/amdgpu: Disable GFX PG on CZ

Yes, I'm up-to-date. KFD doesn't use the KIQ to map the HIQ. And HIQ
maps all our other queues (unless we're disabling the hardware scheduler).

Regards,
  Felix


On 2017-08-11 07:45 PM, StDenis, Tom wrote:
> Hi Felix,
>
> I'm assuming your tree is up to date with amd-staging-4.11 or 4.12 but we did previously have issues with compute rings if PG was enabled (specifically CGCG + PG) on Carrizo.  Then David committed some KIQ upgrades and it started working properly.
>
> Could that be related?  Because GFX PG "should work" on Carrizo is the official line last I heard from the GFX IP team.
>
> Cheers,
> Tom
> ________________________________________
> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> on behalf of Felix Kuehling <Felix.Kuehling@amd.com>
> Sent: Friday, August 11, 2017 17:56
> To: amd-gfx@lists.freedesktop.org; oded.gabbay@gmail.com
> Cc: Kuehling, Felix
> Subject: [PATCH 18/19] drm/amdgpu: Disable GFX PG on CZ
>
> It's causing problems with user mode queues and the HIQ, and can
> lead to hard hangs during boot after programming RLC_CP_SCHEDULERS.
>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/vi.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c b/drivers/gpu/drm/amd/amdgpu/vi.c
> index 18bb3cb..495c8a3 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vi.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vi.c
> @@ -1029,8 +1029,7 @@ static int vi_common_early_init(void *handle)
>                 /* rev0 hardware requires workarounds to support PG */
>                 adev->pg_flags = 0;
>                 if (adev->rev_id != 0x00 || CZ_REV_BRISTOL(adev->pdev->revision)) {
> -                       adev->pg_flags |= AMD_PG_SUPPORT_GFX_PG |
> -                               AMD_PG_SUPPORT_GFX_SMG |
> +                       adev->pg_flags |= AMD_PG_SUPPORT_GFX_SMG |
>                                 AMD_PG_SUPPORT_GFX_PIPELINE |
>                                 AMD_PG_SUPPORT_CP |
>                                 AMD_PG_SUPPORT_UVD |
> --
> 2.7.4
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 18/19] drm/amdgpu: Disable GFX PG on CZ
       [not found]                 ` <DM5PR1201MB0074364796DD927ADB4746A7F78E0-grEf7a3NxMAwZliakWjjqGrFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
@ 2017-08-12  0:40                   ` Felix Kuehling
       [not found]                     ` <1d263ee5-e1e3-aa3a-a8aa-c1fbfbcbb8ab-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: Felix Kuehling @ 2017-08-12  0:40 UTC (permalink / raw)
  To: StDenis, Tom, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w

With the next change that adds programming of RLC_CP_SCHEDULERS it's a
VM fault and hard hang during boot, just after HWS initialization.
Without that change it's only a MEC hang when the first application
tries to create a user mode queue.

Regards,
  Felix

On 2017-08-11 08:08 PM, StDenis, Tom wrote:
> Hmm, I'd still be careful about disabling GFX PG since we may fail to meet energy star requirements.
>
> Does the system hard hang or simply GPU hang?
>
> Tom
>
> ________________________________________
> From: Kuehling, Felix
> Sent: Friday, August 11, 2017 19:56
> To: StDenis, Tom; amd-gfx@lists.freedesktop.org; oded.gabbay@gmail.com
> Subject: Re: [PATCH 18/19] drm/amdgpu: Disable GFX PG on CZ
>
> Yes, I'm up-to-date. KFD doesn't use the KIQ to map the HIQ. And HIQ
> maps all our other queues (unless we're disabling the hardware scheduler).
>
> Regards,
>   Felix
>
>
> On 2017-08-11 07:45 PM, StDenis, Tom wrote:
>> Hi Felix,
>>
>> I'm assuming your tree is up to date with amd-staging-4.11 or 4.12 but we did previously have issues with compute rings if PG was enabled (specifically CGCG + PG) on Carrizo.  Then David committed some KIQ upgrades and it started working properly.
>>
>> Could that be related?  Because GFX PG "should work" on Carrizo is the official line last I heard from the GFX IP team.
>>
>> Cheers,
>> Tom
>> ________________________________________
>> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> on behalf of Felix Kuehling <Felix.Kuehling@amd.com>
>> Sent: Friday, August 11, 2017 17:56
>> To: amd-gfx@lists.freedesktop.org; oded.gabbay@gmail.com
>> Cc: Kuehling, Felix
>> Subject: [PATCH 18/19] drm/amdgpu: Disable GFX PG on CZ
>>
>> It's causing problems with user mode queues and the HIQ, and can
>> lead to hard hangs during boot after programming RLC_CP_SCHEDULERS.
>>
>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>> ---
>>  drivers/gpu/drm/amd/amdgpu/vi.c | 3 +--
>>  1 file changed, 1 insertion(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c b/drivers/gpu/drm/amd/amdgpu/vi.c
>> index 18bb3cb..495c8a3 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/vi.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/vi.c
>> @@ -1029,8 +1029,7 @@ static int vi_common_early_init(void *handle)
>>                 /* rev0 hardware requires workarounds to support PG */
>>                 adev->pg_flags = 0;
>>                 if (adev->rev_id != 0x00 || CZ_REV_BRISTOL(adev->pdev->revision)) {
>> -                       adev->pg_flags |= AMD_PG_SUPPORT_GFX_PG |
>> -                               AMD_PG_SUPPORT_GFX_SMG |
>> +                       adev->pg_flags |= AMD_PG_SUPPORT_GFX_SMG |
>>                                 AMD_PG_SUPPORT_GFX_PIPELINE |
>>                                 AMD_PG_SUPPORT_CP |
>>                                 AMD_PG_SUPPORT_UVD |
>> --
>> 2.7.4
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 18/19] drm/amdgpu: Disable GFX PG on CZ
       [not found]                     ` <1d263ee5-e1e3-aa3a-a8aa-c1fbfbcbb8ab-5C7GfCeVMHo@public.gmane.org>
@ 2017-08-12  0:54                       ` StDenis, Tom
       [not found]                         ` <DM5PR1201MB00745120AE898231F7BAD8DFF78E0-grEf7a3NxMAwZliakWjjqGrFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
  2017-08-14 15:28                       ` Deucher, Alexander
  1 sibling, 1 reply; 70+ messages in thread
From: StDenis, Tom @ 2017-08-12  0:54 UTC (permalink / raw)
  To: Kuehling, Felix, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w

Hi Felix,

Well it's really up to Christian and Alex but I'd keep an eye on this since it'll cause issues with embedded down the road.

I happen to have a CZ system so I could possibly try and bisect 4.11/4.12 and see if there's any stable points for you guys.  Is there a short and simple KFD setup I can install/run to test it?  Or is simply loading a KFD merged/rebased kernel enough to cause the hang (and thus I guess a bisect doesn't make sense).

Cheers,
Tom

________________________________________
From: Kuehling, Felix
Sent: Friday, August 11, 2017 20:40
To: StDenis, Tom; amd-gfx@lists.freedesktop.org; oded.gabbay@gmail.com
Subject: Re: [PATCH 18/19] drm/amdgpu: Disable GFX PG on CZ

With the next change that adds programming of RLC_CP_SCHEDULERS it's a
VM fault and hard hang during boot, just after HWS initialization.
Without that change it's only a MEC hang when the first application
tries to create a user mode queue.

Regards,
  Felix

On 2017-08-11 08:08 PM, StDenis, Tom wrote:
> Hmm, I'd still be careful about disabling GFX PG since we may fail to meet energy star requirements.
>
> Does the system hard hang or simply GPU hang?
>
> Tom
>
> ________________________________________
> From: Kuehling, Felix
> Sent: Friday, August 11, 2017 19:56
> To: StDenis, Tom; amd-gfx@lists.freedesktop.org; oded.gabbay@gmail.com
> Subject: Re: [PATCH 18/19] drm/amdgpu: Disable GFX PG on CZ
>
> Yes, I'm up-to-date. KFD doesn't use the KIQ to map the HIQ. And HIQ
> maps all our other queues (unless we're disabling the hardware scheduler).
>
> Regards,
>   Felix
>
>
> On 2017-08-11 07:45 PM, StDenis, Tom wrote:
>> Hi Felix,
>>
>> I'm assuming your tree is up to date with amd-staging-4.11 or 4.12 but we did previously have issues with compute rings if PG was enabled (specifically CGCG + PG) on Carrizo.  Then David committed some KIQ upgrades and it started working properly.
>>
>> Could that be related?  Because GFX PG "should work" on Carrizo is the official line last I heard from the GFX IP team.
>>
>> Cheers,
>> Tom
>> ________________________________________
>> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> on behalf of Felix Kuehling <Felix.Kuehling@amd.com>
>> Sent: Friday, August 11, 2017 17:56
>> To: amd-gfx@lists.freedesktop.org; oded.gabbay@gmail.com
>> Cc: Kuehling, Felix
>> Subject: [PATCH 18/19] drm/amdgpu: Disable GFX PG on CZ
>>
>> It's causing problems with user mode queues and the HIQ, and can
>> lead to hard hangs during boot after programming RLC_CP_SCHEDULERS.
>>
>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>> ---
>>  drivers/gpu/drm/amd/amdgpu/vi.c | 3 +--
>>  1 file changed, 1 insertion(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c b/drivers/gpu/drm/amd/amdgpu/vi.c
>> index 18bb3cb..495c8a3 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/vi.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/vi.c
>> @@ -1029,8 +1029,7 @@ static int vi_common_early_init(void *handle)
>>                 /* rev0 hardware requires workarounds to support PG */
>>                 adev->pg_flags = 0;
>>                 if (adev->rev_id != 0x00 || CZ_REV_BRISTOL(adev->pdev->revision)) {
>> -                       adev->pg_flags |= AMD_PG_SUPPORT_GFX_PG |
>> -                               AMD_PG_SUPPORT_GFX_SMG |
>> +                       adev->pg_flags |= AMD_PG_SUPPORT_GFX_SMG |
>>                                 AMD_PG_SUPPORT_GFX_PIPELINE |
>>                                 AMD_PG_SUPPORT_CP |
>>                                 AMD_PG_SUPPORT_UVD |
>> --
>> 2.7.4
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 18/19] drm/amdgpu: Disable GFX PG on CZ
       [not found]                         ` <DM5PR1201MB00745120AE898231F7BAD8DFF78E0-grEf7a3NxMAwZliakWjjqGrFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
@ 2017-08-12  1:18                           ` Felix Kuehling
  0 siblings, 0 replies; 70+ messages in thread
From: Felix Kuehling @ 2017-08-12  1:18 UTC (permalink / raw)
  To: StDenis, Tom, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w


On 2017-08-11 08:54 PM, StDenis, Tom wrote:
> Hi Felix,
>
> Well it's really up to Christian and Alex but I'd keep an eye on this since it'll cause issues with embedded down the road.
>
> I happen to have a CZ system so I could possibly try and bisect 4.11/4.12 and see if there's any stable points for you guys.

I doubt there is a stable point. On the KFD branch we've always had GFX
power gating disabled, because it was causing us problems as soon as we
picked up kernel 4.6 in August 2016, which first introduced CZ power
gating to the KFD branch.

>   Is there a short and simple KFD setup I can install/run to test it?  Or is simply loading a KFD merged/rebased kernel enough to cause the hang (and thus I guess a bisect doesn't make sense).

With patch 19 in this series, it's a hang during boot. Without it, you
can boot, and you'll get errors from kfdtest due to MEC hangs as soon as
a user mode queue is created. You'd need a modified Thunk and KFDTest
for this experiment. You could get both from a recent roc-master build.
The rest of the ROCm stack isn't needed.

KFDTest isn't released to the public, and the last public release
doesn't include the necessary Thunk changes yet. I think the Thunk
change will make it into ROCm 1.6.3.

I've also been able to run hsaconformance (which I think is included in
our public releases) with 74% of tests passing. OCL tests currently
segfault in the HSA runtime, as do some of the conformance tests. I'm
going to look into the HSA runtime a bit more to see if I can get OCL to
work for more realistic testing.

Regards,
  Felix

>
> Cheers,
> Tom
>
> ________________________________________
> From: Kuehling, Felix
> Sent: Friday, August 11, 2017 20:40
> To: StDenis, Tom; amd-gfx@lists.freedesktop.org; oded.gabbay@gmail.com
> Subject: Re: [PATCH 18/19] drm/amdgpu: Disable GFX PG on CZ
>
> With the next change that adds programming of RLC_CP_SCHEDULERS it's a
> VM fault and hard hang during boot, just after HWS initialization.
> Without that change it's only a MEC hang when the first application
> tries to create a user mode queue.
>
> Regards,
>   Felix
>
> On 2017-08-11 08:08 PM, StDenis, Tom wrote:
>> Hmm, I'd still be careful about disabling GFX PG since we may fail to meet energy star requirements.
>>
>> Does the system hard hang or simply GPU hang?
>>
>> Tom
>>
>> ________________________________________
>> From: Kuehling, Felix
>> Sent: Friday, August 11, 2017 19:56
>> To: StDenis, Tom; amd-gfx@lists.freedesktop.org; oded.gabbay@gmail.com
>> Subject: Re: [PATCH 18/19] drm/amdgpu: Disable GFX PG on CZ
>>
>> Yes, I'm up-to-date. KFD doesn't use the KIQ to map the HIQ. And HIQ
>> maps all our other queues (unless we're disabling the hardware scheduler).
>>
>> Regards,
>>   Felix
>>
>>
>> On 2017-08-11 07:45 PM, StDenis, Tom wrote:
>>> Hi Felix,
>>>
>>> I'm assuming your tree is up to date with amd-staging-4.11 or 4.12 but we did previously have issues with compute rings if PG was enabled (specifically CGCG + PG) on Carrizo.  Then David committed some KIQ upgrades and it started working properly.
>>>
>>> Could that be related?  Because GFX PG "should work" on Carrizo is the official line last I heard from the GFX IP team.
>>>
>>> Cheers,
>>> Tom
>>> ________________________________________
>>> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> on behalf of Felix Kuehling <Felix.Kuehling@amd.com>
>>> Sent: Friday, August 11, 2017 17:56
>>> To: amd-gfx@lists.freedesktop.org; oded.gabbay@gmail.com
>>> Cc: Kuehling, Felix
>>> Subject: [PATCH 18/19] drm/amdgpu: Disable GFX PG on CZ
>>>
>>> It's causing problems with user mode queues and the HIQ, and can
>>> lead to hard hangs during boot after programming RLC_CP_SCHEDULERS.
>>>
>>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>>> ---
>>>  drivers/gpu/drm/amd/amdgpu/vi.c | 3 +--
>>>  1 file changed, 1 insertion(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c b/drivers/gpu/drm/amd/amdgpu/vi.c
>>> index 18bb3cb..495c8a3 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/vi.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/vi.c
>>> @@ -1029,8 +1029,7 @@ static int vi_common_early_init(void *handle)
>>>                 /* rev0 hardware requires workarounds to support PG */
>>>                 adev->pg_flags = 0;
>>>                 if (adev->rev_id != 0x00 || CZ_REV_BRISTOL(adev->pdev->revision)) {
>>> -                       adev->pg_flags |= AMD_PG_SUPPORT_GFX_PG |
>>> -                               AMD_PG_SUPPORT_GFX_SMG |
>>> +                       adev->pg_flags |= AMD_PG_SUPPORT_GFX_SMG |
>>>                                 AMD_PG_SUPPORT_GFX_PIPELINE |
>>>                                 AMD_PG_SUPPORT_CP |
>>>                                 AMD_PG_SUPPORT_UVD |
>>> --
>>> 2.7.4
>>>
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 01/19] drm/amdkfd: Fix double Mutex lock order
       [not found]     ` <1502488589-30272-2-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-08-12 12:23       ` Oded Gabbay
       [not found]         ` <CAFCwf12A9Qr-HCyQFR2eDN_TEExzxHEBVK9XQ9_xuwPKErHg3w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: Oded Gabbay @ 2017-08-12 12:23 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: amd-gfx list

On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
>
> From: Yair Shachar <yair.shachar@amd.com>
>
> Signed-off-by: Yair Shachar <yair.shachar@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> index 6316aad..2a45718e 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> @@ -451,8 +451,8 @@ static int kfd_ioctl_dbg_register(struct file *filep,
>                 return -EINVAL;
>         }
>
> -       mutex_lock(kfd_get_dbgmgr_mutex());
>         mutex_lock(&p->mutex);
> +       mutex_lock(kfd_get_dbgmgr_mutex());
>
>         /*
>          * make sure that we have pdd, if this the first queue created for
> @@ -460,8 +460,8 @@ static int kfd_ioctl_dbg_register(struct file *filep,
>          */
>         pdd = kfd_bind_process_to_device(dev, p);
>         if (IS_ERR(pdd)) {
> -               mutex_unlock(&p->mutex);
>                 mutex_unlock(kfd_get_dbgmgr_mutex());
> +               mutex_unlock(&p->mutex);
>                 return PTR_ERR(pdd);
>         }
>
> @@ -480,8 +480,8 @@ static int kfd_ioctl_dbg_register(struct file *filep,
>                 status = -EINVAL;
>         }
>
> -       mutex_unlock(&p->mutex);
>         mutex_unlock(kfd_get_dbgmgr_mutex());
> +       mutex_unlock(&p->mutex);
>
>         return status;
>  }
> --
> 2.7.4
>

Hi Felix,
Could you please explain why this change is necessary ?

It seems to me this actually makes things a bit worse in a
multi-process environment, because the p->mutex is per process but the
dbgmgr mutex is global. Therefore, if process A first takes the
process mutex, and process B takes the dbgmgr mutex (in this function
or some other function, such as kfd_ioctl_dbg_address_watch) *before*
process A managed to take dbgmgr mutex, then process A will be locked
from doing other, totally unrelated functions, such as
kfd_ioctl_create_queue.

While, if we keep things as they are now, process A will first take
the dbgmgr mutex, making process B stuck on it, but allowing it to do
other unrelated ioctls because he hadn't taken the process mutex.

Thanks,
Oded
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 00/19] KFD fixes and cleanups
       [not found] ` <1502488589-30272-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (18 preceding siblings ...)
  2017-08-11 21:56   ` [PATCH 19/19] drm/amd: Update MEC HQD loading code for KFD Felix Kuehling
@ 2017-08-12 12:28   ` Oded Gabbay
       [not found]     ` <CAFCwf10L0sMCWnPxOi=zLuXLor4X90--m-a6UnervTmEguGL9g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  19 siblings, 1 reply; 70+ messages in thread
From: Oded Gabbay @ 2017-08-12 12:28 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: amd-gfx list

Hi Felix,
Thanks for all the patches.
I have started to review them, but I have a small request from you
while I'm doing the review.
Could you please rebase them over my amdkfd-next branch, or
alternatively, over Alex's drm-next-4.14  or Dave Airlie's drm-next
(which amdkfd-next currently points to) branches ?
I tried to apply this patch-set on amdkfd-next, but it fails on patch
5. I can't upstream them to Dave when they don't apply to his upstream
branch.

Thanks,
Oded

On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> This is the first round of changes preparing for upstreaming KFD
> changes made internally in the last 2 years at AMD. A big part of it
> is coding style and messaging cleanup. I have tried to avoid making
> gratuitous formatting changes. All coding style changes should have a
> justification based on the Linux style guide.
>
> The last few patches (15-19) enable running pieces of the current ROCm
> user mode stack (with minor Thunk fixes for backwards compatibility)
> on this soon-to-be upstream kernel on CZ. At this time I can run some
> KFDTest unit tests, which are currently not open source. I'm trying to
> find other more substantial tests using a real compute API as a
> baseline for testing further KFD upstreaming patches.
>
> This patch series is freshly rebased on amd-staging-4.12.
>
> Felix Kuehling (11):
>   drm/amdkfd: Fix typo in dbgdev_wave_reset_wavefronts
>   drm/amdkfd: Remove bogus divide-by-sizeof(uint32_t)
>   drm/amdkfd: Fix allocated_queues bitmap initialization
>   drm/amdkfd: Remove BUG_ONs for NULL pointer arguments
>   drm/amdkfd: Fix doorbell initialization and finalization
>   drm/amdkfd: Allocate gtt_sa_bitmap in long units
>   drm/amdkfd: Handle remaining BUG_ONs more gracefully
>   drm/amdkfd: Update PM4 packet headers
>   drm/amdgpu: Remove hard-coded assumptions about compute pipes
>   drm/amdgpu: Disable GFX PG on CZ
>   drm/amd: Update MEC HQD loading code for KFD
>
> Jay Cornwall (1):
>   drm/amdkfd: Clamp EOP queue size correctly on Gfx8
>
> Kent Russell (5):
>   drm/amdkfd: Clean up KFD style errors and warnings
>   drm/amdkfd: Consolidate and clean up log commands
>   drm/amdkfd: Change x==NULL/false references to !x
>   drm/amdkfd: Fix goto usage
>   drm/amdkfd: Remove usage of alloc(sizeof(struct...
>
> Yair Shachar (1):
>   drm/amdkfd: Fix double Mutex lock order
>
> Yong Zhao (1):
>   drm/amdkfd: Add more error printing to help bringup
>
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c         |   4 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h         |  16 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c  | 156 +++++++---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c  | 185 ++++++++++--
>  drivers/gpu/drm/amd/amdgpu/vi.c                    |   3 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c           | 107 +++----
>  drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c            | 102 +++----
>  drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c            |  21 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.h            |  27 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_device.c            | 122 ++++----
>  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  | 313 ++++++++-----------
>  .../drm/amd/amdkfd/kfd_device_queue_manager_cik.c  |   6 +-
>  .../drm/amd/amdkfd/kfd_device_queue_manager_vi.c   |   6 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c          |  40 +--
>  drivers/gpu/drm/amd/amdkfd/kfd_events.c            |  33 +--
>  drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c       |   2 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c         |   2 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c      |  63 ++--
>  drivers/gpu/drm/amd/amdkfd/kfd_module.c            |  10 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h       |   3 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c   |  62 ++--
>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c    |  46 +--
>  drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c    | 301 +++++++------------
>  drivers/gpu/drm/amd/amdkfd/kfd_pasid.c             |   7 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h       | 330 +++------------------
>  drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h    | 140 ++++++++-
>  drivers/gpu/drm/amd/amdkfd/kfd_priv.h              |  31 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_process.c           |  25 +-
>  .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |  71 ++---
>  drivers/gpu/drm/amd/amdkfd/kfd_queue.c             |  12 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_topology.c          |  46 +--
>  drivers/gpu/drm/amd/include/kgd_kfd_interface.h    |  11 +-
>  drivers/gpu/drm/radeon/radeon_kfd.c                |  12 +-
>  33 files changed, 1054 insertions(+), 1261 deletions(-)
>
> --
> 2.7.4
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 02/19] drm/amdkfd: Fix typo in dbgdev_wave_reset_wavefronts
       [not found]     ` <1502488589-30272-3-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-08-12 12:29       ` Oded Gabbay
  0 siblings, 0 replies; 70+ messages in thread
From: Oded Gabbay @ 2017-08-12 12:29 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: amd-gfx list

On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
> index d5e19b5..8b14a4e 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
> @@ -823,7 +823,7 @@ int dbgdev_wave_reset_wavefronts(struct kfd_dev *dev, struct kfd_process *p)
>         for (vmid = first_vmid_to_scan; vmid <= last_vmid_to_scan; vmid++) {
>                 if (dev->kfd2kgd->get_atc_vmid_pasid_mapping_valid
>                                 (dev->kgd, vmid)) {
> -                       if (dev->kfd2kgd->get_atc_vmid_pasid_mapping_valid
> +                       if (dev->kfd2kgd->get_atc_vmid_pasid_mapping_pasid
>                                         (dev->kgd, vmid) == p->pasid) {
>                                 pr_debug("Killing wave fronts of vmid %d and pasid %d\n",
>                                                 vmid, p->pasid);
> --
> 2.7.4
>
This patch is:
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 03/19] drm/amdkfd: Remove bogus divide-by-sizeof(uint32_t)
       [not found]     ` <1502488589-30272-4-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-08-12 12:37       ` Oded Gabbay
       [not found]         ` <CAFCwf11Bg41FNg2sChh6EZkczb6quSzxdFXoJ-qhoE8JwqgGJw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: Oded Gabbay @ 2017-08-12 12:37 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: amd-gfx list

On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> kfd2kgd->address_watch_get_offset returns dword register offsets.
> The divide-by-sizeof(uint32_t) is incorrect.

In amdgpu that's true, but in radeon that's incorrect.
If you look at cik_reg.h in radeon driver, you will see the address of
all TCP_WATCH_* registers is multiplied by 4, and that's why Yair
originally divided the offset by sizeof(uint32_t).
I think this patch should move the divide-by-sizeof operation into the
radeon function instead of just deleting it from kfd_dbgdev.c.

Oded

>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c | 8 --------
>  1 file changed, 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
> index 8b14a4e..faa0790 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
> @@ -442,8 +442,6 @@ static int dbgdev_address_watch_diq(struct kfd_dbgdev *dbgdev,
>                                         i,
>                                         ADDRESS_WATCH_REG_CNTL);
>
> -               aw_reg_add_dword /= sizeof(uint32_t);
> -
>                 packets_vec[0].bitfields2.reg_offset =
>                                         aw_reg_add_dword - AMD_CONFIG_REG_BASE;
>
> @@ -455,8 +453,6 @@ static int dbgdev_address_watch_diq(struct kfd_dbgdev *dbgdev,
>                                         i,
>                                         ADDRESS_WATCH_REG_ADDR_HI);
>
> -               aw_reg_add_dword /= sizeof(uint32_t);
> -
>                 packets_vec[1].bitfields2.reg_offset =
>                                         aw_reg_add_dword - AMD_CONFIG_REG_BASE;
>                 packets_vec[1].reg_data[0] = addrHi.u32All;
> @@ -467,8 +463,6 @@ static int dbgdev_address_watch_diq(struct kfd_dbgdev *dbgdev,
>                                         i,
>                                         ADDRESS_WATCH_REG_ADDR_LO);
>
> -               aw_reg_add_dword /= sizeof(uint32_t);
> -
>                 packets_vec[2].bitfields2.reg_offset =
>                                 aw_reg_add_dword - AMD_CONFIG_REG_BASE;
>                 packets_vec[2].reg_data[0] = addrLo.u32All;
> @@ -485,8 +479,6 @@ static int dbgdev_address_watch_diq(struct kfd_dbgdev *dbgdev,
>                                         i,
>                                         ADDRESS_WATCH_REG_CNTL);
>
> -               aw_reg_add_dword /= sizeof(uint32_t);
> -
>                 packets_vec[3].bitfields2.reg_offset =
>                                         aw_reg_add_dword - AMD_CONFIG_REG_BASE;
>                 packets_vec[3].reg_data[0] = cntl.u32All;
> --
> 2.7.4
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 04/19] drm/amdkfd: Fix allocated_queues bitmap initialization
       [not found]     ` <1502488589-30272-5-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-08-12 12:45       ` Oded Gabbay
  0 siblings, 0 replies; 70+ messages in thread
From: Oded Gabbay @ 2017-08-12 12:45 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: amd-gfx list

On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> Use shared_resources.queue_bitmap to determine the queues available
> for KFD in each pipe.
>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> index 42de22b..9d2796b 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> @@ -513,7 +513,7 @@ static int init_scheduler(struct device_queue_manager *dqm)
>
>  static int initialize_nocpsch(struct device_queue_manager *dqm)
>  {
> -       int i;
> +       int pipe, queue;
>
>         BUG_ON(!dqm);
>
> @@ -531,8 +531,14 @@ static int initialize_nocpsch(struct device_queue_manager *dqm)
>                 return -ENOMEM;
>         }
>
> -       for (i = 0; i < get_pipes_per_mec(dqm); i++)
> -               dqm->allocated_queues[i] = (1 << get_queues_per_pipe(dqm)) - 1;
> +       for (pipe = 0; pipe < get_pipes_per_mec(dqm); pipe++) {
> +               int pipe_offset = pipe * get_queues_per_pipe(dqm);
> +
> +               for (queue = 0; queue < get_queues_per_pipe(dqm); queue++)
> +                       if (test_bit(pipe_offset + queue,
> +                                    dqm->dev->shared_resources.queue_bitmap))
> +                               dqm->allocated_queues[pipe] |= 1 << queue;
> +       }
>
>         dqm->vmid_bitmap = (1 << VMID_PER_DEVICE) - 1;
>         dqm->sdma_bitmap = (1 << CIK_SDMA_QUEUES) - 1;
> --
> 2.7.4
>

This patch is:
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 05/19] drm/amdkfd: Clean up KFD style errors and warnings
       [not found]     ` <1502488589-30272-6-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-08-12 12:46       ` Oded Gabbay
       [not found]         ` <CAFCwf13wgCm9qPk4XX5DNOx0toPmZxogpxt5zBvEKzcMd2x3tw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: Oded Gabbay @ 2017-08-12 12:46 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: Kent Russell, amd-gfx list

I'd like to check this patch, but it doesn't apply cleanly on the
upstream tree.
Please fix and re-send.

Thanks,
Oded

On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> From: Kent Russell <kent.russell@amd.com>
>
> Using checkpatch.pl -f <file> showed a number of style issues. This
> patch addresses as many of them as possible. Some long lines have been
> left for readability, but attempts to minimize them have been made.
>
> Signed-off-by: Kent Russell <kent.russell@amd.com>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c         |  4 ++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c  | 24 +++++++------------
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c  | 16 ++++++-------
>  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c           |  6 +++--
>  drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c            |  7 +++---
>  drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.h            | 27 +++++++++++-----------
>  drivers/gpu/drm/amd/amdkfd/kfd_device.c            |  5 ++--
>  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  |  8 ++++---
>  drivers/gpu/drm/amd/amdkfd/kfd_events.c            |  5 ++--
>  drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c         |  2 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_module.c            |  3 ++-
>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c   |  3 +--
>  drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c    |  5 ++--
>  drivers/gpu/drm/amd/amdkfd/kfd_pasid.c             |  3 ++-
>  drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h       | 16 ++++++-------
>  drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h    | 10 ++++----
>  drivers/gpu/drm/amd/amdkfd/kfd_priv.h              | 23 ++++++++++--------
>  drivers/gpu/drm/amd/amdkfd/kfd_process.c           |  6 +++--
>  drivers/gpu/drm/amd/amdkfd/kfd_topology.c          |  4 ++--
>  19 files changed, 91 insertions(+), 86 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> index 3949736..342dc3e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> @@ -28,14 +28,14 @@
>  #include <linux/module.h>
>
>  const struct kgd2kfd_calls *kgd2kfd;
> -bool (*kgd2kfd_init_p)(unsigned, const struct kgd2kfd_calls**);
> +bool (*kgd2kfd_init_p)(unsigned int, const struct kgd2kfd_calls**);
>
>  int amdgpu_amdkfd_init(void)
>  {
>         int ret;
>
>  #if defined(CONFIG_HSA_AMD_MODULE)
> -       int (*kgd2kfd_init_p)(unsigned, const struct kgd2kfd_calls**);
> +       int (*kgd2kfd_init_p)(unsigned int, const struct kgd2kfd_calls**);
>
>         kgd2kfd_init_p = symbol_request(kgd2kfd_init);
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
> index 5254562..5936222 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
> @@ -565,43 +565,35 @@ static uint16_t get_fw_version(struct kgd_dev *kgd, enum kgd_engine_type type)
>
>         switch (type) {
>         case KGD_ENGINE_PFP:
> -               hdr = (const union amdgpu_firmware_header *)
> -                                                       adev->gfx.pfp_fw->data;
> +               hdr = (const union amdgpu_firmware_header *)adev->gfx.pfp_fw->data;
>                 break;
>
>         case KGD_ENGINE_ME:
> -               hdr = (const union amdgpu_firmware_header *)
> -                                                       adev->gfx.me_fw->data;
> +               hdr = (const union amdgpu_firmware_header *)adev->gfx.me_fw->data;
>                 break;
>
>         case KGD_ENGINE_CE:
> -               hdr = (const union amdgpu_firmware_header *)
> -                                                       adev->gfx.ce_fw->data;
> +               hdr = (const union amdgpu_firmware_header *)adev->gfx.ce_fw->data;
>                 break;
>
>         case KGD_ENGINE_MEC1:
> -               hdr = (const union amdgpu_firmware_header *)
> -                                                       adev->gfx.mec_fw->data;
> +               hdr = (const union amdgpu_firmware_header *)adev->gfx.mec_fw->data;
>                 break;
>
>         case KGD_ENGINE_MEC2:
> -               hdr = (const union amdgpu_firmware_header *)
> -                                                       adev->gfx.mec2_fw->data;
> +               hdr = (const union amdgpu_firmware_header *)adev->gfx.mec2_fw->data;
>                 break;
>
>         case KGD_ENGINE_RLC:
> -               hdr = (const union amdgpu_firmware_header *)
> -                                                       adev->gfx.rlc_fw->data;
> +               hdr = (const union amdgpu_firmware_header *)adev->gfx.rlc_fw->data;
>                 break;
>
>         case KGD_ENGINE_SDMA1:
> -               hdr = (const union amdgpu_firmware_header *)
> -                                                       adev->sdma.instance[0].fw->data;
> +               hdr = (const union amdgpu_firmware_header *)adev->sdma.instance[0].fw->data;
>                 break;
>
>         case KGD_ENGINE_SDMA2:
> -               hdr = (const union amdgpu_firmware_header *)
> -                                                       adev->sdma.instance[1].fw->data;
> +               hdr = (const union amdgpu_firmware_header *)adev->sdma.instance[1].fw->data;
>                 break;
>
>         default:
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
> index 133d066..90271f6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
> @@ -454,42 +454,42 @@ static uint16_t get_fw_version(struct kgd_dev *kgd, enum kgd_engine_type type)
>         switch (type) {
>         case KGD_ENGINE_PFP:
>                 hdr = (const union amdgpu_firmware_header *)
> -                                                       adev->gfx.pfp_fw->data;
> +                                               adev->gfx.pfp_fw->data;
>                 break;
>
>         case KGD_ENGINE_ME:
>                 hdr = (const union amdgpu_firmware_header *)
> -                                                       adev->gfx.me_fw->data;
> +                                               adev->gfx.me_fw->data;
>                 break;
>
>         case KGD_ENGINE_CE:
>                 hdr = (const union amdgpu_firmware_header *)
> -                                                       adev->gfx.ce_fw->data;
> +                                               adev->gfx.ce_fw->data;
>                 break;
>
>         case KGD_ENGINE_MEC1:
>                 hdr = (const union amdgpu_firmware_header *)
> -                                                       adev->gfx.mec_fw->data;
> +                                               adev->gfx.mec_fw->data;
>                 break;
>
>         case KGD_ENGINE_MEC2:
>                 hdr = (const union amdgpu_firmware_header *)
> -                                                       adev->gfx.mec2_fw->data;
> +                                               adev->gfx.mec2_fw->data;
>                 break;
>
>         case KGD_ENGINE_RLC:
>                 hdr = (const union amdgpu_firmware_header *)
> -                                                       adev->gfx.rlc_fw->data;
> +                                               adev->gfx.rlc_fw->data;
>                 break;
>
>         case KGD_ENGINE_SDMA1:
>                 hdr = (const union amdgpu_firmware_header *)
> -                                                       adev->sdma.instance[0].fw->data;
> +                                               adev->sdma.instance[0].fw->data;
>                 break;
>
>         case KGD_ENGINE_SDMA2:
>                 hdr = (const union amdgpu_firmware_header *)
> -                                                       adev->sdma.instance[1].fw->data;
> +                                               adev->sdma.instance[1].fw->data;
>                 break;
>
>         default:
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> index 2a45718e..98f4dbf 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> @@ -782,7 +782,8 @@ static int kfd_ioctl_get_process_apertures(struct file *filp,
>                                 "scratch_limit %llX\n", pdd->scratch_limit);
>
>                         args->num_of_nodes++;
> -               } while ((pdd = kfd_get_next_process_device_data(p, pdd)) != NULL &&
> +               } while ((pdd = kfd_get_next_process_device_data(p, pdd)) !=
> +                               NULL &&
>                                 (args->num_of_nodes < NUM_OF_SUPPORTED_GPUS));
>         }
>
> @@ -848,7 +849,8 @@ static int kfd_ioctl_wait_events(struct file *filp, struct kfd_process *p,
>  }
>
>  #define AMDKFD_IOCTL_DEF(ioctl, _func, _flags) \
> -       [_IOC_NR(ioctl)] = {.cmd = ioctl, .func = _func, .flags = _flags, .cmd_drv = 0, .name = #ioctl}
> +       [_IOC_NR(ioctl)] = {.cmd = ioctl, .func = _func, .flags = _flags, \
> +                           .cmd_drv = 0, .name = #ioctl}
>
>  /** Ioctl table */
>  static const struct amdkfd_ioctl_desc amdkfd_ioctls[] = {
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
> index faa0790..a7548a5 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
> @@ -313,7 +313,7 @@ static int dbgdev_address_watch_nodiq(struct kfd_dbgdev *dbgdev,
>                 return -EINVAL;
>         }
>
> -       for (i = 0 ; i < adw_info->num_watch_points ; i++) {
> +       for (i = 0; i < adw_info->num_watch_points; i++) {
>                 dbgdev_address_watch_set_registers(adw_info, &addrHi, &addrLo,
>                                                 &cntl, i, pdd->qpd.vmid);
>
> @@ -623,7 +623,7 @@ static int dbgdev_wave_control_diq(struct kfd_dbgdev *dbgdev,
>                 return status;
>         }
>
> -       /* we do not control the VMID in DIQ,so reset it to a known value */
> +       /* we do not control the VMID in DIQ, so reset it to a known value */
>         reg_sq_cmd.bits.vm_id = 0;
>
>         pr_debug("\t\t %30s\n", "* * * * * * * * * * * * * * * * * *");
> @@ -810,7 +810,8 @@ int dbgdev_wave_reset_wavefronts(struct kfd_dev *dev, struct kfd_process *p)
>
>         /* Scan all registers in the range ATC_VMID8_PASID_MAPPING ..
>          * ATC_VMID15_PASID_MAPPING
> -        * to check which VMID the current process is mapped to. */
> +        * to check which VMID the current process is mapped to.
> +        */
>
>         for (vmid = first_vmid_to_scan; vmid <= last_vmid_to_scan; vmid++) {
>                 if (dev->kfd2kgd->get_atc_vmid_pasid_mapping_valid
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.h b/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.h
> index 257a745..a04a1fe 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.h
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.h
> @@ -30,13 +30,11 @@
>  #pragma pack(push, 4)
>
>  enum HSA_DBG_WAVEOP {
> -       HSA_DBG_WAVEOP_HALT = 1,        /* Halts a wavefront            */
> -       HSA_DBG_WAVEOP_RESUME = 2,      /* Resumes a wavefront          */
> -       HSA_DBG_WAVEOP_KILL = 3,        /* Kills a wavefront            */
> -       HSA_DBG_WAVEOP_DEBUG = 4,       /* Causes wavefront to enter
> -                                               debug mode              */
> -       HSA_DBG_WAVEOP_TRAP = 5,        /* Causes wavefront to take
> -                                               a trap                  */
> +       HSA_DBG_WAVEOP_HALT = 1,   /* Halts a wavefront */
> +       HSA_DBG_WAVEOP_RESUME = 2, /* Resumes a wavefront */
> +       HSA_DBG_WAVEOP_KILL = 3,   /* Kills a wavefront */
> +       HSA_DBG_WAVEOP_DEBUG = 4,  /* Causes wavefront to enter dbg mode */
> +       HSA_DBG_WAVEOP_TRAP = 5,   /* Causes wavefront to take a trap */
>         HSA_DBG_NUM_WAVEOP = 5,
>         HSA_DBG_MAX_WAVEOP = 0xFFFFFFFF
>  };
> @@ -81,15 +79,13 @@ struct HsaDbgWaveMsgAMDGen2 {
>                         uint32_t UserData:8;    /* user data */
>                         uint32_t ShaderArray:1; /* Shader array */
>                         uint32_t Priv:1;        /* Privileged */
> -                       uint32_t Reserved0:4;   /* This field is reserved,
> -                                                  should be 0 */
> +                       uint32_t Reserved0:4;   /* Reserved, should be 0 */
>                         uint32_t WaveId:4;      /* wave id */
>                         uint32_t SIMD:2;        /* SIMD id */
>                         uint32_t HSACU:4;       /* Compute unit */
>                         uint32_t ShaderEngine:2;/* Shader engine */
>                         uint32_t MessageType:2; /* see HSA_DBG_WAVEMSG_TYPE */
> -                       uint32_t Reserved1:4;   /* This field is reserved,
> -                                                  should be 0 */
> +                       uint32_t Reserved1:4;   /* Reserved, should be 0 */
>                 } ui32;
>                 uint32_t Value;
>         };
> @@ -121,20 +117,23 @@ struct HsaDbgWaveMessage {
>   * in the user mode instruction stream. The OS scheduler event is typically
>   * associated and signaled by an interrupt issued by the GPU, but other HSA
>   * system interrupt conditions from other HW (e.g. IOMMUv2) may be surfaced
> - * by the KFD by this mechanism, too. */
> + * by the KFD by this mechanism, too.
> + */
>
>  /* these are the new definitions for events */
>  enum HSA_EVENTTYPE {
>         HSA_EVENTTYPE_SIGNAL = 0,       /* user-mode generated GPU signal */
>         HSA_EVENTTYPE_NODECHANGE = 1,   /* HSA node change (attach/detach) */
>         HSA_EVENTTYPE_DEVICESTATECHANGE = 2,    /* HSA device state change
> -                                                  (start/stop) */
> +                                                * (start/stop)
> +                                                */
>         HSA_EVENTTYPE_HW_EXCEPTION = 3, /* GPU shader exception event */
>         HSA_EVENTTYPE_SYSTEM_EVENT = 4, /* GPU SYSCALL with parameter info */
>         HSA_EVENTTYPE_DEBUG_EVENT = 5,  /* GPU signal for debugging */
>         HSA_EVENTTYPE_PROFILE_EVENT = 6,/* GPU signal for profiling */
>         HSA_EVENTTYPE_QUEUE_EVENT = 7,  /* GPU signal queue idle state
> -                                          (EOP pm4) */
> +                                        * (EOP pm4)
> +                                        */
>         /* ...  */
>         HSA_EVENTTYPE_MAXID,
>         HSA_EVENTTYPE_TYPE_SIZE = 0xFFFFFFFF
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> index 3f95f7c..1f50325 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> @@ -155,12 +155,13 @@ static bool device_iommu_pasid_init(struct kfd_dev *kfd)
>                 dev_err(kfd_device, "error required iommu flags ats(%i), pri(%i), pasid(%i)\n",
>                        (iommu_info.flags & AMD_IOMMU_DEVICE_FLAG_ATS_SUP) != 0,
>                        (iommu_info.flags & AMD_IOMMU_DEVICE_FLAG_PRI_SUP) != 0,
> -                      (iommu_info.flags & AMD_IOMMU_DEVICE_FLAG_PASID_SUP) != 0);
> +                      (iommu_info.flags & AMD_IOMMU_DEVICE_FLAG_PASID_SUP)
> +                                                                       != 0);
>                 return false;
>         }
>
>         pasid_limit = min_t(unsigned int,
> -                       (unsigned int)1 << kfd->device_info->max_pasid_bits,
> +                       (unsigned int)(1 << kfd->device_info->max_pasid_bits),
>                         iommu_info.max_pasids);
>         /*
>          * last pasid is used for kernel queues doorbells
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> index 9d2796b..3b850da 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> @@ -216,7 +216,8 @@ static int allocate_hqd(struct device_queue_manager *dqm, struct queue *q)
>
>         set = false;
>
> -       for (pipe = dqm->next_pipe_to_allocate, i = 0; i < get_pipes_per_mec(dqm);
> +       for (pipe = dqm->next_pipe_to_allocate, i = 0;
> +                       i < get_pipes_per_mec(dqm);
>                         pipe = ((pipe + 1) % get_pipes_per_mec(dqm)), ++i) {
>
>                 if (!is_pipe_enabled(dqm, 0, pipe))
> @@ -669,7 +670,8 @@ static int set_sched_resources(struct device_queue_manager *dqm)
>
>                 /* This situation may be hit in the future if a new HW
>                  * generation exposes more than 64 queues. If so, the
> -                * definition of res.queue_mask needs updating */
> +                * definition of res.queue_mask needs updating
> +                */
>                 if (WARN_ON(i >= (sizeof(res.queue_mask)*8))) {
>                         pr_err("Invalid queue enabled by amdgpu: %d\n", i);
>                         break;
> @@ -890,7 +892,7 @@ static int create_queue_cpsch(struct device_queue_manager *dqm, struct queue *q,
>         }
>
>         if (q->properties.type == KFD_QUEUE_TYPE_SDMA)
> -                       dqm->sdma_queue_count++;
> +               dqm->sdma_queue_count++;
>         /*
>          * Unconditionally increment this counter, regardless of the queue's
>          * type or whether the queue is active.
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
> index d1ce83d..d8b9b3c 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
> @@ -194,7 +194,8 @@ static void release_event_notification_slot(struct signal_page *page,
>         page->free_slots++;
>
>         /* We don't free signal pages, they are retained by the process
> -        * and reused until it exits. */
> +        * and reused until it exits.
> +        */
>  }
>
>  static struct signal_page *lookup_signal_page_by_index(struct kfd_process *p,
> @@ -584,7 +585,7 @@ void kfd_signal_event_interrupt(unsigned int pasid, uint32_t partial_id,
>                  * search faster.
>                  */
>                 struct signal_page *page;
> -               unsigned i;
> +               unsigned int i;
>
>                 list_for_each_entry(page, &p->signal_event_pages, event_pages)
>                         for (i = 0; i < SLOTS_PER_PAGE; i++)
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c b/drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c
> index 7f134aa..70b3a99c 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c
> @@ -179,7 +179,7 @@ static void interrupt_wq(struct work_struct *work)
>  bool interrupt_is_wanted(struct kfd_dev *dev, const uint32_t *ih_ring_entry)
>  {
>         /* integer and bitwise OR so there is no boolean short-circuiting */
> -       unsigned wanted = 0;
> +       unsigned int wanted = 0;
>
>         wanted |= dev->device_info->event_interrupt_class->interrupt_isr(dev,
>                                                                 ih_ring_entry);
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_module.c b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
> index 850a562..af5bfc1 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
> @@ -61,7 +61,8 @@ MODULE_PARM_DESC(send_sigterm,
>
>  static int amdkfd_init_completed;
>
> -int kgd2kfd_init(unsigned interface_version, const struct kgd2kfd_calls **g2f)
> +int kgd2kfd_init(unsigned int interface_version,
> +               const struct kgd2kfd_calls **g2f)
>  {
>         if (!amdkfd_init_completed)
>                 return -EPROBE_DEFER;
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
> index 6acc431..ac59229 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
> @@ -193,9 +193,8 @@ static int update_mqd(struct mqd_manager *mm, void *mqd,
>
>         m->cp_hqd_vmid = q->vmid;
>
> -       if (q->format == KFD_QUEUE_FORMAT_AQL) {
> +       if (q->format == KFD_QUEUE_FORMAT_AQL)
>                 m->cp_hqd_pq_control |= NO_UPDATE_RPTR;
> -       }
>
>         m->cp_hqd_active = 0;
>         q->is_active = false;
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> index 7131998..99c11a4 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> @@ -458,7 +458,7 @@ int pm_send_set_resources(struct packet_manager *pm,
>         mutex_lock(&pm->lock);
>         pm->priv_queue->ops.acquire_packet_buffer(pm->priv_queue,
>                                         sizeof(*packet) / sizeof(uint32_t),
> -                       (unsigned int **)&packet);
> +                                       (unsigned int **)&packet);
>         if (packet == NULL) {
>                 mutex_unlock(&pm->lock);
>                 pr_err("kfd: failed to allocate buffer on kernel queue\n");
> @@ -530,8 +530,7 @@ int pm_send_runlist(struct packet_manager *pm, struct list_head *dqm_queues)
>  fail_acquire_packet_buffer:
>         mutex_unlock(&pm->lock);
>  fail_create_runlist_ib:
> -       if (pm->allocated)
> -               pm_release_ib(pm);
> +       pm_release_ib(pm);
>         return retval;
>  }
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pasid.c b/drivers/gpu/drm/amd/amdkfd/kfd_pasid.c
> index 6cfe7f1..b3f7d43 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_pasid.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_pasid.c
> @@ -32,7 +32,8 @@ int kfd_pasid_init(void)
>  {
>         pasid_limit = KFD_MAX_NUM_OF_PROCESSES;
>
> -       pasid_bitmap = kcalloc(BITS_TO_LONGS(pasid_limit), sizeof(long), GFP_KERNEL);
> +       pasid_bitmap = kcalloc(BITS_TO_LONGS(pasid_limit), sizeof(long),
> +                               GFP_KERNEL);
>         if (!pasid_bitmap)
>                 return -ENOMEM;
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
> index 5b393f3..97e5442 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
> @@ -28,14 +28,14 @@
>  #define PM4_MES_HEADER_DEFINED
>  union PM4_MES_TYPE_3_HEADER {
>         struct {
> -               uint32_t reserved1:8;   /* < reserved */
> -               uint32_t opcode:8;      /* < IT opcode */
> -               uint32_t count:14;      /* < number of DWORDs - 1
> -                                        * in the information body.
> -                                        */
> -               uint32_t type:2;        /* < packet identifier.
> -                                        * It should be 3 for type 3 packets
> -                                        */
> +               /* reserved */
> +               uint32_t reserved1:8;
> +               /* IT opcode */
> +               uint32_t opcode:8;
> +               /* number of DWORDs - 1 in the information body */
> +               uint32_t count:14;
> +               /* packet identifier. It should be 3 for type 3 packets */
> +               uint32_t type:2;
>         };
>         uint32_t u32all;
>  };
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
> index 08c72192..c4eda6f 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
> @@ -30,10 +30,12 @@ union PM4_MES_TYPE_3_HEADER {
>         struct {
>                 uint32_t reserved1 : 8; /* < reserved */
>                 uint32_t opcode    : 8; /* < IT opcode */
> -               uint32_t count     : 14;/* < number of DWORDs - 1 in the
> -               information body. */
> -               uint32_t type      : 2; /* < packet identifier.
> -                                       It should be 3 for type 3 packets */
> +               uint32_t count     : 14;/* < Number of DWORDS - 1 in the
> +                                        *   information body
> +                                        */
> +               uint32_t type      : 2; /* < packet identifier
> +                                        *   It should be 3 for type 3 packets
> +                                        */
>         };
>         uint32_t u32All;
>  };
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> index 4750cab..469b7ea 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> @@ -294,13 +294,13 @@ enum kfd_queue_format {
>   * @write_ptr: Defines the number of dwords written to the ring buffer.
>   *
>   * @doorbell_ptr: This field aim is to notify the H/W of new packet written to
> - * the queue ring buffer. This field should be similar to write_ptr and the user
> - * should update this field after he updated the write_ptr.
> + * the queue ring buffer. This field should be similar to write_ptr and the
> + * user should update this field after he updated the write_ptr.
>   *
>   * @doorbell_off: The doorbell offset in the doorbell pci-bar.
>   *
> - * @is_interop: Defines if this is a interop queue. Interop queue means that the
> - * queue can access both graphics and compute resources.
> + * @is_interop: Defines if this is a interop queue. Interop queue means that
> + * the queue can access both graphics and compute resources.
>   *
>   * @is_active: Defines if the queue is active or not.
>   *
> @@ -352,9 +352,10 @@ struct queue_properties {
>   * @properties: The queue properties.
>   *
>   * @mec: Used only in no cp scheduling mode and identifies to micro engine id
> - * that the queue should be execute on.
> + *      that the queue should be execute on.
>   *
> - * @pipe: Used only in no cp scheduling mode and identifies the queue's pipe id.
> + * @pipe: Used only in no cp scheduling mode and identifies the queue's pipe
> + *       id.
>   *
>   * @queue: Used only in no cp scheduliong mode and identifies the queue's slot.
>   *
> @@ -520,8 +521,8 @@ struct kfd_process {
>         struct mutex event_mutex;
>         /* All events in process hashed by ID, linked on kfd_event.events. */
>         DECLARE_HASHTABLE(events, 4);
> -       struct list_head signal_event_pages;    /* struct slot_page_header.
> -                                                               event_pages */
> +       /* struct slot_page_header.event_pages */
> +       struct list_head signal_event_pages;
>         u32 next_nonsignal_event_id;
>         size_t signal_event_count;
>  };
> @@ -559,8 +560,10 @@ struct kfd_process_device *kfd_create_process_device_data(struct kfd_dev *dev,
>                                                         struct kfd_process *p);
>
>  /* Process device data iterator */
> -struct kfd_process_device *kfd_get_first_process_device_data(struct kfd_process *p);
> -struct kfd_process_device *kfd_get_next_process_device_data(struct kfd_process *p,
> +struct kfd_process_device *kfd_get_first_process_device_data(
> +                                                       struct kfd_process *p);
> +struct kfd_process_device *kfd_get_next_process_device_data(
> +                                               struct kfd_process *p,
>                                                 struct kfd_process_device *pdd);
>  bool kfd_has_process_device_data(struct kfd_process *p);
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> index 035bbc9..a4e4a2d 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> @@ -449,14 +449,16 @@ void kfd_unbind_process_from_device(struct kfd_dev *dev, unsigned int pasid)
>         mutex_unlock(&p->mutex);
>  }
>
> -struct kfd_process_device *kfd_get_first_process_device_data(struct kfd_process *p)
> +struct kfd_process_device *kfd_get_first_process_device_data(
> +                                               struct kfd_process *p)
>  {
>         return list_first_entry(&p->per_device_data,
>                                 struct kfd_process_device,
>                                 per_device_list);
>  }
>
> -struct kfd_process_device *kfd_get_next_process_device_data(struct kfd_process *p,
> +struct kfd_process_device *kfd_get_next_process_device_data(
> +                                               struct kfd_process *p,
>                                                 struct kfd_process_device *pdd)
>  {
>         if (list_is_last(&pdd->per_device_list, &p->per_device_data))
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> index 1e50647..0200dae 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> @@ -1170,8 +1170,8 @@ int kfd_topology_add_device(struct kfd_dev *gpu)
>                  * GPU vBIOS
>                  */
>
> -               /*
> -                * Update the SYSFS tree, since we added another topology device
> +               /* Update the SYSFS tree, since we added another topology
> +                * device
>                  */
>                 if (kfd_topology_update_sysfs() < 0)
>                         kfd_topology_release_sysfs();
> --
> 2.7.4
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 05/19] drm/amdkfd: Clean up KFD style errors and warnings
       [not found]         ` <CAFCwf13wgCm9qPk4XX5DNOx0toPmZxogpxt5zBvEKzcMd2x3tw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-08-12 12:58           ` Oded Gabbay
  0 siblings, 0 replies; 70+ messages in thread
From: Oded Gabbay @ 2017-08-12 12:58 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: Kent Russell, amd-gfx list

On Sat, Aug 12, 2017 at 3:46 PM, Oded Gabbay <oded.gabbay@gmail.com> wrote:
> I'd like to check this patch, but it doesn't apply cleanly on the
> upstream tree.
> Please fix and re-send.
>
> Thanks,
> Oded
>
> On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
>> From: Kent Russell <kent.russell@amd.com>
>>
>> Using checkpatch.pl -f <file> showed a number of style issues. This
>> patch addresses as many of them as possible. Some long lines have been
>> left for readability, but attempts to minimize them have been made.
>>
>> Signed-off-by: Kent Russell <kent.russell@amd.com>
>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>> ---
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c         |  4 ++--
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c  | 24 +++++++------------
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c  | 16 ++++++-------
>>  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c           |  6 +++--
>>  drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c            |  7 +++---
>>  drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.h            | 27 +++++++++++-----------
>>  drivers/gpu/drm/amd/amdkfd/kfd_device.c            |  5 ++--
>>  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  |  8 ++++---
>>  drivers/gpu/drm/amd/amdkfd/kfd_events.c            |  5 ++--
>>  drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c         |  2 +-
>>  drivers/gpu/drm/amd/amdkfd/kfd_module.c            |  3 ++-
>>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c   |  3 +--
>>  drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c    |  5 ++--
>>  drivers/gpu/drm/amd/amdkfd/kfd_pasid.c             |  3 ++-
>>  drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h       | 16 ++++++-------
>>  drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h    | 10 ++++----
>>  drivers/gpu/drm/amd/amdkfd/kfd_priv.h              | 23 ++++++++++--------
>>  drivers/gpu/drm/amd/amdkfd/kfd_process.c           |  6 +++--
>>  drivers/gpu/drm/amd/amdkfd/kfd_topology.c          |  4 ++--
>>  19 files changed, 91 insertions(+), 86 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>> index 3949736..342dc3e 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>> @@ -28,14 +28,14 @@
>>  #include <linux/module.h>
>>
>>  const struct kgd2kfd_calls *kgd2kfd;
>> -bool (*kgd2kfd_init_p)(unsigned, const struct kgd2kfd_calls**);
>> +bool (*kgd2kfd_init_p)(unsigned int, const struct kgd2kfd_calls**);
>>
>>  int amdgpu_amdkfd_init(void)
>>  {
>>         int ret;
>>
>>  #if defined(CONFIG_HSA_AMD_MODULE)
>> -       int (*kgd2kfd_init_p)(unsigned, const struct kgd2kfd_calls**);
>> +       int (*kgd2kfd_init_p)(unsigned int, const struct kgd2kfd_calls**);
>>
>>         kgd2kfd_init_p = symbol_request(kgd2kfd_init);
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
>> index 5254562..5936222 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
>> @@ -565,43 +565,35 @@ static uint16_t get_fw_version(struct kgd_dev *kgd, enum kgd_engine_type type)
>>
>>         switch (type) {
>>         case KGD_ENGINE_PFP:
>> -               hdr = (const union amdgpu_firmware_header *)
>> -                                                       adev->gfx.pfp_fw->data;
>> +               hdr = (const union amdgpu_firmware_header *)adev->gfx.pfp_fw->data;
>>                 break;
>>
>>         case KGD_ENGINE_ME:
>> -               hdr = (const union amdgpu_firmware_header *)
>> -                                                       adev->gfx.me_fw->data;
>> +               hdr = (const union amdgpu_firmware_header *)adev->gfx.me_fw->data;
>>                 break;
>>
>>         case KGD_ENGINE_CE:
>> -               hdr = (const union amdgpu_firmware_header *)
>> -                                                       adev->gfx.ce_fw->data;
>> +               hdr = (const union amdgpu_firmware_header *)adev->gfx.ce_fw->data;
>>                 break;
>>
>>         case KGD_ENGINE_MEC1:
>> -               hdr = (const union amdgpu_firmware_header *)
>> -                                                       adev->gfx.mec_fw->data;
>> +               hdr = (const union amdgpu_firmware_header *)adev->gfx.mec_fw->data;
>>                 break;
>>
>>         case KGD_ENGINE_MEC2:
>> -               hdr = (const union amdgpu_firmware_header *)
>> -                                                       adev->gfx.mec2_fw->data;
>> +               hdr = (const union amdgpu_firmware_header *)adev->gfx.mec2_fw->data;
>>                 break;
>>
>>         case KGD_ENGINE_RLC:
>> -               hdr = (const union amdgpu_firmware_header *)
>> -                                                       adev->gfx.rlc_fw->data;
>> +               hdr = (const union amdgpu_firmware_header *)adev->gfx.rlc_fw->data;
>>                 break;
>>
>>         case KGD_ENGINE_SDMA1:
>> -               hdr = (const union amdgpu_firmware_header *)
>> -                                                       adev->sdma.instance[0].fw->data;
>> +               hdr = (const union amdgpu_firmware_header *)adev->sdma.instance[0].fw->data;
>>                 break;
>>
>>         case KGD_ENGINE_SDMA2:
>> -               hdr = (const union amdgpu_firmware_header *)
>> -                                                       adev->sdma.instance[1].fw->data;
>> +               hdr = (const union amdgpu_firmware_header *)adev->sdma.instance[1].fw->data;
>>                 break;

The above 8 changes add the "above 80 chars" warning.
I suggest to remove these 8 changes.

Thanks,
Oded.

>>
>>         default:
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
>> index 133d066..90271f6 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
>> @@ -454,42 +454,42 @@ static uint16_t get_fw_version(struct kgd_dev *kgd, enum kgd_engine_type type)
>>         switch (type) {
>>         case KGD_ENGINE_PFP:
>>                 hdr = (const union amdgpu_firmware_header *)
>> -                                                       adev->gfx.pfp_fw->data;
>> +                                               adev->gfx.pfp_fw->data;
>>                 break;
>>
>>         case KGD_ENGINE_ME:
>>                 hdr = (const union amdgpu_firmware_header *)
>> -                                                       adev->gfx.me_fw->data;
>> +                                               adev->gfx.me_fw->data;
>>                 break;
>>
>>         case KGD_ENGINE_CE:
>>                 hdr = (const union amdgpu_firmware_header *)
>> -                                                       adev->gfx.ce_fw->data;
>> +                                               adev->gfx.ce_fw->data;
>>                 break;
>>
>>         case KGD_ENGINE_MEC1:
>>                 hdr = (const union amdgpu_firmware_header *)
>> -                                                       adev->gfx.mec_fw->data;
>> +                                               adev->gfx.mec_fw->data;
>>                 break;
>>
>>         case KGD_ENGINE_MEC2:
>>                 hdr = (const union amdgpu_firmware_header *)
>> -                                                       adev->gfx.mec2_fw->data;
>> +                                               adev->gfx.mec2_fw->data;
>>                 break;
>>
>>         case KGD_ENGINE_RLC:
>>                 hdr = (const union amdgpu_firmware_header *)
>> -                                                       adev->gfx.rlc_fw->data;
>> +                                               adev->gfx.rlc_fw->data;
>>                 break;
>>
>>         case KGD_ENGINE_SDMA1:
>>                 hdr = (const union amdgpu_firmware_header *)
>> -                                                       adev->sdma.instance[0].fw->data;
>> +                                               adev->sdma.instance[0].fw->data;
>>                 break;
>>
>>         case KGD_ENGINE_SDMA2:
>>                 hdr = (const union amdgpu_firmware_header *)
>> -                                                       adev->sdma.instance[1].fw->data;
>> +                                               adev->sdma.instance[1].fw->data;
>>                 break;
>>
>>         default:
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>> index 2a45718e..98f4dbf 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>> @@ -782,7 +782,8 @@ static int kfd_ioctl_get_process_apertures(struct file *filp,
>>                                 "scratch_limit %llX\n", pdd->scratch_limit);
>>
>>                         args->num_of_nodes++;
>> -               } while ((pdd = kfd_get_next_process_device_data(p, pdd)) != NULL &&
>> +               } while ((pdd = kfd_get_next_process_device_data(p, pdd)) !=
>> +                               NULL &&
>>                                 (args->num_of_nodes < NUM_OF_SUPPORTED_GPUS));
>>         }
>>
>> @@ -848,7 +849,8 @@ static int kfd_ioctl_wait_events(struct file *filp, struct kfd_process *p,
>>  }
>>
>>  #define AMDKFD_IOCTL_DEF(ioctl, _func, _flags) \
>> -       [_IOC_NR(ioctl)] = {.cmd = ioctl, .func = _func, .flags = _flags, .cmd_drv = 0, .name = #ioctl}
>> +       [_IOC_NR(ioctl)] = {.cmd = ioctl, .func = _func, .flags = _flags, \
>> +                           .cmd_drv = 0, .name = #ioctl}
>>
>>  /** Ioctl table */
>>  static const struct amdkfd_ioctl_desc amdkfd_ioctls[] = {
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
>> index faa0790..a7548a5 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
>> @@ -313,7 +313,7 @@ static int dbgdev_address_watch_nodiq(struct kfd_dbgdev *dbgdev,
>>                 return -EINVAL;
>>         }
>>
>> -       for (i = 0 ; i < adw_info->num_watch_points ; i++) {
>> +       for (i = 0; i < adw_info->num_watch_points; i++) {
>>                 dbgdev_address_watch_set_registers(adw_info, &addrHi, &addrLo,
>>                                                 &cntl, i, pdd->qpd.vmid);
>>
>> @@ -623,7 +623,7 @@ static int dbgdev_wave_control_diq(struct kfd_dbgdev *dbgdev,
>>                 return status;
>>         }
>>
>> -       /* we do not control the VMID in DIQ,so reset it to a known value */
>> +       /* we do not control the VMID in DIQ, so reset it to a known value */
>>         reg_sq_cmd.bits.vm_id = 0;
>>
>>         pr_debug("\t\t %30s\n", "* * * * * * * * * * * * * * * * * *");
>> @@ -810,7 +810,8 @@ int dbgdev_wave_reset_wavefronts(struct kfd_dev *dev, struct kfd_process *p)
>>
>>         /* Scan all registers in the range ATC_VMID8_PASID_MAPPING ..
>>          * ATC_VMID15_PASID_MAPPING
>> -        * to check which VMID the current process is mapped to. */
>> +        * to check which VMID the current process is mapped to.
>> +        */
>>
>>         for (vmid = first_vmid_to_scan; vmid <= last_vmid_to_scan; vmid++) {
>>                 if (dev->kfd2kgd->get_atc_vmid_pasid_mapping_valid
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.h b/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.h
>> index 257a745..a04a1fe 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.h
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.h
>> @@ -30,13 +30,11 @@
>>  #pragma pack(push, 4)
>>
>>  enum HSA_DBG_WAVEOP {
>> -       HSA_DBG_WAVEOP_HALT = 1,        /* Halts a wavefront            */
>> -       HSA_DBG_WAVEOP_RESUME = 2,      /* Resumes a wavefront          */
>> -       HSA_DBG_WAVEOP_KILL = 3,        /* Kills a wavefront            */
>> -       HSA_DBG_WAVEOP_DEBUG = 4,       /* Causes wavefront to enter
>> -                                               debug mode              */
>> -       HSA_DBG_WAVEOP_TRAP = 5,        /* Causes wavefront to take
>> -                                               a trap                  */
>> +       HSA_DBG_WAVEOP_HALT = 1,   /* Halts a wavefront */
>> +       HSA_DBG_WAVEOP_RESUME = 2, /* Resumes a wavefront */
>> +       HSA_DBG_WAVEOP_KILL = 3,   /* Kills a wavefront */
>> +       HSA_DBG_WAVEOP_DEBUG = 4,  /* Causes wavefront to enter dbg mode */
>> +       HSA_DBG_WAVEOP_TRAP = 5,   /* Causes wavefront to take a trap */
>>         HSA_DBG_NUM_WAVEOP = 5,
>>         HSA_DBG_MAX_WAVEOP = 0xFFFFFFFF
>>  };
>> @@ -81,15 +79,13 @@ struct HsaDbgWaveMsgAMDGen2 {
>>                         uint32_t UserData:8;    /* user data */
>>                         uint32_t ShaderArray:1; /* Shader array */
>>                         uint32_t Priv:1;        /* Privileged */
>> -                       uint32_t Reserved0:4;   /* This field is reserved,
>> -                                                  should be 0 */
>> +                       uint32_t Reserved0:4;   /* Reserved, should be 0 */
>>                         uint32_t WaveId:4;      /* wave id */
>>                         uint32_t SIMD:2;        /* SIMD id */
>>                         uint32_t HSACU:4;       /* Compute unit */
>>                         uint32_t ShaderEngine:2;/* Shader engine */
>>                         uint32_t MessageType:2; /* see HSA_DBG_WAVEMSG_TYPE */
>> -                       uint32_t Reserved1:4;   /* This field is reserved,
>> -                                                  should be 0 */
>> +                       uint32_t Reserved1:4;   /* Reserved, should be 0 */
>>                 } ui32;
>>                 uint32_t Value;
>>         };
>> @@ -121,20 +117,23 @@ struct HsaDbgWaveMessage {
>>   * in the user mode instruction stream. The OS scheduler event is typically
>>   * associated and signaled by an interrupt issued by the GPU, but other HSA
>>   * system interrupt conditions from other HW (e.g. IOMMUv2) may be surfaced
>> - * by the KFD by this mechanism, too. */
>> + * by the KFD by this mechanism, too.
>> + */
>>
>>  /* these are the new definitions for events */
>>  enum HSA_EVENTTYPE {
>>         HSA_EVENTTYPE_SIGNAL = 0,       /* user-mode generated GPU signal */
>>         HSA_EVENTTYPE_NODECHANGE = 1,   /* HSA node change (attach/detach) */
>>         HSA_EVENTTYPE_DEVICESTATECHANGE = 2,    /* HSA device state change
>> -                                                  (start/stop) */
>> +                                                * (start/stop)
>> +                                                */
>>         HSA_EVENTTYPE_HW_EXCEPTION = 3, /* GPU shader exception event */
>>         HSA_EVENTTYPE_SYSTEM_EVENT = 4, /* GPU SYSCALL with parameter info */
>>         HSA_EVENTTYPE_DEBUG_EVENT = 5,  /* GPU signal for debugging */
>>         HSA_EVENTTYPE_PROFILE_EVENT = 6,/* GPU signal for profiling */
>>         HSA_EVENTTYPE_QUEUE_EVENT = 7,  /* GPU signal queue idle state
>> -                                          (EOP pm4) */
>> +                                        * (EOP pm4)
>> +                                        */
>>         /* ...  */
>>         HSA_EVENTTYPE_MAXID,
>>         HSA_EVENTTYPE_TYPE_SIZE = 0xFFFFFFFF
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> index 3f95f7c..1f50325 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> @@ -155,12 +155,13 @@ static bool device_iommu_pasid_init(struct kfd_dev *kfd)
>>                 dev_err(kfd_device, "error required iommu flags ats(%i), pri(%i), pasid(%i)\n",
>>                        (iommu_info.flags & AMD_IOMMU_DEVICE_FLAG_ATS_SUP) != 0,
>>                        (iommu_info.flags & AMD_IOMMU_DEVICE_FLAG_PRI_SUP) != 0,
>> -                      (iommu_info.flags & AMD_IOMMU_DEVICE_FLAG_PASID_SUP) != 0);
>> +                      (iommu_info.flags & AMD_IOMMU_DEVICE_FLAG_PASID_SUP)
>> +                                                                       != 0);
>>                 return false;
>>         }
>>
>>         pasid_limit = min_t(unsigned int,
>> -                       (unsigned int)1 << kfd->device_info->max_pasid_bits,
>> +                       (unsigned int)(1 << kfd->device_info->max_pasid_bits),
>>                         iommu_info.max_pasids);
>>         /*
>>          * last pasid is used for kernel queues doorbells
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
>> index 9d2796b..3b850da 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
>> @@ -216,7 +216,8 @@ static int allocate_hqd(struct device_queue_manager *dqm, struct queue *q)
>>
>>         set = false;
>>
>> -       for (pipe = dqm->next_pipe_to_allocate, i = 0; i < get_pipes_per_mec(dqm);
>> +       for (pipe = dqm->next_pipe_to_allocate, i = 0;
>> +                       i < get_pipes_per_mec(dqm);
>>                         pipe = ((pipe + 1) % get_pipes_per_mec(dqm)), ++i) {
>>
>>                 if (!is_pipe_enabled(dqm, 0, pipe))
>> @@ -669,7 +670,8 @@ static int set_sched_resources(struct device_queue_manager *dqm)
>>
>>                 /* This situation may be hit in the future if a new HW
>>                  * generation exposes more than 64 queues. If so, the
>> -                * definition of res.queue_mask needs updating */
>> +                * definition of res.queue_mask needs updating
>> +                */
>>                 if (WARN_ON(i >= (sizeof(res.queue_mask)*8))) {
>>                         pr_err("Invalid queue enabled by amdgpu: %d\n", i);
>>                         break;
>> @@ -890,7 +892,7 @@ static int create_queue_cpsch(struct device_queue_manager *dqm, struct queue *q,
>>         }
>>
>>         if (q->properties.type == KFD_QUEUE_TYPE_SDMA)
>> -                       dqm->sdma_queue_count++;
>> +               dqm->sdma_queue_count++;
>>         /*
>>          * Unconditionally increment this counter, regardless of the queue's
>>          * type or whether the queue is active.
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
>> index d1ce83d..d8b9b3c 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
>> @@ -194,7 +194,8 @@ static void release_event_notification_slot(struct signal_page *page,
>>         page->free_slots++;
>>
>>         /* We don't free signal pages, they are retained by the process
>> -        * and reused until it exits. */
>> +        * and reused until it exits.
>> +        */
>>  }
>>
>>  static struct signal_page *lookup_signal_page_by_index(struct kfd_process *p,
>> @@ -584,7 +585,7 @@ void kfd_signal_event_interrupt(unsigned int pasid, uint32_t partial_id,
>>                  * search faster.
>>                  */
>>                 struct signal_page *page;
>> -               unsigned i;
>> +               unsigned int i;
>>
>>                 list_for_each_entry(page, &p->signal_event_pages, event_pages)
>>                         for (i = 0; i < SLOTS_PER_PAGE; i++)
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c b/drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c
>> index 7f134aa..70b3a99c 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c
>> @@ -179,7 +179,7 @@ static void interrupt_wq(struct work_struct *work)
>>  bool interrupt_is_wanted(struct kfd_dev *dev, const uint32_t *ih_ring_entry)
>>  {
>>         /* integer and bitwise OR so there is no boolean short-circuiting */
>> -       unsigned wanted = 0;
>> +       unsigned int wanted = 0;
>>
>>         wanted |= dev->device_info->event_interrupt_class->interrupt_isr(dev,
>>                                                                 ih_ring_entry);
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_module.c b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
>> index 850a562..af5bfc1 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
>> @@ -61,7 +61,8 @@ MODULE_PARM_DESC(send_sigterm,
>>
>>  static int amdkfd_init_completed;
>>
>> -int kgd2kfd_init(unsigned interface_version, const struct kgd2kfd_calls **g2f)
>> +int kgd2kfd_init(unsigned int interface_version,
>> +               const struct kgd2kfd_calls **g2f)
>>  {
>>         if (!amdkfd_init_completed)
>>                 return -EPROBE_DEFER;
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
>> index 6acc431..ac59229 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
>> @@ -193,9 +193,8 @@ static int update_mqd(struct mqd_manager *mm, void *mqd,
>>
>>         m->cp_hqd_vmid = q->vmid;
>>
>> -       if (q->format == KFD_QUEUE_FORMAT_AQL) {
>> +       if (q->format == KFD_QUEUE_FORMAT_AQL)
>>                 m->cp_hqd_pq_control |= NO_UPDATE_RPTR;
>> -       }
>>
>>         m->cp_hqd_active = 0;
>>         q->is_active = false;
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
>> index 7131998..99c11a4 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
>> @@ -458,7 +458,7 @@ int pm_send_set_resources(struct packet_manager *pm,
>>         mutex_lock(&pm->lock);
>>         pm->priv_queue->ops.acquire_packet_buffer(pm->priv_queue,
>>                                         sizeof(*packet) / sizeof(uint32_t),
>> -                       (unsigned int **)&packet);
>> +                                       (unsigned int **)&packet);
>>         if (packet == NULL) {
>>                 mutex_unlock(&pm->lock);
>>                 pr_err("kfd: failed to allocate buffer on kernel queue\n");
>> @@ -530,8 +530,7 @@ int pm_send_runlist(struct packet_manager *pm, struct list_head *dqm_queues)
>>  fail_acquire_packet_buffer:
>>         mutex_unlock(&pm->lock);
>>  fail_create_runlist_ib:
>> -       if (pm->allocated)
>> -               pm_release_ib(pm);
>> +       pm_release_ib(pm);
>>         return retval;
>>  }
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pasid.c b/drivers/gpu/drm/amd/amdkfd/kfd_pasid.c
>> index 6cfe7f1..b3f7d43 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_pasid.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_pasid.c
>> @@ -32,7 +32,8 @@ int kfd_pasid_init(void)
>>  {
>>         pasid_limit = KFD_MAX_NUM_OF_PROCESSES;
>>
>> -       pasid_bitmap = kcalloc(BITS_TO_LONGS(pasid_limit), sizeof(long), GFP_KERNEL);
>> +       pasid_bitmap = kcalloc(BITS_TO_LONGS(pasid_limit), sizeof(long),
>> +                               GFP_KERNEL);
>>         if (!pasid_bitmap)
>>                 return -ENOMEM;
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
>> index 5b393f3..97e5442 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
>> @@ -28,14 +28,14 @@
>>  #define PM4_MES_HEADER_DEFINED
>>  union PM4_MES_TYPE_3_HEADER {
>>         struct {
>> -               uint32_t reserved1:8;   /* < reserved */
>> -               uint32_t opcode:8;      /* < IT opcode */
>> -               uint32_t count:14;      /* < number of DWORDs - 1
>> -                                        * in the information body.
>> -                                        */
>> -               uint32_t type:2;        /* < packet identifier.
>> -                                        * It should be 3 for type 3 packets
>> -                                        */
>> +               /* reserved */
>> +               uint32_t reserved1:8;
>> +               /* IT opcode */
>> +               uint32_t opcode:8;
>> +               /* number of DWORDs - 1 in the information body */
>> +               uint32_t count:14;
>> +               /* packet identifier. It should be 3 for type 3 packets */
>> +               uint32_t type:2;
>>         };
>>         uint32_t u32all;
>>  };
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
>> index 08c72192..c4eda6f 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
>> @@ -30,10 +30,12 @@ union PM4_MES_TYPE_3_HEADER {
>>         struct {
>>                 uint32_t reserved1 : 8; /* < reserved */
>>                 uint32_t opcode    : 8; /* < IT opcode */
>> -               uint32_t count     : 14;/* < number of DWORDs - 1 in the
>> -               information body. */
>> -               uint32_t type      : 2; /* < packet identifier.
>> -                                       It should be 3 for type 3 packets */
>> +               uint32_t count     : 14;/* < Number of DWORDS - 1 in the
>> +                                        *   information body
>> +                                        */
>> +               uint32_t type      : 2; /* < packet identifier
>> +                                        *   It should be 3 for type 3 packets
>> +                                        */
>>         };
>>         uint32_t u32All;
>>  };
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>> index 4750cab..469b7ea 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
>> @@ -294,13 +294,13 @@ enum kfd_queue_format {
>>   * @write_ptr: Defines the number of dwords written to the ring buffer.
>>   *
>>   * @doorbell_ptr: This field aim is to notify the H/W of new packet written to
>> - * the queue ring buffer. This field should be similar to write_ptr and the user
>> - * should update this field after he updated the write_ptr.
>> + * the queue ring buffer. This field should be similar to write_ptr and the
>> + * user should update this field after he updated the write_ptr.
>>   *
>>   * @doorbell_off: The doorbell offset in the doorbell pci-bar.
>>   *
>> - * @is_interop: Defines if this is a interop queue. Interop queue means that the
>> - * queue can access both graphics and compute resources.
>> + * @is_interop: Defines if this is a interop queue. Interop queue means that
>> + * the queue can access both graphics and compute resources.
>>   *
>>   * @is_active: Defines if the queue is active or not.
>>   *
>> @@ -352,9 +352,10 @@ struct queue_properties {
>>   * @properties: The queue properties.
>>   *
>>   * @mec: Used only in no cp scheduling mode and identifies to micro engine id
>> - * that the queue should be execute on.
>> + *      that the queue should be execute on.
>>   *
>> - * @pipe: Used only in no cp scheduling mode and identifies the queue's pipe id.
>> + * @pipe: Used only in no cp scheduling mode and identifies the queue's pipe
>> + *       id.
>>   *
>>   * @queue: Used only in no cp scheduliong mode and identifies the queue's slot.
>>   *
>> @@ -520,8 +521,8 @@ struct kfd_process {
>>         struct mutex event_mutex;
>>         /* All events in process hashed by ID, linked on kfd_event.events. */
>>         DECLARE_HASHTABLE(events, 4);
>> -       struct list_head signal_event_pages;    /* struct slot_page_header.
>> -                                                               event_pages */
>> +       /* struct slot_page_header.event_pages */
>> +       struct list_head signal_event_pages;
>>         u32 next_nonsignal_event_id;
>>         size_t signal_event_count;
>>  };
>> @@ -559,8 +560,10 @@ struct kfd_process_device *kfd_create_process_device_data(struct kfd_dev *dev,
>>                                                         struct kfd_process *p);
>>
>>  /* Process device data iterator */
>> -struct kfd_process_device *kfd_get_first_process_device_data(struct kfd_process *p);
>> -struct kfd_process_device *kfd_get_next_process_device_data(struct kfd_process *p,
>> +struct kfd_process_device *kfd_get_first_process_device_data(
>> +                                                       struct kfd_process *p);
>> +struct kfd_process_device *kfd_get_next_process_device_data(
>> +                                               struct kfd_process *p,
>>                                                 struct kfd_process_device *pdd);
>>  bool kfd_has_process_device_data(struct kfd_process *p);
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>> index 035bbc9..a4e4a2d 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>> @@ -449,14 +449,16 @@ void kfd_unbind_process_from_device(struct kfd_dev *dev, unsigned int pasid)
>>         mutex_unlock(&p->mutex);
>>  }
>>
>> -struct kfd_process_device *kfd_get_first_process_device_data(struct kfd_process *p)
>> +struct kfd_process_device *kfd_get_first_process_device_data(
>> +                                               struct kfd_process *p)
>>  {
>>         return list_first_entry(&p->per_device_data,
>>                                 struct kfd_process_device,
>>                                 per_device_list);
>>  }
>>
>> -struct kfd_process_device *kfd_get_next_process_device_data(struct kfd_process *p,
>> +struct kfd_process_device *kfd_get_next_process_device_data(
>> +                                               struct kfd_process *p,
>>                                                 struct kfd_process_device *pdd)
>>  {
>>         if (list_is_last(&pdd->per_device_list, &p->per_device_data))
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>> index 1e50647..0200dae 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>> @@ -1170,8 +1170,8 @@ int kfd_topology_add_device(struct kfd_dev *gpu)
>>                  * GPU vBIOS
>>                  */
>>
>> -               /*
>> -                * Update the SYSFS tree, since we added another topology device
>> +               /* Update the SYSFS tree, since we added another topology
>> +                * device
>>                  */
>>                 if (kfd_topology_update_sysfs() < 0)
>>                         kfd_topology_release_sysfs();
>> --
>> 2.7.4
>>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 06/19] drm/amdkfd: Consolidate and clean up log commands
       [not found]     ` <1502488589-30272-7-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-08-12 13:01       ` Oded Gabbay
  0 siblings, 0 replies; 70+ messages in thread
From: Oded Gabbay @ 2017-08-12 13:01 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: Yong Zhao, Kent Russell, amd-gfx list

On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> From: Kent Russell <kent.russell@amd.com>
>
> Consolidate log commands so that dev_info(NULL, "Error...") uses the more
> accurate pr_err, remove the module name from the log (can be seen via
> dynamic debugging with +m), and the function name (can be seen via
> dynamic debugging with +f). We also don't need debug messages saying
> what function we're in. Those can be added by devs when needed
>
> Don't print vendor and device ID in error messages. They are typically
> the same for all GPUs in a multi-GPU system. So this doesn't add any
> value to the message.
>
> Lastly, remove parentheses around %d, %i and 0x%llX.
> According to kernel.org:
> "Printing numbers in parentheses (%d) adds no value and should be
> avoided."
>
> Signed-off-by: Kent Russell <kent.russell@amd.com>
> Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c           | 64 ++++++++---------
>  drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c            | 38 +++++-----
>  drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c            |  4 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_device.c            | 51 ++++++--------
>  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  | 81 +++++++---------------
>  .../drm/amd/amdkfd/kfd_device_queue_manager_cik.c  |  2 +-
>  .../drm/amd/amdkfd/kfd_device_queue_manager_vi.c   |  2 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c          | 21 +++---
>  drivers/gpu/drm/amd/amdkfd/kfd_events.c            | 22 +++---
>  drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c      | 16 ++---
>  drivers/gpu/drm/amd/amdkfd/kfd_module.c            |  4 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c   | 10 ---
>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c    |  8 +--
>  drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c    | 34 ++++-----
>  drivers/gpu/drm/amd/amdkfd/kfd_process.c           |  4 +-
>  .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 27 +++-----
>  drivers/gpu/drm/amd/amdkfd/kfd_topology.c          |  6 +-
>  17 files changed, 158 insertions(+), 236 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> index 98f4dbf..6244958 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> @@ -142,12 +142,12 @@ static int set_queue_properties_from_user(struct queue_properties *q_properties,
>                                 struct kfd_ioctl_create_queue_args *args)
>  {
>         if (args->queue_percentage > KFD_MAX_QUEUE_PERCENTAGE) {
> -               pr_err("kfd: queue percentage must be between 0 to KFD_MAX_QUEUE_PERCENTAGE\n");
> +               pr_err("Queue percentage must be between 0 to KFD_MAX_QUEUE_PERCENTAGE\n");
>                 return -EINVAL;
>         }
>
>         if (args->queue_priority > KFD_MAX_QUEUE_PRIORITY) {
> -               pr_err("kfd: queue priority must be between 0 to KFD_MAX_QUEUE_PRIORITY\n");
> +               pr_err("Queue priority must be between 0 to KFD_MAX_QUEUE_PRIORITY\n");
>                 return -EINVAL;
>         }
>
> @@ -155,26 +155,26 @@ static int set_queue_properties_from_user(struct queue_properties *q_properties,
>                 (!access_ok(VERIFY_WRITE,
>                         (const void __user *) args->ring_base_address,
>                         sizeof(uint64_t)))) {
> -               pr_err("kfd: can't access ring base address\n");
> +               pr_err("Can't access ring base address\n");
>                 return -EFAULT;
>         }
>
>         if (!is_power_of_2(args->ring_size) && (args->ring_size != 0)) {
> -               pr_err("kfd: ring size must be a power of 2 or 0\n");
> +               pr_err("Ring size must be a power of 2 or 0\n");
>                 return -EINVAL;
>         }
>
>         if (!access_ok(VERIFY_WRITE,
>                         (const void __user *) args->read_pointer_address,
>                         sizeof(uint32_t))) {
> -               pr_err("kfd: can't access read pointer\n");
> +               pr_err("Can't access read pointer\n");
>                 return -EFAULT;
>         }
>
>         if (!access_ok(VERIFY_WRITE,
>                         (const void __user *) args->write_pointer_address,
>                         sizeof(uint32_t))) {
> -               pr_err("kfd: can't access write pointer\n");
> +               pr_err("Can't access write pointer\n");
>                 return -EFAULT;
>         }
>
> @@ -182,7 +182,7 @@ static int set_queue_properties_from_user(struct queue_properties *q_properties,
>                 !access_ok(VERIFY_WRITE,
>                         (const void __user *) args->eop_buffer_address,
>                         sizeof(uint32_t))) {
> -               pr_debug("kfd: can't access eop buffer");
> +               pr_debug("Can't access eop buffer");
>                 return -EFAULT;
>         }
>
> @@ -190,7 +190,7 @@ static int set_queue_properties_from_user(struct queue_properties *q_properties,
>                 !access_ok(VERIFY_WRITE,
>                         (const void __user *) args->ctx_save_restore_address,
>                         sizeof(uint32_t))) {
> -               pr_debug("kfd: can't access ctx save restore buffer");
> +               pr_debug("Can't access ctx save restore buffer");
>                 return -EFAULT;
>         }
>
> @@ -219,27 +219,27 @@ static int set_queue_properties_from_user(struct queue_properties *q_properties,
>         else
>                 q_properties->format = KFD_QUEUE_FORMAT_PM4;
>
> -       pr_debug("Queue Percentage (%d, %d)\n",
> +       pr_debug("Queue Percentage: %d, %d\n",
>                         q_properties->queue_percent, args->queue_percentage);
>
> -       pr_debug("Queue Priority (%d, %d)\n",
> +       pr_debug("Queue Priority: %d, %d\n",
>                         q_properties->priority, args->queue_priority);
>
> -       pr_debug("Queue Address (0x%llX, 0x%llX)\n",
> +       pr_debug("Queue Address: 0x%llX, 0x%llX\n",
>                         q_properties->queue_address, args->ring_base_address);
>
> -       pr_debug("Queue Size (0x%llX, %u)\n",
> +       pr_debug("Queue Size: 0x%llX, %u\n",
>                         q_properties->queue_size, args->ring_size);
>
> -       pr_debug("Queue r/w Pointers (0x%llX, 0x%llX)\n",
> -                       (uint64_t) q_properties->read_ptr,
> -                       (uint64_t) q_properties->write_ptr);
> +       pr_debug("Queue r/w Pointers: %p, %p\n",
> +                       q_properties->read_ptr,
> +                       q_properties->write_ptr);
>
> -       pr_debug("Queue Format (%d)\n", q_properties->format);
> +       pr_debug("Queue Format: %d\n", q_properties->format);
>
> -       pr_debug("Queue EOP (0x%llX)\n", q_properties->eop_ring_buffer_address);
> +       pr_debug("Queue EOP: 0x%llX\n", q_properties->eop_ring_buffer_address);
>
> -       pr_debug("Queue CTX save arex (0x%llX)\n",
> +       pr_debug("Queue CTX save area: 0x%llX\n",
>                         q_properties->ctx_save_restore_area_address);
>
>         return 0;
> @@ -257,16 +257,16 @@ static int kfd_ioctl_create_queue(struct file *filep, struct kfd_process *p,
>
>         memset(&q_properties, 0, sizeof(struct queue_properties));
>
> -       pr_debug("kfd: creating queue ioctl\n");
> +       pr_debug("Creating queue ioctl\n");
>
>         err = set_queue_properties_from_user(&q_properties, args);
>         if (err)
>                 return err;
>
> -       pr_debug("kfd: looking for gpu id 0x%x\n", args->gpu_id);
> +       pr_debug("Looking for gpu id 0x%x\n", args->gpu_id);
>         dev = kfd_device_by_id(args->gpu_id);
>         if (dev == NULL) {
> -               pr_debug("kfd: gpu id 0x%x was not found\n", args->gpu_id);
> +               pr_debug("Could not find gpu id 0x%x\n", args->gpu_id);
>                 return -EINVAL;
>         }
>
> @@ -278,7 +278,7 @@ static int kfd_ioctl_create_queue(struct file *filep, struct kfd_process *p,
>                 goto err_bind_process;
>         }
>
> -       pr_debug("kfd: creating queue for PASID %d on GPU 0x%x\n",
> +       pr_debug("Creating queue for PASID %d on gpu 0x%x\n",
>                         p->pasid,
>                         dev->id);
>
> @@ -296,15 +296,15 @@ static int kfd_ioctl_create_queue(struct file *filep, struct kfd_process *p,
>
>         mutex_unlock(&p->mutex);
>
> -       pr_debug("kfd: queue id %d was created successfully\n", args->queue_id);
> +       pr_debug("Queue id %d was created successfully\n", args->queue_id);
>
> -       pr_debug("ring buffer address == 0x%016llX\n",
> +       pr_debug("Ring buffer address == 0x%016llX\n",
>                         args->ring_base_address);
>
> -       pr_debug("read ptr address    == 0x%016llX\n",
> +       pr_debug("Read ptr address    == 0x%016llX\n",
>                         args->read_pointer_address);
>
> -       pr_debug("write ptr address   == 0x%016llX\n",
> +       pr_debug("Write ptr address   == 0x%016llX\n",
>                         args->write_pointer_address);
>
>         return 0;
> @@ -321,7 +321,7 @@ static int kfd_ioctl_destroy_queue(struct file *filp, struct kfd_process *p,
>         int retval;
>         struct kfd_ioctl_destroy_queue_args *args = data;
>
> -       pr_debug("kfd: destroying queue id %d for PASID %d\n",
> +       pr_debug("Destroying queue id %d for pasid %d\n",
>                                 args->queue_id,
>                                 p->pasid);
>
> @@ -341,12 +341,12 @@ static int kfd_ioctl_update_queue(struct file *filp, struct kfd_process *p,
>         struct queue_properties properties;
>
>         if (args->queue_percentage > KFD_MAX_QUEUE_PERCENTAGE) {
> -               pr_err("kfd: queue percentage must be between 0 to KFD_MAX_QUEUE_PERCENTAGE\n");
> +               pr_err("Queue percentage must be between 0 to KFD_MAX_QUEUE_PERCENTAGE\n");
>                 return -EINVAL;
>         }
>
>         if (args->queue_priority > KFD_MAX_QUEUE_PRIORITY) {
> -               pr_err("kfd: queue priority must be between 0 to KFD_MAX_QUEUE_PRIORITY\n");
> +               pr_err("Queue priority must be between 0 to KFD_MAX_QUEUE_PRIORITY\n");
>                 return -EINVAL;
>         }
>
> @@ -354,12 +354,12 @@ static int kfd_ioctl_update_queue(struct file *filp, struct kfd_process *p,
>                 (!access_ok(VERIFY_WRITE,
>                         (const void __user *) args->ring_base_address,
>                         sizeof(uint64_t)))) {
> -               pr_err("kfd: can't access ring base address\n");
> +               pr_err("Can't access ring base address\n");
>                 return -EFAULT;
>         }
>
>         if (!is_power_of_2(args->ring_size) && (args->ring_size != 0)) {
> -               pr_err("kfd: ring size must be a power of 2 or 0\n");
> +               pr_err("Ring size must be a power of 2 or 0\n");
>                 return -EINVAL;
>         }
>
> @@ -368,7 +368,7 @@ static int kfd_ioctl_update_queue(struct file *filp, struct kfd_process *p,
>         properties.queue_percent = args->queue_percentage;
>         properties.priority = args->queue_priority;
>
> -       pr_debug("kfd: updating queue id %d for PASID %d\n",
> +       pr_debug("Updating queue id %d for pasid %d\n",
>                         args->queue_id, p->pasid);
>
>         mutex_lock(&p->mutex);
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
> index a7548a5..bf8ee19 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
> @@ -78,7 +78,7 @@ static int dbgdev_diq_submit_ib(struct kfd_dbgdev *dbgdev,
>                                 pq_packets_size_in_bytes / sizeof(uint32_t),
>                                 &ib_packet_buff);
>         if (status != 0) {
> -               pr_err("amdkfd: acquire_packet_buffer failed\n");
> +               pr_err("acquire_packet_buffer failed\n");
>                 return status;
>         }
>
> @@ -116,7 +116,7 @@ static int dbgdev_diq_submit_ib(struct kfd_dbgdev *dbgdev,
>                                         &mem_obj);
>
>         if (status != 0) {
> -               pr_err("amdkfd: Failed to allocate GART memory\n");
> +               pr_err("Failed to allocate GART memory\n");
>                 kq->ops.rollback_packet(kq);
>                 return status;
>         }
> @@ -194,7 +194,7 @@ static int dbgdev_register_diq(struct kfd_dbgdev *dbgdev)
>                                 &qid);
>
>         if (status) {
> -               pr_err("amdkfd: Failed to create DIQ\n");
> +               pr_err("Failed to create DIQ\n");
>                 return status;
>         }
>
> @@ -203,7 +203,7 @@ static int dbgdev_register_diq(struct kfd_dbgdev *dbgdev)
>         kq = pqm_get_kernel_queue(dbgdev->pqm, qid);
>
>         if (kq == NULL) {
> -               pr_err("amdkfd: Error getting DIQ\n");
> +               pr_err("Error getting DIQ\n");
>                 pqm_destroy_queue(dbgdev->pqm, qid);
>                 return -EFAULT;
>         }
> @@ -279,7 +279,7 @@ static void dbgdev_address_watch_set_registers(
>  }
>
>  static int dbgdev_address_watch_nodiq(struct kfd_dbgdev *dbgdev,
> -                                       struct dbg_address_watch_info *adw_info)
> +                                     struct dbg_address_watch_info *adw_info)
>  {
>         union TCP_WATCH_ADDR_H_BITS addrHi;
>         union TCP_WATCH_ADDR_L_BITS addrLo;
> @@ -293,7 +293,7 @@ static int dbgdev_address_watch_nodiq(struct kfd_dbgdev *dbgdev,
>         pdd = kfd_get_process_device_data(dbgdev->dev,
>                                         adw_info->process);
>         if (!pdd) {
> -               pr_err("amdkfd: Failed to get pdd for wave control no DIQ\n");
> +               pr_err("Failed to get pdd for wave control no DIQ\n");
>                 return -EFAULT;
>         }
>
> @@ -303,13 +303,13 @@ static int dbgdev_address_watch_nodiq(struct kfd_dbgdev *dbgdev,
>
>         if ((adw_info->num_watch_points > MAX_WATCH_ADDRESSES) ||
>                         (adw_info->num_watch_points == 0)) {
> -               pr_err("amdkfd: num_watch_points is invalid\n");
> +               pr_err("num_watch_points is invalid\n");
>                 return -EINVAL;
>         }
>
>         if ((adw_info->watch_mode == NULL) ||
>                 (adw_info->watch_address == NULL)) {
> -               pr_err("amdkfd: adw_info fields are not valid\n");
> +               pr_err("adw_info fields are not valid\n");
>                 return -EINVAL;
>         }
>
> @@ -348,7 +348,7 @@ static int dbgdev_address_watch_nodiq(struct kfd_dbgdev *dbgdev,
>  }
>
>  static int dbgdev_address_watch_diq(struct kfd_dbgdev *dbgdev,
> -                                       struct dbg_address_watch_info *adw_info)
> +                                   struct dbg_address_watch_info *adw_info)
>  {
>         struct pm4__set_config_reg *packets_vec;
>         union TCP_WATCH_ADDR_H_BITS addrHi;
> @@ -371,20 +371,20 @@ static int dbgdev_address_watch_diq(struct kfd_dbgdev *dbgdev,
>
>         if ((adw_info->num_watch_points > MAX_WATCH_ADDRESSES) ||
>                         (adw_info->num_watch_points == 0)) {
> -               pr_err("amdkfd: num_watch_points is invalid\n");
> +               pr_err("num_watch_points is invalid\n");
>                 return -EINVAL;
>         }
>
>         if ((NULL == adw_info->watch_mode) ||
>                         (NULL == adw_info->watch_address)) {
> -               pr_err("amdkfd: adw_info fields are not valid\n");
> +               pr_err("adw_info fields are not valid\n");
>                 return -EINVAL;
>         }
>
>         status = kfd_gtt_sa_allocate(dbgdev->dev, ib_size, &mem_obj);
>
>         if (status != 0) {
> -               pr_err("amdkfd: Failed to allocate GART memory\n");
> +               pr_err("Failed to allocate GART memory\n");
>                 return status;
>         }
>
> @@ -491,7 +491,7 @@ static int dbgdev_address_watch_diq(struct kfd_dbgdev *dbgdev,
>                                         ib_size);
>
>                 if (status != 0) {
> -                       pr_err("amdkfd: Failed to submit IB to DIQ\n");
> +                       pr_err("Failed to submit IB to DIQ\n");
>                         break;
>                 }
>         }
> @@ -619,7 +619,7 @@ static int dbgdev_wave_control_diq(struct kfd_dbgdev *dbgdev,
>         status = dbgdev_wave_control_set_registers(wac_info, &reg_sq_cmd,
>                                                         &reg_gfx_index);
>         if (status) {
> -               pr_err("amdkfd: Failed to set wave control registers\n");
> +               pr_err("Failed to set wave control registers\n");
>                 return status;
>         }
>
> @@ -659,7 +659,7 @@ static int dbgdev_wave_control_diq(struct kfd_dbgdev *dbgdev,
>         status = kfd_gtt_sa_allocate(dbgdev->dev, ib_size, &mem_obj);
>
>         if (status != 0) {
> -               pr_err("amdkfd: Failed to allocate GART memory\n");
> +               pr_err("Failed to allocate GART memory\n");
>                 return status;
>         }
>
> @@ -712,7 +712,7 @@ static int dbgdev_wave_control_diq(struct kfd_dbgdev *dbgdev,
>                         ib_size);
>
>         if (status != 0)
> -               pr_err("amdkfd: Failed to submit IB to DIQ\n");
> +               pr_err("Failed to submit IB to DIQ\n");
>
>         kfd_gtt_sa_free(dbgdev->dev, mem_obj);
>
> @@ -735,13 +735,13 @@ static int dbgdev_wave_control_nodiq(struct kfd_dbgdev *dbgdev,
>         pdd = kfd_get_process_device_data(dbgdev->dev, wac_info->process);
>
>         if (!pdd) {
> -               pr_err("amdkfd: Failed to get pdd for wave control no DIQ\n");
> +               pr_err("Failed to get pdd for wave control no DIQ\n");
>                 return -EFAULT;
>         }
>         status = dbgdev_wave_control_set_registers(wac_info, &reg_sq_cmd,
>                                                         &reg_gfx_index);
>         if (status) {
> -               pr_err("amdkfd: Failed to set wave control registers\n");
> +               pr_err("Failed to set wave control registers\n");
>                 return status;
>         }
>
> @@ -826,7 +826,7 @@ int dbgdev_wave_reset_wavefronts(struct kfd_dev *dev, struct kfd_process *p)
>         }
>
>         if (vmid > last_vmid_to_scan) {
> -               pr_err("amdkfd: didn't found vmid for pasid (%d)\n", p->pasid);
> +               pr_err("Didn't find vmid for pasid %d\n", p->pasid);
>                 return -EFAULT;
>         }
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c b/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c
> index 56d6763..7225789 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c
> @@ -71,7 +71,7 @@ bool kfd_dbgmgr_create(struct kfd_dbgmgr **ppmgr, struct kfd_dev *pdev)
>
>         new_buff = kfd_alloc_struct(new_buff);
>         if (!new_buff) {
> -               pr_err("amdkfd: Failed to allocate dbgmgr instance\n");
> +               pr_err("Failed to allocate dbgmgr instance\n");
>                 return false;
>         }
>
> @@ -79,7 +79,7 @@ bool kfd_dbgmgr_create(struct kfd_dbgmgr **ppmgr, struct kfd_dev *pdev)
>         new_buff->dev = pdev;
>         new_buff->dbgdev = kfd_alloc_struct(new_buff->dbgdev);
>         if (!new_buff->dbgdev) {
> -               pr_err("amdkfd: Failed to allocate dbgdev instance\n");
> +               pr_err("Failed to allocate dbgdev instance\n");
>                 kfree(new_buff);
>                 return false;
>         }
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> index 1f50325..87df8bf 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> @@ -152,7 +152,7 @@ static bool device_iommu_pasid_init(struct kfd_dev *kfd)
>         }
>
>         if ((iommu_info.flags & required_iommu_flags) != required_iommu_flags) {
> -               dev_err(kfd_device, "error required iommu flags ats(%i), pri(%i), pasid(%i)\n",
> +               dev_err(kfd_device, "error required iommu flags ats %i, pri %i, pasid %i\n",
>                        (iommu_info.flags & AMD_IOMMU_DEVICE_FLAG_ATS_SUP) != 0,
>                        (iommu_info.flags & AMD_IOMMU_DEVICE_FLAG_PRI_SUP) != 0,
>                        (iommu_info.flags & AMD_IOMMU_DEVICE_FLAG_PASID_SUP)
> @@ -248,42 +248,33 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
>         if (kfd->kfd2kgd->init_gtt_mem_allocation(
>                         kfd->kgd, size, &kfd->gtt_mem,
>                         &kfd->gtt_start_gpu_addr, &kfd->gtt_start_cpu_ptr)){
> -               dev_err(kfd_device,
> -                       "Could not allocate %d bytes for device (%x:%x)\n",
> -                       size, kfd->pdev->vendor, kfd->pdev->device);
> +               dev_err(kfd_device, "Could not allocate %d bytes\n", size);
>                 goto out;
>         }
>
> -       dev_info(kfd_device,
> -               "Allocated %d bytes on gart for device(%x:%x)\n",
> -               size, kfd->pdev->vendor, kfd->pdev->device);
> +       dev_info(kfd_device, "Allocated %d bytes on gart\n", size);
>
>         /* Initialize GTT sa with 512 byte chunk size */
>         if (kfd_gtt_sa_init(kfd, size, 512) != 0) {
> -               dev_err(kfd_device,
> -                       "Error initializing gtt sub-allocator\n");
> +               dev_err(kfd_device, "Error initializing gtt sub-allocator\n");
>                 goto kfd_gtt_sa_init_error;
>         }
>
>         kfd_doorbell_init(kfd);
>
>         if (kfd_topology_add_device(kfd) != 0) {
> -               dev_err(kfd_device,
> -                       "Error adding device (%x:%x) to topology\n",
> -                       kfd->pdev->vendor, kfd->pdev->device);
> +               dev_err(kfd_device, "Error adding device to topology\n");
>                 goto kfd_topology_add_device_error;
>         }
>
>         if (kfd_interrupt_init(kfd)) {
> -               dev_err(kfd_device,
> -                       "Error initializing interrupts for device (%x:%x)\n",
> -                       kfd->pdev->vendor, kfd->pdev->device);
> +               dev_err(kfd_device, "Error initializing interrupts\n");
>                 goto kfd_interrupt_error;
>         }
>
>         if (!device_iommu_pasid_init(kfd)) {
>                 dev_err(kfd_device,
> -                       "Error initializing iommuv2 for device (%x:%x)\n",
> +                       "Error initializing iommuv2 for device %x:%x\n",
>                         kfd->pdev->vendor, kfd->pdev->device);
>                 goto device_iommu_pasid_error;
>         }
> @@ -293,15 +284,13 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
>
>         kfd->dqm = device_queue_manager_init(kfd);
>         if (!kfd->dqm) {
> -               dev_err(kfd_device,
> -                       "Error initializing queue manager for device (%x:%x)\n",
> -                       kfd->pdev->vendor, kfd->pdev->device);
> +               dev_err(kfd_device, "Error initializing queue manager\n");
>                 goto device_queue_manager_error;
>         }
>
>         if (kfd->dqm->ops.start(kfd->dqm) != 0) {
>                 dev_err(kfd_device,
> -                       "Error starting queuen manager for device (%x:%x)\n",
> +                       "Error starting queue manager for device %x:%x\n",
>                         kfd->pdev->vendor, kfd->pdev->device);
>                 goto dqm_start_error;
>         }
> @@ -309,10 +298,10 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
>         kfd->dbgmgr = NULL;
>
>         kfd->init_complete = true;
> -       dev_info(kfd_device, "added device (%x:%x)\n", kfd->pdev->vendor,
> +       dev_info(kfd_device, "added device %x:%x\n", kfd->pdev->vendor,
>                  kfd->pdev->device);
>
> -       pr_debug("kfd: Starting kfd with the following scheduling policy %d\n",
> +       pr_debug("Starting kfd with the following scheduling policy %d\n",
>                 sched_policy);
>
>         goto out;
> @@ -330,7 +319,7 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
>  kfd_gtt_sa_init_error:
>         kfd->kfd2kgd->free_gtt_mem(kfd->kgd, kfd->gtt_mem);
>         dev_err(kfd_device,
> -               "device (%x:%x) NOT added due to errors\n",
> +               "device %x:%x NOT added due to errors\n",
>                 kfd->pdev->vendor, kfd->pdev->device);
>  out:
>         return kfd->init_complete;
> @@ -422,7 +411,7 @@ static int kfd_gtt_sa_init(struct kfd_dev *kfd, unsigned int buf_size,
>         if (!kfd->gtt_sa_bitmap)
>                 return -ENOMEM;
>
> -       pr_debug("kfd: gtt_sa_num_of_chunks = %d, gtt_sa_bitmap = %p\n",
> +       pr_debug("gtt_sa_num_of_chunks = %d, gtt_sa_bitmap = %p\n",
>                         kfd->gtt_sa_num_of_chunks, kfd->gtt_sa_bitmap);
>
>         mutex_init(&kfd->gtt_sa_lock);
> @@ -468,7 +457,7 @@ int kfd_gtt_sa_allocate(struct kfd_dev *kfd, unsigned int size,
>         if ((*mem_obj) == NULL)
>                 return -ENOMEM;
>
> -       pr_debug("kfd: allocated mem_obj = %p for size = %d\n", *mem_obj, size);
> +       pr_debug("Allocated mem_obj = %p for size = %d\n", *mem_obj, size);
>
>         start_search = 0;
>
> @@ -480,7 +469,7 @@ int kfd_gtt_sa_allocate(struct kfd_dev *kfd, unsigned int size,
>                                         kfd->gtt_sa_num_of_chunks,
>                                         start_search);
>
> -       pr_debug("kfd: found = %d\n", found);
> +       pr_debug("Found = %d\n", found);
>
>         /* If there wasn't any free chunk, bail out */
>         if (found == kfd->gtt_sa_num_of_chunks)
> @@ -498,12 +487,12 @@ int kfd_gtt_sa_allocate(struct kfd_dev *kfd, unsigned int size,
>                                         found,
>                                         kfd->gtt_sa_chunk_size);
>
> -       pr_debug("kfd: gpu_addr = %p, cpu_addr = %p\n",
> +       pr_debug("gpu_addr = %p, cpu_addr = %p\n",
>                         (uint64_t *) (*mem_obj)->gpu_addr, (*mem_obj)->cpu_ptr);
>
>         /* If we need only one chunk, mark it as allocated and get out */
>         if (size <= kfd->gtt_sa_chunk_size) {
> -               pr_debug("kfd: single bit\n");
> +               pr_debug("Single bit\n");
>                 set_bit(found, kfd->gtt_sa_bitmap);
>                 goto kfd_gtt_out;
>         }
> @@ -538,7 +527,7 @@ int kfd_gtt_sa_allocate(struct kfd_dev *kfd, unsigned int size,
>
>         } while (cur_size > 0);
>
> -       pr_debug("kfd: range_start = %d, range_end = %d\n",
> +       pr_debug("range_start = %d, range_end = %d\n",
>                 (*mem_obj)->range_start, (*mem_obj)->range_end);
>
>         /* Mark the chunks as allocated */
> @@ -552,7 +541,7 @@ int kfd_gtt_sa_allocate(struct kfd_dev *kfd, unsigned int size,
>         return 0;
>
>  kfd_gtt_no_free_chunk:
> -       pr_debug("kfd: allocation failed with mem_obj = %p\n", mem_obj);
> +       pr_debug("Allocation failed with mem_obj = %p\n", mem_obj);
>         mutex_unlock(&kfd->gtt_sa_lock);
>         kfree(mem_obj);
>         return -ENOMEM;
> @@ -568,7 +557,7 @@ int kfd_gtt_sa_free(struct kfd_dev *kfd, struct kfd_mem_obj *mem_obj)
>         if (!mem_obj)
>                 return 0;
>
> -       pr_debug("kfd: free mem_obj = %p, range_start = %d, range_end = %d\n",
> +       pr_debug("Free mem_obj = %p, range_start = %d, range_end = %d\n",
>                         mem_obj, mem_obj->range_start, mem_obj->range_end);
>
>         mutex_lock(&kfd->gtt_sa_lock);
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> index 3b850da..8b147e4 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> @@ -121,7 +121,7 @@ static int allocate_vmid(struct device_queue_manager *dqm,
>
>         /* Kaveri kfd vmid's starts from vmid 8 */
>         allocated_vmid = bit + KFD_VMID_START_OFFSET;
> -       pr_debug("kfd: vmid allocation %d\n", allocated_vmid);
> +       pr_debug("vmid allocation %d\n", allocated_vmid);
>         qpd->vmid = allocated_vmid;
>         q->properties.vmid = allocated_vmid;
>
> @@ -154,13 +154,12 @@ static int create_queue_nocpsch(struct device_queue_manager *dqm,
>
>         BUG_ON(!dqm || !q || !qpd || !allocated_vmid);
>
> -       pr_debug("kfd: In func %s\n", __func__);
>         print_queue(q);
>
>         mutex_lock(&dqm->lock);
>
>         if (dqm->total_queue_count >= max_num_of_queues_per_device) {
> -               pr_warn("amdkfd: Can't create new usermode queue because %d queues were already created\n",
> +               pr_warn("Can't create new usermode queue because %d queues were already created\n",
>                                 dqm->total_queue_count);
>                 mutex_unlock(&dqm->lock);
>                 return -EPERM;
> @@ -240,8 +239,7 @@ static int allocate_hqd(struct device_queue_manager *dqm, struct queue *q)
>         if (!set)
>                 return -EBUSY;
>
> -       pr_debug("kfd: DQM %s hqd slot - pipe (%d) queue(%d)\n",
> -                               __func__, q->pipe, q->queue);
> +       pr_debug("hqd slot - pipe %d, queue %d\n", q->pipe, q->queue);
>         /* horizontal hqd allocation */
>         dqm->next_pipe_to_allocate = (pipe + 1) % get_pipes_per_mec(dqm);
>
> @@ -278,9 +276,8 @@ static int create_compute_queue_nocpsch(struct device_queue_manager *dqm,
>                 return retval;
>         }
>
> -       pr_debug("kfd: loading mqd to hqd on pipe (%d) queue (%d)\n",
> -                       q->pipe,
> -                       q->queue);
> +       pr_debug("Loading mqd to hqd on pipe %d, queue %d\n",
> +                       q->pipe, q->queue);
>
>         retval = mqd->load_mqd(mqd, q->mqd, q->pipe,
>                         q->queue, (uint32_t __user *) q->properties.write_ptr);
> @@ -304,8 +301,6 @@ static int destroy_queue_nocpsch(struct device_queue_manager *dqm,
>
>         retval = 0;
>
> -       pr_debug("kfd: In Func %s\n", __func__);
> -
>         mutex_lock(&dqm->lock);
>
>         if (q->properties.type == KFD_QUEUE_TYPE_COMPUTE) {
> @@ -324,7 +319,7 @@ static int destroy_queue_nocpsch(struct device_queue_manager *dqm,
>                 dqm->sdma_queue_count--;
>                 deallocate_sdma_queue(dqm, q->sdma_id);
>         } else {
> -               pr_debug("q->properties.type is invalid (%d)\n",
> +               pr_debug("q->properties.type %d is invalid\n",
>                                 q->properties.type);
>                 retval = -EINVAL;
>                 goto out;
> @@ -403,13 +398,13 @@ static struct mqd_manager *get_mqd_manager_nocpsch(
>
>         BUG_ON(!dqm || type >= KFD_MQD_TYPE_MAX);
>
> -       pr_debug("kfd: In func %s mqd type %d\n", __func__, type);
> +       pr_debug("mqd type %d\n", type);
>
>         mqd = dqm->mqds[type];
>         if (!mqd) {
>                 mqd = mqd_manager_init(type, dqm->dev);
>                 if (mqd == NULL)
> -                       pr_err("kfd: mqd manager is NULL");
> +                       pr_err("mqd manager is NULL");
>                 dqm->mqds[type] = mqd;
>         }
>
> @@ -424,8 +419,6 @@ static int register_process_nocpsch(struct device_queue_manager *dqm,
>
>         BUG_ON(!dqm || !qpd);
>
> -       pr_debug("kfd: In func %s\n", __func__);
> -
>         n = kzalloc(sizeof(struct device_process_node), GFP_KERNEL);
>         if (!n)
>                 return -ENOMEM;
> @@ -452,8 +445,6 @@ static int unregister_process_nocpsch(struct device_queue_manager *dqm,
>
>         BUG_ON(!dqm || !qpd);
>
> -       pr_debug("In func %s\n", __func__);
> -
>         pr_debug("qpd->queues_list is %s\n",
>                         list_empty(&qpd->queues_list) ? "empty" : "not empty");
>
> @@ -501,25 +492,13 @@ static void init_interrupts(struct device_queue_manager *dqm)
>                         dqm->dev->kfd2kgd->init_interrupts(dqm->dev->kgd, i);
>  }
>
> -static int init_scheduler(struct device_queue_manager *dqm)
> -{
> -       int retval = 0;
> -
> -       BUG_ON(!dqm);
> -
> -       pr_debug("kfd: In %s\n", __func__);
> -
> -       return retval;
> -}
> -
>  static int initialize_nocpsch(struct device_queue_manager *dqm)
>  {
>         int pipe, queue;
>
>         BUG_ON(!dqm);
>
> -       pr_debug("kfd: In func %s num of pipes: %d\n",
> -                       __func__, get_pipes_per_mec(dqm));
> +       pr_debug("num of pipes: %d\n", get_pipes_per_mec(dqm));
>
>         mutex_init(&dqm->lock);
>         INIT_LIST_HEAD(&dqm->queues);
> @@ -544,7 +523,6 @@ static int initialize_nocpsch(struct device_queue_manager *dqm)
>         dqm->vmid_bitmap = (1 << VMID_PER_DEVICE) - 1;
>         dqm->sdma_bitmap = (1 << CIK_SDMA_QUEUES) - 1;
>
> -       init_scheduler(dqm);
>         return 0;
>  }
>
> @@ -617,9 +595,9 @@ static int create_sdma_queue_nocpsch(struct device_queue_manager *dqm,
>         q->properties.sdma_queue_id = q->sdma_id % CIK_SDMA_QUEUES_PER_ENGINE;
>         q->properties.sdma_engine_id = q->sdma_id / CIK_SDMA_ENGINE_NUM;
>
> -       pr_debug("kfd: sdma id is:    %d\n", q->sdma_id);
> -       pr_debug("     sdma queue id: %d\n", q->properties.sdma_queue_id);
> -       pr_debug("     sdma engine id: %d\n", q->properties.sdma_engine_id);
> +       pr_debug("SDMA id is:    %d\n", q->sdma_id);
> +       pr_debug("SDMA queue id: %d\n", q->properties.sdma_queue_id);
> +       pr_debug("SDMA engine id: %d\n", q->properties.sdma_engine_id);
>
>         dqm->ops_asic_specific.init_sdma_vm(dqm, q, qpd);
>         retval = mqd->init_mqd(mqd, &q->mqd, &q->mqd_mem_obj,
> @@ -651,8 +629,6 @@ static int set_sched_resources(struct device_queue_manager *dqm)
>
>         BUG_ON(!dqm);
>
> -       pr_debug("kfd: In func %s\n", __func__);
> -
>         res.vmid_mask = (1 << VMID_PER_DEVICE) - 1;
>         res.vmid_mask <<= KFD_VMID_START_OFFSET;
>
> @@ -682,9 +658,9 @@ static int set_sched_resources(struct device_queue_manager *dqm)
>         res.gws_mask = res.oac_mask = res.gds_heap_base =
>                                                 res.gds_heap_size = 0;
>
> -       pr_debug("kfd: scheduling resources:\n"
> -                       "      vmid mask: 0x%8X\n"
> -                       "      queue mask: 0x%8llX\n",
> +       pr_debug("Scheduling resources:\n"
> +                       "vmid mask: 0x%8X\n"
> +                       "queue mask: 0x%8llX\n",
>                         res.vmid_mask, res.queue_mask);
>
>         return pm_send_set_resources(&dqm->packets, &res);
> @@ -696,8 +672,7 @@ static int initialize_cpsch(struct device_queue_manager *dqm)
>
>         BUG_ON(!dqm);
>
> -       pr_debug("kfd: In func %s num of pipes: %d\n",
> -                       __func__, get_pipes_per_mec(dqm));
> +       pr_debug("num of pipes: %d\n", get_pipes_per_mec(dqm));
>
>         mutex_init(&dqm->lock);
>         INIT_LIST_HEAD(&dqm->queues);
> @@ -732,7 +707,7 @@ static int start_cpsch(struct device_queue_manager *dqm)
>         if (retval != 0)
>                 goto fail_set_sched_resources;
>
> -       pr_debug("kfd: allocating fence memory\n");
> +       pr_debug("Allocating fence memory\n");
>
>         /* allocate fence memory on the gart */
>         retval = kfd_gtt_sa_allocate(dqm->dev, sizeof(*dqm->fence_addr),
> @@ -786,11 +761,9 @@ static int create_kernel_queue_cpsch(struct device_queue_manager *dqm,
>  {
>         BUG_ON(!dqm || !kq || !qpd);
>
> -       pr_debug("kfd: In func %s\n", __func__);
> -
>         mutex_lock(&dqm->lock);
>         if (dqm->total_queue_count >= max_num_of_queues_per_device) {
> -               pr_warn("amdkfd: Can't create new kernel queue because %d queues were already created\n",
> +               pr_warn("Can't create new kernel queue because %d queues were already created\n",
>                                 dqm->total_queue_count);
>                 mutex_unlock(&dqm->lock);
>                 return -EPERM;
> @@ -819,8 +792,6 @@ static void destroy_kernel_queue_cpsch(struct device_queue_manager *dqm,
>  {
>         BUG_ON(!dqm || !kq);
>
> -       pr_debug("kfd: In %s\n", __func__);
> -
>         mutex_lock(&dqm->lock);
>         /* here we actually preempt the DIQ */
>         destroy_queues_cpsch(dqm, true, false);
> @@ -862,7 +833,7 @@ static int create_queue_cpsch(struct device_queue_manager *dqm, struct queue *q,
>         mutex_lock(&dqm->lock);
>
>         if (dqm->total_queue_count >= max_num_of_queues_per_device) {
> -               pr_warn("amdkfd: Can't create new usermode queue because %d queues were already created\n",
> +               pr_warn("Can't create new usermode queue because %d queues were already created\n",
>                                 dqm->total_queue_count);
>                 retval = -EPERM;
>                 goto out;
> @@ -916,7 +887,7 @@ int amdkfd_fence_wait_timeout(unsigned int *fence_addr,
>
>         while (*fence_addr != fence_value) {
>                 if (time_after(jiffies, timeout)) {
> -                       pr_err("kfd: qcm fence wait loop timeout expired\n");
> +                       pr_err("qcm fence wait loop timeout expired\n");
>                         return -ETIME;
>                 }
>                 schedule();
> @@ -949,7 +920,7 @@ static int destroy_queues_cpsch(struct device_queue_manager *dqm,
>         if (!dqm->active_runlist)
>                 goto out;
>
> -       pr_debug("kfd: Before destroying queues, sdma queue count is : %u\n",
> +       pr_debug("Before destroying queues, sdma queue count is : %u\n",
>                 dqm->sdma_queue_count);
>
>         if (dqm->sdma_queue_count > 0) {
> @@ -998,7 +969,7 @@ static int execute_queues_cpsch(struct device_queue_manager *dqm, bool lock)
>
>         retval = destroy_queues_cpsch(dqm, false, false);
>         if (retval != 0) {
> -               pr_err("kfd: the cp might be in an unrecoverable state due to an unsuccessful queues preemption");
> +               pr_err("The cp might be in an unrecoverable state due to an unsuccessful queues preemption");
>                 goto out;
>         }
>
> @@ -1014,7 +985,7 @@ static int execute_queues_cpsch(struct device_queue_manager *dqm, bool lock)
>
>         retval = pm_send_runlist(&dqm->packets, &dqm->queues);
>         if (retval != 0) {
> -               pr_err("kfd: failed to execute runlist");
> +               pr_err("failed to execute runlist");
>                 goto out;
>         }
>         dqm->active_runlist = true;
> @@ -1106,8 +1077,6 @@ static bool set_cache_memory_policy(struct device_queue_manager *dqm,
>  {
>         bool retval;
>
> -       pr_debug("kfd: In func %s\n", __func__);
> -
>         mutex_lock(&dqm->lock);
>
>         if (alternate_aperture_size == 0) {
> @@ -1152,7 +1121,7 @@ static bool set_cache_memory_policy(struct device_queue_manager *dqm,
>         if ((sched_policy == KFD_SCHED_POLICY_NO_HWS) && (qpd->vmid != 0))
>                 program_sh_mem_settings(dqm, qpd);
>
> -       pr_debug("kfd: sh_mem_config: 0x%x, ape1_base: 0x%x, ape1_limit: 0x%x\n",
> +       pr_debug("sh_mem_config: 0x%x, ape1_base: 0x%x, ape1_limit: 0x%x\n",
>                 qpd->sh_mem_config, qpd->sh_mem_ape1_base,
>                 qpd->sh_mem_ape1_limit);
>
> @@ -1170,7 +1139,7 @@ struct device_queue_manager *device_queue_manager_init(struct kfd_dev *dev)
>
>         BUG_ON(!dev);
>
> -       pr_debug("kfd: loading device queue manager\n");
> +       pr_debug("Loading device queue manager\n");
>
>         dqm = kzalloc(sizeof(struct device_queue_manager), GFP_KERNEL);
>         if (!dqm)
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c
> index 48dc056..a263e2a 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c
> @@ -127,7 +127,7 @@ static int register_process_cik(struct device_queue_manager *dqm,
>                 qpd->sh_mem_bases = compute_sh_mem_bases_64bit(temp);
>         }
>
> -       pr_debug("kfd: is32bit process: %d sh_mem_bases nybble: 0x%X and register 0x%X\n",
> +       pr_debug("is32bit process: %d sh_mem_bases nybble: 0x%X and register 0x%X\n",
>                 qpd->pqm->process->is_32bit_user_mode, temp, qpd->sh_mem_bases);
>
>         return 0;
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c
> index 7e9cae9..8c45c86 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c
> @@ -139,7 +139,7 @@ static int register_process_vi(struct device_queue_manager *dqm,
>                         SH_MEM_CONFIG__ADDRESS_MODE__SHIFT;
>         }
>
> -       pr_debug("kfd: is32bit process: %d sh_mem_bases nybble: 0x%X and register 0x%X\n",
> +       pr_debug("is32bit process: %d sh_mem_bases nybble: 0x%X and register 0x%X\n",
>                 qpd->pqm->process->is_32bit_user_mode, temp, qpd->sh_mem_bases);
>
>         return 0;
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
> index 453c5d6..ca21538 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
> @@ -97,23 +97,23 @@ void kfd_doorbell_init(struct kfd_dev *kfd)
>
>         BUG_ON(!kfd->doorbell_kernel_ptr);
>
> -       pr_debug("kfd: doorbell initialization:\n");
> -       pr_debug("kfd: doorbell base           == 0x%08lX\n",
> +       pr_debug("Doorbell initialization:\n");
> +       pr_debug("doorbell base           == 0x%08lX\n",
>                         (uintptr_t)kfd->doorbell_base);
>
> -       pr_debug("kfd: doorbell_id_offset      == 0x%08lX\n",
> +       pr_debug("doorbell_id_offset      == 0x%08lX\n",
>                         kfd->doorbell_id_offset);
>
> -       pr_debug("kfd: doorbell_process_limit  == 0x%08lX\n",
> +       pr_debug("doorbell_process_limit  == 0x%08lX\n",
>                         doorbell_process_limit);
>
> -       pr_debug("kfd: doorbell_kernel_offset  == 0x%08lX\n",
> +       pr_debug("doorbell_kernel_offset  == 0x%08lX\n",
>                         (uintptr_t)kfd->doorbell_base);
>
> -       pr_debug("kfd: doorbell aperture size  == 0x%08lX\n",
> +       pr_debug("doorbell aperture size  == 0x%08lX\n",
>                         kfd->shared_resources.doorbell_aperture_size);
>
> -       pr_debug("kfd: doorbell kernel address == 0x%08lX\n",
> +       pr_debug("doorbell kernel address == 0x%08lX\n",
>                         (uintptr_t)kfd->doorbell_kernel_ptr);
>  }
>
> @@ -142,12 +142,11 @@ int kfd_doorbell_mmap(struct kfd_process *process, struct vm_area_struct *vma)
>
>         vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
>
> -       pr_debug("kfd: mapping doorbell page in %s\n"
> +       pr_debug("Mapping doorbell page\n"
>                  "     target user address == 0x%08llX\n"
>                  "     physical address    == 0x%08llX\n"
>                  "     vm_flags            == 0x%04lX\n"
>                  "     size                == 0x%04lX\n",
> -                __func__,
>                  (unsigned long long) vma->vm_start, address, vma->vm_flags,
>                  doorbell_process_allocation());
>
> @@ -185,7 +184,7 @@ u32 __iomem *kfd_get_kernel_doorbell(struct kfd_dev *kfd,
>         *doorbell_off = KERNEL_DOORBELL_PASID * (doorbell_process_allocation() /
>                                                         sizeof(u32)) + inx;
>
> -       pr_debug("kfd: get kernel queue doorbell\n"
> +       pr_debug("Get kernel queue doorbell\n"
>                          "     doorbell offset   == 0x%08X\n"
>                          "     kernel address    == 0x%08lX\n",
>                 *doorbell_off, (uintptr_t)(kfd->doorbell_kernel_ptr + inx));
> @@ -210,7 +209,7 @@ inline void write_kernel_doorbell(u32 __iomem *db, u32 value)
>  {
>         if (db) {
>                 writel(value, db);
> -               pr_debug("writing %d to doorbell address 0x%p\n", value, db);
> +               pr_debug("Writing %d to doorbell address 0x%p\n", value, db);
>         }
>  }
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
> index d8b9b3c..abdaf95 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
> @@ -110,7 +110,7 @@ static bool allocate_free_slot(struct kfd_process *process,
>                         *out_page = page;
>                         *out_slot_index = slot;
>
> -                       pr_debug("allocated event signal slot in page %p, slot %d\n",
> +                       pr_debug("Allocated event signal slot in page %p, slot %d\n",
>                                         page, slot);
>
>                         return true;
> @@ -155,9 +155,9 @@ static bool allocate_signal_page(struct file *devkfd, struct kfd_process *p)
>                                                    struct signal_page,
>                                                    event_pages)->page_index + 1;
>
> -       pr_debug("allocated new event signal page at %p, for process %p\n",
> +       pr_debug("Allocated new event signal page at %p, for process %p\n",
>                         page, p);
> -       pr_debug("page index is %d\n", page->page_index);
> +       pr_debug("Page index is %d\n", page->page_index);
>
>         list_add(&page->event_pages, &p->signal_event_pages);
>
> @@ -292,13 +292,13 @@ static int create_signal_event(struct file *devkfd,
>                                 struct kfd_event *ev)
>  {
>         if (p->signal_event_count == KFD_SIGNAL_EVENT_LIMIT) {
> -               pr_warn("amdkfd: Signal event wasn't created because limit was reached\n");
> +               pr_warn("Signal event wasn't created because limit was reached\n");
>                 return -ENOMEM;
>         }
>
>         if (!allocate_event_notification_slot(devkfd, p, &ev->signal_page,
>                                                 &ev->signal_slot_index)) {
> -               pr_warn("amdkfd: Signal event wasn't created because out of kernel memory\n");
> +               pr_warn("Signal event wasn't created because out of kernel memory\n");
>                 return -ENOMEM;
>         }
>
> @@ -310,11 +310,7 @@ static int create_signal_event(struct file *devkfd,
>         ev->event_id = make_signal_event_id(ev->signal_page,
>                                                 ev->signal_slot_index);
>
> -       pr_debug("signal event number %zu created with id %d, address %p\n",
> -                       p->signal_event_count, ev->event_id,
> -                       ev->user_signal_address);
> -
> -       pr_debug("signal event number %zu created with id %d, address %p\n",
> +       pr_debug("Signal event number %zu created with id %d, address %p\n",
>                         p->signal_event_count, ev->event_id,
>                         ev->user_signal_address);
>
> @@ -817,7 +813,7 @@ int kfd_event_mmap(struct kfd_process *p, struct vm_area_struct *vma)
>         /* check required size is logical */
>         if (get_order(KFD_SIGNAL_EVENT_LIMIT * 8) !=
>                         get_order(vma->vm_end - vma->vm_start)) {
> -               pr_err("amdkfd: event page mmap requested illegal size\n");
> +               pr_err("Event page mmap requested illegal size\n");
>                 return -EINVAL;
>         }
>
> @@ -826,7 +822,7 @@ int kfd_event_mmap(struct kfd_process *p, struct vm_area_struct *vma)
>         page = lookup_signal_page_by_index(p, page_index);
>         if (!page) {
>                 /* Probably KFD bug, but mmap is user-accessible. */
> -               pr_debug("signal page could not be found for page_index %u\n",
> +               pr_debug("Signal page could not be found for page_index %u\n",
>                                 page_index);
>                 return -EINVAL;
>         }
> @@ -837,7 +833,7 @@ int kfd_event_mmap(struct kfd_process *p, struct vm_area_struct *vma)
>         vma->vm_flags |= VM_IO | VM_DONTCOPY | VM_DONTEXPAND | VM_NORESERVE
>                        | VM_DONTDUMP | VM_PFNMAP;
>
> -       pr_debug("mapping signal page\n");
> +       pr_debug("Mapping signal page\n");
>         pr_debug("     start user address  == 0x%08lx\n", vma->vm_start);
>         pr_debug("     end user address    == 0x%08lx\n", vma->vm_end);
>         pr_debug("     pfn                 == 0x%016lX\n", pfn);
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
> index d135cd0..f89d366 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
> @@ -44,8 +44,8 @@ static bool initialize(struct kernel_queue *kq, struct kfd_dev *dev,
>         BUG_ON(!kq || !dev);
>         BUG_ON(type != KFD_QUEUE_TYPE_DIQ && type != KFD_QUEUE_TYPE_HIQ);
>
> -       pr_debug("amdkfd: In func %s initializing queue type %d size %d\n",
> -                       __func__, KFD_QUEUE_TYPE_HIQ, queue_size);
> +       pr_debug("Initializing queue type %d size %d\n", KFD_QUEUE_TYPE_HIQ,
> +                       queue_size);
>
>         memset(&prop, 0, sizeof(prop));
>         memset(&nop, 0, sizeof(nop));
> @@ -73,13 +73,13 @@ static bool initialize(struct kernel_queue *kq, struct kfd_dev *dev,
>         prop.doorbell_ptr = kfd_get_kernel_doorbell(dev, &prop.doorbell_off);
>
>         if (prop.doorbell_ptr == NULL) {
> -               pr_err("amdkfd: error init doorbell");
> +               pr_err("Failed to initialize doorbell");
>                 goto err_get_kernel_doorbell;
>         }
>
>         retval = kfd_gtt_sa_allocate(dev, queue_size, &kq->pq);
>         if (retval != 0) {
> -               pr_err("amdkfd: error init pq queues size (%d)\n", queue_size);
> +               pr_err("Failed to init pq queues size %d\n", queue_size);
>                 goto err_pq_allocate_vidmem;
>         }
>
> @@ -139,7 +139,7 @@ static bool initialize(struct kernel_queue *kq, struct kfd_dev *dev,
>
>         /* assign HIQ to HQD */
>         if (type == KFD_QUEUE_TYPE_HIQ) {
> -               pr_debug("assigning hiq to hqd\n");
> +               pr_debug("Assigning hiq to hqd\n");
>                 kq->queue->pipe = KFD_CIK_HIQ_PIPE;
>                 kq->queue->queue = KFD_CIK_HIQ_QUEUE;
>                 kq->mqd->load_mqd(kq->mqd, kq->queue->mqd, kq->queue->pipe,
> @@ -304,7 +304,7 @@ struct kernel_queue *kernel_queue_init(struct kfd_dev *dev,
>         }
>
>         if (!kq->ops.initialize(kq, dev, type, KFD_KERNEL_QUEUE_SIZE)) {
> -               pr_err("amdkfd: failed to init kernel queue\n");
> +               pr_err("Failed to init kernel queue\n");
>                 kfree(kq);
>                 return NULL;
>         }
> @@ -327,7 +327,7 @@ static __attribute__((unused)) void test_kq(struct kfd_dev *dev)
>
>         BUG_ON(!dev);
>
> -       pr_err("amdkfd: starting kernel queue test\n");
> +       pr_err("Starting kernel queue test\n");
>
>         kq = kernel_queue_init(dev, KFD_QUEUE_TYPE_HIQ);
>         BUG_ON(!kq);
> @@ -338,7 +338,7 @@ static __attribute__((unused)) void test_kq(struct kfd_dev *dev)
>                 buffer[i] = kq->nop_packet;
>         kq->ops.submit_packet(kq);
>
> -       pr_err("amdkfd: ending kernel queue test\n");
> +       pr_err("Ending kernel queue test\n");
>  }
>
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_module.c b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
> index af5bfc1..819a442 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
> @@ -91,7 +91,7 @@ static int __init kfd_module_init(void)
>         /* Verify module parameters */
>         if ((sched_policy < KFD_SCHED_POLICY_HWS) ||
>                 (sched_policy > KFD_SCHED_POLICY_NO_HWS)) {
> -               pr_err("kfd: sched_policy has invalid value\n");
> +               pr_err("sched_policy has invalid value\n");
>                 return -1;
>         }
>
> @@ -99,7 +99,7 @@ static int __init kfd_module_init(void)
>         if ((max_num_of_queues_per_device < 1) ||
>                 (max_num_of_queues_per_device >
>                         KFD_MAX_NUM_OF_QUEUES_PER_DEVICE)) {
> -               pr_err("kfd: max_num_of_queues_per_device must be between 1 to KFD_MAX_NUM_OF_QUEUES_PER_DEVICE\n");
> +               pr_err("max_num_of_queues_per_device must be between 1 to KFD_MAX_NUM_OF_QUEUES_PER_DEVICE\n");
>                 return -1;
>         }
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
> index ac59229..27fd930 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
> @@ -46,8 +46,6 @@ static int init_mqd(struct mqd_manager *mm, void **mqd,
>
>         BUG_ON(!mm || !q || !mqd);
>
> -       pr_debug("kfd: In func %s\n", __func__);
> -
>         retval = kfd_gtt_sa_allocate(mm->dev, sizeof(struct cik_mqd),
>                                         mqd_mem_obj);
>
> @@ -172,8 +170,6 @@ static int update_mqd(struct mqd_manager *mm, void *mqd,
>
>         BUG_ON(!mm || !q || !mqd);
>
> -       pr_debug("kfd: In func %s\n", __func__);
> -
>         m = get_mqd(mqd);
>         m->cp_hqd_pq_control = DEFAULT_RPTR_BLOCK_SIZE |
>                                 DEFAULT_MIN_AVAIL_SIZE | PQ_ATC_EN;
> @@ -302,8 +298,6 @@ static int init_mqd_hiq(struct mqd_manager *mm, void **mqd,
>
>         BUG_ON(!mm || !q || !mqd || !mqd_mem_obj);
>
> -       pr_debug("kfd: In func %s\n", __func__);
> -
>         retval = kfd_gtt_sa_allocate(mm->dev, sizeof(struct cik_mqd),
>                                         mqd_mem_obj);
>
> @@ -360,8 +354,6 @@ static int update_mqd_hiq(struct mqd_manager *mm, void *mqd,
>
>         BUG_ON(!mm || !q || !mqd);
>
> -       pr_debug("kfd: In func %s\n", __func__);
> -
>         m = get_mqd(mqd);
>         m->cp_hqd_pq_control = DEFAULT_RPTR_BLOCK_SIZE |
>                                 DEFAULT_MIN_AVAIL_SIZE |
> @@ -414,8 +406,6 @@ struct mqd_manager *mqd_manager_init_cik(enum KFD_MQD_TYPE type,
>         BUG_ON(!dev);
>         BUG_ON(type >= KFD_MQD_TYPE_MAX);
>
> -       pr_debug("kfd: In func %s\n", __func__);
> -
>         mqd = kzalloc(sizeof(struct mqd_manager), GFP_KERNEL);
>         if (!mqd)
>                 return NULL;
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
> index a9b9882..5dc30f5 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
> @@ -108,8 +108,6 @@ static int __update_mqd(struct mqd_manager *mm, void *mqd,
>
>         BUG_ON(!mm || !q || !mqd);
>
> -       pr_debug("kfd: In func %s\n", __func__);
> -
>         m = get_mqd(mqd);
>
>         m->cp_hqd_pq_control = 5 << CP_HQD_PQ_CONTROL__RPTR_BLOCK_SIZE__SHIFT |
> @@ -117,7 +115,7 @@ static int __update_mqd(struct mqd_manager *mm, void *mqd,
>                         mtype << CP_HQD_PQ_CONTROL__MTYPE__SHIFT;
>         m->cp_hqd_pq_control |=
>                         ffs(q->queue_size / sizeof(unsigned int)) - 1 - 1;
> -       pr_debug("kfd: cp_hqd_pq_control 0x%x\n", m->cp_hqd_pq_control);
> +       pr_debug("cp_hqd_pq_control 0x%x\n", m->cp_hqd_pq_control);
>
>         m->cp_hqd_pq_base_lo = lower_32_bits((uint64_t)q->queue_address >> 8);
>         m->cp_hqd_pq_base_hi = upper_32_bits((uint64_t)q->queue_address >> 8);
> @@ -129,7 +127,7 @@ static int __update_mqd(struct mqd_manager *mm, void *mqd,
>                 1 << CP_HQD_PQ_DOORBELL_CONTROL__DOORBELL_EN__SHIFT |
>                 q->doorbell_off <<
>                         CP_HQD_PQ_DOORBELL_CONTROL__DOORBELL_OFFSET__SHIFT;
> -       pr_debug("kfd: cp_hqd_pq_doorbell_control 0x%x\n",
> +       pr_debug("cp_hqd_pq_doorbell_control 0x%x\n",
>                         m->cp_hqd_pq_doorbell_control);
>
>         m->cp_hqd_eop_control = atc_bit << CP_HQD_EOP_CONTROL__EOP_ATC__SHIFT |
> @@ -241,8 +239,6 @@ struct mqd_manager *mqd_manager_init_vi(enum KFD_MQD_TYPE type,
>         BUG_ON(!dev);
>         BUG_ON(type >= KFD_MQD_TYPE_MAX);
>
> -       pr_debug("kfd: In func %s\n", __func__);
> -
>         mqd = kzalloc(sizeof(struct mqd_manager), GFP_KERNEL);
>         if (!mqd)
>                 return NULL;
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> index 99c11a4..31d7d46 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> @@ -67,7 +67,7 @@ static void pm_calc_rlib_size(struct packet_manager *pm,
>         *over_subscription = false;
>         if ((process_count > 1) || queue_count > get_queues_num(pm->dqm)) {
>                 *over_subscription = true;
> -               pr_debug("kfd: over subscribed runlist\n");
> +               pr_debug("Over subscribed runlist\n");
>         }
>
>         map_queue_size =
> @@ -85,7 +85,7 @@ static void pm_calc_rlib_size(struct packet_manager *pm,
>         if (*over_subscription)
>                 *rlib_size += sizeof(struct pm4_runlist);
>
> -       pr_debug("kfd: runlist ib size %d\n", *rlib_size);
> +       pr_debug("runlist ib size %d\n", *rlib_size);
>  }
>
>  static int pm_allocate_runlist_ib(struct packet_manager *pm,
> @@ -106,7 +106,7 @@ static int pm_allocate_runlist_ib(struct packet_manager *pm,
>                                         &pm->ib_buffer_obj);
>
>         if (retval != 0) {
> -               pr_err("kfd: failed to allocate runlist IB\n");
> +               pr_err("Failed to allocate runlist IB\n");
>                 return retval;
>         }
>
> @@ -152,8 +152,6 @@ static int pm_create_map_process(struct packet_manager *pm, uint32_t *buffer,
>
>         packet = (struct pm4_map_process *)buffer;
>
> -       pr_debug("kfd: In func %s\n", __func__);
> -
>         memset(buffer, 0, sizeof(struct pm4_map_process));
>
>         packet->header.u32all = build_pm4_header(IT_MAP_PROCESS,
> @@ -189,8 +187,6 @@ static int pm_create_map_queue_vi(struct packet_manager *pm, uint32_t *buffer,
>
>         BUG_ON(!pm || !buffer || !q);
>
> -       pr_debug("kfd: In func %s\n", __func__);
> -
>         packet = (struct pm4_mes_map_queues *)buffer;
>         memset(buffer, 0, sizeof(struct pm4_map_queues));
>
> @@ -223,8 +219,7 @@ static int pm_create_map_queue_vi(struct packet_manager *pm, uint32_t *buffer,
>                 use_static = false; /* no static queues under SDMA */
>                 break;
>         default:
> -               pr_err("kfd: in %s queue type %d\n", __func__,
> -                               q->properties.type);
> +               pr_err("queue type %d\n", q->properties.type);
>                 BUG();
>                 break;
>         }
> @@ -254,8 +249,6 @@ static int pm_create_map_queue(struct packet_manager *pm, uint32_t *buffer,
>
>         BUG_ON(!pm || !buffer || !q);
>
> -       pr_debug("kfd: In func %s\n", __func__);
> -
>         packet = (struct pm4_map_queues *)buffer;
>         memset(buffer, 0, sizeof(struct pm4_map_queues));
>
> @@ -333,8 +326,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
>
>         *rl_size_bytes = alloc_size_bytes;
>
> -       pr_debug("kfd: In func %s\n", __func__);
> -       pr_debug("kfd: building runlist ib process count: %d queues count %d\n",
> +       pr_debug("Building runlist ib process count: %d queues count %d\n",
>                 pm->dqm->processes_count, pm->dqm->queue_count);
>
>         /* build the run list ib packet */
> @@ -342,7 +334,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
>                 qpd = cur->qpd;
>                 /* build map process packet */
>                 if (proccesses_mapped >= pm->dqm->processes_count) {
> -                       pr_debug("kfd: not enough space left in runlist IB\n");
> +                       pr_debug("Not enough space left in runlist IB\n");
>                         pm_release_ib(pm);
>                         return -ENOMEM;
>                 }
> @@ -359,7 +351,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
>                         if (!kq->queue->properties.is_active)
>                                 continue;
>
> -                       pr_debug("kfd: static_queue, mapping kernel q %d, is debug status %d\n",
> +                       pr_debug("static_queue, mapping kernel q %d, is debug status %d\n",
>                                 kq->queue->queue, qpd->is_debug);
>
>                         if (pm->dqm->dev->device_info->asic_family ==
> @@ -385,7 +377,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
>                         if (!q->properties.is_active)
>                                 continue;
>
> -                       pr_debug("kfd: static_queue, mapping user queue %d, is debug status %d\n",
> +                       pr_debug("static_queue, mapping user queue %d, is debug status %d\n",
>                                 q->queue, qpd->is_debug);
>
>                         if (pm->dqm->dev->device_info->asic_family ==
> @@ -409,7 +401,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
>                 }
>         }
>
> -       pr_debug("kfd: finished map process and queues to runlist\n");
> +       pr_debug("Finished map process and queues to runlist\n");
>
>         if (is_over_subscription)
>                 pm_create_runlist(pm, &rl_buffer[rl_wptr], *rl_gpu_addr,
> @@ -453,15 +445,13 @@ int pm_send_set_resources(struct packet_manager *pm,
>
>         BUG_ON(!pm || !res);
>
> -       pr_debug("kfd: In func %s\n", __func__);
> -
>         mutex_lock(&pm->lock);
>         pm->priv_queue->ops.acquire_packet_buffer(pm->priv_queue,
>                                         sizeof(*packet) / sizeof(uint32_t),
>                                         (unsigned int **)&packet);
>         if (packet == NULL) {
>                 mutex_unlock(&pm->lock);
> -               pr_err("kfd: failed to allocate buffer on kernel queue\n");
> +               pr_err("Failed to allocate buffer on kernel queue\n");
>                 return -ENOMEM;
>         }
>
> @@ -504,7 +494,7 @@ int pm_send_runlist(struct packet_manager *pm, struct list_head *dqm_queues)
>         if (retval != 0)
>                 goto fail_create_runlist_ib;
>
> -       pr_debug("kfd: runlist IB address: 0x%llX\n", rl_gpu_ib_addr);
> +       pr_debug("runlist IB address: 0x%llX\n", rl_gpu_ib_addr);
>
>         packet_size_dwords = sizeof(struct pm4_runlist) / sizeof(uint32_t);
>         mutex_lock(&pm->lock);
> @@ -595,7 +585,7 @@ int pm_send_unmap_queue(struct packet_manager *pm, enum kfd_queue_type type,
>
>         packet = (struct pm4_unmap_queues *)buffer;
>         memset(buffer, 0, sizeof(struct pm4_unmap_queues));
> -       pr_debug("kfd: static_queue: unmapping queues: mode is %d , reset is %d , type is %d\n",
> +       pr_debug("static_queue: unmapping queues: mode is %d , reset is %d , type is %d\n",
>                 mode, reset, type);
>         packet->header.u32all = build_pm4_header(IT_UNMAP_QUEUES,
>                                         sizeof(struct pm4_unmap_queues));
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> index a4e4a2d..86032bd 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> @@ -101,7 +101,7 @@ struct kfd_process *kfd_create_process(const struct task_struct *thread)
>         /* A prior open of /dev/kfd could have already created the process. */
>         process = find_process(thread);
>         if (process)
> -               pr_debug("kfd: process already found\n");
> +               pr_debug("Process already found\n");
>
>         if (!process)
>                 process = create_process(thread);
> @@ -250,7 +250,7 @@ static void kfd_process_notifier_release(struct mmu_notifier *mn,
>                         kfd_dbgmgr_destroy(pdd->dev->dbgmgr);
>
>                 if (pdd->reset_wavefronts) {
> -                       pr_warn("amdkfd: Resetting all wave fronts\n");
> +                       pr_warn("Resetting all wave fronts\n");
>                         dbgdev_wave_reset_wavefronts(pdd->dev, p);
>                         pdd->reset_wavefronts = false;
>                 }
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> index 32cdf2b..9482a5a 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> @@ -51,15 +51,13 @@ static int find_available_queue_slot(struct process_queue_manager *pqm,
>
>         BUG_ON(!pqm || !qid);
>
> -       pr_debug("kfd: in %s\n", __func__);
> -
>         found = find_first_zero_bit(pqm->queue_slot_bitmap,
>                         KFD_MAX_NUM_OF_QUEUES_PER_PROCESS);
>
> -       pr_debug("kfd: the new slot id %lu\n", found);
> +       pr_debug("The new slot id %lu\n", found);
>
>         if (found >= KFD_MAX_NUM_OF_QUEUES_PER_PROCESS) {
> -               pr_info("amdkfd: Can not open more queues for process with pasid %d\n",
> +               pr_info("Cannot open more queues for process with pasid %d\n",
>                                 pqm->process->pasid);
>                 return -ENOMEM;
>         }
> @@ -92,8 +90,6 @@ void pqm_uninit(struct process_queue_manager *pqm)
>
>         BUG_ON(!pqm);
>
> -       pr_debug("In func %s\n", __func__);
> -
>         list_for_each_entry_safe(pqn, next, &pqm->queues, process_queue_list) {
>                 retval = pqm_destroy_queue(
>                                 pqm,
> @@ -102,7 +98,7 @@ void pqm_uninit(struct process_queue_manager *pqm)
>                                         pqn->kq->queue->properties.queue_id);
>
>                 if (retval != 0) {
> -                       pr_err("kfd: failed to destroy queue\n");
> +                       pr_err("failed to destroy queue\n");
>                         return;
>                 }
>         }
> @@ -136,7 +132,7 @@ static int create_cp_queue(struct process_queue_manager *pqm,
>         (*q)->device = dev;
>         (*q)->process = pqm->process;
>
> -       pr_debug("kfd: PQM After init queue");
> +       pr_debug("PQM After init queue");
>
>         return retval;
>
> @@ -210,7 +206,7 @@ int pqm_create_queue(struct process_queue_manager *pqm,
>                 if ((sched_policy == KFD_SCHED_POLICY_HWS_NO_OVERSUBSCRIPTION) &&
>                 ((dev->dqm->processes_count >= VMID_PER_DEVICE) ||
>                 (dev->dqm->queue_count >= get_queues_num(dev->dqm)))) {
> -                       pr_err("kfd: over-subscription is not allowed in radeon_kfd.sched_policy == 1\n");
> +                       pr_err("Over-subscription is not allowed in radeon_kfd.sched_policy == 1\n");
>                         retval = -EPERM;
>                         goto err_create_queue;
>                 }
> @@ -243,17 +239,17 @@ int pqm_create_queue(struct process_queue_manager *pqm,
>         }
>
>         if (retval != 0) {
> -               pr_debug("Error dqm create queue\n");
> +               pr_err("DQM create queue failed\n");
>                 goto err_create_queue;
>         }
>
> -       pr_debug("kfd: PQM After DQM create queue\n");
> +       pr_debug("PQM After DQM create queue\n");
>
>         list_add(&pqn->process_queue_list, &pqm->queues);
>
>         if (q) {
>                 *properties = q->properties;
> -               pr_debug("kfd: PQM done creating queue\n");
> +               pr_debug("PQM done creating queue\n");
>                 print_queue_properties(properties);
>         }
>
> @@ -282,11 +278,9 @@ int pqm_destroy_queue(struct process_queue_manager *pqm, unsigned int qid)
>         BUG_ON(!pqm);
>         retval = 0;
>
> -       pr_debug("kfd: In Func %s\n", __func__);
> -
>         pqn = get_queue_by_qid(pqm, qid);
>         if (pqn == NULL) {
> -               pr_err("kfd: queue id does not match any known queue\n");
> +               pr_err("Queue id does not match any known queue\n");
>                 return -EINVAL;
>         }
>
> @@ -339,8 +333,7 @@ int pqm_update_queue(struct process_queue_manager *pqm, unsigned int qid,
>
>         pqn = get_queue_by_qid(pqm, qid);
>         if (!pqn) {
> -               pr_debug("amdkfd: No queue %d exists for update operation\n",
> -                               qid);
> +               pr_debug("No queue %d exists for update operation\n", qid);
>                 return -EFAULT;
>         }
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> index 0200dae..72d566a 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> @@ -666,7 +666,7 @@ static ssize_t node_show(struct kobject *kobj, struct attribute *attr,
>                         dev->node_props.simd_count);
>
>         if (dev->mem_bank_count < dev->node_props.mem_banks_count) {
> -               pr_info_once("kfd: mem_banks_count truncated from %d to %d\n",
> +               pr_info_once("mem_banks_count truncated from %d to %d\n",
>                                 dev->node_props.mem_banks_count,
>                                 dev->mem_bank_count);
>                 sysfs_show_32bit_prop(buffer, "mem_banks_count",
> @@ -1147,7 +1147,7 @@ int kfd_topology_add_device(struct kfd_dev *gpu)
>
>         gpu_id = kfd_generate_gpu_id(gpu);
>
> -       pr_debug("kfd: Adding new GPU (ID: 0x%x) to topology\n", gpu_id);
> +       pr_debug("Adding new GPU (ID: 0x%x) to topology\n", gpu_id);
>
>         down_write(&topology_lock);
>         /*
> @@ -1190,7 +1190,7 @@ int kfd_topology_add_device(struct kfd_dev *gpu)
>
>         if (dev->gpu->device_info->asic_family == CHIP_CARRIZO) {
>                 dev->node_props.capability |= HSA_CAP_DOORBELL_PACKET_TYPE;
> -               pr_info("amdkfd: adding doorbell packet type capability\n");
> +               pr_info("Adding doorbell packet type capability\n");
>         }
>
>         res = 0;
> --
> 2.7.4
>
This patch is:
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 07/19] drm/amdkfd: Change x==NULL/false references to !x
       [not found]     ` <1502488589-30272-8-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-08-12 13:07       ` Oded Gabbay
  0 siblings, 0 replies; 70+ messages in thread
From: Oded Gabbay @ 2017-08-12 13:07 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: Kent Russell, amd-gfx list

On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> From: Kent Russell <kent.russell@amd.com>
>
> Upstream prefers the !x notation to x==NULL or x==false. Along those lines
> change the ==true or !=NULL references as well. Also make the references
> to !x the same, excluding () for readability.
>
> Signed-off-by: Kent Russell <kent.russell@amd.com>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c           | 22 +++++-----
>  drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c            | 20 ++++-----
>  drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c            |  4 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_device.c            | 10 ++---
>  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  | 50 +++++++++++-----------
>  drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c          |  2 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_events.c            |  6 +--
>  drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c       |  2 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c      |  6 +--
>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c   |  4 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c    |  2 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c    | 26 +++++------
>  drivers/gpu/drm/amd/amdkfd/kfd_process.c           |  6 +--
>  .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |  6 +--
>  drivers/gpu/drm/amd/amdkfd/kfd_topology.c          |  6 +--
>  15 files changed, 85 insertions(+), 87 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> index 6244958..c22401e 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> @@ -265,7 +265,7 @@ static int kfd_ioctl_create_queue(struct file *filep, struct kfd_process *p,
>
>         pr_debug("Looking for gpu id 0x%x\n", args->gpu_id);
>         dev = kfd_device_by_id(args->gpu_id);
> -       if (dev == NULL) {
> +       if (!dev) {
>                 pr_debug("Could not find gpu id 0x%x\n", args->gpu_id);
>                 return -EINVAL;
>         }
> @@ -400,7 +400,7 @@ static int kfd_ioctl_set_memory_policy(struct file *filep,
>         }
>
>         dev = kfd_device_by_id(args->gpu_id);
> -       if (dev == NULL)
> +       if (!dev)
>                 return -EINVAL;
>
>         mutex_lock(&p->mutex);
> @@ -443,7 +443,7 @@ static int kfd_ioctl_dbg_register(struct file *filep,
>         long status = 0;
>
>         dev = kfd_device_by_id(args->gpu_id);
> -       if (dev == NULL)
> +       if (!dev)
>                 return -EINVAL;
>
>         if (dev->device_info->asic_family == CHIP_CARRIZO) {
> @@ -465,7 +465,7 @@ static int kfd_ioctl_dbg_register(struct file *filep,
>                 return PTR_ERR(pdd);
>         }
>
> -       if (dev->dbgmgr == NULL) {
> +       if (!dev->dbgmgr) {
>                 /* In case of a legal call, we have no dbgmgr yet */
>                 create_ok = kfd_dbgmgr_create(&dbgmgr_ptr, dev);
>                 if (create_ok) {
> @@ -494,7 +494,7 @@ static int kfd_ioctl_dbg_unregister(struct file *filep,
>         long status;
>
>         dev = kfd_device_by_id(args->gpu_id);
> -       if (dev == NULL)
> +       if (!dev)
>                 return -EINVAL;
>
>         if (dev->device_info->asic_family == CHIP_CARRIZO) {
> @@ -505,7 +505,7 @@ static int kfd_ioctl_dbg_unregister(struct file *filep,
>         mutex_lock(kfd_get_dbgmgr_mutex());
>
>         status = kfd_dbgmgr_unregister(dev->dbgmgr, p);
> -       if (status == 0) {
> +       if (!status) {
>                 kfd_dbgmgr_destroy(dev->dbgmgr);
>                 dev->dbgmgr = NULL;
>         }
> @@ -539,7 +539,7 @@ static int kfd_ioctl_dbg_address_watch(struct file *filep,
>         memset((void *) &aw_info, 0, sizeof(struct dbg_address_watch_info));
>
>         dev = kfd_device_by_id(args->gpu_id);
> -       if (dev == NULL)
> +       if (!dev)
>                 return -EINVAL;
>
>         if (dev->device_info->asic_family == CHIP_CARRIZO) {
> @@ -646,7 +646,7 @@ static int kfd_ioctl_dbg_wave_control(struct file *filep,
>                                 sizeof(wac_info.trapId);
>
>         dev = kfd_device_by_id(args->gpu_id);
> -       if (dev == NULL)
> +       if (!dev)
>                 return -EINVAL;
>
>         if (dev->device_info->asic_family == CHIP_CARRIZO) {
> @@ -782,9 +782,9 @@ static int kfd_ioctl_get_process_apertures(struct file *filp,
>                                 "scratch_limit %llX\n", pdd->scratch_limit);
>
>                         args->num_of_nodes++;
> -               } while ((pdd = kfd_get_next_process_device_data(p, pdd)) !=
> -                               NULL &&
> -                               (args->num_of_nodes < NUM_OF_SUPPORTED_GPUS));
> +
> +                       pdd = kfd_get_next_process_device_data(p, pdd);
> +               } while (pdd && (args->num_of_nodes < NUM_OF_SUPPORTED_GPUS));
>         }
>
>         mutex_unlock(&p->mutex);
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
> index bf8ee19..0ef9136 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
> @@ -77,7 +77,7 @@ static int dbgdev_diq_submit_ib(struct kfd_dbgdev *dbgdev,
>         status = kq->ops.acquire_packet_buffer(kq,
>                                 pq_packets_size_in_bytes / sizeof(uint32_t),
>                                 &ib_packet_buff);
> -       if (status != 0) {
> +       if (status) {
>                 pr_err("acquire_packet_buffer failed\n");
>                 return status;
>         }
> @@ -115,7 +115,7 @@ static int dbgdev_diq_submit_ib(struct kfd_dbgdev *dbgdev,
>         status = kfd_gtt_sa_allocate(dbgdev->dev, sizeof(uint64_t),
>                                         &mem_obj);
>
> -       if (status != 0) {
> +       if (status) {
>                 pr_err("Failed to allocate GART memory\n");
>                 kq->ops.rollback_packet(kq);
>                 return status;
> @@ -202,7 +202,7 @@ static int dbgdev_register_diq(struct kfd_dbgdev *dbgdev)
>
>         kq = pqm_get_kernel_queue(dbgdev->pqm, qid);
>
> -       if (kq == NULL) {
> +       if (!kq) {
>                 pr_err("Error getting DIQ\n");
>                 pqm_destroy_queue(dbgdev->pqm, qid);
>                 return -EFAULT;
> @@ -252,7 +252,7 @@ static void dbgdev_address_watch_set_registers(
>         addrLo->u32All = 0;
>         cntl->u32All = 0;
>
> -       if (adw_info->watch_mask != NULL)
> +       if (adw_info->watch_mask)
>                 cntl->bitfields.mask =
>                         (uint32_t) (adw_info->watch_mask[index] &
>                                         ADDRESS_WATCH_REG_CNTL_DEFAULT_MASK);
> @@ -307,8 +307,7 @@ static int dbgdev_address_watch_nodiq(struct kfd_dbgdev *dbgdev,
>                 return -EINVAL;
>         }
>
> -       if ((adw_info->watch_mode == NULL) ||
> -               (adw_info->watch_address == NULL)) {
> +       if (!adw_info->watch_mode || !adw_info->watch_address) {
>                 pr_err("adw_info fields are not valid\n");
>                 return -EINVAL;
>         }
> @@ -375,15 +374,14 @@ static int dbgdev_address_watch_diq(struct kfd_dbgdev *dbgdev,
>                 return -EINVAL;
>         }
>
> -       if ((NULL == adw_info->watch_mode) ||
> -                       (NULL == adw_info->watch_address)) {
> +       if (!adw_info->watch_mode || !adw_info->watch_address) {
>                 pr_err("adw_info fields are not valid\n");
>                 return -EINVAL;
>         }
>
>         status = kfd_gtt_sa_allocate(dbgdev->dev, ib_size, &mem_obj);
>
> -       if (status != 0) {
> +       if (status) {
>                 pr_err("Failed to allocate GART memory\n");
>                 return status;
>         }
> @@ -490,7 +488,7 @@ static int dbgdev_address_watch_diq(struct kfd_dbgdev *dbgdev,
>                                         packet_buff_uint,
>                                         ib_size);
>
> -               if (status != 0) {
> +               if (status) {
>                         pr_err("Failed to submit IB to DIQ\n");
>                         break;
>                 }
> @@ -711,7 +709,7 @@ static int dbgdev_wave_control_diq(struct kfd_dbgdev *dbgdev,
>                         packet_buff_uint,
>                         ib_size);
>
> -       if (status != 0)
> +       if (status)
>                 pr_err("Failed to submit IB to DIQ\n");
>
>         kfd_gtt_sa_free(dbgdev->dev, mem_obj);
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c b/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c
> index 7225789..210bdc1 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c
> @@ -55,7 +55,7 @@ static void kfd_dbgmgr_uninitialize(struct kfd_dbgmgr *pmgr)
>
>  void kfd_dbgmgr_destroy(struct kfd_dbgmgr *pmgr)
>  {
> -       if (pmgr != NULL) {
> +       if (pmgr) {
>                 kfd_dbgmgr_uninitialize(pmgr);
>                 kfree(pmgr);
>         }
> @@ -66,7 +66,7 @@ bool kfd_dbgmgr_create(struct kfd_dbgmgr **ppmgr, struct kfd_dev *pdev)
>         enum DBGDEV_TYPE type = DBGDEV_TYPE_DIQ;
>         struct kfd_dbgmgr *new_buff;
>
> -       BUG_ON(pdev == NULL);
> +       BUG_ON(!pdev);
>         BUG_ON(!pdev->init_complete);
>
>         new_buff = kfd_alloc_struct(new_buff);
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> index 87df8bf..d962342 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> @@ -98,7 +98,7 @@ static const struct kfd_device_info *lookup_device_info(unsigned short did)
>
>         for (i = 0; i < ARRAY_SIZE(supported_devices); i++) {
>                 if (supported_devices[i].did == did) {
> -                       BUG_ON(supported_devices[i].device_info == NULL);
> +                       BUG_ON(!supported_devices[i].device_info);
>                         return supported_devices[i].device_info;
>                 }
>         }
> @@ -212,7 +212,7 @@ static int iommu_invalid_ppr_cb(struct pci_dev *pdev, int pasid,
>                         flags);
>
>         dev = kfd_device_by_pci_dev(pdev);
> -       BUG_ON(dev == NULL);
> +       BUG_ON(!dev);
>
>         kfd_signal_iommu_event(dev, pasid, address,
>                         flags & PPR_FAULT_WRITE, flags & PPR_FAULT_EXEC);
> @@ -262,7 +262,7 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
>
>         kfd_doorbell_init(kfd);
>
> -       if (kfd_topology_add_device(kfd) != 0) {
> +       if (kfd_topology_add_device(kfd)) {
>                 dev_err(kfd_device, "Error adding device to topology\n");
>                 goto kfd_topology_add_device_error;
>         }
> @@ -288,7 +288,7 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
>                 goto device_queue_manager_error;
>         }
>
> -       if (kfd->dqm->ops.start(kfd->dqm) != 0) {
> +       if (kfd->dqm->ops.start(kfd->dqm)) {
>                 dev_err(kfd_device,
>                         "Error starting queue manager for device %x:%x\n",
>                         kfd->pdev->vendor, kfd->pdev->device);
> @@ -341,7 +341,7 @@ void kgd2kfd_device_exit(struct kfd_dev *kfd)
>
>  void kgd2kfd_suspend(struct kfd_dev *kfd)
>  {
> -       BUG_ON(kfd == NULL);
> +       BUG_ON(!kfd);
>
>         if (kfd->init_complete) {
>                 kfd->dqm->ops.stop(kfd->dqm);
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> index 8b147e4..df93531 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> @@ -167,7 +167,7 @@ static int create_queue_nocpsch(struct device_queue_manager *dqm,
>
>         if (list_empty(&qpd->queues_list)) {
>                 retval = allocate_vmid(dqm, qpd, q);
> -               if (retval != 0) {
> +               if (retval) {
>                         mutex_unlock(&dqm->lock);
>                         return retval;
>                 }
> @@ -180,7 +180,7 @@ static int create_queue_nocpsch(struct device_queue_manager *dqm,
>         if (q->properties.type == KFD_QUEUE_TYPE_SDMA)
>                 retval = create_sdma_queue_nocpsch(dqm, q, qpd);
>
> -       if (retval != 0) {
> +       if (retval) {
>                 if (list_empty(&qpd->queues_list)) {
>                         deallocate_vmid(dqm, qpd, q);
>                         *allocated_vmid = 0;
> @@ -262,16 +262,16 @@ static int create_compute_queue_nocpsch(struct device_queue_manager *dqm,
>         BUG_ON(!dqm || !q || !qpd);
>
>         mqd = dqm->ops.get_mqd_manager(dqm, KFD_MQD_TYPE_COMPUTE);
> -       if (mqd == NULL)
> +       if (!mqd)
>                 return -ENOMEM;
>
>         retval = allocate_hqd(dqm, q);
> -       if (retval != 0)
> +       if (retval)
>                 return retval;
>
>         retval = mqd->init_mqd(mqd, &q->mqd, &q->mqd_mem_obj,
>                                 &q->gart_mqd_addr, &q->properties);
> -       if (retval != 0) {
> +       if (retval) {
>                 deallocate_hqd(dqm, q);
>                 return retval;
>         }
> @@ -281,7 +281,7 @@ static int create_compute_queue_nocpsch(struct device_queue_manager *dqm,
>
>         retval = mqd->load_mqd(mqd, q->mqd, q->pipe,
>                         q->queue, (uint32_t __user *) q->properties.write_ptr);
> -       if (retval != 0) {
> +       if (retval) {
>                 deallocate_hqd(dqm, q);
>                 mqd->uninit_mqd(mqd, q->mqd, q->mqd_mem_obj);
>                 return retval;
> @@ -330,7 +330,7 @@ static int destroy_queue_nocpsch(struct device_queue_manager *dqm,
>                                 QUEUE_PREEMPT_DEFAULT_TIMEOUT_MS,
>                                 q->pipe, q->queue);
>
> -       if (retval != 0)
> +       if (retval)
>                 goto out;
>
>         mqd->uninit_mqd(mqd, q->mqd, q->mqd_mem_obj);
> @@ -365,7 +365,7 @@ static int update_queue(struct device_queue_manager *dqm, struct queue *q)
>         mutex_lock(&dqm->lock);
>         mqd = dqm->ops.get_mqd_manager(dqm,
>                         get_mqd_type_from_queue_type(q->properties.type));
> -       if (mqd == NULL) {
> +       if (!mqd) {
>                 mutex_unlock(&dqm->lock);
>                 return -ENOMEM;
>         }
> @@ -381,7 +381,7 @@ static int update_queue(struct device_queue_manager *dqm, struct queue *q)
>         retval = mqd->update_mqd(mqd, q->mqd, &q->properties);
>         if ((q->properties.is_active) && (!prev_active))
>                 dqm->queue_count++;
> -       else if ((!q->properties.is_active) && (prev_active))
> +       else if (!q->properties.is_active && prev_active)
>                 dqm->queue_count--;
>
>         if (sched_policy != KFD_SCHED_POLICY_NO_HWS)
> @@ -403,7 +403,7 @@ static struct mqd_manager *get_mqd_manager_nocpsch(
>         mqd = dqm->mqds[type];
>         if (!mqd) {
>                 mqd = mqd_manager_init(type, dqm->dev);
> -               if (mqd == NULL)
> +               if (!mqd)
>                         pr_err("mqd manager is NULL");
>                 dqm->mqds[type] = mqd;
>         }
> @@ -485,7 +485,7 @@ static void init_interrupts(struct device_queue_manager *dqm)
>  {
>         unsigned int i;
>
> -       BUG_ON(dqm == NULL);
> +       BUG_ON(!dqm);
>
>         for (i = 0 ; i < get_pipes_per_mec(dqm) ; i++)
>                 if (is_pipe_enabled(dqm, 0, i))
> @@ -589,7 +589,7 @@ static int create_sdma_queue_nocpsch(struct device_queue_manager *dqm,
>                 return -ENOMEM;
>
>         retval = allocate_sdma_queue(dqm, &q->sdma_id);
> -       if (retval != 0)
> +       if (retval)
>                 return retval;
>
>         q->properties.sdma_queue_id = q->sdma_id % CIK_SDMA_QUEUES_PER_ENGINE;
> @@ -602,14 +602,14 @@ static int create_sdma_queue_nocpsch(struct device_queue_manager *dqm,
>         dqm->ops_asic_specific.init_sdma_vm(dqm, q, qpd);
>         retval = mqd->init_mqd(mqd, &q->mqd, &q->mqd_mem_obj,
>                                 &q->gart_mqd_addr, &q->properties);
> -       if (retval != 0) {
> +       if (retval) {
>                 deallocate_sdma_queue(dqm, q->sdma_id);
>                 return retval;
>         }
>
>         retval = mqd->load_mqd(mqd, q->mqd, 0,
>                                 0, NULL);
> -       if (retval != 0) {
> +       if (retval) {
>                 deallocate_sdma_queue(dqm, q->sdma_id);
>                 mqd->uninit_mqd(mqd, q->mqd, q->mqd_mem_obj);
>                 return retval;
> @@ -680,7 +680,7 @@ static int initialize_cpsch(struct device_queue_manager *dqm)
>         dqm->sdma_queue_count = 0;
>         dqm->active_runlist = false;
>         retval = dqm->ops_asic_specific.initialize(dqm);
> -       if (retval != 0)
> +       if (retval)
>                 goto fail_init_pipelines;
>
>         return 0;
> @@ -700,11 +700,11 @@ static int start_cpsch(struct device_queue_manager *dqm)
>         retval = 0;
>
>         retval = pm_init(&dqm->packets, dqm);
> -       if (retval != 0)
> +       if (retval)
>                 goto fail_packet_manager_init;
>
>         retval = set_sched_resources(dqm);
> -       if (retval != 0)
> +       if (retval)
>                 goto fail_set_sched_resources;
>
>         pr_debug("Allocating fence memory\n");
> @@ -713,7 +713,7 @@ static int start_cpsch(struct device_queue_manager *dqm)
>         retval = kfd_gtt_sa_allocate(dqm->dev, sizeof(*dqm->fence_addr),
>                                         &dqm->fence_mem);
>
> -       if (retval != 0)
> +       if (retval)
>                 goto fail_allocate_vidmem;
>
>         dqm->fence_addr = dqm->fence_mem->cpu_ptr;
> @@ -845,7 +845,7 @@ static int create_queue_cpsch(struct device_queue_manager *dqm, struct queue *q,
>         mqd = dqm->ops.get_mqd_manager(dqm,
>                         get_mqd_type_from_queue_type(q->properties.type));
>
> -       if (mqd == NULL) {
> +       if (!mqd) {
>                 mutex_unlock(&dqm->lock);
>                 return -ENOMEM;
>         }
> @@ -853,7 +853,7 @@ static int create_queue_cpsch(struct device_queue_manager *dqm, struct queue *q,
>         dqm->ops_asic_specific.init_sdma_vm(dqm, q, qpd);
>         retval = mqd->init_mqd(mqd, &q->mqd, &q->mqd_mem_obj,
>                                 &q->gart_mqd_addr, &q->properties);
> -       if (retval != 0)
> +       if (retval)
>                 goto out;
>
>         list_add(&q->list, &qpd->queues_list);
> @@ -934,7 +934,7 @@ static int destroy_queues_cpsch(struct device_queue_manager *dqm,
>
>         retval = pm_send_unmap_queue(&dqm->packets, KFD_QUEUE_TYPE_COMPUTE,
>                         preempt_type, 0, false, 0);
> -       if (retval != 0)
> +       if (retval)
>                 goto out;
>
>         *dqm->fence_addr = KFD_FENCE_INIT;
> @@ -943,7 +943,7 @@ static int destroy_queues_cpsch(struct device_queue_manager *dqm,
>         /* should be timed out */
>         retval = amdkfd_fence_wait_timeout(dqm->fence_addr, KFD_FENCE_COMPLETED,
>                                 QUEUE_PREEMPT_DEFAULT_TIMEOUT_MS);
> -       if (retval != 0) {
> +       if (retval) {
>                 pdd = kfd_get_process_device_data(dqm->dev,
>                                 kfd_get_process(current));
>                 pdd->reset_wavefronts = true;
> @@ -968,7 +968,7 @@ static int execute_queues_cpsch(struct device_queue_manager *dqm, bool lock)
>                 mutex_lock(&dqm->lock);
>
>         retval = destroy_queues_cpsch(dqm, false, false);
> -       if (retval != 0) {
> +       if (retval) {
>                 pr_err("The cp might be in an unrecoverable state due to an unsuccessful queues preemption");
>                 goto out;
>         }
> @@ -984,7 +984,7 @@ static int execute_queues_cpsch(struct device_queue_manager *dqm, bool lock)
>         }
>
>         retval = pm_send_runlist(&dqm->packets, &dqm->queues);
> -       if (retval != 0) {
> +       if (retval) {
>                 pr_err("failed to execute runlist");
>                 goto out;
>         }
> @@ -1193,7 +1193,7 @@ struct device_queue_manager *device_queue_manager_init(struct kfd_dev *dev)
>                 break;
>         }
>
> -       if (dqm->ops.initialize(dqm) != 0) {
> +       if (dqm->ops.initialize(dqm)) {
>                 kfree(dqm);
>                 return NULL;
>         }
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
> index ca21538..48018a3 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
> @@ -131,7 +131,7 @@ int kfd_doorbell_mmap(struct kfd_process *process, struct vm_area_struct *vma)
>
>         /* Find kfd device according to gpu id */
>         dev = kfd_device_by_id(vma->vm_pgoff);
> -       if (dev == NULL)
> +       if (!dev)
>                 return -EINVAL;
>
>         /* Calculate physical address of doorbell */
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
> index abdaf95..5979158 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
> @@ -247,7 +247,7 @@ static u32 make_nonsignal_event_id(struct kfd_process *p)
>
>         for (id = p->next_nonsignal_event_id;
>                 id < KFD_LAST_NONSIGNAL_EVENT_ID &&
> -               lookup_event_by_id(p, id) != NULL;
> +               lookup_event_by_id(p, id);
>                 id++)
>                 ;
>
> @@ -266,7 +266,7 @@ static u32 make_nonsignal_event_id(struct kfd_process *p)
>
>         for (id = KFD_FIRST_NONSIGNAL_EVENT_ID;
>                 id < KFD_LAST_NONSIGNAL_EVENT_ID &&
> -               lookup_event_by_id(p, id) != NULL;
> +               lookup_event_by_id(p, id);
>                 id++)
>                 ;
>
> @@ -342,7 +342,7 @@ void kfd_event_init_process(struct kfd_process *p)
>
>  static void destroy_event(struct kfd_process *p, struct kfd_event *ev)
>  {
> -       if (ev->signal_page != NULL) {
> +       if (ev->signal_page) {
>                 release_event_notification_slot(ev->signal_page,
>                                                 ev->signal_slot_index);
>                 p->signal_event_count--;
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c b/drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c
> index 2b65510..c59384b 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c
> @@ -304,7 +304,7 @@ int kfd_init_apertures(struct kfd_process *process)
>                 id < NUM_OF_SUPPORTED_GPUS) {
>
>                 pdd = kfd_create_process_device_data(dev, process);
> -               if (pdd == NULL) {
> +               if (!pdd) {
>                         pr_err("Failed to create process device data\n");
>                         return -1;
>                 }
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
> index f89d366..8844798 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
> @@ -67,12 +67,12 @@ static bool initialize(struct kernel_queue *kq, struct kfd_dev *dev,
>                 break;
>         }
>
> -       if (kq->mqd == NULL)
> +       if (!kq->mqd)
>                 return false;
>
>         prop.doorbell_ptr = kfd_get_kernel_doorbell(dev, &prop.doorbell_off);
>
> -       if (prop.doorbell_ptr == NULL) {
> +       if (!prop.doorbell_ptr) {
>                 pr_err("Failed to initialize doorbell");
>                 goto err_get_kernel_doorbell;
>         }
> @@ -87,7 +87,7 @@ static bool initialize(struct kernel_queue *kq, struct kfd_dev *dev,
>         kq->pq_gpu_addr = kq->pq->gpu_addr;
>
>         retval = kq->ops_asic_specific.initialize(kq, dev, type, queue_size);
> -       if (retval == false)
> +       if (!retval)
>                 goto err_eop_allocate_vidmem;
>
>         retval = kfd_gtt_sa_allocate(dev, sizeof(*kq->rptr_kernel),
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
> index 27fd930..9908227 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
> @@ -99,7 +99,7 @@ static int init_mqd(struct mqd_manager *mm, void **mqd,
>                 m->cp_hqd_iq_rptr = AQL_ENABLE;
>
>         *mqd = m;
> -       if (gart_addr != NULL)
> +       if (gart_addr)
>                 *gart_addr = addr;
>         retval = mm->update_mqd(mm, m, q);
>
> @@ -127,7 +127,7 @@ static int init_mqd_sdma(struct mqd_manager *mm, void **mqd,
>         memset(m, 0, sizeof(struct cik_sdma_rlc_registers));
>
>         *mqd = m;
> -       if (gart_addr != NULL)
> +       if (gart_addr)
>                 *gart_addr = (*mqd_mem_obj)->gpu_addr;
>
>         retval = mm->update_mqd(mm, m, q);
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
> index 5dc30f5..5ba3b40 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
> @@ -85,7 +85,7 @@ static int init_mqd(struct mqd_manager *mm, void **mqd,
>                 m->cp_hqd_iq_rptr = 1;
>
>         *mqd = m;
> -       if (gart_addr != NULL)
> +       if (gart_addr)
>                 *gart_addr = addr;
>         retval = mm->update_mqd(mm, m, q);
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> index 31d7d46..f3b8cc8 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> @@ -98,14 +98,14 @@ static int pm_allocate_runlist_ib(struct packet_manager *pm,
>
>         BUG_ON(!pm);
>         BUG_ON(pm->allocated);
> -       BUG_ON(is_over_subscription == NULL);
> +       BUG_ON(!is_over_subscription);
>
>         pm_calc_rlib_size(pm, rl_buffer_size, is_over_subscription);
>
>         retval = kfd_gtt_sa_allocate(pm->dqm->dev, *rl_buffer_size,
>                                         &pm->ib_buffer_obj);
>
> -       if (retval != 0) {
> +       if (retval) {
>                 pr_err("Failed to allocate runlist IB\n");
>                 return retval;
>         }
> @@ -321,7 +321,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
>
>         retval = pm_allocate_runlist_ib(pm, &rl_buffer, rl_gpu_addr,
>                                 &alloc_size_bytes, &is_over_subscription);
> -       if (retval != 0)
> +       if (retval)
>                 return retval;
>
>         *rl_size_bytes = alloc_size_bytes;
> @@ -340,7 +340,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
>                 }
>
>                 retval = pm_create_map_process(pm, &rl_buffer[rl_wptr], qpd);
> -               if (retval != 0)
> +               if (retval)
>                         return retval;
>
>                 proccesses_mapped++;
> @@ -365,7 +365,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
>                                                 &rl_buffer[rl_wptr],
>                                                 kq->queue,
>                                                 qpd->is_debug);
> -                       if (retval != 0)
> +                       if (retval)
>                                 return retval;
>
>                         inc_wptr(&rl_wptr,
> @@ -392,7 +392,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
>                                                 q,
>                                                 qpd->is_debug);
>
> -                       if (retval != 0)
> +                       if (retval)
>                                 return retval;
>
>                         inc_wptr(&rl_wptr,
> @@ -421,7 +421,7 @@ int pm_init(struct packet_manager *pm, struct device_queue_manager *dqm)
>         pm->dqm = dqm;
>         mutex_init(&pm->lock);
>         pm->priv_queue = kernel_queue_init(dqm->dev, KFD_QUEUE_TYPE_HIQ);
> -       if (pm->priv_queue == NULL) {
> +       if (!pm->priv_queue) {
>                 mutex_destroy(&pm->lock);
>                 return -ENOMEM;
>         }
> @@ -449,7 +449,7 @@ int pm_send_set_resources(struct packet_manager *pm,
>         pm->priv_queue->ops.acquire_packet_buffer(pm->priv_queue,
>                                         sizeof(*packet) / sizeof(uint32_t),
>                                         (unsigned int **)&packet);
> -       if (packet == NULL) {
> +       if (!packet) {
>                 mutex_unlock(&pm->lock);
>                 pr_err("Failed to allocate buffer on kernel queue\n");
>                 return -ENOMEM;
> @@ -491,7 +491,7 @@ int pm_send_runlist(struct packet_manager *pm, struct list_head *dqm_queues)
>
>         retval = pm_create_runlist_ib(pm, dqm_queues, &rl_gpu_ib_addr,
>                                         &rl_ib_size);
> -       if (retval != 0)
> +       if (retval)
>                 goto fail_create_runlist_ib;
>
>         pr_debug("runlist IB address: 0x%llX\n", rl_gpu_ib_addr);
> @@ -501,12 +501,12 @@ int pm_send_runlist(struct packet_manager *pm, struct list_head *dqm_queues)
>
>         retval = pm->priv_queue->ops.acquire_packet_buffer(pm->priv_queue,
>                                         packet_size_dwords, &rl_buffer);
> -       if (retval != 0)
> +       if (retval)
>                 goto fail_acquire_packet_buffer;
>
>         retval = pm_create_runlist(pm, rl_buffer, rl_gpu_ib_addr,
>                                         rl_ib_size / sizeof(uint32_t), false);
> -       if (retval != 0)
> +       if (retval)
>                 goto fail_create_runlist;
>
>         pm->priv_queue->ops.submit_packet(pm->priv_queue);
> @@ -537,7 +537,7 @@ int pm_send_query_status(struct packet_manager *pm, uint64_t fence_address,
>                         pm->priv_queue,
>                         sizeof(struct pm4_query_status) / sizeof(uint32_t),
>                         (unsigned int **)&packet);
> -       if (retval != 0)
> +       if (retval)
>                 goto fail_acquire_packet_buffer;
>
>         packet->header.u32all = build_pm4_header(IT_QUERY_STATUS,
> @@ -580,7 +580,7 @@ int pm_send_unmap_queue(struct packet_manager *pm, enum kfd_queue_type type,
>                         pm->priv_queue,
>                         sizeof(struct pm4_unmap_queues) / sizeof(uint32_t),
>                         &buffer);
> -       if (retval != 0)
> +       if (retval)
>                 goto err_acquire_packet_buffer;
>
>         packet = (struct pm4_unmap_queues *)buffer;
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> index 86032bd..d877cda 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> @@ -81,7 +81,7 @@ struct kfd_process *kfd_create_process(const struct task_struct *thread)
>
>         BUG_ON(!kfd_process_wq);
>
> -       if (thread->mm == NULL)
> +       if (!thread->mm)
>                 return ERR_PTR(-EINVAL);
>
>         /* Only the pthreads threading model is supported. */
> @@ -117,7 +117,7 @@ struct kfd_process *kfd_get_process(const struct task_struct *thread)
>  {
>         struct kfd_process *process;
>
> -       if (thread->mm == NULL)
> +       if (!thread->mm)
>                 return ERR_PTR(-EINVAL);
>
>         /* Only the pthreads threading model is supported. */
> @@ -407,7 +407,7 @@ void kfd_unbind_process_from_device(struct kfd_dev *dev, unsigned int pasid)
>         struct kfd_process *p;
>         struct kfd_process_device *pdd;
>
> -       BUG_ON(dev == NULL);
> +       BUG_ON(!dev);
>
>         /*
>          * Look for the process that matches the pasid. If there is no such
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> index 9482a5a..d4f8bae 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> @@ -76,7 +76,7 @@ int pqm_init(struct process_queue_manager *pqm, struct kfd_process *p)
>         pqm->queue_slot_bitmap =
>                         kzalloc(DIV_ROUND_UP(KFD_MAX_NUM_OF_QUEUES_PER_PROCESS,
>                                         BITS_PER_BYTE), GFP_KERNEL);
> -       if (pqm->queue_slot_bitmap == NULL)
> +       if (!pqm->queue_slot_bitmap)
>                 return -ENOMEM;
>         pqm->process = p;
>
> @@ -223,7 +223,7 @@ int pqm_create_queue(struct process_queue_manager *pqm,
>                 break;
>         case KFD_QUEUE_TYPE_DIQ:
>                 kq = kernel_queue_init(dev, KFD_QUEUE_TYPE_DIQ);
> -               if (kq == NULL) {
> +               if (!kq) {
>                         retval = -ENOMEM;
>                         goto err_create_queue;
>                 }
> @@ -279,7 +279,7 @@ int pqm_destroy_queue(struct process_queue_manager *pqm, unsigned int qid)
>         retval = 0;
>
>         pqn = get_queue_by_qid(pqm, qid);
> -       if (pqn == NULL) {
> +       if (!pqn) {
>                 pr_err("Queue id does not match any known queue\n");
>                 return -EINVAL;
>         }
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> index 72d566a..113c1ce 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> @@ -416,7 +416,7 @@ static struct kfd_topology_device *kfd_create_topology_device(void)
>         struct kfd_topology_device *dev;
>
>         dev = kfd_alloc_struct(dev);
> -       if (dev == NULL) {
> +       if (!dev) {
>                 pr_err("No memory to allocate a topology device");
>                 return NULL;
>         }
> @@ -957,7 +957,7 @@ static int kfd_topology_update_sysfs(void)
>         int ret;
>
>         pr_info("Creating topology SYSFS entries\n");
> -       if (sys_props.kobj_topology == NULL) {
> +       if (!sys_props.kobj_topology) {
>                 sys_props.kobj_topology =
>                                 kfd_alloc_struct(sys_props.kobj_topology);
>                 if (!sys_props.kobj_topology)
> @@ -1120,7 +1120,7 @@ static struct kfd_topology_device *kfd_assign_gpu(struct kfd_dev *gpu)
>         BUG_ON(!gpu);
>
>         list_for_each_entry(dev, &topology_device_list, list)
> -               if (dev->gpu == NULL && dev->node_props.simd_count > 0) {
> +               if (!dev->gpu && (dev->node_props.simd_count > 0)) {
>                         dev->gpu = gpu;
>                         out_dev = dev;
>                         break;
> --
> 2.7.4
>

This patch is:
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 08/19] drm/amdkfd: Fix goto usage
       [not found]     ` <1502488589-30272-9-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-08-12 13:21       ` Oded Gabbay
  0 siblings, 0 replies; 70+ messages in thread
From: Oded Gabbay @ 2017-08-12 13:21 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: Kent Russell, amd-gfx list

On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> From: Kent Russell <kent.russell@amd.com>
>
> Remove gotos that do not feature any common cleanup, and use gotos
> instead of repeating cleanup commands.
>
> According to kernel.org: "The goto statement comes in handy when a
> function exits from multiple locations and some common work such as
> cleanup has to be done. If there is no cleanup needed then just return
> directly."
>
> Signed-off-by: Kent Russell <kent.russell@amd.com>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c           |  15 +--
>  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  | 102 ++++++++++-----------
>  drivers/gpu/drm/amd/amdkfd/kfd_module.c            |   3 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c    |  14 +--
>  .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |  14 +--
>  5 files changed, 65 insertions(+), 83 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> index c22401e..7d78119 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> @@ -460,9 +460,8 @@ static int kfd_ioctl_dbg_register(struct file *filep,
>          */
>         pdd = kfd_bind_process_to_device(dev, p);
>         if (IS_ERR(pdd)) {
> -               mutex_unlock(kfd_get_dbgmgr_mutex());
> -               mutex_unlock(&p->mutex);
> -               return PTR_ERR(pdd);
> +               status = PTR_ERR(pdd);
> +               goto out;
>         }
>
>         if (!dev->dbgmgr) {
> @@ -480,6 +479,7 @@ static int kfd_ioctl_dbg_register(struct file *filep,
>                 status = -EINVAL;
>         }
>
> +out:
>         mutex_unlock(kfd_get_dbgmgr_mutex());
>         mutex_unlock(&p->mutex);
>
> @@ -580,8 +580,8 @@ static int kfd_ioctl_dbg_address_watch(struct file *filep,
>         args_idx += sizeof(aw_info.watch_address) * aw_info.num_watch_points;
>
>         if (args_idx >= args->buf_size_in_bytes - sizeof(*args)) {
> -               kfree(args_buff);
> -               return -EINVAL;
> +               status = -EINVAL;
> +               goto out;
>         }
>
>         watch_mask_value = (uint64_t) args_buff[args_idx];
> @@ -604,8 +604,8 @@ static int kfd_ioctl_dbg_address_watch(struct file *filep,
>         }
>
>         if (args_idx >= args->buf_size_in_bytes - sizeof(args)) {
> -               kfree(args_buff);
> -               return -EINVAL;
> +               status = -EINVAL;
> +               goto out;
>         }
>
>         /* Currently HSA Event is not supported for DBG */
> @@ -617,6 +617,7 @@ static int kfd_ioctl_dbg_address_watch(struct file *filep,
>
>         mutex_unlock(kfd_get_dbgmgr_mutex());
>
> +out:
>         kfree(args_buff);
>
>         return status;
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> index df93531..2003a7e 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> @@ -150,7 +150,7 @@ static int create_queue_nocpsch(struct device_queue_manager *dqm,
>                                 struct qcm_process_device *qpd,
>                                 int *allocated_vmid)
>  {
> -       int retval;
> +       int retval = 0;
This is redundant. retval will *always* be initialized by calling
create_*_queue_nocpsch() function later.

>
>         BUG_ON(!dqm || !q || !qpd || !allocated_vmid);
>
> @@ -161,23 +161,21 @@ static int create_queue_nocpsch(struct device_queue_manager *dqm,
>         if (dqm->total_queue_count >= max_num_of_queues_per_device) {
>                 pr_warn("Can't create new usermode queue because %d queues were already created\n",
>                                 dqm->total_queue_count);
> -               mutex_unlock(&dqm->lock);
> -               return -EPERM;
> +               retval = -EPERM;
> +               goto out_unlock;
>         }
>
>         if (list_empty(&qpd->queues_list)) {
>                 retval = allocate_vmid(dqm, qpd, q);
> -               if (retval) {
> -                       mutex_unlock(&dqm->lock);
> -                       return retval;
> -               }
> +               if (retval)
> +                       goto out_unlock;
>         }
>         *allocated_vmid = qpd->vmid;
>         q->properties.vmid = qpd->vmid;
>
>         if (q->properties.type == KFD_QUEUE_TYPE_COMPUTE)
>                 retval = create_compute_queue_nocpsch(dqm, q, qpd);
> -       if (q->properties.type == KFD_QUEUE_TYPE_SDMA)
> +       else if (q->properties.type == KFD_QUEUE_TYPE_SDMA)
>                 retval = create_sdma_queue_nocpsch(dqm, q, qpd);
Just for completeness (although it should never occur) , I think we should add :

else
      retval = -EINVAL;

>
>         if (retval) {
> @@ -185,8 +183,7 @@ static int create_queue_nocpsch(struct device_queue_manager *dqm,
>                         deallocate_vmid(dqm, qpd, q);
>                         *allocated_vmid = 0;
>                 }
> -               mutex_unlock(&dqm->lock);
> -               return retval;
> +               goto out_unlock;
>         }
>
>         list_add(&q->list, &qpd->queues_list);
> @@ -204,8 +201,9 @@ static int create_queue_nocpsch(struct device_queue_manager *dqm,
>         pr_debug("Total of %d queues are accountable so far\n",
>                         dqm->total_queue_count);
>
> +out_unlock:
>         mutex_unlock(&dqm->lock);
> -       return 0;
> +       return retval;
>  }
>
>  static int allocate_hqd(struct device_queue_manager *dqm, struct queue *q)
> @@ -271,23 +269,25 @@ static int create_compute_queue_nocpsch(struct device_queue_manager *dqm,
>
>         retval = mqd->init_mqd(mqd, &q->mqd, &q->mqd_mem_obj,
>                                 &q->gart_mqd_addr, &q->properties);
> -       if (retval) {
> -               deallocate_hqd(dqm, q);
> -               return retval;
> -       }
> +       if (retval)
> +               goto out_deallocate_hqd;
>
>         pr_debug("Loading mqd to hqd on pipe %d, queue %d\n",
>                         q->pipe, q->queue);
>
>         retval = mqd->load_mqd(mqd, q->mqd, q->pipe,
>                         q->queue, (uint32_t __user *) q->properties.write_ptr);
> -       if (retval) {
> -               deallocate_hqd(dqm, q);
> -               mqd->uninit_mqd(mqd, q->mqd, q->mqd_mem_obj);
> -               return retval;
> -       }
> +       if (retval)
> +               goto out_uninit_mqd;
>
>         return 0;
> +
> +out_uninit_mqd:
> +       mqd->uninit_mqd(mqd, q->mqd, q->mqd_mem_obj);
> +out_deallocate_hqd:
> +       deallocate_hqd(dqm, q);
> +
> +       return retval;
>  }
>
>  static int destroy_queue_nocpsch(struct device_queue_manager *dqm,
> @@ -366,8 +366,8 @@ static int update_queue(struct device_queue_manager *dqm, struct queue *q)
>         mqd = dqm->ops.get_mqd_manager(dqm,
>                         get_mqd_type_from_queue_type(q->properties.type));
>         if (!mqd) {
> -               mutex_unlock(&dqm->lock);
> -               return -ENOMEM;
> +               retval = -ENOMEM;
> +               goto out_unlock;
>         }
>
>         if (q->properties.is_active)
> @@ -387,6 +387,7 @@ static int update_queue(struct device_queue_manager *dqm, struct queue *q)
>         if (sched_policy != KFD_SCHED_POLICY_NO_HWS)
>                 retval = execute_queues_cpsch(dqm, false);
>
> +out_unlock:
>         mutex_unlock(&dqm->lock);
>         return retval;
>  }
> @@ -500,16 +501,15 @@ static int initialize_nocpsch(struct device_queue_manager *dqm)
>
>         pr_debug("num of pipes: %d\n", get_pipes_per_mec(dqm));
>
> +       dqm->allocated_queues = kcalloc(get_pipes_per_mec(dqm),
> +                                       sizeof(unsigned int), GFP_KERNEL);
> +       if (!dqm->allocated_queues)
> +               return -ENOMEM;
> +
>         mutex_init(&dqm->lock);
>         INIT_LIST_HEAD(&dqm->queues);
>         dqm->queue_count = dqm->next_pipe_to_allocate = 0;
>         dqm->sdma_queue_count = 0;
> -       dqm->allocated_queues = kcalloc(get_pipes_per_mec(dqm),
> -                                       sizeof(unsigned int), GFP_KERNEL);
> -       if (!dqm->allocated_queues) {
> -               mutex_destroy(&dqm->lock);
> -               return -ENOMEM;
> -       }
>
>         for (pipe = 0; pipe < get_pipes_per_mec(dqm); pipe++) {
>                 int pipe_offset = pipe * get_queues_per_pipe(dqm);
> @@ -602,20 +602,22 @@ static int create_sdma_queue_nocpsch(struct device_queue_manager *dqm,
>         dqm->ops_asic_specific.init_sdma_vm(dqm, q, qpd);
>         retval = mqd->init_mqd(mqd, &q->mqd, &q->mqd_mem_obj,
>                                 &q->gart_mqd_addr, &q->properties);
> -       if (retval) {
> -               deallocate_sdma_queue(dqm, q->sdma_id);
> -               return retval;
> -       }
> +       if (retval)
> +               goto out_deallocate_sdma_queue;
>
>         retval = mqd->load_mqd(mqd, q->mqd, 0,
>                                 0, NULL);
> -       if (retval) {
> -               deallocate_sdma_queue(dqm, q->sdma_id);
> -               mqd->uninit_mqd(mqd, q->mqd, q->mqd_mem_obj);
> -               return retval;
> -       }
> +       if (retval)
> +               goto out_uninit_mqd;
>
>         return 0;
> +
> +out_uninit_mqd:
> +       mqd->uninit_mqd(mqd, q->mqd, q->mqd_mem_obj);
> +out_deallocate_sdma_queue:
> +       deallocate_sdma_queue(dqm, q->sdma_id);
> +
> +       return retval;
>  }
>
>  /*
> @@ -681,12 +683,8 @@ static int initialize_cpsch(struct device_queue_manager *dqm)
>         dqm->active_runlist = false;
>         retval = dqm->ops_asic_specific.initialize(dqm);
>         if (retval)
> -               goto fail_init_pipelines;
> -
> -       return 0;
> +               mutex_destroy(&dqm->lock);
>
> -fail_init_pipelines:
> -       mutex_destroy(&dqm->lock);
>         return retval;
>  }
>
> @@ -846,8 +844,8 @@ static int create_queue_cpsch(struct device_queue_manager *dqm, struct queue *q,
>                         get_mqd_type_from_queue_type(q->properties.type));
>
>         if (!mqd) {
> -               mutex_unlock(&dqm->lock);
> -               return -ENOMEM;
> +               retval = -ENOMEM;
> +               goto out;
>         }
>
>         dqm->ops_asic_specific.init_sdma_vm(dqm, q, qpd);
> @@ -1097,14 +1095,11 @@ static bool set_cache_memory_policy(struct device_queue_manager *dqm,
>                 uint64_t base = (uintptr_t)alternate_aperture_base;
>                 uint64_t limit = base + alternate_aperture_size - 1;
>
> -               if (limit <= base)
> -                       goto out;
> -
> -               if ((base & APE1_FIXED_BITS_MASK) != 0)
> -                       goto out;
> -
> -               if ((limit & APE1_FIXED_BITS_MASK) != APE1_LIMIT_ALIGNMENT)
> +               if (limit <= base || (base & APE1_FIXED_BITS_MASK) != 0 ||
> +                  (limit & APE1_FIXED_BITS_MASK) != APE1_LIMIT_ALIGNMENT) {
> +                       retval = false;
>                         goto out;
> +               }
>
>                 qpd->sh_mem_ape1_base = base >> 16;
>                 qpd->sh_mem_ape1_limit = limit >> 16;
> @@ -1125,12 +1120,9 @@ static bool set_cache_memory_policy(struct device_queue_manager *dqm,
>                 qpd->sh_mem_config, qpd->sh_mem_ape1_base,
>                 qpd->sh_mem_ape1_limit);
>
> -       mutex_unlock(&dqm->lock);
> -       return retval;
> -
>  out:
>         mutex_unlock(&dqm->lock);
> -       return false;
> +       return retval;
>  }
>
>  struct device_queue_manager *device_queue_manager_init(struct kfd_dev *dev)
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_module.c b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
> index 819a442..0d73bea 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
> @@ -105,7 +105,7 @@ static int __init kfd_module_init(void)
>
>         err = kfd_pasid_init();
>         if (err < 0)
> -               goto err_pasid;
> +               return err;
>
>         err = kfd_chardev_init();
>         if (err < 0)
> @@ -127,7 +127,6 @@ static int __init kfd_module_init(void)
>         kfd_chardev_exit();
>  err_ioctl:
>         kfd_pasid_exit();
> -err_pasid:
>         return err;
>  }
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> index f3b8cc8..c4030b3 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> @@ -442,6 +442,7 @@ int pm_send_set_resources(struct packet_manager *pm,
>                                 struct scheduling_resources *res)
>  {
>         struct pm4_set_resources *packet;
> +       int retval = 0;
>
>         BUG_ON(!pm || !res);
>
> @@ -450,9 +451,9 @@ int pm_send_set_resources(struct packet_manager *pm,
>                                         sizeof(*packet) / sizeof(uint32_t),
>                                         (unsigned int **)&packet);
>         if (!packet) {
> -               mutex_unlock(&pm->lock);
>                 pr_err("Failed to allocate buffer on kernel queue\n");
> -               return -ENOMEM;
> +               retval = -ENOMEM;
> +               goto out;
>         }
>
>         memset(packet, 0, sizeof(struct pm4_set_resources));
> @@ -475,9 +476,10 @@ int pm_send_set_resources(struct packet_manager *pm,
>
>         pm->priv_queue->ops.submit_packet(pm->priv_queue);
>
> +out:
>         mutex_unlock(&pm->lock);
>
> -       return 0;
> +       return retval;
>  }
>
>  int pm_send_runlist(struct packet_manager *pm, struct list_head *dqm_queues)
> @@ -555,9 +557,6 @@ int pm_send_query_status(struct packet_manager *pm, uint64_t fence_address,
>         packet->data_lo = lower_32_bits((uint64_t)fence_value);
>
>         pm->priv_queue->ops.submit_packet(pm->priv_queue);
> -       mutex_unlock(&pm->lock);
> -
> -       return 0;
>
>  fail_acquire_packet_buffer:
>         mutex_unlock(&pm->lock);
> @@ -639,9 +638,6 @@ int pm_send_unmap_queue(struct packet_manager *pm, enum kfd_queue_type type,
>
>         pm->priv_queue->ops.submit_packet(pm->priv_queue);
>
> -       mutex_unlock(&pm->lock);
> -       return 0;
> -
>  err_acquire_packet_buffer:
>         mutex_unlock(&pm->lock);
>         return retval;
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> index d4f8bae..8432f5f 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> @@ -35,9 +35,8 @@ static inline struct process_queue_node *get_queue_by_qid(
>         BUG_ON(!pqm);
>
>         list_for_each_entry(pqn, &pqm->queues, process_queue_list) {
> -               if (pqn->q && pqn->q->properties.queue_id == qid)
> -                       return pqn;
> -               if (pqn->kq && pqn->kq->queue->properties.queue_id == qid)
> +               if ((pqn->q && pqn->q->properties.queue_id == qid) ||
> +                   (pqn->kq && pqn->kq->queue->properties.queue_id == qid))
>                         return pqn;
>         }
>
> @@ -113,8 +112,6 @@ static int create_cp_queue(struct process_queue_manager *pqm,
>  {
>         int retval;
>
> -       retval = 0;
> -
>         /* Doorbell initialized in user space*/
>         q_properties->doorbell_ptr = NULL;
>
> @@ -127,7 +124,7 @@ static int create_cp_queue(struct process_queue_manager *pqm,
>
>         retval = init_queue(q, q_properties);
>         if (retval != 0)
> -               goto err_init_queue;
> +               return retval;
>
>         (*q)->device = dev;
>         (*q)->process = pqm->process;
> @@ -135,9 +132,6 @@ static int create_cp_queue(struct process_queue_manager *pqm,
>         pr_debug("PQM After init queue");
>
>         return retval;
> -
> -err_init_queue:
> -       return retval;
>  }
>
>  int pqm_create_queue(struct process_queue_manager *pqm,
> @@ -181,7 +175,7 @@ int pqm_create_queue(struct process_queue_manager *pqm,
>                 list_for_each_entry(cur, &pdd->qpd.queues_list, list)
>                         num_queues++;
>                 if (num_queues >= dev->device_info->max_no_of_hqd/2)
> -                       return (-ENOSPC);
> +                       return -ENOSPC;
>         }
>
>         retval = find_available_queue_slot(pqm, qid);
> --
> 2.7.4
>

Excellent cleanup!
With the above comments fixed, this patch is:
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 09/19] drm/amdkfd: Remove usage of alloc(sizeof(struct...
       [not found]     ` <1502488589-30272-10-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-08-12 13:23       ` Oded Gabbay
  0 siblings, 0 replies; 70+ messages in thread
From: Oded Gabbay @ 2017-08-12 13:23 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: Kent Russell, amd-gfx list

On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> From: Kent Russell <kent.russell@amd.com>
>
> See https://kernel.org/doc/html/latest/process/coding-style.html
> under "14) Allocating Memory" for rationale behind removing the
> x=alloc(sizeof(struct) style and using x=alloc(sizeof(*x) instead
>
> Signed-off-by: Kent Russell <kent.russell@amd.com>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  |  4 ++--
>  drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c          |  2 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c       |  2 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c        |  2 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |  2 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_queue.c                 | 10 +++++-----
>  6 files changed, 11 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> index 2003a7e..68fe6ed 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> @@ -420,7 +420,7 @@ static int register_process_nocpsch(struct device_queue_manager *dqm,
>
>         BUG_ON(!dqm || !qpd);
>
> -       n = kzalloc(sizeof(struct device_process_node), GFP_KERNEL);
> +       n = kzalloc(sizeof(*n), GFP_KERNEL);
>         if (!n)
>                 return -ENOMEM;
>
> @@ -1133,7 +1133,7 @@ struct device_queue_manager *device_queue_manager_init(struct kfd_dev *dev)
>
>         pr_debug("Loading device queue manager\n");
>
> -       dqm = kzalloc(sizeof(struct device_queue_manager), GFP_KERNEL);
> +       dqm = kzalloc(sizeof(*dqm), GFP_KERNEL);
>         if (!dqm)
>                 return NULL;
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
> index 8844798..47e2e8a 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
> @@ -283,7 +283,7 @@ struct kernel_queue *kernel_queue_init(struct kfd_dev *dev,
>
>         BUG_ON(!dev);
>
> -       kq = kzalloc(sizeof(struct kernel_queue), GFP_KERNEL);
> +       kq = kzalloc(sizeof(*kq), GFP_KERNEL);
>         if (!kq)
>                 return NULL;
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
> index 9908227..dca4fc7 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
> @@ -406,7 +406,7 @@ struct mqd_manager *mqd_manager_init_cik(enum KFD_MQD_TYPE type,
>         BUG_ON(!dev);
>         BUG_ON(type >= KFD_MQD_TYPE_MAX);
>
> -       mqd = kzalloc(sizeof(struct mqd_manager), GFP_KERNEL);
> +       mqd = kzalloc(sizeof(*mqd), GFP_KERNEL);
>         if (!mqd)
>                 return NULL;
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
> index 5ba3b40..aaaa87a 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
> @@ -239,7 +239,7 @@ struct mqd_manager *mqd_manager_init_vi(enum KFD_MQD_TYPE type,
>         BUG_ON(!dev);
>         BUG_ON(type >= KFD_MQD_TYPE_MAX);
>
> -       mqd = kzalloc(sizeof(struct mqd_manager), GFP_KERNEL);
> +       mqd = kzalloc(sizeof(*mqd), GFP_KERNEL);
>         if (!mqd)
>                 return NULL;
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> index 8432f5f..1d056a6 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> @@ -187,7 +187,7 @@ int pqm_create_queue(struct process_queue_manager *pqm,
>                 dev->dqm->ops.register_process(dev->dqm, &pdd->qpd);
>         }
>
> -       pqn = kzalloc(sizeof(struct process_queue_node), GFP_KERNEL);
> +       pqn = kzalloc(sizeof(*pqn), GFP_KERNEL);
>         if (!pqn) {
>                 retval = -ENOMEM;
>                 goto err_allocate_pqn;
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_queue.c b/drivers/gpu/drm/amd/amdkfd/kfd_queue.c
> index 0ab1970..5ad9f6f 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_queue.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_queue.c
> @@ -65,17 +65,17 @@ void print_queue(struct queue *q)
>
>  int init_queue(struct queue **q, const struct queue_properties *properties)
>  {
> -       struct queue *tmp;
> +       struct queue *tmp_q;
>
>         BUG_ON(!q);
>
> -       tmp = kzalloc(sizeof(struct queue), GFP_KERNEL);
> -       if (!tmp)
> +       tmp_q = kzalloc(sizeof(*tmp_q), GFP_KERNEL);
> +       if (!tmp_q)
>                 return -ENOMEM;
>
> -       memcpy(&tmp->properties, properties, sizeof(struct queue_properties));
> +       memcpy(&tmp_q->properties, properties, sizeof(*properties));
>
> -       *q = tmp;
> +       *q = tmp_q;
>         return 0;
>  }
>
> --
> 2.7.4
>

This patch is:
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 10/19] drm/amdkfd: Remove BUG_ONs for NULL pointer arguments
       [not found]     ` <1502488589-30272-11-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-08-12 14:19       ` Oded Gabbay
  0 siblings, 0 replies; 70+ messages in thread
From: Oded Gabbay @ 2017-08-12 14:19 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: amd-gfx list

On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> Remove BUG_ONs that check for NULL pointer arguments that are
> dereferenced in the same function. Dereferencing the NULL pointer
> will generate a BUG anyway, so the explicit check is redundant and
> unnecessary overhead.
>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c            | 26 +-----------
>  drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c            | 12 ------
>  drivers/gpu/drm/amd/amdkfd/kfd_device.c            | 10 -----
>  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  | 48 +---------------------
>  .../drm/amd/amdkfd/kfd_device_queue_manager_cik.c  |  2 -
>  .../drm/amd/amdkfd/kfd_device_queue_manager_vi.c   |  2 -
>  drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c          |  4 --
>  drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c      | 16 --------
>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c   | 17 --------
>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c    |  4 --
>  drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c    | 28 +------------
>  drivers/gpu/drm/amd/amdkfd/kfd_process.c           |  2 -
>  .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 15 -------
>  drivers/gpu/drm/amd/amdkfd/kfd_queue.c             |  2 -
>  drivers/gpu/drm/amd/amdkfd/kfd_topology.c          | 28 -------------
>  15 files changed, 4 insertions(+), 212 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
> index 0ef9136..3841cad 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
> @@ -42,8 +42,6 @@
>
>  static void dbgdev_address_watch_disable_nodiq(struct kfd_dev *dev)
>  {
> -       BUG_ON(!dev || !dev->kfd2kgd);
> -
>         dev->kfd2kgd->address_watch_disable(dev->kgd);
>  }
>
> @@ -62,7 +60,7 @@ static int dbgdev_diq_submit_ib(struct kfd_dbgdev *dbgdev,
>         unsigned int *ib_packet_buff;
>         int status;
>
> -       BUG_ON(!dbgdev || !dbgdev->kq || !packet_buff || !size_in_bytes);
> +       BUG_ON(!size_in_bytes);
>
>         kq = dbgdev->kq;
>
> @@ -168,8 +166,6 @@ static int dbgdev_diq_submit_ib(struct kfd_dbgdev *dbgdev,
>
>  static int dbgdev_register_nodiq(struct kfd_dbgdev *dbgdev)
>  {
> -       BUG_ON(!dbgdev);
> -
>         /*
>          * no action is needed in this case,
>          * just make sure diq will not be used
> @@ -187,8 +183,6 @@ static int dbgdev_register_diq(struct kfd_dbgdev *dbgdev)
>         struct kernel_queue *kq = NULL;
>         int status;
>
> -       BUG_ON(!dbgdev || !dbgdev->pqm || !dbgdev->dev);
> -
>         status = pqm_create_queue(dbgdev->pqm, dbgdev->dev, NULL,
>                                 &properties, 0, KFD_QUEUE_TYPE_DIQ,
>                                 &qid);
> @@ -215,8 +209,6 @@ static int dbgdev_register_diq(struct kfd_dbgdev *dbgdev)
>
>  static int dbgdev_unregister_nodiq(struct kfd_dbgdev *dbgdev)
>  {
> -       BUG_ON(!dbgdev || !dbgdev->dev);
> -
>         /* disable watch address */
>         dbgdev_address_watch_disable_nodiq(dbgdev->dev);
>         return 0;
> @@ -227,8 +219,6 @@ static int dbgdev_unregister_diq(struct kfd_dbgdev *dbgdev)
>         /* todo - disable address watch */
>         int status;
>
> -       BUG_ON(!dbgdev || !dbgdev->pqm || !dbgdev->kq);
> -
>         status = pqm_destroy_queue(dbgdev->pqm,
>                         dbgdev->kq->queue->properties.queue_id);
>         dbgdev->kq = NULL;
> @@ -245,8 +235,6 @@ static void dbgdev_address_watch_set_registers(
>  {
>         union ULARGE_INTEGER addr;
>
> -       BUG_ON(!adw_info || !addrHi || !addrLo || !cntl);
> -
>         addr.quad_part = 0;
>         addrHi->u32All = 0;
>         addrLo->u32All = 0;
> @@ -287,8 +275,6 @@ static int dbgdev_address_watch_nodiq(struct kfd_dbgdev *dbgdev,
>         struct kfd_process_device *pdd;
>         unsigned int i;
>
> -       BUG_ON(!dbgdev || !dbgdev->dev || !adw_info);
> -
>         /* taking the vmid for that process on the safe way using pdd */
>         pdd = kfd_get_process_device_data(dbgdev->dev,
>                                         adw_info->process);
> @@ -362,8 +348,6 @@ static int dbgdev_address_watch_diq(struct kfd_dbgdev *dbgdev,
>         /* we do not control the vmid in DIQ mode, just a place holder */
>         unsigned int vmid = 0;
>
> -       BUG_ON(!dbgdev || !dbgdev->dev || !adw_info);
> -
>         addrHi.u32All = 0;
>         addrLo.u32All = 0;
>         cntl.u32All = 0;
> @@ -508,8 +492,6 @@ static int dbgdev_wave_control_set_registers(
>         union GRBM_GFX_INDEX_BITS reg_gfx_index;
>         struct HsaDbgWaveMsgAMDGen2 *pMsg;
>
> -       BUG_ON(!wac_info || !in_reg_sq_cmd || !in_reg_gfx_index);
> -
>         reg_sq_cmd.u32All = 0;
>         reg_gfx_index.u32All = 0;
>         pMsg = &wac_info->dbgWave_msg.DbgWaveMsg.WaveMsgInfoGen2;
> @@ -610,8 +592,6 @@ static int dbgdev_wave_control_diq(struct kfd_dbgdev *dbgdev,
>         struct pm4__set_config_reg *packets_vec;
>         size_t ib_size = sizeof(struct pm4__set_config_reg) * 3;
>
> -       BUG_ON(!dbgdev || !wac_info);
> -
>         reg_sq_cmd.u32All = 0;
>
>         status = dbgdev_wave_control_set_registers(wac_info, &reg_sq_cmd,
> @@ -725,8 +705,6 @@ static int dbgdev_wave_control_nodiq(struct kfd_dbgdev *dbgdev,
>         union GRBM_GFX_INDEX_BITS reg_gfx_index;
>         struct kfd_process_device *pdd;
>
> -       BUG_ON(!dbgdev || !dbgdev->dev || !wac_info);
> -
>         reg_sq_cmd.u32All = 0;
>
>         /* taking the VMID for that process on the safe way using PDD */
> @@ -851,8 +829,6 @@ int dbgdev_wave_reset_wavefronts(struct kfd_dev *dev, struct kfd_process *p)
>  void kfd_dbgdev_init(struct kfd_dbgdev *pdbgdev, struct kfd_dev *pdev,
>                         enum DBGDEV_TYPE type)
>  {
> -       BUG_ON(!pdbgdev || !pdev);
> -
>         pdbgdev->dev = pdev;
>         pdbgdev->kq = NULL;
>         pdbgdev->type = type;
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c b/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c
> index 210bdc1..2d5555c 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c
> @@ -44,8 +44,6 @@ struct mutex *kfd_get_dbgmgr_mutex(void)
>
>  static void kfd_dbgmgr_uninitialize(struct kfd_dbgmgr *pmgr)
>  {
> -       BUG_ON(!pmgr);
> -
>         kfree(pmgr->dbgdev);
>
>         pmgr->dbgdev = NULL;
> @@ -66,7 +64,6 @@ bool kfd_dbgmgr_create(struct kfd_dbgmgr **ppmgr, struct kfd_dev *pdev)
>         enum DBGDEV_TYPE type = DBGDEV_TYPE_DIQ;
>         struct kfd_dbgmgr *new_buff;
>
> -       BUG_ON(!pdev);
>         BUG_ON(!pdev->init_complete);
>
>         new_buff = kfd_alloc_struct(new_buff);
> @@ -96,8 +93,6 @@ bool kfd_dbgmgr_create(struct kfd_dbgmgr **ppmgr, struct kfd_dev *pdev)
>
>  long kfd_dbgmgr_register(struct kfd_dbgmgr *pmgr, struct kfd_process *p)
>  {
> -       BUG_ON(!p || !pmgr || !pmgr->dbgdev);
> -
>         if (pmgr->pasid != 0) {
>                 pr_debug("H/W debugger is already active using pasid %d\n",
>                                 pmgr->pasid);
> @@ -118,8 +113,6 @@ long kfd_dbgmgr_register(struct kfd_dbgmgr *pmgr, struct kfd_process *p)
>
>  long kfd_dbgmgr_unregister(struct kfd_dbgmgr *pmgr, struct kfd_process *p)
>  {
> -       BUG_ON(!p || !pmgr || !pmgr->dbgdev);
> -
>         /* Is the requests coming from the already registered process? */
>         if (pmgr->pasid != p->pasid) {
>                 pr_debug("H/W debugger is not registered by calling pasid %d\n",
> @@ -137,8 +130,6 @@ long kfd_dbgmgr_unregister(struct kfd_dbgmgr *pmgr, struct kfd_process *p)
>  long kfd_dbgmgr_wave_control(struct kfd_dbgmgr *pmgr,
>                                 struct dbg_wave_control_info *wac_info)
>  {
> -       BUG_ON(!pmgr || !pmgr->dbgdev || !wac_info);
> -
>         /* Is the requests coming from the already registered process? */
>         if (pmgr->pasid != wac_info->process->pasid) {
>                 pr_debug("H/W debugger support was not registered for requester pasid %d\n",
> @@ -152,9 +143,6 @@ long kfd_dbgmgr_wave_control(struct kfd_dbgmgr *pmgr,
>  long kfd_dbgmgr_address_watch(struct kfd_dbgmgr *pmgr,
>                                 struct dbg_address_watch_info *adw_info)
>  {
> -       BUG_ON(!pmgr || !pmgr->dbgdev || !adw_info);
> -
> -
>         /* Is the requests coming from the already registered process? */
>         if (pmgr->pasid != adw_info->process->pasid) {
>                 pr_debug("H/W debugger support was not registered for requester pasid %d\n",
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> index d962342..e28e818 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> @@ -341,8 +341,6 @@ void kgd2kfd_device_exit(struct kfd_dev *kfd)
>
>  void kgd2kfd_suspend(struct kfd_dev *kfd)
>  {
> -       BUG_ON(!kfd);
> -
>         if (kfd->init_complete) {
>                 kfd->dqm->ops.stop(kfd->dqm);
>                 amd_iommu_set_invalidate_ctx_cb(kfd->pdev, NULL);
> @@ -356,8 +354,6 @@ int kgd2kfd_resume(struct kfd_dev *kfd)
>         unsigned int pasid_limit;
>         int err;
>
> -       BUG_ON(kfd == NULL);
> -
>         pasid_limit = kfd_get_pasid_limit();
>
>         if (kfd->init_complete) {
> @@ -394,8 +390,6 @@ static int kfd_gtt_sa_init(struct kfd_dev *kfd, unsigned int buf_size,
>  {
>         unsigned int num_of_bits;
>
> -       BUG_ON(!kfd);
> -       BUG_ON(!kfd->gtt_mem);
>         BUG_ON(buf_size < chunk_size);
>         BUG_ON(buf_size == 0);
>         BUG_ON(chunk_size == 0);
> @@ -445,8 +439,6 @@ int kfd_gtt_sa_allocate(struct kfd_dev *kfd, unsigned int size,
>  {
>         unsigned int found, start_search, cur_size;
>
> -       BUG_ON(!kfd);
> -
>         if (size == 0)
>                 return -EINVAL;
>
> @@ -551,8 +543,6 @@ int kfd_gtt_sa_free(struct kfd_dev *kfd, struct kfd_mem_obj *mem_obj)
>  {
>         unsigned int bit;
>
> -       BUG_ON(!kfd);
> -
>         /* Act like kfree when trying to free a NULL object */
>         if (!mem_obj)
>                 return 0;
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> index 68fe6ed..43bc1b5 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> @@ -79,20 +79,17 @@ static bool is_pipe_enabled(struct device_queue_manager *dqm, int mec, int pipe)
>
>  unsigned int get_queues_num(struct device_queue_manager *dqm)
>  {
> -       BUG_ON(!dqm || !dqm->dev);
>         return bitmap_weight(dqm->dev->shared_resources.queue_bitmap,
>                                 KGD_MAX_QUEUES);
>  }
>
>  unsigned int get_queues_per_pipe(struct device_queue_manager *dqm)
>  {
> -       BUG_ON(!dqm || !dqm->dev);
>         return dqm->dev->shared_resources.num_queue_per_pipe;
>  }
>
>  unsigned int get_pipes_per_mec(struct device_queue_manager *dqm)
>  {
> -       BUG_ON(!dqm || !dqm->dev);
>         return dqm->dev->shared_resources.num_pipe_per_mec;
>  }
>
> @@ -152,8 +149,6 @@ static int create_queue_nocpsch(struct device_queue_manager *dqm,
>  {
>         int retval = 0;
>
> -       BUG_ON(!dqm || !q || !qpd || !allocated_vmid);
> -
>         print_queue(q);
>
>         mutex_lock(&dqm->lock);
> @@ -257,8 +252,6 @@ static int create_compute_queue_nocpsch(struct device_queue_manager *dqm,
>         int retval;
>         struct mqd_manager *mqd;
>
> -       BUG_ON(!dqm || !q || !qpd);
> -
>         mqd = dqm->ops.get_mqd_manager(dqm, KFD_MQD_TYPE_COMPUTE);
>         if (!mqd)
>                 return -ENOMEM;
> @@ -297,8 +290,6 @@ static int destroy_queue_nocpsch(struct device_queue_manager *dqm,
>         int retval;
>         struct mqd_manager *mqd;
>
> -       BUG_ON(!dqm || !q || !q->mqd || !qpd);
> -
>         retval = 0;
>
>         mutex_lock(&dqm->lock);
> @@ -360,8 +351,6 @@ static int update_queue(struct device_queue_manager *dqm, struct queue *q)
>         struct mqd_manager *mqd;
>         bool prev_active = false;
>
> -       BUG_ON(!dqm || !q || !q->mqd);
> -
>         mutex_lock(&dqm->lock);
>         mqd = dqm->ops.get_mqd_manager(dqm,
>                         get_mqd_type_from_queue_type(q->properties.type));
> @@ -397,7 +386,7 @@ static struct mqd_manager *get_mqd_manager_nocpsch(
>  {
>         struct mqd_manager *mqd;
>
> -       BUG_ON(!dqm || type >= KFD_MQD_TYPE_MAX);
> +       BUG_ON(type >= KFD_MQD_TYPE_MAX);
>
>         pr_debug("mqd type %d\n", type);
>
> @@ -418,8 +407,6 @@ static int register_process_nocpsch(struct device_queue_manager *dqm,
>         struct device_process_node *n;
>         int retval;
>
> -       BUG_ON(!dqm || !qpd);
> -
>         n = kzalloc(sizeof(*n), GFP_KERNEL);
>         if (!n)
>                 return -ENOMEM;
> @@ -444,8 +431,6 @@ static int unregister_process_nocpsch(struct device_queue_manager *dqm,
>         int retval;
>         struct device_process_node *cur, *next;
>
> -       BUG_ON(!dqm || !qpd);
> -
>         pr_debug("qpd->queues_list is %s\n",
>                         list_empty(&qpd->queues_list) ? "empty" : "not empty");
>
> @@ -486,8 +471,6 @@ static void init_interrupts(struct device_queue_manager *dqm)
>  {
>         unsigned int i;
>
> -       BUG_ON(!dqm);
> -
>         for (i = 0 ; i < get_pipes_per_mec(dqm) ; i++)
>                 if (is_pipe_enabled(dqm, 0, i))
>                         dqm->dev->kfd2kgd->init_interrupts(dqm->dev->kgd, i);
> @@ -497,8 +480,6 @@ static int initialize_nocpsch(struct device_queue_manager *dqm)
>  {
>         int pipe, queue;
>
> -       BUG_ON(!dqm);
> -
>         pr_debug("num of pipes: %d\n", get_pipes_per_mec(dqm));
>
>         dqm->allocated_queues = kcalloc(get_pipes_per_mec(dqm),
> @@ -530,8 +511,6 @@ static void uninitialize_nocpsch(struct device_queue_manager *dqm)
>  {
>         int i;
>
> -       BUG_ON(!dqm);
> -
>         BUG_ON(dqm->queue_count > 0 || dqm->processes_count > 0);
>
>         kfree(dqm->allocated_queues);
> @@ -629,8 +608,6 @@ static int set_sched_resources(struct device_queue_manager *dqm)
>         int i, mec;
>         struct scheduling_resources res;
>
> -       BUG_ON(!dqm);
> -
>         res.vmid_mask = (1 << VMID_PER_DEVICE) - 1;
>         res.vmid_mask <<= KFD_VMID_START_OFFSET;
>
> @@ -672,8 +649,6 @@ static int initialize_cpsch(struct device_queue_manager *dqm)
>  {
>         int retval;
>
> -       BUG_ON(!dqm);
> -
>         pr_debug("num of pipes: %d\n", get_pipes_per_mec(dqm));
>
>         mutex_init(&dqm->lock);
> @@ -693,8 +668,6 @@ static int start_cpsch(struct device_queue_manager *dqm)
>         struct device_process_node *node;
>         int retval;
>
> -       BUG_ON(!dqm);
> -
>         retval = 0;
>
>         retval = pm_init(&dqm->packets, dqm);
> @@ -739,8 +712,6 @@ static int stop_cpsch(struct device_queue_manager *dqm)
>         struct device_process_node *node;
>         struct kfd_process_device *pdd;
>
> -       BUG_ON(!dqm);
> -
>         destroy_queues_cpsch(dqm, true, true);
>
>         list_for_each_entry(node, &dqm->queues, list) {
> @@ -757,8 +728,6 @@ static int create_kernel_queue_cpsch(struct device_queue_manager *dqm,
>                                         struct kernel_queue *kq,
>                                         struct qcm_process_device *qpd)
>  {
> -       BUG_ON(!dqm || !kq || !qpd);
> -
>         mutex_lock(&dqm->lock);
>         if (dqm->total_queue_count >= max_num_of_queues_per_device) {
>                 pr_warn("Can't create new kernel queue because %d queues were already created\n",
> @@ -788,8 +757,6 @@ static void destroy_kernel_queue_cpsch(struct device_queue_manager *dqm,
>                                         struct kernel_queue *kq,
>                                         struct qcm_process_device *qpd)
>  {
> -       BUG_ON(!dqm || !kq);
> -
>         mutex_lock(&dqm->lock);
>         /* here we actually preempt the DIQ */
>         destroy_queues_cpsch(dqm, true, false);
> @@ -821,8 +788,6 @@ static int create_queue_cpsch(struct device_queue_manager *dqm, struct queue *q,
>         int retval;
>         struct mqd_manager *mqd;
>
> -       BUG_ON(!dqm || !q || !qpd);
> -
>         retval = 0;
>
>         if (allocate_vmid)
> @@ -880,7 +845,6 @@ int amdkfd_fence_wait_timeout(unsigned int *fence_addr,
>                                 unsigned int fence_value,
>                                 unsigned long timeout)
>  {
> -       BUG_ON(!fence_addr);
>         timeout += jiffies;
>
>         while (*fence_addr != fence_value) {
> @@ -909,8 +873,6 @@ static int destroy_queues_cpsch(struct device_queue_manager *dqm,
>         enum kfd_preempt_type_filter preempt_type;
>         struct kfd_process_device *pdd;
>
> -       BUG_ON(!dqm);
> -
>         retval = 0;
>
>         if (lock)
> @@ -960,8 +922,6 @@ static int execute_queues_cpsch(struct device_queue_manager *dqm, bool lock)
>  {
>         int retval;
>
> -       BUG_ON(!dqm);
> -
>         if (lock)
>                 mutex_lock(&dqm->lock);
>
> @@ -1002,8 +962,6 @@ static int destroy_queue_cpsch(struct device_queue_manager *dqm,
>         struct mqd_manager *mqd;
>         bool preempt_all_queues;
>
> -       BUG_ON(!dqm || !qpd || !q);
> -
>         preempt_all_queues = false;
>
>         retval = 0;
> @@ -1129,8 +1087,6 @@ struct device_queue_manager *device_queue_manager_init(struct kfd_dev *dev)
>  {
>         struct device_queue_manager *dqm;
>
> -       BUG_ON(!dev);
> -
>         pr_debug("Loading device queue manager\n");
>
>         dqm = kzalloc(sizeof(*dqm), GFP_KERNEL);
> @@ -1195,8 +1151,6 @@ struct device_queue_manager *device_queue_manager_init(struct kfd_dev *dev)
>
>  void device_queue_manager_uninit(struct device_queue_manager *dqm)
>  {
> -       BUG_ON(!dqm);
> -
>         dqm->ops.uninitialize(dqm);
>         kfree(dqm);
>  }
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c
> index a263e2a..43194b4 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c
> @@ -104,8 +104,6 @@ static int register_process_cik(struct device_queue_manager *dqm,
>         struct kfd_process_device *pdd;
>         unsigned int temp;
>
> -       BUG_ON(!dqm || !qpd);
> -
>         pdd = qpd_to_pdd(qpd);
>
>         /* check if sh_mem_config register already configured */
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c
> index 8c45c86..47ef910 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c
> @@ -110,8 +110,6 @@ static int register_process_vi(struct device_queue_manager *dqm,
>         struct kfd_process_device *pdd;
>         unsigned int temp;
>
> -       BUG_ON(!dqm || !qpd);
> -
>         pdd = qpd_to_pdd(qpd);
>
>         /* check if sh_mem_config register already configured */
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
> index 48018a3..0055270 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
> @@ -165,8 +165,6 @@ u32 __iomem *kfd_get_kernel_doorbell(struct kfd_dev *kfd,
>  {
>         u32 inx;
>
> -       BUG_ON(!kfd || !doorbell_off);
> -
>         mutex_lock(&kfd->doorbell_mutex);
>         inx = find_first_zero_bit(kfd->doorbell_available_index,
>                                         KFD_MAX_NUM_OF_QUEUES_PER_PROCESS);
> @@ -196,8 +194,6 @@ void kfd_release_kernel_doorbell(struct kfd_dev *kfd, u32 __iomem *db_addr)
>  {
>         unsigned int inx;
>
> -       BUG_ON(!kfd || !db_addr);
> -
>         inx = (unsigned int)(db_addr - kfd->doorbell_kernel_ptr);
>
>         mutex_lock(&kfd->doorbell_mutex);
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
> index 47e2e8a..970bc07 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
> @@ -41,7 +41,6 @@ static bool initialize(struct kernel_queue *kq, struct kfd_dev *dev,
>         int retval;
>         union PM4_MES_TYPE_3_HEADER nop;
>
> -       BUG_ON(!kq || !dev);
>         BUG_ON(type != KFD_QUEUE_TYPE_DIQ && type != KFD_QUEUE_TYPE_HIQ);
>
>         pr_debug("Initializing queue type %d size %d\n", KFD_QUEUE_TYPE_HIQ,
> @@ -180,8 +179,6 @@ static bool initialize(struct kernel_queue *kq, struct kfd_dev *dev,
>
>  static void uninitialize(struct kernel_queue *kq)
>  {
> -       BUG_ON(!kq);
> -
>         if (kq->queue->properties.type == KFD_QUEUE_TYPE_HIQ)
>                 kq->mqd->destroy_mqd(kq->mqd,
>                                         NULL,
> @@ -211,8 +208,6 @@ static int acquire_packet_buffer(struct kernel_queue *kq,
>         uint32_t wptr, rptr;
>         unsigned int *queue_address;
>
> -       BUG_ON(!kq || !buffer_ptr);
> -
>         rptr = *kq->rptr_kernel;
>         wptr = *kq->wptr_kernel;
>         queue_address = (unsigned int *)kq->pq_kernel_addr;
> @@ -252,11 +247,7 @@ static void submit_packet(struct kernel_queue *kq)
>  {
>  #ifdef DEBUG
>         int i;
> -#endif
> -
> -       BUG_ON(!kq);
>
> -#ifdef DEBUG
>         for (i = *kq->wptr_kernel; i < kq->pending_wptr; i++) {
>                 pr_debug("0x%2X ", kq->pq_kernel_addr[i]);
>                 if (i % 15 == 0)
> @@ -272,7 +263,6 @@ static void submit_packet(struct kernel_queue *kq)
>
>  static void rollback_packet(struct kernel_queue *kq)
>  {
> -       BUG_ON(!kq);
>         kq->pending_wptr = *kq->queue->properties.write_ptr;
>  }
>
> @@ -281,8 +271,6 @@ struct kernel_queue *kernel_queue_init(struct kfd_dev *dev,
>  {
>         struct kernel_queue *kq;
>
> -       BUG_ON(!dev);
> -
>         kq = kzalloc(sizeof(*kq), GFP_KERNEL);
>         if (!kq)
>                 return NULL;
> @@ -313,8 +301,6 @@ struct kernel_queue *kernel_queue_init(struct kfd_dev *dev,
>
>  void kernel_queue_uninit(struct kernel_queue *kq)
>  {
> -       BUG_ON(!kq);
> -
>         kq->ops.uninitialize(kq);
>         kfree(kq);
>  }
> @@ -325,8 +311,6 @@ static __attribute__((unused)) void test_kq(struct kfd_dev *dev)
>         uint32_t *buffer, i;
>         int retval;
>
> -       BUG_ON(!dev);
> -
>         pr_err("Starting kernel queue test\n");
>
>         kq = kernel_queue_init(dev, KFD_QUEUE_TYPE_HIQ);
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
> index dca4fc7..a11477d 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
> @@ -44,8 +44,6 @@ static int init_mqd(struct mqd_manager *mm, void **mqd,
>         struct cik_mqd *m;
>         int retval;
>
> -       BUG_ON(!mm || !q || !mqd);
> -
>         retval = kfd_gtt_sa_allocate(mm->dev, sizeof(struct cik_mqd),
>                                         mqd_mem_obj);
>
> @@ -113,8 +111,6 @@ static int init_mqd_sdma(struct mqd_manager *mm, void **mqd,
>         int retval;
>         struct cik_sdma_rlc_registers *m;
>
> -       BUG_ON(!mm || !mqd || !mqd_mem_obj);
> -
>         retval = kfd_gtt_sa_allocate(mm->dev,
>                                         sizeof(struct cik_sdma_rlc_registers),
>                                         mqd_mem_obj);
> @@ -138,14 +134,12 @@ static int init_mqd_sdma(struct mqd_manager *mm, void **mqd,
>  static void uninit_mqd(struct mqd_manager *mm, void *mqd,
>                         struct kfd_mem_obj *mqd_mem_obj)
>  {
> -       BUG_ON(!mm || !mqd);
>         kfd_gtt_sa_free(mm->dev, mqd_mem_obj);
>  }
>
>  static void uninit_mqd_sdma(struct mqd_manager *mm, void *mqd,
>                                 struct kfd_mem_obj *mqd_mem_obj)
>  {
> -       BUG_ON(!mm || !mqd);
>         kfd_gtt_sa_free(mm->dev, mqd_mem_obj);
>  }
>
> @@ -168,8 +162,6 @@ static int update_mqd(struct mqd_manager *mm, void *mqd,
>  {
>         struct cik_mqd *m;
>
> -       BUG_ON(!mm || !q || !mqd);
> -
>         m = get_mqd(mqd);
>         m->cp_hqd_pq_control = DEFAULT_RPTR_BLOCK_SIZE |
>                                 DEFAULT_MIN_AVAIL_SIZE | PQ_ATC_EN;
> @@ -209,8 +201,6 @@ static int update_mqd_sdma(struct mqd_manager *mm, void *mqd,
>  {
>         struct cik_sdma_rlc_registers *m;
>
> -       BUG_ON(!mm || !mqd || !q);
> -
>         m = get_sdma_mqd(mqd);
>         m->sdma_rlc_rb_cntl = ffs(q->queue_size / sizeof(unsigned int)) <<
>                         SDMA0_RLC0_RB_CNTL__RB_SIZE__SHIFT |
> @@ -296,8 +286,6 @@ static int init_mqd_hiq(struct mqd_manager *mm, void **mqd,
>         struct cik_mqd *m;
>         int retval;
>
> -       BUG_ON(!mm || !q || !mqd || !mqd_mem_obj);
> -
>         retval = kfd_gtt_sa_allocate(mm->dev, sizeof(struct cik_mqd),
>                                         mqd_mem_obj);
>
> @@ -352,8 +340,6 @@ static int update_mqd_hiq(struct mqd_manager *mm, void *mqd,
>  {
>         struct cik_mqd *m;
>
> -       BUG_ON(!mm || !q || !mqd);
> -
>         m = get_mqd(mqd);
>         m->cp_hqd_pq_control = DEFAULT_RPTR_BLOCK_SIZE |
>                                 DEFAULT_MIN_AVAIL_SIZE |
> @@ -391,8 +377,6 @@ struct cik_sdma_rlc_registers *get_sdma_mqd(void *mqd)
>  {
>         struct cik_sdma_rlc_registers *m;
>
> -       BUG_ON(!mqd);
> -
>         m = (struct cik_sdma_rlc_registers *)mqd;
>
>         return m;
> @@ -403,7 +387,6 @@ struct mqd_manager *mqd_manager_init_cik(enum KFD_MQD_TYPE type,
>  {
>         struct mqd_manager *mqd;
>
> -       BUG_ON(!dev);
>         BUG_ON(type >= KFD_MQD_TYPE_MAX);
>
>         mqd = kzalloc(sizeof(*mqd), GFP_KERNEL);
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
> index aaaa87a..d638c2c 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
> @@ -106,8 +106,6 @@ static int __update_mqd(struct mqd_manager *mm, void *mqd,
>  {
>         struct vi_mqd *m;
>
> -       BUG_ON(!mm || !q || !mqd);
> -
>         m = get_mqd(mqd);
>
>         m->cp_hqd_pq_control = 5 << CP_HQD_PQ_CONTROL__RPTR_BLOCK_SIZE__SHIFT |
> @@ -186,7 +184,6 @@ static int destroy_mqd(struct mqd_manager *mm, void *mqd,
>  static void uninit_mqd(struct mqd_manager *mm, void *mqd,
>                         struct kfd_mem_obj *mqd_mem_obj)
>  {
> -       BUG_ON(!mm || !mqd);
>         kfd_gtt_sa_free(mm->dev, mqd_mem_obj);
>  }
>
> @@ -236,7 +233,6 @@ struct mqd_manager *mqd_manager_init_vi(enum KFD_MQD_TYPE type,
>  {
>         struct mqd_manager *mqd;
>
> -       BUG_ON(!dev);
>         BUG_ON(type >= KFD_MQD_TYPE_MAX);
>
>         mqd = kzalloc(sizeof(*mqd), GFP_KERNEL);
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> index c4030b3..aacd5a3 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> @@ -58,8 +58,6 @@ static void pm_calc_rlib_size(struct packet_manager *pm,
>         unsigned int process_count, queue_count;
>         unsigned int map_queue_size;
>
> -       BUG_ON(!pm || !rlib_size || !over_subscription);
> -
>         process_count = pm->dqm->processes_count;
>         queue_count = pm->dqm->queue_count;
>
> @@ -96,9 +94,7 @@ static int pm_allocate_runlist_ib(struct packet_manager *pm,
>  {
>         int retval;
>
> -       BUG_ON(!pm);
>         BUG_ON(pm->allocated);
> -       BUG_ON(!is_over_subscription);
>
>         pm_calc_rlib_size(pm, rl_buffer_size, is_over_subscription);
>
> @@ -123,7 +119,7 @@ static int pm_create_runlist(struct packet_manager *pm, uint32_t *buffer,
>  {
>         struct pm4_runlist *packet;
>
> -       BUG_ON(!pm || !buffer || !ib);
> +       BUG_ON(!ib);
>
>         packet = (struct pm4_runlist *)buffer;
>
> @@ -148,8 +144,6 @@ static int pm_create_map_process(struct packet_manager *pm, uint32_t *buffer,
>         struct queue *cur;
>         uint32_t num_queues;
>
> -       BUG_ON(!pm || !buffer || !qpd);
> -
>         packet = (struct pm4_map_process *)buffer;
>
>         memset(buffer, 0, sizeof(struct pm4_map_process));
> @@ -185,8 +179,6 @@ static int pm_create_map_queue_vi(struct packet_manager *pm, uint32_t *buffer,
>         struct pm4_mes_map_queues *packet;
>         bool use_static = is_static;
>
> -       BUG_ON(!pm || !buffer || !q);
> -
>         packet = (struct pm4_mes_map_queues *)buffer;
>         memset(buffer, 0, sizeof(struct pm4_map_queues));
>
> @@ -247,8 +239,6 @@ static int pm_create_map_queue(struct packet_manager *pm, uint32_t *buffer,
>         struct pm4_map_queues *packet;
>         bool use_static = is_static;
>
> -       BUG_ON(!pm || !buffer || !q);
> -
>         packet = (struct pm4_map_queues *)buffer;
>         memset(buffer, 0, sizeof(struct pm4_map_queues));
>
> @@ -315,8 +305,6 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
>         struct kernel_queue *kq;
>         bool is_over_subscription;
>
> -       BUG_ON(!pm || !queues || !rl_size_bytes || !rl_gpu_addr);
> -
>         rl_wptr = retval = proccesses_mapped = 0;
>
>         retval = pm_allocate_runlist_ib(pm, &rl_buffer, rl_gpu_addr,
> @@ -416,8 +404,6 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
>
>  int pm_init(struct packet_manager *pm, struct device_queue_manager *dqm)
>  {
> -       BUG_ON(!dqm);
> -
>         pm->dqm = dqm;
>         mutex_init(&pm->lock);
>         pm->priv_queue = kernel_queue_init(dqm->dev, KFD_QUEUE_TYPE_HIQ);
> @@ -432,8 +418,6 @@ int pm_init(struct packet_manager *pm, struct device_queue_manager *dqm)
>
>  void pm_uninit(struct packet_manager *pm)
>  {
> -       BUG_ON(!pm);
> -
>         mutex_destroy(&pm->lock);
>         kernel_queue_uninit(pm->priv_queue);
>  }
> @@ -444,8 +428,6 @@ int pm_send_set_resources(struct packet_manager *pm,
>         struct pm4_set_resources *packet;
>         int retval = 0;
>
> -       BUG_ON(!pm || !res);
> -
>         mutex_lock(&pm->lock);
>         pm->priv_queue->ops.acquire_packet_buffer(pm->priv_queue,
>                                         sizeof(*packet) / sizeof(uint32_t),
> @@ -489,8 +471,6 @@ int pm_send_runlist(struct packet_manager *pm, struct list_head *dqm_queues)
>         size_t rl_ib_size, packet_size_dwords;
>         int retval;
>
> -       BUG_ON(!pm || !dqm_queues);
> -
>         retval = pm_create_runlist_ib(pm, dqm_queues, &rl_gpu_ib_addr,
>                                         &rl_ib_size);
>         if (retval)
> @@ -532,7 +512,7 @@ int pm_send_query_status(struct packet_manager *pm, uint64_t fence_address,
>         int retval;
>         struct pm4_query_status *packet;
>
> -       BUG_ON(!pm || !fence_address);
> +       BUG_ON(!fence_address);
>
>         mutex_lock(&pm->lock);
>         retval = pm->priv_queue->ops.acquire_packet_buffer(
> @@ -572,8 +552,6 @@ int pm_send_unmap_queue(struct packet_manager *pm, enum kfd_queue_type type,
>         uint32_t *buffer;
>         struct pm4_unmap_queues *packet;
>
> -       BUG_ON(!pm);
> -
>         mutex_lock(&pm->lock);
>         retval = pm->priv_queue->ops.acquire_packet_buffer(
>                         pm->priv_queue,
> @@ -645,8 +623,6 @@ int pm_send_unmap_queue(struct packet_manager *pm, enum kfd_queue_type type,
>
>  void pm_release_ib(struct packet_manager *pm)
>  {
> -       BUG_ON(!pm);
> -
>         mutex_lock(&pm->lock);
>         if (pm->allocated) {
>                 kfd_gtt_sa_free(pm->dqm->dev, pm->ib_buffer_obj);
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> index d877cda..d030d76 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> @@ -407,8 +407,6 @@ void kfd_unbind_process_from_device(struct kfd_dev *dev, unsigned int pasid)
>         struct kfd_process *p;
>         struct kfd_process_device *pdd;
>
> -       BUG_ON(!dev);
> -
>         /*
>          * Look for the process that matches the pasid. If there is no such
>          * process, we either released it in amdkfd's own notifier, or there
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> index 1d056a6..f6ecdff 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> @@ -32,8 +32,6 @@ static inline struct process_queue_node *get_queue_by_qid(
>  {
>         struct process_queue_node *pqn;
>
> -       BUG_ON(!pqm);
> -
>         list_for_each_entry(pqn, &pqm->queues, process_queue_list) {
>                 if ((pqn->q && pqn->q->properties.queue_id == qid) ||
>                     (pqn->kq && pqn->kq->queue->properties.queue_id == qid))
> @@ -48,8 +46,6 @@ static int find_available_queue_slot(struct process_queue_manager *pqm,
>  {
>         unsigned long found;
>
> -       BUG_ON(!pqm || !qid);
> -
>         found = find_first_zero_bit(pqm->queue_slot_bitmap,
>                         KFD_MAX_NUM_OF_QUEUES_PER_PROCESS);
>
> @@ -69,8 +65,6 @@ static int find_available_queue_slot(struct process_queue_manager *pqm,
>
>  int pqm_init(struct process_queue_manager *pqm, struct kfd_process *p)
>  {
> -       BUG_ON(!pqm);
> -
>         INIT_LIST_HEAD(&pqm->queues);
>         pqm->queue_slot_bitmap =
>                         kzalloc(DIV_ROUND_UP(KFD_MAX_NUM_OF_QUEUES_PER_PROCESS,
> @@ -87,8 +81,6 @@ void pqm_uninit(struct process_queue_manager *pqm)
>         int retval;
>         struct process_queue_node *pqn, *next;
>
> -       BUG_ON(!pqm);
> -
>         list_for_each_entry_safe(pqn, next, &pqm->queues, process_queue_list) {
>                 retval = pqm_destroy_queue(
>                                 pqm,
> @@ -151,8 +143,6 @@ int pqm_create_queue(struct process_queue_manager *pqm,
>         int num_queues = 0;
>         struct queue *cur;
>
> -       BUG_ON(!pqm || !dev || !properties || !qid);
> -
>         memset(&q_properties, 0, sizeof(struct queue_properties));
>         memcpy(&q_properties, properties, sizeof(struct queue_properties));
>         q = NULL;
> @@ -269,7 +259,6 @@ int pqm_destroy_queue(struct process_queue_manager *pqm, unsigned int qid)
>
>         dqm = NULL;
>
> -       BUG_ON(!pqm);
>         retval = 0;
>
>         pqn = get_queue_by_qid(pqm, qid);
> @@ -323,8 +312,6 @@ int pqm_update_queue(struct process_queue_manager *pqm, unsigned int qid,
>         int retval;
>         struct process_queue_node *pqn;
>
> -       BUG_ON(!pqm);
> -
>         pqn = get_queue_by_qid(pqm, qid);
>         if (!pqn) {
>                 pr_debug("No queue %d exists for update operation\n", qid);
> @@ -350,8 +337,6 @@ struct kernel_queue *pqm_get_kernel_queue(
>  {
>         struct process_queue_node *pqn;
>
> -       BUG_ON(!pqm);
> -
>         pqn = get_queue_by_qid(pqm, qid);
>         if (pqn && pqn->kq)
>                 return pqn->kq;
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_queue.c b/drivers/gpu/drm/amd/amdkfd/kfd_queue.c
> index 5ad9f6f..a5315d4 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_queue.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_queue.c
> @@ -67,8 +67,6 @@ int init_queue(struct queue **q, const struct queue_properties *properties)
>  {
>         struct queue *tmp_q;
>
> -       BUG_ON(!q);
> -
>         tmp_q = kzalloc(sizeof(*tmp_q), GFP_KERNEL);
>         if (!tmp_q)
>                 return -ENOMEM;
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> index 113c1ce..e5486f4 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> @@ -108,9 +108,6 @@ static int kfd_topology_get_crat_acpi(void *crat_image, size_t *size)
>  static void kfd_populated_cu_info_cpu(struct kfd_topology_device *dev,
>                 struct crat_subtype_computeunit *cu)
>  {
> -       BUG_ON(!dev);
> -       BUG_ON(!cu);
> -
>         dev->node_props.cpu_cores_count = cu->num_cpu_cores;
>         dev->node_props.cpu_core_id_base = cu->processor_id_low;
>         if (cu->hsa_capability & CRAT_CU_FLAGS_IOMMU_PRESENT)
> @@ -123,9 +120,6 @@ static void kfd_populated_cu_info_cpu(struct kfd_topology_device *dev,
>  static void kfd_populated_cu_info_gpu(struct kfd_topology_device *dev,
>                 struct crat_subtype_computeunit *cu)
>  {
> -       BUG_ON(!dev);
> -       BUG_ON(!cu);
> -
>         dev->node_props.simd_id_base = cu->processor_id_low;
>         dev->node_props.simd_count = cu->num_simd_cores;
>         dev->node_props.lds_size_in_kb = cu->lds_size_in_kb;
> @@ -148,8 +142,6 @@ static int kfd_parse_subtype_cu(struct crat_subtype_computeunit *cu)
>         struct kfd_topology_device *dev;
>         int i = 0;
>
> -       BUG_ON(!cu);
> -
>         pr_info("Found CU entry in CRAT table with proximity_domain=%d caps=%x\n",
>                         cu->proximity_domain, cu->hsa_capability);
>         list_for_each_entry(dev, &topology_device_list, list) {
> @@ -177,8 +169,6 @@ static int kfd_parse_subtype_mem(struct crat_subtype_memory *mem)
>         struct kfd_topology_device *dev;
>         int i = 0;
>
> -       BUG_ON(!mem);
> -
>         pr_info("Found memory entry in CRAT table with proximity_domain=%d\n",
>                         mem->promixity_domain);
>         list_for_each_entry(dev, &topology_device_list, list) {
> @@ -223,8 +213,6 @@ static int kfd_parse_subtype_cache(struct crat_subtype_cache *cache)
>         struct kfd_topology_device *dev;
>         uint32_t id;
>
> -       BUG_ON(!cache);
> -
>         id = cache->processor_id_low;
>
>         pr_info("Found cache entry in CRAT table with processor_id=%d\n", id);
> @@ -274,8 +262,6 @@ static int kfd_parse_subtype_iolink(struct crat_subtype_iolink *iolink)
>         uint32_t id_from;
>         uint32_t id_to;
>
> -       BUG_ON(!iolink);
> -
>         id_from = iolink->proximity_domain_from;
>         id_to = iolink->proximity_domain_to;
>
> @@ -323,8 +309,6 @@ static int kfd_parse_subtype(struct crat_subtype_generic *sub_type_hdr)
>         struct crat_subtype_iolink *iolink;
>         int ret = 0;
>
> -       BUG_ON(!sub_type_hdr);
> -
>         switch (sub_type_hdr->type) {
>         case CRAT_SUBTYPE_COMPUTEUNIT_AFFINITY:
>                 cu = (struct crat_subtype_computeunit *)sub_type_hdr;
> @@ -368,8 +352,6 @@ static void kfd_release_topology_device(struct kfd_topology_device *dev)
>         struct kfd_cache_properties *cache;
>         struct kfd_iolink_properties *iolink;
>
> -       BUG_ON(!dev);
> -
>         list_del(&dev->list);
>
>         while (dev->mem_props.next != &dev->mem_props) {
> @@ -763,8 +745,6 @@ static void kfd_remove_sysfs_node_entry(struct kfd_topology_device *dev)
>         struct kfd_cache_properties *cache;
>         struct kfd_mem_properties *mem;
>
> -       BUG_ON(!dev);
> -
>         if (dev->kobj_iolink) {
>                 list_for_each_entry(iolink, &dev->io_link_props, list)
>                         if (iolink->kobj) {
> @@ -819,8 +799,6 @@ static int kfd_build_sysfs_node_entry(struct kfd_topology_device *dev,
>         int ret;
>         uint32_t i;
>
> -       BUG_ON(!dev);
> -
>         /*
>          * Creating the sysfs folders
>          */
> @@ -1117,8 +1095,6 @@ static struct kfd_topology_device *kfd_assign_gpu(struct kfd_dev *gpu)
>         struct kfd_topology_device *dev;
>         struct kfd_topology_device *out_dev = NULL;
>
> -       BUG_ON(!gpu);
> -
>         list_for_each_entry(dev, &topology_device_list, list)
>                 if (!dev->gpu && (dev->node_props.simd_count > 0)) {
>                         dev->gpu = gpu;
> @@ -1143,8 +1119,6 @@ int kfd_topology_add_device(struct kfd_dev *gpu)
>         struct kfd_topology_device *dev;
>         int res;
>
> -       BUG_ON(!gpu);
> -
>         gpu_id = kfd_generate_gpu_id(gpu);
>
>         pr_debug("Adding new GPU (ID: 0x%x) to topology\n", gpu_id);
> @@ -1210,8 +1184,6 @@ int kfd_topology_remove_device(struct kfd_dev *gpu)
>         uint32_t gpu_id;
>         int res = -ENODEV;
>
> -       BUG_ON(!gpu);
> -
>         down_write(&topology_lock);
>
>         list_for_each_entry(dev, &topology_device_list, list)
> --
> 2.7.4
>

Good patch!
This patch is:
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 11/19] drm/amdkfd: Fix doorbell initialization and finalization
       [not found]     ` <1502488589-30272-12-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-08-12 14:21       ` Oded Gabbay
  0 siblings, 0 replies; 70+ messages in thread
From: Oded Gabbay @ 2017-08-12 14:21 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: amd-gfx list

On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> Handle errors in doorbell aperture initialization instead of BUG_ON.
> iounmap doorbell aperture during finalization.
>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_device.c   |  9 ++++++++-
>  drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c | 13 +++++++++++--
>  drivers/gpu/drm/amd/amdkfd/kfd_priv.h     |  3 ++-
>  3 files changed, 21 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> index e28e818..cb7ed02 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> @@ -260,7 +260,11 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
>                 goto kfd_gtt_sa_init_error;
>         }
>
> -       kfd_doorbell_init(kfd);
> +       if (kfd_doorbell_init(kfd)) {
> +               dev_err(kfd_device,
> +                       "Error initializing doorbell aperture\n");
> +               goto kfd_doorbell_error;
> +       }
>
>         if (kfd_topology_add_device(kfd)) {
>                 dev_err(kfd_device, "Error adding device to topology\n");
> @@ -315,6 +319,8 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
>  kfd_interrupt_error:
>         kfd_topology_remove_device(kfd);
>  kfd_topology_add_device_error:
> +       kfd_doorbell_fini(kfd);
> +kfd_doorbell_error:
>         kfd_gtt_sa_fini(kfd);
>  kfd_gtt_sa_init_error:
>         kfd->kfd2kgd->free_gtt_mem(kfd->kgd, kfd->gtt_mem);
> @@ -332,6 +338,7 @@ void kgd2kfd_device_exit(struct kfd_dev *kfd)
>                 amd_iommu_free_device(kfd->pdev);
>                 kfd_interrupt_exit(kfd);
>                 kfd_topology_remove_device(kfd);
> +               kfd_doorbell_fini(kfd);
>                 kfd_gtt_sa_fini(kfd);
>                 kfd->kfd2kgd->free_gtt_mem(kfd->kgd, kfd->gtt_mem);
>         }
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
> index 0055270..acf4d2a 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
> @@ -59,7 +59,7 @@ static inline size_t doorbell_process_allocation(void)
>  }
>
>  /* Doorbell calculations for device init. */
> -void kfd_doorbell_init(struct kfd_dev *kfd)
> +int kfd_doorbell_init(struct kfd_dev *kfd)
>  {
>         size_t doorbell_start_offset;
>         size_t doorbell_aperture_size;
> @@ -95,7 +95,8 @@ void kfd_doorbell_init(struct kfd_dev *kfd)
>         kfd->doorbell_kernel_ptr = ioremap(kfd->doorbell_base,
>                                                 doorbell_process_allocation());
>
> -       BUG_ON(!kfd->doorbell_kernel_ptr);
> +       if (!kfd->doorbell_kernel_ptr)
> +               return -ENOMEM;
>
>         pr_debug("Doorbell initialization:\n");
>         pr_debug("doorbell base           == 0x%08lX\n",
> @@ -115,6 +116,14 @@ void kfd_doorbell_init(struct kfd_dev *kfd)
>
>         pr_debug("doorbell kernel address == 0x%08lX\n",
>                         (uintptr_t)kfd->doorbell_kernel_ptr);
> +
> +       return 0;
> +}
> +
> +void kfd_doorbell_fini(struct kfd_dev *kfd)
> +{
> +       if (kfd->doorbell_kernel_ptr)
> +               iounmap(kfd->doorbell_kernel_ptr);
>  }
>
>  int kfd_doorbell_mmap(struct kfd_process *process, struct vm_area_struct *vma)
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> index 469b7ea..f0d55cc0 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> @@ -576,7 +576,8 @@ unsigned int kfd_pasid_alloc(void);
>  void kfd_pasid_free(unsigned int pasid);
>
>  /* Doorbells */
> -void kfd_doorbell_init(struct kfd_dev *kfd);
> +int kfd_doorbell_init(struct kfd_dev *kfd);
> +void kfd_doorbell_fini(struct kfd_dev *kfd);
>  int kfd_doorbell_mmap(struct kfd_process *process, struct vm_area_struct *vma);
>  u32 __iomem *kfd_get_kernel_doorbell(struct kfd_dev *kfd,
>                                         unsigned int *doorbell_off);
> --
> 2.7.4
>

This patch is:
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 12/19] drm/amdkfd: Allocate gtt_sa_bitmap in long units
       [not found]     ` <1502488589-30272-13-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-08-12 14:26       ` Oded Gabbay
  0 siblings, 0 replies; 70+ messages in thread
From: Oded Gabbay @ 2017-08-12 14:26 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: amd-gfx list

On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> gtt_sa_bitmap is accessed by bitmap functions, which operate on longs.
> Therefore the array should be allocated in long units. Also round up
> in case the number of bits is not a multiple of BITS_PER_LONG.
>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_device.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> index cb7ed02..416955f 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> @@ -395,7 +395,7 @@ void kgd2kfd_interrupt(struct kfd_dev *kfd, const void *ih_ring_entry)
>  static int kfd_gtt_sa_init(struct kfd_dev *kfd, unsigned int buf_size,
>                                 unsigned int chunk_size)
>  {
> -       unsigned int num_of_bits;
> +       unsigned int num_of_longs;
>
>         BUG_ON(buf_size < chunk_size);
>         BUG_ON(buf_size == 0);
> @@ -404,10 +404,10 @@ static int kfd_gtt_sa_init(struct kfd_dev *kfd, unsigned int buf_size,
>         kfd->gtt_sa_chunk_size = chunk_size;
>         kfd->gtt_sa_num_of_chunks = buf_size / chunk_size;
>
> -       num_of_bits = kfd->gtt_sa_num_of_chunks / BITS_PER_BYTE;
> -       BUG_ON(num_of_bits == 0);
> +       num_of_longs = (kfd->gtt_sa_num_of_chunks + BITS_PER_LONG - 1) /
> +               BITS_PER_LONG;
>
> -       kfd->gtt_sa_bitmap = kzalloc(num_of_bits, GFP_KERNEL);
> +       kfd->gtt_sa_bitmap = kcalloc(num_of_longs, sizeof(long), GFP_KERNEL);
>
>         if (!kfd->gtt_sa_bitmap)
>                 return -ENOMEM;
> --
> 2.7.4
>
This patch is:
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 13/19] drm/amdkfd: Handle remaining BUG_ONs more gracefully
       [not found]     ` <1502488589-30272-14-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-08-12 14:39       ` Oded Gabbay
       [not found]         ` <CAFCwf129vQLO5owYBQt6S-V5WcxSrO7tu+v62-HhH2eOzATS1A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: Oded Gabbay @ 2017-08-12 14:39 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: amd-gfx list

On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> In most cases, BUG_ONs can be replaced with WARN_ON with an error
> return. In some void functions just turn them into a WARN_ON and
> possibly an early exit.
>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c            |  3 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c            |  3 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_device.c            | 16 ++++----
>  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  | 19 ++++-----
>  .../drm/amd/amdkfd/kfd_device_queue_manager_cik.c  |  2 +-
>  .../drm/amd/amdkfd/kfd_device_queue_manager_vi.c   |  2 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c      | 20 +++++++---
>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c   |  3 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c    |  3 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c    | 45 +++++++++++++---------
>  drivers/gpu/drm/amd/amdkfd/kfd_pasid.c             |  4 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_process.c           |  9 ++---
>  .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |  7 ++--
>  drivers/gpu/drm/amd/amdkfd/kfd_topology.c          |  4 +-
>  14 files changed, 84 insertions(+), 56 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
> index 3841cad..0aa021a 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
> @@ -60,7 +60,8 @@ static int dbgdev_diq_submit_ib(struct kfd_dbgdev *dbgdev,
>         unsigned int *ib_packet_buff;
>         int status;
>
> -       BUG_ON(!size_in_bytes);
> +       if (WARN_ON(!size_in_bytes))
> +               return -EINVAL;
>
>         kq = dbgdev->kq;
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c b/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c
> index 2d5555c..3da25f7 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c
> @@ -64,7 +64,8 @@ bool kfd_dbgmgr_create(struct kfd_dbgmgr **ppmgr, struct kfd_dev *pdev)
>         enum DBGDEV_TYPE type = DBGDEV_TYPE_DIQ;
>         struct kfd_dbgmgr *new_buff;
>
> -       BUG_ON(!pdev->init_complete);
> +       if (WARN_ON(!pdev->init_complete))
> +               return false;
>
>         new_buff = kfd_alloc_struct(new_buff);
>         if (!new_buff) {
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> index 416955f..f628ac3 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> @@ -98,7 +98,7 @@ static const struct kfd_device_info *lookup_device_info(unsigned short did)
>
>         for (i = 0; i < ARRAY_SIZE(supported_devices); i++) {
>                 if (supported_devices[i].did == did) {
> -                       BUG_ON(!supported_devices[i].device_info);
> +                       WARN_ON(!supported_devices[i].device_info);
>                         return supported_devices[i].device_info;
>                 }
>         }
> @@ -212,9 +212,8 @@ static int iommu_invalid_ppr_cb(struct pci_dev *pdev, int pasid,
>                         flags);
>
>         dev = kfd_device_by_pci_dev(pdev);
> -       BUG_ON(!dev);
> -
> -       kfd_signal_iommu_event(dev, pasid, address,
> +       if (!WARN_ON(!dev))
> +               kfd_signal_iommu_event(dev, pasid, address,
>                         flags & PPR_FAULT_WRITE, flags & PPR_FAULT_EXEC);
>
>         return AMD_IOMMU_INV_PRI_RSP_INVALID;
> @@ -397,9 +396,12 @@ static int kfd_gtt_sa_init(struct kfd_dev *kfd, unsigned int buf_size,
>  {
>         unsigned int num_of_longs;
>
> -       BUG_ON(buf_size < chunk_size);
> -       BUG_ON(buf_size == 0);
> -       BUG_ON(chunk_size == 0);
> +       if (WARN_ON(buf_size < chunk_size))
> +               return -EINVAL;
> +       if (WARN_ON(buf_size == 0))
> +               return -EINVAL;
> +       if (WARN_ON(chunk_size == 0))
> +               return -EINVAL;
>
>         kfd->gtt_sa_chunk_size = chunk_size;
>         kfd->gtt_sa_num_of_chunks = buf_size / chunk_size;
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> index 43bc1b5..5dac29d 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> @@ -386,7 +386,8 @@ static struct mqd_manager *get_mqd_manager_nocpsch(
>  {
>         struct mqd_manager *mqd;
>
> -       BUG_ON(type >= KFD_MQD_TYPE_MAX);
> +       if (WARN_ON(type >= KFD_MQD_TYPE_MAX))
> +               return NULL;
>
>         pr_debug("mqd type %d\n", type);
>
> @@ -511,7 +512,7 @@ static void uninitialize_nocpsch(struct device_queue_manager *dqm)
>  {
>         int i;
>
> -       BUG_ON(dqm->queue_count > 0 || dqm->processes_count > 0);
> +       WARN_ON(dqm->queue_count > 0 || dqm->processes_count > 0);
>
>         kfree(dqm->allocated_queues);
>         for (i = 0 ; i < KFD_MQD_TYPE_MAX ; i++)
> @@ -1127,8 +1128,8 @@ struct device_queue_manager *device_queue_manager_init(struct kfd_dev *dev)
>                 dqm->ops.set_cache_memory_policy = set_cache_memory_policy;
>                 break;
>         default:
> -               BUG();
> -               break;
> +               pr_err("Invalid scheduling policy %d\n", sched_policy);
> +               goto out_free;
>         }
>
>         switch (dev->device_info->asic_family) {
> @@ -1141,12 +1142,12 @@ struct device_queue_manager *device_queue_manager_init(struct kfd_dev *dev)
>                 break;
>         }
>
> -       if (dqm->ops.initialize(dqm)) {
> -               kfree(dqm);
> -               return NULL;
> -       }
> +       if (!dqm->ops.initialize(dqm))
> +               return dqm;
>
> -       return dqm;
> +out_free:
> +       kfree(dqm);
> +       return NULL;
>  }
>
>  void device_queue_manager_uninit(struct device_queue_manager *dqm)
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c
> index 43194b4..fadc56a 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c
> @@ -65,7 +65,7 @@ static uint32_t compute_sh_mem_bases_64bit(unsigned int top_address_nybble)
>          * for LDS/Scratch and GPUVM.
>          */
>
> -       BUG_ON((top_address_nybble & 1) || top_address_nybble > 0xE ||
> +       WARN_ON((top_address_nybble & 1) || top_address_nybble > 0xE ||
>                 top_address_nybble == 0);
>
>         return PRIVATE_BASE(top_address_nybble << 12) |
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c
> index 47ef910..15e81ae 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c
> @@ -67,7 +67,7 @@ static uint32_t compute_sh_mem_bases_64bit(unsigned int top_address_nybble)
>          * for LDS/Scratch and GPUVM.
>          */
>
> -       BUG_ON((top_address_nybble & 1) || top_address_nybble > 0xE ||
> +       WARN_ON((top_address_nybble & 1) || top_address_nybble > 0xE ||
>                 top_address_nybble == 0);
>
>         return top_address_nybble << 12 |
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
> index 970bc07..0e4d4a9 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
> @@ -41,7 +41,8 @@ static bool initialize(struct kernel_queue *kq, struct kfd_dev *dev,
>         int retval;
>         union PM4_MES_TYPE_3_HEADER nop;
>
> -       BUG_ON(type != KFD_QUEUE_TYPE_DIQ && type != KFD_QUEUE_TYPE_HIQ);
> +       if (WARN_ON(type != KFD_QUEUE_TYPE_DIQ && type != KFD_QUEUE_TYPE_HIQ))
> +               return false;
>
>         pr_debug("Initializing queue type %d size %d\n", KFD_QUEUE_TYPE_HIQ,
>                         queue_size);
> @@ -62,8 +63,8 @@ static bool initialize(struct kernel_queue *kq, struct kfd_dev *dev,
>                                                 KFD_MQD_TYPE_HIQ);
>                 break;
>         default:
> -               BUG();
> -               break;
> +               pr_err("Invalid queue type %d\n", type);
> +               return false;
>         }
>
>         if (!kq->mqd)
> @@ -305,6 +306,7 @@ void kernel_queue_uninit(struct kernel_queue *kq)
>         kfree(kq);
>  }
>
> +/* FIXME: Can this test be removed? */
>  static __attribute__((unused)) void test_kq(struct kfd_dev *dev)
>  {
>         struct kernel_queue *kq;
> @@ -314,10 +316,18 @@ static __attribute__((unused)) void test_kq(struct kfd_dev *dev)
>         pr_err("Starting kernel queue test\n");
>
>         kq = kernel_queue_init(dev, KFD_QUEUE_TYPE_HIQ);
> -       BUG_ON(!kq);
> +       if (unlikely(!kq)) {
> +               pr_err("  Failed to initialize HIQ\n");
> +               pr_err("Kernel queue test failed\n");
> +               return;
> +       }
>
>         retval = kq->ops.acquire_packet_buffer(kq, 5, &buffer);
> -       BUG_ON(retval != 0);
> +       if (unlikely(retval != 0)) {
> +               pr_err("  Failed to acquire packet buffer\n");
> +               pr_err("Kernel queue test failed\n");
> +               return;
> +       }
>         for (i = 0; i < 5; i++)
>                 buffer[i] = kq->nop_packet;
>         kq->ops.submit_packet(kq);
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
> index a11477d..7e0ec6b 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
> @@ -387,7 +387,8 @@ struct mqd_manager *mqd_manager_init_cik(enum KFD_MQD_TYPE type,
>  {
>         struct mqd_manager *mqd;
>
> -       BUG_ON(type >= KFD_MQD_TYPE_MAX);
> +       if (WARN_ON(type >= KFD_MQD_TYPE_MAX))
> +               return NULL;
>
>         mqd = kzalloc(sizeof(*mqd), GFP_KERNEL);
>         if (!mqd)
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
> index d638c2c..f4c8c23 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
> @@ -233,7 +233,8 @@ struct mqd_manager *mqd_manager_init_vi(enum KFD_MQD_TYPE type,
>  {
>         struct mqd_manager *mqd;
>
> -       BUG_ON(type >= KFD_MQD_TYPE_MAX);
> +       if (WARN_ON(type >= KFD_MQD_TYPE_MAX))
> +               return NULL;
>
>         mqd = kzalloc(sizeof(*mqd), GFP_KERNEL);
>         if (!mqd)
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> index aacd5a3..77a6f2b 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> @@ -35,7 +35,8 @@ static inline void inc_wptr(unsigned int *wptr, unsigned int increment_bytes,
>  {
>         unsigned int temp = *wptr + increment_bytes / sizeof(uint32_t);
>
> -       BUG_ON((temp * sizeof(uint32_t)) > buffer_size_bytes);
> +       WARN((temp * sizeof(uint32_t)) > buffer_size_bytes,
> +            "Runlist IB overflow");
>         *wptr = temp;
>  }
>
> @@ -94,7 +95,8 @@ static int pm_allocate_runlist_ib(struct packet_manager *pm,
>  {
>         int retval;
>
> -       BUG_ON(pm->allocated);
> +       if (WARN_ON(pm->allocated))
> +               return -EINVAL;
>
>         pm_calc_rlib_size(pm, rl_buffer_size, is_over_subscription);
>
> @@ -119,7 +121,8 @@ static int pm_create_runlist(struct packet_manager *pm, uint32_t *buffer,
>  {
>         struct pm4_runlist *packet;
>
> -       BUG_ON(!ib);
> +       if (WARN_ON(!ib))
> +               return -EFAULT;
>
>         packet = (struct pm4_runlist *)buffer;
>
> @@ -211,9 +214,8 @@ static int pm_create_map_queue_vi(struct packet_manager *pm, uint32_t *buffer,
>                 use_static = false; /* no static queues under SDMA */
>                 break;
>         default:
> -               pr_err("queue type %d\n", q->properties.type);
> -               BUG();
> -               break;
> +               WARN(1, "queue type %d", q->properties.type);
> +               return -EINVAL;
>         }
>         packet->bitfields3.doorbell_offset =
>                         q->properties.doorbell_off;
> @@ -266,8 +268,8 @@ static int pm_create_map_queue(struct packet_manager *pm, uint32_t *buffer,
>                 use_static = false; /* no static queues under SDMA */
>                 break;
>         default:
> -               BUG();
> -               break;
> +               WARN(1, "queue type %d", q->properties.type);
> +               return -EINVAL;
>         }
>
>         packet->mes_map_queues_ordinals[0].bitfields3.doorbell_offset =
> @@ -392,14 +394,16 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
>         pr_debug("Finished map process and queues to runlist\n");
>
>         if (is_over_subscription)
> -               pm_create_runlist(pm, &rl_buffer[rl_wptr], *rl_gpu_addr,
> -                               alloc_size_bytes / sizeof(uint32_t), true);
> +               retval = pm_create_runlist(pm, &rl_buffer[rl_wptr],
> +                                       *rl_gpu_addr,
> +                                       alloc_size_bytes / sizeof(uint32_t),
> +                                       true);
>
>         for (i = 0; i < alloc_size_bytes / sizeof(uint32_t); i++)
>                 pr_debug("0x%2X ", rl_buffer[i]);
>         pr_debug("\n");
>
> -       return 0;
> +       return retval;
>  }
>
>  int pm_init(struct packet_manager *pm, struct device_queue_manager *dqm)
> @@ -512,7 +516,8 @@ int pm_send_query_status(struct packet_manager *pm, uint64_t fence_address,
>         int retval;
>         struct pm4_query_status *packet;
>
> -       BUG_ON(!fence_address);
> +       if (WARN_ON(!fence_address))
> +               return -EFAULT;
>
>         mutex_lock(&pm->lock);
>         retval = pm->priv_queue->ops.acquire_packet_buffer(
> @@ -577,8 +582,9 @@ int pm_send_unmap_queue(struct packet_manager *pm, enum kfd_queue_type type,
>                         engine_sel__mes_unmap_queues__sdma0 + sdma_engine;
>                 break;
>         default:
> -               BUG();
> -               break;
> +               WARN(1, "queue type %d", type);
> +               retval = -EINVAL;
> +               goto err_invalid;
>         }
>
>         if (reset)
> @@ -610,12 +616,15 @@ int pm_send_unmap_queue(struct packet_manager *pm, enum kfd_queue_type type,
>                                 queue_sel__mes_unmap_queues__perform_request_on_dynamic_queues_only;
>                 break;
>         default:
> -               BUG();
> -               break;
> +               WARN(1, "filter %d", mode);
> +               retval = -EINVAL;
>         }
>
> -       pm->priv_queue->ops.submit_packet(pm->priv_queue);
> -
> +err_invalid:
> +       if (!retval)
> +               pm->priv_queue->ops.submit_packet(pm->priv_queue);
> +       else
> +               pm->priv_queue->ops.rollback_packet(pm->priv_queue);

I don't feel comfortable putting a valid code path under an "err_invalid" label.
This defeats the purpose of goto statement and common cleanup code,
making the code unreadable.
Also, the rollback packet function was not in the original code. Why
did you add it here ?

>  err_acquire_packet_buffer:
>         mutex_unlock(&pm->lock);
>         return retval;
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pasid.c b/drivers/gpu/drm/amd/amdkfd/kfd_pasid.c
> index b3f7d43..1e06de0 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_pasid.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_pasid.c
> @@ -92,6 +92,6 @@ unsigned int kfd_pasid_alloc(void)
>
>  void kfd_pasid_free(unsigned int pasid)
>  {
> -       BUG_ON(pasid == 0 || pasid >= pasid_limit);
> -       clear_bit(pasid, pasid_bitmap);
> +       if (!WARN_ON(pasid == 0 || pasid >= pasid_limit))
> +               clear_bit(pasid, pasid_bitmap);
>  }
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> index d030d76..41a9976 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> @@ -79,8 +79,6 @@ struct kfd_process *kfd_create_process(const struct task_struct *thread)
>  {
>         struct kfd_process *process;
>
> -       BUG_ON(!kfd_process_wq);
> -
>         if (!thread->mm)
>                 return ERR_PTR(-EINVAL);
>
> @@ -202,10 +200,10 @@ static void kfd_process_destroy_delayed(struct rcu_head *rcu)
>         struct kfd_process_release_work *work;
>         struct kfd_process *p;
>
> -       BUG_ON(!kfd_process_wq);
> +       WARN_ON(!kfd_process_wq);
I think this is redundant, as kfd_process_wq is later derefernced
inside queue_work (as *wq). So we will get a violation there anyway.

>
>         p = container_of(rcu, struct kfd_process, rcu);
> -       BUG_ON(atomic_read(&p->mm->mm_count) <= 0);
> +       WARN_ON(atomic_read(&p->mm->mm_count) <= 0);
>
>         mmdrop(p->mm);
>
> @@ -229,7 +227,8 @@ static void kfd_process_notifier_release(struct mmu_notifier *mn,
>          * mmu_notifier srcu is read locked
>          */
>         p = container_of(mn, struct kfd_process, mmu_notifier);
> -       BUG_ON(p->mm != mm);
> +       if (WARN_ON(p->mm != mm))
> +               return;
>
>         mutex_lock(&kfd_processes_mutex);
>         hash_del_rcu(&p->kfd_processes);
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> index f6ecdff..1cae95e 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> @@ -218,8 +218,8 @@ int pqm_create_queue(struct process_queue_manager *pqm,
>                                                         kq, &pdd->qpd);
>                 break;
>         default:
> -               BUG();
> -               break;
> +               WARN(1, "Invalid queue type %d", type);
> +               retval = -EINVAL;
>         }
>
>         if (retval != 0) {
> @@ -272,7 +272,8 @@ int pqm_destroy_queue(struct process_queue_manager *pqm, unsigned int qid)
>                 dev = pqn->kq->dev;
>         if (pqn->q)
>                 dev = pqn->q->device;
> -       BUG_ON(!dev);
> +       if (WARN_ON(!dev))
> +               return -ENODEV;
>
>         pdd = kfd_get_process_device_data(dev, pqm->process);
>         if (!pdd) {
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> index e5486f4..19ce590 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> @@ -799,10 +799,12 @@ static int kfd_build_sysfs_node_entry(struct kfd_topology_device *dev,
>         int ret;
>         uint32_t i;
>
> +       if (WARN_ON(dev->kobj_node))
> +               return -EEXIST;
> +
>         /*
>          * Creating the sysfs folders
>          */
> -       BUG_ON(dev->kobj_node);
>         dev->kobj_node = kfd_alloc_struct(dev->kobj_node);
>         if (!dev->kobj_node)
>                 return -ENOMEM;
> --
> 2.7.4
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 14/19] drm/amdkfd: Add more error printing to help bringup
       [not found]     ` <1502488589-30272-15-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-08-12 14:54       ` Oded Gabbay
       [not found]         ` <CAFCwf12LtX8Me-DSVvnf72eZr=UQm6sWnBoSuB2DM8jbqk3nOQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: Oded Gabbay @ 2017-08-12 14:54 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: Yong Zhao, amd-gfx list

On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> From: Yong Zhao <Yong.Zhao@amd.com>
>
> Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_device.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> index f628ac3..e1c2ad2 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> @@ -103,6 +103,8 @@ static const struct kfd_device_info *lookup_device_info(unsigned short did)
>                 }
>         }
>
> +       WARN(1, "device is not added to supported_devices\n");
> +
I think WARN is a bit excessive here. Its not actually a warning - an
AMD gpu device is present but not supported in amdkfd.
Maybe a dev_info is more appropriate here.

Oded

>         return NULL;
>  }
>
> @@ -114,8 +116,10 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd,
>         const struct kfd_device_info *device_info =
>                                         lookup_device_info(pdev->device);
>
> -       if (!device_info)
> +       if (!device_info) {
> +               dev_err(kfd_device, "kgd2kfd_probe failed\n");
>                 return NULL;
> +       }
>
>         kfd = kzalloc(sizeof(*kfd), GFP_KERNEL);
>         if (!kfd)
> @@ -364,8 +368,11 @@ int kgd2kfd_resume(struct kfd_dev *kfd)
>
>         if (kfd->init_complete) {
>                 err = amd_iommu_init_device(kfd->pdev, pasid_limit);
> -               if (err < 0)
> +               if (err < 0) {
> +                       dev_err(kfd_device, "failed to initialize iommu\n");
>                         return -ENXIO;
> +               }
> +
>                 amd_iommu_set_invalidate_ctx_cb(kfd->pdev,
>                                                 iommu_pasid_shutdown_callback);
>                 amd_iommu_set_invalid_ppr_cb(kfd->pdev, iommu_invalid_ppr_cb);
> --
> 2.7.4
>
With the above fixed, this patch is:
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 15/19] drm/amdkfd: Clamp EOP queue size correctly on Gfx8
       [not found]     ` <1502488589-30272-16-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-08-12 15:04       ` Oded Gabbay
  0 siblings, 0 replies; 70+ messages in thread
From: Oded Gabbay @ 2017-08-12 15:04 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: Jay Cornwall, amd-gfx list

On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> From: Jay Cornwall <Jay.Cornwall@amd.com>
>
> Gfx8 HW incorrectly clamps CP_HQD_EOP_CONTROL.EOP_SIZE, which can
> lead to scheduling deadlock due to SE EOP done counter overflow.
>
> Enforce a EOP queue size limit which prevents the CP from sending
> more than 0xFF events at a time.
>
> Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
> index f4c8c23..98a930e 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
> @@ -135,8 +135,15 @@ static int __update_mqd(struct mqd_manager *mm, void *mqd,
>                         3 << CP_HQD_IB_CONTROL__MIN_IB_AVAIL_SIZE__SHIFT |
>                         mtype << CP_HQD_IB_CONTROL__MTYPE__SHIFT;
>
> -       m->cp_hqd_eop_control |=
> -               ffs(q->eop_ring_buffer_size / sizeof(unsigned int)) - 1 - 1;
> +       /*
> +        * HW does not clamp this field correctly. Maximum EOP queue size
> +        * is constrained by per-SE EOP done signal count, which is 8-bit.
> +        * Limit is 0xFF EOP entries (= 0x7F8 dwords). CP will not submit
> +        * more than (EOP entry count - 1) so a queue size of 0x800 dwords
> +        * is safe, giving a maximum field value of 0xA.
> +        */
> +       m->cp_hqd_eop_control |= min(0xA,
> +               ffs(q->eop_ring_buffer_size / sizeof(unsigned int)) - 1 - 1);
>         m->cp_hqd_eop_base_addr_lo =
>                         lower_32_bits(q->eop_ring_buffer_address >> 8);
>         m->cp_hqd_eop_base_addr_hi =
> --
> 2.7.4
>

This patch is:
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 16/19] drm/amdkfd: Update PM4 packet headers
       [not found]     ` <1502488589-30272-17-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-08-12 15:10       ` Oded Gabbay
       [not found]         ` <CAFCwf12mAxpYF0-AWA=4hJiEe093KkakUYO28-+VxV=Uo+X4Tw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: Oded Gabbay @ 2017-08-12 15:10 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: amd-gfx list

On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> To match current firmware. The map process packet has been extended
> to support scratch. This is a non-backwards compatible change and
> it's about two years old. So no point keeping the old version around
> conditionally.

Do you mean that it won't work with Kaveri anymore ?
I believe we aren't allowed to break older H/W support without some
serious justification.

Oded

>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_device.c         |   8 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c | 161 ++++--------
>  drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h    | 314 +++---------------------
>  drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h | 130 +++++++++-
>  4 files changed, 199 insertions(+), 414 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> index e1c2ad2..e790e7f 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> @@ -26,7 +26,7 @@
>  #include <linux/slab.h>
>  #include "kfd_priv.h"
>  #include "kfd_device_queue_manager.h"
> -#include "kfd_pm4_headers.h"
> +#include "kfd_pm4_headers_vi.h"
>
>  #define MQD_SIZE_ALIGNED 768
>
> @@ -238,9 +238,9 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
>          * calculate max size of runlist packet.
>          * There can be only 2 packets at once
>          */
> -       size += (KFD_MAX_NUM_OF_PROCESSES * sizeof(struct pm4_map_process) +
> -               max_num_of_queues_per_device *
> -               sizeof(struct pm4_map_queues) + sizeof(struct pm4_runlist)) * 2;
> +       size += (KFD_MAX_NUM_OF_PROCESSES * sizeof(struct pm4_mes_map_process) +
> +               max_num_of_queues_per_device * sizeof(struct pm4_mes_map_queues)
> +               + sizeof(struct pm4_mes_runlist)) * 2;
>
>         /* Add size of HIQ & DIQ */
>         size += KFD_KERNEL_QUEUE_SIZE * 2;
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> index 77a6f2b..3141e05 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> @@ -26,7 +26,6 @@
>  #include "kfd_device_queue_manager.h"
>  #include "kfd_kernel_queue.h"
>  #include "kfd_priv.h"
> -#include "kfd_pm4_headers.h"
>  #include "kfd_pm4_headers_vi.h"
>  #include "kfd_pm4_opcodes.h"
>
> @@ -44,12 +43,12 @@ static unsigned int build_pm4_header(unsigned int opcode, size_t packet_size)
>  {
>         union PM4_MES_TYPE_3_HEADER header;
>
> -       header.u32all = 0;
> +       header.u32All = 0;
>         header.opcode = opcode;
>         header.count = packet_size/sizeof(uint32_t) - 2;
>         header.type = PM4_TYPE_3;
>
> -       return header.u32all;
> +       return header.u32All;
>  }
>
>  static void pm_calc_rlib_size(struct packet_manager *pm,
> @@ -69,12 +68,9 @@ static void pm_calc_rlib_size(struct packet_manager *pm,
>                 pr_debug("Over subscribed runlist\n");
>         }
>
> -       map_queue_size =
> -               (pm->dqm->dev->device_info->asic_family == CHIP_CARRIZO) ?
> -               sizeof(struct pm4_mes_map_queues) :
> -               sizeof(struct pm4_map_queues);
> +       map_queue_size = sizeof(struct pm4_mes_map_queues);
>         /* calculate run list ib allocation size */
> -       *rlib_size = process_count * sizeof(struct pm4_map_process) +
> +       *rlib_size = process_count * sizeof(struct pm4_mes_map_process) +
>                      queue_count * map_queue_size;
>
>         /*
> @@ -82,7 +78,7 @@ static void pm_calc_rlib_size(struct packet_manager *pm,
>          * when over subscription
>          */
>         if (*over_subscription)
> -               *rlib_size += sizeof(struct pm4_runlist);
> +               *rlib_size += sizeof(struct pm4_mes_runlist);
>
>         pr_debug("runlist ib size %d\n", *rlib_size);
>  }
> @@ -119,16 +115,16 @@ static int pm_allocate_runlist_ib(struct packet_manager *pm,
>  static int pm_create_runlist(struct packet_manager *pm, uint32_t *buffer,
>                         uint64_t ib, size_t ib_size_in_dwords, bool chain)
>  {
> -       struct pm4_runlist *packet;
> +       struct pm4_mes_runlist *packet;
>
>         if (WARN_ON(!ib))
>                 return -EFAULT;
>
> -       packet = (struct pm4_runlist *)buffer;
> +       packet = (struct pm4_mes_runlist *)buffer;
>
> -       memset(buffer, 0, sizeof(struct pm4_runlist));
> -       packet->header.u32all = build_pm4_header(IT_RUN_LIST,
> -                                               sizeof(struct pm4_runlist));
> +       memset(buffer, 0, sizeof(struct pm4_mes_runlist));
> +       packet->header.u32All = build_pm4_header(IT_RUN_LIST,
> +                                               sizeof(struct pm4_mes_runlist));
>
>         packet->bitfields4.ib_size = ib_size_in_dwords;
>         packet->bitfields4.chain = chain ? 1 : 0;
> @@ -143,16 +139,16 @@ static int pm_create_runlist(struct packet_manager *pm, uint32_t *buffer,
>  static int pm_create_map_process(struct packet_manager *pm, uint32_t *buffer,
>                                 struct qcm_process_device *qpd)
>  {
> -       struct pm4_map_process *packet;
> +       struct pm4_mes_map_process *packet;
>         struct queue *cur;
>         uint32_t num_queues;
>
> -       packet = (struct pm4_map_process *)buffer;
> +       packet = (struct pm4_mes_map_process *)buffer;
>
> -       memset(buffer, 0, sizeof(struct pm4_map_process));
> +       memset(buffer, 0, sizeof(struct pm4_mes_map_process));
>
> -       packet->header.u32all = build_pm4_header(IT_MAP_PROCESS,
> -                                       sizeof(struct pm4_map_process));
> +       packet->header.u32All = build_pm4_header(IT_MAP_PROCESS,
> +                                       sizeof(struct pm4_mes_map_process));
>         packet->bitfields2.diq_enable = (qpd->is_debug) ? 1 : 0;
>         packet->bitfields2.process_quantum = 1;
>         packet->bitfields2.pasid = qpd->pqm->process->pasid;
> @@ -170,23 +166,26 @@ static int pm_create_map_process(struct packet_manager *pm, uint32_t *buffer,
>         packet->sh_mem_ape1_base = qpd->sh_mem_ape1_base;
>         packet->sh_mem_ape1_limit = qpd->sh_mem_ape1_limit;
>
> +       /* TODO: scratch support */
> +       packet->sh_hidden_private_base_vmid = 0;
> +
>         packet->gds_addr_lo = lower_32_bits(qpd->gds_context_area);
>         packet->gds_addr_hi = upper_32_bits(qpd->gds_context_area);
>
>         return 0;
>  }
>
> -static int pm_create_map_queue_vi(struct packet_manager *pm, uint32_t *buffer,
> +static int pm_create_map_queue(struct packet_manager *pm, uint32_t *buffer,
>                 struct queue *q, bool is_static)
>  {
>         struct pm4_mes_map_queues *packet;
>         bool use_static = is_static;
>
>         packet = (struct pm4_mes_map_queues *)buffer;
> -       memset(buffer, 0, sizeof(struct pm4_map_queues));
> +       memset(buffer, 0, sizeof(struct pm4_mes_map_queues));
>
> -       packet->header.u32all = build_pm4_header(IT_MAP_QUEUES,
> -                                               sizeof(struct pm4_map_queues));
> +       packet->header.u32All = build_pm4_header(IT_MAP_QUEUES,
> +                                               sizeof(struct pm4_mes_map_queues));
>         packet->bitfields2.alloc_format =
>                 alloc_format__mes_map_queues__one_per_pipe_vi;
>         packet->bitfields2.num_queues = 1;
> @@ -235,64 +234,6 @@ static int pm_create_map_queue_vi(struct packet_manager *pm, uint32_t *buffer,
>         return 0;
>  }
>
> -static int pm_create_map_queue(struct packet_manager *pm, uint32_t *buffer,
> -                               struct queue *q, bool is_static)
> -{
> -       struct pm4_map_queues *packet;
> -       bool use_static = is_static;
> -
> -       packet = (struct pm4_map_queues *)buffer;
> -       memset(buffer, 0, sizeof(struct pm4_map_queues));
> -
> -       packet->header.u32all = build_pm4_header(IT_MAP_QUEUES,
> -                                               sizeof(struct pm4_map_queues));
> -       packet->bitfields2.alloc_format =
> -                               alloc_format__mes_map_queues__one_per_pipe;
> -       packet->bitfields2.num_queues = 1;
> -       packet->bitfields2.queue_sel =
> -               queue_sel__mes_map_queues__map_to_hws_determined_queue_slots;
> -
> -       packet->bitfields2.vidmem = (q->properties.is_interop) ?
> -                       vidmem__mes_map_queues__uses_video_memory :
> -                       vidmem__mes_map_queues__uses_no_video_memory;
> -
> -       switch (q->properties.type) {
> -       case KFD_QUEUE_TYPE_COMPUTE:
> -       case KFD_QUEUE_TYPE_DIQ:
> -               packet->bitfields2.engine_sel =
> -                               engine_sel__mes_map_queues__compute;
> -               break;
> -       case KFD_QUEUE_TYPE_SDMA:
> -               packet->bitfields2.engine_sel =
> -                               engine_sel__mes_map_queues__sdma0;
> -               use_static = false; /* no static queues under SDMA */
> -               break;
> -       default:
> -               WARN(1, "queue type %d", q->properties.type);
> -               return -EINVAL;
> -       }
> -
> -       packet->mes_map_queues_ordinals[0].bitfields3.doorbell_offset =
> -                       q->properties.doorbell_off;
> -
> -       packet->mes_map_queues_ordinals[0].bitfields3.is_static =
> -                       (use_static) ? 1 : 0;
> -
> -       packet->mes_map_queues_ordinals[0].mqd_addr_lo =
> -                       lower_32_bits(q->gart_mqd_addr);
> -
> -       packet->mes_map_queues_ordinals[0].mqd_addr_hi =
> -                       upper_32_bits(q->gart_mqd_addr);
> -
> -       packet->mes_map_queues_ordinals[0].wptr_addr_lo =
> -                       lower_32_bits((uint64_t)q->properties.write_ptr);
> -
> -       packet->mes_map_queues_ordinals[0].wptr_addr_hi =
> -                       upper_32_bits((uint64_t)q->properties.write_ptr);
> -
> -       return 0;
> -}
> -
>  static int pm_create_runlist_ib(struct packet_manager *pm,
>                                 struct list_head *queues,
>                                 uint64_t *rl_gpu_addr,
> @@ -334,7 +275,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
>                         return retval;
>
>                 proccesses_mapped++;
> -               inc_wptr(&rl_wptr, sizeof(struct pm4_map_process),
> +               inc_wptr(&rl_wptr, sizeof(struct pm4_mes_map_process),
>                                 alloc_size_bytes);
>
>                 list_for_each_entry(kq, &qpd->priv_queue_list, list) {
> @@ -344,14 +285,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
>                         pr_debug("static_queue, mapping kernel q %d, is debug status %d\n",
>                                 kq->queue->queue, qpd->is_debug);
>
> -                       if (pm->dqm->dev->device_info->asic_family ==
> -                                       CHIP_CARRIZO)
> -                               retval = pm_create_map_queue_vi(pm,
> -                                               &rl_buffer[rl_wptr],
> -                                               kq->queue,
> -                                               qpd->is_debug);
> -                       else
> -                               retval = pm_create_map_queue(pm,
> +                       retval = pm_create_map_queue(pm,
>                                                 &rl_buffer[rl_wptr],
>                                                 kq->queue,
>                                                 qpd->is_debug);
> @@ -359,7 +293,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
>                                 return retval;
>
>                         inc_wptr(&rl_wptr,
> -                               sizeof(struct pm4_map_queues),
> +                               sizeof(struct pm4_mes_map_queues),
>                                 alloc_size_bytes);
>                 }
>
> @@ -370,14 +304,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
>                         pr_debug("static_queue, mapping user queue %d, is debug status %d\n",
>                                 q->queue, qpd->is_debug);
>
> -                       if (pm->dqm->dev->device_info->asic_family ==
> -                                       CHIP_CARRIZO)
> -                               retval = pm_create_map_queue_vi(pm,
> -                                               &rl_buffer[rl_wptr],
> -                                               q,
> -                                               qpd->is_debug);
> -                       else
> -                               retval = pm_create_map_queue(pm,
> +                       retval = pm_create_map_queue(pm,
>                                                 &rl_buffer[rl_wptr],
>                                                 q,
>                                                 qpd->is_debug);
> @@ -386,7 +313,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
>                                 return retval;
>
>                         inc_wptr(&rl_wptr,
> -                               sizeof(struct pm4_map_queues),
> +                               sizeof(struct pm4_mes_map_queues),
>                                 alloc_size_bytes);
>                 }
>         }
> @@ -429,7 +356,7 @@ void pm_uninit(struct packet_manager *pm)
>  int pm_send_set_resources(struct packet_manager *pm,
>                                 struct scheduling_resources *res)
>  {
> -       struct pm4_set_resources *packet;
> +       struct pm4_mes_set_resources *packet;
>         int retval = 0;
>
>         mutex_lock(&pm->lock);
> @@ -442,9 +369,9 @@ int pm_send_set_resources(struct packet_manager *pm,
>                 goto out;
>         }
>
> -       memset(packet, 0, sizeof(struct pm4_set_resources));
> -       packet->header.u32all = build_pm4_header(IT_SET_RESOURCES,
> -                                       sizeof(struct pm4_set_resources));
> +       memset(packet, 0, sizeof(struct pm4_mes_set_resources));
> +       packet->header.u32All = build_pm4_header(IT_SET_RESOURCES,
> +                                       sizeof(struct pm4_mes_set_resources));
>
>         packet->bitfields2.queue_type =
>                         queue_type__mes_set_resources__hsa_interface_queue_hiq;
> @@ -482,7 +409,7 @@ int pm_send_runlist(struct packet_manager *pm, struct list_head *dqm_queues)
>
>         pr_debug("runlist IB address: 0x%llX\n", rl_gpu_ib_addr);
>
> -       packet_size_dwords = sizeof(struct pm4_runlist) / sizeof(uint32_t);
> +       packet_size_dwords = sizeof(struct pm4_mes_runlist) / sizeof(uint32_t);
>         mutex_lock(&pm->lock);
>
>         retval = pm->priv_queue->ops.acquire_packet_buffer(pm->priv_queue,
> @@ -514,7 +441,7 @@ int pm_send_query_status(struct packet_manager *pm, uint64_t fence_address,
>                         uint32_t fence_value)
>  {
>         int retval;
> -       struct pm4_query_status *packet;
> +       struct pm4_mes_query_status *packet;
>
>         if (WARN_ON(!fence_address))
>                 return -EFAULT;
> @@ -522,13 +449,13 @@ int pm_send_query_status(struct packet_manager *pm, uint64_t fence_address,
>         mutex_lock(&pm->lock);
>         retval = pm->priv_queue->ops.acquire_packet_buffer(
>                         pm->priv_queue,
> -                       sizeof(struct pm4_query_status) / sizeof(uint32_t),
> +                       sizeof(struct pm4_mes_query_status) / sizeof(uint32_t),
>                         (unsigned int **)&packet);
>         if (retval)
>                 goto fail_acquire_packet_buffer;
>
> -       packet->header.u32all = build_pm4_header(IT_QUERY_STATUS,
> -                                       sizeof(struct pm4_query_status));
> +       packet->header.u32All = build_pm4_header(IT_QUERY_STATUS,
> +                                       sizeof(struct pm4_mes_query_status));
>
>         packet->bitfields2.context_id = 0;
>         packet->bitfields2.interrupt_sel =
> @@ -555,22 +482,22 @@ int pm_send_unmap_queue(struct packet_manager *pm, enum kfd_queue_type type,
>  {
>         int retval;
>         uint32_t *buffer;
> -       struct pm4_unmap_queues *packet;
> +       struct pm4_mes_unmap_queues *packet;
>
>         mutex_lock(&pm->lock);
>         retval = pm->priv_queue->ops.acquire_packet_buffer(
>                         pm->priv_queue,
> -                       sizeof(struct pm4_unmap_queues) / sizeof(uint32_t),
> +                       sizeof(struct pm4_mes_unmap_queues) / sizeof(uint32_t),
>                         &buffer);
>         if (retval)
>                 goto err_acquire_packet_buffer;
>
> -       packet = (struct pm4_unmap_queues *)buffer;
> -       memset(buffer, 0, sizeof(struct pm4_unmap_queues));
> +       packet = (struct pm4_mes_unmap_queues *)buffer;
> +       memset(buffer, 0, sizeof(struct pm4_mes_unmap_queues));
>         pr_debug("static_queue: unmapping queues: mode is %d , reset is %d , type is %d\n",
>                 mode, reset, type);
> -       packet->header.u32all = build_pm4_header(IT_UNMAP_QUEUES,
> -                                       sizeof(struct pm4_unmap_queues));
> +       packet->header.u32All = build_pm4_header(IT_UNMAP_QUEUES,
> +                                       sizeof(struct pm4_mes_unmap_queues));
>         switch (type) {
>         case KFD_QUEUE_TYPE_COMPUTE:
>         case KFD_QUEUE_TYPE_DIQ:
> @@ -608,12 +535,12 @@ int pm_send_unmap_queue(struct packet_manager *pm, enum kfd_queue_type type,
>                 break;
>         case KFD_PREEMPT_TYPE_FILTER_ALL_QUEUES:
>                 packet->bitfields2.queue_sel =
> -                               queue_sel__mes_unmap_queues__perform_request_on_all_active_queues;
> +                               queue_sel__mes_unmap_queues__unmap_all_queues;
>                 break;
>         case KFD_PREEMPT_TYPE_FILTER_DYNAMIC_QUEUES:
>                 /* in this case, we do not preempt static queues */
>                 packet->bitfields2.queue_sel =
> -                               queue_sel__mes_unmap_queues__perform_request_on_dynamic_queues_only;
> +                               queue_sel__mes_unmap_queues__unmap_all_non_static_queues;
>                 break;
>         default:
>                 WARN(1, "filter %d", mode);
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
> index 97e5442..e50f73d 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
> @@ -41,99 +41,6 @@ union PM4_MES_TYPE_3_HEADER {
>  };
>  #endif /* PM4_MES_HEADER_DEFINED */
>
> -/* --------------------MES_SET_RESOURCES-------------------- */
> -
> -#ifndef PM4_MES_SET_RESOURCES_DEFINED
> -#define PM4_MES_SET_RESOURCES_DEFINED
> -enum set_resources_queue_type_enum {
> -       queue_type__mes_set_resources__kernel_interface_queue_kiq = 0,
> -       queue_type__mes_set_resources__hsa_interface_queue_hiq = 1,
> -       queue_type__mes_set_resources__hsa_debug_interface_queue = 4
> -};
> -
> -struct pm4_set_resources {
> -       union {
> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
> -               uint32_t ordinal1;
> -       };
> -
> -       union {
> -               struct {
> -                       uint32_t vmid_mask:16;
> -                       uint32_t unmap_latency:8;
> -                       uint32_t reserved1:5;
> -                       enum set_resources_queue_type_enum queue_type:3;
> -               } bitfields2;
> -               uint32_t ordinal2;
> -       };
> -
> -       uint32_t queue_mask_lo;
> -       uint32_t queue_mask_hi;
> -       uint32_t gws_mask_lo;
> -       uint32_t gws_mask_hi;
> -
> -       union {
> -               struct {
> -                       uint32_t oac_mask:16;
> -                       uint32_t reserved2:16;
> -               } bitfields7;
> -               uint32_t ordinal7;
> -       };
> -
> -       union {
> -               struct {
> -                       uint32_t gds_heap_base:6;
> -                       uint32_t reserved3:5;
> -                       uint32_t gds_heap_size:6;
> -                       uint32_t reserved4:15;
> -               } bitfields8;
> -               uint32_t ordinal8;
> -       };
> -
> -};
> -#endif
> -
> -/*--------------------MES_RUN_LIST-------------------- */
> -
> -#ifndef PM4_MES_RUN_LIST_DEFINED
> -#define PM4_MES_RUN_LIST_DEFINED
> -
> -struct pm4_runlist {
> -       union {
> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
> -               uint32_t ordinal1;
> -       };
> -
> -       union {
> -               struct {
> -                       uint32_t reserved1:2;
> -                       uint32_t ib_base_lo:30;
> -               } bitfields2;
> -               uint32_t ordinal2;
> -       };
> -
> -       union {
> -               struct {
> -                       uint32_t ib_base_hi:16;
> -                       uint32_t reserved2:16;
> -               } bitfields3;
> -               uint32_t ordinal3;
> -       };
> -
> -       union {
> -               struct {
> -                       uint32_t ib_size:20;
> -                       uint32_t chain:1;
> -                       uint32_t offload_polling:1;
> -                       uint32_t reserved3:1;
> -                       uint32_t valid:1;
> -                       uint32_t reserved4:8;
> -               } bitfields4;
> -               uint32_t ordinal4;
> -       };
> -
> -};
> -#endif
>
>  /*--------------------MES_MAP_PROCESS-------------------- */
>
> @@ -186,217 +93,58 @@ struct pm4_map_process {
>  };
>  #endif
>
> -/*--------------------MES_MAP_QUEUES--------------------*/
> -
> -#ifndef PM4_MES_MAP_QUEUES_DEFINED
> -#define PM4_MES_MAP_QUEUES_DEFINED
> -enum map_queues_queue_sel_enum {
> -       queue_sel__mes_map_queues__map_to_specified_queue_slots = 0,
> -       queue_sel__mes_map_queues__map_to_hws_determined_queue_slots = 1,
> -       queue_sel__mes_map_queues__enable_process_queues = 2
> -};
> +#ifndef PM4_MES_MAP_PROCESS_DEFINED_KV_SCRATCH
> +#define PM4_MES_MAP_PROCESS_DEFINED_KV_SCRATCH
>
> -enum map_queues_vidmem_enum {
> -       vidmem__mes_map_queues__uses_no_video_memory = 0,
> -       vidmem__mes_map_queues__uses_video_memory = 1
> -};
> -
> -enum map_queues_alloc_format_enum {
> -       alloc_format__mes_map_queues__one_per_pipe = 0,
> -       alloc_format__mes_map_queues__all_on_one_pipe = 1
> -};
> -
> -enum map_queues_engine_sel_enum {
> -       engine_sel__mes_map_queues__compute = 0,
> -       engine_sel__mes_map_queues__sdma0 = 2,
> -       engine_sel__mes_map_queues__sdma1 = 3
> -};
> -
> -struct pm4_map_queues {
> +struct pm4_map_process_scratch_kv {
>         union {
> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
> -               uint32_t ordinal1;
> -       };
> -
> -       union {
> -               struct {
> -                       uint32_t reserved1:4;
> -                       enum map_queues_queue_sel_enum queue_sel:2;
> -                       uint32_t reserved2:2;
> -                       uint32_t vmid:4;
> -                       uint32_t reserved3:4;
> -                       enum map_queues_vidmem_enum vidmem:2;
> -                       uint32_t reserved4:6;
> -                       enum map_queues_alloc_format_enum alloc_format:2;
> -                       enum map_queues_engine_sel_enum engine_sel:3;
> -                       uint32_t num_queues:3;
> -               } bitfields2;
> -               uint32_t ordinal2;
> -       };
> -
> -       struct {
> -               union {
> -                       struct {
> -                               uint32_t is_static:1;
> -                               uint32_t reserved5:1;
> -                               uint32_t doorbell_offset:21;
> -                               uint32_t reserved6:3;
> -                               uint32_t queue:6;
> -                       } bitfields3;
> -                       uint32_t ordinal3;
> -               };
> -
> -               uint32_t mqd_addr_lo;
> -               uint32_t mqd_addr_hi;
> -               uint32_t wptr_addr_lo;
> -               uint32_t wptr_addr_hi;
> -
> -       } mes_map_queues_ordinals[1];   /* 1..N of these ordinal groups */
> -
> -};
> -#endif
> -
> -/*--------------------MES_QUERY_STATUS--------------------*/
> -
> -#ifndef PM4_MES_QUERY_STATUS_DEFINED
> -#define PM4_MES_QUERY_STATUS_DEFINED
> -enum query_status_interrupt_sel_enum {
> -       interrupt_sel__mes_query_status__completion_status = 0,
> -       interrupt_sel__mes_query_status__process_status = 1,
> -       interrupt_sel__mes_query_status__queue_status = 2
> -};
> -
> -enum query_status_command_enum {
> -       command__mes_query_status__interrupt_only = 0,
> -       command__mes_query_status__fence_only_immediate = 1,
> -       command__mes_query_status__fence_only_after_write_ack = 2,
> -       command__mes_query_status__fence_wait_for_write_ack_send_interrupt = 3
> -};
> -
> -enum query_status_engine_sel_enum {
> -       engine_sel__mes_query_status__compute = 0,
> -       engine_sel__mes_query_status__sdma0_queue = 2,
> -       engine_sel__mes_query_status__sdma1_queue = 3
> -};
> -
> -struct pm4_query_status {
> -       union {
> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
> -               uint32_t ordinal1;
> -       };
> -
> -       union {
> -               struct {
> -                       uint32_t context_id:28;
> -                       enum query_status_interrupt_sel_enum interrupt_sel:2;
> -                       enum query_status_command_enum command:2;
> -               } bitfields2;
> -               uint32_t ordinal2;
> +               union PM4_MES_TYPE_3_HEADER   header; /* header */
> +               uint32_t            ordinal1;
>         };
>
>         union {
>                 struct {
>                         uint32_t pasid:16;
> -                       uint32_t reserved1:16;
> -               } bitfields3a;
> -               struct {
> -                       uint32_t reserved2:2;
> -                       uint32_t doorbell_offset:21;
> -                       uint32_t reserved3:3;
> -                       enum query_status_engine_sel_enum engine_sel:3;
> -                       uint32_t reserved4:3;
> -               } bitfields3b;
> -               uint32_t ordinal3;
> -       };
> -
> -       uint32_t addr_lo;
> -       uint32_t addr_hi;
> -       uint32_t data_lo;
> -       uint32_t data_hi;
> -};
> -#endif
> -
> -/*--------------------MES_UNMAP_QUEUES--------------------*/
> -
> -#ifndef PM4_MES_UNMAP_QUEUES_DEFINED
> -#define PM4_MES_UNMAP_QUEUES_DEFINED
> -enum unmap_queues_action_enum {
> -       action__mes_unmap_queues__preempt_queues = 0,
> -       action__mes_unmap_queues__reset_queues = 1,
> -       action__mes_unmap_queues__disable_process_queues = 2
> -};
> -
> -enum unmap_queues_queue_sel_enum {
> -       queue_sel__mes_unmap_queues__perform_request_on_specified_queues = 0,
> -       queue_sel__mes_unmap_queues__perform_request_on_pasid_queues = 1,
> -       queue_sel__mes_unmap_queues__perform_request_on_all_active_queues = 2,
> -       queue_sel__mes_unmap_queues__perform_request_on_dynamic_queues_only = 3
> -};
> -
> -enum unmap_queues_engine_sel_enum {
> -       engine_sel__mes_unmap_queues__compute = 0,
> -       engine_sel__mes_unmap_queues__sdma0 = 2,
> -       engine_sel__mes_unmap_queues__sdma1 = 3
> -};
> -
> -struct pm4_unmap_queues {
> -       union {
> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
> -               uint32_t ordinal1;
> -       };
> -
> -       union {
> -               struct {
> -                       enum unmap_queues_action_enum action:2;
> -                       uint32_t reserved1:2;
> -                       enum unmap_queues_queue_sel_enum queue_sel:2;
> -                       uint32_t reserved2:20;
> -                       enum unmap_queues_engine_sel_enum engine_sel:3;
> -                       uint32_t num_queues:3;
> +                       uint32_t reserved1:8;
> +                       uint32_t diq_enable:1;
> +                       uint32_t process_quantum:7;
>                 } bitfields2;
>                 uint32_t ordinal2;
>         };
>
>         union {
>                 struct {
> -                       uint32_t pasid:16;
> -                       uint32_t reserved3:16;
> -               } bitfields3a;
> -               struct {
> -                       uint32_t reserved4:2;
> -                       uint32_t doorbell_offset0:21;
> -                       uint32_t reserved5:9;
> -               } bitfields3b;
> +                       uint32_t page_table_base:28;
> +                       uint32_t reserved2:4;
> +               } bitfields3;
>                 uint32_t ordinal3;
>         };
>
> -       union {
> -               struct {
> -                       uint32_t reserved6:2;
> -                       uint32_t doorbell_offset1:21;
> -                       uint32_t reserved7:9;
> -               } bitfields4;
> -               uint32_t ordinal4;
> -       };
> -
> -       union {
> -               struct {
> -                       uint32_t reserved8:2;
> -                       uint32_t doorbell_offset2:21;
> -                       uint32_t reserved9:9;
> -               } bitfields5;
> -               uint32_t ordinal5;
> -       };
> +       uint32_t reserved3;
> +       uint32_t sh_mem_bases;
> +       uint32_t sh_mem_config;
> +       uint32_t sh_mem_ape1_base;
> +       uint32_t sh_mem_ape1_limit;
> +       uint32_t sh_hidden_private_base_vmid;
> +       uint32_t reserved4;
> +       uint32_t reserved5;
> +       uint32_t gds_addr_lo;
> +       uint32_t gds_addr_hi;
>
>         union {
>                 struct {
> -                       uint32_t reserved10:2;
> -                       uint32_t doorbell_offset3:21;
> -                       uint32_t reserved11:9;
> -               } bitfields6;
> -               uint32_t ordinal6;
> +                       uint32_t num_gws:6;
> +                       uint32_t reserved6:2;
> +                       uint32_t num_oac:4;
> +                       uint32_t reserved7:4;
> +                       uint32_t gds_size:6;
> +                       uint32_t num_queues:10;
> +               } bitfields14;
> +               uint32_t ordinal14;
>         };
>
> +       uint32_t completion_signal_lo32;
> +uint32_t completion_signal_hi32;
>  };
>  #endif
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
> index c4eda6f..7c8d9b3 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
> @@ -126,9 +126,10 @@ struct pm4_mes_runlist {
>                         uint32_t ib_size:20;
>                         uint32_t chain:1;
>                         uint32_t offload_polling:1;
> -                       uint32_t reserved3:1;
> +                       uint32_t reserved2:1;
>                         uint32_t valid:1;
> -                       uint32_t reserved4:8;
> +                       uint32_t process_cnt:4;
> +                       uint32_t reserved3:4;
>                 } bitfields4;
>                 uint32_t ordinal4;
>         };
> @@ -143,8 +144,8 @@ struct pm4_mes_runlist {
>
>  struct pm4_mes_map_process {
>         union {
> -               union PM4_MES_TYPE_3_HEADER   header;            /* header */
> -               uint32_t            ordinal1;
> +               union PM4_MES_TYPE_3_HEADER header;     /* header */
> +               uint32_t ordinal1;
>         };
>
>         union {
> @@ -155,36 +156,48 @@ struct pm4_mes_map_process {
>                         uint32_t process_quantum:7;
>                 } bitfields2;
>                 uint32_t ordinal2;
> -};
> +       };
>
>         union {
>                 struct {
>                         uint32_t page_table_base:28;
> -                       uint32_t reserved2:4;
> +                       uint32_t reserved3:4;
>                 } bitfields3;
>                 uint32_t ordinal3;
>         };
>
> +       uint32_t reserved;
> +
>         uint32_t sh_mem_bases;
> +       uint32_t sh_mem_config;
>         uint32_t sh_mem_ape1_base;
>         uint32_t sh_mem_ape1_limit;
> -       uint32_t sh_mem_config;
> +
> +       uint32_t sh_hidden_private_base_vmid;
> +
> +       uint32_t reserved2;
> +       uint32_t reserved3;
> +
>         uint32_t gds_addr_lo;
>         uint32_t gds_addr_hi;
>
>         union {
>                 struct {
>                         uint32_t num_gws:6;
> -                       uint32_t reserved3:2;
> +                       uint32_t reserved4:2;
>                         uint32_t num_oac:4;
> -                       uint32_t reserved4:4;
> +                       uint32_t reserved5:4;
>                         uint32_t gds_size:6;
>                         uint32_t num_queues:10;
>                 } bitfields10;
>                 uint32_t ordinal10;
>         };
>
> +       uint32_t completion_signal_lo;
> +       uint32_t completion_signal_hi;
> +
>  };
> +
>  #endif
>
>  /*--------------------MES_MAP_QUEUES--------------------*/
> @@ -337,7 +350,7 @@ enum mes_unmap_queues_engine_sel_enum {
>         engine_sel__mes_unmap_queues__sdmal = 3
>  };
>
> -struct PM4_MES_UNMAP_QUEUES {
> +struct pm4_mes_unmap_queues {
>         union {
>                 union PM4_MES_TYPE_3_HEADER   header;            /* header */
>                 uint32_t            ordinal1;
> @@ -397,4 +410,101 @@ struct PM4_MES_UNMAP_QUEUES {
>  };
>  #endif
>
> +#ifndef PM4_MEC_RELEASE_MEM_DEFINED
> +#define PM4_MEC_RELEASE_MEM_DEFINED
> +enum RELEASE_MEM_event_index_enum {
> +       event_index___release_mem__end_of_pipe = 5,
> +       event_index___release_mem__shader_done = 6
> +};
> +
> +enum RELEASE_MEM_cache_policy_enum {
> +       cache_policy___release_mem__lru = 0,
> +       cache_policy___release_mem__stream = 1,
> +       cache_policy___release_mem__bypass = 2
> +};
> +
> +enum RELEASE_MEM_dst_sel_enum {
> +       dst_sel___release_mem__memory_controller = 0,
> +       dst_sel___release_mem__tc_l2 = 1,
> +       dst_sel___release_mem__queue_write_pointer_register = 2,
> +       dst_sel___release_mem__queue_write_pointer_poll_mask_bit = 3
> +};
> +
> +enum RELEASE_MEM_int_sel_enum {
> +       int_sel___release_mem__none = 0,
> +       int_sel___release_mem__send_interrupt_only = 1,
> +       int_sel___release_mem__send_interrupt_after_write_confirm = 2,
> +       int_sel___release_mem__send_data_after_write_confirm = 3
> +};
> +
> +enum RELEASE_MEM_data_sel_enum {
> +       data_sel___release_mem__none = 0,
> +       data_sel___release_mem__send_32_bit_low = 1,
> +       data_sel___release_mem__send_64_bit_data = 2,
> +       data_sel___release_mem__send_gpu_clock_counter = 3,
> +       data_sel___release_mem__send_cp_perfcounter_hi_lo = 4,
> +       data_sel___release_mem__store_gds_data_to_memory = 5
> +};
> +
> +struct pm4_mec_release_mem {
> +       union {
> +               union PM4_MES_TYPE_3_HEADER header;     /*header */
> +               unsigned int ordinal1;
> +       };
> +
> +       union {
> +               struct {
> +                       unsigned int event_type:6;
> +                       unsigned int reserved1:2;
> +                       enum RELEASE_MEM_event_index_enum event_index:4;
> +                       unsigned int tcl1_vol_action_ena:1;
> +                       unsigned int tc_vol_action_ena:1;
> +                       unsigned int reserved2:1;
> +                       unsigned int tc_wb_action_ena:1;
> +                       unsigned int tcl1_action_ena:1;
> +                       unsigned int tc_action_ena:1;
> +                       unsigned int reserved3:6;
> +                       unsigned int atc:1;
> +                       enum RELEASE_MEM_cache_policy_enum cache_policy:2;
> +                       unsigned int reserved4:5;
> +               } bitfields2;
> +               unsigned int ordinal2;
> +       };
> +
> +       union {
> +               struct {
> +                       unsigned int reserved5:16;
> +                       enum RELEASE_MEM_dst_sel_enum dst_sel:2;
> +                       unsigned int reserved6:6;
> +                       enum RELEASE_MEM_int_sel_enum int_sel:3;
> +                       unsigned int reserved7:2;
> +                       enum RELEASE_MEM_data_sel_enum data_sel:3;
> +               } bitfields3;
> +               unsigned int ordinal3;
> +       };
> +
> +       union {
> +               struct {
> +                       unsigned int reserved8:2;
> +                       unsigned int address_lo_32b:30;
> +               } bitfields4;
> +               struct {
> +                       unsigned int reserved9:3;
> +                       unsigned int address_lo_64b:29;
> +               } bitfields5;
> +               unsigned int ordinal4;
> +       };
> +
> +       unsigned int address_hi;
> +
> +       unsigned int data_lo;
> +
> +       unsigned int data_hi;
> +};
> +#endif
> +
> +enum {
> +       CACHE_FLUSH_AND_INV_TS_EVENT = 0x00000014
> +};
> +
>  #endif
> --
> 2.7.4
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 03/19] drm/amdkfd: Remove bogus divide-by-sizeof(uint32_t)
       [not found]         ` <CAFCwf11Bg41FNg2sChh6EZkczb6quSzxdFXoJ-qhoE8JwqgGJw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-08-12 18:02           ` Kuehling, Felix
       [not found]             ` <DM5PR1201MB02356DFA5A4EAE4CC747F12F928E0-grEf7a3NxMBd8L2jMOIKKmrFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: Kuehling, Felix @ 2017-08-12 18:02 UTC (permalink / raw)
  To: Oded Gabbay; +Cc: amd-gfx list

Hi Oded,

Most of our recent work has been on the amdgpu driver. We basically just keep radeon compiling these days. Amdgpu can support all the GPUs that KFD supports. Between the amdgpu developers and the KFD team we've also talked about merging KFD into amdgpu at some point and replacing the KFD2KGD function pointer interfaces with mostly direct function calls. This would remove KFD support from radeon.

What's your position on radeon KFD support going forward? Do you insist in maintaining it, just for Kaveri?

Regards,
  Felix


From: Oded Gabbay <oded.gabbay@gmail.com>
Sent: Saturday, August 12, 2017 8:37 AM
To: Kuehling, Felix
Cc: amd-gfx list
Subject: Re: [PATCH 03/19] drm/amdkfd: Remove bogus divide-by-sizeof(uint32_t)
    
On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> kfd2kgd->address_watch_get_offset returns dword register offsets.
> The divide-by-sizeof(uint32_t) is incorrect.

In amdgpu that's true, but in radeon that's incorrect.
If you look at cik_reg.h in radeon driver, you will see the address of
all TCP_WATCH_* registers is multiplied by 4, and that's why Yair
originally divided the offset by sizeof(uint32_t).
I think this patch should move the divide-by-sizeof operation into the
radeon function instead of just deleting it from kfd_dbgdev.c.

Oded

>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c | 8 --------
>  1 file changed, 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
> index 8b14a4e..faa0790 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
> @@ -442,8 +442,6 @@ static int dbgdev_address_watch_diq(struct kfd_dbgdev *dbgdev,
>                                         i,
>                                         ADDRESS_WATCH_REG_CNTL);
>
> -               aw_reg_add_dword /= sizeof(uint32_t);
> -
>                 packets_vec[0].bitfields2.reg_offset =
>                                         aw_reg_add_dword - AMD_CONFIG_REG_BASE;
>
> @@ -455,8 +453,6 @@ static int dbgdev_address_watch_diq(struct kfd_dbgdev *dbgdev,
>                                         i,
>                                         ADDRESS_WATCH_REG_ADDR_HI);
>
> -               aw_reg_add_dword /= sizeof(uint32_t);
> -
>                 packets_vec[1].bitfields2.reg_offset =
>                                         aw_reg_add_dword - AMD_CONFIG_REG_BASE;
>                 packets_vec[1].reg_data[0] = addrHi.u32All;
> @@ -467,8 +463,6 @@ static int dbgdev_address_watch_diq(struct kfd_dbgdev *dbgdev,
>                                         i,
>                                         ADDRESS_WATCH_REG_ADDR_LO);
>
> -               aw_reg_add_dword /= sizeof(uint32_t);
> -
>                 packets_vec[2].bitfields2.reg_offset =
>                                 aw_reg_add_dword - AMD_CONFIG_REG_BASE;
>                 packets_vec[2].reg_data[0] = addrLo.u32All;
> @@ -485,8 +479,6 @@ static int dbgdev_address_watch_diq(struct kfd_dbgdev *dbgdev,
>                                         i,
>                                         ADDRESS_WATCH_REG_CNTL);
>
> -               aw_reg_add_dword /= sizeof(uint32_t);
> -
>                 packets_vec[3].bitfields2.reg_offset =
>                                         aw_reg_add_dword - AMD_CONFIG_REG_BASE;
>                 packets_vec[3].reg_data[0] = cntl.u32All;
> --
> 2.7.4
>
    
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 00/19] KFD fixes and cleanups
       [not found]     ` <CAFCwf10L0sMCWnPxOi=zLuXLor4X90--m-a6UnervTmEguGL9g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-08-12 18:07       ` Kuehling, Felix
       [not found]         ` <DM5PR1201MB0235D9AA9FAEF6F2635942C8928E0-grEf7a3NxMBd8L2jMOIKKmrFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: Kuehling, Felix @ 2017-08-12 18:07 UTC (permalink / raw)
  To: Oded Gabbay, Deucher, Alexander; +Cc: amd-gfx list

[+Alex]

I'll rebase this on drm-next-4.14. Alex, is this the branch that will become the new default development branch for the amdgpu team? This should make coordination of dependent AMDGPU and KFD changes easier.

Regards,
  Felix



From: Oded Gabbay <oded.gabbay@gmail.com>
Sent: Saturday, August 12, 2017 8:28 AM
To: Kuehling, Felix
Cc: amd-gfx list
Subject: Re: [PATCH 00/19] KFD fixes and cleanups
    
Hi Felix,
Thanks for all the patches.
I have started to review them, but I have a small request from you
while I'm doing the review.
Could you please rebase them over my amdkfd-next branch, or
alternatively, over Alex's drm-next-4.14  or Dave Airlie's drm-next
(which amdkfd-next currently points to) branches ?
I tried to apply this patch-set on amdkfd-next, but it fails on patch
5. I can't upstream them to Dave when they don't apply to his upstream
branch.

Thanks,
Oded

On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> This is the first round of changes preparing for upstreaming KFD
> changes made internally in the last 2 years at AMD. A big part of it
> is coding style and messaging cleanup. I have tried to avoid making
> gratuitous formatting changes. All coding style changes should have a
> justification based on the Linux style guide.
>
> The last few patches (15-19) enable running pieces of the current ROCm
> user mode stack (with minor Thunk fixes for backwards compatibility)
> on this soon-to-be upstream kernel on CZ. At this time I can run some
> KFDTest unit tests, which are currently not open source. I'm trying to
> find other more substantial tests using a real compute API as a
> baseline for testing further KFD upstreaming patches.
>
> This patch series is freshly rebased on amd-staging-4.12.
>
> Felix Kuehling (11):
>   drm/amdkfd: Fix typo in dbgdev_wave_reset_wavefronts
>   drm/amdkfd: Remove bogus divide-by-sizeof(uint32_t)
>   drm/amdkfd: Fix allocated_queues bitmap initialization
>   drm/amdkfd: Remove BUG_ONs for NULL pointer arguments
>   drm/amdkfd: Fix doorbell initialization and finalization
>   drm/amdkfd: Allocate gtt_sa_bitmap in long units
>   drm/amdkfd: Handle remaining BUG_ONs more gracefully
>   drm/amdkfd: Update PM4 packet headers
>   drm/amdgpu: Remove hard-coded assumptions about compute pipes
>   drm/amdgpu: Disable GFX PG on CZ
>   drm/amd: Update MEC HQD loading code for KFD
>
> Jay Cornwall (1):
>   drm/amdkfd: Clamp EOP queue size correctly on Gfx8
>
> Kent Russell (5):
>   drm/amdkfd: Clean up KFD style errors and warnings
>   drm/amdkfd: Consolidate and clean up log commands
>   drm/amdkfd: Change x==NULL/false references to !x
>   drm/amdkfd: Fix goto usage
>   drm/amdkfd: Remove usage of alloc(sizeof(struct...
>
> Yair Shachar (1):
>   drm/amdkfd: Fix double Mutex lock order
>
> Yong Zhao (1):
>   drm/amdkfd: Add more error printing to help bringup
>
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c         |   4 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h         |  16 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c  | 156 +++++++---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c  | 185 ++++++++++--
>  drivers/gpu/drm/amd/amdgpu/vi.c                    |   3 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c           | 107 +++----
>  drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c            | 102 +++----
>  drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c            |  21 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.h            |  27 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_device.c            | 122 ++++----
>  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  | 313 ++++++++-----------
>  .../drm/amd/amdkfd/kfd_device_queue_manager_cik.c  |   6 +-
>  .../drm/amd/amdkfd/kfd_device_queue_manager_vi.c   |   6 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c          |  40 +--
>  drivers/gpu/drm/amd/amdkfd/kfd_events.c            |  33 +--
>  drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c       |   2 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c         |   2 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c      |  63 ++--
>  drivers/gpu/drm/amd/amdkfd/kfd_module.c            |  10 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h       |   3 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c   |  62 ++--
>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c    |  46 +--
>  drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c    | 301 +++++++------------
>  drivers/gpu/drm/amd/amdkfd/kfd_pasid.c             |   7 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h       | 330 +++------------------
>  drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h    | 140 ++++++++-
>  drivers/gpu/drm/amd/amdkfd/kfd_priv.h              |  31 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_process.c           |  25 +-
>  .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |  71 ++---
>  drivers/gpu/drm/amd/amdkfd/kfd_queue.c             |  12 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_topology.c          |  46 +--
>  drivers/gpu/drm/amd/include/kgd_kfd_interface.h    |  11 +-
>  drivers/gpu/drm/radeon/radeon_kfd.c                |  12 +-
>  33 files changed, 1054 insertions(+), 1261 deletions(-)
>
> --
> 2.7.4
>
    
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 16/19] drm/amdkfd: Update PM4 packet headers
       [not found]         ` <CAFCwf12mAxpYF0-AWA=4hJiEe093KkakUYO28-+VxV=Uo+X4Tw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-08-12 18:16           ` Kuehling, Felix
       [not found]             ` <DM5PR1201MB023536CBDE40370D9703EBD6928E0-grEf7a3NxMBd8L2jMOIKKmrFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: Kuehling, Felix @ 2017-08-12 18:16 UTC (permalink / raw)
  To: Oded Gabbay; +Cc: amd-gfx list

> Do you mean that it won't work with Kaveri anymore ?

Kaveri got the same firmware changes, mostly for scratch memory support. The Kaveri firmware headers name the structures and fields a bit differently but they should be binary compatible. So we simplified the code to use only one set of headers. I'll grab a Kaveri system to confirm that it works.

Regards,
  Felix

From: Oded Gabbay <oded.gabbay@gmail.com>
Sent: Saturday, August 12, 2017 11:10 AM
To: Kuehling, Felix
Cc: amd-gfx list
Subject: Re: [PATCH 16/19] drm/amdkfd: Update PM4 packet headers
    
On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> To match current firmware. The map process packet has been extended
> to support scratch. This is a non-backwards compatible change and
> it's about two years old. So no point keeping the old version around
> conditionally.

Do you mean that it won't work with Kaveri anymore ?
I believe we aren't allowed to break older H/W support without some
serious justification.

Oded

>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_device.c         |   8 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c | 161 ++++--------
>  drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h    | 314 +++---------------------
>  drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h | 130 +++++++++-
>  4 files changed, 199 insertions(+), 414 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> index e1c2ad2..e790e7f 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> @@ -26,7 +26,7 @@
>  #include <linux/slab.h>
>  #include "kfd_priv.h"
>  #include "kfd_device_queue_manager.h"
> -#include "kfd_pm4_headers.h"
> +#include "kfd_pm4_headers_vi.h"
>
>  #define MQD_SIZE_ALIGNED 768
>
> @@ -238,9 +238,9 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
>          * calculate max size of runlist packet.
>          * There can be only 2 packets at once
>          */
> -       size += (KFD_MAX_NUM_OF_PROCESSES * sizeof(struct pm4_map_process) +
> -               max_num_of_queues_per_device *
> -               sizeof(struct pm4_map_queues) + sizeof(struct pm4_runlist)) * 2;
> +       size += (KFD_MAX_NUM_OF_PROCESSES * sizeof(struct pm4_mes_map_process) +
> +               max_num_of_queues_per_device * sizeof(struct pm4_mes_map_queues)
> +               + sizeof(struct pm4_mes_runlist)) * 2;
>
>         /* Add size of HIQ & DIQ */
>         size += KFD_KERNEL_QUEUE_SIZE * 2;
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> index 77a6f2b..3141e05 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> @@ -26,7 +26,6 @@
>  #include "kfd_device_queue_manager.h"
>  #include "kfd_kernel_queue.h"
>  #include "kfd_priv.h"
> -#include "kfd_pm4_headers.h"
>  #include "kfd_pm4_headers_vi.h"
>  #include "kfd_pm4_opcodes.h"
>
> @@ -44,12 +43,12 @@ static unsigned int build_pm4_header(unsigned int opcode, size_t packet_size)
>  {
>         union PM4_MES_TYPE_3_HEADER header;
>
> -       header.u32all = 0;
> +       header.u32All = 0;
>         header.opcode = opcode;
>         header.count = packet_size/sizeof(uint32_t) - 2;
>         header.type = PM4_TYPE_3;
>
> -       return header.u32all;
> +       return header.u32All;
>  }
>
>  static void pm_calc_rlib_size(struct packet_manager *pm,
> @@ -69,12 +68,9 @@ static void pm_calc_rlib_size(struct packet_manager *pm,
>                 pr_debug("Over subscribed runlist\n");
>         }
>
> -       map_queue_size =
> -               (pm->dqm->dev->device_info->asic_family == CHIP_CARRIZO) ?
> -               sizeof(struct pm4_mes_map_queues) :
> -               sizeof(struct pm4_map_queues);
> +       map_queue_size = sizeof(struct pm4_mes_map_queues);
>         /* calculate run list ib allocation size */
> -       *rlib_size = process_count * sizeof(struct pm4_map_process) +
> +       *rlib_size = process_count * sizeof(struct pm4_mes_map_process) +
>                      queue_count * map_queue_size;
>
>         /*
> @@ -82,7 +78,7 @@ static void pm_calc_rlib_size(struct packet_manager *pm,
>          * when over subscription
>          */
>         if (*over_subscription)
> -               *rlib_size += sizeof(struct pm4_runlist);
> +               *rlib_size += sizeof(struct pm4_mes_runlist);
>
>         pr_debug("runlist ib size %d\n", *rlib_size);
>  }
> @@ -119,16 +115,16 @@ static int pm_allocate_runlist_ib(struct packet_manager *pm,
>  static int pm_create_runlist(struct packet_manager *pm, uint32_t *buffer,
>                         uint64_t ib, size_t ib_size_in_dwords, bool chain)
>  {
> -       struct pm4_runlist *packet;
> +       struct pm4_mes_runlist *packet;
>
>         if (WARN_ON(!ib))
>                 return -EFAULT;
>
> -       packet = (struct pm4_runlist *)buffer;
> +       packet = (struct pm4_mes_runlist *)buffer;
>
> -       memset(buffer, 0, sizeof(struct pm4_runlist));
> -       packet->header.u32all = build_pm4_header(IT_RUN_LIST,
> -                                               sizeof(struct pm4_runlist));
> +       memset(buffer, 0, sizeof(struct pm4_mes_runlist));
> +       packet->header.u32All = build_pm4_header(IT_RUN_LIST,
> +                                               sizeof(struct pm4_mes_runlist));
>
>         packet->bitfields4.ib_size = ib_size_in_dwords;
>         packet->bitfields4.chain = chain ? 1 : 0;
> @@ -143,16 +139,16 @@ static int pm_create_runlist(struct packet_manager *pm, uint32_t *buffer,
>  static int pm_create_map_process(struct packet_manager *pm, uint32_t *buffer,
>                                 struct qcm_process_device *qpd)
>  {
> -       struct pm4_map_process *packet;
> +       struct pm4_mes_map_process *packet;
>         struct queue *cur;
>         uint32_t num_queues;
>
> -       packet = (struct pm4_map_process *)buffer;
> +       packet = (struct pm4_mes_map_process *)buffer;
>
> -       memset(buffer, 0, sizeof(struct pm4_map_process));
> +       memset(buffer, 0, sizeof(struct pm4_mes_map_process));
>
> -       packet->header.u32all = build_pm4_header(IT_MAP_PROCESS,
> -                                       sizeof(struct pm4_map_process));
> +       packet->header.u32All = build_pm4_header(IT_MAP_PROCESS,
> +                                       sizeof(struct pm4_mes_map_process));
>         packet->bitfields2.diq_enable = (qpd->is_debug) ? 1 : 0;
>         packet->bitfields2.process_quantum = 1;
>         packet->bitfields2.pasid = qpd->pqm->process->pasid;
> @@ -170,23 +166,26 @@ static int pm_create_map_process(struct packet_manager *pm, uint32_t *buffer,
>         packet->sh_mem_ape1_base = qpd->sh_mem_ape1_base;
>         packet->sh_mem_ape1_limit = qpd->sh_mem_ape1_limit;
>
> +       /* TODO: scratch support */
> +       packet->sh_hidden_private_base_vmid = 0;
> +
>         packet->gds_addr_lo = lower_32_bits(qpd->gds_context_area);
>         packet->gds_addr_hi = upper_32_bits(qpd->gds_context_area);
>
>         return 0;
>  }
>
> -static int pm_create_map_queue_vi(struct packet_manager *pm, uint32_t *buffer,
> +static int pm_create_map_queue(struct packet_manager *pm, uint32_t *buffer,
>                 struct queue *q, bool is_static)
>  {
>         struct pm4_mes_map_queues *packet;
>         bool use_static = is_static;
>
>         packet = (struct pm4_mes_map_queues *)buffer;
> -       memset(buffer, 0, sizeof(struct pm4_map_queues));
> +       memset(buffer, 0, sizeof(struct pm4_mes_map_queues));
>
> -       packet->header.u32all = build_pm4_header(IT_MAP_QUEUES,
> -                                               sizeof(struct pm4_map_queues));
> +       packet->header.u32All = build_pm4_header(IT_MAP_QUEUES,
> +                                               sizeof(struct pm4_mes_map_queues));
>         packet->bitfields2.alloc_format =
>                 alloc_format__mes_map_queues__one_per_pipe_vi;
>         packet->bitfields2.num_queues = 1;
> @@ -235,64 +234,6 @@ static int pm_create_map_queue_vi(struct packet_manager *pm, uint32_t *buffer,
>         return 0;
>  }
>
> -static int pm_create_map_queue(struct packet_manager *pm, uint32_t *buffer,
> -                               struct queue *q, bool is_static)
> -{
> -       struct pm4_map_queues *packet;
> -       bool use_static = is_static;
> -
> -       packet = (struct pm4_map_queues *)buffer;
> -       memset(buffer, 0, sizeof(struct pm4_map_queues));
> -
> -       packet->header.u32all = build_pm4_header(IT_MAP_QUEUES,
> -                                               sizeof(struct pm4_map_queues));
> -       packet->bitfields2.alloc_format =
> -                               alloc_format__mes_map_queues__one_per_pipe;
> -       packet->bitfields2.num_queues = 1;
> -       packet->bitfields2.queue_sel =
> -               queue_sel__mes_map_queues__map_to_hws_determined_queue_slots;
> -
> -       packet->bitfields2.vidmem = (q->properties.is_interop) ?
> -                       vidmem__mes_map_queues__uses_video_memory :
> -                       vidmem__mes_map_queues__uses_no_video_memory;
> -
> -       switch (q->properties.type) {
> -       case KFD_QUEUE_TYPE_COMPUTE:
> -       case KFD_QUEUE_TYPE_DIQ:
> -               packet->bitfields2.engine_sel =
> -                               engine_sel__mes_map_queues__compute;
> -               break;
> -       case KFD_QUEUE_TYPE_SDMA:
> -               packet->bitfields2.engine_sel =
> -                               engine_sel__mes_map_queues__sdma0;
> -               use_static = false; /* no static queues under SDMA */
> -               break;
> -       default:
> -               WARN(1, "queue type %d", q->properties.type);
> -               return -EINVAL;
> -       }
> -
> -       packet->mes_map_queues_ordinals[0].bitfields3.doorbell_offset =
> -                       q->properties.doorbell_off;
> -
> -       packet->mes_map_queues_ordinals[0].bitfields3.is_static =
> -                       (use_static) ? 1 : 0;
> -
> -       packet->mes_map_queues_ordinals[0].mqd_addr_lo =
> -                       lower_32_bits(q->gart_mqd_addr);
> -
> -       packet->mes_map_queues_ordinals[0].mqd_addr_hi =
> -                       upper_32_bits(q->gart_mqd_addr);
> -
> -       packet->mes_map_queues_ordinals[0].wptr_addr_lo =
> -                       lower_32_bits((uint64_t)q->properties.write_ptr);
> -
> -       packet->mes_map_queues_ordinals[0].wptr_addr_hi =
> -                       upper_32_bits((uint64_t)q->properties.write_ptr);
> -
> -       return 0;
> -}
> -
>  static int pm_create_runlist_ib(struct packet_manager *pm,
>                                 struct list_head *queues,
>                                 uint64_t *rl_gpu_addr,
> @@ -334,7 +275,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
>                         return retval;
>
>                 proccesses_mapped++;
> -               inc_wptr(&rl_wptr, sizeof(struct pm4_map_process),
> +               inc_wptr(&rl_wptr, sizeof(struct pm4_mes_map_process),
>                                 alloc_size_bytes);
>
>                 list_for_each_entry(kq, &qpd->priv_queue_list, list) {
> @@ -344,14 +285,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
>                         pr_debug("static_queue, mapping kernel q %d, is debug status %d\n",
>                                 kq->queue->queue, qpd->is_debug);
>
> -                       if (pm->dqm->dev->device_info->asic_family ==
> -                                       CHIP_CARRIZO)
> -                               retval = pm_create_map_queue_vi(pm,
> -                                               &rl_buffer[rl_wptr],
> -                                               kq->queue,
> -                                               qpd->is_debug);
> -                       else
> -                               retval = pm_create_map_queue(pm,
> +                       retval = pm_create_map_queue(pm,
>                                                 &rl_buffer[rl_wptr],
>                                                 kq->queue,
>                                                 qpd->is_debug);
> @@ -359,7 +293,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
>                                 return retval;
>
>                         inc_wptr(&rl_wptr,
> -                               sizeof(struct pm4_map_queues),
> +                               sizeof(struct pm4_mes_map_queues),
>                                 alloc_size_bytes);
>                 }
>
> @@ -370,14 +304,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
>                         pr_debug("static_queue, mapping user queue %d, is debug status %d\n",
>                                 q->queue, qpd->is_debug);
>
> -                       if (pm->dqm->dev->device_info->asic_family ==
> -                                       CHIP_CARRIZO)
> -                               retval = pm_create_map_queue_vi(pm,
> -                                               &rl_buffer[rl_wptr],
> -                                               q,
> -                                               qpd->is_debug);
> -                       else
> -                               retval = pm_create_map_queue(pm,
> +                       retval = pm_create_map_queue(pm,
>                                                 &rl_buffer[rl_wptr],
>                                                 q,
>                                                 qpd->is_debug);
> @@ -386,7 +313,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
>                                 return retval;
>
>                         inc_wptr(&rl_wptr,
> -                               sizeof(struct pm4_map_queues),
> +                               sizeof(struct pm4_mes_map_queues),
>                                 alloc_size_bytes);
>                 }
>         }
> @@ -429,7 +356,7 @@ void pm_uninit(struct packet_manager *pm)
>  int pm_send_set_resources(struct packet_manager *pm,
>                                 struct scheduling_resources *res)
>  {
> -       struct pm4_set_resources *packet;
> +       struct pm4_mes_set_resources *packet;
>         int retval = 0;
>
>         mutex_lock(&pm->lock);
> @@ -442,9 +369,9 @@ int pm_send_set_resources(struct packet_manager *pm,
>                 goto out;
>         }
>
> -       memset(packet, 0, sizeof(struct pm4_set_resources));
> -       packet->header.u32all = build_pm4_header(IT_SET_RESOURCES,
> -                                       sizeof(struct pm4_set_resources));
> +       memset(packet, 0, sizeof(struct pm4_mes_set_resources));
> +       packet->header.u32All = build_pm4_header(IT_SET_RESOURCES,
> +                                       sizeof(struct pm4_mes_set_resources));
>
>         packet->bitfields2.queue_type =
>                         queue_type__mes_set_resources__hsa_interface_queue_hiq;
> @@ -482,7 +409,7 @@ int pm_send_runlist(struct packet_manager *pm, struct list_head *dqm_queues)
>
>         pr_debug("runlist IB address: 0x%llX\n", rl_gpu_ib_addr);
>
> -       packet_size_dwords = sizeof(struct pm4_runlist) / sizeof(uint32_t);
> +       packet_size_dwords = sizeof(struct pm4_mes_runlist) / sizeof(uint32_t);
>         mutex_lock(&pm->lock);
>
>         retval = pm->priv_queue->ops.acquire_packet_buffer(pm->priv_queue,
> @@ -514,7 +441,7 @@ int pm_send_query_status(struct packet_manager *pm, uint64_t fence_address,
>                         uint32_t fence_value)
>  {
>         int retval;
> -       struct pm4_query_status *packet;
> +       struct pm4_mes_query_status *packet;
>
>         if (WARN_ON(!fence_address))
>                 return -EFAULT;
> @@ -522,13 +449,13 @@ int pm_send_query_status(struct packet_manager *pm, uint64_t fence_address,
>         mutex_lock(&pm->lock);
>         retval = pm->priv_queue->ops.acquire_packet_buffer(
>                         pm->priv_queue,
> -                       sizeof(struct pm4_query_status) / sizeof(uint32_t),
> +                       sizeof(struct pm4_mes_query_status) / sizeof(uint32_t),
>                         (unsigned int **)&packet);
>         if (retval)
>                 goto fail_acquire_packet_buffer;
>
> -       packet->header.u32all = build_pm4_header(IT_QUERY_STATUS,
> -                                       sizeof(struct pm4_query_status));
> +       packet->header.u32All = build_pm4_header(IT_QUERY_STATUS,
> +                                       sizeof(struct pm4_mes_query_status));
>
>         packet->bitfields2.context_id = 0;
>         packet->bitfields2.interrupt_sel =
> @@ -555,22 +482,22 @@ int pm_send_unmap_queue(struct packet_manager *pm, enum kfd_queue_type type,
>  {
>         int retval;
>         uint32_t *buffer;
> -       struct pm4_unmap_queues *packet;
> +       struct pm4_mes_unmap_queues *packet;
>
>         mutex_lock(&pm->lock);
>         retval = pm->priv_queue->ops.acquire_packet_buffer(
>                         pm->priv_queue,
> -                       sizeof(struct pm4_unmap_queues) / sizeof(uint32_t),
> +                       sizeof(struct pm4_mes_unmap_queues) / sizeof(uint32_t),
>                         &buffer);
>         if (retval)
>                 goto err_acquire_packet_buffer;
>
> -       packet = (struct pm4_unmap_queues *)buffer;
> -       memset(buffer, 0, sizeof(struct pm4_unmap_queues));
> +       packet = (struct pm4_mes_unmap_queues *)buffer;
> +       memset(buffer, 0, sizeof(struct pm4_mes_unmap_queues));
>         pr_debug("static_queue: unmapping queues: mode is %d , reset is %d , type is %d\n",
>                 mode, reset, type);
> -       packet->header.u32all = build_pm4_header(IT_UNMAP_QUEUES,
> -                                       sizeof(struct pm4_unmap_queues));
> +       packet->header.u32All = build_pm4_header(IT_UNMAP_QUEUES,
> +                                       sizeof(struct pm4_mes_unmap_queues));
>         switch (type) {
>         case KFD_QUEUE_TYPE_COMPUTE:
>         case KFD_QUEUE_TYPE_DIQ:
> @@ -608,12 +535,12 @@ int pm_send_unmap_queue(struct packet_manager *pm, enum kfd_queue_type type,
>                 break;
>         case KFD_PREEMPT_TYPE_FILTER_ALL_QUEUES:
>                 packet->bitfields2.queue_sel =
> -                               queue_sel__mes_unmap_queues__perform_request_on_all_active_queues;
> +                               queue_sel__mes_unmap_queues__unmap_all_queues;
>                 break;
>         case KFD_PREEMPT_TYPE_FILTER_DYNAMIC_QUEUES:
>                 /* in this case, we do not preempt static queues */
>                 packet->bitfields2.queue_sel =
> -                               queue_sel__mes_unmap_queues__perform_request_on_dynamic_queues_only;
> +                               queue_sel__mes_unmap_queues__unmap_all_non_static_queues;
>                 break;
>         default:
>                 WARN(1, "filter %d", mode);
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
> index 97e5442..e50f73d 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
> @@ -41,99 +41,6 @@ union PM4_MES_TYPE_3_HEADER {
>  };
>  #endif /* PM4_MES_HEADER_DEFINED */
>
> -/* --------------------MES_SET_RESOURCES-------------------- */
> -
> -#ifndef PM4_MES_SET_RESOURCES_DEFINED
> -#define PM4_MES_SET_RESOURCES_DEFINED
> -enum set_resources_queue_type_enum {
> -       queue_type__mes_set_resources__kernel_interface_queue_kiq = 0,
> -       queue_type__mes_set_resources__hsa_interface_queue_hiq = 1,
> -       queue_type__mes_set_resources__hsa_debug_interface_queue = 4
> -};
> -
> -struct pm4_set_resources {
> -       union {
> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
> -               uint32_t ordinal1;
> -       };
> -
> -       union {
> -               struct {
> -                       uint32_t vmid_mask:16;
> -                       uint32_t unmap_latency:8;
> -                       uint32_t reserved1:5;
> -                       enum set_resources_queue_type_enum queue_type:3;
> -               } bitfields2;
> -               uint32_t ordinal2;
> -       };
> -
> -       uint32_t queue_mask_lo;
> -       uint32_t queue_mask_hi;
> -       uint32_t gws_mask_lo;
> -       uint32_t gws_mask_hi;
> -
> -       union {
> -               struct {
> -                       uint32_t oac_mask:16;
> -                       uint32_t reserved2:16;
> -               } bitfields7;
> -               uint32_t ordinal7;
> -       };
> -
> -       union {
> -               struct {
> -                       uint32_t gds_heap_base:6;
> -                       uint32_t reserved3:5;
> -                       uint32_t gds_heap_size:6;
> -                       uint32_t reserved4:15;
> -               } bitfields8;
> -               uint32_t ordinal8;
> -       };
> -
> -};
> -#endif
> -
> -/*--------------------MES_RUN_LIST-------------------- */
> -
> -#ifndef PM4_MES_RUN_LIST_DEFINED
> -#define PM4_MES_RUN_LIST_DEFINED
> -
> -struct pm4_runlist {
> -       union {
> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
> -               uint32_t ordinal1;
> -       };
> -
> -       union {
> -               struct {
> -                       uint32_t reserved1:2;
> -                       uint32_t ib_base_lo:30;
> -               } bitfields2;
> -               uint32_t ordinal2;
> -       };
> -
> -       union {
> -               struct {
> -                       uint32_t ib_base_hi:16;
> -                       uint32_t reserved2:16;
> -               } bitfields3;
> -               uint32_t ordinal3;
> -       };
> -
> -       union {
> -               struct {
> -                       uint32_t ib_size:20;
> -                       uint32_t chain:1;
> -                       uint32_t offload_polling:1;
> -                       uint32_t reserved3:1;
> -                       uint32_t valid:1;
> -                       uint32_t reserved4:8;
> -               } bitfields4;
> -               uint32_t ordinal4;
> -       };
> -
> -};
> -#endif
>
>  /*--------------------MES_MAP_PROCESS-------------------- */
>
> @@ -186,217 +93,58 @@ struct pm4_map_process {
>  };
>  #endif
>
> -/*--------------------MES_MAP_QUEUES--------------------*/
> -
> -#ifndef PM4_MES_MAP_QUEUES_DEFINED
> -#define PM4_MES_MAP_QUEUES_DEFINED
> -enum map_queues_queue_sel_enum {
> -       queue_sel__mes_map_queues__map_to_specified_queue_slots = 0,
> -       queue_sel__mes_map_queues__map_to_hws_determined_queue_slots = 1,
> -       queue_sel__mes_map_queues__enable_process_queues = 2
> -};
> +#ifndef PM4_MES_MAP_PROCESS_DEFINED_KV_SCRATCH
> +#define PM4_MES_MAP_PROCESS_DEFINED_KV_SCRATCH
>
> -enum map_queues_vidmem_enum {
> -       vidmem__mes_map_queues__uses_no_video_memory = 0,
> -       vidmem__mes_map_queues__uses_video_memory = 1
> -};
> -
> -enum map_queues_alloc_format_enum {
> -       alloc_format__mes_map_queues__one_per_pipe = 0,
> -       alloc_format__mes_map_queues__all_on_one_pipe = 1
> -};
> -
> -enum map_queues_engine_sel_enum {
> -       engine_sel__mes_map_queues__compute = 0,
> -       engine_sel__mes_map_queues__sdma0 = 2,
> -       engine_sel__mes_map_queues__sdma1 = 3
> -};
> -
> -struct pm4_map_queues {
> +struct pm4_map_process_scratch_kv {
>         union {
> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
> -               uint32_t ordinal1;
> -       };
> -
> -       union {
> -               struct {
> -                       uint32_t reserved1:4;
> -                       enum map_queues_queue_sel_enum queue_sel:2;
> -                       uint32_t reserved2:2;
> -                       uint32_t vmid:4;
> -                       uint32_t reserved3:4;
> -                       enum map_queues_vidmem_enum vidmem:2;
> -                       uint32_t reserved4:6;
> -                       enum map_queues_alloc_format_enum alloc_format:2;
> -                       enum map_queues_engine_sel_enum engine_sel:3;
> -                       uint32_t num_queues:3;
> -               } bitfields2;
> -               uint32_t ordinal2;
> -       };
> -
> -       struct {
> -               union {
> -                       struct {
> -                               uint32_t is_static:1;
> -                               uint32_t reserved5:1;
> -                               uint32_t doorbell_offset:21;
> -                               uint32_t reserved6:3;
> -                               uint32_t queue:6;
> -                       } bitfields3;
> -                       uint32_t ordinal3;
> -               };
> -
> -               uint32_t mqd_addr_lo;
> -               uint32_t mqd_addr_hi;
> -               uint32_t wptr_addr_lo;
> -               uint32_t wptr_addr_hi;
> -
> -       } mes_map_queues_ordinals[1];   /* 1..N of these ordinal groups */
> -
> -};
> -#endif
> -
> -/*--------------------MES_QUERY_STATUS--------------------*/
> -
> -#ifndef PM4_MES_QUERY_STATUS_DEFINED
> -#define PM4_MES_QUERY_STATUS_DEFINED
> -enum query_status_interrupt_sel_enum {
> -       interrupt_sel__mes_query_status__completion_status = 0,
> -       interrupt_sel__mes_query_status__process_status = 1,
> -       interrupt_sel__mes_query_status__queue_status = 2
> -};
> -
> -enum query_status_command_enum {
> -       command__mes_query_status__interrupt_only = 0,
> -       command__mes_query_status__fence_only_immediate = 1,
> -       command__mes_query_status__fence_only_after_write_ack = 2,
> -       command__mes_query_status__fence_wait_for_write_ack_send_interrupt = 3
> -};
> -
> -enum query_status_engine_sel_enum {
> -       engine_sel__mes_query_status__compute = 0,
> -       engine_sel__mes_query_status__sdma0_queue = 2,
> -       engine_sel__mes_query_status__sdma1_queue = 3
> -};
> -
> -struct pm4_query_status {
> -       union {
> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
> -               uint32_t ordinal1;
> -       };
> -
> -       union {
> -               struct {
> -                       uint32_t context_id:28;
> -                       enum query_status_interrupt_sel_enum interrupt_sel:2;
> -                       enum query_status_command_enum command:2;
> -               } bitfields2;
> -               uint32_t ordinal2;
> +               union PM4_MES_TYPE_3_HEADER   header; /* header */
> +               uint32_t            ordinal1;
>         };
>
>         union {
>                 struct {
>                         uint32_t pasid:16;
> -                       uint32_t reserved1:16;
> -               } bitfields3a;
> -               struct {
> -                       uint32_t reserved2:2;
> -                       uint32_t doorbell_offset:21;
> -                       uint32_t reserved3:3;
> -                       enum query_status_engine_sel_enum engine_sel:3;
> -                       uint32_t reserved4:3;
> -               } bitfields3b;
> -               uint32_t ordinal3;
> -       };
> -
> -       uint32_t addr_lo;
> -       uint32_t addr_hi;
> -       uint32_t data_lo;
> -       uint32_t data_hi;
> -};
> -#endif
> -
> -/*--------------------MES_UNMAP_QUEUES--------------------*/
> -
> -#ifndef PM4_MES_UNMAP_QUEUES_DEFINED
> -#define PM4_MES_UNMAP_QUEUES_DEFINED
> -enum unmap_queues_action_enum {
> -       action__mes_unmap_queues__preempt_queues = 0,
> -       action__mes_unmap_queues__reset_queues = 1,
> -       action__mes_unmap_queues__disable_process_queues = 2
> -};
> -
> -enum unmap_queues_queue_sel_enum {
> -       queue_sel__mes_unmap_queues__perform_request_on_specified_queues = 0,
> -       queue_sel__mes_unmap_queues__perform_request_on_pasid_queues = 1,
> -       queue_sel__mes_unmap_queues__perform_request_on_all_active_queues = 2,
> -       queue_sel__mes_unmap_queues__perform_request_on_dynamic_queues_only = 3
> -};
> -
> -enum unmap_queues_engine_sel_enum {
> -       engine_sel__mes_unmap_queues__compute = 0,
> -       engine_sel__mes_unmap_queues__sdma0 = 2,
> -       engine_sel__mes_unmap_queues__sdma1 = 3
> -};
> -
> -struct pm4_unmap_queues {
> -       union {
> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
> -               uint32_t ordinal1;
> -       };
> -
> -       union {
> -               struct {
> -                       enum unmap_queues_action_enum action:2;
> -                       uint32_t reserved1:2;
> -                       enum unmap_queues_queue_sel_enum queue_sel:2;
> -                       uint32_t reserved2:20;
> -                       enum unmap_queues_engine_sel_enum engine_sel:3;
> -                       uint32_t num_queues:3;
> +                       uint32_t reserved1:8;
> +                       uint32_t diq_enable:1;
> +                       uint32_t process_quantum:7;
>                 } bitfields2;
>                 uint32_t ordinal2;
>         };
>
>         union {
>                 struct {
> -                       uint32_t pasid:16;
> -                       uint32_t reserved3:16;
> -               } bitfields3a;
> -               struct {
> -                       uint32_t reserved4:2;
> -                       uint32_t doorbell_offset0:21;
> -                       uint32_t reserved5:9;
> -               } bitfields3b;
> +                       uint32_t page_table_base:28;
> +                       uint32_t reserved2:4;
> +               } bitfields3;
>                 uint32_t ordinal3;
>         };
>
> -       union {
> -               struct {
> -                       uint32_t reserved6:2;
> -                       uint32_t doorbell_offset1:21;
> -                       uint32_t reserved7:9;
> -               } bitfields4;
> -               uint32_t ordinal4;
> -       };
> -
> -       union {
> -               struct {
> -                       uint32_t reserved8:2;
> -                       uint32_t doorbell_offset2:21;
> -                       uint32_t reserved9:9;
> -               } bitfields5;
> -               uint32_t ordinal5;
> -       };
> +       uint32_t reserved3;
> +       uint32_t sh_mem_bases;
> +       uint32_t sh_mem_config;
> +       uint32_t sh_mem_ape1_base;
> +       uint32_t sh_mem_ape1_limit;
> +       uint32_t sh_hidden_private_base_vmid;
> +       uint32_t reserved4;
> +       uint32_t reserved5;
> +       uint32_t gds_addr_lo;
> +       uint32_t gds_addr_hi;
>
>         union {
>                 struct {
> -                       uint32_t reserved10:2;
> -                       uint32_t doorbell_offset3:21;
> -                       uint32_t reserved11:9;
> -               } bitfields6;
> -               uint32_t ordinal6;
> +                       uint32_t num_gws:6;
> +                       uint32_t reserved6:2;
> +                       uint32_t num_oac:4;
> +                       uint32_t reserved7:4;
> +                       uint32_t gds_size:6;
> +                       uint32_t num_queues:10;
> +               } bitfields14;
> +               uint32_t ordinal14;
>         };
>
> +       uint32_t completion_signal_lo32;
> +uint32_t completion_signal_hi32;
>  };
>  #endif
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
> index c4eda6f..7c8d9b3 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
> @@ -126,9 +126,10 @@ struct pm4_mes_runlist {
>                         uint32_t ib_size:20;
>                         uint32_t chain:1;
>                         uint32_t offload_polling:1;
> -                       uint32_t reserved3:1;
> +                       uint32_t reserved2:1;
>                         uint32_t valid:1;
> -                       uint32_t reserved4:8;
> +                       uint32_t process_cnt:4;
> +                       uint32_t reserved3:4;
>                 } bitfields4;
>                 uint32_t ordinal4;
>         };
> @@ -143,8 +144,8 @@ struct pm4_mes_runlist {
>
>  struct pm4_mes_map_process {
>         union {
> -               union PM4_MES_TYPE_3_HEADER   header;            /* header */
> -               uint32_t            ordinal1;
> +               union PM4_MES_TYPE_3_HEADER header;     /* header */
> +               uint32_t ordinal1;
>         };
>
>         union {
> @@ -155,36 +156,48 @@ struct pm4_mes_map_process {
>                         uint32_t process_quantum:7;
>                 } bitfields2;
>                 uint32_t ordinal2;
> -};
> +       };
>
>         union {
>                 struct {
>                         uint32_t page_table_base:28;
> -                       uint32_t reserved2:4;
> +                       uint32_t reserved3:4;
>                 } bitfields3;
>                 uint32_t ordinal3;
>         };
>
> +       uint32_t reserved;
> +
>         uint32_t sh_mem_bases;
> +       uint32_t sh_mem_config;
>         uint32_t sh_mem_ape1_base;
>         uint32_t sh_mem_ape1_limit;
> -       uint32_t sh_mem_config;
> +
> +       uint32_t sh_hidden_private_base_vmid;
> +
> +       uint32_t reserved2;
> +       uint32_t reserved3;
> +
>         uint32_t gds_addr_lo;
>         uint32_t gds_addr_hi;
>
>         union {
>                 struct {
>                         uint32_t num_gws:6;
> -                       uint32_t reserved3:2;
> +                       uint32_t reserved4:2;
>                         uint32_t num_oac:4;
> -                       uint32_t reserved4:4;
> +                       uint32_t reserved5:4;
>                         uint32_t gds_size:6;
>                         uint32_t num_queues:10;
>                 } bitfields10;
>                 uint32_t ordinal10;
>         };
>
> +       uint32_t completion_signal_lo;
> +       uint32_t completion_signal_hi;
> +
>  };
> +
>  #endif
>
>  /*--------------------MES_MAP_QUEUES--------------------*/
> @@ -337,7 +350,7 @@ enum mes_unmap_queues_engine_sel_enum {
>         engine_sel__mes_unmap_queues__sdmal = 3
>  };
>
> -struct PM4_MES_UNMAP_QUEUES {
> +struct pm4_mes_unmap_queues {
>         union {
>                 union PM4_MES_TYPE_3_HEADER   header;            /* header */
>                 uint32_t            ordinal1;
> @@ -397,4 +410,101 @@ struct PM4_MES_UNMAP_QUEUES {
>  };
>  #endif
>
> +#ifndef PM4_MEC_RELEASE_MEM_DEFINED
> +#define PM4_MEC_RELEASE_MEM_DEFINED
> +enum RELEASE_MEM_event_index_enum {
> +       event_index___release_mem__end_of_pipe = 5,
> +       event_index___release_mem__shader_done = 6
> +};
> +
> +enum RELEASE_MEM_cache_policy_enum {
> +       cache_policy___release_mem__lru = 0,
> +       cache_policy___release_mem__stream = 1,
> +       cache_policy___release_mem__bypass = 2
> +};
> +
> +enum RELEASE_MEM_dst_sel_enum {
> +       dst_sel___release_mem__memory_controller = 0,
> +       dst_sel___release_mem__tc_l2 = 1,
> +       dst_sel___release_mem__queue_write_pointer_register = 2,
> +       dst_sel___release_mem__queue_write_pointer_poll_mask_bit = 3
> +};
> +
> +enum RELEASE_MEM_int_sel_enum {
> +       int_sel___release_mem__none = 0,
> +       int_sel___release_mem__send_interrupt_only = 1,
> +       int_sel___release_mem__send_interrupt_after_write_confirm = 2,
> +       int_sel___release_mem__send_data_after_write_confirm = 3
> +};
> +
> +enum RELEASE_MEM_data_sel_enum {
> +       data_sel___release_mem__none = 0,
> +       data_sel___release_mem__send_32_bit_low = 1,
> +       data_sel___release_mem__send_64_bit_data = 2,
> +       data_sel___release_mem__send_gpu_clock_counter = 3,
> +       data_sel___release_mem__send_cp_perfcounter_hi_lo = 4,
> +       data_sel___release_mem__store_gds_data_to_memory = 5
> +};
> +
> +struct pm4_mec_release_mem {
> +       union {
> +               union PM4_MES_TYPE_3_HEADER header;     /*header */
> +               unsigned int ordinal1;
> +       };
> +
> +       union {
> +               struct {
> +                       unsigned int event_type:6;
> +                       unsigned int reserved1:2;
> +                       enum RELEASE_MEM_event_index_enum event_index:4;
> +                       unsigned int tcl1_vol_action_ena:1;
> +                       unsigned int tc_vol_action_ena:1;
> +                       unsigned int reserved2:1;
> +                       unsigned int tc_wb_action_ena:1;
> +                       unsigned int tcl1_action_ena:1;
> +                       unsigned int tc_action_ena:1;
> +                       unsigned int reserved3:6;
> +                       unsigned int atc:1;
> +                       enum RELEASE_MEM_cache_policy_enum cache_policy:2;
> +                       unsigned int reserved4:5;
> +               } bitfields2;
> +               unsigned int ordinal2;
> +       };
> +
> +       union {
> +               struct {
> +                       unsigned int reserved5:16;
> +                       enum RELEASE_MEM_dst_sel_enum dst_sel:2;
> +                       unsigned int reserved6:6;
> +                       enum RELEASE_MEM_int_sel_enum int_sel:3;
> +                       unsigned int reserved7:2;
> +                       enum RELEASE_MEM_data_sel_enum data_sel:3;
> +               } bitfields3;
> +               unsigned int ordinal3;
> +       };
> +
> +       union {
> +               struct {
> +                       unsigned int reserved8:2;
> +                       unsigned int address_lo_32b:30;
> +               } bitfields4;
> +               struct {
> +                       unsigned int reserved9:3;
> +                       unsigned int address_lo_64b:29;
> +               } bitfields5;
> +               unsigned int ordinal4;
> +       };
> +
> +       unsigned int address_hi;
> +
> +       unsigned int data_lo;
> +
> +       unsigned int data_hi;
> +};
> +#endif
> +
> +enum {
> +       CACHE_FLUSH_AND_INV_TS_EVENT = 0x00000014
> +};
> +
>  #endif
> --
> 2.7.4
>
    
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 13/19] drm/amdkfd: Handle remaining BUG_ONs more gracefully
       [not found]         ` <CAFCwf129vQLO5owYBQt6S-V5WcxSrO7tu+v62-HhH2eOzATS1A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-08-12 18:37           ` Kuehling, Felix
       [not found]             ` <DM5PR1201MB0235FD5550E28485BCF31EB1928E0-grEf7a3NxMBd8L2jMOIKKmrFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: Kuehling, Felix @ 2017-08-12 18:37 UTC (permalink / raw)
  To: Oded Gabbay; +Cc: amd-gfx list

Sorry about the weird quoting format. I'm using outlook web access from home. Comments inline [FK]

________________________________________
From: Oded Gabbay <oded.gabbay@gmail.com>
Sent: Saturday, August 12, 2017 10:39 AM
To: Kuehling, Felix
Cc: amd-gfx list
Subject: Re: [PATCH 13/19] drm/amdkfd: Handle remaining BUG_ONs more gracefully

On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> In most cases, BUG_ONs can be replaced with WARN_ON with an error
> return. In some void functions just turn them into a WARN_ON and
> possibly an early exit.
>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c            |  3 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c            |  3 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_device.c            | 16 ++++----
>  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  | 19 ++++-----
>  .../drm/amd/amdkfd/kfd_device_queue_manager_cik.c  |  2 +-
>  .../drm/amd/amdkfd/kfd_device_queue_manager_vi.c   |  2 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c      | 20 +++++++---
>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c   |  3 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c    |  3 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c    | 45 +++++++++++++---------
>  drivers/gpu/drm/amd/amdkfd/kfd_pasid.c             |  4 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_process.c           |  9 ++---
>  .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |  7 ++--
>  drivers/gpu/drm/amd/amdkfd/kfd_topology.c          |  4 +-
>  14 files changed, 84 insertions(+), 56 deletions(-)
[snip]

>> @@ -610,12 +616,15 @@ int pm_send_unmap_queue(struct packet_manager *pm, enum kfd_queue_type type,
>                                 queue_sel__mes_unmap_queues__perform_request_on_dynamic_queues_only;
>                 break;
>         default:
> -               BUG();
> -               break;
> +               WARN(1, "filter %d", mode);
> +               retval = -EINVAL;
>         }
>
> -       pm->priv_queue->ops.submit_packet(pm->priv_queue);
> -
> +err_invalid:
> +       if (!retval)
> +               pm->priv_queue->ops.submit_packet(pm->priv_queue);
> +       else
> +               pm->priv_queue->ops.rollback_packet(pm->priv_queue);

I don't feel comfortable putting a valid code path under an "err_invalid" label.
This defeats the purpose of goto statement and common cleanup code,
making the code unreadable.

[FK] It is common clean-up code that is needed both in the error and success case. If you prefer, I can separate the error cleanup from the normal cleanup, but it will result in some duplication.

Also, the rollback packet function was not in the original code. Why
did you add it here ?

[FK] With the BUG(), this function would not return in case of an error. Without the BUG it will return with an error code, so it needs to clean up after itself. So it needs to release the space it allocated on the queue. Otherwise the next potential user of the queue will submit garbage.

>  err_acquire_packet_buffer:
>         mutex_unlock(&pm->lock);
>         return retval;
[snip]

> @@ -202,10 +200,10 @@ static void kfd_process_destroy_delayed(struct rcu_head *rcu)
>         struct kfd_process_release_work *work;
>         struct kfd_process *p;
>
> -       BUG_ON(!kfd_process_wq);
> +       WARN_ON(!kfd_process_wq);
I think this is redundant, as kfd_process_wq is later derefernced
inside queue_work (as *wq). So we will get a violation there anyway.

[FK] OK.

Regards,
  Felix

>
>         p = container_of(rcu, struct kfd_process, rcu);
> -       BUG_ON(atomic_read(&p->mm->mm_count) <= 0);
> +       WARN_ON(atomic_read(&p->mm->mm_count) <= 0);
>
>         mmdrop(p->mm);
>
> @@ -229,7 +227,8 @@ static void kfd_process_notifier_release(struct mmu_notifier *mn,
>          * mmu_notifier srcu is read locked
>          */
>         p = container_of(mn, struct kfd_process, mmu_notifier);
> -       BUG_ON(p->mm != mm);
> +       if (WARN_ON(p->mm != mm))
> +               return;
>
>         mutex_lock(&kfd_processes_mutex);
>         hash_del_rcu(&p->kfd_processes);
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> index f6ecdff..1cae95e 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> @@ -218,8 +218,8 @@ int pqm_create_queue(struct process_queue_manager *pqm,
>                                                         kq, &pdd->qpd);
>                 break;
>         default:
> -               BUG();
> -               break;
> +               WARN(1, "Invalid queue type %d", type);
> +               retval = -EINVAL;
>         }
>
>         if (retval != 0) {
> @@ -272,7 +272,8 @@ int pqm_destroy_queue(struct process_queue_manager *pqm, unsigned int qid)
>                 dev = pqn->kq->dev;
>         if (pqn->q)
>                 dev = pqn->q->device;
> -       BUG_ON(!dev);
> +       if (WARN_ON(!dev))
> +               return -ENODEV;
>
>         pdd = kfd_get_process_device_data(dev, pqm->process);
>         if (!pdd) {
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> index e5486f4..19ce590 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> @@ -799,10 +799,12 @@ static int kfd_build_sysfs_node_entry(struct kfd_topology_device *dev,
>         int ret;
>         uint32_t i;
>
> +       if (WARN_ON(dev->kobj_node))
> +               return -EEXIST;
> +
>         /*
>          * Creating the sysfs folders
>          */
> -       BUG_ON(dev->kobj_node);
>         dev->kobj_node = kfd_alloc_struct(dev->kobj_node);
>         if (!dev->kobj_node)
>                 return -ENOMEM;
> --
> 2.7.4
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* RE: [PATCH 16/19] drm/amdkfd: Update PM4 packet headers
       [not found]             ` <DM5PR1201MB023536CBDE40370D9703EBD6928E0-grEf7a3NxMBd8L2jMOIKKmrFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
@ 2017-08-12 19:09               ` Bridgman, John
       [not found]                 ` <BN6PR12MB13489181F8A90932135C1CBBE88E0-/b2+HYfkarQX0pEhCR5T8QdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: Bridgman, John @ 2017-08-12 19:09 UTC (permalink / raw)
  To: Kuehling, Felix, Oded Gabbay; +Cc: amd-gfx list

IIRC the amdgpu devs had been holding back on publishing the updated MEC microcode (with scratch support) because that WOULD have broken Kaveri. With this change from Felix we should be able to publish the newest microcode for both amdgpu and amdkfd WITHOUT breaking Kaveri.

IOW this is the "scratch fix for Kaveri KFD" you have wanted for a couple of years :)

>-----Original Message-----
>From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf
>Of Kuehling, Felix
>Sent: Saturday, August 12, 2017 2:16 PM
>To: Oded Gabbay
>Cc: amd-gfx list
>Subject: Re: [PATCH 16/19] drm/amdkfd: Update PM4 packet headers
>
>> Do you mean that it won't work with Kaveri anymore ?
>
>Kaveri got the same firmware changes, mostly for scratch memory support.
>The Kaveri firmware headers name the structures and fields a bit differently
>but they should be binary compatible. So we simplified the code to use only
>one set of headers. I'll grab a Kaveri system to confirm that it works.
>
>Regards,
>  Felix
>
>From: Oded Gabbay <oded.gabbay@gmail.com>
>Sent: Saturday, August 12, 2017 11:10 AM
>To: Kuehling, Felix
>Cc: amd-gfx list
>Subject: Re: [PATCH 16/19] drm/amdkfd: Update PM4 packet headers
>
>On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling <Felix.Kuehling@amd.com>
>wrote:
>> To match current firmware. The map process packet has been extended to
>> support scratch. This is a non-backwards compatible change and it's
>> about two years old. So no point keeping the old version around
>> conditionally.
>
>Do you mean that it won't work with Kaveri anymore ?
>I believe we aren't allowed to break older H/W support without some
>serious justification.
>
>Oded
>
>>
>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>> ---
>>  drivers/gpu/drm/amd/amdkfd/kfd_device.c         |   8 +-
>>  drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c | 161 ++++--------
>>  drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h    | 314
>>+++---------------------
>>  drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h | 130 +++++++++-
>>  4 files changed, 199 insertions(+), 414 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> index e1c2ad2..e790e7f 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>> @@ -26,7 +26,7 @@
>>  #include <linux/slab.h>
>>  #include "kfd_priv.h"
>>  #include "kfd_device_queue_manager.h"
>> -#include "kfd_pm4_headers.h"
>> +#include "kfd_pm4_headers_vi.h"
>>
>>  #define MQD_SIZE_ALIGNED 768
>>
>> @@ -238,9 +238,9 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
>>          * calculate max size of runlist packet.
>>          * There can be only 2 packets at once
>>          */
>> -       size += (KFD_MAX_NUM_OF_PROCESSES * sizeof(struct
>>pm4_map_process) +
>> -               max_num_of_queues_per_device *
>> -               sizeof(struct pm4_map_queues) + sizeof(struct
>>pm4_runlist)) * 2;
>> +       size += (KFD_MAX_NUM_OF_PROCESSES * sizeof(struct
>> +pm4_mes_map_process) +
>> +               max_num_of_queues_per_device * sizeof(struct
>> +pm4_mes_map_queues)
>> +               + sizeof(struct pm4_mes_runlist)) * 2;
>>
>>         /* Add size of HIQ & DIQ */
>>         size += KFD_KERNEL_QUEUE_SIZE * 2;  diff --git
>>a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
>>b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
>> index 77a6f2b..3141e05 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
>> @@ -26,7 +26,6 @@
>>  #include "kfd_device_queue_manager.h"
>>  #include "kfd_kernel_queue.h"
>>  #include "kfd_priv.h"
>> -#include "kfd_pm4_headers.h"
>>  #include "kfd_pm4_headers_vi.h"
>>  #include "kfd_pm4_opcodes.h"
>>
>> @@ -44,12 +43,12 @@ static unsigned int build_pm4_header(unsigned int
>>opcode, size_t packet_size)
>>  {
>>         union PM4_MES_TYPE_3_HEADER header;
>>
>> -       header.u32all = 0;
>> +       header.u32All = 0;
>>         header.opcode = opcode;
>>         header.count = packet_size/sizeof(uint32_t) - 2;
>>         header.type = PM4_TYPE_3;
>>
>> -       return header.u32all;
>> +       return header.u32All;
>>  }
>>
>>  static void pm_calc_rlib_size(struct packet_manager *pm,  @@ -69,12
>>+68,9 @@ static void pm_calc_rlib_size(struct packet_manager *pm,
>>                 pr_debug("Over subscribed runlist\n");
>>         }
>>
>> -       map_queue_size =
>> -               (pm->dqm->dev->device_info->asic_family == CHIP_CARRIZO) ?
>> -               sizeof(struct pm4_mes_map_queues) :
>> -               sizeof(struct pm4_map_queues);
>> +       map_queue_size = sizeof(struct pm4_mes_map_queues);
>>         /* calculate run list ib allocation size */
>> -       *rlib_size = process_count * sizeof(struct pm4_map_process) +
>> +       *rlib_size = process_count * sizeof(struct
>> +pm4_mes_map_process) +
>>                      queue_count * map_queue_size;
>>
>>         /*
>> @@ -82,7 +78,7 @@ static void pm_calc_rlib_size(struct packet_manager
>>*pm,
>>          * when over subscription
>>          */
>>         if (*over_subscription)
>> -               *rlib_size += sizeof(struct pm4_runlist);
>> +               *rlib_size += sizeof(struct pm4_mes_runlist);
>>
>>         pr_debug("runlist ib size %d\n", *rlib_size);
>>  }
>> @@ -119,16 +115,16 @@ static int pm_allocate_runlist_ib(struct
>>packet_manager *pm,
>>  static int pm_create_runlist(struct packet_manager *pm, uint32_t
>>*buffer,
>>                         uint64_t ib, size_t ib_size_in_dwords, bool
>>chain)
>>  {
>> -       struct pm4_runlist *packet;
>> +       struct pm4_mes_runlist *packet;
>>
>>         if (WARN_ON(!ib))
>>                 return -EFAULT;
>>
>> -       packet = (struct pm4_runlist *)buffer;
>> +       packet = (struct pm4_mes_runlist *)buffer;
>>
>> -       memset(buffer, 0, sizeof(struct pm4_runlist));
>> -       packet->header.u32all = build_pm4_header(IT_RUN_LIST,
>> -                                               sizeof(struct
>> pm4_runlist));
>> +       memset(buffer, 0, sizeof(struct pm4_mes_runlist));
>> +       packet->header.u32All = build_pm4_header(IT_RUN_LIST,
>> +                                               sizeof(struct
>> +pm4_mes_runlist));
>>
>>         packet->bitfields4.ib_size = ib_size_in_dwords;
>>         packet->bitfields4.chain = chain ? 1 : 0;  @@ -143,16 +139,16
>>@@ static int pm_create_runlist(struct packet_manager *pm, uint32_t
>>*buffer,
>>  static int pm_create_map_process(struct packet_manager *pm, uint32_t
>>*buffer,
>>                                 struct qcm_process_device *qpd)
>>  {
>> -       struct pm4_map_process *packet;
>> +       struct pm4_mes_map_process *packet;
>>         struct queue *cur;
>>         uint32_t num_queues;
>>
>> -       packet = (struct pm4_map_process *)buffer;
>> +       packet = (struct pm4_mes_map_process *)buffer;
>>
>> -       memset(buffer, 0, sizeof(struct pm4_map_process));
>> +       memset(buffer, 0, sizeof(struct pm4_mes_map_process));
>>
>> -       packet->header.u32all = build_pm4_header(IT_MAP_PROCESS,
>> -                                       sizeof(struct
>> pm4_map_process));
>> +       packet->header.u32All = build_pm4_header(IT_MAP_PROCESS,
>> +                                       sizeof(struct
>> +pm4_mes_map_process));
>>         packet->bitfields2.diq_enable = (qpd->is_debug) ? 1 : 0;
>>         packet->bitfields2.process_quantum = 1;
>>         packet->bitfields2.pasid = qpd->pqm->process->pasid;  @@
>>-170,23 +166,26 @@ static int pm_create_map_process(struct
>>packet_manager *pm, uint32_t *buffer,
>>         packet->sh_mem_ape1_base = qpd->sh_mem_ape1_base;
>>         packet->sh_mem_ape1_limit = qpd->sh_mem_ape1_limit;
>>
>> +       /* TODO: scratch support */
>> +       packet->sh_hidden_private_base_vmid = 0;
>> +
>>         packet->gds_addr_lo = lower_32_bits(qpd->gds_context_area);
>>         packet->gds_addr_hi = upper_32_bits(qpd->gds_context_area);
>>
>>         return 0;
>>  }
>>
>> -static int pm_create_map_queue_vi(struct packet_manager *pm, uint32_t
>> *buffer,
>> +static int pm_create_map_queue(struct packet_manager *pm, uint32_t
>> +*buffer,
>>                 struct queue *q, bool is_static)
>>  {
>>         struct pm4_mes_map_queues *packet;
>>         bool use_static = is_static;
>>
>>         packet = (struct pm4_mes_map_queues *)buffer;
>> -       memset(buffer, 0, sizeof(struct pm4_map_queues));
>> +       memset(buffer, 0, sizeof(struct pm4_mes_map_queues));
>>
>> -       packet->header.u32all = build_pm4_header(IT_MAP_QUEUES,
>> -                                               sizeof(struct
>> pm4_map_queues));
>> +       packet->header.u32All = build_pm4_header(IT_MAP_QUEUES,
>> +                                               sizeof(struct
>> +pm4_mes_map_queues));
>>         packet->bitfields2.alloc_format =
>>                 alloc_format__mes_map_queues__one_per_pipe_vi;
>>         packet->bitfields2.num_queues = 1;  @@ -235,64 +234,6 @@
>>static int pm_create_map_queue_vi(struct packet_manager *pm, uint32_t
>>*buffer,
>>         return 0;
>>  }
>>
>> -static int pm_create_map_queue(struct packet_manager *pm, uint32_t
>>*buffer,
>> -                               struct queue *q, bool is_static)  -{
>> -       struct pm4_map_queues *packet;
>> -       bool use_static = is_static;
>> -
>> -       packet = (struct pm4_map_queues *)buffer;
>> -       memset(buffer, 0, sizeof(struct pm4_map_queues));
>> -
>> -       packet->header.u32all = build_pm4_header(IT_MAP_QUEUES,
>> -                                               sizeof(struct
>>pm4_map_queues));
>> -       packet->bitfields2.alloc_format =
>> -
>>alloc_format__mes_map_queues__one_per_pipe;
>> -       packet->bitfields2.num_queues = 1;
>> -       packet->bitfields2.queue_sel =
>> -
>>queue_sel__mes_map_queues__map_to_hws_determined_queue_slots;
>> -
>> -       packet->bitfields2.vidmem = (q->properties.is_interop) ?
>> -                       vidmem__mes_map_queues__uses_video_memory :
>> -                       vidmem__mes_map_queues__uses_no_video_memory;
>> -
>> -       switch (q->properties.type) {
>> -       case KFD_QUEUE_TYPE_COMPUTE:
>> -       case KFD_QUEUE_TYPE_DIQ:
>> -               packet->bitfields2.engine_sel =
>> -                               engine_sel__mes_map_queues__compute;
>> -               break;
>> -       case KFD_QUEUE_TYPE_SDMA:
>> -               packet->bitfields2.engine_sel =
>> -                               engine_sel__mes_map_queues__sdma0;
>> -               use_static = false; /* no static queues under SDMA */
>> -               break;
>> -       default:
>> -               WARN(1, "queue type %d", q->properties.type);
>> -               return -EINVAL;
>> -       }
>> -
>> -       packet->mes_map_queues_ordinals[0].bitfields3.doorbell_offset
>>=
>> -                       q->properties.doorbell_off;
>> -
>> -       packet->mes_map_queues_ordinals[0].bitfields3.is_static =
>> -                       (use_static) ? 1 : 0;
>> -
>> -       packet->mes_map_queues_ordinals[0].mqd_addr_lo =
>> -                       lower_32_bits(q->gart_mqd_addr);
>> -
>> -       packet->mes_map_queues_ordinals[0].mqd_addr_hi =
>> -                       upper_32_bits(q->gart_mqd_addr);
>> -
>> -       packet->mes_map_queues_ordinals[0].wptr_addr_lo =
>> -
>>lower_32_bits((uint64_t)q->properties.write_ptr);
>> -
>> -       packet->mes_map_queues_ordinals[0].wptr_addr_hi =
>> -
>>upper_32_bits((uint64_t)q->properties.write_ptr);
>> -
>> -       return 0;
>> -}
>> -
>>  static int pm_create_runlist_ib(struct packet_manager *pm,
>>                                 struct list_head *queues,
>>                                 uint64_t *rl_gpu_addr,  @@ -334,7
>>+275,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
>>                         return retval;
>>
>>                 proccesses_mapped++;
>> -               inc_wptr(&rl_wptr, sizeof(struct pm4_map_process),
>> +               inc_wptr(&rl_wptr, sizeof(struct pm4_mes_map_process),
>>                                 alloc_size_bytes);
>>
>>                 list_for_each_entry(kq, &qpd->priv_queue_list, list) {
>>@@ -344,14 +285,7 @@ static int pm_create_runlist_ib(struct
>>packet_manager *pm,
>>                         pr_debug("static_queue, mapping kernel q %d,
>>is debug status %d\n",
>>                                 kq->queue->queue, qpd->is_debug);
>>
>> -                       if (pm->dqm->dev->device_info->asic_family ==
>> -                                       CHIP_CARRIZO)
>> -                               retval = pm_create_map_queue_vi(pm,
>> -                                               &rl_buffer[rl_wptr],
>> -                                               kq->queue,
>> -                                               qpd->is_debug);
>> -                       else
>> -                               retval = pm_create_map_queue(pm,
>> +                       retval = pm_create_map_queue(pm,
>>                                                 &rl_buffer[rl_wptr],
>>                                                 kq->queue,
>>                                                 qpd->is_debug);  @@
>>-359,7 +293,7 @@ static int pm_create_runlist_ib(struct packet_manager
>>*pm,
>>                                 return retval;
>>
>>                         inc_wptr(&rl_wptr,
>> -                               sizeof(struct pm4_map_queues),
>> +                               sizeof(struct pm4_mes_map_queues),
>>                                 alloc_size_bytes);
>>                 }
>>
>> @@ -370,14 +304,7 @@ static int pm_create_runlist_ib(struct
>>packet_manager *pm,
>>                         pr_debug("static_queue, mapping user queue %d,
>>is debug status %d\n",
>>                                 q->queue, qpd->is_debug);
>>
>> -                       if (pm->dqm->dev->device_info->asic_family ==
>> -                                       CHIP_CARRIZO)
>> -                               retval = pm_create_map_queue_vi(pm,
>> -                                               &rl_buffer[rl_wptr],
>> -                                               q,
>> -                                               qpd->is_debug);
>> -                       else
>> -                               retval = pm_create_map_queue(pm,
>> +                       retval = pm_create_map_queue(pm,
>>                                                 &rl_buffer[rl_wptr],
>>                                                 q,
>>                                                 qpd->is_debug);  @@
>>-386,7 +313,7 @@ static int pm_create_runlist_ib(struct packet_manager
>>*pm,
>>                                 return retval;
>>
>>                         inc_wptr(&rl_wptr,
>> -                               sizeof(struct pm4_map_queues),
>> +                               sizeof(struct pm4_mes_map_queues),
>>                                 alloc_size_bytes);
>>                 }
>>         }
>> @@ -429,7 +356,7 @@ void pm_uninit(struct packet_manager *pm)
>>  int pm_send_set_resources(struct packet_manager *pm,
>>                                 struct scheduling_resources *res)
>>  {
>> -       struct pm4_set_resources *packet;
>> +       struct pm4_mes_set_resources *packet;
>>         int retval = 0;
>>
>>         mutex_lock(&pm->lock);
>> @@ -442,9 +369,9 @@ int pm_send_set_resources(struct packet_manager
>>*pm,
>>                 goto out;
>>         }
>>
>> -       memset(packet, 0, sizeof(struct pm4_set_resources));
>> -       packet->header.u32all = build_pm4_header(IT_SET_RESOURCES,
>> -                                       sizeof(struct
>> pm4_set_resources));
>> +       memset(packet, 0, sizeof(struct pm4_mes_set_resources));
>> +       packet->header.u32All = build_pm4_header(IT_SET_RESOURCES,
>> +                                       sizeof(struct
>> +pm4_mes_set_resources));
>>
>>         packet->bitfields2.queue_type =
>>
>>queue_type__mes_set_resources__hsa_interface_queue_hiq;
>> @@ -482,7 +409,7 @@ int pm_send_runlist(struct packet_manager *pm,
>>struct list_head *dqm_queues)
>>
>>         pr_debug("runlist IB address: 0x%llX\n", rl_gpu_ib_addr);
>>
>> -       packet_size_dwords = sizeof(struct pm4_runlist) /
>> sizeof(uint32_t);
>> +       packet_size_dwords = sizeof(struct pm4_mes_runlist) /
>> +sizeof(uint32_t);
>>         mutex_lock(&pm->lock);
>>
>>         retval =
>>pm->priv_queue->ops.acquire_packet_buffer(pm->priv_queue,
>> @@ -514,7 +441,7 @@ int pm_send_query_status(struct packet_manager
>>*pm, uint64_t fence_address,
>>                         uint32_t fence_value)
>>  {
>>         int retval;
>> -       struct pm4_query_status *packet;
>> +       struct pm4_mes_query_status *packet;
>>
>>         if (WARN_ON(!fence_address))
>>                 return -EFAULT;
>> @@ -522,13 +449,13 @@ int pm_send_query_status(struct packet_manager
>>*pm, uint64_t fence_address,
>>         mutex_lock(&pm->lock);
>>         retval = pm->priv_queue->ops.acquire_packet_buffer(
>>                         pm->priv_queue,
>> -                       sizeof(struct pm4_query_status) /
>>sizeof(uint32_t),
>> +                       sizeof(struct pm4_mes_query_status) /
>> +sizeof(uint32_t),
>>                         (unsigned int **)&packet);
>>         if (retval)
>>                 goto fail_acquire_packet_buffer;
>>
>> -       packet->header.u32all = build_pm4_header(IT_QUERY_STATUS,
>> -                                       sizeof(struct
>> pm4_query_status));
>> +       packet->header.u32All = build_pm4_header(IT_QUERY_STATUS,
>> +                                       sizeof(struct
>> +pm4_mes_query_status));
>>
>>         packet->bitfields2.context_id = 0;
>>         packet->bitfields2.interrupt_sel =  @@ -555,22 +482,22 @@ int
>>pm_send_unmap_queue(struct packet_manager *pm, enum
>kfd_queue_type
>>type,
>>  {
>>         int retval;
>>         uint32_t *buffer;
>> -       struct pm4_unmap_queues *packet;
>> +       struct pm4_mes_unmap_queues *packet;
>>
>>         mutex_lock(&pm->lock);
>>         retval = pm->priv_queue->ops.acquire_packet_buffer(
>>                         pm->priv_queue,
>> -                       sizeof(struct pm4_unmap_queues) /
>>sizeof(uint32_t),
>> +                       sizeof(struct pm4_mes_unmap_queues) /
>> +sizeof(uint32_t),
>>                         &buffer);
>>         if (retval)
>>                 goto err_acquire_packet_buffer;
>>
>> -       packet = (struct pm4_unmap_queues *)buffer;
>> -       memset(buffer, 0, sizeof(struct pm4_unmap_queues));
>> +       packet = (struct pm4_mes_unmap_queues *)buffer;
>> +       memset(buffer, 0, sizeof(struct pm4_mes_unmap_queues));
>>         pr_debug("static_queue: unmapping queues: mode is %d , reset
>>is %d , type is %d\n",
>>                 mode, reset, type);
>> -       packet->header.u32all = build_pm4_header(IT_UNMAP_QUEUES,
>> -                                       sizeof(struct
>>pm4_unmap_queues));
>> +       packet->header.u32All = build_pm4_header(IT_UNMAP_QUEUES,
>> +                                       sizeof(struct
>> +pm4_mes_unmap_queues));
>>         switch (type) {
>>         case KFD_QUEUE_TYPE_COMPUTE:
>>         case KFD_QUEUE_TYPE_DIQ:
>> @@ -608,12 +535,12 @@ int pm_send_unmap_queue(struct
>packet_manager
>>*pm, enum kfd_queue_type type,
>>                 break;
>>         case KFD_PREEMPT_TYPE_FILTER_ALL_QUEUES:
>>                 packet->bitfields2.queue_sel =
>> -
>>queue_sel__mes_unmap_queues__perform_request_on_all_active_queue
>s;
>> +
>> +queue_sel__mes_unmap_queues__unmap_all_queues;
>>                 break;
>>         case KFD_PREEMPT_TYPE_FILTER_DYNAMIC_QUEUES:
>>                 /* in this case, we do not preempt static queues */
>>                 packet->bitfields2.queue_sel =
>> -
>>queue_sel__mes_unmap_queues__perform_request_on_dynamic_queues
>_only;
>> +
>> +queue_sel__mes_unmap_queues__unmap_all_non_static_queues;
>>                 break;
>>         default:
>>                 WARN(1, "filter %d", mode);  diff --git
>>a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
>>b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
>> index 97e5442..e50f73d 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
>> @@ -41,99 +41,6 @@ union PM4_MES_TYPE_3_HEADER {
>>  };
>>  #endif /* PM4_MES_HEADER_DEFINED */
>>
>> -/* --------------------MES_SET_RESOURCES-------------------- */
>> -
>> -#ifndef PM4_MES_SET_RESOURCES_DEFINED -#define
>> PM4_MES_SET_RESOURCES_DEFINED -enum
>set_resources_queue_type_enum {
>> -       queue_type__mes_set_resources__kernel_interface_queue_kiq = 0,
>> -       queue_type__mes_set_resources__hsa_interface_queue_hiq = 1,
>> -       queue_type__mes_set_resources__hsa_debug_interface_queue = 4
>> -};
>> -
>> -struct pm4_set_resources {
>> -       union {
>> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
>> -               uint32_t ordinal1;
>> -       };
>> -
>> -       union {
>> -               struct {
>> -                       uint32_t vmid_mask:16;
>> -                       uint32_t unmap_latency:8;
>> -                       uint32_t reserved1:5;
>> -                       enum set_resources_queue_type_enum
>> queue_type:3;
>> -               } bitfields2;
>> -               uint32_t ordinal2;
>> -       };
>> -
>> -       uint32_t queue_mask_lo;
>> -       uint32_t queue_mask_hi;
>> -       uint32_t gws_mask_lo;
>> -       uint32_t gws_mask_hi;
>> -
>> -       union {
>> -               struct {
>> -                       uint32_t oac_mask:16;
>> -                       uint32_t reserved2:16;
>> -               } bitfields7;
>> -               uint32_t ordinal7;
>> -       };
>> -
>> -       union {
>> -               struct {
>> -                       uint32_t gds_heap_base:6;
>> -                       uint32_t reserved3:5;
>> -                       uint32_t gds_heap_size:6;
>> -                       uint32_t reserved4:15;
>> -               } bitfields8;
>> -               uint32_t ordinal8;
>> -       };
>> -
>> -};
>> -#endif
>> -
>> -/*--------------------MES_RUN_LIST-------------------- */
>> -
>> -#ifndef PM4_MES_RUN_LIST_DEFINED
>> -#define PM4_MES_RUN_LIST_DEFINED
>> -
>> -struct pm4_runlist {
>> -       union {
>> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
>> -               uint32_t ordinal1;
>> -       };
>> -
>> -       union {
>> -               struct {
>> -                       uint32_t reserved1:2;
>> -                       uint32_t ib_base_lo:30;
>> -               } bitfields2;
>> -               uint32_t ordinal2;
>> -       };
>> -
>> -       union {
>> -               struct {
>> -                       uint32_t ib_base_hi:16;
>> -                       uint32_t reserved2:16;
>> -               } bitfields3;
>> -               uint32_t ordinal3;
>> -       };
>> -
>> -       union {
>> -               struct {
>> -                       uint32_t ib_size:20;
>> -                       uint32_t chain:1;
>> -                       uint32_t offload_polling:1;
>> -                       uint32_t reserved3:1;
>> -                       uint32_t valid:1;
>> -                       uint32_t reserved4:8;
>> -               } bitfields4;
>> -               uint32_t ordinal4;
>> -       };
>> -
>> -};
>> -#endif
>>
>>  /*--------------------MES_MAP_PROCESS-------------------- */
>>
>> @@ -186,217 +93,58 @@ struct pm4_map_process {
>>  };
>>  #endif
>>
>> -/*--------------------MES_MAP_QUEUES--------------------*/
>> -
>> -#ifndef PM4_MES_MAP_QUEUES_DEFINED
>> -#define PM4_MES_MAP_QUEUES_DEFINED
>> -enum map_queues_queue_sel_enum {
>> -       queue_sel__mes_map_queues__map_to_specified_queue_slots = 0,
>> -
>       queue_sel__mes_map_queues__map_to_hws_determined_queue_slots
>=
>> 1,
>> -       queue_sel__mes_map_queues__enable_process_queues = 2 -};
>> +#ifndef PM4_MES_MAP_PROCESS_DEFINED_KV_SCRATCH
>> +#define PM4_MES_MAP_PROCESS_DEFINED_KV_SCRATCH
>>
>> -enum map_queues_vidmem_enum {
>> -       vidmem__mes_map_queues__uses_no_video_memory = 0,
>> -       vidmem__mes_map_queues__uses_video_memory = 1 -};
>> -
>> -enum map_queues_alloc_format_enum {
>> -       alloc_format__mes_map_queues__one_per_pipe = 0,
>> -       alloc_format__mes_map_queues__all_on_one_pipe = 1 -};
>> -
>> -enum map_queues_engine_sel_enum {
>> -       engine_sel__mes_map_queues__compute = 0,
>> -       engine_sel__mes_map_queues__sdma0 = 2,
>> -       engine_sel__mes_map_queues__sdma1 = 3 -};
>> -
>> -struct pm4_map_queues {
>> +struct pm4_map_process_scratch_kv {
>>         union {
>> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
>> -               uint32_t ordinal1;
>> -       };
>> -
>> -       union {
>> -               struct {
>> -                       uint32_t reserved1:4;
>> -                       enum map_queues_queue_sel_enum queue_sel:2;
>> -                       uint32_t reserved2:2;
>> -                       uint32_t vmid:4;
>> -                       uint32_t reserved3:4;
>> -                       enum map_queues_vidmem_enum vidmem:2;
>> -                       uint32_t reserved4:6;
>> -                       enum map_queues_alloc_format_enum
>>alloc_format:2;
>> -                       enum map_queues_engine_sel_enum engine_sel:3;
>> -                       uint32_t num_queues:3;
>> -               } bitfields2;
>> -               uint32_t ordinal2;
>> -       };
>> -
>> -       struct {
>> -               union {
>> -                       struct {
>> -                               uint32_t is_static:1;
>> -                               uint32_t reserved5:1;
>> -                               uint32_t doorbell_offset:21;
>> -                               uint32_t reserved6:3;
>> -                               uint32_t queue:6;
>> -                       } bitfields3;
>> -                       uint32_t ordinal3;
>> -               };
>> -
>> -               uint32_t mqd_addr_lo;
>> -               uint32_t mqd_addr_hi;
>> -               uint32_t wptr_addr_lo;
>> -               uint32_t wptr_addr_hi;
>> -
>> -       } mes_map_queues_ordinals[1];   /* 1..N of these ordinal
>>groups */
>> -
>> -};
>> -#endif
>> -
>> -/*--------------------MES_QUERY_STATUS--------------------*/
>> -
>> -#ifndef PM4_MES_QUERY_STATUS_DEFINED
>> -#define PM4_MES_QUERY_STATUS_DEFINED
>> -enum query_status_interrupt_sel_enum {
>> -       interrupt_sel__mes_query_status__completion_status = 0,
>> -       interrupt_sel__mes_query_status__process_status = 1,
>> -       interrupt_sel__mes_query_status__queue_status = 2  -};
>> -
>> -enum query_status_command_enum {
>> -       command__mes_query_status__interrupt_only = 0,
>> -       command__mes_query_status__fence_only_immediate = 1,
>> -       command__mes_query_status__fence_only_after_write_ack = 2,
>> -
>>command__mes_query_status__fence_wait_for_write_ack_send_interrupt
>= 3
>>-};
>> -
>> -enum query_status_engine_sel_enum {
>> -       engine_sel__mes_query_status__compute = 0,
>> -       engine_sel__mes_query_status__sdma0_queue = 2,
>> -       engine_sel__mes_query_status__sdma1_queue = 3  -};
>> -
>> -struct pm4_query_status {
>> -       union {
>> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
>> -               uint32_t ordinal1;
>> -       };
>> -
>> -       union {
>> -               struct {
>> -                       uint32_t context_id:28;
>> -                       enum query_status_interrupt_sel_enum
>>interrupt_sel:2;
>> -                       enum query_status_command_enum command:2;
>> -               } bitfields2;
>> -               uint32_t ordinal2;
>> +               union PM4_MES_TYPE_3_HEADER   header; /* header */
>> +               uint32_t            ordinal1;
>>         };
>>
>>         union {
>>                 struct {
>>                         uint32_t pasid:16;
>> -                       uint32_t reserved1:16;
>> -               } bitfields3a;
>> -               struct {
>> -                       uint32_t reserved2:2;
>> -                       uint32_t doorbell_offset:21;
>> -                       uint32_t reserved3:3;
>> -                       enum query_status_engine_sel_enum
>>engine_sel:3;
>> -                       uint32_t reserved4:3;
>> -               } bitfields3b;
>> -               uint32_t ordinal3;
>> -       };
>> -
>> -       uint32_t addr_lo;
>> -       uint32_t addr_hi;
>> -       uint32_t data_lo;
>> -       uint32_t data_hi;
>> -};
>> -#endif
>> -
>> -/*--------------------MES_UNMAP_QUEUES--------------------*/
>> -
>> -#ifndef PM4_MES_UNMAP_QUEUES_DEFINED
>> -#define PM4_MES_UNMAP_QUEUES_DEFINED
>> -enum unmap_queues_action_enum {
>> -       action__mes_unmap_queues__preempt_queues = 0,
>> -       action__mes_unmap_queues__reset_queues = 1,
>> -       action__mes_unmap_queues__disable_process_queues = 2  -};
>> -
>> -enum unmap_queues_queue_sel_enum {
>> -
>>queue_sel__mes_unmap_queues__perform_request_on_specified_queues
>= 0,
>> -
>       queue_sel__mes_unmap_queues__perform_request_on_pasid_queues =
>>1,
>> -
>>queue_sel__mes_unmap_queues__perform_request_on_all_active_queue
>s = 2,
>> -
>>queue_sel__mes_unmap_queues__perform_request_on_dynamic_queues
>_only = 3
>>-};
>> -
>> -enum unmap_queues_engine_sel_enum {
>> -       engine_sel__mes_unmap_queues__compute = 0,
>> -       engine_sel__mes_unmap_queues__sdma0 = 2,
>> -       engine_sel__mes_unmap_queues__sdma1 = 3  -};
>> -
>> -struct pm4_unmap_queues {
>> -       union {
>> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
>> -               uint32_t ordinal1;
>> -       };
>> -
>> -       union {
>> -               struct {
>> -                       enum unmap_queues_action_enum action:2;
>> -                       uint32_t reserved1:2;
>> -                       enum unmap_queues_queue_sel_enum queue_sel:2;
>> -                       uint32_t reserved2:20;
>> -                       enum unmap_queues_engine_sel_enum
>>engine_sel:3;
>> -                       uint32_t num_queues:3;
>> +                       uint32_t reserved1:8;
>> +                       uint32_t diq_enable:1;
>> +                       uint32_t process_quantum:7;
>>                 } bitfields2;
>>                 uint32_t ordinal2;
>>         };
>>
>>         union {
>>                 struct {
>> -                       uint32_t pasid:16;
>> -                       uint32_t reserved3:16;
>> -               } bitfields3a;
>> -               struct {
>> -                       uint32_t reserved4:2;
>> -                       uint32_t doorbell_offset0:21;
>> -                       uint32_t reserved5:9;
>> -               } bitfields3b;
>> +                       uint32_t page_table_base:28;
>> +                       uint32_t reserved2:4;
>> +               } bitfields3;
>>                 uint32_t ordinal3;
>>         };
>>
>> -       union {
>> -               struct {
>> -                       uint32_t reserved6:2;
>> -                       uint32_t doorbell_offset1:21;
>> -                       uint32_t reserved7:9;
>> -               } bitfields4;
>> -               uint32_t ordinal4;
>> -       };
>> -
>> -       union {
>> -               struct {
>> -                       uint32_t reserved8:2;
>> -                       uint32_t doorbell_offset2:21;
>> -                       uint32_t reserved9:9;
>> -               } bitfields5;
>> -               uint32_t ordinal5;
>> -       };
>> +       uint32_t reserved3;
>> +       uint32_t sh_mem_bases;
>> +       uint32_t sh_mem_config;
>> +       uint32_t sh_mem_ape1_base;
>> +       uint32_t sh_mem_ape1_limit;
>> +       uint32_t sh_hidden_private_base_vmid;
>> +       uint32_t reserved4;
>> +       uint32_t reserved5;
>> +       uint32_t gds_addr_lo;
>> +       uint32_t gds_addr_hi;
>>
>>         union {
>>                 struct {
>> -                       uint32_t reserved10:2;
>> -                       uint32_t doorbell_offset3:21;
>> -                       uint32_t reserved11:9;
>> -               } bitfields6;
>> -               uint32_t ordinal6;
>> +                       uint32_t num_gws:6;
>> +                       uint32_t reserved6:2;
>> +                       uint32_t num_oac:4;
>> +                       uint32_t reserved7:4;
>> +                       uint32_t gds_size:6;
>> +                       uint32_t num_queues:10;
>> +               } bitfields14;
>> +               uint32_t ordinal14;
>>         };
>>
>> +       uint32_t completion_signal_lo32; uint32_t
>> +completion_signal_hi32;
>>  };
>>  #endif
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
>> b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
>> index c4eda6f..7c8d9b3 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
>> @@ -126,9 +126,10 @@ struct pm4_mes_runlist {
>>                         uint32_t ib_size:20;
>>                         uint32_t chain:1;
>>                         uint32_t offload_polling:1;
>> -                       uint32_t reserved3:1;
>> +                       uint32_t reserved2:1;
>>                         uint32_t valid:1;
>> -                       uint32_t reserved4:8;
>> +                       uint32_t process_cnt:4;
>> +                       uint32_t reserved3:4;
>>                 } bitfields4;
>>                 uint32_t ordinal4;
>>         };
>> @@ -143,8 +144,8 @@ struct pm4_mes_runlist {
>>
>>  struct pm4_mes_map_process {
>>         union {
>> -               union PM4_MES_TYPE_3_HEADER   header;            /*
>>header */
>> -               uint32_t            ordinal1;
>> +               union PM4_MES_TYPE_3_HEADER header;     /* header */
>> +               uint32_t ordinal1;
>>         };
>>
>>         union {
>> @@ -155,36 +156,48 @@ struct pm4_mes_map_process {
>>                         uint32_t process_quantum:7;
>>                 } bitfields2;
>>                 uint32_t ordinal2;
>> -};
>> +       };
>>
>>         union {
>>                 struct {
>>                         uint32_t page_table_base:28;
>> -                       uint32_t reserved2:4;
>> +                       uint32_t reserved3:4;
>>                 } bitfields3;
>>                 uint32_t ordinal3;
>>         };
>>
>> +       uint32_t reserved;
>> +
>>         uint32_t sh_mem_bases;
>> +       uint32_t sh_mem_config;
>>         uint32_t sh_mem_ape1_base;
>>         uint32_t sh_mem_ape1_limit;
>> -       uint32_t sh_mem_config;
>> +
>> +       uint32_t sh_hidden_private_base_vmid;
>> +
>> +       uint32_t reserved2;
>> +       uint32_t reserved3;
>> +
>>         uint32_t gds_addr_lo;
>>         uint32_t gds_addr_hi;
>>
>>         union {
>>                 struct {
>>                         uint32_t num_gws:6;
>> -                       uint32_t reserved3:2;
>> +                       uint32_t reserved4:2;
>>                         uint32_t num_oac:4;
>> -                       uint32_t reserved4:4;
>> +                       uint32_t reserved5:4;
>>                         uint32_t gds_size:6;
>>                         uint32_t num_queues:10;
>>                 } bitfields10;
>>                 uint32_t ordinal10;
>>         };
>>
>> +       uint32_t completion_signal_lo;
>> +       uint32_t completion_signal_hi;
>> +
>>  };
>> +
>>  #endif
>>
>>  /*--------------------MES_MAP_QUEUES--------------------*/
>> @@ -337,7 +350,7 @@ enum mes_unmap_queues_engine_sel_enum {
>>         engine_sel__mes_unmap_queues__sdmal = 3
>>  };
>>
>> -struct PM4_MES_UNMAP_QUEUES {
>> +struct pm4_mes_unmap_queues {
>>         union {
>>                 union PM4_MES_TYPE_3_HEADER   header;            /*
>>header */
>>                 uint32_t            ordinal1;  @@ -397,4 +410,101 @@
>>struct PM4_MES_UNMAP_QUEUES {
>>  };
>>  #endif
>>
>> +#ifndef PM4_MEC_RELEASE_MEM_DEFINED
>> +#define PM4_MEC_RELEASE_MEM_DEFINED
>> +enum RELEASE_MEM_event_index_enum {
>> +       event_index___release_mem__end_of_pipe = 5,
>> +       event_index___release_mem__shader_done = 6 };
>> +
>> +enum RELEASE_MEM_cache_policy_enum {
>> +       cache_policy___release_mem__lru = 0,
>> +       cache_policy___release_mem__stream = 1,
>> +       cache_policy___release_mem__bypass = 2 };
>> +
>> +enum RELEASE_MEM_dst_sel_enum {
>> +       dst_sel___release_mem__memory_controller = 0,
>> +       dst_sel___release_mem__tc_l2 = 1,
>> +       dst_sel___release_mem__queue_write_pointer_register = 2,
>> +       dst_sel___release_mem__queue_write_pointer_poll_mask_bit = 3
>> +};
>> +
>> +enum RELEASE_MEM_int_sel_enum {
>> +       int_sel___release_mem__none = 0,
>> +       int_sel___release_mem__send_interrupt_only = 1,
>> +       int_sel___release_mem__send_interrupt_after_write_confirm = 2,
>> +       int_sel___release_mem__send_data_after_write_confirm = 3 };
>> +
>> +enum RELEASE_MEM_data_sel_enum {
>> +       data_sel___release_mem__none = 0,
>> +       data_sel___release_mem__send_32_bit_low = 1,
>> +       data_sel___release_mem__send_64_bit_data = 2,
>> +       data_sel___release_mem__send_gpu_clock_counter = 3,
>> +       data_sel___release_mem__send_cp_perfcounter_hi_lo = 4,
>> +       data_sel___release_mem__store_gds_data_to_memory = 5 };
>> +
>> +struct pm4_mec_release_mem {
>> +       union {
>> +               union PM4_MES_TYPE_3_HEADER header;     /*header */
>> +               unsigned int ordinal1;
>> +       };
>> +
>> +       union {
>> +               struct {
>> +                       unsigned int event_type:6;
>> +                       unsigned int reserved1:2;
>> +                       enum RELEASE_MEM_event_index_enum
>> +event_index:4;
>> +                       unsigned int tcl1_vol_action_ena:1;
>> +                       unsigned int tc_vol_action_ena:1;
>> +                       unsigned int reserved2:1;
>> +                       unsigned int tc_wb_action_ena:1;
>> +                       unsigned int tcl1_action_ena:1;
>> +                       unsigned int tc_action_ena:1;
>> +                       unsigned int reserved3:6;
>> +                       unsigned int atc:1;
>> +                       enum RELEASE_MEM_cache_policy_enum
>> +cache_policy:2;
>> +                       unsigned int reserved4:5;
>> +               } bitfields2;
>> +               unsigned int ordinal2;
>> +       };
>> +
>> +       union {
>> +               struct {
>> +                       unsigned int reserved5:16;
>> +                       enum RELEASE_MEM_dst_sel_enum dst_sel:2;
>> +                       unsigned int reserved6:6;
>> +                       enum RELEASE_MEM_int_sel_enum int_sel:3;
>> +                       unsigned int reserved7:2;
>> +                       enum RELEASE_MEM_data_sel_enum data_sel:3;
>> +               } bitfields3;
>> +               unsigned int ordinal3;
>> +       };
>> +
>> +       union {
>> +               struct {
>> +                       unsigned int reserved8:2;
>> +                       unsigned int address_lo_32b:30;
>> +               } bitfields4;
>> +               struct {
>> +                       unsigned int reserved9:3;
>> +                       unsigned int address_lo_64b:29;
>> +               } bitfields5;
>> +               unsigned int ordinal4;
>> +       };
>> +
>> +       unsigned int address_hi;
>> +
>> +       unsigned int data_lo;
>> +
>> +       unsigned int data_hi;
>> +};
>> +#endif
>> +
>> +enum {
>> +       CACHE_FLUSH_AND_INV_TS_EVENT = 0x00000014 };
>> +
>>  #endif
>> --
>> 2.7.4
>>
>
>_______________________________________________
>amd-gfx mailing list
>amd-gfx@lists.freedesktop.org
>https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 01/19] drm/amdkfd: Fix double Mutex lock order
       [not found]         ` <CAFCwf12A9Qr-HCyQFR2eDN_TEExzxHEBVK9XQ9_xuwPKErHg3w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-08-12 19:28           ` Kuehling, Felix
  0 siblings, 0 replies; 70+ messages in thread
From: Kuehling, Felix @ 2017-08-12 19:28 UTC (permalink / raw)
  To: Oded Gabbay; +Cc: amd-gfx list

I guess in the current upstream code this patch doesn't make sense, because there is only one function that takes both locks. I included it in this patch series just because I stumbled over it while reducing cosmetic differences between amd-kfd-staging and upstream.

In our latest code we have other places that take both locks, so they have to all  take them in the same order. I think we can reverse the order and take the dbgmgr_mutex first where it's needed. I'll drop this patch for now and include it later in a context where it makes more sense.

Regards,
  Felix
________________________________________
From: Oded Gabbay <oded.gabbay@gmail.com>
Sent: Saturday, August 12, 2017 8:23 AM
To: Kuehling, Felix
Cc: amd-gfx list
Subject: Re: [PATCH 01/19] drm/amdkfd: Fix double Mutex lock order
[snip]

Hi Felix,
Could you please explain why this change is necessary ?

It seems to me this actually makes things a bit worse in a
multi-process environment, because the p->mutex is per process but the
dbgmgr mutex is global. Therefore, if process A first takes the
process mutex, and process B takes the dbgmgr mutex (in this function
or some other function, such as kfd_ioctl_dbg_address_watch) *before*
process A managed to take dbgmgr mutex, then process A will be locked
from doing other, totally unrelated functions, such as
kfd_ioctl_create_queue.

While, if we keep things as they are now, process A will first take
the dbgmgr mutex, making process B stuck on it, but allowing it to do
other unrelated ioctls because he hadn't taken the process mutex.

Thanks,
Oded
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 03/19] drm/amdkfd: Remove bogus divide-by-sizeof(uint32_t)
       [not found]             ` <DM5PR1201MB02356DFA5A4EAE4CC747F12F928E0-grEf7a3NxMBd8L2jMOIKKmrFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
@ 2017-08-12 20:00               ` Oded Gabbay
  0 siblings, 0 replies; 70+ messages in thread
From: Oded Gabbay @ 2017-08-12 20:00 UTC (permalink / raw)
  To: Kuehling, Felix; +Cc: amd-gfx list

On Sat, Aug 12, 2017 at 9:02 PM, Kuehling, Felix <Felix.Kuehling@amd.com> wrote:
> Hi Oded,
>
> Most of our recent work has been on the amdgpu driver. We basically just keep radeon compiling these days. Amdgpu can support all the GPUs that KFD supports. Between the amdgpu developers and the KFD team we've also talked about merging KFD into amdgpu at some point and replacing the KFD2KGD function pointer interfaces with mostly direct function calls. This would remove KFD support from radeon.
>
> What's your position on radeon KFD support going forward? Do you insist in maintaining it, just for Kaveri?
>
> Regards,
>   Felix
>
If amdgpu supports Kaveri, then there is no need to maintain radeon
support forever. As long as we provide a migration option, then we can
say that kfd won't support radeon starting from kernel version x.y




>
> From: Oded Gabbay <oded.gabbay@gmail.com>
> Sent: Saturday, August 12, 2017 8:37 AM
> To: Kuehling, Felix
> Cc: amd-gfx list
> Subject: Re: [PATCH 03/19] drm/amdkfd: Remove bogus divide-by-sizeof(uint32_t)
>
> On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
>> kfd2kgd->address_watch_get_offset returns dword register offsets.
>> The divide-by-sizeof(uint32_t) is incorrect.
>
> In amdgpu that's true, but in radeon that's incorrect.
> If you look at cik_reg.h in radeon driver, you will see the address of
> all TCP_WATCH_* registers is multiplied by 4, and that's why Yair
> originally divided the offset by sizeof(uint32_t).
> I think this patch should move the divide-by-sizeof operation into the
> radeon function instead of just deleting it from kfd_dbgdev.c.
>
> Oded
>
>>
>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>> ---
>>  drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c | 8 --------
>>  1 file changed, 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
>> index 8b14a4e..faa0790 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
>> @@ -442,8 +442,6 @@ static int dbgdev_address_watch_diq(struct kfd_dbgdev *dbgdev,
>>                                         i,
>>                                         ADDRESS_WATCH_REG_CNTL);
>>
>> -               aw_reg_add_dword /= sizeof(uint32_t);
>> -
>>                 packets_vec[0].bitfields2.reg_offset =
>>                                         aw_reg_add_dword - AMD_CONFIG_REG_BASE;
>>
>> @@ -455,8 +453,6 @@ static int dbgdev_address_watch_diq(struct kfd_dbgdev *dbgdev,
>>                                         i,
>>                                         ADDRESS_WATCH_REG_ADDR_HI);
>>
>> -               aw_reg_add_dword /= sizeof(uint32_t);
>> -
>>                 packets_vec[1].bitfields2.reg_offset =
>>                                         aw_reg_add_dword - AMD_CONFIG_REG_BASE;
>>                 packets_vec[1].reg_data[0] = addrHi.u32All;
>> @@ -467,8 +463,6 @@ static int dbgdev_address_watch_diq(struct kfd_dbgdev *dbgdev,
>>                                         i,
>>                                         ADDRESS_WATCH_REG_ADDR_LO);
>>
>> -               aw_reg_add_dword /= sizeof(uint32_t);
>> -
>>                 packets_vec[2].bitfields2.reg_offset =
>>                                 aw_reg_add_dword - AMD_CONFIG_REG_BASE;
>>                 packets_vec[2].reg_data[0] = addrLo.u32All;
>> @@ -485,8 +479,6 @@ static int dbgdev_address_watch_diq(struct kfd_dbgdev *dbgdev,
>>                                         i,
>>                                         ADDRESS_WATCH_REG_CNTL);
>>
>> -               aw_reg_add_dword /= sizeof(uint32_t);
>> -
>>                 packets_vec[3].bitfields2.reg_offset =
>>                                         aw_reg_add_dword - AMD_CONFIG_REG_BASE;
>>                 packets_vec[3].reg_data[0] = cntl.u32All;
>> --
>> 2.7.4
>>
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 17/19] drm/amdgpu: Remove hard-coded assumptions about compute pipes
       [not found]     ` <1502488589-30272-18-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-08-13  8:29       ` Oded Gabbay
       [not found]         ` <CAFCwf108X+f6+jehRvykPy0NPCnYa6uHjoVAXWDvoN+35h-N5A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: Oded Gabbay @ 2017-08-13  8:29 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: amd-gfx list

On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
>
> Remove hard-coded assumption that the first compute pipe is
> reserved for amdgpu. Pipe 0 actually means pipe 0 now.
>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c | 2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c | 4 ++--
>  2 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
> index 5936222..dfb8c74 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
> @@ -186,7 +186,7 @@ static void acquire_queue(struct kgd_dev *kgd, uint32_t pipe_id,
>  {
>         struct amdgpu_device *adev = get_amdgpu_device(kgd);
>
> -       uint32_t mec = (++pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
> +       uint32_t mec = (pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
>         uint32_t pipe = (pipe_id % adev->gfx.mec.num_pipe_per_mec);
>
>         lock_srbm(kgd, mec, pipe, queue_id, 0);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
> index 90271f6..0fccd30 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
> @@ -147,7 +147,7 @@ static void acquire_queue(struct kgd_dev *kgd, uint32_t pipe_id,
>  {
>         struct amdgpu_device *adev = get_amdgpu_device(kgd);
>
> -       uint32_t mec = (++pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
> +       uint32_t mec = (pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
>         uint32_t pipe = (pipe_id % adev->gfx.mec.num_pipe_per_mec);
>
>         lock_srbm(kgd, mec, pipe, queue_id, 0);
> @@ -216,7 +216,7 @@ static int kgd_init_interrupts(struct kgd_dev *kgd, uint32_t pipe_id)
>         uint32_t mec;
>         uint32_t pipe;
>
> -       mec = (++pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
> +       mec = (pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
>         pipe = (pipe_id % adev->gfx.mec.num_pipe_per_mec);
>
>         lock_srbm(kgd, mec, pipe, 0, 0);
> --
> 2.7.4
>

If I'm looking at amdgpu_gfx_compute_queue_acquire(), amdgpu takes the
first pipe, so won't this change collide with that code ?
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 19/19] drm/amd: Update MEC HQD loading code for KFD
       [not found]     ` <1502488589-30272-20-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-08-13  8:33       ` Oded Gabbay
  0 siblings, 0 replies; 70+ messages in thread
From: Oded Gabbay @ 2017-08-13  8:33 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: amd-gfx list

On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> Various bug fixes and improvements that accumulated over the last two
> years.
>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h         |  16 ++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c  | 130 +++++++++++++---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c  | 165 ++++++++++++++++++---
>  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  |   7 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c      |   3 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h       |   3 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c   |  23 +--
>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c    |  16 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_priv.h              |   5 -
>  drivers/gpu/drm/amd/include/kgd_kfd_interface.h    |  11 +-
>  drivers/gpu/drm/radeon/radeon_kfd.c                |  12 +-
>  11 files changed, 322 insertions(+), 69 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> index b8802a5..8d689ab 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> @@ -26,6 +26,7 @@
>  #define AMDGPU_AMDKFD_H_INCLUDED
>
>  #include <linux/types.h>
> +#include <linux/mmu_context.h>
>  #include <kgd_kfd_interface.h>
>
>  struct amdgpu_device;
> @@ -60,4 +61,19 @@ uint64_t get_gpu_clock_counter(struct kgd_dev *kgd);
>
>  uint32_t get_max_engine_clock_in_mhz(struct kgd_dev *kgd);
>
> +#define read_user_wptr(mmptr, wptr, dst)                               \
> +       ({                                                              \
> +               bool valid = false;                                     \
> +               if ((mmptr) && (wptr)) {                                \
> +                       if ((mmptr) == current->mm) {                   \
> +                               valid = !get_user((dst), (wptr));       \
> +                       } else if (current->mm == NULL) {               \
> +                               use_mm(mmptr);                          \
> +                               valid = !get_user((dst), (wptr));       \
> +                               unuse_mm(mmptr);                        \
> +                       }                                               \
> +               }                                                       \
> +               valid;                                                  \
> +       })
> +
>  #endif /* AMDGPU_AMDKFD_H_INCLUDED */
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
> index dfb8c74..994d262 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
> @@ -39,6 +39,12 @@
>  #include "gmc/gmc_7_1_sh_mask.h"
>  #include "cik_structs.h"
>
> +enum hqd_dequeue_request_type {
> +       NO_ACTION = 0,
> +       DRAIN_PIPE,
> +       RESET_WAVES
> +};
> +
>  enum {
>         MAX_TRAPID = 8,         /* 3 bits in the bitfield. */
>         MAX_WATCH_ADDRESSES = 4
> @@ -96,12 +102,15 @@ static int kgd_init_pipeline(struct kgd_dev *kgd, uint32_t pipe_id,
>                                 uint32_t hpd_size, uint64_t hpd_gpu_addr);
>  static int kgd_init_interrupts(struct kgd_dev *kgd, uint32_t pipe_id);
>  static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, uint32_t pipe_id,
> -                       uint32_t queue_id, uint32_t __user *wptr);
> +                       uint32_t queue_id, uint32_t __user *wptr,
> +                       uint32_t wptr_shift, uint32_t wptr_mask,
> +                       struct mm_struct *mm);
>  static int kgd_hqd_sdma_load(struct kgd_dev *kgd, void *mqd);
>  static bool kgd_hqd_is_occupied(struct kgd_dev *kgd, uint64_t queue_address,
>                                 uint32_t pipe_id, uint32_t queue_id);
>
> -static int kgd_hqd_destroy(struct kgd_dev *kgd, uint32_t reset_type,
> +static int kgd_hqd_destroy(struct kgd_dev *kgd, void *mqd,
> +                               enum kfd_preempt_type reset_type,
>                                 unsigned int utimeout, uint32_t pipe_id,
>                                 uint32_t queue_id);
>  static bool kgd_hqd_sdma_is_occupied(struct kgd_dev *kgd, void *mqd);
> @@ -290,20 +299,38 @@ static inline struct cik_sdma_rlc_registers *get_sdma_mqd(void *mqd)
>  }
>
>  static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, uint32_t pipe_id,
> -                       uint32_t queue_id, uint32_t __user *wptr)
> +                       uint32_t queue_id, uint32_t __user *wptr,
> +                       uint32_t wptr_shift, uint32_t wptr_mask,
> +                       struct mm_struct *mm)
>  {
>         struct amdgpu_device *adev = get_amdgpu_device(kgd);
> -       uint32_t wptr_shadow, is_wptr_shadow_valid;
>         struct cik_mqd *m;
> +       uint32_t *mqd_hqd;
> +       uint32_t reg, wptr_val, data;
>
>         m = get_mqd(mqd);
>
> -       is_wptr_shadow_valid = !get_user(wptr_shadow, wptr);
> -       if (is_wptr_shadow_valid)
> -               m->cp_hqd_pq_wptr = wptr_shadow;
> -
>         acquire_queue(kgd, pipe_id, queue_id);
> -       gfx_v7_0_mqd_commit(adev, m);
> +
> +       /* HQD registers extend from CP_MQD_BASE_ADDR to CP_MQD_CONTROL. */
> +       mqd_hqd = &m->cp_mqd_base_addr_lo;
> +
> +       for (reg = mmCP_MQD_BASE_ADDR; reg <= mmCP_MQD_CONTROL; reg++)
> +               WREG32(reg, mqd_hqd[reg - mmCP_MQD_BASE_ADDR]);
> +
> +       /* Copy userspace write pointer value to register.
> +        * Activate doorbell logic to monitor subsequent changes.
> +        */
> +       data = REG_SET_FIELD(m->cp_hqd_pq_doorbell_control,
> +                            CP_HQD_PQ_DOORBELL_CONTROL, DOORBELL_EN, 1);
> +       WREG32(mmCP_HQD_PQ_DOORBELL_CONTROL, data);
> +
> +       if (read_user_wptr(mm, wptr, wptr_val))
> +               WREG32(mmCP_HQD_PQ_WPTR, (wptr_val << wptr_shift) & wptr_mask);
> +
> +       data = REG_SET_FIELD(m->cp_hqd_active, CP_HQD_ACTIVE, ACTIVE, 1);
> +       WREG32(mmCP_HQD_ACTIVE, data);
> +
>         release_queue(kgd);
>
>         return 0;
> @@ -382,30 +409,99 @@ static bool kgd_hqd_sdma_is_occupied(struct kgd_dev *kgd, void *mqd)
>         return false;
>  }
>
> -static int kgd_hqd_destroy(struct kgd_dev *kgd, uint32_t reset_type,
> +static int kgd_hqd_destroy(struct kgd_dev *kgd, void *mqd,
> +                               enum kfd_preempt_type reset_type,
>                                 unsigned int utimeout, uint32_t pipe_id,
>                                 uint32_t queue_id)
>  {
>         struct amdgpu_device *adev = get_amdgpu_device(kgd);
>         uint32_t temp;
> -       int timeout = utimeout;
> +       enum hqd_dequeue_request_type type;
> +       unsigned long flags, end_jiffies;
> +       int retry;
>
>         acquire_queue(kgd, pipe_id, queue_id);
>         WREG32(mmCP_HQD_PQ_DOORBELL_CONTROL, 0);
>
> -       WREG32(mmCP_HQD_DEQUEUE_REQUEST, reset_type);
> +       switch (reset_type) {
> +       case KFD_PREEMPT_TYPE_WAVEFRONT_DRAIN:
> +               type = DRAIN_PIPE;
> +               break;
> +       case KFD_PREEMPT_TYPE_WAVEFRONT_RESET:
> +               type = RESET_WAVES;
> +               break;
> +       default:
> +               type = DRAIN_PIPE;
> +               break;
> +       }
>
> +       /* Workaround: If IQ timer is active and the wait time is close to or
> +        * equal to 0, dequeueing is not safe. Wait until either the wait time
> +        * is larger or timer is cleared. Also, ensure that IQ_REQ_PEND is
> +        * cleared before continuing. Also, ensure wait times are set to at
> +        * least 0x3.
> +        */
> +       local_irq_save(flags);
> +       preempt_disable();
> +       retry = 5000; /* wait for 500 usecs at maximum */
> +       while (true) {
> +               temp = RREG32(mmCP_HQD_IQ_TIMER);
> +               if (REG_GET_FIELD(temp, CP_HQD_IQ_TIMER, PROCESSING_IQ)) {
> +                       pr_debug("HW is processing IQ\n");
> +                       goto loop;
> +               }
> +               if (REG_GET_FIELD(temp, CP_HQD_IQ_TIMER, ACTIVE)) {
> +                       if (REG_GET_FIELD(temp, CP_HQD_IQ_TIMER, RETRY_TYPE)
> +                                       == 3) /* SEM-rearm is safe */
> +                               break;
> +                       /* Wait time 3 is safe for CP, but our MMIO read/write
> +                        * time is close to 1 microsecond, so check for 10 to
> +                        * leave more buffer room
> +                        */
> +                       if (REG_GET_FIELD(temp, CP_HQD_IQ_TIMER, WAIT_TIME)
> +                                       >= 10)
> +                               break;
> +                       pr_debug("IQ timer is active\n");
> +               } else
> +                       break;
> +loop:
> +               if (!retry) {
> +                       pr_err("CP HQD IQ timer status time out\n");
> +                       break;
> +               }
> +               ndelay(100);
> +               --retry;
> +       }
> +       retry = 1000;
> +       while (true) {
> +               temp = RREG32(mmCP_HQD_DEQUEUE_REQUEST);
> +               if (!(temp & CP_HQD_DEQUEUE_REQUEST__IQ_REQ_PEND_MASK))
> +                       break;
> +               pr_debug("Dequeue request is pending\n");
> +
> +               if (!retry) {
> +                       pr_err("CP HQD dequeue request time out\n");
> +                       break;
> +               }
> +               ndelay(100);
> +               --retry;
> +       }
> +       local_irq_restore(flags);
> +       preempt_enable();
> +
> +       WREG32(mmCP_HQD_DEQUEUE_REQUEST, type);
> +
> +       end_jiffies = (utimeout * HZ / 1000) + jiffies;
>         while (true) {
>                 temp = RREG32(mmCP_HQD_ACTIVE);
> -               if (temp & CP_HQD_ACTIVE__ACTIVE_MASK)
> +               if (!(temp & CP_HQD_ACTIVE__ACTIVE_MASK))
>                         break;
> -               if (timeout <= 0) {
> -                       pr_err("kfd: cp queue preemption time out.\n");
> +               if (time_after(jiffies, end_jiffies)) {
> +                       pr_err("cp queue preemption time out\n");
>                         release_queue(kgd);
>                         return -ETIME;
>                 }
> -               msleep(20);
> -               timeout -= 20;
> +               usleep_range(500, 1000);
>         }
>
>         release_queue(kgd);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
> index 0fccd30..29a6f5d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
> @@ -39,6 +39,12 @@
>  #include "vi_structs.h"
>  #include "vid.h"
>
> +enum hqd_dequeue_request_type {
> +       NO_ACTION = 0,
> +       DRAIN_PIPE,
> +       RESET_WAVES
> +};
> +
>  struct cik_sdma_rlc_registers;
>
>  /*
> @@ -55,12 +61,15 @@ static int kgd_init_pipeline(struct kgd_dev *kgd, uint32_t pipe_id,
>                 uint32_t hpd_size, uint64_t hpd_gpu_addr);
>  static int kgd_init_interrupts(struct kgd_dev *kgd, uint32_t pipe_id);
>  static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, uint32_t pipe_id,
> -               uint32_t queue_id, uint32_t __user *wptr);
> +                       uint32_t queue_id, uint32_t __user *wptr,
> +                       uint32_t wptr_shift, uint32_t wptr_mask,
> +                       struct mm_struct *mm);
>  static int kgd_hqd_sdma_load(struct kgd_dev *kgd, void *mqd);
>  static bool kgd_hqd_is_occupied(struct kgd_dev *kgd, uint64_t queue_address,
>                 uint32_t pipe_id, uint32_t queue_id);
>  static bool kgd_hqd_sdma_is_occupied(struct kgd_dev *kgd, void *mqd);
> -static int kgd_hqd_destroy(struct kgd_dev *kgd, uint32_t reset_type,
> +static int kgd_hqd_destroy(struct kgd_dev *kgd, void *mqd,
> +                               enum kfd_preempt_type reset_type,
>                                 unsigned int utimeout, uint32_t pipe_id,
>                                 uint32_t queue_id);
>  static int kgd_hqd_sdma_destroy(struct kgd_dev *kgd, void *mqd,
> @@ -244,20 +253,67 @@ static inline struct cik_sdma_rlc_registers *get_sdma_mqd(void *mqd)
>  }
>
>  static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, uint32_t pipe_id,
> -                       uint32_t queue_id, uint32_t __user *wptr)
> +                       uint32_t queue_id, uint32_t __user *wptr,
> +                       uint32_t wptr_shift, uint32_t wptr_mask,
> +                       struct mm_struct *mm)
>  {
> -       struct vi_mqd *m;
> -       uint32_t shadow_wptr, valid_wptr;
>         struct amdgpu_device *adev = get_amdgpu_device(kgd);
> +       struct vi_mqd *m;
> +       uint32_t *mqd_hqd;
> +       uint32_t reg, wptr_val, data;
>
>         m = get_mqd(mqd);
>
> -       valid_wptr = copy_from_user(&shadow_wptr, wptr, sizeof(shadow_wptr));
> -       if (valid_wptr == 0)
> -               m->cp_hqd_pq_wptr = shadow_wptr;
> -
>         acquire_queue(kgd, pipe_id, queue_id);
> -       gfx_v8_0_mqd_commit(adev, mqd);
> +
> +       /* HIQ is set during driver init period with vmid set to 0*/
> +       if (m->cp_hqd_vmid == 0) {
> +               uint32_t value, mec, pipe;
> +
> +               mec = (pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
> +               pipe = (pipe_id % adev->gfx.mec.num_pipe_per_mec);
> +
> +               pr_debug("kfd: set HIQ, mec:%d, pipe:%d, queue:%d.\n",
> +                       mec, pipe, queue_id);
> +               value = RREG32(mmRLC_CP_SCHEDULERS);
> +               value = REG_SET_FIELD(value, RLC_CP_SCHEDULERS, scheduler1,
> +                       ((mec << 5) | (pipe << 3) | queue_id | 0x80));
> +               WREG32(mmRLC_CP_SCHEDULERS, value);
> +       }
> +
> +       /* HQD registers extend from CP_MQD_BASE_ADDR to CP_HQD_EOP_WPTR_MEM. */
> +       mqd_hqd = &m->cp_mqd_base_addr_lo;
> +
> +       for (reg = mmCP_MQD_BASE_ADDR; reg <= mmCP_HQD_EOP_CONTROL; reg++)
> +               WREG32(reg, mqd_hqd[reg - mmCP_MQD_BASE_ADDR]);
> +
> +       /* Tonga errata: EOP RPTR/WPTR should be left unmodified.
> +        * This is safe since EOP RPTR==WPTR for any inactive HQD
> +        * on ASICs that do not support context-save.
> +        * EOP writes/reads can start anywhere in the ring.
> +        */
> +       if (get_amdgpu_device(kgd)->asic_type != CHIP_TONGA) {
> +               WREG32(mmCP_HQD_EOP_RPTR, m->cp_hqd_eop_rptr);
> +               WREG32(mmCP_HQD_EOP_WPTR, m->cp_hqd_eop_wptr);
> +               WREG32(mmCP_HQD_EOP_WPTR_MEM, m->cp_hqd_eop_wptr_mem);
> +       }
> +
> +       for (reg = mmCP_HQD_EOP_EVENTS; reg <= mmCP_HQD_ERROR; reg++)
> +               WREG32(reg, mqd_hqd[reg - mmCP_MQD_BASE_ADDR]);
> +
> +       /* Copy userspace write pointer value to register.
> +        * Activate doorbell logic to monitor subsequent changes.
> +        */
> +       data = REG_SET_FIELD(m->cp_hqd_pq_doorbell_control,
> +                            CP_HQD_PQ_DOORBELL_CONTROL, DOORBELL_EN, 1);
> +       WREG32(mmCP_HQD_PQ_DOORBELL_CONTROL, data);
> +
> +       if (read_user_wptr(mm, wptr, wptr_val))
> +               WREG32(mmCP_HQD_PQ_WPTR, (wptr_val << wptr_shift) & wptr_mask);
> +
> +       data = REG_SET_FIELD(m->cp_hqd_active, CP_HQD_ACTIVE, ACTIVE, 1);
> +       WREG32(mmCP_HQD_ACTIVE, data);
> +
>         release_queue(kgd);
>
>         return 0;
> @@ -308,29 +364,102 @@ static bool kgd_hqd_sdma_is_occupied(struct kgd_dev *kgd, void *mqd)
>         return false;
>  }
>
> -static int kgd_hqd_destroy(struct kgd_dev *kgd, uint32_t reset_type,
> +static int kgd_hqd_destroy(struct kgd_dev *kgd, void *mqd,
> +                               enum kfd_preempt_type reset_type,
>                                 unsigned int utimeout, uint32_t pipe_id,
>                                 uint32_t queue_id)
>  {
>         struct amdgpu_device *adev = get_amdgpu_device(kgd);
>         uint32_t temp;
> -       int timeout = utimeout;
> +       enum hqd_dequeue_request_type type;
> +       unsigned long flags, end_jiffies;
> +       int retry;
> +       struct vi_mqd *m = get_mqd(mqd);
>
>         acquire_queue(kgd, pipe_id, queue_id);
>
> -       WREG32(mmCP_HQD_DEQUEUE_REQUEST, reset_type);
> +       if (m->cp_hqd_vmid == 0)
> +               WREG32_FIELD(RLC_CP_SCHEDULERS, scheduler1, 0);
> +
> +       switch (reset_type) {
> +       case KFD_PREEMPT_TYPE_WAVEFRONT_DRAIN:
> +               type = DRAIN_PIPE;
> +               break;
> +       case KFD_PREEMPT_TYPE_WAVEFRONT_RESET:
> +               type = RESET_WAVES;
> +               break;
> +       default:
> +               type = DRAIN_PIPE;
> +               break;
> +       }
> +
> +       /* Workaround: If IQ timer is active and the wait time is close to or
> +        * equal to 0, dequeueing is not safe. Wait until either the wait time
> +        * is larger or timer is cleared. Also, ensure that IQ_REQ_PEND is
> +        * cleared before continuing. Also, ensure wait times are set to at
> +        * least 0x3.
> +        */
> +       local_irq_save(flags);
> +       preempt_disable();
> +       retry = 5000; /* wait for 500 usecs at maximum */
> +       while (true) {
> +               temp = RREG32(mmCP_HQD_IQ_TIMER);
> +               if (REG_GET_FIELD(temp, CP_HQD_IQ_TIMER, PROCESSING_IQ)) {
> +                       pr_debug("HW is processing IQ\n");
> +                       goto loop;
> +               }
> +               if (REG_GET_FIELD(temp, CP_HQD_IQ_TIMER, ACTIVE)) {
> +                       if (REG_GET_FIELD(temp, CP_HQD_IQ_TIMER, RETRY_TYPE)
> +                                       == 3) /* SEM-rearm is safe */
> +                               break;
> +                       /* Wait time 3 is safe for CP, but our MMIO read/write
> +                        * time is close to 1 microsecond, so check for 10 to
> +                        * leave more buffer room
> +                        */
> +                       if (REG_GET_FIELD(temp, CP_HQD_IQ_TIMER, WAIT_TIME)
> +                                       >= 10)
> +                               break;
> +                       pr_debug("IQ timer is active\n");
> +               } else
> +                       break;
> +loop:
> +               if (!retry) {
> +                       pr_err("CP HQD IQ timer status time out\n");
> +                       break;
> +               }
> +               ndelay(100);
> +               --retry;
> +       }
> +       retry = 1000;
> +       while (true) {
> +               temp = RREG32(mmCP_HQD_DEQUEUE_REQUEST);
> +               if (!(temp & CP_HQD_DEQUEUE_REQUEST__IQ_REQ_PEND_MASK))
> +                       break;
> +               pr_debug("Dequeue request is pending\n");
>
> +               if (!retry) {
> +                       pr_err("CP HQD dequeue request time out\n");
> +                       break;
> +               }
> +               ndelay(100);
> +               --retry;
> +       }
> +       local_irq_restore(flags);
> +       preempt_enable();
> +
> +       WREG32(mmCP_HQD_DEQUEUE_REQUEST, type);
> +
> +       end_jiffies = (utimeout * HZ / 1000) + jiffies;
>         while (true) {
>                 temp = RREG32(mmCP_HQD_ACTIVE);
> -               if (temp & CP_HQD_ACTIVE__ACTIVE_MASK)
> +               if (!(temp & CP_HQD_ACTIVE__ACTIVE_MASK))
>                         break;
> -               if (timeout <= 0) {
> -                       pr_err("kfd: cp queue preemption time out.\n");
> +               if (time_after(jiffies, end_jiffies)) {
> +                       pr_err("cp queue preemption time out.\n");
>                         release_queue(kgd);
>                         return -ETIME;
>                 }
> -               msleep(20);
> -               timeout -= 20;
> +               usleep_range(500, 1000);
>         }
>
>         release_queue(kgd);
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> index 5dac29d..3891fe5 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> @@ -268,8 +268,8 @@ static int create_compute_queue_nocpsch(struct device_queue_manager *dqm,
>         pr_debug("Loading mqd to hqd on pipe %d, queue %d\n",
>                         q->pipe, q->queue);
>
> -       retval = mqd->load_mqd(mqd, q->mqd, q->pipe,
> -                       q->queue, (uint32_t __user *) q->properties.write_ptr);
> +       retval = mqd->load_mqd(mqd, q->mqd, q->pipe, q->queue, &q->properties,
> +                              q->process->mm);
>         if (retval)
>                 goto out_uninit_mqd;
>
> @@ -585,8 +585,7 @@ static int create_sdma_queue_nocpsch(struct device_queue_manager *dqm,
>         if (retval)
>                 goto out_deallocate_sdma_queue;
>
> -       retval = mqd->load_mqd(mqd, q->mqd, 0,
> -                               0, NULL);
> +       retval = mqd->load_mqd(mqd, q->mqd, 0, 0, &q->properties, NULL);
>         if (retval)
>                 goto out_uninit_mqd;
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
> index 0e4d4a9..681b639 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
> @@ -143,7 +143,8 @@ static bool initialize(struct kernel_queue *kq, struct kfd_dev *dev,
>                 kq->queue->pipe = KFD_CIK_HIQ_PIPE;
>                 kq->queue->queue = KFD_CIK_HIQ_QUEUE;
>                 kq->mqd->load_mqd(kq->mqd, kq->queue->mqd, kq->queue->pipe,
> -                                       kq->queue->queue, NULL);
> +                                 kq->queue->queue, &kq->queue->properties,
> +                                 NULL);
>         } else {
>                 /* allocate fence for DIQ */
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h
> index 213a71e..1f3a6ba 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h
> @@ -67,7 +67,8 @@ struct mqd_manager {
>
>         int     (*load_mqd)(struct mqd_manager *mm, void *mqd,
>                                 uint32_t pipe_id, uint32_t queue_id,
> -                               uint32_t __user *wptr);
> +                               struct queue_properties *p,
> +                               struct mm_struct *mms);
>
>         int     (*update_mqd)(struct mqd_manager *mm, void *mqd,
>                                 struct queue_properties *q);
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
> index 7e0ec6b..44ffd23 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
> @@ -144,15 +144,21 @@ static void uninit_mqd_sdma(struct mqd_manager *mm, void *mqd,
>  }
>
>  static int load_mqd(struct mqd_manager *mm, void *mqd, uint32_t pipe_id,
> -                       uint32_t queue_id, uint32_t __user *wptr)
> +                   uint32_t queue_id, struct queue_properties *p,
> +                   struct mm_struct *mms)
>  {
> -       return mm->dev->kfd2kgd->hqd_load
> -               (mm->dev->kgd, mqd, pipe_id, queue_id, wptr);
> +       /* AQL write pointer counts in 64B packets, PM4/CP counts in dwords. */
> +       uint32_t wptr_shift = (p->format == KFD_QUEUE_FORMAT_AQL ? 4 : 0);
> +       uint32_t wptr_mask = (uint32_t)((p->queue_size / sizeof(uint32_t)) - 1);
> +
> +       return mm->dev->kfd2kgd->hqd_load(mm->dev->kgd, mqd, pipe_id, queue_id,
> +                                         (uint32_t __user *)p->write_ptr,
> +                                         wptr_shift, wptr_mask, mms);
>  }
>
>  static int load_mqd_sdma(struct mqd_manager *mm, void *mqd,
> -                       uint32_t pipe_id, uint32_t queue_id,
> -                       uint32_t __user *wptr)
> +                        uint32_t pipe_id, uint32_t queue_id,
> +                        struct queue_properties *p, struct mm_struct *mms)
>  {
>         return mm->dev->kfd2kgd->hqd_sdma_load(mm->dev->kgd, mqd);
>  }
> @@ -176,20 +182,17 @@ static int update_mqd(struct mqd_manager *mm, void *mqd,
>         m->cp_hqd_pq_base_hi = upper_32_bits((uint64_t)q->queue_address >> 8);
>         m->cp_hqd_pq_rptr_report_addr_lo = lower_32_bits((uint64_t)q->read_ptr);
>         m->cp_hqd_pq_rptr_report_addr_hi = upper_32_bits((uint64_t)q->read_ptr);
> -       m->cp_hqd_pq_doorbell_control = DOORBELL_EN |
> -                                       DOORBELL_OFFSET(q->doorbell_off);
> +       m->cp_hqd_pq_doorbell_control = DOORBELL_OFFSET(q->doorbell_off);
>
>         m->cp_hqd_vmid = q->vmid;
>
>         if (q->format == KFD_QUEUE_FORMAT_AQL)
>                 m->cp_hqd_pq_control |= NO_UPDATE_RPTR;
>
> -       m->cp_hqd_active = 0;
>         q->is_active = false;
>         if (q->queue_size > 0 &&
>                         q->queue_address != 0 &&
>                         q->queue_percent > 0) {
> -               m->cp_hqd_active = 1;
>                 q->is_active = true;
>         }
>
> @@ -239,7 +242,7 @@ static int destroy_mqd(struct mqd_manager *mm, void *mqd,
>                         unsigned int timeout, uint32_t pipe_id,
>                         uint32_t queue_id)
>  {
> -       return mm->dev->kfd2kgd->hqd_destroy(mm->dev->kgd, type, timeout,
> +       return mm->dev->kfd2kgd->hqd_destroy(mm->dev->kgd, mqd, type, timeout,
>                                         pipe_id, queue_id);
>  }
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
> index 98a930e..73cbfe1 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
> @@ -94,10 +94,15 @@ static int init_mqd(struct mqd_manager *mm, void **mqd,
>
>  static int load_mqd(struct mqd_manager *mm, void *mqd,
>                         uint32_t pipe_id, uint32_t queue_id,
> -                       uint32_t __user *wptr)
> +                       struct queue_properties *p, struct mm_struct *mms)
>  {
> -       return mm->dev->kfd2kgd->hqd_load
> -               (mm->dev->kgd, mqd, pipe_id, queue_id, wptr);
> +       /* AQL write pointer counts in 64B packets, PM4/CP counts in dwords. */
> +       uint32_t wptr_shift = (p->format == KFD_QUEUE_FORMAT_AQL ? 4 : 0);
> +       uint32_t wptr_mask = (uint32_t)((p->queue_size / sizeof(uint32_t)) - 1);
> +
> +       return mm->dev->kfd2kgd->hqd_load(mm->dev->kgd, mqd, pipe_id, queue_id,
> +                                         (uint32_t __user *)p->write_ptr,
> +                                         wptr_shift, wptr_mask, mms);
>  }
>
>  static int __update_mqd(struct mqd_manager *mm, void *mqd,
> @@ -122,7 +127,6 @@ static int __update_mqd(struct mqd_manager *mm, void *mqd,
>         m->cp_hqd_pq_rptr_report_addr_hi = upper_32_bits((uint64_t)q->read_ptr);
>
>         m->cp_hqd_pq_doorbell_control =
> -               1 << CP_HQD_PQ_DOORBELL_CONTROL__DOORBELL_EN__SHIFT |
>                 q->doorbell_off <<
>                         CP_HQD_PQ_DOORBELL_CONTROL__DOORBELL_OFFSET__SHIFT;
>         pr_debug("cp_hqd_pq_doorbell_control 0x%x\n",
> @@ -159,12 +163,10 @@ static int __update_mqd(struct mqd_manager *mm, void *mqd,
>                                 2 << CP_HQD_PQ_CONTROL__SLOT_BASED_WPTR__SHIFT;
>         }
>
> -       m->cp_hqd_active = 0;
>         q->is_active = false;
>         if (q->queue_size > 0 &&
>                         q->queue_address != 0 &&
>                         q->queue_percent > 0) {
> -               m->cp_hqd_active = 1;
>                 q->is_active = true;
>         }
>
> @@ -184,7 +186,7 @@ static int destroy_mqd(struct mqd_manager *mm, void *mqd,
>                         uint32_t queue_id)
>  {
>         return mm->dev->kfd2kgd->hqd_destroy
> -               (mm->dev->kgd, type, timeout,
> +               (mm->dev->kgd, mqd, type, timeout,
>                 pipe_id, queue_id);
>  }
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> index f0d55cc0..30ce92c 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> @@ -239,11 +239,6 @@ enum kfd_preempt_type_filter {
>         KFD_PREEMPT_TYPE_FILTER_BY_PASID
>  };
>
> -enum kfd_preempt_type {
> -       KFD_PREEMPT_TYPE_WAVEFRONT,
> -       KFD_PREEMPT_TYPE_WAVEFRONT_RESET
> -};
> -
>  /**
>   * enum kfd_queue_type
>   *
> diff --git a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
> index 36f3766..ffafda0 100644
> --- a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
> +++ b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
> @@ -41,6 +41,11 @@ struct kgd_dev;
>
>  struct kgd_mem;
>
> +enum kfd_preempt_type {
> +       KFD_PREEMPT_TYPE_WAVEFRONT_DRAIN = 0,
> +       KFD_PREEMPT_TYPE_WAVEFRONT_RESET,
> +};
> +
>  enum kgd_memory_pool {
>         KGD_POOL_SYSTEM_CACHEABLE = 1,
>         KGD_POOL_SYSTEM_WRITECOMBINE = 2,
> @@ -153,14 +158,16 @@ struct kfd2kgd_calls {
>         int (*init_interrupts)(struct kgd_dev *kgd, uint32_t pipe_id);
>
>         int (*hqd_load)(struct kgd_dev *kgd, void *mqd, uint32_t pipe_id,
> -                       uint32_t queue_id, uint32_t __user *wptr);
> +                       uint32_t queue_id, uint32_t __user *wptr,
> +                       uint32_t wptr_shift, uint32_t wptr_mask,
> +                       struct mm_struct *mm);
>
>         int (*hqd_sdma_load)(struct kgd_dev *kgd, void *mqd);
>
>         bool (*hqd_is_occupied)(struct kgd_dev *kgd, uint64_t queue_address,
>                                 uint32_t pipe_id, uint32_t queue_id);
>
> -       int (*hqd_destroy)(struct kgd_dev *kgd, uint32_t reset_type,
> +       int (*hqd_destroy)(struct kgd_dev *kgd, void *mqd, uint32_t reset_type,
>                                 unsigned int timeout, uint32_t pipe_id,
>                                 uint32_t queue_id);
>
> diff --git a/drivers/gpu/drm/radeon/radeon_kfd.c b/drivers/gpu/drm/radeon/radeon_kfd.c
> index a2ab6dc..695117a 100644
> --- a/drivers/gpu/drm/radeon/radeon_kfd.c
> +++ b/drivers/gpu/drm/radeon/radeon_kfd.c
> @@ -75,12 +75,14 @@ static int kgd_init_pipeline(struct kgd_dev *kgd, uint32_t pipe_id,
>                                 uint32_t hpd_size, uint64_t hpd_gpu_addr);
>  static int kgd_init_interrupts(struct kgd_dev *kgd, uint32_t pipe_id);
>  static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, uint32_t pipe_id,
> -                       uint32_t queue_id, uint32_t __user *wptr);
> +                       uint32_t queue_id, uint32_t __user *wptr,
> +                       uint32_t wptr_shift, uint32_t wptr_mask,
> +                       struct mm_struct *mm);
>  static int kgd_hqd_sdma_load(struct kgd_dev *kgd, void *mqd);
>  static bool kgd_hqd_is_occupied(struct kgd_dev *kgd, uint64_t queue_address,
>                                 uint32_t pipe_id, uint32_t queue_id);
>
> -static int kgd_hqd_destroy(struct kgd_dev *kgd, uint32_t reset_type,
> +static int kgd_hqd_destroy(struct kgd_dev *kgd, void *mqd, uint32_t reset_type,
>                                 unsigned int timeout, uint32_t pipe_id,
>                                 uint32_t queue_id);
>  static bool kgd_hqd_sdma_is_occupied(struct kgd_dev *kgd, void *mqd);
> @@ -482,7 +484,9 @@ static inline struct cik_sdma_rlc_registers *get_sdma_mqd(void *mqd)
>  }
>
>  static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, uint32_t pipe_id,
> -                       uint32_t queue_id, uint32_t __user *wptr)
> +                       uint32_t queue_id, uint32_t __user *wptr,
> +                       uint32_t wptr_shift, uint32_t wptr_mask,
> +                       struct mm_struct *mm)
>  {
>         uint32_t wptr_shadow, is_wptr_shadow_valid;
>         struct cik_mqd *m;
> @@ -636,7 +640,7 @@ static bool kgd_hqd_sdma_is_occupied(struct kgd_dev *kgd, void *mqd)
>         return false;
>  }
>
> -static int kgd_hqd_destroy(struct kgd_dev *kgd, uint32_t reset_type,
> +static int kgd_hqd_destroy(struct kgd_dev *kgd, void *mqd, uint32_t reset_type,
>                                 unsigned int timeout, uint32_t pipe_id,
>                                 uint32_t queue_id)
>  {
> --
> 2.7.4
>


This patch is:
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 13/19] drm/amdkfd: Handle remaining BUG_ONs more gracefully
       [not found]             ` <DM5PR1201MB0235FD5550E28485BCF31EB1928E0-grEf7a3NxMBd8L2jMOIKKmrFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
@ 2017-08-13  8:48               ` Oded Gabbay
  0 siblings, 0 replies; 70+ messages in thread
From: Oded Gabbay @ 2017-08-13  8:48 UTC (permalink / raw)
  To: Kuehling, Felix; +Cc: amd-gfx list

On Sat, Aug 12, 2017 at 9:37 PM, Kuehling, Felix <Felix.Kuehling@amd.com> wrote:
> Sorry about the weird quoting format. I'm using outlook web access from home. Comments inline [FK]
>
> ________________________________________
> From: Oded Gabbay <oded.gabbay@gmail.com>
> Sent: Saturday, August 12, 2017 10:39 AM
> To: Kuehling, Felix
> Cc: amd-gfx list
> Subject: Re: [PATCH 13/19] drm/amdkfd: Handle remaining BUG_ONs more gracefully
>
> On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
>> In most cases, BUG_ONs can be replaced with WARN_ON with an error
>> return. In some void functions just turn them into a WARN_ON and
>> possibly an early exit.
>>
>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>> ---
>>  drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c            |  3 +-
>>  drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c            |  3 +-
>>  drivers/gpu/drm/amd/amdkfd/kfd_device.c            | 16 ++++----
>>  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  | 19 ++++-----
>>  .../drm/amd/amdkfd/kfd_device_queue_manager_cik.c  |  2 +-
>>  .../drm/amd/amdkfd/kfd_device_queue_manager_vi.c   |  2 +-
>>  drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c      | 20 +++++++---
>>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c   |  3 +-
>>  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c    |  3 +-
>>  drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c    | 45 +++++++++++++---------
>>  drivers/gpu/drm/amd/amdkfd/kfd_pasid.c             |  4 +-
>>  drivers/gpu/drm/amd/amdkfd/kfd_process.c           |  9 ++---
>>  .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |  7 ++--
>>  drivers/gpu/drm/amd/amdkfd/kfd_topology.c          |  4 +-
>>  14 files changed, 84 insertions(+), 56 deletions(-)
> [snip]
>
>>> @@ -610,12 +616,15 @@ int pm_send_unmap_queue(struct packet_manager *pm, enum kfd_queue_type type,
>>                                 queue_sel__mes_unmap_queues__perform_request_on_dynamic_queues_only;
>>                 break;
>>         default:
>> -               BUG();
>> -               break;
>> +               WARN(1, "filter %d", mode);
>> +               retval = -EINVAL;
>>         }
>>
>> -       pm->priv_queue->ops.submit_packet(pm->priv_queue);
>> -
>> +err_invalid:
>> +       if (!retval)
>> +               pm->priv_queue->ops.submit_packet(pm->priv_queue);
>> +       else
>> +               pm->priv_queue->ops.rollback_packet(pm->priv_queue);
>
> I don't feel comfortable putting a valid code path under an "err_invalid" label.
> This defeats the purpose of goto statement and common cleanup code,
> making the code unreadable.
>
> [FK] It is common clean-up code that is needed both in the error and success case. If you prefer, I can separate the error cleanup from the normal cleanup, but it will result in some duplication.

I believe the correct way is to do the following because then you
don't need to check retval in common cleanup code, which to me looks
strange.

<start> pm_send_unmap_queue()

        if (retval != 0)
-                goto err_acquire_packet_buffer;
+               goto unlock_mutex;

<...>

+               WARN(1, "filter %d", mode);
+               retval = -EINVAL;
+               goto err_invalid;
        }

        pm->priv_queue->ops.submit_packet(pm->priv_queue);
+      goto unlock_mutex;

+err_invalid:
+      pm->priv_queue->ops.rollback_packet(pm->priv_queue);
-err_acquire_packet_buffer:
+unlock_mutex:
        mutex_unlock(&pm->lock);
        return retval;
}

<end> pm_send_unmap_queue()

>
> Also, the rollback packet function was not in the original code. Why
> did you add it here ?
>
> [FK] With the BUG(), this function would not return in case of an error. Without the BUG it will return with an error code, so it needs to clean up after itself. So it needs to release the space it allocated on the queue. Otherwise the next potential user of the queue will submit garbage.
>
>>  err_acquire_packet_buffer:
>>         mutex_unlock(&pm->lock);
>>         return retval;
> [snip]
>
>> @@ -202,10 +200,10 @@ static void kfd_process_destroy_delayed(struct rcu_head *rcu)
>>         struct kfd_process_release_work *work;
>>         struct kfd_process *p;
>>
>> -       BUG_ON(!kfd_process_wq);
>> +       WARN_ON(!kfd_process_wq);
> I think this is redundant, as kfd_process_wq is later derefernced
> inside queue_work (as *wq). So we will get a violation there anyway.
>
> [FK] OK.
>
> Regards,
>   Felix
>
>>
>>         p = container_of(rcu, struct kfd_process, rcu);
>> -       BUG_ON(atomic_read(&p->mm->mm_count) <= 0);
>> +       WARN_ON(atomic_read(&p->mm->mm_count) <= 0);
>>
>>         mmdrop(p->mm);
>>
>> @@ -229,7 +227,8 @@ static void kfd_process_notifier_release(struct mmu_notifier *mn,
>>          * mmu_notifier srcu is read locked
>>          */
>>         p = container_of(mn, struct kfd_process, mmu_notifier);
>> -       BUG_ON(p->mm != mm);
>> +       if (WARN_ON(p->mm != mm))
>> +               return;
>>
>>         mutex_lock(&kfd_processes_mutex);
>>         hash_del_rcu(&p->kfd_processes);
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
>> index f6ecdff..1cae95e 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
>> @@ -218,8 +218,8 @@ int pqm_create_queue(struct process_queue_manager *pqm,
>>                                                         kq, &pdd->qpd);
>>                 break;
>>         default:
>> -               BUG();
>> -               break;
>> +               WARN(1, "Invalid queue type %d", type);
>> +               retval = -EINVAL;
>>         }
>>
>>         if (retval != 0) {
>> @@ -272,7 +272,8 @@ int pqm_destroy_queue(struct process_queue_manager *pqm, unsigned int qid)
>>                 dev = pqn->kq->dev;
>>         if (pqn->q)
>>                 dev = pqn->q->device;
>> -       BUG_ON(!dev);
>> +       if (WARN_ON(!dev))
>> +               return -ENODEV;
>>
>>         pdd = kfd_get_process_device_data(dev, pqm->process);
>>         if (!pdd) {
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>> index e5486f4..19ce590 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>> @@ -799,10 +799,12 @@ static int kfd_build_sysfs_node_entry(struct kfd_topology_device *dev,
>>         int ret;
>>         uint32_t i;
>>
>> +       if (WARN_ON(dev->kobj_node))
>> +               return -EEXIST;
>> +
>>         /*
>>          * Creating the sysfs folders
>>          */
>> -       BUG_ON(dev->kobj_node);
>>         dev->kobj_node = kfd_alloc_struct(dev->kobj_node);
>>         if (!dev->kobj_node)
>>                 return -ENOMEM;
>> --
>> 2.7.4
>>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 16/19] drm/amdkfd: Update PM4 packet headers
       [not found]                 ` <BN6PR12MB13489181F8A90932135C1CBBE88E0-/b2+HYfkarQX0pEhCR5T8QdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
@ 2017-08-13  8:49                   ` Oded Gabbay
       [not found]                     ` <CAFCwf12edu4VXLP8UTTJk+x9uu9D1bkgO23FpiJbuz3BEreYzg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: Oded Gabbay @ 2017-08-13  8:49 UTC (permalink / raw)
  To: Bridgman, John; +Cc: Kuehling, Felix, amd-gfx list

On Sat, Aug 12, 2017 at 10:09 PM, Bridgman, John <John.Bridgman@amd.com> wrote:
> IIRC the amdgpu devs had been holding back on publishing the updated MEC microcode (with scratch support) because that WOULD have broken Kaveri. With this change from Felix we should be able to publish the newest microcode for both amdgpu and amdkfd WITHOUT breaking Kaveri.
>
> IOW this is the "scratch fix for Kaveri KFD" you have wanted for a couple of years :)

ah, ok.

In that case, this patch is:
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>


>
>>-----Original Message-----
>>From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf
>>Of Kuehling, Felix
>>Sent: Saturday, August 12, 2017 2:16 PM
>>To: Oded Gabbay
>>Cc: amd-gfx list
>>Subject: Re: [PATCH 16/19] drm/amdkfd: Update PM4 packet headers
>>
>>> Do you mean that it won't work with Kaveri anymore ?
>>
>>Kaveri got the same firmware changes, mostly for scratch memory support.
>>The Kaveri firmware headers name the structures and fields a bit differently
>>but they should be binary compatible. So we simplified the code to use only
>>one set of headers. I'll grab a Kaveri system to confirm that it works.
>>
>>Regards,
>>  Felix
>>
>>From: Oded Gabbay <oded.gabbay@gmail.com>
>>Sent: Saturday, August 12, 2017 11:10 AM
>>To: Kuehling, Felix
>>Cc: amd-gfx list
>>Subject: Re: [PATCH 16/19] drm/amdkfd: Update PM4 packet headers
>>
>>On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling <Felix.Kuehling@amd.com>
>>wrote:
>>> To match current firmware. The map process packet has been extended to
>>> support scratch. This is a non-backwards compatible change and it's
>>> about two years old. So no point keeping the old version around
>>> conditionally.
>>
>>Do you mean that it won't work with Kaveri anymore ?
>>I believe we aren't allowed to break older H/W support without some
>>serious justification.
>>
>>Oded
>>
>>>
>>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>>> ---
>>>  drivers/gpu/drm/amd/amdkfd/kfd_device.c         |   8 +-
>>>  drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c | 161 ++++--------
>>>  drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h    | 314
>>>+++---------------------
>>>  drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h | 130 +++++++++-
>>>  4 files changed, 199 insertions(+), 414 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>> b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>> index e1c2ad2..e790e7f 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>> @@ -26,7 +26,7 @@
>>>  #include <linux/slab.h>
>>>  #include "kfd_priv.h"
>>>  #include "kfd_device_queue_manager.h"
>>> -#include "kfd_pm4_headers.h"
>>> +#include "kfd_pm4_headers_vi.h"
>>>
>>>  #define MQD_SIZE_ALIGNED 768
>>>
>>> @@ -238,9 +238,9 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
>>>          * calculate max size of runlist packet.
>>>          * There can be only 2 packets at once
>>>          */
>>> -       size += (KFD_MAX_NUM_OF_PROCESSES * sizeof(struct
>>>pm4_map_process) +
>>> -               max_num_of_queues_per_device *
>>> -               sizeof(struct pm4_map_queues) + sizeof(struct
>>>pm4_runlist)) * 2;
>>> +       size += (KFD_MAX_NUM_OF_PROCESSES * sizeof(struct
>>> +pm4_mes_map_process) +
>>> +               max_num_of_queues_per_device * sizeof(struct
>>> +pm4_mes_map_queues)
>>> +               + sizeof(struct pm4_mes_runlist)) * 2;
>>>
>>>         /* Add size of HIQ & DIQ */
>>>         size += KFD_KERNEL_QUEUE_SIZE * 2;  diff --git
>>>a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
>>>b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
>>> index 77a6f2b..3141e05 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
>>> @@ -26,7 +26,6 @@
>>>  #include "kfd_device_queue_manager.h"
>>>  #include "kfd_kernel_queue.h"
>>>  #include "kfd_priv.h"
>>> -#include "kfd_pm4_headers.h"
>>>  #include "kfd_pm4_headers_vi.h"
>>>  #include "kfd_pm4_opcodes.h"
>>>
>>> @@ -44,12 +43,12 @@ static unsigned int build_pm4_header(unsigned int
>>>opcode, size_t packet_size)
>>>  {
>>>         union PM4_MES_TYPE_3_HEADER header;
>>>
>>> -       header.u32all = 0;
>>> +       header.u32All = 0;
>>>         header.opcode = opcode;
>>>         header.count = packet_size/sizeof(uint32_t) - 2;
>>>         header.type = PM4_TYPE_3;
>>>
>>> -       return header.u32all;
>>> +       return header.u32All;
>>>  }
>>>
>>>  static void pm_calc_rlib_size(struct packet_manager *pm,  @@ -69,12
>>>+68,9 @@ static void pm_calc_rlib_size(struct packet_manager *pm,
>>>                 pr_debug("Over subscribed runlist\n");
>>>         }
>>>
>>> -       map_queue_size =
>>> -               (pm->dqm->dev->device_info->asic_family == CHIP_CARRIZO) ?
>>> -               sizeof(struct pm4_mes_map_queues) :
>>> -               sizeof(struct pm4_map_queues);
>>> +       map_queue_size = sizeof(struct pm4_mes_map_queues);
>>>         /* calculate run list ib allocation size */
>>> -       *rlib_size = process_count * sizeof(struct pm4_map_process) +
>>> +       *rlib_size = process_count * sizeof(struct
>>> +pm4_mes_map_process) +
>>>                      queue_count * map_queue_size;
>>>
>>>         /*
>>> @@ -82,7 +78,7 @@ static void pm_calc_rlib_size(struct packet_manager
>>>*pm,
>>>          * when over subscription
>>>          */
>>>         if (*over_subscription)
>>> -               *rlib_size += sizeof(struct pm4_runlist);
>>> +               *rlib_size += sizeof(struct pm4_mes_runlist);
>>>
>>>         pr_debug("runlist ib size %d\n", *rlib_size);
>>>  }
>>> @@ -119,16 +115,16 @@ static int pm_allocate_runlist_ib(struct
>>>packet_manager *pm,
>>>  static int pm_create_runlist(struct packet_manager *pm, uint32_t
>>>*buffer,
>>>                         uint64_t ib, size_t ib_size_in_dwords, bool
>>>chain)
>>>  {
>>> -       struct pm4_runlist *packet;
>>> +       struct pm4_mes_runlist *packet;
>>>
>>>         if (WARN_ON(!ib))
>>>                 return -EFAULT;
>>>
>>> -       packet = (struct pm4_runlist *)buffer;
>>> +       packet = (struct pm4_mes_runlist *)buffer;
>>>
>>> -       memset(buffer, 0, sizeof(struct pm4_runlist));
>>> -       packet->header.u32all = build_pm4_header(IT_RUN_LIST,
>>> -                                               sizeof(struct
>>> pm4_runlist));
>>> +       memset(buffer, 0, sizeof(struct pm4_mes_runlist));
>>> +       packet->header.u32All = build_pm4_header(IT_RUN_LIST,
>>> +                                               sizeof(struct
>>> +pm4_mes_runlist));
>>>
>>>         packet->bitfields4.ib_size = ib_size_in_dwords;
>>>         packet->bitfields4.chain = chain ? 1 : 0;  @@ -143,16 +139,16
>>>@@ static int pm_create_runlist(struct packet_manager *pm, uint32_t
>>>*buffer,
>>>  static int pm_create_map_process(struct packet_manager *pm, uint32_t
>>>*buffer,
>>>                                 struct qcm_process_device *qpd)
>>>  {
>>> -       struct pm4_map_process *packet;
>>> +       struct pm4_mes_map_process *packet;
>>>         struct queue *cur;
>>>         uint32_t num_queues;
>>>
>>> -       packet = (struct pm4_map_process *)buffer;
>>> +       packet = (struct pm4_mes_map_process *)buffer;
>>>
>>> -       memset(buffer, 0, sizeof(struct pm4_map_process));
>>> +       memset(buffer, 0, sizeof(struct pm4_mes_map_process));
>>>
>>> -       packet->header.u32all = build_pm4_header(IT_MAP_PROCESS,
>>> -                                       sizeof(struct
>>> pm4_map_process));
>>> +       packet->header.u32All = build_pm4_header(IT_MAP_PROCESS,
>>> +                                       sizeof(struct
>>> +pm4_mes_map_process));
>>>         packet->bitfields2.diq_enable = (qpd->is_debug) ? 1 : 0;
>>>         packet->bitfields2.process_quantum = 1;
>>>         packet->bitfields2.pasid = qpd->pqm->process->pasid;  @@
>>>-170,23 +166,26 @@ static int pm_create_map_process(struct
>>>packet_manager *pm, uint32_t *buffer,
>>>         packet->sh_mem_ape1_base = qpd->sh_mem_ape1_base;
>>>         packet->sh_mem_ape1_limit = qpd->sh_mem_ape1_limit;
>>>
>>> +       /* TODO: scratch support */
>>> +       packet->sh_hidden_private_base_vmid = 0;
>>> +
>>>         packet->gds_addr_lo = lower_32_bits(qpd->gds_context_area);
>>>         packet->gds_addr_hi = upper_32_bits(qpd->gds_context_area);
>>>
>>>         return 0;
>>>  }
>>>
>>> -static int pm_create_map_queue_vi(struct packet_manager *pm, uint32_t
>>> *buffer,
>>> +static int pm_create_map_queue(struct packet_manager *pm, uint32_t
>>> +*buffer,
>>>                 struct queue *q, bool is_static)
>>>  {
>>>         struct pm4_mes_map_queues *packet;
>>>         bool use_static = is_static;
>>>
>>>         packet = (struct pm4_mes_map_queues *)buffer;
>>> -       memset(buffer, 0, sizeof(struct pm4_map_queues));
>>> +       memset(buffer, 0, sizeof(struct pm4_mes_map_queues));
>>>
>>> -       packet->header.u32all = build_pm4_header(IT_MAP_QUEUES,
>>> -                                               sizeof(struct
>>> pm4_map_queues));
>>> +       packet->header.u32All = build_pm4_header(IT_MAP_QUEUES,
>>> +                                               sizeof(struct
>>> +pm4_mes_map_queues));
>>>         packet->bitfields2.alloc_format =
>>>                 alloc_format__mes_map_queues__one_per_pipe_vi;
>>>         packet->bitfields2.num_queues = 1;  @@ -235,64 +234,6 @@
>>>static int pm_create_map_queue_vi(struct packet_manager *pm, uint32_t
>>>*buffer,
>>>         return 0;
>>>  }
>>>
>>> -static int pm_create_map_queue(struct packet_manager *pm, uint32_t
>>>*buffer,
>>> -                               struct queue *q, bool is_static)  -{
>>> -       struct pm4_map_queues *packet;
>>> -       bool use_static = is_static;
>>> -
>>> -       packet = (struct pm4_map_queues *)buffer;
>>> -       memset(buffer, 0, sizeof(struct pm4_map_queues));
>>> -
>>> -       packet->header.u32all = build_pm4_header(IT_MAP_QUEUES,
>>> -                                               sizeof(struct
>>>pm4_map_queues));
>>> -       packet->bitfields2.alloc_format =
>>> -
>>>alloc_format__mes_map_queues__one_per_pipe;
>>> -       packet->bitfields2.num_queues = 1;
>>> -       packet->bitfields2.queue_sel =
>>> -
>>>queue_sel__mes_map_queues__map_to_hws_determined_queue_slots;
>>> -
>>> -       packet->bitfields2.vidmem = (q->properties.is_interop) ?
>>> -                       vidmem__mes_map_queues__uses_video_memory :
>>> -                       vidmem__mes_map_queues__uses_no_video_memory;
>>> -
>>> -       switch (q->properties.type) {
>>> -       case KFD_QUEUE_TYPE_COMPUTE:
>>> -       case KFD_QUEUE_TYPE_DIQ:
>>> -               packet->bitfields2.engine_sel =
>>> -                               engine_sel__mes_map_queues__compute;
>>> -               break;
>>> -       case KFD_QUEUE_TYPE_SDMA:
>>> -               packet->bitfields2.engine_sel =
>>> -                               engine_sel__mes_map_queues__sdma0;
>>> -               use_static = false; /* no static queues under SDMA */
>>> -               break;
>>> -       default:
>>> -               WARN(1, "queue type %d", q->properties.type);
>>> -               return -EINVAL;
>>> -       }
>>> -
>>> -       packet->mes_map_queues_ordinals[0].bitfields3.doorbell_offset
>>>=
>>> -                       q->properties.doorbell_off;
>>> -
>>> -       packet->mes_map_queues_ordinals[0].bitfields3.is_static =
>>> -                       (use_static) ? 1 : 0;
>>> -
>>> -       packet->mes_map_queues_ordinals[0].mqd_addr_lo =
>>> -                       lower_32_bits(q->gart_mqd_addr);
>>> -
>>> -       packet->mes_map_queues_ordinals[0].mqd_addr_hi =
>>> -                       upper_32_bits(q->gart_mqd_addr);
>>> -
>>> -       packet->mes_map_queues_ordinals[0].wptr_addr_lo =
>>> -
>>>lower_32_bits((uint64_t)q->properties.write_ptr);
>>> -
>>> -       packet->mes_map_queues_ordinals[0].wptr_addr_hi =
>>> -
>>>upper_32_bits((uint64_t)q->properties.write_ptr);
>>> -
>>> -       return 0;
>>> -}
>>> -
>>>  static int pm_create_runlist_ib(struct packet_manager *pm,
>>>                                 struct list_head *queues,
>>>                                 uint64_t *rl_gpu_addr,  @@ -334,7
>>>+275,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
>>>                         return retval;
>>>
>>>                 proccesses_mapped++;
>>> -               inc_wptr(&rl_wptr, sizeof(struct pm4_map_process),
>>> +               inc_wptr(&rl_wptr, sizeof(struct pm4_mes_map_process),
>>>                                 alloc_size_bytes);
>>>
>>>                 list_for_each_entry(kq, &qpd->priv_queue_list, list) {
>>>@@ -344,14 +285,7 @@ static int pm_create_runlist_ib(struct
>>>packet_manager *pm,
>>>                         pr_debug("static_queue, mapping kernel q %d,
>>>is debug status %d\n",
>>>                                 kq->queue->queue, qpd->is_debug);
>>>
>>> -                       if (pm->dqm->dev->device_info->asic_family ==
>>> -                                       CHIP_CARRIZO)
>>> -                               retval = pm_create_map_queue_vi(pm,
>>> -                                               &rl_buffer[rl_wptr],
>>> -                                               kq->queue,
>>> -                                               qpd->is_debug);
>>> -                       else
>>> -                               retval = pm_create_map_queue(pm,
>>> +                       retval = pm_create_map_queue(pm,
>>>                                                 &rl_buffer[rl_wptr],
>>>                                                 kq->queue,
>>>                                                 qpd->is_debug);  @@
>>>-359,7 +293,7 @@ static int pm_create_runlist_ib(struct packet_manager
>>>*pm,
>>>                                 return retval;
>>>
>>>                         inc_wptr(&rl_wptr,
>>> -                               sizeof(struct pm4_map_queues),
>>> +                               sizeof(struct pm4_mes_map_queues),
>>>                                 alloc_size_bytes);
>>>                 }
>>>
>>> @@ -370,14 +304,7 @@ static int pm_create_runlist_ib(struct
>>>packet_manager *pm,
>>>                         pr_debug("static_queue, mapping user queue %d,
>>>is debug status %d\n",
>>>                                 q->queue, qpd->is_debug);
>>>
>>> -                       if (pm->dqm->dev->device_info->asic_family ==
>>> -                                       CHIP_CARRIZO)
>>> -                               retval = pm_create_map_queue_vi(pm,
>>> -                                               &rl_buffer[rl_wptr],
>>> -                                               q,
>>> -                                               qpd->is_debug);
>>> -                       else
>>> -                               retval = pm_create_map_queue(pm,
>>> +                       retval = pm_create_map_queue(pm,
>>>                                                 &rl_buffer[rl_wptr],
>>>                                                 q,
>>>                                                 qpd->is_debug);  @@
>>>-386,7 +313,7 @@ static int pm_create_runlist_ib(struct packet_manager
>>>*pm,
>>>                                 return retval;
>>>
>>>                         inc_wptr(&rl_wptr,
>>> -                               sizeof(struct pm4_map_queues),
>>> +                               sizeof(struct pm4_mes_map_queues),
>>>                                 alloc_size_bytes);
>>>                 }
>>>         }
>>> @@ -429,7 +356,7 @@ void pm_uninit(struct packet_manager *pm)
>>>  int pm_send_set_resources(struct packet_manager *pm,
>>>                                 struct scheduling_resources *res)
>>>  {
>>> -       struct pm4_set_resources *packet;
>>> +       struct pm4_mes_set_resources *packet;
>>>         int retval = 0;
>>>
>>>         mutex_lock(&pm->lock);
>>> @@ -442,9 +369,9 @@ int pm_send_set_resources(struct packet_manager
>>>*pm,
>>>                 goto out;
>>>         }
>>>
>>> -       memset(packet, 0, sizeof(struct pm4_set_resources));
>>> -       packet->header.u32all = build_pm4_header(IT_SET_RESOURCES,
>>> -                                       sizeof(struct
>>> pm4_set_resources));
>>> +       memset(packet, 0, sizeof(struct pm4_mes_set_resources));
>>> +       packet->header.u32All = build_pm4_header(IT_SET_RESOURCES,
>>> +                                       sizeof(struct
>>> +pm4_mes_set_resources));
>>>
>>>         packet->bitfields2.queue_type =
>>>
>>>queue_type__mes_set_resources__hsa_interface_queue_hiq;
>>> @@ -482,7 +409,7 @@ int pm_send_runlist(struct packet_manager *pm,
>>>struct list_head *dqm_queues)
>>>
>>>         pr_debug("runlist IB address: 0x%llX\n", rl_gpu_ib_addr);
>>>
>>> -       packet_size_dwords = sizeof(struct pm4_runlist) /
>>> sizeof(uint32_t);
>>> +       packet_size_dwords = sizeof(struct pm4_mes_runlist) /
>>> +sizeof(uint32_t);
>>>         mutex_lock(&pm->lock);
>>>
>>>         retval =
>>>pm->priv_queue->ops.acquire_packet_buffer(pm->priv_queue,
>>> @@ -514,7 +441,7 @@ int pm_send_query_status(struct packet_manager
>>>*pm, uint64_t fence_address,
>>>                         uint32_t fence_value)
>>>  {
>>>         int retval;
>>> -       struct pm4_query_status *packet;
>>> +       struct pm4_mes_query_status *packet;
>>>
>>>         if (WARN_ON(!fence_address))
>>>                 return -EFAULT;
>>> @@ -522,13 +449,13 @@ int pm_send_query_status(struct packet_manager
>>>*pm, uint64_t fence_address,
>>>         mutex_lock(&pm->lock);
>>>         retval = pm->priv_queue->ops.acquire_packet_buffer(
>>>                         pm->priv_queue,
>>> -                       sizeof(struct pm4_query_status) /
>>>sizeof(uint32_t),
>>> +                       sizeof(struct pm4_mes_query_status) /
>>> +sizeof(uint32_t),
>>>                         (unsigned int **)&packet);
>>>         if (retval)
>>>                 goto fail_acquire_packet_buffer;
>>>
>>> -       packet->header.u32all = build_pm4_header(IT_QUERY_STATUS,
>>> -                                       sizeof(struct
>>> pm4_query_status));
>>> +       packet->header.u32All = build_pm4_header(IT_QUERY_STATUS,
>>> +                                       sizeof(struct
>>> +pm4_mes_query_status));
>>>
>>>         packet->bitfields2.context_id = 0;
>>>         packet->bitfields2.interrupt_sel =  @@ -555,22 +482,22 @@ int
>>>pm_send_unmap_queue(struct packet_manager *pm, enum
>>kfd_queue_type
>>>type,
>>>  {
>>>         int retval;
>>>         uint32_t *buffer;
>>> -       struct pm4_unmap_queues *packet;
>>> +       struct pm4_mes_unmap_queues *packet;
>>>
>>>         mutex_lock(&pm->lock);
>>>         retval = pm->priv_queue->ops.acquire_packet_buffer(
>>>                         pm->priv_queue,
>>> -                       sizeof(struct pm4_unmap_queues) /
>>>sizeof(uint32_t),
>>> +                       sizeof(struct pm4_mes_unmap_queues) /
>>> +sizeof(uint32_t),
>>>                         &buffer);
>>>         if (retval)
>>>                 goto err_acquire_packet_buffer;
>>>
>>> -       packet = (struct pm4_unmap_queues *)buffer;
>>> -       memset(buffer, 0, sizeof(struct pm4_unmap_queues));
>>> +       packet = (struct pm4_mes_unmap_queues *)buffer;
>>> +       memset(buffer, 0, sizeof(struct pm4_mes_unmap_queues));
>>>         pr_debug("static_queue: unmapping queues: mode is %d , reset
>>>is %d , type is %d\n",
>>>                 mode, reset, type);
>>> -       packet->header.u32all = build_pm4_header(IT_UNMAP_QUEUES,
>>> -                                       sizeof(struct
>>>pm4_unmap_queues));
>>> +       packet->header.u32All = build_pm4_header(IT_UNMAP_QUEUES,
>>> +                                       sizeof(struct
>>> +pm4_mes_unmap_queues));
>>>         switch (type) {
>>>         case KFD_QUEUE_TYPE_COMPUTE:
>>>         case KFD_QUEUE_TYPE_DIQ:
>>> @@ -608,12 +535,12 @@ int pm_send_unmap_queue(struct
>>packet_manager
>>>*pm, enum kfd_queue_type type,
>>>                 break;
>>>         case KFD_PREEMPT_TYPE_FILTER_ALL_QUEUES:
>>>                 packet->bitfields2.queue_sel =
>>> -
>>>queue_sel__mes_unmap_queues__perform_request_on_all_active_queue
>>s;
>>> +
>>> +queue_sel__mes_unmap_queues__unmap_all_queues;
>>>                 break;
>>>         case KFD_PREEMPT_TYPE_FILTER_DYNAMIC_QUEUES:
>>>                 /* in this case, we do not preempt static queues */
>>>                 packet->bitfields2.queue_sel =
>>> -
>>>queue_sel__mes_unmap_queues__perform_request_on_dynamic_queues
>>_only;
>>> +
>>> +queue_sel__mes_unmap_queues__unmap_all_non_static_queues;
>>>                 break;
>>>         default:
>>>                 WARN(1, "filter %d", mode);  diff --git
>>>a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
>>>b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
>>> index 97e5442..e50f73d 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
>>> @@ -41,99 +41,6 @@ union PM4_MES_TYPE_3_HEADER {
>>>  };
>>>  #endif /* PM4_MES_HEADER_DEFINED */
>>>
>>> -/* --------------------MES_SET_RESOURCES-------------------- */
>>> -
>>> -#ifndef PM4_MES_SET_RESOURCES_DEFINED -#define
>>> PM4_MES_SET_RESOURCES_DEFINED -enum
>>set_resources_queue_type_enum {
>>> -       queue_type__mes_set_resources__kernel_interface_queue_kiq = 0,
>>> -       queue_type__mes_set_resources__hsa_interface_queue_hiq = 1,
>>> -       queue_type__mes_set_resources__hsa_debug_interface_queue = 4
>>> -};
>>> -
>>> -struct pm4_set_resources {
>>> -       union {
>>> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
>>> -               uint32_t ordinal1;
>>> -       };
>>> -
>>> -       union {
>>> -               struct {
>>> -                       uint32_t vmid_mask:16;
>>> -                       uint32_t unmap_latency:8;
>>> -                       uint32_t reserved1:5;
>>> -                       enum set_resources_queue_type_enum
>>> queue_type:3;
>>> -               } bitfields2;
>>> -               uint32_t ordinal2;
>>> -       };
>>> -
>>> -       uint32_t queue_mask_lo;
>>> -       uint32_t queue_mask_hi;
>>> -       uint32_t gws_mask_lo;
>>> -       uint32_t gws_mask_hi;
>>> -
>>> -       union {
>>> -               struct {
>>> -                       uint32_t oac_mask:16;
>>> -                       uint32_t reserved2:16;
>>> -               } bitfields7;
>>> -               uint32_t ordinal7;
>>> -       };
>>> -
>>> -       union {
>>> -               struct {
>>> -                       uint32_t gds_heap_base:6;
>>> -                       uint32_t reserved3:5;
>>> -                       uint32_t gds_heap_size:6;
>>> -                       uint32_t reserved4:15;
>>> -               } bitfields8;
>>> -               uint32_t ordinal8;
>>> -       };
>>> -
>>> -};
>>> -#endif
>>> -
>>> -/*--------------------MES_RUN_LIST-------------------- */
>>> -
>>> -#ifndef PM4_MES_RUN_LIST_DEFINED
>>> -#define PM4_MES_RUN_LIST_DEFINED
>>> -
>>> -struct pm4_runlist {
>>> -       union {
>>> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
>>> -               uint32_t ordinal1;
>>> -       };
>>> -
>>> -       union {
>>> -               struct {
>>> -                       uint32_t reserved1:2;
>>> -                       uint32_t ib_base_lo:30;
>>> -               } bitfields2;
>>> -               uint32_t ordinal2;
>>> -       };
>>> -
>>> -       union {
>>> -               struct {
>>> -                       uint32_t ib_base_hi:16;
>>> -                       uint32_t reserved2:16;
>>> -               } bitfields3;
>>> -               uint32_t ordinal3;
>>> -       };
>>> -
>>> -       union {
>>> -               struct {
>>> -                       uint32_t ib_size:20;
>>> -                       uint32_t chain:1;
>>> -                       uint32_t offload_polling:1;
>>> -                       uint32_t reserved3:1;
>>> -                       uint32_t valid:1;
>>> -                       uint32_t reserved4:8;
>>> -               } bitfields4;
>>> -               uint32_t ordinal4;
>>> -       };
>>> -
>>> -};
>>> -#endif
>>>
>>>  /*--------------------MES_MAP_PROCESS-------------------- */
>>>
>>> @@ -186,217 +93,58 @@ struct pm4_map_process {
>>>  };
>>>  #endif
>>>
>>> -/*--------------------MES_MAP_QUEUES--------------------*/
>>> -
>>> -#ifndef PM4_MES_MAP_QUEUES_DEFINED
>>> -#define PM4_MES_MAP_QUEUES_DEFINED
>>> -enum map_queues_queue_sel_enum {
>>> -       queue_sel__mes_map_queues__map_to_specified_queue_slots = 0,
>>> -
>>       queue_sel__mes_map_queues__map_to_hws_determined_queue_slots
>>=
>>> 1,
>>> -       queue_sel__mes_map_queues__enable_process_queues = 2 -};
>>> +#ifndef PM4_MES_MAP_PROCESS_DEFINED_KV_SCRATCH
>>> +#define PM4_MES_MAP_PROCESS_DEFINED_KV_SCRATCH
>>>
>>> -enum map_queues_vidmem_enum {
>>> -       vidmem__mes_map_queues__uses_no_video_memory = 0,
>>> -       vidmem__mes_map_queues__uses_video_memory = 1 -};
>>> -
>>> -enum map_queues_alloc_format_enum {
>>> -       alloc_format__mes_map_queues__one_per_pipe = 0,
>>> -       alloc_format__mes_map_queues__all_on_one_pipe = 1 -};
>>> -
>>> -enum map_queues_engine_sel_enum {
>>> -       engine_sel__mes_map_queues__compute = 0,
>>> -       engine_sel__mes_map_queues__sdma0 = 2,
>>> -       engine_sel__mes_map_queues__sdma1 = 3 -};
>>> -
>>> -struct pm4_map_queues {
>>> +struct pm4_map_process_scratch_kv {
>>>         union {
>>> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
>>> -               uint32_t ordinal1;
>>> -       };
>>> -
>>> -       union {
>>> -               struct {
>>> -                       uint32_t reserved1:4;
>>> -                       enum map_queues_queue_sel_enum queue_sel:2;
>>> -                       uint32_t reserved2:2;
>>> -                       uint32_t vmid:4;
>>> -                       uint32_t reserved3:4;
>>> -                       enum map_queues_vidmem_enum vidmem:2;
>>> -                       uint32_t reserved4:6;
>>> -                       enum map_queues_alloc_format_enum
>>>alloc_format:2;
>>> -                       enum map_queues_engine_sel_enum engine_sel:3;
>>> -                       uint32_t num_queues:3;
>>> -               } bitfields2;
>>> -               uint32_t ordinal2;
>>> -       };
>>> -
>>> -       struct {
>>> -               union {
>>> -                       struct {
>>> -                               uint32_t is_static:1;
>>> -                               uint32_t reserved5:1;
>>> -                               uint32_t doorbell_offset:21;
>>> -                               uint32_t reserved6:3;
>>> -                               uint32_t queue:6;
>>> -                       } bitfields3;
>>> -                       uint32_t ordinal3;
>>> -               };
>>> -
>>> -               uint32_t mqd_addr_lo;
>>> -               uint32_t mqd_addr_hi;
>>> -               uint32_t wptr_addr_lo;
>>> -               uint32_t wptr_addr_hi;
>>> -
>>> -       } mes_map_queues_ordinals[1];   /* 1..N of these ordinal
>>>groups */
>>> -
>>> -};
>>> -#endif
>>> -
>>> -/*--------------------MES_QUERY_STATUS--------------------*/
>>> -
>>> -#ifndef PM4_MES_QUERY_STATUS_DEFINED
>>> -#define PM4_MES_QUERY_STATUS_DEFINED
>>> -enum query_status_interrupt_sel_enum {
>>> -       interrupt_sel__mes_query_status__completion_status = 0,
>>> -       interrupt_sel__mes_query_status__process_status = 1,
>>> -       interrupt_sel__mes_query_status__queue_status = 2  -};
>>> -
>>> -enum query_status_command_enum {
>>> -       command__mes_query_status__interrupt_only = 0,
>>> -       command__mes_query_status__fence_only_immediate = 1,
>>> -       command__mes_query_status__fence_only_after_write_ack = 2,
>>> -
>>>command__mes_query_status__fence_wait_for_write_ack_send_interrupt
>>= 3
>>>-};
>>> -
>>> -enum query_status_engine_sel_enum {
>>> -       engine_sel__mes_query_status__compute = 0,
>>> -       engine_sel__mes_query_status__sdma0_queue = 2,
>>> -       engine_sel__mes_query_status__sdma1_queue = 3  -};
>>> -
>>> -struct pm4_query_status {
>>> -       union {
>>> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
>>> -               uint32_t ordinal1;
>>> -       };
>>> -
>>> -       union {
>>> -               struct {
>>> -                       uint32_t context_id:28;
>>> -                       enum query_status_interrupt_sel_enum
>>>interrupt_sel:2;
>>> -                       enum query_status_command_enum command:2;
>>> -               } bitfields2;
>>> -               uint32_t ordinal2;
>>> +               union PM4_MES_TYPE_3_HEADER   header; /* header */
>>> +               uint32_t            ordinal1;
>>>         };
>>>
>>>         union {
>>>                 struct {
>>>                         uint32_t pasid:16;
>>> -                       uint32_t reserved1:16;
>>> -               } bitfields3a;
>>> -               struct {
>>> -                       uint32_t reserved2:2;
>>> -                       uint32_t doorbell_offset:21;
>>> -                       uint32_t reserved3:3;
>>> -                       enum query_status_engine_sel_enum
>>>engine_sel:3;
>>> -                       uint32_t reserved4:3;
>>> -               } bitfields3b;
>>> -               uint32_t ordinal3;
>>> -       };
>>> -
>>> -       uint32_t addr_lo;
>>> -       uint32_t addr_hi;
>>> -       uint32_t data_lo;
>>> -       uint32_t data_hi;
>>> -};
>>> -#endif
>>> -
>>> -/*--------------------MES_UNMAP_QUEUES--------------------*/
>>> -
>>> -#ifndef PM4_MES_UNMAP_QUEUES_DEFINED
>>> -#define PM4_MES_UNMAP_QUEUES_DEFINED
>>> -enum unmap_queues_action_enum {
>>> -       action__mes_unmap_queues__preempt_queues = 0,
>>> -       action__mes_unmap_queues__reset_queues = 1,
>>> -       action__mes_unmap_queues__disable_process_queues = 2  -};
>>> -
>>> -enum unmap_queues_queue_sel_enum {
>>> -
>>>queue_sel__mes_unmap_queues__perform_request_on_specified_queues
>>= 0,
>>> -
>>       queue_sel__mes_unmap_queues__perform_request_on_pasid_queues =
>>>1,
>>> -
>>>queue_sel__mes_unmap_queues__perform_request_on_all_active_queue
>>s = 2,
>>> -
>>>queue_sel__mes_unmap_queues__perform_request_on_dynamic_queues
>>_only = 3
>>>-};
>>> -
>>> -enum unmap_queues_engine_sel_enum {
>>> -       engine_sel__mes_unmap_queues__compute = 0,
>>> -       engine_sel__mes_unmap_queues__sdma0 = 2,
>>> -       engine_sel__mes_unmap_queues__sdma1 = 3  -};
>>> -
>>> -struct pm4_unmap_queues {
>>> -       union {
>>> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
>>> -               uint32_t ordinal1;
>>> -       };
>>> -
>>> -       union {
>>> -               struct {
>>> -                       enum unmap_queues_action_enum action:2;
>>> -                       uint32_t reserved1:2;
>>> -                       enum unmap_queues_queue_sel_enum queue_sel:2;
>>> -                       uint32_t reserved2:20;
>>> -                       enum unmap_queues_engine_sel_enum
>>>engine_sel:3;
>>> -                       uint32_t num_queues:3;
>>> +                       uint32_t reserved1:8;
>>> +                       uint32_t diq_enable:1;
>>> +                       uint32_t process_quantum:7;
>>>                 } bitfields2;
>>>                 uint32_t ordinal2;
>>>         };
>>>
>>>         union {
>>>                 struct {
>>> -                       uint32_t pasid:16;
>>> -                       uint32_t reserved3:16;
>>> -               } bitfields3a;
>>> -               struct {
>>> -                       uint32_t reserved4:2;
>>> -                       uint32_t doorbell_offset0:21;
>>> -                       uint32_t reserved5:9;
>>> -               } bitfields3b;
>>> +                       uint32_t page_table_base:28;
>>> +                       uint32_t reserved2:4;
>>> +               } bitfields3;
>>>                 uint32_t ordinal3;
>>>         };
>>>
>>> -       union {
>>> -               struct {
>>> -                       uint32_t reserved6:2;
>>> -                       uint32_t doorbell_offset1:21;
>>> -                       uint32_t reserved7:9;
>>> -               } bitfields4;
>>> -               uint32_t ordinal4;
>>> -       };
>>> -
>>> -       union {
>>> -               struct {
>>> -                       uint32_t reserved8:2;
>>> -                       uint32_t doorbell_offset2:21;
>>> -                       uint32_t reserved9:9;
>>> -               } bitfields5;
>>> -               uint32_t ordinal5;
>>> -       };
>>> +       uint32_t reserved3;
>>> +       uint32_t sh_mem_bases;
>>> +       uint32_t sh_mem_config;
>>> +       uint32_t sh_mem_ape1_base;
>>> +       uint32_t sh_mem_ape1_limit;
>>> +       uint32_t sh_hidden_private_base_vmid;
>>> +       uint32_t reserved4;
>>> +       uint32_t reserved5;
>>> +       uint32_t gds_addr_lo;
>>> +       uint32_t gds_addr_hi;
>>>
>>>         union {
>>>                 struct {
>>> -                       uint32_t reserved10:2;
>>> -                       uint32_t doorbell_offset3:21;
>>> -                       uint32_t reserved11:9;
>>> -               } bitfields6;
>>> -               uint32_t ordinal6;
>>> +                       uint32_t num_gws:6;
>>> +                       uint32_t reserved6:2;
>>> +                       uint32_t num_oac:4;
>>> +                       uint32_t reserved7:4;
>>> +                       uint32_t gds_size:6;
>>> +                       uint32_t num_queues:10;
>>> +               } bitfields14;
>>> +               uint32_t ordinal14;
>>>         };
>>>
>>> +       uint32_t completion_signal_lo32; uint32_t
>>> +completion_signal_hi32;
>>>  };
>>>  #endif
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
>>> b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
>>> index c4eda6f..7c8d9b3 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
>>> @@ -126,9 +126,10 @@ struct pm4_mes_runlist {
>>>                         uint32_t ib_size:20;
>>>                         uint32_t chain:1;
>>>                         uint32_t offload_polling:1;
>>> -                       uint32_t reserved3:1;
>>> +                       uint32_t reserved2:1;
>>>                         uint32_t valid:1;
>>> -                       uint32_t reserved4:8;
>>> +                       uint32_t process_cnt:4;
>>> +                       uint32_t reserved3:4;
>>>                 } bitfields4;
>>>                 uint32_t ordinal4;
>>>         };
>>> @@ -143,8 +144,8 @@ struct pm4_mes_runlist {
>>>
>>>  struct pm4_mes_map_process {
>>>         union {
>>> -               union PM4_MES_TYPE_3_HEADER   header;            /*
>>>header */
>>> -               uint32_t            ordinal1;
>>> +               union PM4_MES_TYPE_3_HEADER header;     /* header */
>>> +               uint32_t ordinal1;
>>>         };
>>>
>>>         union {
>>> @@ -155,36 +156,48 @@ struct pm4_mes_map_process {
>>>                         uint32_t process_quantum:7;
>>>                 } bitfields2;
>>>                 uint32_t ordinal2;
>>> -};
>>> +       };
>>>
>>>         union {
>>>                 struct {
>>>                         uint32_t page_table_base:28;
>>> -                       uint32_t reserved2:4;
>>> +                       uint32_t reserved3:4;
>>>                 } bitfields3;
>>>                 uint32_t ordinal3;
>>>         };
>>>
>>> +       uint32_t reserved;
>>> +
>>>         uint32_t sh_mem_bases;
>>> +       uint32_t sh_mem_config;
>>>         uint32_t sh_mem_ape1_base;
>>>         uint32_t sh_mem_ape1_limit;
>>> -       uint32_t sh_mem_config;
>>> +
>>> +       uint32_t sh_hidden_private_base_vmid;
>>> +
>>> +       uint32_t reserved2;
>>> +       uint32_t reserved3;
>>> +
>>>         uint32_t gds_addr_lo;
>>>         uint32_t gds_addr_hi;
>>>
>>>         union {
>>>                 struct {
>>>                         uint32_t num_gws:6;
>>> -                       uint32_t reserved3:2;
>>> +                       uint32_t reserved4:2;
>>>                         uint32_t num_oac:4;
>>> -                       uint32_t reserved4:4;
>>> +                       uint32_t reserved5:4;
>>>                         uint32_t gds_size:6;
>>>                         uint32_t num_queues:10;
>>>                 } bitfields10;
>>>                 uint32_t ordinal10;
>>>         };
>>>
>>> +       uint32_t completion_signal_lo;
>>> +       uint32_t completion_signal_hi;
>>> +
>>>  };
>>> +
>>>  #endif
>>>
>>>  /*--------------------MES_MAP_QUEUES--------------------*/
>>> @@ -337,7 +350,7 @@ enum mes_unmap_queues_engine_sel_enum {
>>>         engine_sel__mes_unmap_queues__sdmal = 3
>>>  };
>>>
>>> -struct PM4_MES_UNMAP_QUEUES {
>>> +struct pm4_mes_unmap_queues {
>>>         union {
>>>                 union PM4_MES_TYPE_3_HEADER   header;            /*
>>>header */
>>>                 uint32_t            ordinal1;  @@ -397,4 +410,101 @@
>>>struct PM4_MES_UNMAP_QUEUES {
>>>  };
>>>  #endif
>>>
>>> +#ifndef PM4_MEC_RELEASE_MEM_DEFINED
>>> +#define PM4_MEC_RELEASE_MEM_DEFINED
>>> +enum RELEASE_MEM_event_index_enum {
>>> +       event_index___release_mem__end_of_pipe = 5,
>>> +       event_index___release_mem__shader_done = 6 };
>>> +
>>> +enum RELEASE_MEM_cache_policy_enum {
>>> +       cache_policy___release_mem__lru = 0,
>>> +       cache_policy___release_mem__stream = 1,
>>> +       cache_policy___release_mem__bypass = 2 };
>>> +
>>> +enum RELEASE_MEM_dst_sel_enum {
>>> +       dst_sel___release_mem__memory_controller = 0,
>>> +       dst_sel___release_mem__tc_l2 = 1,
>>> +       dst_sel___release_mem__queue_write_pointer_register = 2,
>>> +       dst_sel___release_mem__queue_write_pointer_poll_mask_bit = 3
>>> +};
>>> +
>>> +enum RELEASE_MEM_int_sel_enum {
>>> +       int_sel___release_mem__none = 0,
>>> +       int_sel___release_mem__send_interrupt_only = 1,
>>> +       int_sel___release_mem__send_interrupt_after_write_confirm = 2,
>>> +       int_sel___release_mem__send_data_after_write_confirm = 3 };
>>> +
>>> +enum RELEASE_MEM_data_sel_enum {
>>> +       data_sel___release_mem__none = 0,
>>> +       data_sel___release_mem__send_32_bit_low = 1,
>>> +       data_sel___release_mem__send_64_bit_data = 2,
>>> +       data_sel___release_mem__send_gpu_clock_counter = 3,
>>> +       data_sel___release_mem__send_cp_perfcounter_hi_lo = 4,
>>> +       data_sel___release_mem__store_gds_data_to_memory = 5 };
>>> +
>>> +struct pm4_mec_release_mem {
>>> +       union {
>>> +               union PM4_MES_TYPE_3_HEADER header;     /*header */
>>> +               unsigned int ordinal1;
>>> +       };
>>> +
>>> +       union {
>>> +               struct {
>>> +                       unsigned int event_type:6;
>>> +                       unsigned int reserved1:2;
>>> +                       enum RELEASE_MEM_event_index_enum
>>> +event_index:4;
>>> +                       unsigned int tcl1_vol_action_ena:1;
>>> +                       unsigned int tc_vol_action_ena:1;
>>> +                       unsigned int reserved2:1;
>>> +                       unsigned int tc_wb_action_ena:1;
>>> +                       unsigned int tcl1_action_ena:1;
>>> +                       unsigned int tc_action_ena:1;
>>> +                       unsigned int reserved3:6;
>>> +                       unsigned int atc:1;
>>> +                       enum RELEASE_MEM_cache_policy_enum
>>> +cache_policy:2;
>>> +                       unsigned int reserved4:5;
>>> +               } bitfields2;
>>> +               unsigned int ordinal2;
>>> +       };
>>> +
>>> +       union {
>>> +               struct {
>>> +                       unsigned int reserved5:16;
>>> +                       enum RELEASE_MEM_dst_sel_enum dst_sel:2;
>>> +                       unsigned int reserved6:6;
>>> +                       enum RELEASE_MEM_int_sel_enum int_sel:3;
>>> +                       unsigned int reserved7:2;
>>> +                       enum RELEASE_MEM_data_sel_enum data_sel:3;
>>> +               } bitfields3;
>>> +               unsigned int ordinal3;
>>> +       };
>>> +
>>> +       union {
>>> +               struct {
>>> +                       unsigned int reserved8:2;
>>> +                       unsigned int address_lo_32b:30;
>>> +               } bitfields4;
>>> +               struct {
>>> +                       unsigned int reserved9:3;
>>> +                       unsigned int address_lo_64b:29;
>>> +               } bitfields5;
>>> +               unsigned int ordinal4;
>>> +       };
>>> +
>>> +       unsigned int address_hi;
>>> +
>>> +       unsigned int data_lo;
>>> +
>>> +       unsigned int data_hi;
>>> +};
>>> +#endif
>>> +
>>> +enum {
>>> +       CACHE_FLUSH_AND_INV_TS_EVENT = 0x00000014 };
>>> +
>>>  #endif
>>> --
>>> 2.7.4
>>>
>>
>>_______________________________________________
>>amd-gfx mailing list
>>amd-gfx@lists.freedesktop.org
>>https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 17/19] drm/amdgpu: Remove hard-coded assumptions about compute pipes
       [not found]         ` <CAFCwf108X+f6+jehRvykPy0NPCnYa6uHjoVAXWDvoN+35h-N5A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-08-14 14:53           ` Felix Kuehling
       [not found]             ` <0eec4793-c9fa-4dfb-c8d7-41995d20e738-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: Felix Kuehling @ 2017-08-14 14:53 UTC (permalink / raw)
  To: Oded Gabbay; +Cc: amd-gfx list

On 2017-08-13 04:29 AM, Oded Gabbay wrote:
> On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
>> Remove hard-coded assumption that the first compute pipe is
>> reserved for amdgpu. Pipe 0 actually means pipe 0 now.
>>
> If I'm looking at amdgpu_gfx_compute_queue_acquire(), amdgpu takes the
> first pipe, so won't this change collide with that code ?

amdgpu_gfx_compute_queue_acquire marks the queues it's using in
adev->gfx.mec.queue_bitmap. amdgpu_amdkfd_device_init uses that to
initialize the queue_bitmap in kgd2kfd_shared_resources. So KFD is not
going to use the compute queues reserved by amdgpu.

Andres Rodriguez had a series of commits a few months ago to generalize
the compute queue assignment between amdgpu and amdkfd. The hard coded
assumption that pipe0 is used by amdgpu goes back to before Andres'
patch series. I think it was overlooked when Andres made his changes.

Regards,
  Felix
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 17/19] drm/amdgpu: Remove hard-coded assumptions about compute pipes
       [not found]             ` <0eec4793-c9fa-4dfb-c8d7-41995d20e738-5C7GfCeVMHo@public.gmane.org>
@ 2017-08-14 15:06               ` Oded Gabbay
  0 siblings, 0 replies; 70+ messages in thread
From: Oded Gabbay @ 2017-08-14 15:06 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: amd-gfx list

On Mon, Aug 14, 2017 at 5:53 PM, Felix Kuehling <felix.kuehling@amd.com> wrote:
>
> On 2017-08-13 04:29 AM, Oded Gabbay wrote:
> > On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> >> Remove hard-coded assumption that the first compute pipe is
> >> reserved for amdgpu. Pipe 0 actually means pipe 0 now.
> >>
> > If I'm looking at amdgpu_gfx_compute_queue_acquire(), amdgpu takes the
> > first pipe, so won't this change collide with that code ?
>
> amdgpu_gfx_compute_queue_acquire marks the queues it's using in
> adev->gfx.mec.queue_bitmap. amdgpu_amdkfd_device_init uses that to
> initialize the queue_bitmap in kgd2kfd_shared_resources. So KFD is not
> going to use the compute queues reserved by amdgpu.
>
> Andres Rodriguez had a series of commits a few months ago to generalize
> the compute queue assignment between amdgpu and amdkfd. The hard coded
> assumption that pipe0 is used by amdgpu goes back to before Andres'
> patch series. I think it was overlooked when Andres made his changes.
>
> Regards,
>   Felix

ok, thanks.

This patch is:
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* RE: [PATCH 18/19] drm/amdgpu: Disable GFX PG on CZ
       [not found]                     ` <1d263ee5-e1e3-aa3a-a8aa-c1fbfbcbb8ab-5C7GfCeVMHo@public.gmane.org>
  2017-08-12  0:54                       ` StDenis, Tom
@ 2017-08-14 15:28                       ` Deucher, Alexander
  1 sibling, 0 replies; 70+ messages in thread
From: Deucher, Alexander @ 2017-08-14 15:28 UTC (permalink / raw)
  To: Kuehling, Felix, StDenis, Tom,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w

> -----Original Message-----
> From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf
> Of Felix Kuehling
> Sent: Friday, August 11, 2017 8:40 PM
> To: StDenis, Tom; amd-gfx@lists.freedesktop.org; oded.gabbay@gmail.com
> Subject: Re: [PATCH 18/19] drm/amdgpu: Disable GFX PG on CZ
> 
> With the next change that adds programming of RLC_CP_SCHEDULERS it's a
> VM fault and hard hang during boot, just after HWS initialization.
> Without that change it's only a MEC hang when the first application
> tries to create a user mode queue.

I wonder if there is an issue changing RLC_CP_SCHEDULERS?  Maybe we need to setup KIQ and HIQ at the same time?  I'm not sure to what extent HIQ was validated with GFX PG.  I don't think HIQ is used on other OSes.  We may need some KCL options for this depending on what we are delivering (ROCm vs. OEM preload) or longer term maybe we can add some callback to disable PG while KFD is active.  Anyway, patch is:
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

Alex

> 
> Regards,
>   Felix
> 
> On 2017-08-11 08:08 PM, StDenis, Tom wrote:
> > Hmm, I'd still be careful about disabling GFX PG since we may fail to meet
> energy star requirements.
> >
> > Does the system hard hang or simply GPU hang?
> >
> > Tom
> >
> > ________________________________________
> > From: Kuehling, Felix
> > Sent: Friday, August 11, 2017 19:56
> > To: StDenis, Tom; amd-gfx@lists.freedesktop.org;
> oded.gabbay@gmail.com
> > Subject: Re: [PATCH 18/19] drm/amdgpu: Disable GFX PG on CZ
> >
> > Yes, I'm up-to-date. KFD doesn't use the KIQ to map the HIQ. And HIQ
> > maps all our other queues (unless we're disabling the hardware scheduler).
> >
> > Regards,
> >   Felix
> >
> >
> > On 2017-08-11 07:45 PM, StDenis, Tom wrote:
> >> Hi Felix,
> >>
> >> I'm assuming your tree is up to date with amd-staging-4.11 or 4.12 but we
> did previously have issues with compute rings if PG was enabled (specifically
> CGCG + PG) on Carrizo.  Then David committed some KIQ upgrades and it
> started working properly.
> >>
> >> Could that be related?  Because GFX PG "should work" on Carrizo is the
> official line last I heard from the GFX IP team.
> >>
> >> Cheers,
> >> Tom
> >> ________________________________________
> >> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> on behalf of
> Felix Kuehling <Felix.Kuehling@amd.com>
> >> Sent: Friday, August 11, 2017 17:56
> >> To: amd-gfx@lists.freedesktop.org; oded.gabbay@gmail.com
> >> Cc: Kuehling, Felix
> >> Subject: [PATCH 18/19] drm/amdgpu: Disable GFX PG on CZ
> >>
> >> It's causing problems with user mode queues and the HIQ, and can
> >> lead to hard hangs during boot after programming RLC_CP_SCHEDULERS.
> >>
> >> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> >> ---
> >>  drivers/gpu/drm/amd/amdgpu/vi.c | 3 +--
> >>  1 file changed, 1 insertion(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c
> b/drivers/gpu/drm/amd/amdgpu/vi.c
> >> index 18bb3cb..495c8a3 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/vi.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/vi.c
> >> @@ -1029,8 +1029,7 @@ static int vi_common_early_init(void *handle)
> >>                 /* rev0 hardware requires workarounds to support PG */
> >>                 adev->pg_flags = 0;
> >>                 if (adev->rev_id != 0x00 || CZ_REV_BRISTOL(adev->pdev-
> >revision)) {
> >> -                       adev->pg_flags |= AMD_PG_SUPPORT_GFX_PG |
> >> -                               AMD_PG_SUPPORT_GFX_SMG |
> >> +                       adev->pg_flags |= AMD_PG_SUPPORT_GFX_SMG |
> >>                                 AMD_PG_SUPPORT_GFX_PIPELINE |
> >>                                 AMD_PG_SUPPORT_CP |
> >>                                 AMD_PG_SUPPORT_UVD |
> >> --
> >> 2.7.4
> >>
> >> _______________________________________________
> >> amd-gfx mailing list
> >> amd-gfx@lists.freedesktop.org
> >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> 
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* RE: [PATCH 00/19] KFD fixes and cleanups
       [not found]         ` <DM5PR1201MB0235D9AA9FAEF6F2635942C8928E0-grEf7a3NxMBd8L2jMOIKKmrFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
@ 2017-08-14 16:25           ` Deucher, Alexander
       [not found]             ` <CY4PR12MB1653ACDE8226C3E06245F8EEF78C0-rpdhrqHFk06apTa93KjAaQdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: Deucher, Alexander @ 2017-08-14 16:25 UTC (permalink / raw)
  To: Kuehling, Felix, Oded Gabbay; +Cc: amd-gfx list

> -----Original Message-----
> From: Kuehling, Felix
> Sent: Saturday, August 12, 2017 2:08 PM
> To: Oded Gabbay; Deucher, Alexander
> Cc: amd-gfx list
> Subject: Re: [PATCH 00/19] KFD fixes and cleanups
> 
> [+Alex]
> 
> I'll rebase this on drm-next-4.14. Alex, is this the branch that will become the
> new default development branch for the amdgpu team? This should make
> coordination of dependent AMDGPU and KFD changes easier.

Yes.  drm-next-4.14-wip is my latest patch queue for upstream.  drm-next-4.14 is the latest code that Dave has pulled.  amd-staging-drm-next is drm-next-4.14-wip with DC and a few other things from amd-staging rebased on top.  amd-staging-drm-next will become the new amd-staging.

Alex

> 
> Regards,
>   Felix
> 
> 
> 
> From: Oded Gabbay <oded.gabbay@gmail.com>
> Sent: Saturday, August 12, 2017 8:28 AM
> To: Kuehling, Felix
> Cc: amd-gfx list
> Subject: Re: [PATCH 00/19] KFD fixes and cleanups
> 
> Hi Felix,
> Thanks for all the patches.
> I have started to review them, but I have a small request from you
> while I'm doing the review.
> Could you please rebase them over my amdkfd-next branch, or
> alternatively, over Alex's drm-next-4.14  or Dave Airlie's drm-next
> (which amdkfd-next currently points to) branches ?
> I tried to apply this patch-set on amdkfd-next, but it fails on patch
> 5. I can't upstream them to Dave when they don't apply to his upstream
> branch.
> 
> Thanks,
> Oded
> 
> On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling <Felix.Kuehling@amd.com>
> wrote:
> > This is the first round of changes preparing for upstreaming KFD
> > changes made internally in the last 2 years at AMD. A big part of it
> > is coding style and messaging cleanup. I have tried to avoid making
> > gratuitous formatting changes. All coding style changes should have a
> > justification based on the Linux style guide.
> >
> > The last few patches (15-19) enable running pieces of the current ROCm
> > user mode stack (with minor Thunk fixes for backwards compatibility)
> > on this soon-to-be upstream kernel on CZ. At this time I can run some
> > KFDTest unit tests, which are currently not open source. I'm trying to
> > find other more substantial tests using a real compute API as a
> > baseline for testing further KFD upstreaming patches.
> >
> > This patch series is freshly rebased on amd-staging-4.12.
> >
> > Felix Kuehling (11):
> >   drm/amdkfd: Fix typo in dbgdev_wave_reset_wavefronts
> >   drm/amdkfd: Remove bogus divide-by-sizeof(uint32_t)
> >   drm/amdkfd: Fix allocated_queues bitmap initialization
> >   drm/amdkfd: Remove BUG_ONs for NULL pointer arguments
> >   drm/amdkfd: Fix doorbell initialization and finalization
> >   drm/amdkfd: Allocate gtt_sa_bitmap in long units
> >   drm/amdkfd: Handle remaining BUG_ONs more gracefully
> >   drm/amdkfd: Update PM4 packet headers
> >   drm/amdgpu: Remove hard-coded assumptions about compute pipes
> >   drm/amdgpu: Disable GFX PG on CZ
> >   drm/amd: Update MEC HQD loading code for KFD
> >
> > Jay Cornwall (1):
> >   drm/amdkfd: Clamp EOP queue size correctly on Gfx8
> >
> > Kent Russell (5):
> >   drm/amdkfd: Clean up KFD style errors and warnings
> >   drm/amdkfd: Consolidate and clean up log commands
> >   drm/amdkfd: Change x==NULL/false references to !x
> >   drm/amdkfd: Fix goto usage
> >   drm/amdkfd: Remove usage of alloc(sizeof(struct...
> >
> > Yair Shachar (1):
> >   drm/amdkfd: Fix double Mutex lock order
> >
> > Yong Zhao (1):
> >   drm/amdkfd: Add more error printing to help bringup
> >
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c         |   4 +-
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h         |  16 +
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c  | 156
> +++++++---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c  | 185
> ++++++++++--
> >  drivers/gpu/drm/amd/amdgpu/vi.c                    |   3 +-
> >  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c           | 107 +++----
> >  drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c            | 102 +++----
> >  drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c            |  21 +-
> >  drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.h            |  27 +-
> >  drivers/gpu/drm/amd/amdkfd/kfd_device.c            | 122 ++++----
> >  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  | 313
> ++++++++-----------
> >  .../drm/amd/amdkfd/kfd_device_queue_manager_cik.c  |   6 +-
> >  .../drm/amd/amdkfd/kfd_device_queue_manager_vi.c   |   6 +-
> >  drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c          |  40 +--
> >  drivers/gpu/drm/amd/amdkfd/kfd_events.c            |  33 +--
> >  drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c       |   2 +-
> >  drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c         |   2 +-
> >  drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c      |  63 ++--
> >  drivers/gpu/drm/amd/amdkfd/kfd_module.c            |  10 +-
> >  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h       |   3 +-
> >  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c   |  62 ++--
> >  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c    |  46 +--
> >  drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c    | 301 +++++++---
> ---------
> >  drivers/gpu/drm/amd/amdkfd/kfd_pasid.c             |   7 +-
> >  drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h       | 330 +++-----------
> -------
> >  drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h    | 140 ++++++++-
> >  drivers/gpu/drm/amd/amdkfd/kfd_priv.h              |  31 +-
> >  drivers/gpu/drm/amd/amdkfd/kfd_process.c           |  25 +-
> >  .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |  71 ++---
> >  drivers/gpu/drm/amd/amdkfd/kfd_queue.c             |  12 +-
> >  drivers/gpu/drm/amd/amdkfd/kfd_topology.c          |  46 +--
> >  drivers/gpu/drm/amd/include/kgd_kfd_interface.h    |  11 +-
> >  drivers/gpu/drm/radeon/radeon_kfd.c                |  12 +-
> >  33 files changed, 1054 insertions(+), 1261 deletions(-)
> >
> > --
> > 2.7.4
> >
> 
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 00/19] KFD fixes and cleanups
       [not found]             ` <CY4PR12MB1653ACDE8226C3E06245F8EEF78C0-rpdhrqHFk06apTa93KjAaQdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
@ 2017-08-14 22:23               ` Felix Kuehling
  0 siblings, 0 replies; 70+ messages in thread
From: Felix Kuehling @ 2017-08-14 22:23 UTC (permalink / raw)
  To: Deucher, Alexander, Oded Gabbay; +Cc: amd-gfx list

Looking at the top-level Makefile, drm-next-4.14 still seems to be based
on 4.12. Dave's drm-next an Alex's drm-next-4.14-wip are currently at
4.13-rc2. I was just about to start rebasing on drm-next-4.14.

But now I'm more leaning towards drm-next-4.14-wip, since that's what
Alex would push to Dave.

Regards,
  Felix


On 2017-08-14 12:25 PM, Deucher, Alexander wrote:
>> -----Original Message-----
>> From: Kuehling, Felix
>> Sent: Saturday, August 12, 2017 2:08 PM
>> To: Oded Gabbay; Deucher, Alexander
>> Cc: amd-gfx list
>> Subject: Re: [PATCH 00/19] KFD fixes and cleanups
>>
>> [+Alex]
>>
>> I'll rebase this on drm-next-4.14. Alex, is this the branch that will become the
>> new default development branch for the amdgpu team? This should make
>> coordination of dependent AMDGPU and KFD changes easier.
> Yes.  drm-next-4.14-wip is my latest patch queue for upstream.  drm-next-4.14 is the latest code that Dave has pulled.  amd-staging-drm-next is drm-next-4.14-wip with DC and a few other things from amd-staging rebased on top.  amd-staging-drm-next will become the new amd-staging.
>
> Alex
>
>> Regards,
>>   Felix
>>
>>
>>
>> From: Oded Gabbay <oded.gabbay@gmail.com>
>> Sent: Saturday, August 12, 2017 8:28 AM
>> To: Kuehling, Felix
>> Cc: amd-gfx list
>> Subject: Re: [PATCH 00/19] KFD fixes and cleanups
>>
>> Hi Felix,
>> Thanks for all the patches.
>> I have started to review them, but I have a small request from you
>> while I'm doing the review.
>> Could you please rebase them over my amdkfd-next branch, or
>> alternatively, over Alex's drm-next-4.14  or Dave Airlie's drm-next
>> (which amdkfd-next currently points to) branches ?
>> I tried to apply this patch-set on amdkfd-next, but it fails on patch
>> 5. I can't upstream them to Dave when they don't apply to his upstream
>> branch.
>>
>> Thanks,
>> Oded
>>
>> On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling <Felix.Kuehling@amd.com>
>> wrote:
>>> This is the first round of changes preparing for upstreaming KFD
>>> changes made internally in the last 2 years at AMD. A big part of it
>>> is coding style and messaging cleanup. I have tried to avoid making
>>> gratuitous formatting changes. All coding style changes should have a
>>> justification based on the Linux style guide.
>>>
>>> The last few patches (15-19) enable running pieces of the current ROCm
>>> user mode stack (with minor Thunk fixes for backwards compatibility)
>>> on this soon-to-be upstream kernel on CZ. At this time I can run some
>>> KFDTest unit tests, which are currently not open source. I'm trying to
>>> find other more substantial tests using a real compute API as a
>>> baseline for testing further KFD upstreaming patches.
>>>
>>> This patch series is freshly rebased on amd-staging-4.12.
>>>
>>> Felix Kuehling (11):
>>>    drm/amdkfd: Fix typo in dbgdev_wave_reset_wavefronts
>>>    drm/amdkfd: Remove bogus divide-by-sizeof(uint32_t)
>>>    drm/amdkfd: Fix allocated_queues bitmap initialization
>>>    drm/amdkfd: Remove BUG_ONs for NULL pointer arguments
>>>    drm/amdkfd: Fix doorbell initialization and finalization
>>>    drm/amdkfd: Allocate gtt_sa_bitmap in long units
>>>    drm/amdkfd: Handle remaining BUG_ONs more gracefully
>>>    drm/amdkfd: Update PM4 packet headers
>>>    drm/amdgpu: Remove hard-coded assumptions about compute pipes
>>>    drm/amdgpu: Disable GFX PG on CZ
>>>    drm/amd: Update MEC HQD loading code for KFD
>>>
>>> Jay Cornwall (1):
>>>    drm/amdkfd: Clamp EOP queue size correctly on Gfx8
>>>
>>> Kent Russell (5):
>>>    drm/amdkfd: Clean up KFD style errors and warnings
>>>    drm/amdkfd: Consolidate and clean up log commands
>>>    drm/amdkfd: Change x==NULL/false references to !x
>>>    drm/amdkfd: Fix goto usage
>>>    drm/amdkfd: Remove usage of alloc(sizeof(struct...
>>>
>>> Yair Shachar (1):
>>>    drm/amdkfd: Fix double Mutex lock order
>>>
>>> Yong Zhao (1):
>>>    drm/amdkfd: Add more error printing to help bringup
>>>
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c         |   4 +-
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h         |  16 +
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c  | 156
>> +++++++---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c  | 185
>> ++++++++++--
>>>   drivers/gpu/drm/amd/amdgpu/vi.c                    |   3 +-
>>>   drivers/gpu/drm/amd/amdkfd/kfd_chardev.c           | 107 +++----
>>>   drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c            | 102 +++----
>>>   drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c            |  21 +-
>>>   drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.h            |  27 +-
>>>   drivers/gpu/drm/amd/amdkfd/kfd_device.c            | 122 ++++----
>>>   .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  | 313
>> ++++++++-----------
>>>   .../drm/amd/amdkfd/kfd_device_queue_manager_cik.c  |   6 +-
>>>   .../drm/amd/amdkfd/kfd_device_queue_manager_vi.c   |   6 +-
>>>   drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c          |  40 +--
>>>   drivers/gpu/drm/amd/amdkfd/kfd_events.c            |  33 +--
>>>   drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c       |   2 +-
>>>   drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c         |   2 +-
>>>   drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c      |  63 ++--
>>>   drivers/gpu/drm/amd/amdkfd/kfd_module.c            |  10 +-
>>>   drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h       |   3 +-
>>>   drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c   |  62 ++--
>>>   drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c    |  46 +--
>>>   drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c    | 301 +++++++---
>> ---------
>>>   drivers/gpu/drm/amd/amdkfd/kfd_pasid.c             |   7 +-
>>>   drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h       | 330 +++-----------
>> -------
>>>   drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h    | 140 ++++++++-
>>>   drivers/gpu/drm/amd/amdkfd/kfd_priv.h              |  31 +-
>>>   drivers/gpu/drm/amd/amdkfd/kfd_process.c           |  25 +-
>>>   .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |  71 ++---
>>>   drivers/gpu/drm/amd/amdkfd/kfd_queue.c             |  12 +-
>>>   drivers/gpu/drm/amd/amdkfd/kfd_topology.c          |  46 +--
>>>   drivers/gpu/drm/amd/include/kgd_kfd_interface.h    |  11 +-
>>>   drivers/gpu/drm/radeon/radeon_kfd.c                |  12 +-
>>>   33 files changed, 1054 insertions(+), 1261 deletions(-)
>>>
>>> --
>>> 2.7.4
>>>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 14/19] drm/amdkfd: Add more error printing to help bringup
       [not found]         ` <CAFCwf12LtX8Me-DSVvnf72eZr=UQm6sWnBoSuB2DM8jbqk3nOQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-08-15  3:50           ` Zhao, Yong
       [not found]             ` <CY1PR1201MB109725CE39D52B9E482824FAF08D0-JBJ/M6OpXY/YBI+VM8qCl2rFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: Zhao, Yong @ 2017-08-15  3:50 UTC (permalink / raw)
  To: Oded Gabbay, Kuehling, Felix; +Cc: amd-gfx list


[-- Attachment #1.1: Type: text/plain, Size: 2845 bytes --]

Oded, I agree with you. When I made the change, there was WARN_ON already in the same function lookup_device_info(), so I followed the suit and used WARN again. It is indeed a bit overkill.


Felix, do I need to fix it or can you fix it directly?


Yong

________________________________
From: Oded Gabbay <oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Sent: Saturday, August 12, 2017 10:54:41 AM
To: Kuehling, Felix
Cc: amd-gfx list; Zhao, Yong
Subject: Re: [PATCH 14/19] drm/amdkfd: Add more error printing to help bringup

On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling <Felix.Kuehling-5C7GfCeVMHo@public.gmane.org> wrote:
> From: Yong Zhao <Yong.Zhao-5C7GfCeVMHo@public.gmane.org>
>
> Signed-off-by: Yong Zhao <Yong.Zhao-5C7GfCeVMHo@public.gmane.org>
> Signed-off-by: Felix Kuehling <Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_device.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> index f628ac3..e1c2ad2 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> @@ -103,6 +103,8 @@ static const struct kfd_device_info *lookup_device_info(unsigned short did)
>                 }
>         }
>
> +       WARN(1, "device is not added to supported_devices\n");
> +
I think WARN is a bit excessive here. Its not actually a warning - an
AMD gpu device is present but not supported in amdkfd.
Maybe a dev_info is more appropriate here.

Oded

>         return NULL;
>  }
>
> @@ -114,8 +116,10 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd,
>         const struct kfd_device_info *device_info =
>                                         lookup_device_info(pdev->device);
>
> -       if (!device_info)
> +       if (!device_info) {
> +               dev_err(kfd_device, "kgd2kfd_probe failed\n");
>                 return NULL;
> +       }
>
>         kfd = kzalloc(sizeof(*kfd), GFP_KERNEL);
>         if (!kfd)
> @@ -364,8 +368,11 @@ int kgd2kfd_resume(struct kfd_dev *kfd)
>
>         if (kfd->init_complete) {
>                 err = amd_iommu_init_device(kfd->pdev, pasid_limit);
> -               if (err < 0)
> +               if (err < 0) {
> +                       dev_err(kfd_device, "failed to initialize iommu\n");
>                         return -ENXIO;
> +               }
> +
>                 amd_iommu_set_invalidate_ctx_cb(kfd->pdev,
>                                                 iommu_pasid_shutdown_callback);
>                 amd_iommu_set_invalid_ppr_cb(kfd->pdev, iommu_invalid_ppr_cb);
> --
> 2.7.4
>
With the above fixed, this patch is:
Reviewed-by: Oded Gabbay <oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

[-- Attachment #1.2: Type: text/html, Size: 6117 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 14/19] drm/amdkfd: Add more error printing to help bringup
       [not found]             ` <CY1PR1201MB109725CE39D52B9E482824FAF08D0-JBJ/M6OpXY/YBI+VM8qCl2rFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
@ 2017-08-15 18:39               ` Felix Kuehling
  2017-08-16  1:14               ` Felix Kuehling
  1 sibling, 0 replies; 70+ messages in thread
From: Felix Kuehling @ 2017-08-15 18:39 UTC (permalink / raw)
  To: Zhao, Yong, Oded Gabbay; +Cc: amd-gfx list

I'll fix it in my upstreaming branch. Thanks.


On 2017-08-14 11:50 PM, Zhao, Yong wrote:
>
> Oded, I agree with you. When I made the change, there was WARN_ON
> already in the same function lookup_device_info(), so I followed the
> suit and used WARN again. It is indeed a bit overkill. 
>
>
> Felix, do I need to fix it or can you fix it directly?
>
>
> Yong
>
> ------------------------------------------------------------------------
> *From:* Oded Gabbay <oded.gabbay@gmail.com>
> *Sent:* Saturday, August 12, 2017 10:54:41 AM
> *To:* Kuehling, Felix
> *Cc:* amd-gfx list; Zhao, Yong
> *Subject:* Re: [PATCH 14/19] drm/amdkfd: Add more error printing to
> help bringup
>  
> On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling
> <Felix.Kuehling@amd.com> wrote:
> > From: Yong Zhao <Yong.Zhao@amd.com>
> >
> > Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
> > Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> > ---
> >  drivers/gpu/drm/amd/amdkfd/kfd_device.c | 11 +++++++++--
> >  1 file changed, 9 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> > index f628ac3..e1c2ad2 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> > @@ -103,6 +103,8 @@ static const struct kfd_device_info
> *lookup_device_info(unsigned short did)
> >                 }
> >         }
> >
> > +       WARN(1, "device is not added to supported_devices\n");
> > +
> I think WARN is a bit excessive here. Its not actually a warning - an
> AMD gpu device is present but not supported in amdkfd.
> Maybe a dev_info is more appropriate here.
>
> Oded
>
> >         return NULL;
> >  }
> >
> > @@ -114,8 +116,10 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd,
> >         const struct kfd_device_info *device_info =
> >                                        
> lookup_device_info(pdev->device);
> >
> > -       if (!device_info)
> > +       if (!device_info) {
> > +               dev_err(kfd_device, "kgd2kfd_probe failed\n");
> >                 return NULL;
> > +       }
> >
> >         kfd = kzalloc(sizeof(*kfd), GFP_KERNEL);
> >         if (!kfd)
> > @@ -364,8 +368,11 @@ int kgd2kfd_resume(struct kfd_dev *kfd)
> >
> >         if (kfd->init_complete) {
> >                 err = amd_iommu_init_device(kfd->pdev, pasid_limit);
> > -               if (err < 0)
> > +               if (err < 0) {
> > +                       dev_err(kfd_device, "failed to initialize
> iommu\n");
> >                         return -ENXIO;
> > +               }
> > +
> >                 amd_iommu_set_invalidate_ctx_cb(kfd->pdev,
> >                                                
> iommu_pasid_shutdown_callback);
> >                 amd_iommu_set_invalid_ppr_cb(kfd->pdev,
> iommu_invalid_ppr_cb);
> > --
> > 2.7.4
> >
> With the above fixed, this patch is:
> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 14/19] drm/amdkfd: Add more error printing to help bringup
       [not found]             ` <CY1PR1201MB109725CE39D52B9E482824FAF08D0-JBJ/M6OpXY/YBI+VM8qCl2rFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
  2017-08-15 18:39               ` Felix Kuehling
@ 2017-08-16  1:14               ` Felix Kuehling
  1 sibling, 0 replies; 70+ messages in thread
From: Felix Kuehling @ 2017-08-16  1:14 UTC (permalink / raw)
  To: Zhao, Yong, Oded Gabbay; +Cc: amd-gfx list

I'll turn it into a dev_warn and I'll make the error message more
helpful. I think this happens when a new DID has been added to amdgpu,
but not to KFD yet. That's probably worth a warning.


Regards,
  Felix


On 2017-08-14 11:50 PM, Zhao, Yong wrote:
>
> Oded, I agree with you. When I made the change, there was WARN_ON
> already in the same function lookup_device_info(), so I followed the
> suit and used WARN again. It is indeed a bit overkill. 
>
>
> Felix, do I need to fix it or can you fix it directly?
>
>
> Yong
>
> ------------------------------------------------------------------------
> *From:* Oded Gabbay <oded.gabbay@gmail.com>
> *Sent:* Saturday, August 12, 2017 10:54:41 AM
> *To:* Kuehling, Felix
> *Cc:* amd-gfx list; Zhao, Yong
> *Subject:* Re: [PATCH 14/19] drm/amdkfd: Add more error printing to
> help bringup
>  
> On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling
> <Felix.Kuehling@amd.com> wrote:
> > From: Yong Zhao <Yong.Zhao@amd.com>
> >
> > Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
> > Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> > ---
> >  drivers/gpu/drm/amd/amdkfd/kfd_device.c | 11 +++++++++--
> >  1 file changed, 9 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> > index f628ac3..e1c2ad2 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> > @@ -103,6 +103,8 @@ static const struct kfd_device_info
> *lookup_device_info(unsigned short did)
> >                 }
> >         }
> >
> > +       WARN(1, "device is not added to supported_devices\n");
> > +
> I think WARN is a bit excessive here. Its not actually a warning - an
> AMD gpu device is present but not supported in amdkfd.
> Maybe a dev_info is more appropriate here.
>
> Oded
>
> >         return NULL;
> >  }
> >
> > @@ -114,8 +116,10 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd,
> >         const struct kfd_device_info *device_info =
> >                                        
> lookup_device_info(pdev->device);
> >
> > -       if (!device_info)
> > +       if (!device_info) {
> > +               dev_err(kfd_device, "kgd2kfd_probe failed\n");
> >                 return NULL;
> > +       }
> >
> >         kfd = kzalloc(sizeof(*kfd), GFP_KERNEL);
> >         if (!kfd)
> > @@ -364,8 +368,11 @@ int kgd2kfd_resume(struct kfd_dev *kfd)
> >
> >         if (kfd->init_complete) {
> >                 err = amd_iommu_init_device(kfd->pdev, pasid_limit);
> > -               if (err < 0)
> > +               if (err < 0) {
> > +                       dev_err(kfd_device, "failed to initialize
> iommu\n");
> >                         return -ENXIO;
> > +               }
> > +
> >                 amd_iommu_set_invalidate_ctx_cb(kfd->pdev,
> >                                                
> iommu_pasid_shutdown_callback);
> >                 amd_iommu_set_invalid_ppr_cb(kfd->pdev,
> iommu_invalid_ppr_cb);
> > --
> > 2.7.4
> >
> With the above fixed, this patch is:
> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 16/19] drm/amdkfd: Update PM4 packet headers
       [not found]                     ` <CAFCwf12edu4VXLP8UTTJk+x9uu9D1bkgO23FpiJbuz3BEreYzg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-08-16  1:19                       ` Felix Kuehling
       [not found]                         ` <6137eb72-cb41-d65a-7863-71adf31a3506-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: Felix Kuehling @ 2017-08-16  1:19 UTC (permalink / raw)
  To: Oded Gabbay, Bridgman, John, Deucher, Alexander; +Cc: amd-gfx list

Hi Alex,

How does firmware get published for the upstream driver? Where can I
check the currently published version of both CZ and KV firmware for
upstream?

Do you publish firmware updates at the same time as patches that depend
on them?

Thanks,
  Felix


On 2017-08-13 04:49 AM, Oded Gabbay wrote:
> On Sat, Aug 12, 2017 at 10:09 PM, Bridgman, John <John.Bridgman@amd.com> wrote:
>> IIRC the amdgpu devs had been holding back on publishing the updated MEC microcode (with scratch support) because that WOULD have broken Kaveri. With this change from Felix we should be able to publish the newest microcode for both amdgpu and amdkfd WITHOUT breaking Kaveri.
>>
>> IOW this is the "scratch fix for Kaveri KFD" you have wanted for a couple of years :)
> ah, ok.
>
> In that case, this patch is:
> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
>
>
>>> -----Original Message-----
>>> From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf
>>> Of Kuehling, Felix
>>> Sent: Saturday, August 12, 2017 2:16 PM
>>> To: Oded Gabbay
>>> Cc: amd-gfx list
>>> Subject: Re: [PATCH 16/19] drm/amdkfd: Update PM4 packet headers
>>>
>>>> Do you mean that it won't work with Kaveri anymore ?
>>> Kaveri got the same firmware changes, mostly for scratch memory support.
>>> The Kaveri firmware headers name the structures and fields a bit differently
>>> but they should be binary compatible. So we simplified the code to use only
>>> one set of headers. I'll grab a Kaveri system to confirm that it works.
>>>
>>> Regards,
>>>  Felix
>>>
>>> From: Oded Gabbay <oded.gabbay@gmail.com>
>>> Sent: Saturday, August 12, 2017 11:10 AM
>>> To: Kuehling, Felix
>>> Cc: amd-gfx list
>>> Subject: Re: [PATCH 16/19] drm/amdkfd: Update PM4 packet headers
>>>
>>> On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling <Felix.Kuehling@amd.com>
>>> wrote:
>>>> To match current firmware. The map process packet has been extended to
>>>> support scratch. This is a non-backwards compatible change and it's
>>>> about two years old. So no point keeping the old version around
>>>> conditionally.
>>> Do you mean that it won't work with Kaveri anymore ?
>>> I believe we aren't allowed to break older H/W support without some
>>> serious justification.
>>>
>>> Oded
>>>
>>>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>>>> ---
>>>>  drivers/gpu/drm/amd/amdkfd/kfd_device.c         |   8 +-
>>>>  drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c | 161 ++++--------
>>>>  drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h    | 314
>>>> +++---------------------
>>>>  drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h | 130 +++++++++-
>>>>  4 files changed, 199 insertions(+), 414 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>>> b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>>> index e1c2ad2..e790e7f 100644
>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>>> @@ -26,7 +26,7 @@
>>>>  #include <linux/slab.h>
>>>>  #include "kfd_priv.h"
>>>>  #include "kfd_device_queue_manager.h"
>>>> -#include "kfd_pm4_headers.h"
>>>> +#include "kfd_pm4_headers_vi.h"
>>>>
>>>>  #define MQD_SIZE_ALIGNED 768
>>>>
>>>> @@ -238,9 +238,9 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
>>>>          * calculate max size of runlist packet.
>>>>          * There can be only 2 packets at once
>>>>          */
>>>> -       size += (KFD_MAX_NUM_OF_PROCESSES * sizeof(struct
>>>> pm4_map_process) +
>>>> -               max_num_of_queues_per_device *
>>>> -               sizeof(struct pm4_map_queues) + sizeof(struct
>>>> pm4_runlist)) * 2;
>>>> +       size += (KFD_MAX_NUM_OF_PROCESSES * sizeof(struct
>>>> +pm4_mes_map_process) +
>>>> +               max_num_of_queues_per_device * sizeof(struct
>>>> +pm4_mes_map_queues)
>>>> +               + sizeof(struct pm4_mes_runlist)) * 2;
>>>>
>>>>         /* Add size of HIQ & DIQ */
>>>>         size += KFD_KERNEL_QUEUE_SIZE * 2;  diff --git
>>>> a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
>>>> b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
>>>> index 77a6f2b..3141e05 100644
>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
>>>> @@ -26,7 +26,6 @@
>>>>  #include "kfd_device_queue_manager.h"
>>>>  #include "kfd_kernel_queue.h"
>>>>  #include "kfd_priv.h"
>>>> -#include "kfd_pm4_headers.h"
>>>>  #include "kfd_pm4_headers_vi.h"
>>>>  #include "kfd_pm4_opcodes.h"
>>>>
>>>> @@ -44,12 +43,12 @@ static unsigned int build_pm4_header(unsigned int
>>>> opcode, size_t packet_size)
>>>>  {
>>>>         union PM4_MES_TYPE_3_HEADER header;
>>>>
>>>> -       header.u32all = 0;
>>>> +       header.u32All = 0;
>>>>         header.opcode = opcode;
>>>>         header.count = packet_size/sizeof(uint32_t) - 2;
>>>>         header.type = PM4_TYPE_3;
>>>>
>>>> -       return header.u32all;
>>>> +       return header.u32All;
>>>>  }
>>>>
>>>>  static void pm_calc_rlib_size(struct packet_manager *pm,  @@ -69,12
>>>> +68,9 @@ static void pm_calc_rlib_size(struct packet_manager *pm,
>>>>                 pr_debug("Over subscribed runlist\n");
>>>>         }
>>>>
>>>> -       map_queue_size =
>>>> -               (pm->dqm->dev->device_info->asic_family == CHIP_CARRIZO) ?
>>>> -               sizeof(struct pm4_mes_map_queues) :
>>>> -               sizeof(struct pm4_map_queues);
>>>> +       map_queue_size = sizeof(struct pm4_mes_map_queues);
>>>>         /* calculate run list ib allocation size */
>>>> -       *rlib_size = process_count * sizeof(struct pm4_map_process) +
>>>> +       *rlib_size = process_count * sizeof(struct
>>>> +pm4_mes_map_process) +
>>>>                      queue_count * map_queue_size;
>>>>
>>>>         /*
>>>> @@ -82,7 +78,7 @@ static void pm_calc_rlib_size(struct packet_manager
>>>> *pm,
>>>>          * when over subscription
>>>>          */
>>>>         if (*over_subscription)
>>>> -               *rlib_size += sizeof(struct pm4_runlist);
>>>> +               *rlib_size += sizeof(struct pm4_mes_runlist);
>>>>
>>>>         pr_debug("runlist ib size %d\n", *rlib_size);
>>>>  }
>>>> @@ -119,16 +115,16 @@ static int pm_allocate_runlist_ib(struct
>>>> packet_manager *pm,
>>>>  static int pm_create_runlist(struct packet_manager *pm, uint32_t
>>>> *buffer,
>>>>                         uint64_t ib, size_t ib_size_in_dwords, bool
>>>> chain)
>>>>  {
>>>> -       struct pm4_runlist *packet;
>>>> +       struct pm4_mes_runlist *packet;
>>>>
>>>>         if (WARN_ON(!ib))
>>>>                 return -EFAULT;
>>>>
>>>> -       packet = (struct pm4_runlist *)buffer;
>>>> +       packet = (struct pm4_mes_runlist *)buffer;
>>>>
>>>> -       memset(buffer, 0, sizeof(struct pm4_runlist));
>>>> -       packet->header.u32all = build_pm4_header(IT_RUN_LIST,
>>>> -                                               sizeof(struct
>>>> pm4_runlist));
>>>> +       memset(buffer, 0, sizeof(struct pm4_mes_runlist));
>>>> +       packet->header.u32All = build_pm4_header(IT_RUN_LIST,
>>>> +                                               sizeof(struct
>>>> +pm4_mes_runlist));
>>>>
>>>>         packet->bitfields4.ib_size = ib_size_in_dwords;
>>>>         packet->bitfields4.chain = chain ? 1 : 0;  @@ -143,16 +139,16
>>>> @@ static int pm_create_runlist(struct packet_manager *pm, uint32_t
>>>> *buffer,
>>>>  static int pm_create_map_process(struct packet_manager *pm, uint32_t
>>>> *buffer,
>>>>                                 struct qcm_process_device *qpd)
>>>>  {
>>>> -       struct pm4_map_process *packet;
>>>> +       struct pm4_mes_map_process *packet;
>>>>         struct queue *cur;
>>>>         uint32_t num_queues;
>>>>
>>>> -       packet = (struct pm4_map_process *)buffer;
>>>> +       packet = (struct pm4_mes_map_process *)buffer;
>>>>
>>>> -       memset(buffer, 0, sizeof(struct pm4_map_process));
>>>> +       memset(buffer, 0, sizeof(struct pm4_mes_map_process));
>>>>
>>>> -       packet->header.u32all = build_pm4_header(IT_MAP_PROCESS,
>>>> -                                       sizeof(struct
>>>> pm4_map_process));
>>>> +       packet->header.u32All = build_pm4_header(IT_MAP_PROCESS,
>>>> +                                       sizeof(struct
>>>> +pm4_mes_map_process));
>>>>         packet->bitfields2.diq_enable = (qpd->is_debug) ? 1 : 0;
>>>>         packet->bitfields2.process_quantum = 1;
>>>>         packet->bitfields2.pasid = qpd->pqm->process->pasid;  @@
>>>> -170,23 +166,26 @@ static int pm_create_map_process(struct
>>>> packet_manager *pm, uint32_t *buffer,
>>>>         packet->sh_mem_ape1_base = qpd->sh_mem_ape1_base;
>>>>         packet->sh_mem_ape1_limit = qpd->sh_mem_ape1_limit;
>>>>
>>>> +       /* TODO: scratch support */
>>>> +       packet->sh_hidden_private_base_vmid = 0;
>>>> +
>>>>         packet->gds_addr_lo = lower_32_bits(qpd->gds_context_area);
>>>>         packet->gds_addr_hi = upper_32_bits(qpd->gds_context_area);
>>>>
>>>>         return 0;
>>>>  }
>>>>
>>>> -static int pm_create_map_queue_vi(struct packet_manager *pm, uint32_t
>>>> *buffer,
>>>> +static int pm_create_map_queue(struct packet_manager *pm, uint32_t
>>>> +*buffer,
>>>>                 struct queue *q, bool is_static)
>>>>  {
>>>>         struct pm4_mes_map_queues *packet;
>>>>         bool use_static = is_static;
>>>>
>>>>         packet = (struct pm4_mes_map_queues *)buffer;
>>>> -       memset(buffer, 0, sizeof(struct pm4_map_queues));
>>>> +       memset(buffer, 0, sizeof(struct pm4_mes_map_queues));
>>>>
>>>> -       packet->header.u32all = build_pm4_header(IT_MAP_QUEUES,
>>>> -                                               sizeof(struct
>>>> pm4_map_queues));
>>>> +       packet->header.u32All = build_pm4_header(IT_MAP_QUEUES,
>>>> +                                               sizeof(struct
>>>> +pm4_mes_map_queues));
>>>>         packet->bitfields2.alloc_format =
>>>>                 alloc_format__mes_map_queues__one_per_pipe_vi;
>>>>         packet->bitfields2.num_queues = 1;  @@ -235,64 +234,6 @@
>>>> static int pm_create_map_queue_vi(struct packet_manager *pm, uint32_t
>>>> *buffer,
>>>>         return 0;
>>>>  }
>>>>
>>>> -static int pm_create_map_queue(struct packet_manager *pm, uint32_t
>>>> *buffer,
>>>> -                               struct queue *q, bool is_static)  -{
>>>> -       struct pm4_map_queues *packet;
>>>> -       bool use_static = is_static;
>>>> -
>>>> -       packet = (struct pm4_map_queues *)buffer;
>>>> -       memset(buffer, 0, sizeof(struct pm4_map_queues));
>>>> -
>>>> -       packet->header.u32all = build_pm4_header(IT_MAP_QUEUES,
>>>> -                                               sizeof(struct
>>>> pm4_map_queues));
>>>> -       packet->bitfields2.alloc_format =
>>>> -
>>>> alloc_format__mes_map_queues__one_per_pipe;
>>>> -       packet->bitfields2.num_queues = 1;
>>>> -       packet->bitfields2.queue_sel =
>>>> -
>>>> queue_sel__mes_map_queues__map_to_hws_determined_queue_slots;
>>>> -
>>>> -       packet->bitfields2.vidmem = (q->properties.is_interop) ?
>>>> -                       vidmem__mes_map_queues__uses_video_memory :
>>>> -                       vidmem__mes_map_queues__uses_no_video_memory;
>>>> -
>>>> -       switch (q->properties.type) {
>>>> -       case KFD_QUEUE_TYPE_COMPUTE:
>>>> -       case KFD_QUEUE_TYPE_DIQ:
>>>> -               packet->bitfields2.engine_sel =
>>>> -                               engine_sel__mes_map_queues__compute;
>>>> -               break;
>>>> -       case KFD_QUEUE_TYPE_SDMA:
>>>> -               packet->bitfields2.engine_sel =
>>>> -                               engine_sel__mes_map_queues__sdma0;
>>>> -               use_static = false; /* no static queues under SDMA */
>>>> -               break;
>>>> -       default:
>>>> -               WARN(1, "queue type %d", q->properties.type);
>>>> -               return -EINVAL;
>>>> -       }
>>>> -
>>>> -       packet->mes_map_queues_ordinals[0].bitfields3.doorbell_offset
>>>> =
>>>> -                       q->properties.doorbell_off;
>>>> -
>>>> -       packet->mes_map_queues_ordinals[0].bitfields3.is_static =
>>>> -                       (use_static) ? 1 : 0;
>>>> -
>>>> -       packet->mes_map_queues_ordinals[0].mqd_addr_lo =
>>>> -                       lower_32_bits(q->gart_mqd_addr);
>>>> -
>>>> -       packet->mes_map_queues_ordinals[0].mqd_addr_hi =
>>>> -                       upper_32_bits(q->gart_mqd_addr);
>>>> -
>>>> -       packet->mes_map_queues_ordinals[0].wptr_addr_lo =
>>>> -
>>>> lower_32_bits((uint64_t)q->properties.write_ptr);
>>>> -
>>>> -       packet->mes_map_queues_ordinals[0].wptr_addr_hi =
>>>> -
>>>> upper_32_bits((uint64_t)q->properties.write_ptr);
>>>> -
>>>> -       return 0;
>>>> -}
>>>> -
>>>>  static int pm_create_runlist_ib(struct packet_manager *pm,
>>>>                                 struct list_head *queues,
>>>>                                 uint64_t *rl_gpu_addr,  @@ -334,7
>>>> +275,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
>>>>                         return retval;
>>>>
>>>>                 proccesses_mapped++;
>>>> -               inc_wptr(&rl_wptr, sizeof(struct pm4_map_process),
>>>> +               inc_wptr(&rl_wptr, sizeof(struct pm4_mes_map_process),
>>>>                                 alloc_size_bytes);
>>>>
>>>>                 list_for_each_entry(kq, &qpd->priv_queue_list, list) {
>>>> @@ -344,14 +285,7 @@ static int pm_create_runlist_ib(struct
>>>> packet_manager *pm,
>>>>                         pr_debug("static_queue, mapping kernel q %d,
>>>> is debug status %d\n",
>>>>                                 kq->queue->queue, qpd->is_debug);
>>>>
>>>> -                       if (pm->dqm->dev->device_info->asic_family ==
>>>> -                                       CHIP_CARRIZO)
>>>> -                               retval = pm_create_map_queue_vi(pm,
>>>> -                                               &rl_buffer[rl_wptr],
>>>> -                                               kq->queue,
>>>> -                                               qpd->is_debug);
>>>> -                       else
>>>> -                               retval = pm_create_map_queue(pm,
>>>> +                       retval = pm_create_map_queue(pm,
>>>>                                                 &rl_buffer[rl_wptr],
>>>>                                                 kq->queue,
>>>>                                                 qpd->is_debug);  @@
>>>> -359,7 +293,7 @@ static int pm_create_runlist_ib(struct packet_manager
>>>> *pm,
>>>>                                 return retval;
>>>>
>>>>                         inc_wptr(&rl_wptr,
>>>> -                               sizeof(struct pm4_map_queues),
>>>> +                               sizeof(struct pm4_mes_map_queues),
>>>>                                 alloc_size_bytes);
>>>>                 }
>>>>
>>>> @@ -370,14 +304,7 @@ static int pm_create_runlist_ib(struct
>>>> packet_manager *pm,
>>>>                         pr_debug("static_queue, mapping user queue %d,
>>>> is debug status %d\n",
>>>>                                 q->queue, qpd->is_debug);
>>>>
>>>> -                       if (pm->dqm->dev->device_info->asic_family ==
>>>> -                                       CHIP_CARRIZO)
>>>> -                               retval = pm_create_map_queue_vi(pm,
>>>> -                                               &rl_buffer[rl_wptr],
>>>> -                                               q,
>>>> -                                               qpd->is_debug);
>>>> -                       else
>>>> -                               retval = pm_create_map_queue(pm,
>>>> +                       retval = pm_create_map_queue(pm,
>>>>                                                 &rl_buffer[rl_wptr],
>>>>                                                 q,
>>>>                                                 qpd->is_debug);  @@
>>>> -386,7 +313,7 @@ static int pm_create_runlist_ib(struct packet_manager
>>>> *pm,
>>>>                                 return retval;
>>>>
>>>>                         inc_wptr(&rl_wptr,
>>>> -                               sizeof(struct pm4_map_queues),
>>>> +                               sizeof(struct pm4_mes_map_queues),
>>>>                                 alloc_size_bytes);
>>>>                 }
>>>>         }
>>>> @@ -429,7 +356,7 @@ void pm_uninit(struct packet_manager *pm)
>>>>  int pm_send_set_resources(struct packet_manager *pm,
>>>>                                 struct scheduling_resources *res)
>>>>  {
>>>> -       struct pm4_set_resources *packet;
>>>> +       struct pm4_mes_set_resources *packet;
>>>>         int retval = 0;
>>>>
>>>>         mutex_lock(&pm->lock);
>>>> @@ -442,9 +369,9 @@ int pm_send_set_resources(struct packet_manager
>>>> *pm,
>>>>                 goto out;
>>>>         }
>>>>
>>>> -       memset(packet, 0, sizeof(struct pm4_set_resources));
>>>> -       packet->header.u32all = build_pm4_header(IT_SET_RESOURCES,
>>>> -                                       sizeof(struct
>>>> pm4_set_resources));
>>>> +       memset(packet, 0, sizeof(struct pm4_mes_set_resources));
>>>> +       packet->header.u32All = build_pm4_header(IT_SET_RESOURCES,
>>>> +                                       sizeof(struct
>>>> +pm4_mes_set_resources));
>>>>
>>>>         packet->bitfields2.queue_type =
>>>>
>>>> queue_type__mes_set_resources__hsa_interface_queue_hiq;
>>>> @@ -482,7 +409,7 @@ int pm_send_runlist(struct packet_manager *pm,
>>>> struct list_head *dqm_queues)
>>>>
>>>>         pr_debug("runlist IB address: 0x%llX\n", rl_gpu_ib_addr);
>>>>
>>>> -       packet_size_dwords = sizeof(struct pm4_runlist) /
>>>> sizeof(uint32_t);
>>>> +       packet_size_dwords = sizeof(struct pm4_mes_runlist) /
>>>> +sizeof(uint32_t);
>>>>         mutex_lock(&pm->lock);
>>>>
>>>>         retval =
>>>> pm->priv_queue->ops.acquire_packet_buffer(pm->priv_queue,
>>>> @@ -514,7 +441,7 @@ int pm_send_query_status(struct packet_manager
>>>> *pm, uint64_t fence_address,
>>>>                         uint32_t fence_value)
>>>>  {
>>>>         int retval;
>>>> -       struct pm4_query_status *packet;
>>>> +       struct pm4_mes_query_status *packet;
>>>>
>>>>         if (WARN_ON(!fence_address))
>>>>                 return -EFAULT;
>>>> @@ -522,13 +449,13 @@ int pm_send_query_status(struct packet_manager
>>>> *pm, uint64_t fence_address,
>>>>         mutex_lock(&pm->lock);
>>>>         retval = pm->priv_queue->ops.acquire_packet_buffer(
>>>>                         pm->priv_queue,
>>>> -                       sizeof(struct pm4_query_status) /
>>>> sizeof(uint32_t),
>>>> +                       sizeof(struct pm4_mes_query_status) /
>>>> +sizeof(uint32_t),
>>>>                         (unsigned int **)&packet);
>>>>         if (retval)
>>>>                 goto fail_acquire_packet_buffer;
>>>>
>>>> -       packet->header.u32all = build_pm4_header(IT_QUERY_STATUS,
>>>> -                                       sizeof(struct
>>>> pm4_query_status));
>>>> +       packet->header.u32All = build_pm4_header(IT_QUERY_STATUS,
>>>> +                                       sizeof(struct
>>>> +pm4_mes_query_status));
>>>>
>>>>         packet->bitfields2.context_id = 0;
>>>>         packet->bitfields2.interrupt_sel =  @@ -555,22 +482,22 @@ int
>>>> pm_send_unmap_queue(struct packet_manager *pm, enum
>>> kfd_queue_type
>>>> type,
>>>>  {
>>>>         int retval;
>>>>         uint32_t *buffer;
>>>> -       struct pm4_unmap_queues *packet;
>>>> +       struct pm4_mes_unmap_queues *packet;
>>>>
>>>>         mutex_lock(&pm->lock);
>>>>         retval = pm->priv_queue->ops.acquire_packet_buffer(
>>>>                         pm->priv_queue,
>>>> -                       sizeof(struct pm4_unmap_queues) /
>>>> sizeof(uint32_t),
>>>> +                       sizeof(struct pm4_mes_unmap_queues) /
>>>> +sizeof(uint32_t),
>>>>                         &buffer);
>>>>         if (retval)
>>>>                 goto err_acquire_packet_buffer;
>>>>
>>>> -       packet = (struct pm4_unmap_queues *)buffer;
>>>> -       memset(buffer, 0, sizeof(struct pm4_unmap_queues));
>>>> +       packet = (struct pm4_mes_unmap_queues *)buffer;
>>>> +       memset(buffer, 0, sizeof(struct pm4_mes_unmap_queues));
>>>>         pr_debug("static_queue: unmapping queues: mode is %d , reset
>>>> is %d , type is %d\n",
>>>>                 mode, reset, type);
>>>> -       packet->header.u32all = build_pm4_header(IT_UNMAP_QUEUES,
>>>> -                                       sizeof(struct
>>>> pm4_unmap_queues));
>>>> +       packet->header.u32All = build_pm4_header(IT_UNMAP_QUEUES,
>>>> +                                       sizeof(struct
>>>> +pm4_mes_unmap_queues));
>>>>         switch (type) {
>>>>         case KFD_QUEUE_TYPE_COMPUTE:
>>>>         case KFD_QUEUE_TYPE_DIQ:
>>>> @@ -608,12 +535,12 @@ int pm_send_unmap_queue(struct
>>> packet_manager
>>>> *pm, enum kfd_queue_type type,
>>>>                 break;
>>>>         case KFD_PREEMPT_TYPE_FILTER_ALL_QUEUES:
>>>>                 packet->bitfields2.queue_sel =
>>>> -
>>>> queue_sel__mes_unmap_queues__perform_request_on_all_active_queue
>>> s;
>>>> +
>>>> +queue_sel__mes_unmap_queues__unmap_all_queues;
>>>>                 break;
>>>>         case KFD_PREEMPT_TYPE_FILTER_DYNAMIC_QUEUES:
>>>>                 /* in this case, we do not preempt static queues */
>>>>                 packet->bitfields2.queue_sel =
>>>> -
>>>> queue_sel__mes_unmap_queues__perform_request_on_dynamic_queues
>>> _only;
>>>> +
>>>> +queue_sel__mes_unmap_queues__unmap_all_non_static_queues;
>>>>                 break;
>>>>         default:
>>>>                 WARN(1, "filter %d", mode);  diff --git
>>>> a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
>>>> b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
>>>> index 97e5442..e50f73d 100644
>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
>>>> @@ -41,99 +41,6 @@ union PM4_MES_TYPE_3_HEADER {
>>>>  };
>>>>  #endif /* PM4_MES_HEADER_DEFINED */
>>>>
>>>> -/* --------------------MES_SET_RESOURCES-------------------- */
>>>> -
>>>> -#ifndef PM4_MES_SET_RESOURCES_DEFINED -#define
>>>> PM4_MES_SET_RESOURCES_DEFINED -enum
>>> set_resources_queue_type_enum {
>>>> -       queue_type__mes_set_resources__kernel_interface_queue_kiq = 0,
>>>> -       queue_type__mes_set_resources__hsa_interface_queue_hiq = 1,
>>>> -       queue_type__mes_set_resources__hsa_debug_interface_queue = 4
>>>> -};
>>>> -
>>>> -struct pm4_set_resources {
>>>> -       union {
>>>> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
>>>> -               uint32_t ordinal1;
>>>> -       };
>>>> -
>>>> -       union {
>>>> -               struct {
>>>> -                       uint32_t vmid_mask:16;
>>>> -                       uint32_t unmap_latency:8;
>>>> -                       uint32_t reserved1:5;
>>>> -                       enum set_resources_queue_type_enum
>>>> queue_type:3;
>>>> -               } bitfields2;
>>>> -               uint32_t ordinal2;
>>>> -       };
>>>> -
>>>> -       uint32_t queue_mask_lo;
>>>> -       uint32_t queue_mask_hi;
>>>> -       uint32_t gws_mask_lo;
>>>> -       uint32_t gws_mask_hi;
>>>> -
>>>> -       union {
>>>> -               struct {
>>>> -                       uint32_t oac_mask:16;
>>>> -                       uint32_t reserved2:16;
>>>> -               } bitfields7;
>>>> -               uint32_t ordinal7;
>>>> -       };
>>>> -
>>>> -       union {
>>>> -               struct {
>>>> -                       uint32_t gds_heap_base:6;
>>>> -                       uint32_t reserved3:5;
>>>> -                       uint32_t gds_heap_size:6;
>>>> -                       uint32_t reserved4:15;
>>>> -               } bitfields8;
>>>> -               uint32_t ordinal8;
>>>> -       };
>>>> -
>>>> -};
>>>> -#endif
>>>> -
>>>> -/*--------------------MES_RUN_LIST-------------------- */
>>>> -
>>>> -#ifndef PM4_MES_RUN_LIST_DEFINED
>>>> -#define PM4_MES_RUN_LIST_DEFINED
>>>> -
>>>> -struct pm4_runlist {
>>>> -       union {
>>>> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
>>>> -               uint32_t ordinal1;
>>>> -       };
>>>> -
>>>> -       union {
>>>> -               struct {
>>>> -                       uint32_t reserved1:2;
>>>> -                       uint32_t ib_base_lo:30;
>>>> -               } bitfields2;
>>>> -               uint32_t ordinal2;
>>>> -       };
>>>> -
>>>> -       union {
>>>> -               struct {
>>>> -                       uint32_t ib_base_hi:16;
>>>> -                       uint32_t reserved2:16;
>>>> -               } bitfields3;
>>>> -               uint32_t ordinal3;
>>>> -       };
>>>> -
>>>> -       union {
>>>> -               struct {
>>>> -                       uint32_t ib_size:20;
>>>> -                       uint32_t chain:1;
>>>> -                       uint32_t offload_polling:1;
>>>> -                       uint32_t reserved3:1;
>>>> -                       uint32_t valid:1;
>>>> -                       uint32_t reserved4:8;
>>>> -               } bitfields4;
>>>> -               uint32_t ordinal4;
>>>> -       };
>>>> -
>>>> -};
>>>> -#endif
>>>>
>>>>  /*--------------------MES_MAP_PROCESS-------------------- */
>>>>
>>>> @@ -186,217 +93,58 @@ struct pm4_map_process {
>>>>  };
>>>>  #endif
>>>>
>>>> -/*--------------------MES_MAP_QUEUES--------------------*/
>>>> -
>>>> -#ifndef PM4_MES_MAP_QUEUES_DEFINED
>>>> -#define PM4_MES_MAP_QUEUES_DEFINED
>>>> -enum map_queues_queue_sel_enum {
>>>> -       queue_sel__mes_map_queues__map_to_specified_queue_slots = 0,
>>>> -
>>>       queue_sel__mes_map_queues__map_to_hws_determined_queue_slots
>>> =
>>>> 1,
>>>> -       queue_sel__mes_map_queues__enable_process_queues = 2 -};
>>>> +#ifndef PM4_MES_MAP_PROCESS_DEFINED_KV_SCRATCH
>>>> +#define PM4_MES_MAP_PROCESS_DEFINED_KV_SCRATCH
>>>>
>>>> -enum map_queues_vidmem_enum {
>>>> -       vidmem__mes_map_queues__uses_no_video_memory = 0,
>>>> -       vidmem__mes_map_queues__uses_video_memory = 1 -};
>>>> -
>>>> -enum map_queues_alloc_format_enum {
>>>> -       alloc_format__mes_map_queues__one_per_pipe = 0,
>>>> -       alloc_format__mes_map_queues__all_on_one_pipe = 1 -};
>>>> -
>>>> -enum map_queues_engine_sel_enum {
>>>> -       engine_sel__mes_map_queues__compute = 0,
>>>> -       engine_sel__mes_map_queues__sdma0 = 2,
>>>> -       engine_sel__mes_map_queues__sdma1 = 3 -};
>>>> -
>>>> -struct pm4_map_queues {
>>>> +struct pm4_map_process_scratch_kv {
>>>>         union {
>>>> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
>>>> -               uint32_t ordinal1;
>>>> -       };
>>>> -
>>>> -       union {
>>>> -               struct {
>>>> -                       uint32_t reserved1:4;
>>>> -                       enum map_queues_queue_sel_enum queue_sel:2;
>>>> -                       uint32_t reserved2:2;
>>>> -                       uint32_t vmid:4;
>>>> -                       uint32_t reserved3:4;
>>>> -                       enum map_queues_vidmem_enum vidmem:2;
>>>> -                       uint32_t reserved4:6;
>>>> -                       enum map_queues_alloc_format_enum
>>>> alloc_format:2;
>>>> -                       enum map_queues_engine_sel_enum engine_sel:3;
>>>> -                       uint32_t num_queues:3;
>>>> -               } bitfields2;
>>>> -               uint32_t ordinal2;
>>>> -       };
>>>> -
>>>> -       struct {
>>>> -               union {
>>>> -                       struct {
>>>> -                               uint32_t is_static:1;
>>>> -                               uint32_t reserved5:1;
>>>> -                               uint32_t doorbell_offset:21;
>>>> -                               uint32_t reserved6:3;
>>>> -                               uint32_t queue:6;
>>>> -                       } bitfields3;
>>>> -                       uint32_t ordinal3;
>>>> -               };
>>>> -
>>>> -               uint32_t mqd_addr_lo;
>>>> -               uint32_t mqd_addr_hi;
>>>> -               uint32_t wptr_addr_lo;
>>>> -               uint32_t wptr_addr_hi;
>>>> -
>>>> -       } mes_map_queues_ordinals[1];   /* 1..N of these ordinal
>>>> groups */
>>>> -
>>>> -};
>>>> -#endif
>>>> -
>>>> -/*--------------------MES_QUERY_STATUS--------------------*/
>>>> -
>>>> -#ifndef PM4_MES_QUERY_STATUS_DEFINED
>>>> -#define PM4_MES_QUERY_STATUS_DEFINED
>>>> -enum query_status_interrupt_sel_enum {
>>>> -       interrupt_sel__mes_query_status__completion_status = 0,
>>>> -       interrupt_sel__mes_query_status__process_status = 1,
>>>> -       interrupt_sel__mes_query_status__queue_status = 2  -};
>>>> -
>>>> -enum query_status_command_enum {
>>>> -       command__mes_query_status__interrupt_only = 0,
>>>> -       command__mes_query_status__fence_only_immediate = 1,
>>>> -       command__mes_query_status__fence_only_after_write_ack = 2,
>>>> -
>>>> command__mes_query_status__fence_wait_for_write_ack_send_interrupt
>>> = 3
>>>> -};
>>>> -
>>>> -enum query_status_engine_sel_enum {
>>>> -       engine_sel__mes_query_status__compute = 0,
>>>> -       engine_sel__mes_query_status__sdma0_queue = 2,
>>>> -       engine_sel__mes_query_status__sdma1_queue = 3  -};
>>>> -
>>>> -struct pm4_query_status {
>>>> -       union {
>>>> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
>>>> -               uint32_t ordinal1;
>>>> -       };
>>>> -
>>>> -       union {
>>>> -               struct {
>>>> -                       uint32_t context_id:28;
>>>> -                       enum query_status_interrupt_sel_enum
>>>> interrupt_sel:2;
>>>> -                       enum query_status_command_enum command:2;
>>>> -               } bitfields2;
>>>> -               uint32_t ordinal2;
>>>> +               union PM4_MES_TYPE_3_HEADER   header; /* header */
>>>> +               uint32_t            ordinal1;
>>>>         };
>>>>
>>>>         union {
>>>>                 struct {
>>>>                         uint32_t pasid:16;
>>>> -                       uint32_t reserved1:16;
>>>> -               } bitfields3a;
>>>> -               struct {
>>>> -                       uint32_t reserved2:2;
>>>> -                       uint32_t doorbell_offset:21;
>>>> -                       uint32_t reserved3:3;
>>>> -                       enum query_status_engine_sel_enum
>>>> engine_sel:3;
>>>> -                       uint32_t reserved4:3;
>>>> -               } bitfields3b;
>>>> -               uint32_t ordinal3;
>>>> -       };
>>>> -
>>>> -       uint32_t addr_lo;
>>>> -       uint32_t addr_hi;
>>>> -       uint32_t data_lo;
>>>> -       uint32_t data_hi;
>>>> -};
>>>> -#endif
>>>> -
>>>> -/*--------------------MES_UNMAP_QUEUES--------------------*/
>>>> -
>>>> -#ifndef PM4_MES_UNMAP_QUEUES_DEFINED
>>>> -#define PM4_MES_UNMAP_QUEUES_DEFINED
>>>> -enum unmap_queues_action_enum {
>>>> -       action__mes_unmap_queues__preempt_queues = 0,
>>>> -       action__mes_unmap_queues__reset_queues = 1,
>>>> -       action__mes_unmap_queues__disable_process_queues = 2  -};
>>>> -
>>>> -enum unmap_queues_queue_sel_enum {
>>>> -
>>>> queue_sel__mes_unmap_queues__perform_request_on_specified_queues
>>> = 0,
>>>> -
>>>       queue_sel__mes_unmap_queues__perform_request_on_pasid_queues =
>>>> 1,
>>>> -
>>>> queue_sel__mes_unmap_queues__perform_request_on_all_active_queue
>>> s = 2,
>>>> -
>>>> queue_sel__mes_unmap_queues__perform_request_on_dynamic_queues
>>> _only = 3
>>>> -};
>>>> -
>>>> -enum unmap_queues_engine_sel_enum {
>>>> -       engine_sel__mes_unmap_queues__compute = 0,
>>>> -       engine_sel__mes_unmap_queues__sdma0 = 2,
>>>> -       engine_sel__mes_unmap_queues__sdma1 = 3  -};
>>>> -
>>>> -struct pm4_unmap_queues {
>>>> -       union {
>>>> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
>>>> -               uint32_t ordinal1;
>>>> -       };
>>>> -
>>>> -       union {
>>>> -               struct {
>>>> -                       enum unmap_queues_action_enum action:2;
>>>> -                       uint32_t reserved1:2;
>>>> -                       enum unmap_queues_queue_sel_enum queue_sel:2;
>>>> -                       uint32_t reserved2:20;
>>>> -                       enum unmap_queues_engine_sel_enum
>>>> engine_sel:3;
>>>> -                       uint32_t num_queues:3;
>>>> +                       uint32_t reserved1:8;
>>>> +                       uint32_t diq_enable:1;
>>>> +                       uint32_t process_quantum:7;
>>>>                 } bitfields2;
>>>>                 uint32_t ordinal2;
>>>>         };
>>>>
>>>>         union {
>>>>                 struct {
>>>> -                       uint32_t pasid:16;
>>>> -                       uint32_t reserved3:16;
>>>> -               } bitfields3a;
>>>> -               struct {
>>>> -                       uint32_t reserved4:2;
>>>> -                       uint32_t doorbell_offset0:21;
>>>> -                       uint32_t reserved5:9;
>>>> -               } bitfields3b;
>>>> +                       uint32_t page_table_base:28;
>>>> +                       uint32_t reserved2:4;
>>>> +               } bitfields3;
>>>>                 uint32_t ordinal3;
>>>>         };
>>>>
>>>> -       union {
>>>> -               struct {
>>>> -                       uint32_t reserved6:2;
>>>> -                       uint32_t doorbell_offset1:21;
>>>> -                       uint32_t reserved7:9;
>>>> -               } bitfields4;
>>>> -               uint32_t ordinal4;
>>>> -       };
>>>> -
>>>> -       union {
>>>> -               struct {
>>>> -                       uint32_t reserved8:2;
>>>> -                       uint32_t doorbell_offset2:21;
>>>> -                       uint32_t reserved9:9;
>>>> -               } bitfields5;
>>>> -               uint32_t ordinal5;
>>>> -       };
>>>> +       uint32_t reserved3;
>>>> +       uint32_t sh_mem_bases;
>>>> +       uint32_t sh_mem_config;
>>>> +       uint32_t sh_mem_ape1_base;
>>>> +       uint32_t sh_mem_ape1_limit;
>>>> +       uint32_t sh_hidden_private_base_vmid;
>>>> +       uint32_t reserved4;
>>>> +       uint32_t reserved5;
>>>> +       uint32_t gds_addr_lo;
>>>> +       uint32_t gds_addr_hi;
>>>>
>>>>         union {
>>>>                 struct {
>>>> -                       uint32_t reserved10:2;
>>>> -                       uint32_t doorbell_offset3:21;
>>>> -                       uint32_t reserved11:9;
>>>> -               } bitfields6;
>>>> -               uint32_t ordinal6;
>>>> +                       uint32_t num_gws:6;
>>>> +                       uint32_t reserved6:2;
>>>> +                       uint32_t num_oac:4;
>>>> +                       uint32_t reserved7:4;
>>>> +                       uint32_t gds_size:6;
>>>> +                       uint32_t num_queues:10;
>>>> +               } bitfields14;
>>>> +               uint32_t ordinal14;
>>>>         };
>>>>
>>>> +       uint32_t completion_signal_lo32; uint32_t
>>>> +completion_signal_hi32;
>>>>  };
>>>>  #endif
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
>>>> b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
>>>> index c4eda6f..7c8d9b3 100644
>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
>>>> @@ -126,9 +126,10 @@ struct pm4_mes_runlist {
>>>>                         uint32_t ib_size:20;
>>>>                         uint32_t chain:1;
>>>>                         uint32_t offload_polling:1;
>>>> -                       uint32_t reserved3:1;
>>>> +                       uint32_t reserved2:1;
>>>>                         uint32_t valid:1;
>>>> -                       uint32_t reserved4:8;
>>>> +                       uint32_t process_cnt:4;
>>>> +                       uint32_t reserved3:4;
>>>>                 } bitfields4;
>>>>                 uint32_t ordinal4;
>>>>         };
>>>> @@ -143,8 +144,8 @@ struct pm4_mes_runlist {
>>>>
>>>>  struct pm4_mes_map_process {
>>>>         union {
>>>> -               union PM4_MES_TYPE_3_HEADER   header;            /*
>>>> header */
>>>> -               uint32_t            ordinal1;
>>>> +               union PM4_MES_TYPE_3_HEADER header;     /* header */
>>>> +               uint32_t ordinal1;
>>>>         };
>>>>
>>>>         union {
>>>> @@ -155,36 +156,48 @@ struct pm4_mes_map_process {
>>>>                         uint32_t process_quantum:7;
>>>>                 } bitfields2;
>>>>                 uint32_t ordinal2;
>>>> -};
>>>> +       };
>>>>
>>>>         union {
>>>>                 struct {
>>>>                         uint32_t page_table_base:28;
>>>> -                       uint32_t reserved2:4;
>>>> +                       uint32_t reserved3:4;
>>>>                 } bitfields3;
>>>>                 uint32_t ordinal3;
>>>>         };
>>>>
>>>> +       uint32_t reserved;
>>>> +
>>>>         uint32_t sh_mem_bases;
>>>> +       uint32_t sh_mem_config;
>>>>         uint32_t sh_mem_ape1_base;
>>>>         uint32_t sh_mem_ape1_limit;
>>>> -       uint32_t sh_mem_config;
>>>> +
>>>> +       uint32_t sh_hidden_private_base_vmid;
>>>> +
>>>> +       uint32_t reserved2;
>>>> +       uint32_t reserved3;
>>>> +
>>>>         uint32_t gds_addr_lo;
>>>>         uint32_t gds_addr_hi;
>>>>
>>>>         union {
>>>>                 struct {
>>>>                         uint32_t num_gws:6;
>>>> -                       uint32_t reserved3:2;
>>>> +                       uint32_t reserved4:2;
>>>>                         uint32_t num_oac:4;
>>>> -                       uint32_t reserved4:4;
>>>> +                       uint32_t reserved5:4;
>>>>                         uint32_t gds_size:6;
>>>>                         uint32_t num_queues:10;
>>>>                 } bitfields10;
>>>>                 uint32_t ordinal10;
>>>>         };
>>>>
>>>> +       uint32_t completion_signal_lo;
>>>> +       uint32_t completion_signal_hi;
>>>> +
>>>>  };
>>>> +
>>>>  #endif
>>>>
>>>>  /*--------------------MES_MAP_QUEUES--------------------*/
>>>> @@ -337,7 +350,7 @@ enum mes_unmap_queues_engine_sel_enum {
>>>>         engine_sel__mes_unmap_queues__sdmal = 3
>>>>  };
>>>>
>>>> -struct PM4_MES_UNMAP_QUEUES {
>>>> +struct pm4_mes_unmap_queues {
>>>>         union {
>>>>                 union PM4_MES_TYPE_3_HEADER   header;            /*
>>>> header */
>>>>                 uint32_t            ordinal1;  @@ -397,4 +410,101 @@
>>>> struct PM4_MES_UNMAP_QUEUES {
>>>>  };
>>>>  #endif
>>>>
>>>> +#ifndef PM4_MEC_RELEASE_MEM_DEFINED
>>>> +#define PM4_MEC_RELEASE_MEM_DEFINED
>>>> +enum RELEASE_MEM_event_index_enum {
>>>> +       event_index___release_mem__end_of_pipe = 5,
>>>> +       event_index___release_mem__shader_done = 6 };
>>>> +
>>>> +enum RELEASE_MEM_cache_policy_enum {
>>>> +       cache_policy___release_mem__lru = 0,
>>>> +       cache_policy___release_mem__stream = 1,
>>>> +       cache_policy___release_mem__bypass = 2 };
>>>> +
>>>> +enum RELEASE_MEM_dst_sel_enum {
>>>> +       dst_sel___release_mem__memory_controller = 0,
>>>> +       dst_sel___release_mem__tc_l2 = 1,
>>>> +       dst_sel___release_mem__queue_write_pointer_register = 2,
>>>> +       dst_sel___release_mem__queue_write_pointer_poll_mask_bit = 3
>>>> +};
>>>> +
>>>> +enum RELEASE_MEM_int_sel_enum {
>>>> +       int_sel___release_mem__none = 0,
>>>> +       int_sel___release_mem__send_interrupt_only = 1,
>>>> +       int_sel___release_mem__send_interrupt_after_write_confirm = 2,
>>>> +       int_sel___release_mem__send_data_after_write_confirm = 3 };
>>>> +
>>>> +enum RELEASE_MEM_data_sel_enum {
>>>> +       data_sel___release_mem__none = 0,
>>>> +       data_sel___release_mem__send_32_bit_low = 1,
>>>> +       data_sel___release_mem__send_64_bit_data = 2,
>>>> +       data_sel___release_mem__send_gpu_clock_counter = 3,
>>>> +       data_sel___release_mem__send_cp_perfcounter_hi_lo = 4,
>>>> +       data_sel___release_mem__store_gds_data_to_memory = 5 };
>>>> +
>>>> +struct pm4_mec_release_mem {
>>>> +       union {
>>>> +               union PM4_MES_TYPE_3_HEADER header;     /*header */
>>>> +               unsigned int ordinal1;
>>>> +       };
>>>> +
>>>> +       union {
>>>> +               struct {
>>>> +                       unsigned int event_type:6;
>>>> +                       unsigned int reserved1:2;
>>>> +                       enum RELEASE_MEM_event_index_enum
>>>> +event_index:4;
>>>> +                       unsigned int tcl1_vol_action_ena:1;
>>>> +                       unsigned int tc_vol_action_ena:1;
>>>> +                       unsigned int reserved2:1;
>>>> +                       unsigned int tc_wb_action_ena:1;
>>>> +                       unsigned int tcl1_action_ena:1;
>>>> +                       unsigned int tc_action_ena:1;
>>>> +                       unsigned int reserved3:6;
>>>> +                       unsigned int atc:1;
>>>> +                       enum RELEASE_MEM_cache_policy_enum
>>>> +cache_policy:2;
>>>> +                       unsigned int reserved4:5;
>>>> +               } bitfields2;
>>>> +               unsigned int ordinal2;
>>>> +       };
>>>> +
>>>> +       union {
>>>> +               struct {
>>>> +                       unsigned int reserved5:16;
>>>> +                       enum RELEASE_MEM_dst_sel_enum dst_sel:2;
>>>> +                       unsigned int reserved6:6;
>>>> +                       enum RELEASE_MEM_int_sel_enum int_sel:3;
>>>> +                       unsigned int reserved7:2;
>>>> +                       enum RELEASE_MEM_data_sel_enum data_sel:3;
>>>> +               } bitfields3;
>>>> +               unsigned int ordinal3;
>>>> +       };
>>>> +
>>>> +       union {
>>>> +               struct {
>>>> +                       unsigned int reserved8:2;
>>>> +                       unsigned int address_lo_32b:30;
>>>> +               } bitfields4;
>>>> +               struct {
>>>> +                       unsigned int reserved9:3;
>>>> +                       unsigned int address_lo_64b:29;
>>>> +               } bitfields5;
>>>> +               unsigned int ordinal4;
>>>> +       };
>>>> +
>>>> +       unsigned int address_hi;
>>>> +
>>>> +       unsigned int data_lo;
>>>> +
>>>> +       unsigned int data_hi;
>>>> +};
>>>> +#endif
>>>> +
>>>> +enum {
>>>> +       CACHE_FLUSH_AND_INV_TS_EVENT = 0x00000014 };
>>>> +
>>>>  #endif
>>>> --
>>>> 2.7.4
>>>>
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 16/19] drm/amdkfd: Update PM4 packet headers
       [not found]                         ` <6137eb72-cb41-d65a-7863-71adf31a3506-5C7GfCeVMHo@public.gmane.org>
@ 2017-08-16  7:37                           ` Christian König
       [not found]                             ` <4a8512a4-21df-cf48-4500-0424b08cd357-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
  2017-08-16 16:10                           ` Deucher, Alexander
  1 sibling, 1 reply; 70+ messages in thread
From: Christian König @ 2017-08-16  7:37 UTC (permalink / raw)
  To: Felix Kuehling, Oded Gabbay, Bridgman, John, Deucher, Alexander
  Cc: amd-gfx list

Hi Felix,

in general Alex handles that by pushing firmware to the linux-firmware tree.

> IIRC the amdgpu devs had been holding back on publishing the updated MEC microcode (with scratch support) because that WOULD have broken Kaveri. With this change from Felix we should be able to publish the newest microcode for both amdgpu and amdkfd WITHOUT breaking Kaveri.
Well what does breaking Kaveri mean here?

Please note that firmware *MUST* always be backward compatible. In other 
words new firmware must still work with older kernels without that fix.

If that is not guaranteed we need to get step back and talk to the 
firmware team once more to implement the new feature in a backward 
compatible way.

Regards,
Christian.

Am 16.08.2017 um 03:19 schrieb Felix Kuehling:
> Hi Alex,
>
> How does firmware get published for the upstream driver? Where can I
> check the currently published version of both CZ and KV firmware for
> upstream?
>
> Do you publish firmware updates at the same time as patches that depend
> on them?
>
> Thanks,
>    Felix
>
>
> On 2017-08-13 04:49 AM, Oded Gabbay wrote:
>> On Sat, Aug 12, 2017 at 10:09 PM, Bridgman, John <John.Bridgman@amd.com> wrote:
>>> IIRC the amdgpu devs had been holding back on publishing the updated MEC microcode (with scratch support) because that WOULD have broken Kaveri. With this change from Felix we should be able to publish the newest microcode for both amdgpu and amdkfd WITHOUT breaking Kaveri.
>>>
>>> IOW this is the "scratch fix for Kaveri KFD" you have wanted for a couple of years :)
>> ah, ok.
>>
>> In that case, this patch is:
>> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
>>
>>
>>>> -----Original Message-----
>>>> From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf
>>>> Of Kuehling, Felix
>>>> Sent: Saturday, August 12, 2017 2:16 PM
>>>> To: Oded Gabbay
>>>> Cc: amd-gfx list
>>>> Subject: Re: [PATCH 16/19] drm/amdkfd: Update PM4 packet headers
>>>>
>>>>> Do you mean that it won't work with Kaveri anymore ?
>>>> Kaveri got the same firmware changes, mostly for scratch memory support.
>>>> The Kaveri firmware headers name the structures and fields a bit differently
>>>> but they should be binary compatible. So we simplified the code to use only
>>>> one set of headers. I'll grab a Kaveri system to confirm that it works.
>>>>
>>>> Regards,
>>>>   Felix
>>>>
>>>> From: Oded Gabbay <oded.gabbay@gmail.com>
>>>> Sent: Saturday, August 12, 2017 11:10 AM
>>>> To: Kuehling, Felix
>>>> Cc: amd-gfx list
>>>> Subject: Re: [PATCH 16/19] drm/amdkfd: Update PM4 packet headers
>>>>
>>>> On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling <Felix.Kuehling@amd.com>
>>>> wrote:
>>>>> To match current firmware. The map process packet has been extended to
>>>>> support scratch. This is a non-backwards compatible change and it's
>>>>> about two years old. So no point keeping the old version around
>>>>> conditionally.
>>>> Do you mean that it won't work with Kaveri anymore ?
>>>> I believe we aren't allowed to break older H/W support without some
>>>> serious justification.
>>>>
>>>> Oded
>>>>
>>>>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>>>>> ---
>>>>>   drivers/gpu/drm/amd/amdkfd/kfd_device.c         |   8 +-
>>>>>   drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c | 161 ++++--------
>>>>>   drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h    | 314
>>>>> +++---------------------
>>>>>   drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h | 130 +++++++++-
>>>>>   4 files changed, 199 insertions(+), 414 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>>>> b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>>>> index e1c2ad2..e790e7f 100644
>>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>>>> @@ -26,7 +26,7 @@
>>>>>   #include <linux/slab.h>
>>>>>   #include "kfd_priv.h"
>>>>>   #include "kfd_device_queue_manager.h"
>>>>> -#include "kfd_pm4_headers.h"
>>>>> +#include "kfd_pm4_headers_vi.h"
>>>>>
>>>>>   #define MQD_SIZE_ALIGNED 768
>>>>>
>>>>> @@ -238,9 +238,9 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
>>>>>           * calculate max size of runlist packet.
>>>>>           * There can be only 2 packets at once
>>>>>           */
>>>>> -       size += (KFD_MAX_NUM_OF_PROCESSES * sizeof(struct
>>>>> pm4_map_process) +
>>>>> -               max_num_of_queues_per_device *
>>>>> -               sizeof(struct pm4_map_queues) + sizeof(struct
>>>>> pm4_runlist)) * 2;
>>>>> +       size += (KFD_MAX_NUM_OF_PROCESSES * sizeof(struct
>>>>> +pm4_mes_map_process) +
>>>>> +               max_num_of_queues_per_device * sizeof(struct
>>>>> +pm4_mes_map_queues)
>>>>> +               + sizeof(struct pm4_mes_runlist)) * 2;
>>>>>
>>>>>          /* Add size of HIQ & DIQ */
>>>>>          size += KFD_KERNEL_QUEUE_SIZE * 2;  diff --git
>>>>> a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
>>>>> b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
>>>>> index 77a6f2b..3141e05 100644
>>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
>>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
>>>>> @@ -26,7 +26,6 @@
>>>>>   #include "kfd_device_queue_manager.h"
>>>>>   #include "kfd_kernel_queue.h"
>>>>>   #include "kfd_priv.h"
>>>>> -#include "kfd_pm4_headers.h"
>>>>>   #include "kfd_pm4_headers_vi.h"
>>>>>   #include "kfd_pm4_opcodes.h"
>>>>>
>>>>> @@ -44,12 +43,12 @@ static unsigned int build_pm4_header(unsigned int
>>>>> opcode, size_t packet_size)
>>>>>   {
>>>>>          union PM4_MES_TYPE_3_HEADER header;
>>>>>
>>>>> -       header.u32all = 0;
>>>>> +       header.u32All = 0;
>>>>>          header.opcode = opcode;
>>>>>          header.count = packet_size/sizeof(uint32_t) - 2;
>>>>>          header.type = PM4_TYPE_3;
>>>>>
>>>>> -       return header.u32all;
>>>>> +       return header.u32All;
>>>>>   }
>>>>>
>>>>>   static void pm_calc_rlib_size(struct packet_manager *pm,  @@ -69,12
>>>>> +68,9 @@ static void pm_calc_rlib_size(struct packet_manager *pm,
>>>>>                  pr_debug("Over subscribed runlist\n");
>>>>>          }
>>>>>
>>>>> -       map_queue_size =
>>>>> -               (pm->dqm->dev->device_info->asic_family == CHIP_CARRIZO) ?
>>>>> -               sizeof(struct pm4_mes_map_queues) :
>>>>> -               sizeof(struct pm4_map_queues);
>>>>> +       map_queue_size = sizeof(struct pm4_mes_map_queues);
>>>>>          /* calculate run list ib allocation size */
>>>>> -       *rlib_size = process_count * sizeof(struct pm4_map_process) +
>>>>> +       *rlib_size = process_count * sizeof(struct
>>>>> +pm4_mes_map_process) +
>>>>>                       queue_count * map_queue_size;
>>>>>
>>>>>          /*
>>>>> @@ -82,7 +78,7 @@ static void pm_calc_rlib_size(struct packet_manager
>>>>> *pm,
>>>>>           * when over subscription
>>>>>           */
>>>>>          if (*over_subscription)
>>>>> -               *rlib_size += sizeof(struct pm4_runlist);
>>>>> +               *rlib_size += sizeof(struct pm4_mes_runlist);
>>>>>
>>>>>          pr_debug("runlist ib size %d\n", *rlib_size);
>>>>>   }
>>>>> @@ -119,16 +115,16 @@ static int pm_allocate_runlist_ib(struct
>>>>> packet_manager *pm,
>>>>>   static int pm_create_runlist(struct packet_manager *pm, uint32_t
>>>>> *buffer,
>>>>>                          uint64_t ib, size_t ib_size_in_dwords, bool
>>>>> chain)
>>>>>   {
>>>>> -       struct pm4_runlist *packet;
>>>>> +       struct pm4_mes_runlist *packet;
>>>>>
>>>>>          if (WARN_ON(!ib))
>>>>>                  return -EFAULT;
>>>>>
>>>>> -       packet = (struct pm4_runlist *)buffer;
>>>>> +       packet = (struct pm4_mes_runlist *)buffer;
>>>>>
>>>>> -       memset(buffer, 0, sizeof(struct pm4_runlist));
>>>>> -       packet->header.u32all = build_pm4_header(IT_RUN_LIST,
>>>>> -                                               sizeof(struct
>>>>> pm4_runlist));
>>>>> +       memset(buffer, 0, sizeof(struct pm4_mes_runlist));
>>>>> +       packet->header.u32All = build_pm4_header(IT_RUN_LIST,
>>>>> +                                               sizeof(struct
>>>>> +pm4_mes_runlist));
>>>>>
>>>>>          packet->bitfields4.ib_size = ib_size_in_dwords;
>>>>>          packet->bitfields4.chain = chain ? 1 : 0;  @@ -143,16 +139,16
>>>>> @@ static int pm_create_runlist(struct packet_manager *pm, uint32_t
>>>>> *buffer,
>>>>>   static int pm_create_map_process(struct packet_manager *pm, uint32_t
>>>>> *buffer,
>>>>>                                  struct qcm_process_device *qpd)
>>>>>   {
>>>>> -       struct pm4_map_process *packet;
>>>>> +       struct pm4_mes_map_process *packet;
>>>>>          struct queue *cur;
>>>>>          uint32_t num_queues;
>>>>>
>>>>> -       packet = (struct pm4_map_process *)buffer;
>>>>> +       packet = (struct pm4_mes_map_process *)buffer;
>>>>>
>>>>> -       memset(buffer, 0, sizeof(struct pm4_map_process));
>>>>> +       memset(buffer, 0, sizeof(struct pm4_mes_map_process));
>>>>>
>>>>> -       packet->header.u32all = build_pm4_header(IT_MAP_PROCESS,
>>>>> -                                       sizeof(struct
>>>>> pm4_map_process));
>>>>> +       packet->header.u32All = build_pm4_header(IT_MAP_PROCESS,
>>>>> +                                       sizeof(struct
>>>>> +pm4_mes_map_process));
>>>>>          packet->bitfields2.diq_enable = (qpd->is_debug) ? 1 : 0;
>>>>>          packet->bitfields2.process_quantum = 1;
>>>>>          packet->bitfields2.pasid = qpd->pqm->process->pasid;  @@
>>>>> -170,23 +166,26 @@ static int pm_create_map_process(struct
>>>>> packet_manager *pm, uint32_t *buffer,
>>>>>          packet->sh_mem_ape1_base = qpd->sh_mem_ape1_base;
>>>>>          packet->sh_mem_ape1_limit = qpd->sh_mem_ape1_limit;
>>>>>
>>>>> +       /* TODO: scratch support */
>>>>> +       packet->sh_hidden_private_base_vmid = 0;
>>>>> +
>>>>>          packet->gds_addr_lo = lower_32_bits(qpd->gds_context_area);
>>>>>          packet->gds_addr_hi = upper_32_bits(qpd->gds_context_area);
>>>>>
>>>>>          return 0;
>>>>>   }
>>>>>
>>>>> -static int pm_create_map_queue_vi(struct packet_manager *pm, uint32_t
>>>>> *buffer,
>>>>> +static int pm_create_map_queue(struct packet_manager *pm, uint32_t
>>>>> +*buffer,
>>>>>                  struct queue *q, bool is_static)
>>>>>   {
>>>>>          struct pm4_mes_map_queues *packet;
>>>>>          bool use_static = is_static;
>>>>>
>>>>>          packet = (struct pm4_mes_map_queues *)buffer;
>>>>> -       memset(buffer, 0, sizeof(struct pm4_map_queues));
>>>>> +       memset(buffer, 0, sizeof(struct pm4_mes_map_queues));
>>>>>
>>>>> -       packet->header.u32all = build_pm4_header(IT_MAP_QUEUES,
>>>>> -                                               sizeof(struct
>>>>> pm4_map_queues));
>>>>> +       packet->header.u32All = build_pm4_header(IT_MAP_QUEUES,
>>>>> +                                               sizeof(struct
>>>>> +pm4_mes_map_queues));
>>>>>          packet->bitfields2.alloc_format =
>>>>>                  alloc_format__mes_map_queues__one_per_pipe_vi;
>>>>>          packet->bitfields2.num_queues = 1;  @@ -235,64 +234,6 @@
>>>>> static int pm_create_map_queue_vi(struct packet_manager *pm, uint32_t
>>>>> *buffer,
>>>>>          return 0;
>>>>>   }
>>>>>
>>>>> -static int pm_create_map_queue(struct packet_manager *pm, uint32_t
>>>>> *buffer,
>>>>> -                               struct queue *q, bool is_static)  -{
>>>>> -       struct pm4_map_queues *packet;
>>>>> -       bool use_static = is_static;
>>>>> -
>>>>> -       packet = (struct pm4_map_queues *)buffer;
>>>>> -       memset(buffer, 0, sizeof(struct pm4_map_queues));
>>>>> -
>>>>> -       packet->header.u32all = build_pm4_header(IT_MAP_QUEUES,
>>>>> -                                               sizeof(struct
>>>>> pm4_map_queues));
>>>>> -       packet->bitfields2.alloc_format =
>>>>> -
>>>>> alloc_format__mes_map_queues__one_per_pipe;
>>>>> -       packet->bitfields2.num_queues = 1;
>>>>> -       packet->bitfields2.queue_sel =
>>>>> -
>>>>> queue_sel__mes_map_queues__map_to_hws_determined_queue_slots;
>>>>> -
>>>>> -       packet->bitfields2.vidmem = (q->properties.is_interop) ?
>>>>> -                       vidmem__mes_map_queues__uses_video_memory :
>>>>> -                       vidmem__mes_map_queues__uses_no_video_memory;
>>>>> -
>>>>> -       switch (q->properties.type) {
>>>>> -       case KFD_QUEUE_TYPE_COMPUTE:
>>>>> -       case KFD_QUEUE_TYPE_DIQ:
>>>>> -               packet->bitfields2.engine_sel =
>>>>> -                               engine_sel__mes_map_queues__compute;
>>>>> -               break;
>>>>> -       case KFD_QUEUE_TYPE_SDMA:
>>>>> -               packet->bitfields2.engine_sel =
>>>>> -                               engine_sel__mes_map_queues__sdma0;
>>>>> -               use_static = false; /* no static queues under SDMA */
>>>>> -               break;
>>>>> -       default:
>>>>> -               WARN(1, "queue type %d", q->properties.type);
>>>>> -               return -EINVAL;
>>>>> -       }
>>>>> -
>>>>> -       packet->mes_map_queues_ordinals[0].bitfields3.doorbell_offset
>>>>> =
>>>>> -                       q->properties.doorbell_off;
>>>>> -
>>>>> -       packet->mes_map_queues_ordinals[0].bitfields3.is_static =
>>>>> -                       (use_static) ? 1 : 0;
>>>>> -
>>>>> -       packet->mes_map_queues_ordinals[0].mqd_addr_lo =
>>>>> -                       lower_32_bits(q->gart_mqd_addr);
>>>>> -
>>>>> -       packet->mes_map_queues_ordinals[0].mqd_addr_hi =
>>>>> -                       upper_32_bits(q->gart_mqd_addr);
>>>>> -
>>>>> -       packet->mes_map_queues_ordinals[0].wptr_addr_lo =
>>>>> -
>>>>> lower_32_bits((uint64_t)q->properties.write_ptr);
>>>>> -
>>>>> -       packet->mes_map_queues_ordinals[0].wptr_addr_hi =
>>>>> -
>>>>> upper_32_bits((uint64_t)q->properties.write_ptr);
>>>>> -
>>>>> -       return 0;
>>>>> -}
>>>>> -
>>>>>   static int pm_create_runlist_ib(struct packet_manager *pm,
>>>>>                                  struct list_head *queues,
>>>>>                                  uint64_t *rl_gpu_addr,  @@ -334,7
>>>>> +275,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
>>>>>                          return retval;
>>>>>
>>>>>                  proccesses_mapped++;
>>>>> -               inc_wptr(&rl_wptr, sizeof(struct pm4_map_process),
>>>>> +               inc_wptr(&rl_wptr, sizeof(struct pm4_mes_map_process),
>>>>>                                  alloc_size_bytes);
>>>>>
>>>>>                  list_for_each_entry(kq, &qpd->priv_queue_list, list) {
>>>>> @@ -344,14 +285,7 @@ static int pm_create_runlist_ib(struct
>>>>> packet_manager *pm,
>>>>>                          pr_debug("static_queue, mapping kernel q %d,
>>>>> is debug status %d\n",
>>>>>                                  kq->queue->queue, qpd->is_debug);
>>>>>
>>>>> -                       if (pm->dqm->dev->device_info->asic_family ==
>>>>> -                                       CHIP_CARRIZO)
>>>>> -                               retval = pm_create_map_queue_vi(pm,
>>>>> -                                               &rl_buffer[rl_wptr],
>>>>> -                                               kq->queue,
>>>>> -                                               qpd->is_debug);
>>>>> -                       else
>>>>> -                               retval = pm_create_map_queue(pm,
>>>>> +                       retval = pm_create_map_queue(pm,
>>>>>                                                  &rl_buffer[rl_wptr],
>>>>>                                                  kq->queue,
>>>>>                                                  qpd->is_debug);  @@
>>>>> -359,7 +293,7 @@ static int pm_create_runlist_ib(struct packet_manager
>>>>> *pm,
>>>>>                                  return retval;
>>>>>
>>>>>                          inc_wptr(&rl_wptr,
>>>>> -                               sizeof(struct pm4_map_queues),
>>>>> +                               sizeof(struct pm4_mes_map_queues),
>>>>>                                  alloc_size_bytes);
>>>>>                  }
>>>>>
>>>>> @@ -370,14 +304,7 @@ static int pm_create_runlist_ib(struct
>>>>> packet_manager *pm,
>>>>>                          pr_debug("static_queue, mapping user queue %d,
>>>>> is debug status %d\n",
>>>>>                                  q->queue, qpd->is_debug);
>>>>>
>>>>> -                       if (pm->dqm->dev->device_info->asic_family ==
>>>>> -                                       CHIP_CARRIZO)
>>>>> -                               retval = pm_create_map_queue_vi(pm,
>>>>> -                                               &rl_buffer[rl_wptr],
>>>>> -                                               q,
>>>>> -                                               qpd->is_debug);
>>>>> -                       else
>>>>> -                               retval = pm_create_map_queue(pm,
>>>>> +                       retval = pm_create_map_queue(pm,
>>>>>                                                  &rl_buffer[rl_wptr],
>>>>>                                                  q,
>>>>>                                                  qpd->is_debug);  @@
>>>>> -386,7 +313,7 @@ static int pm_create_runlist_ib(struct packet_manager
>>>>> *pm,
>>>>>                                  return retval;
>>>>>
>>>>>                          inc_wptr(&rl_wptr,
>>>>> -                               sizeof(struct pm4_map_queues),
>>>>> +                               sizeof(struct pm4_mes_map_queues),
>>>>>                                  alloc_size_bytes);
>>>>>                  }
>>>>>          }
>>>>> @@ -429,7 +356,7 @@ void pm_uninit(struct packet_manager *pm)
>>>>>   int pm_send_set_resources(struct packet_manager *pm,
>>>>>                                  struct scheduling_resources *res)
>>>>>   {
>>>>> -       struct pm4_set_resources *packet;
>>>>> +       struct pm4_mes_set_resources *packet;
>>>>>          int retval = 0;
>>>>>
>>>>>          mutex_lock(&pm->lock);
>>>>> @@ -442,9 +369,9 @@ int pm_send_set_resources(struct packet_manager
>>>>> *pm,
>>>>>                  goto out;
>>>>>          }
>>>>>
>>>>> -       memset(packet, 0, sizeof(struct pm4_set_resources));
>>>>> -       packet->header.u32all = build_pm4_header(IT_SET_RESOURCES,
>>>>> -                                       sizeof(struct
>>>>> pm4_set_resources));
>>>>> +       memset(packet, 0, sizeof(struct pm4_mes_set_resources));
>>>>> +       packet->header.u32All = build_pm4_header(IT_SET_RESOURCES,
>>>>> +                                       sizeof(struct
>>>>> +pm4_mes_set_resources));
>>>>>
>>>>>          packet->bitfields2.queue_type =
>>>>>
>>>>> queue_type__mes_set_resources__hsa_interface_queue_hiq;
>>>>> @@ -482,7 +409,7 @@ int pm_send_runlist(struct packet_manager *pm,
>>>>> struct list_head *dqm_queues)
>>>>>
>>>>>          pr_debug("runlist IB address: 0x%llX\n", rl_gpu_ib_addr);
>>>>>
>>>>> -       packet_size_dwords = sizeof(struct pm4_runlist) /
>>>>> sizeof(uint32_t);
>>>>> +       packet_size_dwords = sizeof(struct pm4_mes_runlist) /
>>>>> +sizeof(uint32_t);
>>>>>          mutex_lock(&pm->lock);
>>>>>
>>>>>          retval =
>>>>> pm->priv_queue->ops.acquire_packet_buffer(pm->priv_queue,
>>>>> @@ -514,7 +441,7 @@ int pm_send_query_status(struct packet_manager
>>>>> *pm, uint64_t fence_address,
>>>>>                          uint32_t fence_value)
>>>>>   {
>>>>>          int retval;
>>>>> -       struct pm4_query_status *packet;
>>>>> +       struct pm4_mes_query_status *packet;
>>>>>
>>>>>          if (WARN_ON(!fence_address))
>>>>>                  return -EFAULT;
>>>>> @@ -522,13 +449,13 @@ int pm_send_query_status(struct packet_manager
>>>>> *pm, uint64_t fence_address,
>>>>>          mutex_lock(&pm->lock);
>>>>>          retval = pm->priv_queue->ops.acquire_packet_buffer(
>>>>>                          pm->priv_queue,
>>>>> -                       sizeof(struct pm4_query_status) /
>>>>> sizeof(uint32_t),
>>>>> +                       sizeof(struct pm4_mes_query_status) /
>>>>> +sizeof(uint32_t),
>>>>>                          (unsigned int **)&packet);
>>>>>          if (retval)
>>>>>                  goto fail_acquire_packet_buffer;
>>>>>
>>>>> -       packet->header.u32all = build_pm4_header(IT_QUERY_STATUS,
>>>>> -                                       sizeof(struct
>>>>> pm4_query_status));
>>>>> +       packet->header.u32All = build_pm4_header(IT_QUERY_STATUS,
>>>>> +                                       sizeof(struct
>>>>> +pm4_mes_query_status));
>>>>>
>>>>>          packet->bitfields2.context_id = 0;
>>>>>          packet->bitfields2.interrupt_sel =  @@ -555,22 +482,22 @@ int
>>>>> pm_send_unmap_queue(struct packet_manager *pm, enum
>>>> kfd_queue_type
>>>>> type,
>>>>>   {
>>>>>          int retval;
>>>>>          uint32_t *buffer;
>>>>> -       struct pm4_unmap_queues *packet;
>>>>> +       struct pm4_mes_unmap_queues *packet;
>>>>>
>>>>>          mutex_lock(&pm->lock);
>>>>>          retval = pm->priv_queue->ops.acquire_packet_buffer(
>>>>>                          pm->priv_queue,
>>>>> -                       sizeof(struct pm4_unmap_queues) /
>>>>> sizeof(uint32_t),
>>>>> +                       sizeof(struct pm4_mes_unmap_queues) /
>>>>> +sizeof(uint32_t),
>>>>>                          &buffer);
>>>>>          if (retval)
>>>>>                  goto err_acquire_packet_buffer;
>>>>>
>>>>> -       packet = (struct pm4_unmap_queues *)buffer;
>>>>> -       memset(buffer, 0, sizeof(struct pm4_unmap_queues));
>>>>> +       packet = (struct pm4_mes_unmap_queues *)buffer;
>>>>> +       memset(buffer, 0, sizeof(struct pm4_mes_unmap_queues));
>>>>>          pr_debug("static_queue: unmapping queues: mode is %d , reset
>>>>> is %d , type is %d\n",
>>>>>                  mode, reset, type);
>>>>> -       packet->header.u32all = build_pm4_header(IT_UNMAP_QUEUES,
>>>>> -                                       sizeof(struct
>>>>> pm4_unmap_queues));
>>>>> +       packet->header.u32All = build_pm4_header(IT_UNMAP_QUEUES,
>>>>> +                                       sizeof(struct
>>>>> +pm4_mes_unmap_queues));
>>>>>          switch (type) {
>>>>>          case KFD_QUEUE_TYPE_COMPUTE:
>>>>>          case KFD_QUEUE_TYPE_DIQ:
>>>>> @@ -608,12 +535,12 @@ int pm_send_unmap_queue(struct
>>>> packet_manager
>>>>> *pm, enum kfd_queue_type type,
>>>>>                  break;
>>>>>          case KFD_PREEMPT_TYPE_FILTER_ALL_QUEUES:
>>>>>                  packet->bitfields2.queue_sel =
>>>>> -
>>>>> queue_sel__mes_unmap_queues__perform_request_on_all_active_queue
>>>> s;
>>>>> +
>>>>> +queue_sel__mes_unmap_queues__unmap_all_queues;
>>>>>                  break;
>>>>>          case KFD_PREEMPT_TYPE_FILTER_DYNAMIC_QUEUES:
>>>>>                  /* in this case, we do not preempt static queues */
>>>>>                  packet->bitfields2.queue_sel =
>>>>> -
>>>>> queue_sel__mes_unmap_queues__perform_request_on_dynamic_queues
>>>> _only;
>>>>> +
>>>>> +queue_sel__mes_unmap_queues__unmap_all_non_static_queues;
>>>>>                  break;
>>>>>          default:
>>>>>                  WARN(1, "filter %d", mode);  diff --git
>>>>> a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
>>>>> b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
>>>>> index 97e5442..e50f73d 100644
>>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
>>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
>>>>> @@ -41,99 +41,6 @@ union PM4_MES_TYPE_3_HEADER {
>>>>>   };
>>>>>   #endif /* PM4_MES_HEADER_DEFINED */
>>>>>
>>>>> -/* --------------------MES_SET_RESOURCES-------------------- */
>>>>> -
>>>>> -#ifndef PM4_MES_SET_RESOURCES_DEFINED -#define
>>>>> PM4_MES_SET_RESOURCES_DEFINED -enum
>>>> set_resources_queue_type_enum {
>>>>> -       queue_type__mes_set_resources__kernel_interface_queue_kiq = 0,
>>>>> -       queue_type__mes_set_resources__hsa_interface_queue_hiq = 1,
>>>>> -       queue_type__mes_set_resources__hsa_debug_interface_queue = 4
>>>>> -};
>>>>> -
>>>>> -struct pm4_set_resources {
>>>>> -       union {
>>>>> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
>>>>> -               uint32_t ordinal1;
>>>>> -       };
>>>>> -
>>>>> -       union {
>>>>> -               struct {
>>>>> -                       uint32_t vmid_mask:16;
>>>>> -                       uint32_t unmap_latency:8;
>>>>> -                       uint32_t reserved1:5;
>>>>> -                       enum set_resources_queue_type_enum
>>>>> queue_type:3;
>>>>> -               } bitfields2;
>>>>> -               uint32_t ordinal2;
>>>>> -       };
>>>>> -
>>>>> -       uint32_t queue_mask_lo;
>>>>> -       uint32_t queue_mask_hi;
>>>>> -       uint32_t gws_mask_lo;
>>>>> -       uint32_t gws_mask_hi;
>>>>> -
>>>>> -       union {
>>>>> -               struct {
>>>>> -                       uint32_t oac_mask:16;
>>>>> -                       uint32_t reserved2:16;
>>>>> -               } bitfields7;
>>>>> -               uint32_t ordinal7;
>>>>> -       };
>>>>> -
>>>>> -       union {
>>>>> -               struct {
>>>>> -                       uint32_t gds_heap_base:6;
>>>>> -                       uint32_t reserved3:5;
>>>>> -                       uint32_t gds_heap_size:6;
>>>>> -                       uint32_t reserved4:15;
>>>>> -               } bitfields8;
>>>>> -               uint32_t ordinal8;
>>>>> -       };
>>>>> -
>>>>> -};
>>>>> -#endif
>>>>> -
>>>>> -/*--------------------MES_RUN_LIST-------------------- */
>>>>> -
>>>>> -#ifndef PM4_MES_RUN_LIST_DEFINED
>>>>> -#define PM4_MES_RUN_LIST_DEFINED
>>>>> -
>>>>> -struct pm4_runlist {
>>>>> -       union {
>>>>> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
>>>>> -               uint32_t ordinal1;
>>>>> -       };
>>>>> -
>>>>> -       union {
>>>>> -               struct {
>>>>> -                       uint32_t reserved1:2;
>>>>> -                       uint32_t ib_base_lo:30;
>>>>> -               } bitfields2;
>>>>> -               uint32_t ordinal2;
>>>>> -       };
>>>>> -
>>>>> -       union {
>>>>> -               struct {
>>>>> -                       uint32_t ib_base_hi:16;
>>>>> -                       uint32_t reserved2:16;
>>>>> -               } bitfields3;
>>>>> -               uint32_t ordinal3;
>>>>> -       };
>>>>> -
>>>>> -       union {
>>>>> -               struct {
>>>>> -                       uint32_t ib_size:20;
>>>>> -                       uint32_t chain:1;
>>>>> -                       uint32_t offload_polling:1;
>>>>> -                       uint32_t reserved3:1;
>>>>> -                       uint32_t valid:1;
>>>>> -                       uint32_t reserved4:8;
>>>>> -               } bitfields4;
>>>>> -               uint32_t ordinal4;
>>>>> -       };
>>>>> -
>>>>> -};
>>>>> -#endif
>>>>>
>>>>>   /*--------------------MES_MAP_PROCESS-------------------- */
>>>>>
>>>>> @@ -186,217 +93,58 @@ struct pm4_map_process {
>>>>>   };
>>>>>   #endif
>>>>>
>>>>> -/*--------------------MES_MAP_QUEUES--------------------*/
>>>>> -
>>>>> -#ifndef PM4_MES_MAP_QUEUES_DEFINED
>>>>> -#define PM4_MES_MAP_QUEUES_DEFINED
>>>>> -enum map_queues_queue_sel_enum {
>>>>> -       queue_sel__mes_map_queues__map_to_specified_queue_slots = 0,
>>>>> -
>>>>        queue_sel__mes_map_queues__map_to_hws_determined_queue_slots
>>>> =
>>>>> 1,
>>>>> -       queue_sel__mes_map_queues__enable_process_queues = 2 -};
>>>>> +#ifndef PM4_MES_MAP_PROCESS_DEFINED_KV_SCRATCH
>>>>> +#define PM4_MES_MAP_PROCESS_DEFINED_KV_SCRATCH
>>>>>
>>>>> -enum map_queues_vidmem_enum {
>>>>> -       vidmem__mes_map_queues__uses_no_video_memory = 0,
>>>>> -       vidmem__mes_map_queues__uses_video_memory = 1 -};
>>>>> -
>>>>> -enum map_queues_alloc_format_enum {
>>>>> -       alloc_format__mes_map_queues__one_per_pipe = 0,
>>>>> -       alloc_format__mes_map_queues__all_on_one_pipe = 1 -};
>>>>> -
>>>>> -enum map_queues_engine_sel_enum {
>>>>> -       engine_sel__mes_map_queues__compute = 0,
>>>>> -       engine_sel__mes_map_queues__sdma0 = 2,
>>>>> -       engine_sel__mes_map_queues__sdma1 = 3 -};
>>>>> -
>>>>> -struct pm4_map_queues {
>>>>> +struct pm4_map_process_scratch_kv {
>>>>>          union {
>>>>> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
>>>>> -               uint32_t ordinal1;
>>>>> -       };
>>>>> -
>>>>> -       union {
>>>>> -               struct {
>>>>> -                       uint32_t reserved1:4;
>>>>> -                       enum map_queues_queue_sel_enum queue_sel:2;
>>>>> -                       uint32_t reserved2:2;
>>>>> -                       uint32_t vmid:4;
>>>>> -                       uint32_t reserved3:4;
>>>>> -                       enum map_queues_vidmem_enum vidmem:2;
>>>>> -                       uint32_t reserved4:6;
>>>>> -                       enum map_queues_alloc_format_enum
>>>>> alloc_format:2;
>>>>> -                       enum map_queues_engine_sel_enum engine_sel:3;
>>>>> -                       uint32_t num_queues:3;
>>>>> -               } bitfields2;
>>>>> -               uint32_t ordinal2;
>>>>> -       };
>>>>> -
>>>>> -       struct {
>>>>> -               union {
>>>>> -                       struct {
>>>>> -                               uint32_t is_static:1;
>>>>> -                               uint32_t reserved5:1;
>>>>> -                               uint32_t doorbell_offset:21;
>>>>> -                               uint32_t reserved6:3;
>>>>> -                               uint32_t queue:6;
>>>>> -                       } bitfields3;
>>>>> -                       uint32_t ordinal3;
>>>>> -               };
>>>>> -
>>>>> -               uint32_t mqd_addr_lo;
>>>>> -               uint32_t mqd_addr_hi;
>>>>> -               uint32_t wptr_addr_lo;
>>>>> -               uint32_t wptr_addr_hi;
>>>>> -
>>>>> -       } mes_map_queues_ordinals[1];   /* 1..N of these ordinal
>>>>> groups */
>>>>> -
>>>>> -};
>>>>> -#endif
>>>>> -
>>>>> -/*--------------------MES_QUERY_STATUS--------------------*/
>>>>> -
>>>>> -#ifndef PM4_MES_QUERY_STATUS_DEFINED
>>>>> -#define PM4_MES_QUERY_STATUS_DEFINED
>>>>> -enum query_status_interrupt_sel_enum {
>>>>> -       interrupt_sel__mes_query_status__completion_status = 0,
>>>>> -       interrupt_sel__mes_query_status__process_status = 1,
>>>>> -       interrupt_sel__mes_query_status__queue_status = 2  -};
>>>>> -
>>>>> -enum query_status_command_enum {
>>>>> -       command__mes_query_status__interrupt_only = 0,
>>>>> -       command__mes_query_status__fence_only_immediate = 1,
>>>>> -       command__mes_query_status__fence_only_after_write_ack = 2,
>>>>> -
>>>>> command__mes_query_status__fence_wait_for_write_ack_send_interrupt
>>>> = 3
>>>>> -};
>>>>> -
>>>>> -enum query_status_engine_sel_enum {
>>>>> -       engine_sel__mes_query_status__compute = 0,
>>>>> -       engine_sel__mes_query_status__sdma0_queue = 2,
>>>>> -       engine_sel__mes_query_status__sdma1_queue = 3  -};
>>>>> -
>>>>> -struct pm4_query_status {
>>>>> -       union {
>>>>> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
>>>>> -               uint32_t ordinal1;
>>>>> -       };
>>>>> -
>>>>> -       union {
>>>>> -               struct {
>>>>> -                       uint32_t context_id:28;
>>>>> -                       enum query_status_interrupt_sel_enum
>>>>> interrupt_sel:2;
>>>>> -                       enum query_status_command_enum command:2;
>>>>> -               } bitfields2;
>>>>> -               uint32_t ordinal2;
>>>>> +               union PM4_MES_TYPE_3_HEADER   header; /* header */
>>>>> +               uint32_t            ordinal1;
>>>>>          };
>>>>>
>>>>>          union {
>>>>>                  struct {
>>>>>                          uint32_t pasid:16;
>>>>> -                       uint32_t reserved1:16;
>>>>> -               } bitfields3a;
>>>>> -               struct {
>>>>> -                       uint32_t reserved2:2;
>>>>> -                       uint32_t doorbell_offset:21;
>>>>> -                       uint32_t reserved3:3;
>>>>> -                       enum query_status_engine_sel_enum
>>>>> engine_sel:3;
>>>>> -                       uint32_t reserved4:3;
>>>>> -               } bitfields3b;
>>>>> -               uint32_t ordinal3;
>>>>> -       };
>>>>> -
>>>>> -       uint32_t addr_lo;
>>>>> -       uint32_t addr_hi;
>>>>> -       uint32_t data_lo;
>>>>> -       uint32_t data_hi;
>>>>> -};
>>>>> -#endif
>>>>> -
>>>>> -/*--------------------MES_UNMAP_QUEUES--------------------*/
>>>>> -
>>>>> -#ifndef PM4_MES_UNMAP_QUEUES_DEFINED
>>>>> -#define PM4_MES_UNMAP_QUEUES_DEFINED
>>>>> -enum unmap_queues_action_enum {
>>>>> -       action__mes_unmap_queues__preempt_queues = 0,
>>>>> -       action__mes_unmap_queues__reset_queues = 1,
>>>>> -       action__mes_unmap_queues__disable_process_queues = 2  -};
>>>>> -
>>>>> -enum unmap_queues_queue_sel_enum {
>>>>> -
>>>>> queue_sel__mes_unmap_queues__perform_request_on_specified_queues
>>>> = 0,
>>>>> -
>>>>        queue_sel__mes_unmap_queues__perform_request_on_pasid_queues =
>>>>> 1,
>>>>> -
>>>>> queue_sel__mes_unmap_queues__perform_request_on_all_active_queue
>>>> s = 2,
>>>>> -
>>>>> queue_sel__mes_unmap_queues__perform_request_on_dynamic_queues
>>>> _only = 3
>>>>> -};
>>>>> -
>>>>> -enum unmap_queues_engine_sel_enum {
>>>>> -       engine_sel__mes_unmap_queues__compute = 0,
>>>>> -       engine_sel__mes_unmap_queues__sdma0 = 2,
>>>>> -       engine_sel__mes_unmap_queues__sdma1 = 3  -};
>>>>> -
>>>>> -struct pm4_unmap_queues {
>>>>> -       union {
>>>>> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
>>>>> -               uint32_t ordinal1;
>>>>> -       };
>>>>> -
>>>>> -       union {
>>>>> -               struct {
>>>>> -                       enum unmap_queues_action_enum action:2;
>>>>> -                       uint32_t reserved1:2;
>>>>> -                       enum unmap_queues_queue_sel_enum queue_sel:2;
>>>>> -                       uint32_t reserved2:20;
>>>>> -                       enum unmap_queues_engine_sel_enum
>>>>> engine_sel:3;
>>>>> -                       uint32_t num_queues:3;
>>>>> +                       uint32_t reserved1:8;
>>>>> +                       uint32_t diq_enable:1;
>>>>> +                       uint32_t process_quantum:7;
>>>>>                  } bitfields2;
>>>>>                  uint32_t ordinal2;
>>>>>          };
>>>>>
>>>>>          union {
>>>>>                  struct {
>>>>> -                       uint32_t pasid:16;
>>>>> -                       uint32_t reserved3:16;
>>>>> -               } bitfields3a;
>>>>> -               struct {
>>>>> -                       uint32_t reserved4:2;
>>>>> -                       uint32_t doorbell_offset0:21;
>>>>> -                       uint32_t reserved5:9;
>>>>> -               } bitfields3b;
>>>>> +                       uint32_t page_table_base:28;
>>>>> +                       uint32_t reserved2:4;
>>>>> +               } bitfields3;
>>>>>                  uint32_t ordinal3;
>>>>>          };
>>>>>
>>>>> -       union {
>>>>> -               struct {
>>>>> -                       uint32_t reserved6:2;
>>>>> -                       uint32_t doorbell_offset1:21;
>>>>> -                       uint32_t reserved7:9;
>>>>> -               } bitfields4;
>>>>> -               uint32_t ordinal4;
>>>>> -       };
>>>>> -
>>>>> -       union {
>>>>> -               struct {
>>>>> -                       uint32_t reserved8:2;
>>>>> -                       uint32_t doorbell_offset2:21;
>>>>> -                       uint32_t reserved9:9;
>>>>> -               } bitfields5;
>>>>> -               uint32_t ordinal5;
>>>>> -       };
>>>>> +       uint32_t reserved3;
>>>>> +       uint32_t sh_mem_bases;
>>>>> +       uint32_t sh_mem_config;
>>>>> +       uint32_t sh_mem_ape1_base;
>>>>> +       uint32_t sh_mem_ape1_limit;
>>>>> +       uint32_t sh_hidden_private_base_vmid;
>>>>> +       uint32_t reserved4;
>>>>> +       uint32_t reserved5;
>>>>> +       uint32_t gds_addr_lo;
>>>>> +       uint32_t gds_addr_hi;
>>>>>
>>>>>          union {
>>>>>                  struct {
>>>>> -                       uint32_t reserved10:2;
>>>>> -                       uint32_t doorbell_offset3:21;
>>>>> -                       uint32_t reserved11:9;
>>>>> -               } bitfields6;
>>>>> -               uint32_t ordinal6;
>>>>> +                       uint32_t num_gws:6;
>>>>> +                       uint32_t reserved6:2;
>>>>> +                       uint32_t num_oac:4;
>>>>> +                       uint32_t reserved7:4;
>>>>> +                       uint32_t gds_size:6;
>>>>> +                       uint32_t num_queues:10;
>>>>> +               } bitfields14;
>>>>> +               uint32_t ordinal14;
>>>>>          };
>>>>>
>>>>> +       uint32_t completion_signal_lo32; uint32_t
>>>>> +completion_signal_hi32;
>>>>>   };
>>>>>   #endif
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
>>>>> b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
>>>>> index c4eda6f..7c8d9b3 100644
>>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
>>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
>>>>> @@ -126,9 +126,10 @@ struct pm4_mes_runlist {
>>>>>                          uint32_t ib_size:20;
>>>>>                          uint32_t chain:1;
>>>>>                          uint32_t offload_polling:1;
>>>>> -                       uint32_t reserved3:1;
>>>>> +                       uint32_t reserved2:1;
>>>>>                          uint32_t valid:1;
>>>>> -                       uint32_t reserved4:8;
>>>>> +                       uint32_t process_cnt:4;
>>>>> +                       uint32_t reserved3:4;
>>>>>                  } bitfields4;
>>>>>                  uint32_t ordinal4;
>>>>>          };
>>>>> @@ -143,8 +144,8 @@ struct pm4_mes_runlist {
>>>>>
>>>>>   struct pm4_mes_map_process {
>>>>>          union {
>>>>> -               union PM4_MES_TYPE_3_HEADER   header;            /*
>>>>> header */
>>>>> -               uint32_t            ordinal1;
>>>>> +               union PM4_MES_TYPE_3_HEADER header;     /* header */
>>>>> +               uint32_t ordinal1;
>>>>>          };
>>>>>
>>>>>          union {
>>>>> @@ -155,36 +156,48 @@ struct pm4_mes_map_process {
>>>>>                          uint32_t process_quantum:7;
>>>>>                  } bitfields2;
>>>>>                  uint32_t ordinal2;
>>>>> -};
>>>>> +       };
>>>>>
>>>>>          union {
>>>>>                  struct {
>>>>>                          uint32_t page_table_base:28;
>>>>> -                       uint32_t reserved2:4;
>>>>> +                       uint32_t reserved3:4;
>>>>>                  } bitfields3;
>>>>>                  uint32_t ordinal3;
>>>>>          };
>>>>>
>>>>> +       uint32_t reserved;
>>>>> +
>>>>>          uint32_t sh_mem_bases;
>>>>> +       uint32_t sh_mem_config;
>>>>>          uint32_t sh_mem_ape1_base;
>>>>>          uint32_t sh_mem_ape1_limit;
>>>>> -       uint32_t sh_mem_config;
>>>>> +
>>>>> +       uint32_t sh_hidden_private_base_vmid;
>>>>> +
>>>>> +       uint32_t reserved2;
>>>>> +       uint32_t reserved3;
>>>>> +
>>>>>          uint32_t gds_addr_lo;
>>>>>          uint32_t gds_addr_hi;
>>>>>
>>>>>          union {
>>>>>                  struct {
>>>>>                          uint32_t num_gws:6;
>>>>> -                       uint32_t reserved3:2;
>>>>> +                       uint32_t reserved4:2;
>>>>>                          uint32_t num_oac:4;
>>>>> -                       uint32_t reserved4:4;
>>>>> +                       uint32_t reserved5:4;
>>>>>                          uint32_t gds_size:6;
>>>>>                          uint32_t num_queues:10;
>>>>>                  } bitfields10;
>>>>>                  uint32_t ordinal10;
>>>>>          };
>>>>>
>>>>> +       uint32_t completion_signal_lo;
>>>>> +       uint32_t completion_signal_hi;
>>>>> +
>>>>>   };
>>>>> +
>>>>>   #endif
>>>>>
>>>>>   /*--------------------MES_MAP_QUEUES--------------------*/
>>>>> @@ -337,7 +350,7 @@ enum mes_unmap_queues_engine_sel_enum {
>>>>>          engine_sel__mes_unmap_queues__sdmal = 3
>>>>>   };
>>>>>
>>>>> -struct PM4_MES_UNMAP_QUEUES {
>>>>> +struct pm4_mes_unmap_queues {
>>>>>          union {
>>>>>                  union PM4_MES_TYPE_3_HEADER   header;            /*
>>>>> header */
>>>>>                  uint32_t            ordinal1;  @@ -397,4 +410,101 @@
>>>>> struct PM4_MES_UNMAP_QUEUES {
>>>>>   };
>>>>>   #endif
>>>>>
>>>>> +#ifndef PM4_MEC_RELEASE_MEM_DEFINED
>>>>> +#define PM4_MEC_RELEASE_MEM_DEFINED
>>>>> +enum RELEASE_MEM_event_index_enum {
>>>>> +       event_index___release_mem__end_of_pipe = 5,
>>>>> +       event_index___release_mem__shader_done = 6 };
>>>>> +
>>>>> +enum RELEASE_MEM_cache_policy_enum {
>>>>> +       cache_policy___release_mem__lru = 0,
>>>>> +       cache_policy___release_mem__stream = 1,
>>>>> +       cache_policy___release_mem__bypass = 2 };
>>>>> +
>>>>> +enum RELEASE_MEM_dst_sel_enum {
>>>>> +       dst_sel___release_mem__memory_controller = 0,
>>>>> +       dst_sel___release_mem__tc_l2 = 1,
>>>>> +       dst_sel___release_mem__queue_write_pointer_register = 2,
>>>>> +       dst_sel___release_mem__queue_write_pointer_poll_mask_bit = 3
>>>>> +};
>>>>> +
>>>>> +enum RELEASE_MEM_int_sel_enum {
>>>>> +       int_sel___release_mem__none = 0,
>>>>> +       int_sel___release_mem__send_interrupt_only = 1,
>>>>> +       int_sel___release_mem__send_interrupt_after_write_confirm = 2,
>>>>> +       int_sel___release_mem__send_data_after_write_confirm = 3 };
>>>>> +
>>>>> +enum RELEASE_MEM_data_sel_enum {
>>>>> +       data_sel___release_mem__none = 0,
>>>>> +       data_sel___release_mem__send_32_bit_low = 1,
>>>>> +       data_sel___release_mem__send_64_bit_data = 2,
>>>>> +       data_sel___release_mem__send_gpu_clock_counter = 3,
>>>>> +       data_sel___release_mem__send_cp_perfcounter_hi_lo = 4,
>>>>> +       data_sel___release_mem__store_gds_data_to_memory = 5 };
>>>>> +
>>>>> +struct pm4_mec_release_mem {
>>>>> +       union {
>>>>> +               union PM4_MES_TYPE_3_HEADER header;     /*header */
>>>>> +               unsigned int ordinal1;
>>>>> +       };
>>>>> +
>>>>> +       union {
>>>>> +               struct {
>>>>> +                       unsigned int event_type:6;
>>>>> +                       unsigned int reserved1:2;
>>>>> +                       enum RELEASE_MEM_event_index_enum
>>>>> +event_index:4;
>>>>> +                       unsigned int tcl1_vol_action_ena:1;
>>>>> +                       unsigned int tc_vol_action_ena:1;
>>>>> +                       unsigned int reserved2:1;
>>>>> +                       unsigned int tc_wb_action_ena:1;
>>>>> +                       unsigned int tcl1_action_ena:1;
>>>>> +                       unsigned int tc_action_ena:1;
>>>>> +                       unsigned int reserved3:6;
>>>>> +                       unsigned int atc:1;
>>>>> +                       enum RELEASE_MEM_cache_policy_enum
>>>>> +cache_policy:2;
>>>>> +                       unsigned int reserved4:5;
>>>>> +               } bitfields2;
>>>>> +               unsigned int ordinal2;
>>>>> +       };
>>>>> +
>>>>> +       union {
>>>>> +               struct {
>>>>> +                       unsigned int reserved5:16;
>>>>> +                       enum RELEASE_MEM_dst_sel_enum dst_sel:2;
>>>>> +                       unsigned int reserved6:6;
>>>>> +                       enum RELEASE_MEM_int_sel_enum int_sel:3;
>>>>> +                       unsigned int reserved7:2;
>>>>> +                       enum RELEASE_MEM_data_sel_enum data_sel:3;
>>>>> +               } bitfields3;
>>>>> +               unsigned int ordinal3;
>>>>> +       };
>>>>> +
>>>>> +       union {
>>>>> +               struct {
>>>>> +                       unsigned int reserved8:2;
>>>>> +                       unsigned int address_lo_32b:30;
>>>>> +               } bitfields4;
>>>>> +               struct {
>>>>> +                       unsigned int reserved9:3;
>>>>> +                       unsigned int address_lo_64b:29;
>>>>> +               } bitfields5;
>>>>> +               unsigned int ordinal4;
>>>>> +       };
>>>>> +
>>>>> +       unsigned int address_hi;
>>>>> +
>>>>> +       unsigned int data_lo;
>>>>> +
>>>>> +       unsigned int data_hi;
>>>>> +};
>>>>> +#endif
>>>>> +
>>>>> +enum {
>>>>> +       CACHE_FLUSH_AND_INV_TS_EVENT = 0x00000014 };
>>>>> +
>>>>>   #endif
>>>>> --
>>>>> 2.7.4
>>>>>
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx@lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 16/19] drm/amdkfd: Update PM4 packet headers
       [not found]                             ` <4a8512a4-21df-cf48-4500-0424b08cd357-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
@ 2017-08-16  9:54                               ` Oded Gabbay
  2017-08-16 22:31                               ` Felix Kuehling
  1 sibling, 0 replies; 70+ messages in thread
From: Oded Gabbay @ 2017-08-16  9:54 UTC (permalink / raw)
  To: Christian König
  Cc: Deucher, Alexander, Felix Kuehling, amd-gfx list, Bridgman, John

On Wed, Aug 16, 2017 at 10:37 AM, Christian König
<deathsimple@vodafone.de> wrote:
> Hi Felix,
>
> in general Alex handles that by pushing firmware to the linux-firmware tree.
>
>> IIRC the amdgpu devs had been holding back on publishing the updated MEC
>> microcode (with scratch support) because that WOULD have broken Kaveri. With
>> this change from Felix we should be able to publish the newest microcode for
>> both amdgpu and amdkfd WITHOUT breaking Kaveri.
>
> Well what does breaking Kaveri mean here?
>
> Please note that firmware *MUST* always be backward compatible. In other
> words new firmware must still work with older kernels without that fix.
>
> If that is not guaranteed we need to get step back and talk to the firmware
> team once more to implement the new feature in a backward compatible way.
>
> Regards,
> Christian.

Can we make amdgpu load a different firmware then radeon for
kaveri_mec and kaveri_mec2 ?
In that case, updating the firmware will only affect amdgpu and people
that for some strange reason still use Kaveri and HSA can continue
with radeon.

Oded

>
>
> Am 16.08.2017 um 03:19 schrieb Felix Kuehling:
>>
>> Hi Alex,
>>
>> How does firmware get published for the upstream driver? Where can I
>> check the currently published version of both CZ and KV firmware for
>> upstream?
>>
>> Do you publish firmware updates at the same time as patches that depend
>> on them?
>>
>> Thanks,
>>    Felix
>>
>>
>> On 2017-08-13 04:49 AM, Oded Gabbay wrote:
>>>
>>> On Sat, Aug 12, 2017 at 10:09 PM, Bridgman, John <John.Bridgman@amd.com>
>>> wrote:
>>>>
>>>> IIRC the amdgpu devs had been holding back on publishing the updated MEC
>>>> microcode (with scratch support) because that WOULD have broken Kaveri. With
>>>> this change from Felix we should be able to publish the newest microcode for
>>>> both amdgpu and amdkfd WITHOUT breaking Kaveri.
>>>>
>>>> IOW this is the "scratch fix for Kaveri KFD" you have wanted for a
>>>> couple of years :)
>>>
>>> ah, ok.
>>>
>>> In that case, this patch is:
>>> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
>>>
>>>
>>>>> -----Original Message-----
>>>>> From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf
>>>>> Of Kuehling, Felix
>>>>> Sent: Saturday, August 12, 2017 2:16 PM
>>>>> To: Oded Gabbay
>>>>> Cc: amd-gfx list
>>>>> Subject: Re: [PATCH 16/19] drm/amdkfd: Update PM4 packet headers
>>>>>
>>>>>> Do you mean that it won't work with Kaveri anymore ?
>>>>>
>>>>> Kaveri got the same firmware changes, mostly for scratch memory
>>>>> support.
>>>>> The Kaveri firmware headers name the structures and fields a bit
>>>>> differently
>>>>> but they should be binary compatible. So we simplified the code to use
>>>>> only
>>>>> one set of headers. I'll grab a Kaveri system to confirm that it works.
>>>>>
>>>>> Regards,
>>>>>   Felix
>>>>>
>>>>> From: Oded Gabbay <oded.gabbay@gmail.com>
>>>>> Sent: Saturday, August 12, 2017 11:10 AM
>>>>> To: Kuehling, Felix
>>>>> Cc: amd-gfx list
>>>>> Subject: Re: [PATCH 16/19] drm/amdkfd: Update PM4 packet headers
>>>>>
>>>>> On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling
>>>>> <Felix.Kuehling@amd.com>
>>>>> wrote:
>>>>>>
>>>>>> To match current firmware. The map process packet has been extended to
>>>>>> support scratch. This is a non-backwards compatible change and it's
>>>>>> about two years old. So no point keeping the old version around
>>>>>> conditionally.
>>>>>
>>>>> Do you mean that it won't work with Kaveri anymore ?
>>>>> I believe we aren't allowed to break older H/W support without some
>>>>> serious justification.
>>>>>
>>>>> Oded
>>>>>
>>>>>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>>>>>> ---
>>>>>>   drivers/gpu/drm/amd/amdkfd/kfd_device.c         |   8 +-
>>>>>>   drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c | 161 ++++--------
>>>>>>   drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h    | 314
>>>>>> +++---------------------
>>>>>>   drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h | 130 +++++++++-
>>>>>>   4 files changed, 199 insertions(+), 414 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>>>>> b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>>>>> index e1c2ad2..e790e7f 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>>>>> @@ -26,7 +26,7 @@
>>>>>>   #include <linux/slab.h>
>>>>>>   #include "kfd_priv.h"
>>>>>>   #include "kfd_device_queue_manager.h"
>>>>>> -#include "kfd_pm4_headers.h"
>>>>>> +#include "kfd_pm4_headers_vi.h"
>>>>>>
>>>>>>   #define MQD_SIZE_ALIGNED 768
>>>>>>
>>>>>> @@ -238,9 +238,9 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
>>>>>>           * calculate max size of runlist packet.
>>>>>>           * There can be only 2 packets at once
>>>>>>           */
>>>>>> -       size += (KFD_MAX_NUM_OF_PROCESSES * sizeof(struct
>>>>>> pm4_map_process) +
>>>>>> -               max_num_of_queues_per_device *
>>>>>> -               sizeof(struct pm4_map_queues) + sizeof(struct
>>>>>> pm4_runlist)) * 2;
>>>>>> +       size += (KFD_MAX_NUM_OF_PROCESSES * sizeof(struct
>>>>>> +pm4_mes_map_process) +
>>>>>> +               max_num_of_queues_per_device * sizeof(struct
>>>>>> +pm4_mes_map_queues)
>>>>>> +               + sizeof(struct pm4_mes_runlist)) * 2;
>>>>>>
>>>>>>          /* Add size of HIQ & DIQ */
>>>>>>          size += KFD_KERNEL_QUEUE_SIZE * 2;  diff --git
>>>>>> a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
>>>>>> b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
>>>>>> index 77a6f2b..3141e05 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
>>>>>> @@ -26,7 +26,6 @@
>>>>>>   #include "kfd_device_queue_manager.h"
>>>>>>   #include "kfd_kernel_queue.h"
>>>>>>   #include "kfd_priv.h"
>>>>>> -#include "kfd_pm4_headers.h"
>>>>>>   #include "kfd_pm4_headers_vi.h"
>>>>>>   #include "kfd_pm4_opcodes.h"
>>>>>>
>>>>>> @@ -44,12 +43,12 @@ static unsigned int build_pm4_header(unsigned int
>>>>>> opcode, size_t packet_size)
>>>>>>   {
>>>>>>          union PM4_MES_TYPE_3_HEADER header;
>>>>>>
>>>>>> -       header.u32all = 0;
>>>>>> +       header.u32All = 0;
>>>>>>          header.opcode = opcode;
>>>>>>          header.count = packet_size/sizeof(uint32_t) - 2;
>>>>>>          header.type = PM4_TYPE_3;
>>>>>>
>>>>>> -       return header.u32all;
>>>>>> +       return header.u32All;
>>>>>>   }
>>>>>>
>>>>>>   static void pm_calc_rlib_size(struct packet_manager *pm,  @@ -69,12
>>>>>> +68,9 @@ static void pm_calc_rlib_size(struct packet_manager *pm,
>>>>>>                  pr_debug("Over subscribed runlist\n");
>>>>>>          }
>>>>>>
>>>>>> -       map_queue_size =
>>>>>> -               (pm->dqm->dev->device_info->asic_family ==
>>>>>> CHIP_CARRIZO) ?
>>>>>> -               sizeof(struct pm4_mes_map_queues) :
>>>>>> -               sizeof(struct pm4_map_queues);
>>>>>> +       map_queue_size = sizeof(struct pm4_mes_map_queues);
>>>>>>          /* calculate run list ib allocation size */
>>>>>> -       *rlib_size = process_count * sizeof(struct pm4_map_process) +
>>>>>> +       *rlib_size = process_count * sizeof(struct
>>>>>> +pm4_mes_map_process) +
>>>>>>                       queue_count * map_queue_size;
>>>>>>
>>>>>>          /*
>>>>>> @@ -82,7 +78,7 @@ static void pm_calc_rlib_size(struct packet_manager
>>>>>> *pm,
>>>>>>           * when over subscription
>>>>>>           */
>>>>>>          if (*over_subscription)
>>>>>> -               *rlib_size += sizeof(struct pm4_runlist);
>>>>>> +               *rlib_size += sizeof(struct pm4_mes_runlist);
>>>>>>
>>>>>>          pr_debug("runlist ib size %d\n", *rlib_size);
>>>>>>   }
>>>>>> @@ -119,16 +115,16 @@ static int pm_allocate_runlist_ib(struct
>>>>>> packet_manager *pm,
>>>>>>   static int pm_create_runlist(struct packet_manager *pm, uint32_t
>>>>>> *buffer,
>>>>>>                          uint64_t ib, size_t ib_size_in_dwords, bool
>>>>>> chain)
>>>>>>   {
>>>>>> -       struct pm4_runlist *packet;
>>>>>> +       struct pm4_mes_runlist *packet;
>>>>>>
>>>>>>          if (WARN_ON(!ib))
>>>>>>                  return -EFAULT;
>>>>>>
>>>>>> -       packet = (struct pm4_runlist *)buffer;
>>>>>> +       packet = (struct pm4_mes_runlist *)buffer;
>>>>>>
>>>>>> -       memset(buffer, 0, sizeof(struct pm4_runlist));
>>>>>> -       packet->header.u32all = build_pm4_header(IT_RUN_LIST,
>>>>>> -                                               sizeof(struct
>>>>>> pm4_runlist));
>>>>>> +       memset(buffer, 0, sizeof(struct pm4_mes_runlist));
>>>>>> +       packet->header.u32All = build_pm4_header(IT_RUN_LIST,
>>>>>> +                                               sizeof(struct
>>>>>> +pm4_mes_runlist));
>>>>>>
>>>>>>          packet->bitfields4.ib_size = ib_size_in_dwords;
>>>>>>          packet->bitfields4.chain = chain ? 1 : 0;  @@ -143,16 +139,16
>>>>>> @@ static int pm_create_runlist(struct packet_manager *pm, uint32_t
>>>>>> *buffer,
>>>>>>   static int pm_create_map_process(struct packet_manager *pm, uint32_t
>>>>>> *buffer,
>>>>>>                                  struct qcm_process_device *qpd)
>>>>>>   {
>>>>>> -       struct pm4_map_process *packet;
>>>>>> +       struct pm4_mes_map_process *packet;
>>>>>>          struct queue *cur;
>>>>>>          uint32_t num_queues;
>>>>>>
>>>>>> -       packet = (struct pm4_map_process *)buffer;
>>>>>> +       packet = (struct pm4_mes_map_process *)buffer;
>>>>>>
>>>>>> -       memset(buffer, 0, sizeof(struct pm4_map_process));
>>>>>> +       memset(buffer, 0, sizeof(struct pm4_mes_map_process));
>>>>>>
>>>>>> -       packet->header.u32all = build_pm4_header(IT_MAP_PROCESS,
>>>>>> -                                       sizeof(struct
>>>>>> pm4_map_process));
>>>>>> +       packet->header.u32All = build_pm4_header(IT_MAP_PROCESS,
>>>>>> +                                       sizeof(struct
>>>>>> +pm4_mes_map_process));
>>>>>>          packet->bitfields2.diq_enable = (qpd->is_debug) ? 1 : 0;
>>>>>>          packet->bitfields2.process_quantum = 1;
>>>>>>          packet->bitfields2.pasid = qpd->pqm->process->pasid;  @@
>>>>>> -170,23 +166,26 @@ static int pm_create_map_process(struct
>>>>>> packet_manager *pm, uint32_t *buffer,
>>>>>>          packet->sh_mem_ape1_base = qpd->sh_mem_ape1_base;
>>>>>>          packet->sh_mem_ape1_limit = qpd->sh_mem_ape1_limit;
>>>>>>
>>>>>> +       /* TODO: scratch support */
>>>>>> +       packet->sh_hidden_private_base_vmid = 0;
>>>>>> +
>>>>>>          packet->gds_addr_lo = lower_32_bits(qpd->gds_context_area);
>>>>>>          packet->gds_addr_hi = upper_32_bits(qpd->gds_context_area);
>>>>>>
>>>>>>          return 0;
>>>>>>   }
>>>>>>
>>>>>> -static int pm_create_map_queue_vi(struct packet_manager *pm, uint32_t
>>>>>> *buffer,
>>>>>> +static int pm_create_map_queue(struct packet_manager *pm, uint32_t
>>>>>> +*buffer,
>>>>>>                  struct queue *q, bool is_static)
>>>>>>   {
>>>>>>          struct pm4_mes_map_queues *packet;
>>>>>>          bool use_static = is_static;
>>>>>>
>>>>>>          packet = (struct pm4_mes_map_queues *)buffer;
>>>>>> -       memset(buffer, 0, sizeof(struct pm4_map_queues));
>>>>>> +       memset(buffer, 0, sizeof(struct pm4_mes_map_queues));
>>>>>>
>>>>>> -       packet->header.u32all = build_pm4_header(IT_MAP_QUEUES,
>>>>>> -                                               sizeof(struct
>>>>>> pm4_map_queues));
>>>>>> +       packet->header.u32All = build_pm4_header(IT_MAP_QUEUES,
>>>>>> +                                               sizeof(struct
>>>>>> +pm4_mes_map_queues));
>>>>>>          packet->bitfields2.alloc_format =
>>>>>>                  alloc_format__mes_map_queues__one_per_pipe_vi;
>>>>>>          packet->bitfields2.num_queues = 1;  @@ -235,64 +234,6 @@
>>>>>> static int pm_create_map_queue_vi(struct packet_manager *pm, uint32_t
>>>>>> *buffer,
>>>>>>          return 0;
>>>>>>   }
>>>>>>
>>>>>> -static int pm_create_map_queue(struct packet_manager *pm, uint32_t
>>>>>> *buffer,
>>>>>> -                               struct queue *q, bool is_static)  -{
>>>>>> -       struct pm4_map_queues *packet;
>>>>>> -       bool use_static = is_static;
>>>>>> -
>>>>>> -       packet = (struct pm4_map_queues *)buffer;
>>>>>> -       memset(buffer, 0, sizeof(struct pm4_map_queues));
>>>>>> -
>>>>>> -       packet->header.u32all = build_pm4_header(IT_MAP_QUEUES,
>>>>>> -                                               sizeof(struct
>>>>>> pm4_map_queues));
>>>>>> -       packet->bitfields2.alloc_format =
>>>>>> -
>>>>>> alloc_format__mes_map_queues__one_per_pipe;
>>>>>> -       packet->bitfields2.num_queues = 1;
>>>>>> -       packet->bitfields2.queue_sel =
>>>>>> -
>>>>>> queue_sel__mes_map_queues__map_to_hws_determined_queue_slots;
>>>>>> -
>>>>>> -       packet->bitfields2.vidmem = (q->properties.is_interop) ?
>>>>>> -                       vidmem__mes_map_queues__uses_video_memory :
>>>>>> -                       vidmem__mes_map_queues__uses_no_video_memory;
>>>>>> -
>>>>>> -       switch (q->properties.type) {
>>>>>> -       case KFD_QUEUE_TYPE_COMPUTE:
>>>>>> -       case KFD_QUEUE_TYPE_DIQ:
>>>>>> -               packet->bitfields2.engine_sel =
>>>>>> -                               engine_sel__mes_map_queues__compute;
>>>>>> -               break;
>>>>>> -       case KFD_QUEUE_TYPE_SDMA:
>>>>>> -               packet->bitfields2.engine_sel =
>>>>>> -                               engine_sel__mes_map_queues__sdma0;
>>>>>> -               use_static = false; /* no static queues under SDMA */
>>>>>> -               break;
>>>>>> -       default:
>>>>>> -               WARN(1, "queue type %d", q->properties.type);
>>>>>> -               return -EINVAL;
>>>>>> -       }
>>>>>> -
>>>>>> -       packet->mes_map_queues_ordinals[0].bitfields3.doorbell_offset
>>>>>> =
>>>>>> -                       q->properties.doorbell_off;
>>>>>> -
>>>>>> -       packet->mes_map_queues_ordinals[0].bitfields3.is_static =
>>>>>> -                       (use_static) ? 1 : 0;
>>>>>> -
>>>>>> -       packet->mes_map_queues_ordinals[0].mqd_addr_lo =
>>>>>> -                       lower_32_bits(q->gart_mqd_addr);
>>>>>> -
>>>>>> -       packet->mes_map_queues_ordinals[0].mqd_addr_hi =
>>>>>> -                       upper_32_bits(q->gart_mqd_addr);
>>>>>> -
>>>>>> -       packet->mes_map_queues_ordinals[0].wptr_addr_lo =
>>>>>> -
>>>>>> lower_32_bits((uint64_t)q->properties.write_ptr);
>>>>>> -
>>>>>> -       packet->mes_map_queues_ordinals[0].wptr_addr_hi =
>>>>>> -
>>>>>> upper_32_bits((uint64_t)q->properties.write_ptr);
>>>>>> -
>>>>>> -       return 0;
>>>>>> -}
>>>>>> -
>>>>>>   static int pm_create_runlist_ib(struct packet_manager *pm,
>>>>>>                                  struct list_head *queues,
>>>>>>                                  uint64_t *rl_gpu_addr,  @@ -334,7
>>>>>> +275,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
>>>>>>                          return retval;
>>>>>>
>>>>>>                  proccesses_mapped++;
>>>>>> -               inc_wptr(&rl_wptr, sizeof(struct pm4_map_process),
>>>>>> +               inc_wptr(&rl_wptr, sizeof(struct pm4_mes_map_process),
>>>>>>                                  alloc_size_bytes);
>>>>>>
>>>>>>                  list_for_each_entry(kq, &qpd->priv_queue_list, list)
>>>>>> {
>>>>>> @@ -344,14 +285,7 @@ static int pm_create_runlist_ib(struct
>>>>>> packet_manager *pm,
>>>>>>                          pr_debug("static_queue, mapping kernel q %d,
>>>>>> is debug status %d\n",
>>>>>>                                  kq->queue->queue, qpd->is_debug);
>>>>>>
>>>>>> -                       if (pm->dqm->dev->device_info->asic_family ==
>>>>>> -                                       CHIP_CARRIZO)
>>>>>> -                               retval = pm_create_map_queue_vi(pm,
>>>>>> -                                               &rl_buffer[rl_wptr],
>>>>>> -                                               kq->queue,
>>>>>> -                                               qpd->is_debug);
>>>>>> -                       else
>>>>>> -                               retval = pm_create_map_queue(pm,
>>>>>> +                       retval = pm_create_map_queue(pm,
>>>>>>                                                  &rl_buffer[rl_wptr],
>>>>>>                                                  kq->queue,
>>>>>>                                                  qpd->is_debug);  @@
>>>>>> -359,7 +293,7 @@ static int pm_create_runlist_ib(struct packet_manager
>>>>>> *pm,
>>>>>>                                  return retval;
>>>>>>
>>>>>>                          inc_wptr(&rl_wptr,
>>>>>> -                               sizeof(struct pm4_map_queues),
>>>>>> +                               sizeof(struct pm4_mes_map_queues),
>>>>>>                                  alloc_size_bytes);
>>>>>>                  }
>>>>>>
>>>>>> @@ -370,14 +304,7 @@ static int pm_create_runlist_ib(struct
>>>>>> packet_manager *pm,
>>>>>>                          pr_debug("static_queue, mapping user queue
>>>>>> %d,
>>>>>> is debug status %d\n",
>>>>>>                                  q->queue, qpd->is_debug);
>>>>>>
>>>>>> -                       if (pm->dqm->dev->device_info->asic_family ==
>>>>>> -                                       CHIP_CARRIZO)
>>>>>> -                               retval = pm_create_map_queue_vi(pm,
>>>>>> -                                               &rl_buffer[rl_wptr],
>>>>>> -                                               q,
>>>>>> -                                               qpd->is_debug);
>>>>>> -                       else
>>>>>> -                               retval = pm_create_map_queue(pm,
>>>>>> +                       retval = pm_create_map_queue(pm,
>>>>>>                                                  &rl_buffer[rl_wptr],
>>>>>>                                                  q,
>>>>>>                                                  qpd->is_debug);  @@
>>>>>> -386,7 +313,7 @@ static int pm_create_runlist_ib(struct packet_manager
>>>>>> *pm,
>>>>>>                                  return retval;
>>>>>>
>>>>>>                          inc_wptr(&rl_wptr,
>>>>>> -                               sizeof(struct pm4_map_queues),
>>>>>> +                               sizeof(struct pm4_mes_map_queues),
>>>>>>                                  alloc_size_bytes);
>>>>>>                  }
>>>>>>          }
>>>>>> @@ -429,7 +356,7 @@ void pm_uninit(struct packet_manager *pm)
>>>>>>   int pm_send_set_resources(struct packet_manager *pm,
>>>>>>                                  struct scheduling_resources *res)
>>>>>>   {
>>>>>> -       struct pm4_set_resources *packet;
>>>>>> +       struct pm4_mes_set_resources *packet;
>>>>>>          int retval = 0;
>>>>>>
>>>>>>          mutex_lock(&pm->lock);
>>>>>> @@ -442,9 +369,9 @@ int pm_send_set_resources(struct packet_manager
>>>>>> *pm,
>>>>>>                  goto out;
>>>>>>          }
>>>>>>
>>>>>> -       memset(packet, 0, sizeof(struct pm4_set_resources));
>>>>>> -       packet->header.u32all = build_pm4_header(IT_SET_RESOURCES,
>>>>>> -                                       sizeof(struct
>>>>>> pm4_set_resources));
>>>>>> +       memset(packet, 0, sizeof(struct pm4_mes_set_resources));
>>>>>> +       packet->header.u32All = build_pm4_header(IT_SET_RESOURCES,
>>>>>> +                                       sizeof(struct
>>>>>> +pm4_mes_set_resources));
>>>>>>
>>>>>>          packet->bitfields2.queue_type =
>>>>>>
>>>>>> queue_type__mes_set_resources__hsa_interface_queue_hiq;
>>>>>> @@ -482,7 +409,7 @@ int pm_send_runlist(struct packet_manager *pm,
>>>>>> struct list_head *dqm_queues)
>>>>>>
>>>>>>          pr_debug("runlist IB address: 0x%llX\n", rl_gpu_ib_addr);
>>>>>>
>>>>>> -       packet_size_dwords = sizeof(struct pm4_runlist) /
>>>>>> sizeof(uint32_t);
>>>>>> +       packet_size_dwords = sizeof(struct pm4_mes_runlist) /
>>>>>> +sizeof(uint32_t);
>>>>>>          mutex_lock(&pm->lock);
>>>>>>
>>>>>>          retval =
>>>>>> pm->priv_queue->ops.acquire_packet_buffer(pm->priv_queue,
>>>>>> @@ -514,7 +441,7 @@ int pm_send_query_status(struct packet_manager
>>>>>> *pm, uint64_t fence_address,
>>>>>>                          uint32_t fence_value)
>>>>>>   {
>>>>>>          int retval;
>>>>>> -       struct pm4_query_status *packet;
>>>>>> +       struct pm4_mes_query_status *packet;
>>>>>>
>>>>>>          if (WARN_ON(!fence_address))
>>>>>>                  return -EFAULT;
>>>>>> @@ -522,13 +449,13 @@ int pm_send_query_status(struct packet_manager
>>>>>> *pm, uint64_t fence_address,
>>>>>>          mutex_lock(&pm->lock);
>>>>>>          retval = pm->priv_queue->ops.acquire_packet_buffer(
>>>>>>                          pm->priv_queue,
>>>>>> -                       sizeof(struct pm4_query_status) /
>>>>>> sizeof(uint32_t),
>>>>>> +                       sizeof(struct pm4_mes_query_status) /
>>>>>> +sizeof(uint32_t),
>>>>>>                          (unsigned int **)&packet);
>>>>>>          if (retval)
>>>>>>                  goto fail_acquire_packet_buffer;
>>>>>>
>>>>>> -       packet->header.u32all = build_pm4_header(IT_QUERY_STATUS,
>>>>>> -                                       sizeof(struct
>>>>>> pm4_query_status));
>>>>>> +       packet->header.u32All = build_pm4_header(IT_QUERY_STATUS,
>>>>>> +                                       sizeof(struct
>>>>>> +pm4_mes_query_status));
>>>>>>
>>>>>>          packet->bitfields2.context_id = 0;
>>>>>>          packet->bitfields2.interrupt_sel =  @@ -555,22 +482,22 @@ int
>>>>>> pm_send_unmap_queue(struct packet_manager *pm, enum
>>>>>
>>>>> kfd_queue_type
>>>>>>
>>>>>> type,
>>>>>>   {
>>>>>>          int retval;
>>>>>>          uint32_t *buffer;
>>>>>> -       struct pm4_unmap_queues *packet;
>>>>>> +       struct pm4_mes_unmap_queues *packet;
>>>>>>
>>>>>>          mutex_lock(&pm->lock);
>>>>>>          retval = pm->priv_queue->ops.acquire_packet_buffer(
>>>>>>                          pm->priv_queue,
>>>>>> -                       sizeof(struct pm4_unmap_queues) /
>>>>>> sizeof(uint32_t),
>>>>>> +                       sizeof(struct pm4_mes_unmap_queues) /
>>>>>> +sizeof(uint32_t),
>>>>>>                          &buffer);
>>>>>>          if (retval)
>>>>>>                  goto err_acquire_packet_buffer;
>>>>>>
>>>>>> -       packet = (struct pm4_unmap_queues *)buffer;
>>>>>> -       memset(buffer, 0, sizeof(struct pm4_unmap_queues));
>>>>>> +       packet = (struct pm4_mes_unmap_queues *)buffer;
>>>>>> +       memset(buffer, 0, sizeof(struct pm4_mes_unmap_queues));
>>>>>>          pr_debug("static_queue: unmapping queues: mode is %d , reset
>>>>>> is %d , type is %d\n",
>>>>>>                  mode, reset, type);
>>>>>> -       packet->header.u32all = build_pm4_header(IT_UNMAP_QUEUES,
>>>>>> -                                       sizeof(struct
>>>>>> pm4_unmap_queues));
>>>>>> +       packet->header.u32All = build_pm4_header(IT_UNMAP_QUEUES,
>>>>>> +                                       sizeof(struct
>>>>>> +pm4_mes_unmap_queues));
>>>>>>          switch (type) {
>>>>>>          case KFD_QUEUE_TYPE_COMPUTE:
>>>>>>          case KFD_QUEUE_TYPE_DIQ:
>>>>>> @@ -608,12 +535,12 @@ int pm_send_unmap_queue(struct
>>>>>
>>>>> packet_manager
>>>>>>
>>>>>> *pm, enum kfd_queue_type type,
>>>>>>                  break;
>>>>>>          case KFD_PREEMPT_TYPE_FILTER_ALL_QUEUES:
>>>>>>                  packet->bitfields2.queue_sel =
>>>>>> -
>>>>>> queue_sel__mes_unmap_queues__perform_request_on_all_active_queue
>>>>>
>>>>> s;
>>>>>>
>>>>>> +
>>>>>> +queue_sel__mes_unmap_queues__unmap_all_queues;
>>>>>>                  break;
>>>>>>          case KFD_PREEMPT_TYPE_FILTER_DYNAMIC_QUEUES:
>>>>>>                  /* in this case, we do not preempt static queues */
>>>>>>                  packet->bitfields2.queue_sel =
>>>>>> -
>>>>>> queue_sel__mes_unmap_queues__perform_request_on_dynamic_queues
>>>>>
>>>>> _only;
>>>>>>
>>>>>> +
>>>>>> +queue_sel__mes_unmap_queues__unmap_all_non_static_queues;
>>>>>>                  break;
>>>>>>          default:
>>>>>>                  WARN(1, "filter %d", mode);  diff --git
>>>>>> a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
>>>>>> b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
>>>>>> index 97e5442..e50f73d 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
>>>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
>>>>>> @@ -41,99 +41,6 @@ union PM4_MES_TYPE_3_HEADER {
>>>>>>   };
>>>>>>   #endif /* PM4_MES_HEADER_DEFINED */
>>>>>>
>>>>>> -/* --------------------MES_SET_RESOURCES-------------------- */
>>>>>> -
>>>>>> -#ifndef PM4_MES_SET_RESOURCES_DEFINED -#define
>>>>>> PM4_MES_SET_RESOURCES_DEFINED -enum
>>>>>
>>>>> set_resources_queue_type_enum {
>>>>>>
>>>>>> -       queue_type__mes_set_resources__kernel_interface_queue_kiq = 0,
>>>>>> -       queue_type__mes_set_resources__hsa_interface_queue_hiq = 1,
>>>>>> -       queue_type__mes_set_resources__hsa_debug_interface_queue = 4
>>>>>> -};
>>>>>> -
>>>>>> -struct pm4_set_resources {
>>>>>> -       union {
>>>>>> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
>>>>>> -               uint32_t ordinal1;
>>>>>> -       };
>>>>>> -
>>>>>> -       union {
>>>>>> -               struct {
>>>>>> -                       uint32_t vmid_mask:16;
>>>>>> -                       uint32_t unmap_latency:8;
>>>>>> -                       uint32_t reserved1:5;
>>>>>> -                       enum set_resources_queue_type_enum
>>>>>> queue_type:3;
>>>>>> -               } bitfields2;
>>>>>> -               uint32_t ordinal2;
>>>>>> -       };
>>>>>> -
>>>>>> -       uint32_t queue_mask_lo;
>>>>>> -       uint32_t queue_mask_hi;
>>>>>> -       uint32_t gws_mask_lo;
>>>>>> -       uint32_t gws_mask_hi;
>>>>>> -
>>>>>> -       union {
>>>>>> -               struct {
>>>>>> -                       uint32_t oac_mask:16;
>>>>>> -                       uint32_t reserved2:16;
>>>>>> -               } bitfields7;
>>>>>> -               uint32_t ordinal7;
>>>>>> -       };
>>>>>> -
>>>>>> -       union {
>>>>>> -               struct {
>>>>>> -                       uint32_t gds_heap_base:6;
>>>>>> -                       uint32_t reserved3:5;
>>>>>> -                       uint32_t gds_heap_size:6;
>>>>>> -                       uint32_t reserved4:15;
>>>>>> -               } bitfields8;
>>>>>> -               uint32_t ordinal8;
>>>>>> -       };
>>>>>> -
>>>>>> -};
>>>>>> -#endif
>>>>>> -
>>>>>> -/*--------------------MES_RUN_LIST-------------------- */
>>>>>> -
>>>>>> -#ifndef PM4_MES_RUN_LIST_DEFINED
>>>>>> -#define PM4_MES_RUN_LIST_DEFINED
>>>>>> -
>>>>>> -struct pm4_runlist {
>>>>>> -       union {
>>>>>> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
>>>>>> -               uint32_t ordinal1;
>>>>>> -       };
>>>>>> -
>>>>>> -       union {
>>>>>> -               struct {
>>>>>> -                       uint32_t reserved1:2;
>>>>>> -                       uint32_t ib_base_lo:30;
>>>>>> -               } bitfields2;
>>>>>> -               uint32_t ordinal2;
>>>>>> -       };
>>>>>> -
>>>>>> -       union {
>>>>>> -               struct {
>>>>>> -                       uint32_t ib_base_hi:16;
>>>>>> -                       uint32_t reserved2:16;
>>>>>> -               } bitfields3;
>>>>>> -               uint32_t ordinal3;
>>>>>> -       };
>>>>>> -
>>>>>> -       union {
>>>>>> -               struct {
>>>>>> -                       uint32_t ib_size:20;
>>>>>> -                       uint32_t chain:1;
>>>>>> -                       uint32_t offload_polling:1;
>>>>>> -                       uint32_t reserved3:1;
>>>>>> -                       uint32_t valid:1;
>>>>>> -                       uint32_t reserved4:8;
>>>>>> -               } bitfields4;
>>>>>> -               uint32_t ordinal4;
>>>>>> -       };
>>>>>> -
>>>>>> -};
>>>>>> -#endif
>>>>>>
>>>>>>   /*--------------------MES_MAP_PROCESS-------------------- */
>>>>>>
>>>>>> @@ -186,217 +93,58 @@ struct pm4_map_process {
>>>>>>   };
>>>>>>   #endif
>>>>>>
>>>>>> -/*--------------------MES_MAP_QUEUES--------------------*/
>>>>>> -
>>>>>> -#ifndef PM4_MES_MAP_QUEUES_DEFINED
>>>>>> -#define PM4_MES_MAP_QUEUES_DEFINED
>>>>>> -enum map_queues_queue_sel_enum {
>>>>>> -       queue_sel__mes_map_queues__map_to_specified_queue_slots = 0,
>>>>>> -
>>>>>
>>>>>        queue_sel__mes_map_queues__map_to_hws_determined_queue_slots
>>>>> =
>>>>>>
>>>>>> 1,
>>>>>> -       queue_sel__mes_map_queues__enable_process_queues = 2 -};
>>>>>> +#ifndef PM4_MES_MAP_PROCESS_DEFINED_KV_SCRATCH
>>>>>> +#define PM4_MES_MAP_PROCESS_DEFINED_KV_SCRATCH
>>>>>>
>>>>>> -enum map_queues_vidmem_enum {
>>>>>> -       vidmem__mes_map_queues__uses_no_video_memory = 0,
>>>>>> -       vidmem__mes_map_queues__uses_video_memory = 1 -};
>>>>>> -
>>>>>> -enum map_queues_alloc_format_enum {
>>>>>> -       alloc_format__mes_map_queues__one_per_pipe = 0,
>>>>>> -       alloc_format__mes_map_queues__all_on_one_pipe = 1 -};
>>>>>> -
>>>>>> -enum map_queues_engine_sel_enum {
>>>>>> -       engine_sel__mes_map_queues__compute = 0,
>>>>>> -       engine_sel__mes_map_queues__sdma0 = 2,
>>>>>> -       engine_sel__mes_map_queues__sdma1 = 3 -};
>>>>>> -
>>>>>> -struct pm4_map_queues {
>>>>>> +struct pm4_map_process_scratch_kv {
>>>>>>          union {
>>>>>> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
>>>>>> -               uint32_t ordinal1;
>>>>>> -       };
>>>>>> -
>>>>>> -       union {
>>>>>> -               struct {
>>>>>> -                       uint32_t reserved1:4;
>>>>>> -                       enum map_queues_queue_sel_enum queue_sel:2;
>>>>>> -                       uint32_t reserved2:2;
>>>>>> -                       uint32_t vmid:4;
>>>>>> -                       uint32_t reserved3:4;
>>>>>> -                       enum map_queues_vidmem_enum vidmem:2;
>>>>>> -                       uint32_t reserved4:6;
>>>>>> -                       enum map_queues_alloc_format_enum
>>>>>> alloc_format:2;
>>>>>> -                       enum map_queues_engine_sel_enum engine_sel:3;
>>>>>> -                       uint32_t num_queues:3;
>>>>>> -               } bitfields2;
>>>>>> -               uint32_t ordinal2;
>>>>>> -       };
>>>>>> -
>>>>>> -       struct {
>>>>>> -               union {
>>>>>> -                       struct {
>>>>>> -                               uint32_t is_static:1;
>>>>>> -                               uint32_t reserved5:1;
>>>>>> -                               uint32_t doorbell_offset:21;
>>>>>> -                               uint32_t reserved6:3;
>>>>>> -                               uint32_t queue:6;
>>>>>> -                       } bitfields3;
>>>>>> -                       uint32_t ordinal3;
>>>>>> -               };
>>>>>> -
>>>>>> -               uint32_t mqd_addr_lo;
>>>>>> -               uint32_t mqd_addr_hi;
>>>>>> -               uint32_t wptr_addr_lo;
>>>>>> -               uint32_t wptr_addr_hi;
>>>>>> -
>>>>>> -       } mes_map_queues_ordinals[1];   /* 1..N of these ordinal
>>>>>> groups */
>>>>>> -
>>>>>> -};
>>>>>> -#endif
>>>>>> -
>>>>>> -/*--------------------MES_QUERY_STATUS--------------------*/
>>>>>> -
>>>>>> -#ifndef PM4_MES_QUERY_STATUS_DEFINED
>>>>>> -#define PM4_MES_QUERY_STATUS_DEFINED
>>>>>> -enum query_status_interrupt_sel_enum {
>>>>>> -       interrupt_sel__mes_query_status__completion_status = 0,
>>>>>> -       interrupt_sel__mes_query_status__process_status = 1,
>>>>>> -       interrupt_sel__mes_query_status__queue_status = 2  -};
>>>>>> -
>>>>>> -enum query_status_command_enum {
>>>>>> -       command__mes_query_status__interrupt_only = 0,
>>>>>> -       command__mes_query_status__fence_only_immediate = 1,
>>>>>> -       command__mes_query_status__fence_only_after_write_ack = 2,
>>>>>> -
>>>>>> command__mes_query_status__fence_wait_for_write_ack_send_interrupt
>>>>>
>>>>> = 3
>>>>>>
>>>>>> -};
>>>>>> -
>>>>>> -enum query_status_engine_sel_enum {
>>>>>> -       engine_sel__mes_query_status__compute = 0,
>>>>>> -       engine_sel__mes_query_status__sdma0_queue = 2,
>>>>>> -       engine_sel__mes_query_status__sdma1_queue = 3  -};
>>>>>> -
>>>>>> -struct pm4_query_status {
>>>>>> -       union {
>>>>>> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
>>>>>> -               uint32_t ordinal1;
>>>>>> -       };
>>>>>> -
>>>>>> -       union {
>>>>>> -               struct {
>>>>>> -                       uint32_t context_id:28;
>>>>>> -                       enum query_status_interrupt_sel_enum
>>>>>> interrupt_sel:2;
>>>>>> -                       enum query_status_command_enum command:2;
>>>>>> -               } bitfields2;
>>>>>> -               uint32_t ordinal2;
>>>>>> +               union PM4_MES_TYPE_3_HEADER   header; /* header */
>>>>>> +               uint32_t            ordinal1;
>>>>>>          };
>>>>>>
>>>>>>          union {
>>>>>>                  struct {
>>>>>>                          uint32_t pasid:16;
>>>>>> -                       uint32_t reserved1:16;
>>>>>> -               } bitfields3a;
>>>>>> -               struct {
>>>>>> -                       uint32_t reserved2:2;
>>>>>> -                       uint32_t doorbell_offset:21;
>>>>>> -                       uint32_t reserved3:3;
>>>>>> -                       enum query_status_engine_sel_enum
>>>>>> engine_sel:3;
>>>>>> -                       uint32_t reserved4:3;
>>>>>> -               } bitfields3b;
>>>>>> -               uint32_t ordinal3;
>>>>>> -       };
>>>>>> -
>>>>>> -       uint32_t addr_lo;
>>>>>> -       uint32_t addr_hi;
>>>>>> -       uint32_t data_lo;
>>>>>> -       uint32_t data_hi;
>>>>>> -};
>>>>>> -#endif
>>>>>> -
>>>>>> -/*--------------------MES_UNMAP_QUEUES--------------------*/
>>>>>> -
>>>>>> -#ifndef PM4_MES_UNMAP_QUEUES_DEFINED
>>>>>> -#define PM4_MES_UNMAP_QUEUES_DEFINED
>>>>>> -enum unmap_queues_action_enum {
>>>>>> -       action__mes_unmap_queues__preempt_queues = 0,
>>>>>> -       action__mes_unmap_queues__reset_queues = 1,
>>>>>> -       action__mes_unmap_queues__disable_process_queues = 2  -};
>>>>>> -
>>>>>> -enum unmap_queues_queue_sel_enum {
>>>>>> -
>>>>>> queue_sel__mes_unmap_queues__perform_request_on_specified_queues
>>>>>
>>>>> = 0,
>>>>>>
>>>>>> -
>>>>>
>>>>>        queue_sel__mes_unmap_queues__perform_request_on_pasid_queues =
>>>>>>
>>>>>> 1,
>>>>>> -
>>>>>> queue_sel__mes_unmap_queues__perform_request_on_all_active_queue
>>>>>
>>>>> s = 2,
>>>>>>
>>>>>> -
>>>>>> queue_sel__mes_unmap_queues__perform_request_on_dynamic_queues
>>>>>
>>>>> _only = 3
>>>>>>
>>>>>> -};
>>>>>> -
>>>>>> -enum unmap_queues_engine_sel_enum {
>>>>>> -       engine_sel__mes_unmap_queues__compute = 0,
>>>>>> -       engine_sel__mes_unmap_queues__sdma0 = 2,
>>>>>> -       engine_sel__mes_unmap_queues__sdma1 = 3  -};
>>>>>> -
>>>>>> -struct pm4_unmap_queues {
>>>>>> -       union {
>>>>>> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
>>>>>> -               uint32_t ordinal1;
>>>>>> -       };
>>>>>> -
>>>>>> -       union {
>>>>>> -               struct {
>>>>>> -                       enum unmap_queues_action_enum action:2;
>>>>>> -                       uint32_t reserved1:2;
>>>>>> -                       enum unmap_queues_queue_sel_enum queue_sel:2;
>>>>>> -                       uint32_t reserved2:20;
>>>>>> -                       enum unmap_queues_engine_sel_enum
>>>>>> engine_sel:3;
>>>>>> -                       uint32_t num_queues:3;
>>>>>> +                       uint32_t reserved1:8;
>>>>>> +                       uint32_t diq_enable:1;
>>>>>> +                       uint32_t process_quantum:7;
>>>>>>                  } bitfields2;
>>>>>>                  uint32_t ordinal2;
>>>>>>          };
>>>>>>
>>>>>>          union {
>>>>>>                  struct {
>>>>>> -                       uint32_t pasid:16;
>>>>>> -                       uint32_t reserved3:16;
>>>>>> -               } bitfields3a;
>>>>>> -               struct {
>>>>>> -                       uint32_t reserved4:2;
>>>>>> -                       uint32_t doorbell_offset0:21;
>>>>>> -                       uint32_t reserved5:9;
>>>>>> -               } bitfields3b;
>>>>>> +                       uint32_t page_table_base:28;
>>>>>> +                       uint32_t reserved2:4;
>>>>>> +               } bitfields3;
>>>>>>                  uint32_t ordinal3;
>>>>>>          };
>>>>>>
>>>>>> -       union {
>>>>>> -               struct {
>>>>>> -                       uint32_t reserved6:2;
>>>>>> -                       uint32_t doorbell_offset1:21;
>>>>>> -                       uint32_t reserved7:9;
>>>>>> -               } bitfields4;
>>>>>> -               uint32_t ordinal4;
>>>>>> -       };
>>>>>> -
>>>>>> -       union {
>>>>>> -               struct {
>>>>>> -                       uint32_t reserved8:2;
>>>>>> -                       uint32_t doorbell_offset2:21;
>>>>>> -                       uint32_t reserved9:9;
>>>>>> -               } bitfields5;
>>>>>> -               uint32_t ordinal5;
>>>>>> -       };
>>>>>> +       uint32_t reserved3;
>>>>>> +       uint32_t sh_mem_bases;
>>>>>> +       uint32_t sh_mem_config;
>>>>>> +       uint32_t sh_mem_ape1_base;
>>>>>> +       uint32_t sh_mem_ape1_limit;
>>>>>> +       uint32_t sh_hidden_private_base_vmid;
>>>>>> +       uint32_t reserved4;
>>>>>> +       uint32_t reserved5;
>>>>>> +       uint32_t gds_addr_lo;
>>>>>> +       uint32_t gds_addr_hi;
>>>>>>
>>>>>>          union {
>>>>>>                  struct {
>>>>>> -                       uint32_t reserved10:2;
>>>>>> -                       uint32_t doorbell_offset3:21;
>>>>>> -                       uint32_t reserved11:9;
>>>>>> -               } bitfields6;
>>>>>> -               uint32_t ordinal6;
>>>>>> +                       uint32_t num_gws:6;
>>>>>> +                       uint32_t reserved6:2;
>>>>>> +                       uint32_t num_oac:4;
>>>>>> +                       uint32_t reserved7:4;
>>>>>> +                       uint32_t gds_size:6;
>>>>>> +                       uint32_t num_queues:10;
>>>>>> +               } bitfields14;
>>>>>> +               uint32_t ordinal14;
>>>>>>          };
>>>>>>
>>>>>> +       uint32_t completion_signal_lo32; uint32_t
>>>>>> +completion_signal_hi32;
>>>>>>   };
>>>>>>   #endif
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
>>>>>> b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
>>>>>> index c4eda6f..7c8d9b3 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
>>>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
>>>>>> @@ -126,9 +126,10 @@ struct pm4_mes_runlist {
>>>>>>                          uint32_t ib_size:20;
>>>>>>                          uint32_t chain:1;
>>>>>>                          uint32_t offload_polling:1;
>>>>>> -                       uint32_t reserved3:1;
>>>>>> +                       uint32_t reserved2:1;
>>>>>>                          uint32_t valid:1;
>>>>>> -                       uint32_t reserved4:8;
>>>>>> +                       uint32_t process_cnt:4;
>>>>>> +                       uint32_t reserved3:4;
>>>>>>                  } bitfields4;
>>>>>>                  uint32_t ordinal4;
>>>>>>          };
>>>>>> @@ -143,8 +144,8 @@ struct pm4_mes_runlist {
>>>>>>
>>>>>>   struct pm4_mes_map_process {
>>>>>>          union {
>>>>>> -               union PM4_MES_TYPE_3_HEADER   header;            /*
>>>>>> header */
>>>>>> -               uint32_t            ordinal1;
>>>>>> +               union PM4_MES_TYPE_3_HEADER header;     /* header */
>>>>>> +               uint32_t ordinal1;
>>>>>>          };
>>>>>>
>>>>>>          union {
>>>>>> @@ -155,36 +156,48 @@ struct pm4_mes_map_process {
>>>>>>                          uint32_t process_quantum:7;
>>>>>>                  } bitfields2;
>>>>>>                  uint32_t ordinal2;
>>>>>> -};
>>>>>> +       };
>>>>>>
>>>>>>          union {
>>>>>>                  struct {
>>>>>>                          uint32_t page_table_base:28;
>>>>>> -                       uint32_t reserved2:4;
>>>>>> +                       uint32_t reserved3:4;
>>>>>>                  } bitfields3;
>>>>>>                  uint32_t ordinal3;
>>>>>>          };
>>>>>>
>>>>>> +       uint32_t reserved;
>>>>>> +
>>>>>>          uint32_t sh_mem_bases;
>>>>>> +       uint32_t sh_mem_config;
>>>>>>          uint32_t sh_mem_ape1_base;
>>>>>>          uint32_t sh_mem_ape1_limit;
>>>>>> -       uint32_t sh_mem_config;
>>>>>> +
>>>>>> +       uint32_t sh_hidden_private_base_vmid;
>>>>>> +
>>>>>> +       uint32_t reserved2;
>>>>>> +       uint32_t reserved3;
>>>>>> +
>>>>>>          uint32_t gds_addr_lo;
>>>>>>          uint32_t gds_addr_hi;
>>>>>>
>>>>>>          union {
>>>>>>                  struct {
>>>>>>                          uint32_t num_gws:6;
>>>>>> -                       uint32_t reserved3:2;
>>>>>> +                       uint32_t reserved4:2;
>>>>>>                          uint32_t num_oac:4;
>>>>>> -                       uint32_t reserved4:4;
>>>>>> +                       uint32_t reserved5:4;
>>>>>>                          uint32_t gds_size:6;
>>>>>>                          uint32_t num_queues:10;
>>>>>>                  } bitfields10;
>>>>>>                  uint32_t ordinal10;
>>>>>>          };
>>>>>>
>>>>>> +       uint32_t completion_signal_lo;
>>>>>> +       uint32_t completion_signal_hi;
>>>>>> +
>>>>>>   };
>>>>>> +
>>>>>>   #endif
>>>>>>
>>>>>>   /*--------------------MES_MAP_QUEUES--------------------*/
>>>>>> @@ -337,7 +350,7 @@ enum mes_unmap_queues_engine_sel_enum {
>>>>>>          engine_sel__mes_unmap_queues__sdmal = 3
>>>>>>   };
>>>>>>
>>>>>> -struct PM4_MES_UNMAP_QUEUES {
>>>>>> +struct pm4_mes_unmap_queues {
>>>>>>          union {
>>>>>>                  union PM4_MES_TYPE_3_HEADER   header;            /*
>>>>>> header */
>>>>>>                  uint32_t            ordinal1;  @@ -397,4 +410,101 @@
>>>>>> struct PM4_MES_UNMAP_QUEUES {
>>>>>>   };
>>>>>>   #endif
>>>>>>
>>>>>> +#ifndef PM4_MEC_RELEASE_MEM_DEFINED
>>>>>> +#define PM4_MEC_RELEASE_MEM_DEFINED
>>>>>> +enum RELEASE_MEM_event_index_enum {
>>>>>> +       event_index___release_mem__end_of_pipe = 5,
>>>>>> +       event_index___release_mem__shader_done = 6 };
>>>>>> +
>>>>>> +enum RELEASE_MEM_cache_policy_enum {
>>>>>> +       cache_policy___release_mem__lru = 0,
>>>>>> +       cache_policy___release_mem__stream = 1,
>>>>>> +       cache_policy___release_mem__bypass = 2 };
>>>>>> +
>>>>>> +enum RELEASE_MEM_dst_sel_enum {
>>>>>> +       dst_sel___release_mem__memory_controller = 0,
>>>>>> +       dst_sel___release_mem__tc_l2 = 1,
>>>>>> +       dst_sel___release_mem__queue_write_pointer_register = 2,
>>>>>> +       dst_sel___release_mem__queue_write_pointer_poll_mask_bit = 3
>>>>>> +};
>>>>>> +
>>>>>> +enum RELEASE_MEM_int_sel_enum {
>>>>>> +       int_sel___release_mem__none = 0,
>>>>>> +       int_sel___release_mem__send_interrupt_only = 1,
>>>>>> +       int_sel___release_mem__send_interrupt_after_write_confirm = 2,
>>>>>> +       int_sel___release_mem__send_data_after_write_confirm = 3 };
>>>>>> +
>>>>>> +enum RELEASE_MEM_data_sel_enum {
>>>>>> +       data_sel___release_mem__none = 0,
>>>>>> +       data_sel___release_mem__send_32_bit_low = 1,
>>>>>> +       data_sel___release_mem__send_64_bit_data = 2,
>>>>>> +       data_sel___release_mem__send_gpu_clock_counter = 3,
>>>>>> +       data_sel___release_mem__send_cp_perfcounter_hi_lo = 4,
>>>>>> +       data_sel___release_mem__store_gds_data_to_memory = 5 };
>>>>>> +
>>>>>> +struct pm4_mec_release_mem {
>>>>>> +       union {
>>>>>> +               union PM4_MES_TYPE_3_HEADER header;     /*header */
>>>>>> +               unsigned int ordinal1;
>>>>>> +       };
>>>>>> +
>>>>>> +       union {
>>>>>> +               struct {
>>>>>> +                       unsigned int event_type:6;
>>>>>> +                       unsigned int reserved1:2;
>>>>>> +                       enum RELEASE_MEM_event_index_enum
>>>>>> +event_index:4;
>>>>>> +                       unsigned int tcl1_vol_action_ena:1;
>>>>>> +                       unsigned int tc_vol_action_ena:1;
>>>>>> +                       unsigned int reserved2:1;
>>>>>> +                       unsigned int tc_wb_action_ena:1;
>>>>>> +                       unsigned int tcl1_action_ena:1;
>>>>>> +                       unsigned int tc_action_ena:1;
>>>>>> +                       unsigned int reserved3:6;
>>>>>> +                       unsigned int atc:1;
>>>>>> +                       enum RELEASE_MEM_cache_policy_enum
>>>>>> +cache_policy:2;
>>>>>> +                       unsigned int reserved4:5;
>>>>>> +               } bitfields2;
>>>>>> +               unsigned int ordinal2;
>>>>>> +       };
>>>>>> +
>>>>>> +       union {
>>>>>> +               struct {
>>>>>> +                       unsigned int reserved5:16;
>>>>>> +                       enum RELEASE_MEM_dst_sel_enum dst_sel:2;
>>>>>> +                       unsigned int reserved6:6;
>>>>>> +                       enum RELEASE_MEM_int_sel_enum int_sel:3;
>>>>>> +                       unsigned int reserved7:2;
>>>>>> +                       enum RELEASE_MEM_data_sel_enum data_sel:3;
>>>>>> +               } bitfields3;
>>>>>> +               unsigned int ordinal3;
>>>>>> +       };
>>>>>> +
>>>>>> +       union {
>>>>>> +               struct {
>>>>>> +                       unsigned int reserved8:2;
>>>>>> +                       unsigned int address_lo_32b:30;
>>>>>> +               } bitfields4;
>>>>>> +               struct {
>>>>>> +                       unsigned int reserved9:3;
>>>>>> +                       unsigned int address_lo_64b:29;
>>>>>> +               } bitfields5;
>>>>>> +               unsigned int ordinal4;
>>>>>> +       };
>>>>>> +
>>>>>> +       unsigned int address_hi;
>>>>>> +
>>>>>> +       unsigned int data_lo;
>>>>>> +
>>>>>> +       unsigned int data_hi;
>>>>>> +};
>>>>>> +#endif
>>>>>> +
>>>>>> +enum {
>>>>>> +       CACHE_FLUSH_AND_INV_TS_EVENT = 0x00000014 };
>>>>>> +
>>>>>>   #endif
>>>>>> --
>>>>>> 2.7.4
>>>>>>
>>>>> _______________________________________________
>>>>> amd-gfx mailing list
>>>>> amd-gfx@lists.freedesktop.org
>>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
>
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* RE: [PATCH 16/19] drm/amdkfd: Update PM4 packet headers
       [not found]                         ` <6137eb72-cb41-d65a-7863-71adf31a3506-5C7GfCeVMHo@public.gmane.org>
  2017-08-16  7:37                           ` Christian König
@ 2017-08-16 16:10                           ` Deucher, Alexander
       [not found]                             ` <BN6PR12MB16521615E13EE720D7D5FE51F7820-/b2+HYfkarQqUD6E6FAiowdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
  1 sibling, 1 reply; 70+ messages in thread
From: Deucher, Alexander @ 2017-08-16 16:10 UTC (permalink / raw)
  To: Kuehling, Felix, Oded Gabbay, Bridgman, John; +Cc: amd-gfx list

> -----Original Message-----
> From: Kuehling, Felix
> Sent: Tuesday, August 15, 2017 9:20 PM
> To: Oded Gabbay; Bridgman, John; Deucher, Alexander
> Cc: amd-gfx list
> Subject: Re: [PATCH 16/19] drm/amdkfd: Update PM4 packet headers
> 
> Hi Alex,
> 
> How does firmware get published for the upstream driver? Where can I
> check the currently published version of both CZ and KV firmware for
> upstream?
> 
> Do you publish firmware updates at the same time as patches that depend
> on them?

I submit patches to the linux-firmware tree periodically.  Just let me know what firmwares you want to update and I can submit patches.

Alex

> 
> Thanks,
>   Felix
> 
> 
> On 2017-08-13 04:49 AM, Oded Gabbay wrote:
> > On Sat, Aug 12, 2017 at 10:09 PM, Bridgman, John
> <John.Bridgman@amd.com> wrote:
> >> IIRC the amdgpu devs had been holding back on publishing the updated
> MEC microcode (with scratch support) because that WOULD have broken
> Kaveri. With this change from Felix we should be able to publish the newest
> microcode for both amdgpu and amdkfd WITHOUT breaking Kaveri.
> >>
> >> IOW this is the "scratch fix for Kaveri KFD" you have wanted for a couple
> of years :)
> > ah, ok.
> >
> > In that case, this patch is:
> > Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
> >
> >
> >>> -----Original Message-----
> >>> From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On
> Behalf
> >>> Of Kuehling, Felix
> >>> Sent: Saturday, August 12, 2017 2:16 PM
> >>> To: Oded Gabbay
> >>> Cc: amd-gfx list
> >>> Subject: Re: [PATCH 16/19] drm/amdkfd: Update PM4 packet headers
> >>>
> >>>> Do you mean that it won't work with Kaveri anymore ?
> >>> Kaveri got the same firmware changes, mostly for scratch memory
> support.
> >>> The Kaveri firmware headers name the structures and fields a bit
> differently
> >>> but they should be binary compatible. So we simplified the code to use
> only
> >>> one set of headers. I'll grab a Kaveri system to confirm that it works.
> >>>
> >>> Regards,
> >>>  Felix
> >>>
> >>> From: Oded Gabbay <oded.gabbay@gmail.com>
> >>> Sent: Saturday, August 12, 2017 11:10 AM
> >>> To: Kuehling, Felix
> >>> Cc: amd-gfx list
> >>> Subject: Re: [PATCH 16/19] drm/amdkfd: Update PM4 packet headers
> >>>
> >>> On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling
> <Felix.Kuehling@amd.com>
> >>> wrote:
> >>>> To match current firmware. The map process packet has been
> extended to
> >>>> support scratch. This is a non-backwards compatible change and it's
> >>>> about two years old. So no point keeping the old version around
> >>>> conditionally.
> >>> Do you mean that it won't work with Kaveri anymore ?
> >>> I believe we aren't allowed to break older H/W support without some
> >>> serious justification.
> >>>
> >>> Oded
> >>>
> >>>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> >>>> ---
> >>>>  drivers/gpu/drm/amd/amdkfd/kfd_device.c         |   8 +-
> >>>>  drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c | 161 ++++-----
> ---
> >>>>  drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h    | 314
> >>>> +++---------------------
> >>>>  drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h | 130
> +++++++++-
> >>>>  4 files changed, 199 insertions(+), 414 deletions(-)
> >>>>
> >>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> >>>> b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> >>>> index e1c2ad2..e790e7f 100644
> >>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> >>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> >>>> @@ -26,7 +26,7 @@
> >>>>  #include <linux/slab.h>
> >>>>  #include "kfd_priv.h"
> >>>>  #include "kfd_device_queue_manager.h"
> >>>> -#include "kfd_pm4_headers.h"
> >>>> +#include "kfd_pm4_headers_vi.h"
> >>>>
> >>>>  #define MQD_SIZE_ALIGNED 768
> >>>>
> >>>> @@ -238,9 +238,9 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
> >>>>          * calculate max size of runlist packet.
> >>>>          * There can be only 2 packets at once
> >>>>          */
> >>>> -       size += (KFD_MAX_NUM_OF_PROCESSES * sizeof(struct
> >>>> pm4_map_process) +
> >>>> -               max_num_of_queues_per_device *
> >>>> -               sizeof(struct pm4_map_queues) + sizeof(struct
> >>>> pm4_runlist)) * 2;
> >>>> +       size += (KFD_MAX_NUM_OF_PROCESSES * sizeof(struct
> >>>> +pm4_mes_map_process) +
> >>>> +               max_num_of_queues_per_device * sizeof(struct
> >>>> +pm4_mes_map_queues)
> >>>> +               + sizeof(struct pm4_mes_runlist)) * 2;
> >>>>
> >>>>         /* Add size of HIQ & DIQ */
> >>>>         size += KFD_KERNEL_QUEUE_SIZE * 2;  diff --git
> >>>> a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> >>>> b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> >>>> index 77a6f2b..3141e05 100644
> >>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> >>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
> >>>> @@ -26,7 +26,6 @@
> >>>>  #include "kfd_device_queue_manager.h"
> >>>>  #include "kfd_kernel_queue.h"
> >>>>  #include "kfd_priv.h"
> >>>> -#include "kfd_pm4_headers.h"
> >>>>  #include "kfd_pm4_headers_vi.h"
> >>>>  #include "kfd_pm4_opcodes.h"
> >>>>
> >>>> @@ -44,12 +43,12 @@ static unsigned int build_pm4_header(unsigned
> int
> >>>> opcode, size_t packet_size)
> >>>>  {
> >>>>         union PM4_MES_TYPE_3_HEADER header;
> >>>>
> >>>> -       header.u32all = 0;
> >>>> +       header.u32All = 0;
> >>>>         header.opcode = opcode;
> >>>>         header.count = packet_size/sizeof(uint32_t) - 2;
> >>>>         header.type = PM4_TYPE_3;
> >>>>
> >>>> -       return header.u32all;
> >>>> +       return header.u32All;
> >>>>  }
> >>>>
> >>>>  static void pm_calc_rlib_size(struct packet_manager *pm,  @@ -69,12
> >>>> +68,9 @@ static void pm_calc_rlib_size(struct packet_manager *pm,
> >>>>                 pr_debug("Over subscribed runlist\n");
> >>>>         }
> >>>>
> >>>> -       map_queue_size =
> >>>> -               (pm->dqm->dev->device_info->asic_family == CHIP_CARRIZO)
> ?
> >>>> -               sizeof(struct pm4_mes_map_queues) :
> >>>> -               sizeof(struct pm4_map_queues);
> >>>> +       map_queue_size = sizeof(struct pm4_mes_map_queues);
> >>>>         /* calculate run list ib allocation size */
> >>>> -       *rlib_size = process_count * sizeof(struct pm4_map_process) +
> >>>> +       *rlib_size = process_count * sizeof(struct
> >>>> +pm4_mes_map_process) +
> >>>>                      queue_count * map_queue_size;
> >>>>
> >>>>         /*
> >>>> @@ -82,7 +78,7 @@ static void pm_calc_rlib_size(struct
> packet_manager
> >>>> *pm,
> >>>>          * when over subscription
> >>>>          */
> >>>>         if (*over_subscription)
> >>>> -               *rlib_size += sizeof(struct pm4_runlist);
> >>>> +               *rlib_size += sizeof(struct pm4_mes_runlist);
> >>>>
> >>>>         pr_debug("runlist ib size %d\n", *rlib_size);
> >>>>  }
> >>>> @@ -119,16 +115,16 @@ static int pm_allocate_runlist_ib(struct
> >>>> packet_manager *pm,
> >>>>  static int pm_create_runlist(struct packet_manager *pm, uint32_t
> >>>> *buffer,
> >>>>                         uint64_t ib, size_t ib_size_in_dwords, bool
> >>>> chain)
> >>>>  {
> >>>> -       struct pm4_runlist *packet;
> >>>> +       struct pm4_mes_runlist *packet;
> >>>>
> >>>>         if (WARN_ON(!ib))
> >>>>                 return -EFAULT;
> >>>>
> >>>> -       packet = (struct pm4_runlist *)buffer;
> >>>> +       packet = (struct pm4_mes_runlist *)buffer;
> >>>>
> >>>> -       memset(buffer, 0, sizeof(struct pm4_runlist));
> >>>> -       packet->header.u32all = build_pm4_header(IT_RUN_LIST,
> >>>> -                                               sizeof(struct
> >>>> pm4_runlist));
> >>>> +       memset(buffer, 0, sizeof(struct pm4_mes_runlist));
> >>>> +       packet->header.u32All = build_pm4_header(IT_RUN_LIST,
> >>>> +                                               sizeof(struct
> >>>> +pm4_mes_runlist));
> >>>>
> >>>>         packet->bitfields4.ib_size = ib_size_in_dwords;
> >>>>         packet->bitfields4.chain = chain ? 1 : 0;  @@ -143,16 +139,16
> >>>> @@ static int pm_create_runlist(struct packet_manager *pm, uint32_t
> >>>> *buffer,
> >>>>  static int pm_create_map_process(struct packet_manager *pm,
> uint32_t
> >>>> *buffer,
> >>>>                                 struct qcm_process_device *qpd)
> >>>>  {
> >>>> -       struct pm4_map_process *packet;
> >>>> +       struct pm4_mes_map_process *packet;
> >>>>         struct queue *cur;
> >>>>         uint32_t num_queues;
> >>>>
> >>>> -       packet = (struct pm4_map_process *)buffer;
> >>>> +       packet = (struct pm4_mes_map_process *)buffer;
> >>>>
> >>>> -       memset(buffer, 0, sizeof(struct pm4_map_process));
> >>>> +       memset(buffer, 0, sizeof(struct pm4_mes_map_process));
> >>>>
> >>>> -       packet->header.u32all = build_pm4_header(IT_MAP_PROCESS,
> >>>> -                                       sizeof(struct
> >>>> pm4_map_process));
> >>>> +       packet->header.u32All = build_pm4_header(IT_MAP_PROCESS,
> >>>> +                                       sizeof(struct
> >>>> +pm4_mes_map_process));
> >>>>         packet->bitfields2.diq_enable = (qpd->is_debug) ? 1 : 0;
> >>>>         packet->bitfields2.process_quantum = 1;
> >>>>         packet->bitfields2.pasid = qpd->pqm->process->pasid;  @@
> >>>> -170,23 +166,26 @@ static int pm_create_map_process(struct
> >>>> packet_manager *pm, uint32_t *buffer,
> >>>>         packet->sh_mem_ape1_base = qpd->sh_mem_ape1_base;
> >>>>         packet->sh_mem_ape1_limit = qpd->sh_mem_ape1_limit;
> >>>>
> >>>> +       /* TODO: scratch support */
> >>>> +       packet->sh_hidden_private_base_vmid = 0;
> >>>> +
> >>>>         packet->gds_addr_lo = lower_32_bits(qpd->gds_context_area);
> >>>>         packet->gds_addr_hi = upper_32_bits(qpd->gds_context_area);
> >>>>
> >>>>         return 0;
> >>>>  }
> >>>>
> >>>> -static int pm_create_map_queue_vi(struct packet_manager *pm,
> uint32_t
> >>>> *buffer,
> >>>> +static int pm_create_map_queue(struct packet_manager *pm,
> uint32_t
> >>>> +*buffer,
> >>>>                 struct queue *q, bool is_static)
> >>>>  {
> >>>>         struct pm4_mes_map_queues *packet;
> >>>>         bool use_static = is_static;
> >>>>
> >>>>         packet = (struct pm4_mes_map_queues *)buffer;
> >>>> -       memset(buffer, 0, sizeof(struct pm4_map_queues));
> >>>> +       memset(buffer, 0, sizeof(struct pm4_mes_map_queues));
> >>>>
> >>>> -       packet->header.u32all = build_pm4_header(IT_MAP_QUEUES,
> >>>> -                                               sizeof(struct
> >>>> pm4_map_queues));
> >>>> +       packet->header.u32All = build_pm4_header(IT_MAP_QUEUES,
> >>>> +                                               sizeof(struct
> >>>> +pm4_mes_map_queues));
> >>>>         packet->bitfields2.alloc_format =
> >>>>                 alloc_format__mes_map_queues__one_per_pipe_vi;
> >>>>         packet->bitfields2.num_queues = 1;  @@ -235,64 +234,6 @@
> >>>> static int pm_create_map_queue_vi(struct packet_manager *pm,
> uint32_t
> >>>> *buffer,
> >>>>         return 0;
> >>>>  }
> >>>>
> >>>> -static int pm_create_map_queue(struct packet_manager *pm,
> uint32_t
> >>>> *buffer,
> >>>> -                               struct queue *q, bool is_static)  -{
> >>>> -       struct pm4_map_queues *packet;
> >>>> -       bool use_static = is_static;
> >>>> -
> >>>> -       packet = (struct pm4_map_queues *)buffer;
> >>>> -       memset(buffer, 0, sizeof(struct pm4_map_queues));
> >>>> -
> >>>> -       packet->header.u32all = build_pm4_header(IT_MAP_QUEUES,
> >>>> -                                               sizeof(struct
> >>>> pm4_map_queues));
> >>>> -       packet->bitfields2.alloc_format =
> >>>> -
> >>>> alloc_format__mes_map_queues__one_per_pipe;
> >>>> -       packet->bitfields2.num_queues = 1;
> >>>> -       packet->bitfields2.queue_sel =
> >>>> -
> >>>>
> queue_sel__mes_map_queues__map_to_hws_determined_queue_slots;
> >>>> -
> >>>> -       packet->bitfields2.vidmem = (q->properties.is_interop) ?
> >>>> -                       vidmem__mes_map_queues__uses_video_memory :
> >>>> -                       vidmem__mes_map_queues__uses_no_video_memory;
> >>>> -
> >>>> -       switch (q->properties.type) {
> >>>> -       case KFD_QUEUE_TYPE_COMPUTE:
> >>>> -       case KFD_QUEUE_TYPE_DIQ:
> >>>> -               packet->bitfields2.engine_sel =
> >>>> -                               engine_sel__mes_map_queues__compute;
> >>>> -               break;
> >>>> -       case KFD_QUEUE_TYPE_SDMA:
> >>>> -               packet->bitfields2.engine_sel =
> >>>> -                               engine_sel__mes_map_queues__sdma0;
> >>>> -               use_static = false; /* no static queues under SDMA */
> >>>> -               break;
> >>>> -       default:
> >>>> -               WARN(1, "queue type %d", q->properties.type);
> >>>> -               return -EINVAL;
> >>>> -       }
> >>>> -
> >>>> -       packet->mes_map_queues_ordinals[0].bitfields3.doorbell_offset
> >>>> =
> >>>> -                       q->properties.doorbell_off;
> >>>> -
> >>>> -       packet->mes_map_queues_ordinals[0].bitfields3.is_static =
> >>>> -                       (use_static) ? 1 : 0;
> >>>> -
> >>>> -       packet->mes_map_queues_ordinals[0].mqd_addr_lo =
> >>>> -                       lower_32_bits(q->gart_mqd_addr);
> >>>> -
> >>>> -       packet->mes_map_queues_ordinals[0].mqd_addr_hi =
> >>>> -                       upper_32_bits(q->gart_mqd_addr);
> >>>> -
> >>>> -       packet->mes_map_queues_ordinals[0].wptr_addr_lo =
> >>>> -
> >>>> lower_32_bits((uint64_t)q->properties.write_ptr);
> >>>> -
> >>>> -       packet->mes_map_queues_ordinals[0].wptr_addr_hi =
> >>>> -
> >>>> upper_32_bits((uint64_t)q->properties.write_ptr);
> >>>> -
> >>>> -       return 0;
> >>>> -}
> >>>> -
> >>>>  static int pm_create_runlist_ib(struct packet_manager *pm,
> >>>>                                 struct list_head *queues,
> >>>>                                 uint64_t *rl_gpu_addr,  @@ -334,7
> >>>> +275,7 @@ static int pm_create_runlist_ib(struct packet_manager
> *pm,
> >>>>                         return retval;
> >>>>
> >>>>                 proccesses_mapped++;
> >>>> -               inc_wptr(&rl_wptr, sizeof(struct pm4_map_process),
> >>>> +               inc_wptr(&rl_wptr, sizeof(struct pm4_mes_map_process),
> >>>>                                 alloc_size_bytes);
> >>>>
> >>>>                 list_for_each_entry(kq, &qpd->priv_queue_list, list) {
> >>>> @@ -344,14 +285,7 @@ static int pm_create_runlist_ib(struct
> >>>> packet_manager *pm,
> >>>>                         pr_debug("static_queue, mapping kernel q %d,
> >>>> is debug status %d\n",
> >>>>                                 kq->queue->queue, qpd->is_debug);
> >>>>
> >>>> -                       if (pm->dqm->dev->device_info->asic_family ==
> >>>> -                                       CHIP_CARRIZO)
> >>>> -                               retval = pm_create_map_queue_vi(pm,
> >>>> -                                               &rl_buffer[rl_wptr],
> >>>> -                                               kq->queue,
> >>>> -                                               qpd->is_debug);
> >>>> -                       else
> >>>> -                               retval = pm_create_map_queue(pm,
> >>>> +                       retval = pm_create_map_queue(pm,
> >>>>                                                 &rl_buffer[rl_wptr],
> >>>>                                                 kq->queue,
> >>>>                                                 qpd->is_debug);  @@
> >>>> -359,7 +293,7 @@ static int pm_create_runlist_ib(struct
> packet_manager
> >>>> *pm,
> >>>>                                 return retval;
> >>>>
> >>>>                         inc_wptr(&rl_wptr,
> >>>> -                               sizeof(struct pm4_map_queues),
> >>>> +                               sizeof(struct pm4_mes_map_queues),
> >>>>                                 alloc_size_bytes);
> >>>>                 }
> >>>>
> >>>> @@ -370,14 +304,7 @@ static int pm_create_runlist_ib(struct
> >>>> packet_manager *pm,
> >>>>                         pr_debug("static_queue, mapping user queue %d,
> >>>> is debug status %d\n",
> >>>>                                 q->queue, qpd->is_debug);
> >>>>
> >>>> -                       if (pm->dqm->dev->device_info->asic_family ==
> >>>> -                                       CHIP_CARRIZO)
> >>>> -                               retval = pm_create_map_queue_vi(pm,
> >>>> -                                               &rl_buffer[rl_wptr],
> >>>> -                                               q,
> >>>> -                                               qpd->is_debug);
> >>>> -                       else
> >>>> -                               retval = pm_create_map_queue(pm,
> >>>> +                       retval = pm_create_map_queue(pm,
> >>>>                                                 &rl_buffer[rl_wptr],
> >>>>                                                 q,
> >>>>                                                 qpd->is_debug);  @@
> >>>> -386,7 +313,7 @@ static int pm_create_runlist_ib(struct
> packet_manager
> >>>> *pm,
> >>>>                                 return retval;
> >>>>
> >>>>                         inc_wptr(&rl_wptr,
> >>>> -                               sizeof(struct pm4_map_queues),
> >>>> +                               sizeof(struct pm4_mes_map_queues),
> >>>>                                 alloc_size_bytes);
> >>>>                 }
> >>>>         }
> >>>> @@ -429,7 +356,7 @@ void pm_uninit(struct packet_manager *pm)
> >>>>  int pm_send_set_resources(struct packet_manager *pm,
> >>>>                                 struct scheduling_resources *res)
> >>>>  {
> >>>> -       struct pm4_set_resources *packet;
> >>>> +       struct pm4_mes_set_resources *packet;
> >>>>         int retval = 0;
> >>>>
> >>>>         mutex_lock(&pm->lock);
> >>>> @@ -442,9 +369,9 @@ int pm_send_set_resources(struct
> packet_manager
> >>>> *pm,
> >>>>                 goto out;
> >>>>         }
> >>>>
> >>>> -       memset(packet, 0, sizeof(struct pm4_set_resources));
> >>>> -       packet->header.u32all = build_pm4_header(IT_SET_RESOURCES,
> >>>> -                                       sizeof(struct
> >>>> pm4_set_resources));
> >>>> +       memset(packet, 0, sizeof(struct pm4_mes_set_resources));
> >>>> +       packet->header.u32All = build_pm4_header(IT_SET_RESOURCES,
> >>>> +                                       sizeof(struct
> >>>> +pm4_mes_set_resources));
> >>>>
> >>>>         packet->bitfields2.queue_type =
> >>>>
> >>>> queue_type__mes_set_resources__hsa_interface_queue_hiq;
> >>>> @@ -482,7 +409,7 @@ int pm_send_runlist(struct packet_manager
> *pm,
> >>>> struct list_head *dqm_queues)
> >>>>
> >>>>         pr_debug("runlist IB address: 0x%llX\n", rl_gpu_ib_addr);
> >>>>
> >>>> -       packet_size_dwords = sizeof(struct pm4_runlist) /
> >>>> sizeof(uint32_t);
> >>>> +       packet_size_dwords = sizeof(struct pm4_mes_runlist) /
> >>>> +sizeof(uint32_t);
> >>>>         mutex_lock(&pm->lock);
> >>>>
> >>>>         retval =
> >>>> pm->priv_queue->ops.acquire_packet_buffer(pm->priv_queue,
> >>>> @@ -514,7 +441,7 @@ int pm_send_query_status(struct
> packet_manager
> >>>> *pm, uint64_t fence_address,
> >>>>                         uint32_t fence_value)
> >>>>  {
> >>>>         int retval;
> >>>> -       struct pm4_query_status *packet;
> >>>> +       struct pm4_mes_query_status *packet;
> >>>>
> >>>>         if (WARN_ON(!fence_address))
> >>>>                 return -EFAULT;
> >>>> @@ -522,13 +449,13 @@ int pm_send_query_status(struct
> packet_manager
> >>>> *pm, uint64_t fence_address,
> >>>>         mutex_lock(&pm->lock);
> >>>>         retval = pm->priv_queue->ops.acquire_packet_buffer(
> >>>>                         pm->priv_queue,
> >>>> -                       sizeof(struct pm4_query_status) /
> >>>> sizeof(uint32_t),
> >>>> +                       sizeof(struct pm4_mes_query_status) /
> >>>> +sizeof(uint32_t),
> >>>>                         (unsigned int **)&packet);
> >>>>         if (retval)
> >>>>                 goto fail_acquire_packet_buffer;
> >>>>
> >>>> -       packet->header.u32all = build_pm4_header(IT_QUERY_STATUS,
> >>>> -                                       sizeof(struct
> >>>> pm4_query_status));
> >>>> +       packet->header.u32All = build_pm4_header(IT_QUERY_STATUS,
> >>>> +                                       sizeof(struct
> >>>> +pm4_mes_query_status));
> >>>>
> >>>>         packet->bitfields2.context_id = 0;
> >>>>         packet->bitfields2.interrupt_sel =  @@ -555,22 +482,22 @@ int
> >>>> pm_send_unmap_queue(struct packet_manager *pm, enum
> >>> kfd_queue_type
> >>>> type,
> >>>>  {
> >>>>         int retval;
> >>>>         uint32_t *buffer;
> >>>> -       struct pm4_unmap_queues *packet;
> >>>> +       struct pm4_mes_unmap_queues *packet;
> >>>>
> >>>>         mutex_lock(&pm->lock);
> >>>>         retval = pm->priv_queue->ops.acquire_packet_buffer(
> >>>>                         pm->priv_queue,
> >>>> -                       sizeof(struct pm4_unmap_queues) /
> >>>> sizeof(uint32_t),
> >>>> +                       sizeof(struct pm4_mes_unmap_queues) /
> >>>> +sizeof(uint32_t),
> >>>>                         &buffer);
> >>>>         if (retval)
> >>>>                 goto err_acquire_packet_buffer;
> >>>>
> >>>> -       packet = (struct pm4_unmap_queues *)buffer;
> >>>> -       memset(buffer, 0, sizeof(struct pm4_unmap_queues));
> >>>> +       packet = (struct pm4_mes_unmap_queues *)buffer;
> >>>> +       memset(buffer, 0, sizeof(struct pm4_mes_unmap_queues));
> >>>>         pr_debug("static_queue: unmapping queues: mode is %d , reset
> >>>> is %d , type is %d\n",
> >>>>                 mode, reset, type);
> >>>> -       packet->header.u32all = build_pm4_header(IT_UNMAP_QUEUES,
> >>>> -                                       sizeof(struct
> >>>> pm4_unmap_queues));
> >>>> +       packet->header.u32All =
> build_pm4_header(IT_UNMAP_QUEUES,
> >>>> +                                       sizeof(struct
> >>>> +pm4_mes_unmap_queues));
> >>>>         switch (type) {
> >>>>         case KFD_QUEUE_TYPE_COMPUTE:
> >>>>         case KFD_QUEUE_TYPE_DIQ:
> >>>> @@ -608,12 +535,12 @@ int pm_send_unmap_queue(struct
> >>> packet_manager
> >>>> *pm, enum kfd_queue_type type,
> >>>>                 break;
> >>>>         case KFD_PREEMPT_TYPE_FILTER_ALL_QUEUES:
> >>>>                 packet->bitfields2.queue_sel =
> >>>> -
> >>>>
> queue_sel__mes_unmap_queues__perform_request_on_all_active_queu
> e
> >>> s;
> >>>> +
> >>>> +queue_sel__mes_unmap_queues__unmap_all_queues;
> >>>>                 break;
> >>>>         case KFD_PREEMPT_TYPE_FILTER_DYNAMIC_QUEUES:
> >>>>                 /* in this case, we do not preempt static queues */
> >>>>                 packet->bitfields2.queue_sel =
> >>>> -
> >>>>
> queue_sel__mes_unmap_queues__perform_request_on_dynamic_queue
> s
> >>> _only;
> >>>> +
> >>>> +queue_sel__mes_unmap_queues__unmap_all_non_static_queues;
> >>>>                 break;
> >>>>         default:
> >>>>                 WARN(1, "filter %d", mode);  diff --git
> >>>> a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
> >>>> b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
> >>>> index 97e5442..e50f73d 100644
> >>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
> >>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
> >>>> @@ -41,99 +41,6 @@ union PM4_MES_TYPE_3_HEADER {
> >>>>  };
> >>>>  #endif /* PM4_MES_HEADER_DEFINED */
> >>>>
> >>>> -/* --------------------MES_SET_RESOURCES-------------------- */
> >>>> -
> >>>> -#ifndef PM4_MES_SET_RESOURCES_DEFINED -#define
> >>>> PM4_MES_SET_RESOURCES_DEFINED -enum
> >>> set_resources_queue_type_enum {
> >>>> -       queue_type__mes_set_resources__kernel_interface_queue_kiq
> = 0,
> >>>> -       queue_type__mes_set_resources__hsa_interface_queue_hiq =
> 1,
> >>>> -       queue_type__mes_set_resources__hsa_debug_interface_queue
> = 4
> >>>> -};
> >>>> -
> >>>> -struct pm4_set_resources {
> >>>> -       union {
> >>>> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
> >>>> -               uint32_t ordinal1;
> >>>> -       };
> >>>> -
> >>>> -       union {
> >>>> -               struct {
> >>>> -                       uint32_t vmid_mask:16;
> >>>> -                       uint32_t unmap_latency:8;
> >>>> -                       uint32_t reserved1:5;
> >>>> -                       enum set_resources_queue_type_enum
> >>>> queue_type:3;
> >>>> -               } bitfields2;
> >>>> -               uint32_t ordinal2;
> >>>> -       };
> >>>> -
> >>>> -       uint32_t queue_mask_lo;
> >>>> -       uint32_t queue_mask_hi;
> >>>> -       uint32_t gws_mask_lo;
> >>>> -       uint32_t gws_mask_hi;
> >>>> -
> >>>> -       union {
> >>>> -               struct {
> >>>> -                       uint32_t oac_mask:16;
> >>>> -                       uint32_t reserved2:16;
> >>>> -               } bitfields7;
> >>>> -               uint32_t ordinal7;
> >>>> -       };
> >>>> -
> >>>> -       union {
> >>>> -               struct {
> >>>> -                       uint32_t gds_heap_base:6;
> >>>> -                       uint32_t reserved3:5;
> >>>> -                       uint32_t gds_heap_size:6;
> >>>> -                       uint32_t reserved4:15;
> >>>> -               } bitfields8;
> >>>> -               uint32_t ordinal8;
> >>>> -       };
> >>>> -
> >>>> -};
> >>>> -#endif
> >>>> -
> >>>> -/*--------------------MES_RUN_LIST-------------------- */
> >>>> -
> >>>> -#ifndef PM4_MES_RUN_LIST_DEFINED
> >>>> -#define PM4_MES_RUN_LIST_DEFINED
> >>>> -
> >>>> -struct pm4_runlist {
> >>>> -       union {
> >>>> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
> >>>> -               uint32_t ordinal1;
> >>>> -       };
> >>>> -
> >>>> -       union {
> >>>> -               struct {
> >>>> -                       uint32_t reserved1:2;
> >>>> -                       uint32_t ib_base_lo:30;
> >>>> -               } bitfields2;
> >>>> -               uint32_t ordinal2;
> >>>> -       };
> >>>> -
> >>>> -       union {
> >>>> -               struct {
> >>>> -                       uint32_t ib_base_hi:16;
> >>>> -                       uint32_t reserved2:16;
> >>>> -               } bitfields3;
> >>>> -               uint32_t ordinal3;
> >>>> -       };
> >>>> -
> >>>> -       union {
> >>>> -               struct {
> >>>> -                       uint32_t ib_size:20;
> >>>> -                       uint32_t chain:1;
> >>>> -                       uint32_t offload_polling:1;
> >>>> -                       uint32_t reserved3:1;
> >>>> -                       uint32_t valid:1;
> >>>> -                       uint32_t reserved4:8;
> >>>> -               } bitfields4;
> >>>> -               uint32_t ordinal4;
> >>>> -       };
> >>>> -
> >>>> -};
> >>>> -#endif
> >>>>
> >>>>  /*--------------------MES_MAP_PROCESS-------------------- */
> >>>>
> >>>> @@ -186,217 +93,58 @@ struct pm4_map_process {
> >>>>  };
> >>>>  #endif
> >>>>
> >>>> -/*--------------------MES_MAP_QUEUES--------------------*/
> >>>> -
> >>>> -#ifndef PM4_MES_MAP_QUEUES_DEFINED
> >>>> -#define PM4_MES_MAP_QUEUES_DEFINED
> >>>> -enum map_queues_queue_sel_enum {
> >>>> -       queue_sel__mes_map_queues__map_to_specified_queue_slots
> = 0,
> >>>> -
> >>>
> queue_sel__mes_map_queues__map_to_hws_determined_queue_slots
> >>> =
> >>>> 1,
> >>>> -       queue_sel__mes_map_queues__enable_process_queues = 2 -};
> >>>> +#ifndef PM4_MES_MAP_PROCESS_DEFINED_KV_SCRATCH
> >>>> +#define PM4_MES_MAP_PROCESS_DEFINED_KV_SCRATCH
> >>>>
> >>>> -enum map_queues_vidmem_enum {
> >>>> -       vidmem__mes_map_queues__uses_no_video_memory = 0,
> >>>> -       vidmem__mes_map_queues__uses_video_memory = 1 -};
> >>>> -
> >>>> -enum map_queues_alloc_format_enum {
> >>>> -       alloc_format__mes_map_queues__one_per_pipe = 0,
> >>>> -       alloc_format__mes_map_queues__all_on_one_pipe = 1 -};
> >>>> -
> >>>> -enum map_queues_engine_sel_enum {
> >>>> -       engine_sel__mes_map_queues__compute = 0,
> >>>> -       engine_sel__mes_map_queues__sdma0 = 2,
> >>>> -       engine_sel__mes_map_queues__sdma1 = 3 -};
> >>>> -
> >>>> -struct pm4_map_queues {
> >>>> +struct pm4_map_process_scratch_kv {
> >>>>         union {
> >>>> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
> >>>> -               uint32_t ordinal1;
> >>>> -       };
> >>>> -
> >>>> -       union {
> >>>> -               struct {
> >>>> -                       uint32_t reserved1:4;
> >>>> -                       enum map_queues_queue_sel_enum queue_sel:2;
> >>>> -                       uint32_t reserved2:2;
> >>>> -                       uint32_t vmid:4;
> >>>> -                       uint32_t reserved3:4;
> >>>> -                       enum map_queues_vidmem_enum vidmem:2;
> >>>> -                       uint32_t reserved4:6;
> >>>> -                       enum map_queues_alloc_format_enum
> >>>> alloc_format:2;
> >>>> -                       enum map_queues_engine_sel_enum engine_sel:3;
> >>>> -                       uint32_t num_queues:3;
> >>>> -               } bitfields2;
> >>>> -               uint32_t ordinal2;
> >>>> -       };
> >>>> -
> >>>> -       struct {
> >>>> -               union {
> >>>> -                       struct {
> >>>> -                               uint32_t is_static:1;
> >>>> -                               uint32_t reserved5:1;
> >>>> -                               uint32_t doorbell_offset:21;
> >>>> -                               uint32_t reserved6:3;
> >>>> -                               uint32_t queue:6;
> >>>> -                       } bitfields3;
> >>>> -                       uint32_t ordinal3;
> >>>> -               };
> >>>> -
> >>>> -               uint32_t mqd_addr_lo;
> >>>> -               uint32_t mqd_addr_hi;
> >>>> -               uint32_t wptr_addr_lo;
> >>>> -               uint32_t wptr_addr_hi;
> >>>> -
> >>>> -       } mes_map_queues_ordinals[1];   /* 1..N of these ordinal
> >>>> groups */
> >>>> -
> >>>> -};
> >>>> -#endif
> >>>> -
> >>>> -/*--------------------MES_QUERY_STATUS--------------------*/
> >>>> -
> >>>> -#ifndef PM4_MES_QUERY_STATUS_DEFINED
> >>>> -#define PM4_MES_QUERY_STATUS_DEFINED
> >>>> -enum query_status_interrupt_sel_enum {
> >>>> -       interrupt_sel__mes_query_status__completion_status = 0,
> >>>> -       interrupt_sel__mes_query_status__process_status = 1,
> >>>> -       interrupt_sel__mes_query_status__queue_status = 2  -};
> >>>> -
> >>>> -enum query_status_command_enum {
> >>>> -       command__mes_query_status__interrupt_only = 0,
> >>>> -       command__mes_query_status__fence_only_immediate = 1,
> >>>> -       command__mes_query_status__fence_only_after_write_ack = 2,
> >>>> -
> >>>>
> command__mes_query_status__fence_wait_for_write_ack_send_interrup
> t
> >>> = 3
> >>>> -};
> >>>> -
> >>>> -enum query_status_engine_sel_enum {
> >>>> -       engine_sel__mes_query_status__compute = 0,
> >>>> -       engine_sel__mes_query_status__sdma0_queue = 2,
> >>>> -       engine_sel__mes_query_status__sdma1_queue = 3  -};
> >>>> -
> >>>> -struct pm4_query_status {
> >>>> -       union {
> >>>> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
> >>>> -               uint32_t ordinal1;
> >>>> -       };
> >>>> -
> >>>> -       union {
> >>>> -               struct {
> >>>> -                       uint32_t context_id:28;
> >>>> -                       enum query_status_interrupt_sel_enum
> >>>> interrupt_sel:2;
> >>>> -                       enum query_status_command_enum command:2;
> >>>> -               } bitfields2;
> >>>> -               uint32_t ordinal2;
> >>>> +               union PM4_MES_TYPE_3_HEADER   header; /* header */
> >>>> +               uint32_t            ordinal1;
> >>>>         };
> >>>>
> >>>>         union {
> >>>>                 struct {
> >>>>                         uint32_t pasid:16;
> >>>> -                       uint32_t reserved1:16;
> >>>> -               } bitfields3a;
> >>>> -               struct {
> >>>> -                       uint32_t reserved2:2;
> >>>> -                       uint32_t doorbell_offset:21;
> >>>> -                       uint32_t reserved3:3;
> >>>> -                       enum query_status_engine_sel_enum
> >>>> engine_sel:3;
> >>>> -                       uint32_t reserved4:3;
> >>>> -               } bitfields3b;
> >>>> -               uint32_t ordinal3;
> >>>> -       };
> >>>> -
> >>>> -       uint32_t addr_lo;
> >>>> -       uint32_t addr_hi;
> >>>> -       uint32_t data_lo;
> >>>> -       uint32_t data_hi;
> >>>> -};
> >>>> -#endif
> >>>> -
> >>>> -/*--------------------MES_UNMAP_QUEUES--------------------*/
> >>>> -
> >>>> -#ifndef PM4_MES_UNMAP_QUEUES_DEFINED
> >>>> -#define PM4_MES_UNMAP_QUEUES_DEFINED
> >>>> -enum unmap_queues_action_enum {
> >>>> -       action__mes_unmap_queues__preempt_queues = 0,
> >>>> -       action__mes_unmap_queues__reset_queues = 1,
> >>>> -       action__mes_unmap_queues__disable_process_queues = 2  -};
> >>>> -
> >>>> -enum unmap_queues_queue_sel_enum {
> >>>> -
> >>>>
> queue_sel__mes_unmap_queues__perform_request_on_specified_queue
> s
> >>> = 0,
> >>>> -
> >>>
> queue_sel__mes_unmap_queues__perform_request_on_pasid_queues =
> >>>> 1,
> >>>> -
> >>>>
> queue_sel__mes_unmap_queues__perform_request_on_all_active_queu
> e
> >>> s = 2,
> >>>> -
> >>>>
> queue_sel__mes_unmap_queues__perform_request_on_dynamic_queue
> s
> >>> _only = 3
> >>>> -};
> >>>> -
> >>>> -enum unmap_queues_engine_sel_enum {
> >>>> -       engine_sel__mes_unmap_queues__compute = 0,
> >>>> -       engine_sel__mes_unmap_queues__sdma0 = 2,
> >>>> -       engine_sel__mes_unmap_queues__sdma1 = 3  -};
> >>>> -
> >>>> -struct pm4_unmap_queues {
> >>>> -       union {
> >>>> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
> >>>> -               uint32_t ordinal1;
> >>>> -       };
> >>>> -
> >>>> -       union {
> >>>> -               struct {
> >>>> -                       enum unmap_queues_action_enum action:2;
> >>>> -                       uint32_t reserved1:2;
> >>>> -                       enum unmap_queues_queue_sel_enum queue_sel:2;
> >>>> -                       uint32_t reserved2:20;
> >>>> -                       enum unmap_queues_engine_sel_enum
> >>>> engine_sel:3;
> >>>> -                       uint32_t num_queues:3;
> >>>> +                       uint32_t reserved1:8;
> >>>> +                       uint32_t diq_enable:1;
> >>>> +                       uint32_t process_quantum:7;
> >>>>                 } bitfields2;
> >>>>                 uint32_t ordinal2;
> >>>>         };
> >>>>
> >>>>         union {
> >>>>                 struct {
> >>>> -                       uint32_t pasid:16;
> >>>> -                       uint32_t reserved3:16;
> >>>> -               } bitfields3a;
> >>>> -               struct {
> >>>> -                       uint32_t reserved4:2;
> >>>> -                       uint32_t doorbell_offset0:21;
> >>>> -                       uint32_t reserved5:9;
> >>>> -               } bitfields3b;
> >>>> +                       uint32_t page_table_base:28;
> >>>> +                       uint32_t reserved2:4;
> >>>> +               } bitfields3;
> >>>>                 uint32_t ordinal3;
> >>>>         };
> >>>>
> >>>> -       union {
> >>>> -               struct {
> >>>> -                       uint32_t reserved6:2;
> >>>> -                       uint32_t doorbell_offset1:21;
> >>>> -                       uint32_t reserved7:9;
> >>>> -               } bitfields4;
> >>>> -               uint32_t ordinal4;
> >>>> -       };
> >>>> -
> >>>> -       union {
> >>>> -               struct {
> >>>> -                       uint32_t reserved8:2;
> >>>> -                       uint32_t doorbell_offset2:21;
> >>>> -                       uint32_t reserved9:9;
> >>>> -               } bitfields5;
> >>>> -               uint32_t ordinal5;
> >>>> -       };
> >>>> +       uint32_t reserved3;
> >>>> +       uint32_t sh_mem_bases;
> >>>> +       uint32_t sh_mem_config;
> >>>> +       uint32_t sh_mem_ape1_base;
> >>>> +       uint32_t sh_mem_ape1_limit;
> >>>> +       uint32_t sh_hidden_private_base_vmid;
> >>>> +       uint32_t reserved4;
> >>>> +       uint32_t reserved5;
> >>>> +       uint32_t gds_addr_lo;
> >>>> +       uint32_t gds_addr_hi;
> >>>>
> >>>>         union {
> >>>>                 struct {
> >>>> -                       uint32_t reserved10:2;
> >>>> -                       uint32_t doorbell_offset3:21;
> >>>> -                       uint32_t reserved11:9;
> >>>> -               } bitfields6;
> >>>> -               uint32_t ordinal6;
> >>>> +                       uint32_t num_gws:6;
> >>>> +                       uint32_t reserved6:2;
> >>>> +                       uint32_t num_oac:4;
> >>>> +                       uint32_t reserved7:4;
> >>>> +                       uint32_t gds_size:6;
> >>>> +                       uint32_t num_queues:10;
> >>>> +               } bitfields14;
> >>>> +               uint32_t ordinal14;
> >>>>         };
> >>>>
> >>>> +       uint32_t completion_signal_lo32; uint32_t
> >>>> +completion_signal_hi32;
> >>>>  };
> >>>>  #endif
> >>>>
> >>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
> >>>> b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
> >>>> index c4eda6f..7c8d9b3 100644
> >>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
> >>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
> >>>> @@ -126,9 +126,10 @@ struct pm4_mes_runlist {
> >>>>                         uint32_t ib_size:20;
> >>>>                         uint32_t chain:1;
> >>>>                         uint32_t offload_polling:1;
> >>>> -                       uint32_t reserved3:1;
> >>>> +                       uint32_t reserved2:1;
> >>>>                         uint32_t valid:1;
> >>>> -                       uint32_t reserved4:8;
> >>>> +                       uint32_t process_cnt:4;
> >>>> +                       uint32_t reserved3:4;
> >>>>                 } bitfields4;
> >>>>                 uint32_t ordinal4;
> >>>>         };
> >>>> @@ -143,8 +144,8 @@ struct pm4_mes_runlist {
> >>>>
> >>>>  struct pm4_mes_map_process {
> >>>>         union {
> >>>> -               union PM4_MES_TYPE_3_HEADER   header;            /*
> >>>> header */
> >>>> -               uint32_t            ordinal1;
> >>>> +               union PM4_MES_TYPE_3_HEADER header;     /* header */
> >>>> +               uint32_t ordinal1;
> >>>>         };
> >>>>
> >>>>         union {
> >>>> @@ -155,36 +156,48 @@ struct pm4_mes_map_process {
> >>>>                         uint32_t process_quantum:7;
> >>>>                 } bitfields2;
> >>>>                 uint32_t ordinal2;
> >>>> -};
> >>>> +       };
> >>>>
> >>>>         union {
> >>>>                 struct {
> >>>>                         uint32_t page_table_base:28;
> >>>> -                       uint32_t reserved2:4;
> >>>> +                       uint32_t reserved3:4;
> >>>>                 } bitfields3;
> >>>>                 uint32_t ordinal3;
> >>>>         };
> >>>>
> >>>> +       uint32_t reserved;
> >>>> +
> >>>>         uint32_t sh_mem_bases;
> >>>> +       uint32_t sh_mem_config;
> >>>>         uint32_t sh_mem_ape1_base;
> >>>>         uint32_t sh_mem_ape1_limit;
> >>>> -       uint32_t sh_mem_config;
> >>>> +
> >>>> +       uint32_t sh_hidden_private_base_vmid;
> >>>> +
> >>>> +       uint32_t reserved2;
> >>>> +       uint32_t reserved3;
> >>>> +
> >>>>         uint32_t gds_addr_lo;
> >>>>         uint32_t gds_addr_hi;
> >>>>
> >>>>         union {
> >>>>                 struct {
> >>>>                         uint32_t num_gws:6;
> >>>> -                       uint32_t reserved3:2;
> >>>> +                       uint32_t reserved4:2;
> >>>>                         uint32_t num_oac:4;
> >>>> -                       uint32_t reserved4:4;
> >>>> +                       uint32_t reserved5:4;
> >>>>                         uint32_t gds_size:6;
> >>>>                         uint32_t num_queues:10;
> >>>>                 } bitfields10;
> >>>>                 uint32_t ordinal10;
> >>>>         };
> >>>>
> >>>> +       uint32_t completion_signal_lo;
> >>>> +       uint32_t completion_signal_hi;
> >>>> +
> >>>>  };
> >>>> +
> >>>>  #endif
> >>>>
> >>>>  /*--------------------MES_MAP_QUEUES--------------------*/
> >>>> @@ -337,7 +350,7 @@ enum mes_unmap_queues_engine_sel_enum
> {
> >>>>         engine_sel__mes_unmap_queues__sdmal = 3
> >>>>  };
> >>>>
> >>>> -struct PM4_MES_UNMAP_QUEUES {
> >>>> +struct pm4_mes_unmap_queues {
> >>>>         union {
> >>>>                 union PM4_MES_TYPE_3_HEADER   header;            /*
> >>>> header */
> >>>>                 uint32_t            ordinal1;  @@ -397,4 +410,101 @@
> >>>> struct PM4_MES_UNMAP_QUEUES {
> >>>>  };
> >>>>  #endif
> >>>>
> >>>> +#ifndef PM4_MEC_RELEASE_MEM_DEFINED
> >>>> +#define PM4_MEC_RELEASE_MEM_DEFINED
> >>>> +enum RELEASE_MEM_event_index_enum {
> >>>> +       event_index___release_mem__end_of_pipe = 5,
> >>>> +       event_index___release_mem__shader_done = 6 };
> >>>> +
> >>>> +enum RELEASE_MEM_cache_policy_enum {
> >>>> +       cache_policy___release_mem__lru = 0,
> >>>> +       cache_policy___release_mem__stream = 1,
> >>>> +       cache_policy___release_mem__bypass = 2 };
> >>>> +
> >>>> +enum RELEASE_MEM_dst_sel_enum {
> >>>> +       dst_sel___release_mem__memory_controller = 0,
> >>>> +       dst_sel___release_mem__tc_l2 = 1,
> >>>> +       dst_sel___release_mem__queue_write_pointer_register = 2,
> >>>> +       dst_sel___release_mem__queue_write_pointer_poll_mask_bit
> = 3
> >>>> +};
> >>>> +
> >>>> +enum RELEASE_MEM_int_sel_enum {
> >>>> +       int_sel___release_mem__none = 0,
> >>>> +       int_sel___release_mem__send_interrupt_only = 1,
> >>>> +       int_sel___release_mem__send_interrupt_after_write_confirm =
> 2,
> >>>> +       int_sel___release_mem__send_data_after_write_confirm = 3 };
> >>>> +
> >>>> +enum RELEASE_MEM_data_sel_enum {
> >>>> +       data_sel___release_mem__none = 0,
> >>>> +       data_sel___release_mem__send_32_bit_low = 1,
> >>>> +       data_sel___release_mem__send_64_bit_data = 2,
> >>>> +       data_sel___release_mem__send_gpu_clock_counter = 3,
> >>>> +       data_sel___release_mem__send_cp_perfcounter_hi_lo = 4,
> >>>> +       data_sel___release_mem__store_gds_data_to_memory = 5 };
> >>>> +
> >>>> +struct pm4_mec_release_mem {
> >>>> +       union {
> >>>> +               union PM4_MES_TYPE_3_HEADER header;     /*header */
> >>>> +               unsigned int ordinal1;
> >>>> +       };
> >>>> +
> >>>> +       union {
> >>>> +               struct {
> >>>> +                       unsigned int event_type:6;
> >>>> +                       unsigned int reserved1:2;
> >>>> +                       enum RELEASE_MEM_event_index_enum
> >>>> +event_index:4;
> >>>> +                       unsigned int tcl1_vol_action_ena:1;
> >>>> +                       unsigned int tc_vol_action_ena:1;
> >>>> +                       unsigned int reserved2:1;
> >>>> +                       unsigned int tc_wb_action_ena:1;
> >>>> +                       unsigned int tcl1_action_ena:1;
> >>>> +                       unsigned int tc_action_ena:1;
> >>>> +                       unsigned int reserved3:6;
> >>>> +                       unsigned int atc:1;
> >>>> +                       enum RELEASE_MEM_cache_policy_enum
> >>>> +cache_policy:2;
> >>>> +                       unsigned int reserved4:5;
> >>>> +               } bitfields2;
> >>>> +               unsigned int ordinal2;
> >>>> +       };
> >>>> +
> >>>> +       union {
> >>>> +               struct {
> >>>> +                       unsigned int reserved5:16;
> >>>> +                       enum RELEASE_MEM_dst_sel_enum dst_sel:2;
> >>>> +                       unsigned int reserved6:6;
> >>>> +                       enum RELEASE_MEM_int_sel_enum int_sel:3;
> >>>> +                       unsigned int reserved7:2;
> >>>> +                       enum RELEASE_MEM_data_sel_enum data_sel:3;
> >>>> +               } bitfields3;
> >>>> +               unsigned int ordinal3;
> >>>> +       };
> >>>> +
> >>>> +       union {
> >>>> +               struct {
> >>>> +                       unsigned int reserved8:2;
> >>>> +                       unsigned int address_lo_32b:30;
> >>>> +               } bitfields4;
> >>>> +               struct {
> >>>> +                       unsigned int reserved9:3;
> >>>> +                       unsigned int address_lo_64b:29;
> >>>> +               } bitfields5;
> >>>> +               unsigned int ordinal4;
> >>>> +       };
> >>>> +
> >>>> +       unsigned int address_hi;
> >>>> +
> >>>> +       unsigned int data_lo;
> >>>> +
> >>>> +       unsigned int data_hi;
> >>>> +};
> >>>> +#endif
> >>>> +
> >>>> +enum {
> >>>> +       CACHE_FLUSH_AND_INV_TS_EVENT = 0x00000014 };
> >>>> +
> >>>>  #endif
> >>>> --
> >>>> 2.7.4
> >>>>
> >>> _______________________________________________
> >>> amd-gfx mailing list
> >>> amd-gfx@lists.freedesktop.org
> >>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 16/19] drm/amdkfd: Update PM4 packet headers
       [not found]                             ` <4a8512a4-21df-cf48-4500-0424b08cd357-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
  2017-08-16  9:54                               ` Oded Gabbay
@ 2017-08-16 22:31                               ` Felix Kuehling
  1 sibling, 0 replies; 70+ messages in thread
From: Felix Kuehling @ 2017-08-16 22:31 UTC (permalink / raw)
  To: Christian König, Oded Gabbay, Bridgman, John, Deucher, Alexander
  Cc: amd-gfx list

On 2017-08-16 03:37 AM, Christian König wrote:
> Hi Felix,
>
> in general Alex handles that by pushing firmware to the linux-firmware
> tree.
>
>> IIRC the amdgpu devs had been holding back on publishing the updated
>> MEC microcode (with scratch support) because that WOULD have broken
>> Kaveri. With this change from Felix we should be able to publish the
>> newest microcode for both amdgpu and amdkfd WITHOUT breaking Kaveri.
> Well what does breaking Kaveri mean here?
>
> Please note that firmware *MUST* always be backward compatible. In
> other words new firmware must still work with older kernels without
> that fix.

FWIW, I checked the firmware versions currently in the upstream
linux-firmware repository and in our rocm tree, for both CZ and KV. And
the firmware versions that introduced scratch support:

   | linux-firmware | rocm | scratch
---+----------------+------+---------
CZ |            665 |  713 |     600
KV |            396 |  421 |     413

In other words, the CZ firmware in linux-firmware already supports
scratch and was broken with the upstream KFD. My patch series fixes it.

KV doesn't have firmware in linux-firmware that supports scratch. So my
patches would break it, unless updated firmware gets upstreamed at the
same time.

I agree that we shouldn't break firmware compatibility like this.
However, this was done about two years ago. It's too late to go back and
fix it now. Let's keep an eye on future firmware changes and stop
incompatible changes when they are introduced before they get published
widely.

Regards,
  Felix

>
> If that is not guaranteed we need to get step back and talk to the
> firmware team once more to implement the new feature in a backward
> compatible way.
>
> Regards,
> Christian.
>
> Am 16.08.2017 um 03:19 schrieb Felix Kuehling:
>> Hi Alex,
>>
>> How does firmware get published for the upstream driver? Where can I
>> check the currently published version of both CZ and KV firmware for
>> upstream?
>>
>> Do you publish firmware updates at the same time as patches that depend
>> on them?
>>
>> Thanks,
>>    Felix
>>
>>
>> On 2017-08-13 04:49 AM, Oded Gabbay wrote:
>>> On Sat, Aug 12, 2017 at 10:09 PM, Bridgman, John
>>> <John.Bridgman@amd.com> wrote:
>>>> IIRC the amdgpu devs had been holding back on publishing the
>>>> updated MEC microcode (with scratch support) because that WOULD
>>>> have broken Kaveri. With this change from Felix we should be able
>>>> to publish the newest microcode for both amdgpu and amdkfd WITHOUT
>>>> breaking Kaveri.
>>>>
>>>> IOW this is the "scratch fix for Kaveri KFD" you have wanted for a
>>>> couple of years :)
>>> ah, ok.
>>>
>>> In that case, this patch is:
>>> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
>>>
>>>
>>>>> -----Original Message-----
>>>>> From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On
>>>>> Behalf
>>>>> Of Kuehling, Felix
>>>>> Sent: Saturday, August 12, 2017 2:16 PM
>>>>> To: Oded Gabbay
>>>>> Cc: amd-gfx list
>>>>> Subject: Re: [PATCH 16/19] drm/amdkfd: Update PM4 packet headers
>>>>>
>>>>>> Do you mean that it won't work with Kaveri anymore ?
>>>>> Kaveri got the same firmware changes, mostly for scratch memory
>>>>> support.
>>>>> The Kaveri firmware headers name the structures and fields a bit
>>>>> differently
>>>>> but they should be binary compatible. So we simplified the code to
>>>>> use only
>>>>> one set of headers. I'll grab a Kaveri system to confirm that it
>>>>> works.
>>>>>
>>>>> Regards,
>>>>>   Felix
>>>>>
>>>>> From: Oded Gabbay <oded.gabbay@gmail.com>
>>>>> Sent: Saturday, August 12, 2017 11:10 AM
>>>>> To: Kuehling, Felix
>>>>> Cc: amd-gfx list
>>>>> Subject: Re: [PATCH 16/19] drm/amdkfd: Update PM4 packet headers
>>>>>
>>>>> On Sat, Aug 12, 2017 at 12:56 AM, Felix Kuehling
>>>>> <Felix.Kuehling@amd.com>
>>>>> wrote:
>>>>>> To match current firmware. The map process packet has been
>>>>>> extended to
>>>>>> support scratch. This is a non-backwards compatible change and it's
>>>>>> about two years old. So no point keeping the old version around
>>>>>> conditionally.
>>>>> Do you mean that it won't work with Kaveri anymore ?
>>>>> I believe we aren't allowed to break older H/W support without some
>>>>> serious justification.
>>>>>
>>>>> Oded
>>>>>
>>>>>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>>>>>> ---
>>>>>>   drivers/gpu/drm/amd/amdkfd/kfd_device.c         |   8 +-
>>>>>>   drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c | 161 ++++--------
>>>>>>   drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h    | 314
>>>>>> +++---------------------
>>>>>>   drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h | 130 +++++++++-
>>>>>>   4 files changed, 199 insertions(+), 414 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>>>>> b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>>>>> index e1c2ad2..e790e7f 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
>>>>>> @@ -26,7 +26,7 @@
>>>>>>   #include <linux/slab.h>
>>>>>>   #include "kfd_priv.h"
>>>>>>   #include "kfd_device_queue_manager.h"
>>>>>> -#include "kfd_pm4_headers.h"
>>>>>> +#include "kfd_pm4_headers_vi.h"
>>>>>>
>>>>>>   #define MQD_SIZE_ALIGNED 768
>>>>>>
>>>>>> @@ -238,9 +238,9 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
>>>>>>           * calculate max size of runlist packet.
>>>>>>           * There can be only 2 packets at once
>>>>>>           */
>>>>>> -       size += (KFD_MAX_NUM_OF_PROCESSES * sizeof(struct
>>>>>> pm4_map_process) +
>>>>>> -               max_num_of_queues_per_device *
>>>>>> -               sizeof(struct pm4_map_queues) + sizeof(struct
>>>>>> pm4_runlist)) * 2;
>>>>>> +       size += (KFD_MAX_NUM_OF_PROCESSES * sizeof(struct
>>>>>> +pm4_mes_map_process) +
>>>>>> +               max_num_of_queues_per_device * sizeof(struct
>>>>>> +pm4_mes_map_queues)
>>>>>> +               + sizeof(struct pm4_mes_runlist)) * 2;
>>>>>>
>>>>>>          /* Add size of HIQ & DIQ */
>>>>>>          size += KFD_KERNEL_QUEUE_SIZE * 2;  diff --git
>>>>>> a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
>>>>>> b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
>>>>>> index 77a6f2b..3141e05 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
>>>>>> @@ -26,7 +26,6 @@
>>>>>>   #include "kfd_device_queue_manager.h"
>>>>>>   #include "kfd_kernel_queue.h"
>>>>>>   #include "kfd_priv.h"
>>>>>> -#include "kfd_pm4_headers.h"
>>>>>>   #include "kfd_pm4_headers_vi.h"
>>>>>>   #include "kfd_pm4_opcodes.h"
>>>>>>
>>>>>> @@ -44,12 +43,12 @@ static unsigned int build_pm4_header(unsigned
>>>>>> int
>>>>>> opcode, size_t packet_size)
>>>>>>   {
>>>>>>          union PM4_MES_TYPE_3_HEADER header;
>>>>>>
>>>>>> -       header.u32all = 0;
>>>>>> +       header.u32All = 0;
>>>>>>          header.opcode = opcode;
>>>>>>          header.count = packet_size/sizeof(uint32_t) - 2;
>>>>>>          header.type = PM4_TYPE_3;
>>>>>>
>>>>>> -       return header.u32all;
>>>>>> +       return header.u32All;
>>>>>>   }
>>>>>>
>>>>>>   static void pm_calc_rlib_size(struct packet_manager *pm,  @@
>>>>>> -69,12
>>>>>> +68,9 @@ static void pm_calc_rlib_size(struct packet_manager *pm,
>>>>>>                  pr_debug("Over subscribed runlist\n");
>>>>>>          }
>>>>>>
>>>>>> -       map_queue_size =
>>>>>> -               (pm->dqm->dev->device_info->asic_family ==
>>>>>> CHIP_CARRIZO) ?
>>>>>> -               sizeof(struct pm4_mes_map_queues) :
>>>>>> -               sizeof(struct pm4_map_queues);
>>>>>> +       map_queue_size = sizeof(struct pm4_mes_map_queues);
>>>>>>          /* calculate run list ib allocation size */
>>>>>> -       *rlib_size = process_count * sizeof(struct
>>>>>> pm4_map_process) +
>>>>>> +       *rlib_size = process_count * sizeof(struct
>>>>>> +pm4_mes_map_process) +
>>>>>>                       queue_count * map_queue_size;
>>>>>>
>>>>>>          /*
>>>>>> @@ -82,7 +78,7 @@ static void pm_calc_rlib_size(struct
>>>>>> packet_manager
>>>>>> *pm,
>>>>>>           * when over subscription
>>>>>>           */
>>>>>>          if (*over_subscription)
>>>>>> -               *rlib_size += sizeof(struct pm4_runlist);
>>>>>> +               *rlib_size += sizeof(struct pm4_mes_runlist);
>>>>>>
>>>>>>          pr_debug("runlist ib size %d\n", *rlib_size);
>>>>>>   }
>>>>>> @@ -119,16 +115,16 @@ static int pm_allocate_runlist_ib(struct
>>>>>> packet_manager *pm,
>>>>>>   static int pm_create_runlist(struct packet_manager *pm, uint32_t
>>>>>> *buffer,
>>>>>>                          uint64_t ib, size_t ib_size_in_dwords, bool
>>>>>> chain)
>>>>>>   {
>>>>>> -       struct pm4_runlist *packet;
>>>>>> +       struct pm4_mes_runlist *packet;
>>>>>>
>>>>>>          if (WARN_ON(!ib))
>>>>>>                  return -EFAULT;
>>>>>>
>>>>>> -       packet = (struct pm4_runlist *)buffer;
>>>>>> +       packet = (struct pm4_mes_runlist *)buffer;
>>>>>>
>>>>>> -       memset(buffer, 0, sizeof(struct pm4_runlist));
>>>>>> -       packet->header.u32all = build_pm4_header(IT_RUN_LIST,
>>>>>> -                                               sizeof(struct
>>>>>> pm4_runlist));
>>>>>> +       memset(buffer, 0, sizeof(struct pm4_mes_runlist));
>>>>>> +       packet->header.u32All = build_pm4_header(IT_RUN_LIST,
>>>>>> +                                               sizeof(struct
>>>>>> +pm4_mes_runlist));
>>>>>>
>>>>>>          packet->bitfields4.ib_size = ib_size_in_dwords;
>>>>>>          packet->bitfields4.chain = chain ? 1 : 0;  @@ -143,16
>>>>>> +139,16
>>>>>> @@ static int pm_create_runlist(struct packet_manager *pm, uint32_t
>>>>>> *buffer,
>>>>>>   static int pm_create_map_process(struct packet_manager *pm,
>>>>>> uint32_t
>>>>>> *buffer,
>>>>>>                                  struct qcm_process_device *qpd)
>>>>>>   {
>>>>>> -       struct pm4_map_process *packet;
>>>>>> +       struct pm4_mes_map_process *packet;
>>>>>>          struct queue *cur;
>>>>>>          uint32_t num_queues;
>>>>>>
>>>>>> -       packet = (struct pm4_map_process *)buffer;
>>>>>> +       packet = (struct pm4_mes_map_process *)buffer;
>>>>>>
>>>>>> -       memset(buffer, 0, sizeof(struct pm4_map_process));
>>>>>> +       memset(buffer, 0, sizeof(struct pm4_mes_map_process));
>>>>>>
>>>>>> -       packet->header.u32all = build_pm4_header(IT_MAP_PROCESS,
>>>>>> -                                       sizeof(struct
>>>>>> pm4_map_process));
>>>>>> +       packet->header.u32All = build_pm4_header(IT_MAP_PROCESS,
>>>>>> +                                       sizeof(struct
>>>>>> +pm4_mes_map_process));
>>>>>>          packet->bitfields2.diq_enable = (qpd->is_debug) ? 1 : 0;
>>>>>>          packet->bitfields2.process_quantum = 1;
>>>>>>          packet->bitfields2.pasid = qpd->pqm->process->pasid;  @@
>>>>>> -170,23 +166,26 @@ static int pm_create_map_process(struct
>>>>>> packet_manager *pm, uint32_t *buffer,
>>>>>>          packet->sh_mem_ape1_base = qpd->sh_mem_ape1_base;
>>>>>>          packet->sh_mem_ape1_limit = qpd->sh_mem_ape1_limit;
>>>>>>
>>>>>> +       /* TODO: scratch support */
>>>>>> +       packet->sh_hidden_private_base_vmid = 0;
>>>>>> +
>>>>>>          packet->gds_addr_lo = lower_32_bits(qpd->gds_context_area);
>>>>>>          packet->gds_addr_hi = upper_32_bits(qpd->gds_context_area);
>>>>>>
>>>>>>          return 0;
>>>>>>   }
>>>>>>
>>>>>> -static int pm_create_map_queue_vi(struct packet_manager *pm,
>>>>>> uint32_t
>>>>>> *buffer,
>>>>>> +static int pm_create_map_queue(struct packet_manager *pm, uint32_t
>>>>>> +*buffer,
>>>>>>                  struct queue *q, bool is_static)
>>>>>>   {
>>>>>>          struct pm4_mes_map_queues *packet;
>>>>>>          bool use_static = is_static;
>>>>>>
>>>>>>          packet = (struct pm4_mes_map_queues *)buffer;
>>>>>> -       memset(buffer, 0, sizeof(struct pm4_map_queues));
>>>>>> +       memset(buffer, 0, sizeof(struct pm4_mes_map_queues));
>>>>>>
>>>>>> -       packet->header.u32all = build_pm4_header(IT_MAP_QUEUES,
>>>>>> -                                               sizeof(struct
>>>>>> pm4_map_queues));
>>>>>> +       packet->header.u32All = build_pm4_header(IT_MAP_QUEUES,
>>>>>> +                                               sizeof(struct
>>>>>> +pm4_mes_map_queues));
>>>>>>          packet->bitfields2.alloc_format =
>>>>>>                  alloc_format__mes_map_queues__one_per_pipe_vi;
>>>>>>          packet->bitfields2.num_queues = 1;  @@ -235,64 +234,6 @@
>>>>>> static int pm_create_map_queue_vi(struct packet_manager *pm,
>>>>>> uint32_t
>>>>>> *buffer,
>>>>>>          return 0;
>>>>>>   }
>>>>>>
>>>>>> -static int pm_create_map_queue(struct packet_manager *pm, uint32_t
>>>>>> *buffer,
>>>>>> -                               struct queue *q, bool is_static)  -{
>>>>>> -       struct pm4_map_queues *packet;
>>>>>> -       bool use_static = is_static;
>>>>>> -
>>>>>> -       packet = (struct pm4_map_queues *)buffer;
>>>>>> -       memset(buffer, 0, sizeof(struct pm4_map_queues));
>>>>>> -
>>>>>> -       packet->header.u32all = build_pm4_header(IT_MAP_QUEUES,
>>>>>> -                                               sizeof(struct
>>>>>> pm4_map_queues));
>>>>>> -       packet->bitfields2.alloc_format =
>>>>>> -
>>>>>> alloc_format__mes_map_queues__one_per_pipe;
>>>>>> -       packet->bitfields2.num_queues = 1;
>>>>>> -       packet->bitfields2.queue_sel =
>>>>>> -
>>>>>> queue_sel__mes_map_queues__map_to_hws_determined_queue_slots;
>>>>>> -
>>>>>> -       packet->bitfields2.vidmem = (q->properties.is_interop) ?
>>>>>> -                       vidmem__mes_map_queues__uses_video_memory :
>>>>>> -                      
>>>>>> vidmem__mes_map_queues__uses_no_video_memory;
>>>>>> -
>>>>>> -       switch (q->properties.type) {
>>>>>> -       case KFD_QUEUE_TYPE_COMPUTE:
>>>>>> -       case KFD_QUEUE_TYPE_DIQ:
>>>>>> -               packet->bitfields2.engine_sel =
>>>>>> -                               engine_sel__mes_map_queues__compute;
>>>>>> -               break;
>>>>>> -       case KFD_QUEUE_TYPE_SDMA:
>>>>>> -               packet->bitfields2.engine_sel =
>>>>>> -                               engine_sel__mes_map_queues__sdma0;
>>>>>> -               use_static = false; /* no static queues under
>>>>>> SDMA */
>>>>>> -               break;
>>>>>> -       default:
>>>>>> -               WARN(1, "queue type %d", q->properties.type);
>>>>>> -               return -EINVAL;
>>>>>> -       }
>>>>>> -
>>>>>> -      
>>>>>> packet->mes_map_queues_ordinals[0].bitfields3.doorbell_offset
>>>>>> =
>>>>>> -                       q->properties.doorbell_off;
>>>>>> -
>>>>>> -       packet->mes_map_queues_ordinals[0].bitfields3.is_static =
>>>>>> -                       (use_static) ? 1 : 0;
>>>>>> -
>>>>>> -       packet->mes_map_queues_ordinals[0].mqd_addr_lo =
>>>>>> -                       lower_32_bits(q->gart_mqd_addr);
>>>>>> -
>>>>>> -       packet->mes_map_queues_ordinals[0].mqd_addr_hi =
>>>>>> -                       upper_32_bits(q->gart_mqd_addr);
>>>>>> -
>>>>>> -       packet->mes_map_queues_ordinals[0].wptr_addr_lo =
>>>>>> -
>>>>>> lower_32_bits((uint64_t)q->properties.write_ptr);
>>>>>> -
>>>>>> -       packet->mes_map_queues_ordinals[0].wptr_addr_hi =
>>>>>> -
>>>>>> upper_32_bits((uint64_t)q->properties.write_ptr);
>>>>>> -
>>>>>> -       return 0;
>>>>>> -}
>>>>>> -
>>>>>>   static int pm_create_runlist_ib(struct packet_manager *pm,
>>>>>>                                  struct list_head *queues,
>>>>>>                                  uint64_t *rl_gpu_addr,  @@ -334,7
>>>>>> +275,7 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
>>>>>>                          return retval;
>>>>>>
>>>>>>                  proccesses_mapped++;
>>>>>> -               inc_wptr(&rl_wptr, sizeof(struct pm4_map_process),
>>>>>> +               inc_wptr(&rl_wptr, sizeof(struct
>>>>>> pm4_mes_map_process),
>>>>>>                                  alloc_size_bytes);
>>>>>>
>>>>>>                  list_for_each_entry(kq, &qpd->priv_queue_list,
>>>>>> list) {
>>>>>> @@ -344,14 +285,7 @@ static int pm_create_runlist_ib(struct
>>>>>> packet_manager *pm,
>>>>>>                          pr_debug("static_queue, mapping kernel q
>>>>>> %d,
>>>>>> is debug status %d\n",
>>>>>>                                  kq->queue->queue, qpd->is_debug);
>>>>>>
>>>>>> -                       if
>>>>>> (pm->dqm->dev->device_info->asic_family ==
>>>>>> -                                       CHIP_CARRIZO)
>>>>>> -                               retval = pm_create_map_queue_vi(pm,
>>>>>> -                                               &rl_buffer[rl_wptr],
>>>>>> -                                               kq->queue,
>>>>>> -                                               qpd->is_debug);
>>>>>> -                       else
>>>>>> -                               retval = pm_create_map_queue(pm,
>>>>>> +                       retval = pm_create_map_queue(pm,
>>>>>>                                                 
>>>>>> &rl_buffer[rl_wptr],
>>>>>>                                                  kq->queue,
>>>>>>                                                  qpd->is_debug);  @@
>>>>>> -359,7 +293,7 @@ static int pm_create_runlist_ib(struct
>>>>>> packet_manager
>>>>>> *pm,
>>>>>>                                  return retval;
>>>>>>
>>>>>>                          inc_wptr(&rl_wptr,
>>>>>> -                               sizeof(struct pm4_map_queues),
>>>>>> +                               sizeof(struct pm4_mes_map_queues),
>>>>>>                                  alloc_size_bytes);
>>>>>>                  }
>>>>>>
>>>>>> @@ -370,14 +304,7 @@ static int pm_create_runlist_ib(struct
>>>>>> packet_manager *pm,
>>>>>>                          pr_debug("static_queue, mapping user
>>>>>> queue %d,
>>>>>> is debug status %d\n",
>>>>>>                                  q->queue, qpd->is_debug);
>>>>>>
>>>>>> -                       if
>>>>>> (pm->dqm->dev->device_info->asic_family ==
>>>>>> -                                       CHIP_CARRIZO)
>>>>>> -                               retval = pm_create_map_queue_vi(pm,
>>>>>> -                                               &rl_buffer[rl_wptr],
>>>>>> -                                               q,
>>>>>> -                                               qpd->is_debug);
>>>>>> -                       else
>>>>>> -                               retval = pm_create_map_queue(pm,
>>>>>> +                       retval = pm_create_map_queue(pm,
>>>>>>                                                 
>>>>>> &rl_buffer[rl_wptr],
>>>>>>                                                  q,
>>>>>>                                                  qpd->is_debug);  @@
>>>>>> -386,7 +313,7 @@ static int pm_create_runlist_ib(struct
>>>>>> packet_manager
>>>>>> *pm,
>>>>>>                                  return retval;
>>>>>>
>>>>>>                          inc_wptr(&rl_wptr,
>>>>>> -                               sizeof(struct pm4_map_queues),
>>>>>> +                               sizeof(struct pm4_mes_map_queues),
>>>>>>                                  alloc_size_bytes);
>>>>>>                  }
>>>>>>          }
>>>>>> @@ -429,7 +356,7 @@ void pm_uninit(struct packet_manager *pm)
>>>>>>   int pm_send_set_resources(struct packet_manager *pm,
>>>>>>                                  struct scheduling_resources *res)
>>>>>>   {
>>>>>> -       struct pm4_set_resources *packet;
>>>>>> +       struct pm4_mes_set_resources *packet;
>>>>>>          int retval = 0;
>>>>>>
>>>>>>          mutex_lock(&pm->lock);
>>>>>> @@ -442,9 +369,9 @@ int pm_send_set_resources(struct packet_manager
>>>>>> *pm,
>>>>>>                  goto out;
>>>>>>          }
>>>>>>
>>>>>> -       memset(packet, 0, sizeof(struct pm4_set_resources));
>>>>>> -       packet->header.u32all = build_pm4_header(IT_SET_RESOURCES,
>>>>>> -                                       sizeof(struct
>>>>>> pm4_set_resources));
>>>>>> +       memset(packet, 0, sizeof(struct pm4_mes_set_resources));
>>>>>> +       packet->header.u32All = build_pm4_header(IT_SET_RESOURCES,
>>>>>> +                                       sizeof(struct
>>>>>> +pm4_mes_set_resources));
>>>>>>
>>>>>>          packet->bitfields2.queue_type =
>>>>>>
>>>>>> queue_type__mes_set_resources__hsa_interface_queue_hiq;
>>>>>> @@ -482,7 +409,7 @@ int pm_send_runlist(struct packet_manager *pm,
>>>>>> struct list_head *dqm_queues)
>>>>>>
>>>>>>          pr_debug("runlist IB address: 0x%llX\n", rl_gpu_ib_addr);
>>>>>>
>>>>>> -       packet_size_dwords = sizeof(struct pm4_runlist) /
>>>>>> sizeof(uint32_t);
>>>>>> +       packet_size_dwords = sizeof(struct pm4_mes_runlist) /
>>>>>> +sizeof(uint32_t);
>>>>>>          mutex_lock(&pm->lock);
>>>>>>
>>>>>>          retval =
>>>>>> pm->priv_queue->ops.acquire_packet_buffer(pm->priv_queue,
>>>>>> @@ -514,7 +441,7 @@ int pm_send_query_status(struct packet_manager
>>>>>> *pm, uint64_t fence_address,
>>>>>>                          uint32_t fence_value)
>>>>>>   {
>>>>>>          int retval;
>>>>>> -       struct pm4_query_status *packet;
>>>>>> +       struct pm4_mes_query_status *packet;
>>>>>>
>>>>>>          if (WARN_ON(!fence_address))
>>>>>>                  return -EFAULT;
>>>>>> @@ -522,13 +449,13 @@ int pm_send_query_status(struct packet_manager
>>>>>> *pm, uint64_t fence_address,
>>>>>>          mutex_lock(&pm->lock);
>>>>>>          retval = pm->priv_queue->ops.acquire_packet_buffer(
>>>>>>                          pm->priv_queue,
>>>>>> -                       sizeof(struct pm4_query_status) /
>>>>>> sizeof(uint32_t),
>>>>>> +                       sizeof(struct pm4_mes_query_status) /
>>>>>> +sizeof(uint32_t),
>>>>>>                          (unsigned int **)&packet);
>>>>>>          if (retval)
>>>>>>                  goto fail_acquire_packet_buffer;
>>>>>>
>>>>>> -       packet->header.u32all = build_pm4_header(IT_QUERY_STATUS,
>>>>>> -                                       sizeof(struct
>>>>>> pm4_query_status));
>>>>>> +       packet->header.u32All = build_pm4_header(IT_QUERY_STATUS,
>>>>>> +                                       sizeof(struct
>>>>>> +pm4_mes_query_status));
>>>>>>
>>>>>>          packet->bitfields2.context_id = 0;
>>>>>>          packet->bitfields2.interrupt_sel =  @@ -555,22 +482,22
>>>>>> @@ int
>>>>>> pm_send_unmap_queue(struct packet_manager *pm, enum
>>>>> kfd_queue_type
>>>>>> type,
>>>>>>   {
>>>>>>          int retval;
>>>>>>          uint32_t *buffer;
>>>>>> -       struct pm4_unmap_queues *packet;
>>>>>> +       struct pm4_mes_unmap_queues *packet;
>>>>>>
>>>>>>          mutex_lock(&pm->lock);
>>>>>>          retval = pm->priv_queue->ops.acquire_packet_buffer(
>>>>>>                          pm->priv_queue,
>>>>>> -                       sizeof(struct pm4_unmap_queues) /
>>>>>> sizeof(uint32_t),
>>>>>> +                       sizeof(struct pm4_mes_unmap_queues) /
>>>>>> +sizeof(uint32_t),
>>>>>>                          &buffer);
>>>>>>          if (retval)
>>>>>>                  goto err_acquire_packet_buffer;
>>>>>>
>>>>>> -       packet = (struct pm4_unmap_queues *)buffer;
>>>>>> -       memset(buffer, 0, sizeof(struct pm4_unmap_queues));
>>>>>> +       packet = (struct pm4_mes_unmap_queues *)buffer;
>>>>>> +       memset(buffer, 0, sizeof(struct pm4_mes_unmap_queues));
>>>>>>          pr_debug("static_queue: unmapping queues: mode is %d ,
>>>>>> reset
>>>>>> is %d , type is %d\n",
>>>>>>                  mode, reset, type);
>>>>>> -       packet->header.u32all = build_pm4_header(IT_UNMAP_QUEUES,
>>>>>> -                                       sizeof(struct
>>>>>> pm4_unmap_queues));
>>>>>> +       packet->header.u32All = build_pm4_header(IT_UNMAP_QUEUES,
>>>>>> +                                       sizeof(struct
>>>>>> +pm4_mes_unmap_queues));
>>>>>>          switch (type) {
>>>>>>          case KFD_QUEUE_TYPE_COMPUTE:
>>>>>>          case KFD_QUEUE_TYPE_DIQ:
>>>>>> @@ -608,12 +535,12 @@ int pm_send_unmap_queue(struct
>>>>> packet_manager
>>>>>> *pm, enum kfd_queue_type type,
>>>>>>                  break;
>>>>>>          case KFD_PREEMPT_TYPE_FILTER_ALL_QUEUES:
>>>>>>                  packet->bitfields2.queue_sel =
>>>>>> -
>>>>>> queue_sel__mes_unmap_queues__perform_request_on_all_active_queue
>>>>> s;
>>>>>> +
>>>>>> +queue_sel__mes_unmap_queues__unmap_all_queues;
>>>>>>                  break;
>>>>>>          case KFD_PREEMPT_TYPE_FILTER_DYNAMIC_QUEUES:
>>>>>>                  /* in this case, we do not preempt static queues */
>>>>>>                  packet->bitfields2.queue_sel =
>>>>>> -
>>>>>> queue_sel__mes_unmap_queues__perform_request_on_dynamic_queues
>>>>> _only;
>>>>>> +
>>>>>> +queue_sel__mes_unmap_queues__unmap_all_non_static_queues;
>>>>>>                  break;
>>>>>>          default:
>>>>>>                  WARN(1, "filter %d", mode);  diff --git
>>>>>> a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
>>>>>> b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
>>>>>> index 97e5442..e50f73d 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
>>>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
>>>>>> @@ -41,99 +41,6 @@ union PM4_MES_TYPE_3_HEADER {
>>>>>>   };
>>>>>>   #endif /* PM4_MES_HEADER_DEFINED */
>>>>>>
>>>>>> -/* --------------------MES_SET_RESOURCES-------------------- */
>>>>>> -
>>>>>> -#ifndef PM4_MES_SET_RESOURCES_DEFINED -#define
>>>>>> PM4_MES_SET_RESOURCES_DEFINED -enum
>>>>> set_resources_queue_type_enum {
>>>>>> -       queue_type__mes_set_resources__kernel_interface_queue_kiq
>>>>>> = 0,
>>>>>> -       queue_type__mes_set_resources__hsa_interface_queue_hiq = 1,
>>>>>> -       queue_type__mes_set_resources__hsa_debug_interface_queue = 4
>>>>>> -};
>>>>>> -
>>>>>> -struct pm4_set_resources {
>>>>>> -       union {
>>>>>> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
>>>>>> -               uint32_t ordinal1;
>>>>>> -       };
>>>>>> -
>>>>>> -       union {
>>>>>> -               struct {
>>>>>> -                       uint32_t vmid_mask:16;
>>>>>> -                       uint32_t unmap_latency:8;
>>>>>> -                       uint32_t reserved1:5;
>>>>>> -                       enum set_resources_queue_type_enum
>>>>>> queue_type:3;
>>>>>> -               } bitfields2;
>>>>>> -               uint32_t ordinal2;
>>>>>> -       };
>>>>>> -
>>>>>> -       uint32_t queue_mask_lo;
>>>>>> -       uint32_t queue_mask_hi;
>>>>>> -       uint32_t gws_mask_lo;
>>>>>> -       uint32_t gws_mask_hi;
>>>>>> -
>>>>>> -       union {
>>>>>> -               struct {
>>>>>> -                       uint32_t oac_mask:16;
>>>>>> -                       uint32_t reserved2:16;
>>>>>> -               } bitfields7;
>>>>>> -               uint32_t ordinal7;
>>>>>> -       };
>>>>>> -
>>>>>> -       union {
>>>>>> -               struct {
>>>>>> -                       uint32_t gds_heap_base:6;
>>>>>> -                       uint32_t reserved3:5;
>>>>>> -                       uint32_t gds_heap_size:6;
>>>>>> -                       uint32_t reserved4:15;
>>>>>> -               } bitfields8;
>>>>>> -               uint32_t ordinal8;
>>>>>> -       };
>>>>>> -
>>>>>> -};
>>>>>> -#endif
>>>>>> -
>>>>>> -/*--------------------MES_RUN_LIST-------------------- */
>>>>>> -
>>>>>> -#ifndef PM4_MES_RUN_LIST_DEFINED
>>>>>> -#define PM4_MES_RUN_LIST_DEFINED
>>>>>> -
>>>>>> -struct pm4_runlist {
>>>>>> -       union {
>>>>>> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
>>>>>> -               uint32_t ordinal1;
>>>>>> -       };
>>>>>> -
>>>>>> -       union {
>>>>>> -               struct {
>>>>>> -                       uint32_t reserved1:2;
>>>>>> -                       uint32_t ib_base_lo:30;
>>>>>> -               } bitfields2;
>>>>>> -               uint32_t ordinal2;
>>>>>> -       };
>>>>>> -
>>>>>> -       union {
>>>>>> -               struct {
>>>>>> -                       uint32_t ib_base_hi:16;
>>>>>> -                       uint32_t reserved2:16;
>>>>>> -               } bitfields3;
>>>>>> -               uint32_t ordinal3;
>>>>>> -       };
>>>>>> -
>>>>>> -       union {
>>>>>> -               struct {
>>>>>> -                       uint32_t ib_size:20;
>>>>>> -                       uint32_t chain:1;
>>>>>> -                       uint32_t offload_polling:1;
>>>>>> -                       uint32_t reserved3:1;
>>>>>> -                       uint32_t valid:1;
>>>>>> -                       uint32_t reserved4:8;
>>>>>> -               } bitfields4;
>>>>>> -               uint32_t ordinal4;
>>>>>> -       };
>>>>>> -
>>>>>> -};
>>>>>> -#endif
>>>>>>
>>>>>>   /*--------------------MES_MAP_PROCESS-------------------- */
>>>>>>
>>>>>> @@ -186,217 +93,58 @@ struct pm4_map_process {
>>>>>>   };
>>>>>>   #endif
>>>>>>
>>>>>> -/*--------------------MES_MAP_QUEUES--------------------*/
>>>>>> -
>>>>>> -#ifndef PM4_MES_MAP_QUEUES_DEFINED
>>>>>> -#define PM4_MES_MAP_QUEUES_DEFINED
>>>>>> -enum map_queues_queue_sel_enum {
>>>>>> -       queue_sel__mes_map_queues__map_to_specified_queue_slots = 0,
>>>>>> -
>>>>>        queue_sel__mes_map_queues__map_to_hws_determined_queue_slots
>>>>> =
>>>>>> 1,
>>>>>> -       queue_sel__mes_map_queues__enable_process_queues = 2 -};
>>>>>> +#ifndef PM4_MES_MAP_PROCESS_DEFINED_KV_SCRATCH
>>>>>> +#define PM4_MES_MAP_PROCESS_DEFINED_KV_SCRATCH
>>>>>>
>>>>>> -enum map_queues_vidmem_enum {
>>>>>> -       vidmem__mes_map_queues__uses_no_video_memory = 0,
>>>>>> -       vidmem__mes_map_queues__uses_video_memory = 1 -};
>>>>>> -
>>>>>> -enum map_queues_alloc_format_enum {
>>>>>> -       alloc_format__mes_map_queues__one_per_pipe = 0,
>>>>>> -       alloc_format__mes_map_queues__all_on_one_pipe = 1 -};
>>>>>> -
>>>>>> -enum map_queues_engine_sel_enum {
>>>>>> -       engine_sel__mes_map_queues__compute = 0,
>>>>>> -       engine_sel__mes_map_queues__sdma0 = 2,
>>>>>> -       engine_sel__mes_map_queues__sdma1 = 3 -};
>>>>>> -
>>>>>> -struct pm4_map_queues {
>>>>>> +struct pm4_map_process_scratch_kv {
>>>>>>          union {
>>>>>> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
>>>>>> -               uint32_t ordinal1;
>>>>>> -       };
>>>>>> -
>>>>>> -       union {
>>>>>> -               struct {
>>>>>> -                       uint32_t reserved1:4;
>>>>>> -                       enum map_queues_queue_sel_enum queue_sel:2;
>>>>>> -                       uint32_t reserved2:2;
>>>>>> -                       uint32_t vmid:4;
>>>>>> -                       uint32_t reserved3:4;
>>>>>> -                       enum map_queues_vidmem_enum vidmem:2;
>>>>>> -                       uint32_t reserved4:6;
>>>>>> -                       enum map_queues_alloc_format_enum
>>>>>> alloc_format:2;
>>>>>> -                       enum map_queues_engine_sel_enum
>>>>>> engine_sel:3;
>>>>>> -                       uint32_t num_queues:3;
>>>>>> -               } bitfields2;
>>>>>> -               uint32_t ordinal2;
>>>>>> -       };
>>>>>> -
>>>>>> -       struct {
>>>>>> -               union {
>>>>>> -                       struct {
>>>>>> -                               uint32_t is_static:1;
>>>>>> -                               uint32_t reserved5:1;
>>>>>> -                               uint32_t doorbell_offset:21;
>>>>>> -                               uint32_t reserved6:3;
>>>>>> -                               uint32_t queue:6;
>>>>>> -                       } bitfields3;
>>>>>> -                       uint32_t ordinal3;
>>>>>> -               };
>>>>>> -
>>>>>> -               uint32_t mqd_addr_lo;
>>>>>> -               uint32_t mqd_addr_hi;
>>>>>> -               uint32_t wptr_addr_lo;
>>>>>> -               uint32_t wptr_addr_hi;
>>>>>> -
>>>>>> -       } mes_map_queues_ordinals[1];   /* 1..N of these ordinal
>>>>>> groups */
>>>>>> -
>>>>>> -};
>>>>>> -#endif
>>>>>> -
>>>>>> -/*--------------------MES_QUERY_STATUS--------------------*/
>>>>>> -
>>>>>> -#ifndef PM4_MES_QUERY_STATUS_DEFINED
>>>>>> -#define PM4_MES_QUERY_STATUS_DEFINED
>>>>>> -enum query_status_interrupt_sel_enum {
>>>>>> -       interrupt_sel__mes_query_status__completion_status = 0,
>>>>>> -       interrupt_sel__mes_query_status__process_status = 1,
>>>>>> -       interrupt_sel__mes_query_status__queue_status = 2  -};
>>>>>> -
>>>>>> -enum query_status_command_enum {
>>>>>> -       command__mes_query_status__interrupt_only = 0,
>>>>>> -       command__mes_query_status__fence_only_immediate = 1,
>>>>>> -       command__mes_query_status__fence_only_after_write_ack = 2,
>>>>>> -
>>>>>> command__mes_query_status__fence_wait_for_write_ack_send_interrupt
>>>>> = 3
>>>>>> -};
>>>>>> -
>>>>>> -enum query_status_engine_sel_enum {
>>>>>> -       engine_sel__mes_query_status__compute = 0,
>>>>>> -       engine_sel__mes_query_status__sdma0_queue = 2,
>>>>>> -       engine_sel__mes_query_status__sdma1_queue = 3  -};
>>>>>> -
>>>>>> -struct pm4_query_status {
>>>>>> -       union {
>>>>>> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
>>>>>> -               uint32_t ordinal1;
>>>>>> -       };
>>>>>> -
>>>>>> -       union {
>>>>>> -               struct {
>>>>>> -                       uint32_t context_id:28;
>>>>>> -                       enum query_status_interrupt_sel_enum
>>>>>> interrupt_sel:2;
>>>>>> -                       enum query_status_command_enum command:2;
>>>>>> -               } bitfields2;
>>>>>> -               uint32_t ordinal2;
>>>>>> +               union PM4_MES_TYPE_3_HEADER   header; /* header */
>>>>>> +               uint32_t            ordinal1;
>>>>>>          };
>>>>>>
>>>>>>          union {
>>>>>>                  struct {
>>>>>>                          uint32_t pasid:16;
>>>>>> -                       uint32_t reserved1:16;
>>>>>> -               } bitfields3a;
>>>>>> -               struct {
>>>>>> -                       uint32_t reserved2:2;
>>>>>> -                       uint32_t doorbell_offset:21;
>>>>>> -                       uint32_t reserved3:3;
>>>>>> -                       enum query_status_engine_sel_enum
>>>>>> engine_sel:3;
>>>>>> -                       uint32_t reserved4:3;
>>>>>> -               } bitfields3b;
>>>>>> -               uint32_t ordinal3;
>>>>>> -       };
>>>>>> -
>>>>>> -       uint32_t addr_lo;
>>>>>> -       uint32_t addr_hi;
>>>>>> -       uint32_t data_lo;
>>>>>> -       uint32_t data_hi;
>>>>>> -};
>>>>>> -#endif
>>>>>> -
>>>>>> -/*--------------------MES_UNMAP_QUEUES--------------------*/
>>>>>> -
>>>>>> -#ifndef PM4_MES_UNMAP_QUEUES_DEFINED
>>>>>> -#define PM4_MES_UNMAP_QUEUES_DEFINED
>>>>>> -enum unmap_queues_action_enum {
>>>>>> -       action__mes_unmap_queues__preempt_queues = 0,
>>>>>> -       action__mes_unmap_queues__reset_queues = 1,
>>>>>> -       action__mes_unmap_queues__disable_process_queues = 2  -};
>>>>>> -
>>>>>> -enum unmap_queues_queue_sel_enum {
>>>>>> -
>>>>>> queue_sel__mes_unmap_queues__perform_request_on_specified_queues
>>>>> = 0,
>>>>>> -
>>>>>        queue_sel__mes_unmap_queues__perform_request_on_pasid_queues =
>>>>>> 1,
>>>>>> -
>>>>>> queue_sel__mes_unmap_queues__perform_request_on_all_active_queue
>>>>> s = 2,
>>>>>> -
>>>>>> queue_sel__mes_unmap_queues__perform_request_on_dynamic_queues
>>>>> _only = 3
>>>>>> -};
>>>>>> -
>>>>>> -enum unmap_queues_engine_sel_enum {
>>>>>> -       engine_sel__mes_unmap_queues__compute = 0,
>>>>>> -       engine_sel__mes_unmap_queues__sdma0 = 2,
>>>>>> -       engine_sel__mes_unmap_queues__sdma1 = 3  -};
>>>>>> -
>>>>>> -struct pm4_unmap_queues {
>>>>>> -       union {
>>>>>> -               union PM4_MES_TYPE_3_HEADER header;     /* header */
>>>>>> -               uint32_t ordinal1;
>>>>>> -       };
>>>>>> -
>>>>>> -       union {
>>>>>> -               struct {
>>>>>> -                       enum unmap_queues_action_enum action:2;
>>>>>> -                       uint32_t reserved1:2;
>>>>>> -                       enum unmap_queues_queue_sel_enum
>>>>>> queue_sel:2;
>>>>>> -                       uint32_t reserved2:20;
>>>>>> -                       enum unmap_queues_engine_sel_enum
>>>>>> engine_sel:3;
>>>>>> -                       uint32_t num_queues:3;
>>>>>> +                       uint32_t reserved1:8;
>>>>>> +                       uint32_t diq_enable:1;
>>>>>> +                       uint32_t process_quantum:7;
>>>>>>                  } bitfields2;
>>>>>>                  uint32_t ordinal2;
>>>>>>          };
>>>>>>
>>>>>>          union {
>>>>>>                  struct {
>>>>>> -                       uint32_t pasid:16;
>>>>>> -                       uint32_t reserved3:16;
>>>>>> -               } bitfields3a;
>>>>>> -               struct {
>>>>>> -                       uint32_t reserved4:2;
>>>>>> -                       uint32_t doorbell_offset0:21;
>>>>>> -                       uint32_t reserved5:9;
>>>>>> -               } bitfields3b;
>>>>>> +                       uint32_t page_table_base:28;
>>>>>> +                       uint32_t reserved2:4;
>>>>>> +               } bitfields3;
>>>>>>                  uint32_t ordinal3;
>>>>>>          };
>>>>>>
>>>>>> -       union {
>>>>>> -               struct {
>>>>>> -                       uint32_t reserved6:2;
>>>>>> -                       uint32_t doorbell_offset1:21;
>>>>>> -                       uint32_t reserved7:9;
>>>>>> -               } bitfields4;
>>>>>> -               uint32_t ordinal4;
>>>>>> -       };
>>>>>> -
>>>>>> -       union {
>>>>>> -               struct {
>>>>>> -                       uint32_t reserved8:2;
>>>>>> -                       uint32_t doorbell_offset2:21;
>>>>>> -                       uint32_t reserved9:9;
>>>>>> -               } bitfields5;
>>>>>> -               uint32_t ordinal5;
>>>>>> -       };
>>>>>> +       uint32_t reserved3;
>>>>>> +       uint32_t sh_mem_bases;
>>>>>> +       uint32_t sh_mem_config;
>>>>>> +       uint32_t sh_mem_ape1_base;
>>>>>> +       uint32_t sh_mem_ape1_limit;
>>>>>> +       uint32_t sh_hidden_private_base_vmid;
>>>>>> +       uint32_t reserved4;
>>>>>> +       uint32_t reserved5;
>>>>>> +       uint32_t gds_addr_lo;
>>>>>> +       uint32_t gds_addr_hi;
>>>>>>
>>>>>>          union {
>>>>>>                  struct {
>>>>>> -                       uint32_t reserved10:2;
>>>>>> -                       uint32_t doorbell_offset3:21;
>>>>>> -                       uint32_t reserved11:9;
>>>>>> -               } bitfields6;
>>>>>> -               uint32_t ordinal6;
>>>>>> +                       uint32_t num_gws:6;
>>>>>> +                       uint32_t reserved6:2;
>>>>>> +                       uint32_t num_oac:4;
>>>>>> +                       uint32_t reserved7:4;
>>>>>> +                       uint32_t gds_size:6;
>>>>>> +                       uint32_t num_queues:10;
>>>>>> +               } bitfields14;
>>>>>> +               uint32_t ordinal14;
>>>>>>          };
>>>>>>
>>>>>> +       uint32_t completion_signal_lo32; uint32_t
>>>>>> +completion_signal_hi32;
>>>>>>   };
>>>>>>   #endif
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
>>>>>> b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
>>>>>> index c4eda6f..7c8d9b3 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
>>>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
>>>>>> @@ -126,9 +126,10 @@ struct pm4_mes_runlist {
>>>>>>                          uint32_t ib_size:20;
>>>>>>                          uint32_t chain:1;
>>>>>>                          uint32_t offload_polling:1;
>>>>>> -                       uint32_t reserved3:1;
>>>>>> +                       uint32_t reserved2:1;
>>>>>>                          uint32_t valid:1;
>>>>>> -                       uint32_t reserved4:8;
>>>>>> +                       uint32_t process_cnt:4;
>>>>>> +                       uint32_t reserved3:4;
>>>>>>                  } bitfields4;
>>>>>>                  uint32_t ordinal4;
>>>>>>          };
>>>>>> @@ -143,8 +144,8 @@ struct pm4_mes_runlist {
>>>>>>
>>>>>>   struct pm4_mes_map_process {
>>>>>>          union {
>>>>>> -               union PM4_MES_TYPE_3_HEADER   header;            /*
>>>>>> header */
>>>>>> -               uint32_t            ordinal1;
>>>>>> +               union PM4_MES_TYPE_3_HEADER header;     /* header */
>>>>>> +               uint32_t ordinal1;
>>>>>>          };
>>>>>>
>>>>>>          union {
>>>>>> @@ -155,36 +156,48 @@ struct pm4_mes_map_process {
>>>>>>                          uint32_t process_quantum:7;
>>>>>>                  } bitfields2;
>>>>>>                  uint32_t ordinal2;
>>>>>> -};
>>>>>> +       };
>>>>>>
>>>>>>          union {
>>>>>>                  struct {
>>>>>>                          uint32_t page_table_base:28;
>>>>>> -                       uint32_t reserved2:4;
>>>>>> +                       uint32_t reserved3:4;
>>>>>>                  } bitfields3;
>>>>>>                  uint32_t ordinal3;
>>>>>>          };
>>>>>>
>>>>>> +       uint32_t reserved;
>>>>>> +
>>>>>>          uint32_t sh_mem_bases;
>>>>>> +       uint32_t sh_mem_config;
>>>>>>          uint32_t sh_mem_ape1_base;
>>>>>>          uint32_t sh_mem_ape1_limit;
>>>>>> -       uint32_t sh_mem_config;
>>>>>> +
>>>>>> +       uint32_t sh_hidden_private_base_vmid;
>>>>>> +
>>>>>> +       uint32_t reserved2;
>>>>>> +       uint32_t reserved3;
>>>>>> +
>>>>>>          uint32_t gds_addr_lo;
>>>>>>          uint32_t gds_addr_hi;
>>>>>>
>>>>>>          union {
>>>>>>                  struct {
>>>>>>                          uint32_t num_gws:6;
>>>>>> -                       uint32_t reserved3:2;
>>>>>> +                       uint32_t reserved4:2;
>>>>>>                          uint32_t num_oac:4;
>>>>>> -                       uint32_t reserved4:4;
>>>>>> +                       uint32_t reserved5:4;
>>>>>>                          uint32_t gds_size:6;
>>>>>>                          uint32_t num_queues:10;
>>>>>>                  } bitfields10;
>>>>>>                  uint32_t ordinal10;
>>>>>>          };
>>>>>>
>>>>>> +       uint32_t completion_signal_lo;
>>>>>> +       uint32_t completion_signal_hi;
>>>>>> +
>>>>>>   };
>>>>>> +
>>>>>>   #endif
>>>>>>
>>>>>>   /*--------------------MES_MAP_QUEUES--------------------*/
>>>>>> @@ -337,7 +350,7 @@ enum mes_unmap_queues_engine_sel_enum {
>>>>>>          engine_sel__mes_unmap_queues__sdmal = 3
>>>>>>   };
>>>>>>
>>>>>> -struct PM4_MES_UNMAP_QUEUES {
>>>>>> +struct pm4_mes_unmap_queues {
>>>>>>          union {
>>>>>>                  union PM4_MES_TYPE_3_HEADER   header;            /*
>>>>>> header */
>>>>>>                  uint32_t            ordinal1;  @@ -397,4
>>>>>> +410,101 @@
>>>>>> struct PM4_MES_UNMAP_QUEUES {
>>>>>>   };
>>>>>>   #endif
>>>>>>
>>>>>> +#ifndef PM4_MEC_RELEASE_MEM_DEFINED
>>>>>> +#define PM4_MEC_RELEASE_MEM_DEFINED
>>>>>> +enum RELEASE_MEM_event_index_enum {
>>>>>> +       event_index___release_mem__end_of_pipe = 5,
>>>>>> +       event_index___release_mem__shader_done = 6 };
>>>>>> +
>>>>>> +enum RELEASE_MEM_cache_policy_enum {
>>>>>> +       cache_policy___release_mem__lru = 0,
>>>>>> +       cache_policy___release_mem__stream = 1,
>>>>>> +       cache_policy___release_mem__bypass = 2 };
>>>>>> +
>>>>>> +enum RELEASE_MEM_dst_sel_enum {
>>>>>> +       dst_sel___release_mem__memory_controller = 0,
>>>>>> +       dst_sel___release_mem__tc_l2 = 1,
>>>>>> +       dst_sel___release_mem__queue_write_pointer_register = 2,
>>>>>> +       dst_sel___release_mem__queue_write_pointer_poll_mask_bit = 3
>>>>>> +};
>>>>>> +
>>>>>> +enum RELEASE_MEM_int_sel_enum {
>>>>>> +       int_sel___release_mem__none = 0,
>>>>>> +       int_sel___release_mem__send_interrupt_only = 1,
>>>>>> +       int_sel___release_mem__send_interrupt_after_write_confirm
>>>>>> = 2,
>>>>>> +       int_sel___release_mem__send_data_after_write_confirm = 3 };
>>>>>> +
>>>>>> +enum RELEASE_MEM_data_sel_enum {
>>>>>> +       data_sel___release_mem__none = 0,
>>>>>> +       data_sel___release_mem__send_32_bit_low = 1,
>>>>>> +       data_sel___release_mem__send_64_bit_data = 2,
>>>>>> +       data_sel___release_mem__send_gpu_clock_counter = 3,
>>>>>> +       data_sel___release_mem__send_cp_perfcounter_hi_lo = 4,
>>>>>> +       data_sel___release_mem__store_gds_data_to_memory = 5 };
>>>>>> +
>>>>>> +struct pm4_mec_release_mem {
>>>>>> +       union {
>>>>>> +               union PM4_MES_TYPE_3_HEADER header;     /*header */
>>>>>> +               unsigned int ordinal1;
>>>>>> +       };
>>>>>> +
>>>>>> +       union {
>>>>>> +               struct {
>>>>>> +                       unsigned int event_type:6;
>>>>>> +                       unsigned int reserved1:2;
>>>>>> +                       enum RELEASE_MEM_event_index_enum
>>>>>> +event_index:4;
>>>>>> +                       unsigned int tcl1_vol_action_ena:1;
>>>>>> +                       unsigned int tc_vol_action_ena:1;
>>>>>> +                       unsigned int reserved2:1;
>>>>>> +                       unsigned int tc_wb_action_ena:1;
>>>>>> +                       unsigned int tcl1_action_ena:1;
>>>>>> +                       unsigned int tc_action_ena:1;
>>>>>> +                       unsigned int reserved3:6;
>>>>>> +                       unsigned int atc:1;
>>>>>> +                       enum RELEASE_MEM_cache_policy_enum
>>>>>> +cache_policy:2;
>>>>>> +                       unsigned int reserved4:5;
>>>>>> +               } bitfields2;
>>>>>> +               unsigned int ordinal2;
>>>>>> +       };
>>>>>> +
>>>>>> +       union {
>>>>>> +               struct {
>>>>>> +                       unsigned int reserved5:16;
>>>>>> +                       enum RELEASE_MEM_dst_sel_enum dst_sel:2;
>>>>>> +                       unsigned int reserved6:6;
>>>>>> +                       enum RELEASE_MEM_int_sel_enum int_sel:3;
>>>>>> +                       unsigned int reserved7:2;
>>>>>> +                       enum RELEASE_MEM_data_sel_enum data_sel:3;
>>>>>> +               } bitfields3;
>>>>>> +               unsigned int ordinal3;
>>>>>> +       };
>>>>>> +
>>>>>> +       union {
>>>>>> +               struct {
>>>>>> +                       unsigned int reserved8:2;
>>>>>> +                       unsigned int address_lo_32b:30;
>>>>>> +               } bitfields4;
>>>>>> +               struct {
>>>>>> +                       unsigned int reserved9:3;
>>>>>> +                       unsigned int address_lo_64b:29;
>>>>>> +               } bitfields5;
>>>>>> +               unsigned int ordinal4;
>>>>>> +       };
>>>>>> +
>>>>>> +       unsigned int address_hi;
>>>>>> +
>>>>>> +       unsigned int data_lo;
>>>>>> +
>>>>>> +       unsigned int data_hi;
>>>>>> +};
>>>>>> +#endif
>>>>>> +
>>>>>> +enum {
>>>>>> +       CACHE_FLUSH_AND_INV_TS_EVENT = 0x00000014 };
>>>>>> +
>>>>>>   #endif
>>>>>> -- 
>>>>>> 2.7.4
>>>>>>
>>>>> _______________________________________________
>>>>> amd-gfx mailing list
>>>>> amd-gfx@lists.freedesktop.org
>>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 16/19] drm/amdkfd: Update PM4 packet headers
       [not found]                             ` <BN6PR12MB16521615E13EE720D7D5FE51F7820-/b2+HYfkarQqUD6E6FAiowdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
@ 2017-08-16 22:33                               ` Felix Kuehling
       [not found]                                 ` <0a2aee42-c1ad-c356-dbdd-812633e570bf-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 70+ messages in thread
From: Felix Kuehling @ 2017-08-16 22:33 UTC (permalink / raw)
  To: Deucher, Alexander, Oded Gabbay, Bridgman, John; +Cc: amd-gfx list

On 2017-08-16 12:10 PM, Deucher, Alexander wrote:
>> -----Original Message-----
>> From: Kuehling, Felix
>> Sent: Tuesday, August 15, 2017 9:20 PM
>> To: Oded Gabbay; Bridgman, John; Deucher, Alexander
>> Cc: amd-gfx list
>> Subject: Re: [PATCH 16/19] drm/amdkfd: Update PM4 packet headers
>>
>> Hi Alex,
>>
>> How does firmware get published for the upstream driver? Where can I
>> check the currently published version of both CZ and KV firmware for
>> upstream?
>>
>> Do you publish firmware updates at the same time as patches that depend
>> on them?
> I submit patches to the linux-firmware tree periodically.  Just let me know what firmwares you want to update and I can submit patches.

I noticed that linux-firmware currently doesn't have firmware for CIK
and SI in the amdgpu folder. Should we add at least CIK if we expect
people to start using amdgpu on CIK hardware?

Regards,
  Felix

>
> Alex
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

* RE: [PATCH 16/19] drm/amdkfd: Update PM4 packet headers
       [not found]                                 ` <0a2aee42-c1ad-c356-dbdd-812633e570bf-5C7GfCeVMHo@public.gmane.org>
@ 2017-08-16 22:36                                   ` Deucher, Alexander
  0 siblings, 0 replies; 70+ messages in thread
From: Deucher, Alexander @ 2017-08-16 22:36 UTC (permalink / raw)
  To: Kuehling, Felix, Oded Gabbay, Bridgman, John; +Cc: amd-gfx list

> -----Original Message-----
> From: Kuehling, Felix
> Sent: Wednesday, August 16, 2017 6:34 PM
> To: Deucher, Alexander; Oded Gabbay; Bridgman, John
> Cc: amd-gfx list
> Subject: Re: [PATCH 16/19] drm/amdkfd: Update PM4 packet headers
> 
> On 2017-08-16 12:10 PM, Deucher, Alexander wrote:
> >> -----Original Message-----
> >> From: Kuehling, Felix
> >> Sent: Tuesday, August 15, 2017 9:20 PM
> >> To: Oded Gabbay; Bridgman, John; Deucher, Alexander
> >> Cc: amd-gfx list
> >> Subject: Re: [PATCH 16/19] drm/amdkfd: Update PM4 packet headers
> >>
> >> Hi Alex,
> >>
> >> How does firmware get published for the upstream driver? Where can I
> >> check the currently published version of both CZ and KV firmware for
> >> upstream?
> >>
> >> Do you publish firmware updates at the same time as patches that
> depend
> >> on them?
> > I submit patches to the linux-firmware tree periodically.  Just let me know
> what firmwares you want to update and I can submit patches.
> 
> I noticed that linux-firmware currently doesn't have firmware for CIK
> and SI in the amdgpu folder. Should we add at least CIK if we expect
> people to start using amdgpu on CIK hardware?

Right now amdgpu uses the same firmware as radeon.  We could switch to using the amdgpu folder if we want to use different firmware for radeon vs. amdgpu.  We just need to patch the driver to use the other firmware path and add the new firmware.  It's something we've thought about for a while, but never had enough of a need to prompt doing it.

Alex

> 
> Regards,
>   Felix
> 
> >
> > Alex
> >

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 70+ messages in thread

end of thread, other threads:[~2017-08-16 22:36 UTC | newest]

Thread overview: 70+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-11 21:56 [PATCH 00/19] KFD fixes and cleanups Felix Kuehling
     [not found] ` <1502488589-30272-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-08-11 21:56   ` [PATCH 01/19] drm/amdkfd: Fix double Mutex lock order Felix Kuehling
     [not found]     ` <1502488589-30272-2-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-08-12 12:23       ` Oded Gabbay
     [not found]         ` <CAFCwf12A9Qr-HCyQFR2eDN_TEExzxHEBVK9XQ9_xuwPKErHg3w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-08-12 19:28           ` Kuehling, Felix
2017-08-11 21:56   ` [PATCH 02/19] drm/amdkfd: Fix typo in dbgdev_wave_reset_wavefronts Felix Kuehling
     [not found]     ` <1502488589-30272-3-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-08-12 12:29       ` Oded Gabbay
2017-08-11 21:56   ` [PATCH 03/19] drm/amdkfd: Remove bogus divide-by-sizeof(uint32_t) Felix Kuehling
     [not found]     ` <1502488589-30272-4-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-08-12 12:37       ` Oded Gabbay
     [not found]         ` <CAFCwf11Bg41FNg2sChh6EZkczb6quSzxdFXoJ-qhoE8JwqgGJw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-08-12 18:02           ` Kuehling, Felix
     [not found]             ` <DM5PR1201MB02356DFA5A4EAE4CC747F12F928E0-grEf7a3NxMBd8L2jMOIKKmrFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
2017-08-12 20:00               ` Oded Gabbay
2017-08-11 21:56   ` [PATCH 04/19] drm/amdkfd: Fix allocated_queues bitmap initialization Felix Kuehling
     [not found]     ` <1502488589-30272-5-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-08-12 12:45       ` Oded Gabbay
2017-08-11 21:56   ` [PATCH 05/19] drm/amdkfd: Clean up KFD style errors and warnings Felix Kuehling
     [not found]     ` <1502488589-30272-6-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-08-12 12:46       ` Oded Gabbay
     [not found]         ` <CAFCwf13wgCm9qPk4XX5DNOx0toPmZxogpxt5zBvEKzcMd2x3tw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-08-12 12:58           ` Oded Gabbay
2017-08-11 21:56   ` [PATCH 06/19] drm/amdkfd: Consolidate and clean up log commands Felix Kuehling
     [not found]     ` <1502488589-30272-7-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-08-12 13:01       ` Oded Gabbay
2017-08-11 21:56   ` [PATCH 07/19] drm/amdkfd: Change x==NULL/false references to !x Felix Kuehling
     [not found]     ` <1502488589-30272-8-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-08-12 13:07       ` Oded Gabbay
2017-08-11 21:56   ` [PATCH 08/19] drm/amdkfd: Fix goto usage Felix Kuehling
     [not found]     ` <1502488589-30272-9-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-08-12 13:21       ` Oded Gabbay
2017-08-11 21:56   ` [PATCH 09/19] drm/amdkfd: Remove usage of alloc(sizeof(struct Felix Kuehling
     [not found]     ` <1502488589-30272-10-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-08-12 13:23       ` Oded Gabbay
2017-08-11 21:56   ` [PATCH 10/19] drm/amdkfd: Remove BUG_ONs for NULL pointer arguments Felix Kuehling
     [not found]     ` <1502488589-30272-11-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-08-12 14:19       ` Oded Gabbay
2017-08-11 21:56   ` [PATCH 11/19] drm/amdkfd: Fix doorbell initialization and finalization Felix Kuehling
     [not found]     ` <1502488589-30272-12-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-08-12 14:21       ` Oded Gabbay
2017-08-11 21:56   ` [PATCH 12/19] drm/amdkfd: Allocate gtt_sa_bitmap in long units Felix Kuehling
     [not found]     ` <1502488589-30272-13-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-08-12 14:26       ` Oded Gabbay
2017-08-11 21:56   ` [PATCH 13/19] drm/amdkfd: Handle remaining BUG_ONs more gracefully Felix Kuehling
     [not found]     ` <1502488589-30272-14-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-08-12 14:39       ` Oded Gabbay
     [not found]         ` <CAFCwf129vQLO5owYBQt6S-V5WcxSrO7tu+v62-HhH2eOzATS1A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-08-12 18:37           ` Kuehling, Felix
     [not found]             ` <DM5PR1201MB0235FD5550E28485BCF31EB1928E0-grEf7a3NxMBd8L2jMOIKKmrFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
2017-08-13  8:48               ` Oded Gabbay
2017-08-11 21:56   ` [PATCH 14/19] drm/amdkfd: Add more error printing to help bringup Felix Kuehling
     [not found]     ` <1502488589-30272-15-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-08-12 14:54       ` Oded Gabbay
     [not found]         ` <CAFCwf12LtX8Me-DSVvnf72eZr=UQm6sWnBoSuB2DM8jbqk3nOQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-08-15  3:50           ` Zhao, Yong
     [not found]             ` <CY1PR1201MB109725CE39D52B9E482824FAF08D0-JBJ/M6OpXY/YBI+VM8qCl2rFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
2017-08-15 18:39               ` Felix Kuehling
2017-08-16  1:14               ` Felix Kuehling
2017-08-11 21:56   ` [PATCH 15/19] drm/amdkfd: Clamp EOP queue size correctly on Gfx8 Felix Kuehling
     [not found]     ` <1502488589-30272-16-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-08-12 15:04       ` Oded Gabbay
2017-08-11 21:56   ` [PATCH 16/19] drm/amdkfd: Update PM4 packet headers Felix Kuehling
     [not found]     ` <1502488589-30272-17-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-08-12 15:10       ` Oded Gabbay
     [not found]         ` <CAFCwf12mAxpYF0-AWA=4hJiEe093KkakUYO28-+VxV=Uo+X4Tw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-08-12 18:16           ` Kuehling, Felix
     [not found]             ` <DM5PR1201MB023536CBDE40370D9703EBD6928E0-grEf7a3NxMBd8L2jMOIKKmrFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
2017-08-12 19:09               ` Bridgman, John
     [not found]                 ` <BN6PR12MB13489181F8A90932135C1CBBE88E0-/b2+HYfkarQX0pEhCR5T8QdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2017-08-13  8:49                   ` Oded Gabbay
     [not found]                     ` <CAFCwf12edu4VXLP8UTTJk+x9uu9D1bkgO23FpiJbuz3BEreYzg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-08-16  1:19                       ` Felix Kuehling
     [not found]                         ` <6137eb72-cb41-d65a-7863-71adf31a3506-5C7GfCeVMHo@public.gmane.org>
2017-08-16  7:37                           ` Christian König
     [not found]                             ` <4a8512a4-21df-cf48-4500-0424b08cd357-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
2017-08-16  9:54                               ` Oded Gabbay
2017-08-16 22:31                               ` Felix Kuehling
2017-08-16 16:10                           ` Deucher, Alexander
     [not found]                             ` <BN6PR12MB16521615E13EE720D7D5FE51F7820-/b2+HYfkarQqUD6E6FAiowdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2017-08-16 22:33                               ` Felix Kuehling
     [not found]                                 ` <0a2aee42-c1ad-c356-dbdd-812633e570bf-5C7GfCeVMHo@public.gmane.org>
2017-08-16 22:36                                   ` Deucher, Alexander
2017-08-11 21:56   ` [PATCH 17/19] drm/amdgpu: Remove hard-coded assumptions about compute pipes Felix Kuehling
     [not found]     ` <1502488589-30272-18-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-08-13  8:29       ` Oded Gabbay
     [not found]         ` <CAFCwf108X+f6+jehRvykPy0NPCnYa6uHjoVAXWDvoN+35h-N5A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-08-14 14:53           ` Felix Kuehling
     [not found]             ` <0eec4793-c9fa-4dfb-c8d7-41995d20e738-5C7GfCeVMHo@public.gmane.org>
2017-08-14 15:06               ` Oded Gabbay
2017-08-11 21:56   ` [PATCH 18/19] drm/amdgpu: Disable GFX PG on CZ Felix Kuehling
     [not found]     ` <1502488589-30272-19-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-08-11 23:45       ` StDenis, Tom
     [not found]         ` <DM5PR1201MB0074CE2415F056F739B6A8B7F7890-grEf7a3NxMAwZliakWjjqGrFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
2017-08-11 23:56           ` Felix Kuehling
     [not found]             ` <b5039bc6-5be9-d8b3-c995-082a60b40490-5C7GfCeVMHo@public.gmane.org>
2017-08-12  0:08               ` StDenis, Tom
     [not found]                 ` <DM5PR1201MB0074364796DD927ADB4746A7F78E0-grEf7a3NxMAwZliakWjjqGrFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
2017-08-12  0:40                   ` Felix Kuehling
     [not found]                     ` <1d263ee5-e1e3-aa3a-a8aa-c1fbfbcbb8ab-5C7GfCeVMHo@public.gmane.org>
2017-08-12  0:54                       ` StDenis, Tom
     [not found]                         ` <DM5PR1201MB00745120AE898231F7BAD8DFF78E0-grEf7a3NxMAwZliakWjjqGrFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
2017-08-12  1:18                           ` Felix Kuehling
2017-08-14 15:28                       ` Deucher, Alexander
2017-08-11 21:56   ` [PATCH 19/19] drm/amd: Update MEC HQD loading code for KFD Felix Kuehling
     [not found]     ` <1502488589-30272-20-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-08-13  8:33       ` Oded Gabbay
2017-08-12 12:28   ` [PATCH 00/19] KFD fixes and cleanups Oded Gabbay
     [not found]     ` <CAFCwf10L0sMCWnPxOi=zLuXLor4X90--m-a6UnervTmEguGL9g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-08-12 18:07       ` Kuehling, Felix
     [not found]         ` <DM5PR1201MB0235D9AA9FAEF6F2635942C8928E0-grEf7a3NxMBd8L2jMOIKKmrFom/aUZj6nBOFsp37pqbUKgpGm//BTAC/G2K4zDHf@public.gmane.org>
2017-08-14 16:25           ` Deucher, Alexander
     [not found]             ` <CY4PR12MB1653ACDE8226C3E06245F8EEF78C0-rpdhrqHFk06apTa93KjAaQdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2017-08-14 22:23               ` Felix Kuehling

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.