All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/6] Raven support for KFD
@ 2018-07-12 21:24 Felix Kuehling
       [not found] ` <1531430694-23966-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Felix Kuehling @ 2018-07-12 21:24 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Felix Kuehling

Raven refers to Ryzen APUs with integrated GFXv9 GPU.
This patch series completes Raven support for KFD:

* fix up memory banks info from CRAT
* support different number of SDMA engines
* workaround IOMMUv2 PPR issues
* add device info

Yong Zhao (6):
  drm/amdkfd: Consolidate duplicate memory banks info in topology
  drm/amdkfd: Make SDMA engine number an ASIC-dependent variable
  drm/amdkfd: Avoid flooding dmesg on Raven due to IOMMU issues
  drm/amdkfd: Workaround to accommodate Raven too many PPR issue
  drm/amdkfd: Optimize out some duplicated code in
    kfd_signal_iommu_event()
  drm/amdkfd: Enable Raven for KFD

 drivers/gpu/drm/amd/amdkfd/kfd_crat.c              | 57 +++++++++++++++++-----
 drivers/gpu/drm/amd/amdkfd/kfd_device.c            | 28 +++++++++++
 .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  | 29 +++++++----
 .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.h  |  6 +--
 drivers/gpu/drm/amd/amdkfd/kfd_events.c            | 47 ++++++++++--------
 drivers/gpu/drm/amd/amdkfd/kfd_iommu.c             |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h              |  1 +
 .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |  3 +-
 8 files changed, 126 insertions(+), 47 deletions(-)

-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/6] drm/amdkfd: Consolidate duplicate memory banks info in topology
       [not found] ` <1531430694-23966-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2018-07-12 21:24   ` Felix Kuehling
       [not found]     ` <1531430694-23966-2-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2018-07-12 21:24   ` [PATCH 2/6] drm/amdkfd: Make SDMA engine number an ASIC-dependent variable Felix Kuehling
                     ` (5 subsequent siblings)
  6 siblings, 1 reply; 16+ messages in thread
From: Felix Kuehling @ 2018-07-12 21:24 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Yong Zhao, Felix Kuehling

From: Yong Zhao <yong.zhao@amd.com>

If there are several memory banks that has the same properties in CRAT,
we aggregate them into one memory bank. This cleans up memory banks on
APUs (e.g. Raven) where the CRAT reports each memory channel as a
separate bank. This only confuses user mode, which only deals with
virtual memory.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 57 ++++++++++++++++++++++++++++-------
 1 file changed, 46 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
index 296b3f2..ee49960 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
@@ -189,6 +189,21 @@ static int kfd_parse_subtype_cu(struct crat_subtype_computeunit *cu,
 	return 0;
 }
 
+static struct kfd_mem_properties *
+find_subtype_mem(uint32_t heap_type, uint32_t flags, uint32_t width,
+		struct kfd_topology_device *dev)
+{
+	struct kfd_mem_properties *props;
+
+	list_for_each_entry(props, &dev->mem_props, list) {
+		if (props->heap_type == heap_type
+				&& props->flags == flags
+				&& props->width == width)
+			return props;
+	}
+
+	return NULL;
+}
 /* kfd_parse_subtype_mem - parse memory subtypes and attach it to correct
  * topology device present in the device_list
  */
@@ -197,36 +212,56 @@ static int kfd_parse_subtype_mem(struct crat_subtype_memory *mem,
 {
 	struct kfd_mem_properties *props;
 	struct kfd_topology_device *dev;
+	uint32_t heap_type;
+	uint64_t size_in_bytes;
+	uint32_t flags = 0;
+	uint32_t width;
 
 	pr_debug("Found memory entry in CRAT table with proximity_domain=%d\n",
 			mem->proximity_domain);
 	list_for_each_entry(dev, device_list, list) {
 		if (mem->proximity_domain == dev->proximity_domain) {
-			props = kfd_alloc_struct(props);
-			if (!props)
-				return -ENOMEM;
-
 			/* We're on GPU node */
 			if (dev->node_props.cpu_cores_count == 0) {
 				/* APU */
 				if (mem->visibility_type == 0)
-					props->heap_type =
+					heap_type =
 						HSA_MEM_HEAP_TYPE_FB_PRIVATE;
 				/* dGPU */
 				else
-					props->heap_type = mem->visibility_type;
+					heap_type = mem->visibility_type;
 			} else
-				props->heap_type = HSA_MEM_HEAP_TYPE_SYSTEM;
+				heap_type = HSA_MEM_HEAP_TYPE_SYSTEM;
 
 			if (mem->flags & CRAT_MEM_FLAGS_HOT_PLUGGABLE)
-				props->flags |= HSA_MEM_FLAGS_HOT_PLUGGABLE;
+				flags |= HSA_MEM_FLAGS_HOT_PLUGGABLE;
 			if (mem->flags & CRAT_MEM_FLAGS_NON_VOLATILE)
-				props->flags |= HSA_MEM_FLAGS_NON_VOLATILE;
+				flags |= HSA_MEM_FLAGS_NON_VOLATILE;
 
-			props->size_in_bytes =
+			size_in_bytes =
 				((uint64_t)mem->length_high << 32) +
 							mem->length_low;
-			props->width = mem->width;
+			width = mem->width;
+
+			/* Multiple banks of the same type are aggregated into
+			 * one. User mode doesn't care about multiple physical
+			 * memory segments. It's managed as a single virtual
+			 * heap for user mode.
+			 */
+			props = find_subtype_mem(heap_type, flags, width, dev);
+			if (props) {
+				props->size_in_bytes += size_in_bytes;
+				break;
+			}
+
+			props = kfd_alloc_struct(props);
+			if (!props)
+				return -ENOMEM;
+
+			props->heap_type = heap_type;
+			props->flags = flags;
+			props->size_in_bytes = size_in_bytes;
+			props->width = width;
 
 			dev->node_props.mem_banks_count++;
 			list_add_tail(&props->list, &dev->mem_props);
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/6] drm/amdkfd: Make SDMA engine number an ASIC-dependent variable
       [not found] ` <1531430694-23966-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2018-07-12 21:24   ` [PATCH 1/6] drm/amdkfd: Consolidate duplicate memory banks info in topology Felix Kuehling
@ 2018-07-12 21:24   ` Felix Kuehling
       [not found]     ` <1531430694-23966-3-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2018-07-12 21:24   ` [PATCH 3/6] drm/amdkfd: Avoid flooding dmesg on Raven due to IOMMU issues Felix Kuehling
                     ` (4 subsequent siblings)
  6 siblings, 1 reply; 16+ messages in thread
From: Felix Kuehling @ 2018-07-12 21:24 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Yong Zhao, Felix Kuehling

From: Yong Zhao <yong.zhao@amd.com>

On Raven there is only one SDMA engine instead of previously assumed two,
so we need to adapt our code to this new scenario.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_device.c            | 12 +++++++++
 .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  | 29 +++++++++++++++-------
 .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.h  |  6 ++---
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h              |  1 +
 .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |  3 +--
 5 files changed, 36 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 8faa8db..572235c 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -52,6 +52,7 @@ static const struct kfd_device_info kaveri_device_info = {
 	.supports_cwsr = false,
 	.needs_iommu_device = true,
 	.needs_pci_atomics = false,
+	.num_sdma_engines = 2,
 };
 
 static const struct kfd_device_info carrizo_device_info = {
@@ -67,6 +68,7 @@ static const struct kfd_device_info carrizo_device_info = {
 	.supports_cwsr = true,
 	.needs_iommu_device = true,
 	.needs_pci_atomics = false,
+	.num_sdma_engines = 2,
 };
 #endif
 
@@ -83,6 +85,7 @@ static const struct kfd_device_info hawaii_device_info = {
 	.supports_cwsr = false,
 	.needs_iommu_device = false,
 	.needs_pci_atomics = false,
+	.num_sdma_engines = 2,
 };
 
 static const struct kfd_device_info tonga_device_info = {
@@ -97,6 +100,7 @@ static const struct kfd_device_info tonga_device_info = {
 	.supports_cwsr = false,
 	.needs_iommu_device = false,
 	.needs_pci_atomics = true,
+	.num_sdma_engines = 2,
 };
 
 static const struct kfd_device_info tonga_vf_device_info = {
@@ -111,6 +115,7 @@ static const struct kfd_device_info tonga_vf_device_info = {
 	.supports_cwsr = false,
 	.needs_iommu_device = false,
 	.needs_pci_atomics = false,
+	.num_sdma_engines = 2,
 };
 
 static const struct kfd_device_info fiji_device_info = {
@@ -125,6 +130,7 @@ static const struct kfd_device_info fiji_device_info = {
 	.supports_cwsr = true,
 	.needs_iommu_device = false,
 	.needs_pci_atomics = true,
+	.num_sdma_engines = 2,
 };
 
 static const struct kfd_device_info fiji_vf_device_info = {
@@ -139,6 +145,7 @@ static const struct kfd_device_info fiji_vf_device_info = {
 	.supports_cwsr = true,
 	.needs_iommu_device = false,
 	.needs_pci_atomics = false,
+	.num_sdma_engines = 2,
 };
 
 
@@ -154,6 +161,7 @@ static const struct kfd_device_info polaris10_device_info = {
 	.supports_cwsr = true,
 	.needs_iommu_device = false,
 	.needs_pci_atomics = true,
+	.num_sdma_engines = 2,
 };
 
 static const struct kfd_device_info polaris10_vf_device_info = {
@@ -168,6 +176,7 @@ static const struct kfd_device_info polaris10_vf_device_info = {
 	.supports_cwsr = true,
 	.needs_iommu_device = false,
 	.needs_pci_atomics = false,
+	.num_sdma_engines = 2,
 };
 
 static const struct kfd_device_info polaris11_device_info = {
@@ -182,6 +191,7 @@ static const struct kfd_device_info polaris11_device_info = {
 	.supports_cwsr = true,
 	.needs_iommu_device = false,
 	.needs_pci_atomics = true,
+	.num_sdma_engines = 2,
 };
 
 static const struct kfd_device_info vega10_device_info = {
@@ -196,6 +206,7 @@ static const struct kfd_device_info vega10_device_info = {
 	.supports_cwsr = true,
 	.needs_iommu_device = false,
 	.needs_pci_atomics = false,
+	.num_sdma_engines = 2,
 };
 
 static const struct kfd_device_info vega10_vf_device_info = {
@@ -210,6 +221,7 @@ static const struct kfd_device_info vega10_vf_device_info = {
 	.supports_cwsr = true,
 	.needs_iommu_device = false,
 	.needs_pci_atomics = false,
+	.num_sdma_engines = 2,
 };
 
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 97c9f10..ace94d6 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -101,6 +101,17 @@ unsigned int get_pipes_per_mec(struct device_queue_manager *dqm)
 	return dqm->dev->shared_resources.num_pipe_per_mec;
 }
 
+static unsigned int get_num_sdma_engines(struct device_queue_manager *dqm)
+{
+	return dqm->dev->device_info->num_sdma_engines;
+}
+
+unsigned int get_num_sdma_queues(struct device_queue_manager *dqm)
+{
+	return dqm->dev->device_info->num_sdma_engines
+			* KFD_SDMA_QUEUES_PER_ENGINE;
+}
+
 void program_sh_mem_settings(struct device_queue_manager *dqm,
 					struct qcm_process_device *qpd)
 {
@@ -855,7 +866,7 @@ static int initialize_nocpsch(struct device_queue_manager *dqm)
 	}
 
 	dqm->vmid_bitmap = (1 << dqm->dev->vm_info.vmid_num_kfd) - 1;
-	dqm->sdma_bitmap = (1 << CIK_SDMA_QUEUES) - 1;
+	dqm->sdma_bitmap = (1 << get_num_sdma_queues(dqm)) - 1;
 
 	return 0;
 }
@@ -903,7 +914,7 @@ static int allocate_sdma_queue(struct device_queue_manager *dqm,
 static void deallocate_sdma_queue(struct device_queue_manager *dqm,
 				unsigned int sdma_queue_id)
 {
-	if (sdma_queue_id >= CIK_SDMA_QUEUES)
+	if (sdma_queue_id >= get_num_sdma_queues(dqm))
 		return;
 	dqm->sdma_bitmap |= (1 << sdma_queue_id);
 }
@@ -923,8 +934,8 @@ static int create_sdma_queue_nocpsch(struct device_queue_manager *dqm,
 	if (retval)
 		return retval;
 
-	q->properties.sdma_queue_id = q->sdma_id / CIK_SDMA_QUEUES_PER_ENGINE;
-	q->properties.sdma_engine_id = q->sdma_id % CIK_SDMA_QUEUES_PER_ENGINE;
+	q->properties.sdma_queue_id = q->sdma_id / get_num_sdma_engines(dqm);
+	q->properties.sdma_engine_id = q->sdma_id % get_num_sdma_engines(dqm);
 
 	retval = allocate_doorbell(qpd, q);
 	if (retval)
@@ -1011,7 +1022,7 @@ static int initialize_cpsch(struct device_queue_manager *dqm)
 	dqm->queue_count = dqm->processes_count = 0;
 	dqm->sdma_queue_count = 0;
 	dqm->active_runlist = false;
-	dqm->sdma_bitmap = (1 << CIK_SDMA_QUEUES) - 1;
+	dqm->sdma_bitmap = (1 << get_num_sdma_queues(dqm)) - 1;
 
 	INIT_WORK(&dqm->hw_exception_work, kfd_process_hw_exception);
 
@@ -1142,9 +1153,9 @@ static int create_queue_cpsch(struct device_queue_manager *dqm, struct queue *q,
 		if (retval)
 			goto out_unlock;
 		q->properties.sdma_queue_id =
-			q->sdma_id / CIK_SDMA_QUEUES_PER_ENGINE;
+			q->sdma_id / get_num_sdma_engines(dqm);
 		q->properties.sdma_engine_id =
-			q->sdma_id % CIK_SDMA_QUEUES_PER_ENGINE;
+			q->sdma_id % get_num_sdma_engines(dqm);
 	}
 
 	retval = allocate_doorbell(qpd, q);
@@ -1791,8 +1802,8 @@ int dqm_debugfs_hqds(struct seq_file *m, void *data)
 		}
 	}
 
-	for (pipe = 0; pipe < CIK_SDMA_ENGINE_NUM; pipe++) {
-		for (queue = 0; queue < CIK_SDMA_QUEUES_PER_ENGINE; queue++) {
+	for (pipe = 0; pipe < get_num_sdma_engines(dqm); pipe++) {
+		for (queue = 0; queue < KFD_SDMA_QUEUES_PER_ENGINE; queue++) {
 			r = dqm->dev->kfd2kgd->hqd_sdma_dump(
 				dqm->dev->kgd, pipe, queue, &dump, &n_regs);
 			if (r)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
index 52e708c..00da316 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
@@ -33,10 +33,7 @@
 
 #define KFD_UNMAP_LATENCY_MS			(4000)
 #define QUEUE_PREEMPT_DEFAULT_TIMEOUT_MS (2 * KFD_UNMAP_LATENCY_MS + 1000)
-
-#define CIK_SDMA_QUEUES				(4)
-#define CIK_SDMA_QUEUES_PER_ENGINE		(2)
-#define CIK_SDMA_ENGINE_NUM			(2)
+#define KFD_SDMA_QUEUES_PER_ENGINE		(2)
 
 struct device_process_node {
 	struct qcm_process_device *qpd;
@@ -214,6 +211,7 @@ void program_sh_mem_settings(struct device_queue_manager *dqm,
 unsigned int get_queues_num(struct device_queue_manager *dqm);
 unsigned int get_queues_per_pipe(struct device_queue_manager *dqm);
 unsigned int get_pipes_per_mec(struct device_queue_manager *dqm);
+unsigned int get_num_sdma_queues(struct device_queue_manager *dqm);
 
 static inline unsigned int get_sh_mem_bases_32(struct kfd_process_device *pdd)
 {
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 37d179e..ca83254 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -203,6 +203,7 @@ struct kfd_device_info {
 	bool supports_cwsr;
 	bool needs_iommu_device;
 	bool needs_pci_atomics;
+	unsigned int num_sdma_engines;
 };
 
 struct kfd_mem_obj {
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
index 1303b14..eb4e5fb 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
@@ -186,8 +186,7 @@ int pqm_create_queue(struct process_queue_manager *pqm,
 
 	switch (type) {
 	case KFD_QUEUE_TYPE_SDMA:
-		if (dev->dqm->queue_count >=
-			CIK_SDMA_QUEUES_PER_ENGINE * CIK_SDMA_ENGINE_NUM) {
+		if (dev->dqm->queue_count >= get_num_sdma_queues(dev->dqm)) {
 			pr_err("Over-subscription is not allowed for SDMA.\n");
 			retval = -EPERM;
 			goto err_create_queue;
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/6] drm/amdkfd: Avoid flooding dmesg on Raven due to IOMMU issues
       [not found] ` <1531430694-23966-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2018-07-12 21:24   ` [PATCH 1/6] drm/amdkfd: Consolidate duplicate memory banks info in topology Felix Kuehling
  2018-07-12 21:24   ` [PATCH 2/6] drm/amdkfd: Make SDMA engine number an ASIC-dependent variable Felix Kuehling
@ 2018-07-12 21:24   ` Felix Kuehling
       [not found]     ` <1531430694-23966-4-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2018-07-12 21:24   ` [PATCH 4/6] drm/amdkfd: Workaround to accommodate Raven too many PPR issue Felix Kuehling
                     ` (3 subsequent siblings)
  6 siblings, 1 reply; 16+ messages in thread
From: Felix Kuehling @ 2018-07-12 21:24 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Yong Zhao, Felix Kuehling

From: Yong Zhao <yong.zhao@amd.com>

On Raven Invalid PPRs can be reported because multiple PPRs can be
still queued when memory is freed. Apply a rate limit to avoid
flooding the log in this case.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_iommu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_iommu.c b/drivers/gpu/drm/amd/amdkfd/kfd_iommu.c
index c718179..7a61f38 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_iommu.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_iommu.c
@@ -190,7 +190,7 @@ static int iommu_invalid_ppr_cb(struct pci_dev *pdev, int pasid,
 {
 	struct kfd_dev *dev;
 
-	dev_warn(kfd_device,
+	dev_warn_ratelimited(kfd_device,
 			"Invalid PPR device %x:%x.%x pasid %d address 0x%lX flags 0x%X",
 			PCI_BUS_NUM(pdev->devfn),
 			PCI_SLOT(pdev->devfn),
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 4/6] drm/amdkfd: Workaround to accommodate Raven too many PPR issue
       [not found] ` <1531430694-23966-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (2 preceding siblings ...)
  2018-07-12 21:24   ` [PATCH 3/6] drm/amdkfd: Avoid flooding dmesg on Raven due to IOMMU issues Felix Kuehling
@ 2018-07-12 21:24   ` Felix Kuehling
       [not found]     ` <1531430694-23966-5-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2018-07-12 21:24   ` [PATCH 5/6] drm/amdkfd: Optimize out some duplicated code in kfd_signal_iommu_event() Felix Kuehling
                     ` (2 subsequent siblings)
  6 siblings, 1 reply; 16+ messages in thread
From: Felix Kuehling @ 2018-07-12 21:24 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Yong Zhao, Felix Kuehling

From: Yong Zhao <yong.zhao@amd.com>

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_events.c | 21 ++++++++++++++++-----
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
index 820133c..4dcacce 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
@@ -932,13 +932,24 @@ void kfd_signal_iommu_event(struct kfd_dev *dev, unsigned int pasid,
 	up_read(&mm->mmap_sem);
 	mmput(mm);
 
-	mutex_lock(&p->event_mutex);
+	pr_debug("notpresent %d, noexecute %d, readonly %d\n",
+			memory_exception_data.failure.NotPresent,
+			memory_exception_data.failure.NoExecute,
+			memory_exception_data.failure.ReadOnly);
 
-	/* Lookup events by type and signal them */
-	lookup_events_by_type_and_signal(p, KFD_EVENT_TYPE_MEMORY,
-			&memory_exception_data);
+	/* Workaround on Raven to not kill the process when memory is freed
+	 * before IOMMU is able to finish processing all the excessive PPRs
+	 */
+	if (dev->device_info->asic_family != CHIP_RAVEN) {
+		mutex_lock(&p->event_mutex);
+
+		/* Lookup events by type and signal them */
+		lookup_events_by_type_and_signal(p, KFD_EVENT_TYPE_MEMORY,
+				&memory_exception_data);
+
+		mutex_unlock(&p->event_mutex);
+	}
 
-	mutex_unlock(&p->event_mutex);
 	kfd_unref_process(p);
 }
 #endif /* KFD_SUPPORT_IOMMU_V2 */
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 5/6] drm/amdkfd: Optimize out some duplicated code in kfd_signal_iommu_event()
       [not found] ` <1531430694-23966-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (3 preceding siblings ...)
  2018-07-12 21:24   ` [PATCH 4/6] drm/amdkfd: Workaround to accommodate Raven too many PPR issue Felix Kuehling
@ 2018-07-12 21:24   ` Felix Kuehling
       [not found]     ` <1531430694-23966-6-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2018-07-12 21:24   ` [PATCH 6/6] drm/amdkfd: Enable Raven for KFD Felix Kuehling
  2018-07-12 22:28   ` [PATCH 0/6] Raven support " Mike Lothian
  6 siblings, 1 reply; 16+ messages in thread
From: Felix Kuehling @ 2018-07-12 21:24 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Yong Zhao, Felix Kuehling

From: Yong Zhao <yong.zhao@amd.com>

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_events.c | 26 +++++++++++---------------
 1 file changed, 11 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
index 4dcacce..e9f0e0a 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
@@ -911,22 +911,18 @@ void kfd_signal_iommu_event(struct kfd_dev *dev, unsigned int pasid,
 	memory_exception_data.failure.NotPresent = 1;
 	memory_exception_data.failure.NoExecute = 0;
 	memory_exception_data.failure.ReadOnly = 0;
-	if (vma) {
-		if (vma->vm_start > address) {
-			memory_exception_data.failure.NotPresent = 1;
-			memory_exception_data.failure.NoExecute = 0;
+	if (vma && address >= vma->vm_start) {
+		memory_exception_data.failure.NotPresent = 0;
+
+		if (is_write_requested && !(vma->vm_flags & VM_WRITE))
+			memory_exception_data.failure.ReadOnly = 1;
+		else
 			memory_exception_data.failure.ReadOnly = 0;
-		} else {
-			memory_exception_data.failure.NotPresent = 0;
-			if (is_write_requested && !(vma->vm_flags & VM_WRITE))
-				memory_exception_data.failure.ReadOnly = 1;
-			else
-				memory_exception_data.failure.ReadOnly = 0;
-			if (is_execute_requested && !(vma->vm_flags & VM_EXEC))
-				memory_exception_data.failure.NoExecute = 1;
-			else
-				memory_exception_data.failure.NoExecute = 0;
-		}
+
+		if (is_execute_requested && !(vma->vm_flags & VM_EXEC))
+			memory_exception_data.failure.NoExecute = 1;
+		else
+			memory_exception_data.failure.NoExecute = 0;
 	}
 
 	up_read(&mm->mmap_sem);
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 6/6] drm/amdkfd: Enable Raven for KFD
       [not found] ` <1531430694-23966-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (4 preceding siblings ...)
  2018-07-12 21:24   ` [PATCH 5/6] drm/amdkfd: Optimize out some duplicated code in kfd_signal_iommu_event() Felix Kuehling
@ 2018-07-12 21:24   ` Felix Kuehling
       [not found]     ` <1531430694-23966-7-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2018-07-12 22:28   ` [PATCH 0/6] Raven support " Mike Lothian
  6 siblings, 1 reply; 16+ messages in thread
From: Felix Kuehling @ 2018-07-12 21:24 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Yong Zhao, Felix Kuehling

From: Yong Zhao <Yong.Zhao@amd.com>

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_device.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 572235c..1b04871 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -70,6 +70,21 @@ static const struct kfd_device_info carrizo_device_info = {
 	.needs_pci_atomics = false,
 	.num_sdma_engines = 2,
 };
+
+static const struct kfd_device_info raven_device_info = {
+	.asic_family = CHIP_RAVEN,
+	.max_pasid_bits = 16,
+	.max_no_of_hqd  = 24,
+	.doorbell_size  = 8,
+	.ih_ring_entry_size = 8 * sizeof(uint32_t),
+	.event_interrupt_class = &event_interrupt_class_v9,
+	.num_of_watch_points = 4,
+	.mqd_size_aligned = MQD_SIZE_ALIGNED,
+	.supports_cwsr = true,
+	.needs_iommu_device = true,
+	.needs_pci_atomics = true,
+	.num_sdma_engines = 1,
+};
 #endif
 
 static const struct kfd_device_info hawaii_device_info = {
@@ -259,6 +274,7 @@ static const struct kfd_deviceid supported_devices[] = {
 	{ 0x9875, &carrizo_device_info },	/* Carrizo */
 	{ 0x9876, &carrizo_device_info },	/* Carrizo */
 	{ 0x9877, &carrizo_device_info },	/* Carrizo */
+	{ 0x15DD, &raven_device_info },		/* Raven */
 #endif
 	{ 0x67A0, &hawaii_device_info },	/* Hawaii */
 	{ 0x67A1, &hawaii_device_info },	/* Hawaii */
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/6] Raven support for KFD
       [not found] ` <1531430694-23966-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (5 preceding siblings ...)
  2018-07-12 21:24   ` [PATCH 6/6] drm/amdkfd: Enable Raven for KFD Felix Kuehling
@ 2018-07-12 22:28   ` Mike Lothian
       [not found]     ` <CAHbf0-Fv9npKsGZmgrNLmx6tS7800YOP4ijFEWGjCa3EwfzftQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  6 siblings, 1 reply; 16+ messages in thread
From: Mike Lothian @ 2018-07-12 22:28 UTC (permalink / raw)
  To: Felix Kuehling
  Cc: oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW


[-- Attachment #1.1: Type: text/plain, Size: 1672 bytes --]

Hi

I'm happy to test this out, just wondering what userspace I should pair it
with

Cheers

Mike

On Thu, 12 Jul 2018 at 22:25 Felix Kuehling <Felix.Kuehling-5C7GfCeVMHo@public.gmane.org> wrote:

> Raven refers to Ryzen APUs with integrated GFXv9 GPU.
> This patch series completes Raven support for KFD:
>
> * fix up memory banks info from CRAT
> * support different number of SDMA engines
> * workaround IOMMUv2 PPR issues
> * add device info
>
> Yong Zhao (6):
>   drm/amdkfd: Consolidate duplicate memory banks info in topology
>   drm/amdkfd: Make SDMA engine number an ASIC-dependent variable
>   drm/amdkfd: Avoid flooding dmesg on Raven due to IOMMU issues
>   drm/amdkfd: Workaround to accommodate Raven too many PPR issue
>   drm/amdkfd: Optimize out some duplicated code in
>     kfd_signal_iommu_event()
>   drm/amdkfd: Enable Raven for KFD
>
>  drivers/gpu/drm/amd/amdkfd/kfd_crat.c              | 57
> +++++++++++++++++-----
>  drivers/gpu/drm/amd/amdkfd/kfd_device.c            | 28 +++++++++++
>  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  | 29 +++++++----
>  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.h  |  6 +--
>  drivers/gpu/drm/amd/amdkfd/kfd_events.c            | 47 ++++++++++--------
>  drivers/gpu/drm/amd/amdkfd/kfd_iommu.c             |  2 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_priv.h              |  1 +
>  .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |  3 +-
>  8 files changed, 126 insertions(+), 47 deletions(-)
>
> --
> 2.7.4
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>

[-- Attachment #1.2: Type: text/html, Size: 2342 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/6] Raven support for KFD
       [not found]     ` <CAHbf0-Fv9npKsGZmgrNLmx6tS7800YOP4ijFEWGjCa3EwfzftQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2018-07-13 19:20       ` Felix Kuehling
  0 siblings, 0 replies; 16+ messages in thread
From: Felix Kuehling @ 2018-07-13 19:20 UTC (permalink / raw)
  To: Mike Lothian
  Cc: oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

This patch series applies on top of my previous 25-patch series on top
of Oded's amdkfd-next branch.

On the user mode side, you need my drm-next-wip branch in
https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/tree/fxkamd/drm-next-wip.
The rest of ROCm user mode should work unmodified.

Regards,
  Felix


On 2018-07-12 06:28 PM, Mike Lothian wrote:
> Hi
>
> I'm happy to test this out, just wondering what userspace I should
> pair it with
>
> Cheers
>
> Mike
>
> On Thu, 12 Jul 2018 at 22:25 Felix Kuehling <Felix.Kuehling@amd.com
> <mailto:Felix.Kuehling@amd.com>> wrote:
>
>     Raven refers to Ryzen APUs with integrated GFXv9 GPU.
>     This patch series completes Raven support for KFD:
>
>     * fix up memory banks info from CRAT
>     * support different number of SDMA engines
>     * workaround IOMMUv2 PPR issues
>     * add device info
>
>     Yong Zhao (6):
>       drm/amdkfd: Consolidate duplicate memory banks info in topology
>       drm/amdkfd: Make SDMA engine number an ASIC-dependent variable
>       drm/amdkfd: Avoid flooding dmesg on Raven due to IOMMU issues
>       drm/amdkfd: Workaround to accommodate Raven too many PPR issue
>       drm/amdkfd: Optimize out some duplicated code in
>         kfd_signal_iommu_event()
>       drm/amdkfd: Enable Raven for KFD
>
>      drivers/gpu/drm/amd/amdkfd/kfd_crat.c              | 57
>     +++++++++++++++++-----
>      drivers/gpu/drm/amd/amdkfd/kfd_device.c            | 28 +++++++++++
>      .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  | 29 +++++++----
>      .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.h  |  6 +--
>      drivers/gpu/drm/amd/amdkfd/kfd_events.c            | 47
>     ++++++++++--------
>      drivers/gpu/drm/amd/amdkfd/kfd_iommu.c             |  2 +-
>      drivers/gpu/drm/amd/amdkfd/kfd_priv.h              |  1 +
>      .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |  3 +-
>      8 files changed, 126 insertions(+), 47 deletions(-)
>
>     -- 
>     2.7.4
>
>     _______________________________________________
>     amd-gfx mailing list
>     amd-gfx@lists.freedesktop.org <mailto:amd-gfx@lists.freedesktop.org>
>     https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/6] drm/amdkfd: Consolidate duplicate memory banks info in topology
       [not found]     ` <1531430694-23966-2-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2018-07-13 19:24       ` Alex Deucher
  0 siblings, 0 replies; 16+ messages in thread
From: Alex Deucher @ 2018-07-13 19:24 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: Oded Gabbay, Yong Zhao, amd-gfx list

On Thu, Jul 12, 2018 at 5:24 PM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> From: Yong Zhao <yong.zhao@amd.com>
>
> If there are several memory banks that has the same properties in CRAT,
> we aggregate them into one memory bank. This cleans up memory banks on
> APUs (e.g. Raven) where the CRAT reports each memory channel as a
> separate bank. This only confuses user mode, which only deals with
> virtual memory.
>
> Signed-off-by: Yong Zhao <yong.zhao@amd.com>
> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>

Acked-by: Alex Deucher <alexander.deucher@amd.com>

> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 57 ++++++++++++++++++++++++++++-------
>  1 file changed, 46 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
> index 296b3f2..ee49960 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
> @@ -189,6 +189,21 @@ static int kfd_parse_subtype_cu(struct crat_subtype_computeunit *cu,
>         return 0;
>  }
>
> +static struct kfd_mem_properties *
> +find_subtype_mem(uint32_t heap_type, uint32_t flags, uint32_t width,
> +               struct kfd_topology_device *dev)
> +{
> +       struct kfd_mem_properties *props;
> +
> +       list_for_each_entry(props, &dev->mem_props, list) {
> +               if (props->heap_type == heap_type
> +                               && props->flags == flags
> +                               && props->width == width)
> +                       return props;
> +       }
> +
> +       return NULL;
> +}
>  /* kfd_parse_subtype_mem - parse memory subtypes and attach it to correct
>   * topology device present in the device_list
>   */
> @@ -197,36 +212,56 @@ static int kfd_parse_subtype_mem(struct crat_subtype_memory *mem,
>  {
>         struct kfd_mem_properties *props;
>         struct kfd_topology_device *dev;
> +       uint32_t heap_type;
> +       uint64_t size_in_bytes;
> +       uint32_t flags = 0;
> +       uint32_t width;
>
>         pr_debug("Found memory entry in CRAT table with proximity_domain=%d\n",
>                         mem->proximity_domain);
>         list_for_each_entry(dev, device_list, list) {
>                 if (mem->proximity_domain == dev->proximity_domain) {
> -                       props = kfd_alloc_struct(props);
> -                       if (!props)
> -                               return -ENOMEM;
> -
>                         /* We're on GPU node */
>                         if (dev->node_props.cpu_cores_count == 0) {
>                                 /* APU */
>                                 if (mem->visibility_type == 0)
> -                                       props->heap_type =
> +                                       heap_type =
>                                                 HSA_MEM_HEAP_TYPE_FB_PRIVATE;
>                                 /* dGPU */
>                                 else
> -                                       props->heap_type = mem->visibility_type;
> +                                       heap_type = mem->visibility_type;
>                         } else
> -                               props->heap_type = HSA_MEM_HEAP_TYPE_SYSTEM;
> +                               heap_type = HSA_MEM_HEAP_TYPE_SYSTEM;
>
>                         if (mem->flags & CRAT_MEM_FLAGS_HOT_PLUGGABLE)
> -                               props->flags |= HSA_MEM_FLAGS_HOT_PLUGGABLE;
> +                               flags |= HSA_MEM_FLAGS_HOT_PLUGGABLE;
>                         if (mem->flags & CRAT_MEM_FLAGS_NON_VOLATILE)
> -                               props->flags |= HSA_MEM_FLAGS_NON_VOLATILE;
> +                               flags |= HSA_MEM_FLAGS_NON_VOLATILE;
>
> -                       props->size_in_bytes =
> +                       size_in_bytes =
>                                 ((uint64_t)mem->length_high << 32) +
>                                                         mem->length_low;
> -                       props->width = mem->width;
> +                       width = mem->width;
> +
> +                       /* Multiple banks of the same type are aggregated into
> +                        * one. User mode doesn't care about multiple physical
> +                        * memory segments. It's managed as a single virtual
> +                        * heap for user mode.
> +                        */
> +                       props = find_subtype_mem(heap_type, flags, width, dev);
> +                       if (props) {
> +                               props->size_in_bytes += size_in_bytes;
> +                               break;
> +                       }
> +
> +                       props = kfd_alloc_struct(props);
> +                       if (!props)
> +                               return -ENOMEM;
> +
> +                       props->heap_type = heap_type;
> +                       props->flags = flags;
> +                       props->size_in_bytes = size_in_bytes;
> +                       props->width = width;
>
>                         dev->node_props.mem_banks_count++;
>                         list_add_tail(&props->list, &dev->mem_props);
> --
> 2.7.4
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/6] drm/amdkfd: Make SDMA engine number an ASIC-dependent variable
       [not found]     ` <1531430694-23966-3-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2018-07-13 19:25       ` Alex Deucher
  0 siblings, 0 replies; 16+ messages in thread
From: Alex Deucher @ 2018-07-13 19:25 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: Oded Gabbay, Yong Zhao, amd-gfx list

On Thu, Jul 12, 2018 at 5:24 PM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> From: Yong Zhao <yong.zhao@amd.com>
>
> On Raven there is only one SDMA engine instead of previously assumed two,
> so we need to adapt our code to this new scenario.
>
> Signed-off-by: Yong Zhao <yong.zhao@amd.com>
> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>

Acked-by: Alex Deucher <alexander.deucher@amd.com>

> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_device.c            | 12 +++++++++
>  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c  | 29 +++++++++++++++-------
>  .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.h  |  6 ++---
>  drivers/gpu/drm/amd/amdkfd/kfd_priv.h              |  1 +
>  .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c |  3 +--
>  5 files changed, 36 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> index 8faa8db..572235c 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> @@ -52,6 +52,7 @@ static const struct kfd_device_info kaveri_device_info = {
>         .supports_cwsr = false,
>         .needs_iommu_device = true,
>         .needs_pci_atomics = false,
> +       .num_sdma_engines = 2,
>  };
>
>  static const struct kfd_device_info carrizo_device_info = {
> @@ -67,6 +68,7 @@ static const struct kfd_device_info carrizo_device_info = {
>         .supports_cwsr = true,
>         .needs_iommu_device = true,
>         .needs_pci_atomics = false,
> +       .num_sdma_engines = 2,
>  };
>  #endif
>
> @@ -83,6 +85,7 @@ static const struct kfd_device_info hawaii_device_info = {
>         .supports_cwsr = false,
>         .needs_iommu_device = false,
>         .needs_pci_atomics = false,
> +       .num_sdma_engines = 2,
>  };
>
>  static const struct kfd_device_info tonga_device_info = {
> @@ -97,6 +100,7 @@ static const struct kfd_device_info tonga_device_info = {
>         .supports_cwsr = false,
>         .needs_iommu_device = false,
>         .needs_pci_atomics = true,
> +       .num_sdma_engines = 2,
>  };
>
>  static const struct kfd_device_info tonga_vf_device_info = {
> @@ -111,6 +115,7 @@ static const struct kfd_device_info tonga_vf_device_info = {
>         .supports_cwsr = false,
>         .needs_iommu_device = false,
>         .needs_pci_atomics = false,
> +       .num_sdma_engines = 2,
>  };
>
>  static const struct kfd_device_info fiji_device_info = {
> @@ -125,6 +130,7 @@ static const struct kfd_device_info fiji_device_info = {
>         .supports_cwsr = true,
>         .needs_iommu_device = false,
>         .needs_pci_atomics = true,
> +       .num_sdma_engines = 2,
>  };
>
>  static const struct kfd_device_info fiji_vf_device_info = {
> @@ -139,6 +145,7 @@ static const struct kfd_device_info fiji_vf_device_info = {
>         .supports_cwsr = true,
>         .needs_iommu_device = false,
>         .needs_pci_atomics = false,
> +       .num_sdma_engines = 2,
>  };
>
>
> @@ -154,6 +161,7 @@ static const struct kfd_device_info polaris10_device_info = {
>         .supports_cwsr = true,
>         .needs_iommu_device = false,
>         .needs_pci_atomics = true,
> +       .num_sdma_engines = 2,
>  };
>
>  static const struct kfd_device_info polaris10_vf_device_info = {
> @@ -168,6 +176,7 @@ static const struct kfd_device_info polaris10_vf_device_info = {
>         .supports_cwsr = true,
>         .needs_iommu_device = false,
>         .needs_pci_atomics = false,
> +       .num_sdma_engines = 2,
>  };
>
>  static const struct kfd_device_info polaris11_device_info = {
> @@ -182,6 +191,7 @@ static const struct kfd_device_info polaris11_device_info = {
>         .supports_cwsr = true,
>         .needs_iommu_device = false,
>         .needs_pci_atomics = true,
> +       .num_sdma_engines = 2,
>  };
>
>  static const struct kfd_device_info vega10_device_info = {
> @@ -196,6 +206,7 @@ static const struct kfd_device_info vega10_device_info = {
>         .supports_cwsr = true,
>         .needs_iommu_device = false,
>         .needs_pci_atomics = false,
> +       .num_sdma_engines = 2,
>  };
>
>  static const struct kfd_device_info vega10_vf_device_info = {
> @@ -210,6 +221,7 @@ static const struct kfd_device_info vega10_vf_device_info = {
>         .supports_cwsr = true,
>         .needs_iommu_device = false,
>         .needs_pci_atomics = false,
> +       .num_sdma_engines = 2,
>  };
>
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> index 97c9f10..ace94d6 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> @@ -101,6 +101,17 @@ unsigned int get_pipes_per_mec(struct device_queue_manager *dqm)
>         return dqm->dev->shared_resources.num_pipe_per_mec;
>  }
>
> +static unsigned int get_num_sdma_engines(struct device_queue_manager *dqm)
> +{
> +       return dqm->dev->device_info->num_sdma_engines;
> +}
> +
> +unsigned int get_num_sdma_queues(struct device_queue_manager *dqm)
> +{
> +       return dqm->dev->device_info->num_sdma_engines
> +                       * KFD_SDMA_QUEUES_PER_ENGINE;
> +}
> +
>  void program_sh_mem_settings(struct device_queue_manager *dqm,
>                                         struct qcm_process_device *qpd)
>  {
> @@ -855,7 +866,7 @@ static int initialize_nocpsch(struct device_queue_manager *dqm)
>         }
>
>         dqm->vmid_bitmap = (1 << dqm->dev->vm_info.vmid_num_kfd) - 1;
> -       dqm->sdma_bitmap = (1 << CIK_SDMA_QUEUES) - 1;
> +       dqm->sdma_bitmap = (1 << get_num_sdma_queues(dqm)) - 1;
>
>         return 0;
>  }
> @@ -903,7 +914,7 @@ static int allocate_sdma_queue(struct device_queue_manager *dqm,
>  static void deallocate_sdma_queue(struct device_queue_manager *dqm,
>                                 unsigned int sdma_queue_id)
>  {
> -       if (sdma_queue_id >= CIK_SDMA_QUEUES)
> +       if (sdma_queue_id >= get_num_sdma_queues(dqm))
>                 return;
>         dqm->sdma_bitmap |= (1 << sdma_queue_id);
>  }
> @@ -923,8 +934,8 @@ static int create_sdma_queue_nocpsch(struct device_queue_manager *dqm,
>         if (retval)
>                 return retval;
>
> -       q->properties.sdma_queue_id = q->sdma_id / CIK_SDMA_QUEUES_PER_ENGINE;
> -       q->properties.sdma_engine_id = q->sdma_id % CIK_SDMA_QUEUES_PER_ENGINE;
> +       q->properties.sdma_queue_id = q->sdma_id / get_num_sdma_engines(dqm);
> +       q->properties.sdma_engine_id = q->sdma_id % get_num_sdma_engines(dqm);
>
>         retval = allocate_doorbell(qpd, q);
>         if (retval)
> @@ -1011,7 +1022,7 @@ static int initialize_cpsch(struct device_queue_manager *dqm)
>         dqm->queue_count = dqm->processes_count = 0;
>         dqm->sdma_queue_count = 0;
>         dqm->active_runlist = false;
> -       dqm->sdma_bitmap = (1 << CIK_SDMA_QUEUES) - 1;
> +       dqm->sdma_bitmap = (1 << get_num_sdma_queues(dqm)) - 1;
>
>         INIT_WORK(&dqm->hw_exception_work, kfd_process_hw_exception);
>
> @@ -1142,9 +1153,9 @@ static int create_queue_cpsch(struct device_queue_manager *dqm, struct queue *q,
>                 if (retval)
>                         goto out_unlock;
>                 q->properties.sdma_queue_id =
> -                       q->sdma_id / CIK_SDMA_QUEUES_PER_ENGINE;
> +                       q->sdma_id / get_num_sdma_engines(dqm);
>                 q->properties.sdma_engine_id =
> -                       q->sdma_id % CIK_SDMA_QUEUES_PER_ENGINE;
> +                       q->sdma_id % get_num_sdma_engines(dqm);
>         }
>
>         retval = allocate_doorbell(qpd, q);
> @@ -1791,8 +1802,8 @@ int dqm_debugfs_hqds(struct seq_file *m, void *data)
>                 }
>         }
>
> -       for (pipe = 0; pipe < CIK_SDMA_ENGINE_NUM; pipe++) {
> -               for (queue = 0; queue < CIK_SDMA_QUEUES_PER_ENGINE; queue++) {
> +       for (pipe = 0; pipe < get_num_sdma_engines(dqm); pipe++) {
> +               for (queue = 0; queue < KFD_SDMA_QUEUES_PER_ENGINE; queue++) {
>                         r = dqm->dev->kfd2kgd->hqd_sdma_dump(
>                                 dqm->dev->kgd, pipe, queue, &dump, &n_regs);
>                         if (r)
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
> index 52e708c..00da316 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
> @@ -33,10 +33,7 @@
>
>  #define KFD_UNMAP_LATENCY_MS                   (4000)
>  #define QUEUE_PREEMPT_DEFAULT_TIMEOUT_MS (2 * KFD_UNMAP_LATENCY_MS + 1000)
> -
> -#define CIK_SDMA_QUEUES                                (4)
> -#define CIK_SDMA_QUEUES_PER_ENGINE             (2)
> -#define CIK_SDMA_ENGINE_NUM                    (2)
> +#define KFD_SDMA_QUEUES_PER_ENGINE             (2)
>
>  struct device_process_node {
>         struct qcm_process_device *qpd;
> @@ -214,6 +211,7 @@ void program_sh_mem_settings(struct device_queue_manager *dqm,
>  unsigned int get_queues_num(struct device_queue_manager *dqm);
>  unsigned int get_queues_per_pipe(struct device_queue_manager *dqm);
>  unsigned int get_pipes_per_mec(struct device_queue_manager *dqm);
> +unsigned int get_num_sdma_queues(struct device_queue_manager *dqm);
>
>  static inline unsigned int get_sh_mem_bases_32(struct kfd_process_device *pdd)
>  {
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> index 37d179e..ca83254 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> @@ -203,6 +203,7 @@ struct kfd_device_info {
>         bool supports_cwsr;
>         bool needs_iommu_device;
>         bool needs_pci_atomics;
> +       unsigned int num_sdma_engines;
>  };
>
>  struct kfd_mem_obj {
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> index 1303b14..eb4e5fb 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
> @@ -186,8 +186,7 @@ int pqm_create_queue(struct process_queue_manager *pqm,
>
>         switch (type) {
>         case KFD_QUEUE_TYPE_SDMA:
> -               if (dev->dqm->queue_count >=
> -                       CIK_SDMA_QUEUES_PER_ENGINE * CIK_SDMA_ENGINE_NUM) {
> +               if (dev->dqm->queue_count >= get_num_sdma_queues(dev->dqm)) {
>                         pr_err("Over-subscription is not allowed for SDMA.\n");
>                         retval = -EPERM;
>                         goto err_create_queue;
> --
> 2.7.4
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/6] drm/amdkfd: Avoid flooding dmesg on Raven due to IOMMU issues
       [not found]     ` <1531430694-23966-4-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2018-07-13 19:25       ` Alex Deucher
  0 siblings, 0 replies; 16+ messages in thread
From: Alex Deucher @ 2018-07-13 19:25 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: Oded Gabbay, Yong Zhao, amd-gfx list

On Thu, Jul 12, 2018 at 5:24 PM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> From: Yong Zhao <yong.zhao@amd.com>
>
> On Raven Invalid PPRs can be reported because multiple PPRs can be
> still queued when memory is freed. Apply a rate limit to avoid
> flooding the log in this case.
>
> Signed-off-by: Yong Zhao <yong.zhao@amd.com>
> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>

Maybe define PPRs in the description.  With that fixed:
Acked-by: Alex Deucher <alexander.deucher@amd.com>

> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_iommu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_iommu.c b/drivers/gpu/drm/amd/amdkfd/kfd_iommu.c
> index c718179..7a61f38 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_iommu.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_iommu.c
> @@ -190,7 +190,7 @@ static int iommu_invalid_ppr_cb(struct pci_dev *pdev, int pasid,
>  {
>         struct kfd_dev *dev;
>
> -       dev_warn(kfd_device,
> +       dev_warn_ratelimited(kfd_device,
>                         "Invalid PPR device %x:%x.%x pasid %d address 0x%lX flags 0x%X",
>                         PCI_BUS_NUM(pdev->devfn),
>                         PCI_SLOT(pdev->devfn),
> --
> 2.7.4
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/6] drm/amdkfd: Workaround to accommodate Raven too many PPR issue
       [not found]     ` <1531430694-23966-5-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2018-07-13 19:26       ` Alex Deucher
  0 siblings, 0 replies; 16+ messages in thread
From: Alex Deucher @ 2018-07-13 19:26 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: Oded Gabbay, Yong Zhao, amd-gfx list

On Thu, Jul 12, 2018 at 5:24 PM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> From: Yong Zhao <yong.zhao@amd.com>

Please add a patch description explaining why this is needed for Raven.

Alex


>
> Signed-off-by: Yong Zhao <yong.zhao@amd.com>
> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_events.c | 21 ++++++++++++++++-----
>  1 file changed, 16 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
> index 820133c..4dcacce 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
> @@ -932,13 +932,24 @@ void kfd_signal_iommu_event(struct kfd_dev *dev, unsigned int pasid,
>         up_read(&mm->mmap_sem);
>         mmput(mm);
>
> -       mutex_lock(&p->event_mutex);
> +       pr_debug("notpresent %d, noexecute %d, readonly %d\n",
> +                       memory_exception_data.failure.NotPresent,
> +                       memory_exception_data.failure.NoExecute,
> +                       memory_exception_data.failure.ReadOnly);
>
> -       /* Lookup events by type and signal them */
> -       lookup_events_by_type_and_signal(p, KFD_EVENT_TYPE_MEMORY,
> -                       &memory_exception_data);
> +       /* Workaround on Raven to not kill the process when memory is freed
> +        * before IOMMU is able to finish processing all the excessive PPRs
> +        */
> +       if (dev->device_info->asic_family != CHIP_RAVEN) {
> +               mutex_lock(&p->event_mutex);
> +
> +               /* Lookup events by type and signal them */
> +               lookup_events_by_type_and_signal(p, KFD_EVENT_TYPE_MEMORY,
> +                               &memory_exception_data);
> +
> +               mutex_unlock(&p->event_mutex);
> +       }
>
> -       mutex_unlock(&p->event_mutex);
>         kfd_unref_process(p);
>  }
>  #endif /* KFD_SUPPORT_IOMMU_V2 */
> --
> 2.7.4
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5/6] drm/amdkfd: Optimize out some duplicated code in kfd_signal_iommu_event()
       [not found]     ` <1531430694-23966-6-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2018-07-13 19:27       ` Alex Deucher
  0 siblings, 0 replies; 16+ messages in thread
From: Alex Deucher @ 2018-07-13 19:27 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: Oded Gabbay, Yong Zhao, amd-gfx list

On Thu, Jul 12, 2018 at 5:24 PM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> From: Yong Zhao <yong.zhao@amd.com>

Please add a patch description.

Alex

>
> Signed-off-by: Yong Zhao <yong.zhao@amd.com>
> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_events.c | 26 +++++++++++---------------
>  1 file changed, 11 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
> index 4dcacce..e9f0e0a 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
> @@ -911,22 +911,18 @@ void kfd_signal_iommu_event(struct kfd_dev *dev, unsigned int pasid,
>         memory_exception_data.failure.NotPresent = 1;
>         memory_exception_data.failure.NoExecute = 0;
>         memory_exception_data.failure.ReadOnly = 0;
> -       if (vma) {
> -               if (vma->vm_start > address) {
> -                       memory_exception_data.failure.NotPresent = 1;
> -                       memory_exception_data.failure.NoExecute = 0;
> +       if (vma && address >= vma->vm_start) {
> +               memory_exception_data.failure.NotPresent = 0;
> +
> +               if (is_write_requested && !(vma->vm_flags & VM_WRITE))
> +                       memory_exception_data.failure.ReadOnly = 1;
> +               else
>                         memory_exception_data.failure.ReadOnly = 0;
> -               } else {
> -                       memory_exception_data.failure.NotPresent = 0;
> -                       if (is_write_requested && !(vma->vm_flags & VM_WRITE))
> -                               memory_exception_data.failure.ReadOnly = 1;
> -                       else
> -                               memory_exception_data.failure.ReadOnly = 0;
> -                       if (is_execute_requested && !(vma->vm_flags & VM_EXEC))
> -                               memory_exception_data.failure.NoExecute = 1;
> -                       else
> -                               memory_exception_data.failure.NoExecute = 0;
> -               }
> +
> +               if (is_execute_requested && !(vma->vm_flags & VM_EXEC))
> +                       memory_exception_data.failure.NoExecute = 1;
> +               else
> +                       memory_exception_data.failure.NoExecute = 0;
>         }
>
>         up_read(&mm->mmap_sem);
> --
> 2.7.4
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 6/6] drm/amdkfd: Enable Raven for KFD
       [not found]     ` <1531430694-23966-7-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2018-07-13 19:28       ` Alex Deucher
  0 siblings, 0 replies; 16+ messages in thread
From: Alex Deucher @ 2018-07-13 19:28 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: Oded Gabbay, Yong Zhao, amd-gfx list

On Thu, Jul 12, 2018 at 5:24 PM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> From: Yong Zhao <Yong.Zhao@amd.com>
>
> Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>

Please add a patch description.  With that fixed:

Acked-by: Alex Deucher <alexander.deucher@amd.com>

> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_device.c | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> index 572235c..1b04871 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
> @@ -70,6 +70,21 @@ static const struct kfd_device_info carrizo_device_info = {
>         .needs_pci_atomics = false,
>         .num_sdma_engines = 2,
>  };
> +
> +static const struct kfd_device_info raven_device_info = {
> +       .asic_family = CHIP_RAVEN,
> +       .max_pasid_bits = 16,
> +       .max_no_of_hqd  = 24,
> +       .doorbell_size  = 8,
> +       .ih_ring_entry_size = 8 * sizeof(uint32_t),
> +       .event_interrupt_class = &event_interrupt_class_v9,
> +       .num_of_watch_points = 4,
> +       .mqd_size_aligned = MQD_SIZE_ALIGNED,
> +       .supports_cwsr = true,
> +       .needs_iommu_device = true,
> +       .needs_pci_atomics = true,
> +       .num_sdma_engines = 1,
> +};
>  #endif
>
>  static const struct kfd_device_info hawaii_device_info = {
> @@ -259,6 +274,7 @@ static const struct kfd_deviceid supported_devices[] = {
>         { 0x9875, &carrizo_device_info },       /* Carrizo */
>         { 0x9876, &carrizo_device_info },       /* Carrizo */
>         { 0x9877, &carrizo_device_info },       /* Carrizo */
> +       { 0x15DD, &raven_device_info },         /* Raven */
>  #endif
>         { 0x67A0, &hawaii_device_info },        /* Hawaii */
>         { 0x67A1, &hawaii_device_info },        /* Hawaii */
> --
> 2.7.4
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 5/6] drm/amdkfd: Optimize out some duplicated code in kfd_signal_iommu_event()
       [not found] ` <1531513068-3805-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2018-07-13 20:17   ` Felix Kuehling
  0 siblings, 0 replies; 16+ messages in thread
From: Felix Kuehling @ 2018-07-13 20:17 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	oded.gabbay-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Yong Zhao, Felix Kuehling

From: Yong Zhao <yong.zhao@amd.com>

memory_exception_data is already initialized for not-present faults.
It only needs to be overridden for permission faults.

Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_events.c | 26 +++++++++++---------------
 1 file changed, 11 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
index 4dcacce..e9f0e0a 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
@@ -911,22 +911,18 @@ void kfd_signal_iommu_event(struct kfd_dev *dev, unsigned int pasid,
 	memory_exception_data.failure.NotPresent = 1;
 	memory_exception_data.failure.NoExecute = 0;
 	memory_exception_data.failure.ReadOnly = 0;
-	if (vma) {
-		if (vma->vm_start > address) {
-			memory_exception_data.failure.NotPresent = 1;
-			memory_exception_data.failure.NoExecute = 0;
+	if (vma && address >= vma->vm_start) {
+		memory_exception_data.failure.NotPresent = 0;
+
+		if (is_write_requested && !(vma->vm_flags & VM_WRITE))
+			memory_exception_data.failure.ReadOnly = 1;
+		else
 			memory_exception_data.failure.ReadOnly = 0;
-		} else {
-			memory_exception_data.failure.NotPresent = 0;
-			if (is_write_requested && !(vma->vm_flags & VM_WRITE))
-				memory_exception_data.failure.ReadOnly = 1;
-			else
-				memory_exception_data.failure.ReadOnly = 0;
-			if (is_execute_requested && !(vma->vm_flags & VM_EXEC))
-				memory_exception_data.failure.NoExecute = 1;
-			else
-				memory_exception_data.failure.NoExecute = 0;
-		}
+
+		if (is_execute_requested && !(vma->vm_flags & VM_EXEC))
+			memory_exception_data.failure.NoExecute = 1;
+		else
+			memory_exception_data.failure.NoExecute = 0;
 	}
 
 	up_read(&mm->mmap_sem);
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2018-07-13 20:17 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-12 21:24 [PATCH 0/6] Raven support for KFD Felix Kuehling
     [not found] ` <1531430694-23966-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2018-07-12 21:24   ` [PATCH 1/6] drm/amdkfd: Consolidate duplicate memory banks info in topology Felix Kuehling
     [not found]     ` <1531430694-23966-2-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2018-07-13 19:24       ` Alex Deucher
2018-07-12 21:24   ` [PATCH 2/6] drm/amdkfd: Make SDMA engine number an ASIC-dependent variable Felix Kuehling
     [not found]     ` <1531430694-23966-3-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2018-07-13 19:25       ` Alex Deucher
2018-07-12 21:24   ` [PATCH 3/6] drm/amdkfd: Avoid flooding dmesg on Raven due to IOMMU issues Felix Kuehling
     [not found]     ` <1531430694-23966-4-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2018-07-13 19:25       ` Alex Deucher
2018-07-12 21:24   ` [PATCH 4/6] drm/amdkfd: Workaround to accommodate Raven too many PPR issue Felix Kuehling
     [not found]     ` <1531430694-23966-5-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2018-07-13 19:26       ` Alex Deucher
2018-07-12 21:24   ` [PATCH 5/6] drm/amdkfd: Optimize out some duplicated code in kfd_signal_iommu_event() Felix Kuehling
     [not found]     ` <1531430694-23966-6-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2018-07-13 19:27       ` Alex Deucher
2018-07-12 21:24   ` [PATCH 6/6] drm/amdkfd: Enable Raven for KFD Felix Kuehling
     [not found]     ` <1531430694-23966-7-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2018-07-13 19:28       ` Alex Deucher
2018-07-12 22:28   ` [PATCH 0/6] Raven support " Mike Lothian
     [not found]     ` <CAHbf0-Fv9npKsGZmgrNLmx6tS7800YOP4ijFEWGjCa3EwfzftQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-07-13 19:20       ` Felix Kuehling
2018-07-13 20:17 [PATCH 0/6] Raven support for KFD v2 Felix Kuehling
     [not found] ` <1531513068-3805-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2018-07-13 20:17   ` [PATCH 5/6] drm/amdkfd: Optimize out some duplicated code in kfd_signal_iommu_event() Felix Kuehling

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.