All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/8] Retry page fault handling for Vega10
@ 2017-08-29 22:25 Felix Kuehling
       [not found] ` <1504045524-23853-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 18+ messages in thread
From: Felix Kuehling @ 2017-08-29 22:25 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Felix Kuehling

Rebased on the public drm-next-4.15-wip. Patch 8 from the WIP patch
series did not apply at all, because upstream KFD doesn't support
GPUVM yet.

The "lib: Closed hash table ..." change is updated and the same as
what I sent to LKML yesterday. Changes are mainly in the way the self
test is hooked up, Kconfig options and some checkpatch fixes. If it
takes too long to get accepted upstream, I could add it under
drivers/gpu/drm/amd/chash in the interim.

This is only compile tested on this branch. I can't do much more
because the upstream KFD doesn't support Vega10 and GPUVM yet. Someone
will have to add PASID support for graphics on top of this.

TODO:
* Finish upstreaming KFD
* Allocate PASIDs for graphics contexts
* Setup VMID-PASID mapping during graphics command submission
* Confirm that graphics page faults have the correct PASID in the IV

Felix Kuehling (8):
  drm/amdgpu: Fix error handling in amdgpu_vm_init
  drm/amdgpu: Add PASID management
  drm/radeon: Add PASID manager for KFD
  drm/amdkfd: Separate doorbell allocation from PASID
  drm/amdkfd: Use PASID manager from KGD
  drm/amdgpu: Add prescreening stage in IH processing
  lib: Closed hash table with low overhead
  drm/amdgpu: Track pending retry faults in IH and VM (v2)

 drivers/gpu/drm/Kconfig                           |   1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu.h               |   2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c |   2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c |   2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c            |  82 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h            |  12 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            |  84 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h            |  21 +-
 drivers/gpu/drm/amd/amdgpu/cik_ih.c               |  14 +
 drivers/gpu/drm/amd/amdgpu/cz_ih.c                |  14 +
 drivers/gpu/drm/amd/amdgpu/iceland_ih.c           |  14 +
 drivers/gpu/drm/amd/amdgpu/si_ih.c                |  14 +
 drivers/gpu/drm/amd/amdgpu/tonga_ih.c             |  14 +
 drivers/gpu/drm/amd/amdgpu/vega10_ih.c            |  90 ++++
 drivers/gpu/drm/amd/amdkfd/kfd_device.c           |   7 -
 drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c         |  50 +-
 drivers/gpu/drm/amd/amdkfd/kfd_module.c           |   6 -
 drivers/gpu/drm/amd/amdkfd/kfd_pasid.c            |  90 ++--
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h             |  10 +-
 drivers/gpu/drm/amd/amdkfd/kfd_process.c          |   6 +
 drivers/gpu/drm/amd/include/kgd_kfd_interface.h   |   6 +
 drivers/gpu/drm/radeon/radeon_kfd.c               |  31 ++
 include/linux/chash.h                             | 358 +++++++++++++
 lib/Kconfig                                       |  24 +
 lib/Makefile                                      |   2 +
 lib/chash.c                                       | 622 ++++++++++++++++++++++
 27 files changed, 1489 insertions(+), 91 deletions(-)
 create mode 100644 include/linux/chash.h
 create mode 100644 lib/chash.c

-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/8] drm/amdgpu: Fix error handling in amdgpu_vm_init
       [not found] ` <1504045524-23853-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-08-29 22:25   ` Felix Kuehling
  2017-08-29 22:25   ` [PATCH 2/8] drm/amdgpu: Add PASID management Felix Kuehling
                     ` (7 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Felix Kuehling @ 2017-08-29 22:25 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Felix Kuehling

Make sure vm->root.bo is not left reserved if amdgpu_bo_kmap fails.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 6ff3c1b..0e068fb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2541,9 +2541,9 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
 			goto error_free_root;
 
 		r = amdgpu_bo_kmap(vm->root.base.bo, NULL);
+		amdgpu_bo_unreserve(vm->root.base.bo);
 		if (r)
 			goto error_free_root;
-		amdgpu_bo_unreserve(vm->root.base.bo);
 	}
 
 	return 0;
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/8] drm/amdgpu: Add PASID management
       [not found] ` <1504045524-23853-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-08-29 22:25   ` [PATCH 1/8] drm/amdgpu: Fix error handling in amdgpu_vm_init Felix Kuehling
@ 2017-08-29 22:25   ` Felix Kuehling
  2017-08-29 22:25   ` [PATCH 3/8] drm/radeon: Add PASID manager for KFD Felix Kuehling
                     ` (6 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Felix Kuehling @ 2017-08-29 22:25 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Felix Kuehling

Allows assigning a PASID to a VM for identifying VMs involved in page
faults. The global PASID manager is also exported in the KFD
interface so that AMDGPU and KFD can share the PASID space.

PASIDs of different sizes can be requested. On APUs, the PASID size
is deterined by the capabilities of the IOMMU. So KFD must be able
to allocate PASIDs in a smaller range.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c |  2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c |  2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            | 75 ++++++++++++++++++++++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h            | 14 ++++-
 drivers/gpu/drm/amd/include/kgd_kfd_interface.h   |  6 ++
 6 files changed, 97 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
index b9dbbf9..dc7e25c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
@@ -169,6 +169,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
 	.get_vmem_size = get_vmem_size,
 	.get_gpu_clock_counter = get_gpu_clock_counter,
 	.get_max_engine_clock_in_mhz = get_max_engine_clock_in_mhz,
+	.alloc_pasid = amdgpu_vm_alloc_pasid,
+	.free_pasid = amdgpu_vm_free_pasid,
 	.program_sh_mem_settings = kgd_program_sh_mem_settings,
 	.set_pasid_vmid_mapping = kgd_set_pasid_vmid_mapping,
 	.init_pipeline = kgd_init_pipeline,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
index 309f241..c678c69 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
@@ -128,6 +128,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
 	.get_vmem_size = get_vmem_size,
 	.get_gpu_clock_counter = get_gpu_clock_counter,
 	.get_max_engine_clock_in_mhz = get_max_engine_clock_in_mhz,
+	.alloc_pasid = amdgpu_vm_alloc_pasid,
+	.free_pasid = amdgpu_vm_free_pasid,
 	.program_sh_mem_settings = kgd_program_sh_mem_settings,
 	.set_pasid_vmid_mapping = kgd_set_pasid_vmid_mapping,
 	.init_pipeline = kgd_init_pipeline,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index e162290..79d9ab4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -825,7 +825,7 @@ int amdgpu_driver_open_kms(struct drm_device *dev, struct drm_file *file_priv)
 	}
 
 	r = amdgpu_vm_init(adev, &fpriv->vm,
-			   AMDGPU_VM_CONTEXT_GFX);
+			   AMDGPU_VM_CONTEXT_GFX, 0);
 	if (r) {
 		kfree(fpriv);
 		goto out_suspend;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 0e068fb..f07b3b6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -27,12 +27,59 @@
  */
 #include <linux/dma-fence-array.h>
 #include <linux/interval_tree_generic.h>
+#include <linux/idr.h>
 #include <drm/drmP.h>
 #include <drm/amdgpu_drm.h>
 #include "amdgpu.h"
 #include "amdgpu_trace.h"
 
 /*
+ * PASID manager
+ *
+ * PASIDs are global address space identifiers that can be shared
+ * between the GPU, an IOMMU and the driver. VMs on different devices
+ * may use the same PASID if they share the same address
+ * space. Therefore PASIDs are allocated using a global IDA. VMs are
+ * looked up from the PASID per amdgpu_device.
+ */
+static DEFINE_IDA(amdgpu_vm_pasid_ida);
+
+/**
+ * amdgpu_vm_alloc_pasid - Allocate a PASID
+ * @bits: Maximum width of the PASID in bits, must be at least 1
+ *
+ * Allocates a PASID of the given width while keeping smaller PASIDs
+ * available if possible.
+ *
+ * Returns a positive integer on success. Returns %-EINVAL if bits==0.
+ * Returns %-ENOSPC if no PASID was available. Returns %-ENOMEM on
+ * memory allocation failure.
+ */
+int amdgpu_vm_alloc_pasid(unsigned int bits)
+{
+	int pasid = -EINVAL;
+
+	for (bits = min(bits, 31U); bits > 0; bits--) {
+		pasid = ida_simple_get(&amdgpu_vm_pasid_ida,
+				       1U << (bits - 1), 1U << bits,
+				       GFP_KERNEL);
+		if (pasid != -ENOSPC)
+			break;
+	}
+
+	return pasid;
+}
+
+/**
+ * amdgpu_vm_free_pasid - Free a PASID
+ * @pasid: PASID to free
+ */
+void amdgpu_vm_free_pasid(unsigned int pasid)
+{
+	ida_simple_remove(&amdgpu_vm_pasid_ida, pasid);
+}
+
+/*
  * GPUVM
  * GPUVM is similar to the legacy gart on older asics, however
  * rather than there being a single global gart table
@@ -2466,7 +2513,7 @@ void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint64_t vm_size, uint32_
  * Init @vm fields.
  */
 int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
-		   int vm_context)
+		   int vm_context, unsigned int pasid)
 {
 	const unsigned align = min(AMDGPU_VM_PTB_ALIGN_SIZE,
 		AMDGPU_VM_PTE_COUNT(adev) * 8);
@@ -2546,6 +2593,19 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
 			goto error_free_root;
 	}
 
+	if (pasid) {
+		unsigned long flags;
+
+		spin_lock_irqsave(&adev->vm_manager.pasid_lock, flags);
+		r = idr_alloc(&adev->vm_manager.pasid_idr, vm, pasid, pasid + 1,
+			      GFP_ATOMIC);
+		spin_unlock_irqrestore(&adev->vm_manager.pasid_lock, flags);
+		if (r < 0)
+			goto error_free_root;
+
+		vm->pasid = pasid;
+	}
+
 	return 0;
 
 error_free_root:
@@ -2599,6 +2659,14 @@ void amdgpu_vm_fini(struct amdgpu_device *adev, struct amdgpu_vm *vm)
 	bool prt_fini_needed = !!adev->gart.gart_funcs->set_prt;
 	int i;
 
+	if (vm->pasid) {
+		unsigned long flags;
+
+		spin_lock_irqsave(&adev->vm_manager.pasid_lock, flags);
+		idr_remove(&adev->vm_manager.pasid_idr, vm->pasid);
+		spin_unlock_irqrestore(&adev->vm_manager.pasid_lock, flags);
+	}
+
 	amd_sched_entity_fini(vm->entity.sched, &vm->entity);
 
 	if (!RB_EMPTY_ROOT(&vm->va)) {
@@ -2678,6 +2746,8 @@ void amdgpu_vm_manager_init(struct amdgpu_device *adev)
 	adev->vm_manager.vm_update_mode = 0;
 #endif
 
+	idr_init(&adev->vm_manager.pasid_idr);
+	spin_lock_init(&adev->vm_manager.pasid_lock);
 }
 
 /**
@@ -2691,6 +2761,9 @@ void amdgpu_vm_manager_fini(struct amdgpu_device *adev)
 {
 	unsigned i, j;
 
+	WARN_ON(!idr_is_empty(&adev->vm_manager.pasid_idr));
+	idr_destroy(&adev->vm_manager.pasid_idr);
+
 	for (i = 0; i < AMDGPU_MAX_VMHUBS; ++i) {
 		struct amdgpu_vm_id_manager *id_mgr =
 			&adev->vm_manager.id_mgr[i];
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
index 4e465e8..861d457 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -25,6 +25,7 @@
 #define __AMDGPU_VM_H__
 
 #include <linux/rbtree.h>
+#include <linux/idr.h>
 
 #include "gpu_scheduler.h"
 #include "amdgpu_sync.h"
@@ -145,8 +146,9 @@ struct amdgpu_vm {
 	/* Scheduler entity for page table updates */
 	struct amd_sched_entity	entity;
 
-	/* client id */
+	/* client id and PASID (TODO: replace client_id with PASID) */
 	u64                     client_id;
+	unsigned int		pasid;
 	/* dedicated to vm */
 	struct amdgpu_vm_id	*reserved_vmid[AMDGPU_MAX_VMHUBS];
 
@@ -217,12 +219,20 @@ struct amdgpu_vm_manager {
 	 * BIT1[= 0] Compute updated by SDMA [= 1] by CPU
 	 */
 	int					vm_update_mode;
+
+	/* PASID to VM mapping, will be used in interrupt context to
+	 * look up VM of a page fault
+	 */
+	struct idr				pasid_idr;
+	spinlock_t				pasid_lock;
 };
 
+int amdgpu_vm_alloc_pasid(unsigned int bits);
+void amdgpu_vm_free_pasid(unsigned int pasid);
 void amdgpu_vm_manager_init(struct amdgpu_device *adev);
 void amdgpu_vm_manager_fini(struct amdgpu_device *adev);
 int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
-		   int vm_context);
+		   int vm_context, unsigned int pasid);
 void amdgpu_vm_fini(struct amdgpu_device *adev, struct amdgpu_vm *vm);
 void amdgpu_vm_get_pd_bo(struct amdgpu_vm *vm,
 			 struct list_head *validated,
diff --git a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
index 94277cb..f516fd1 100644
--- a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
+++ b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
@@ -112,6 +112,9 @@ struct tile_config {
  *
  * @get_max_engine_clock_in_mhz: Retrieves maximum GPU clock in MHz
  *
+ * @alloc_pasid: Allocate a PASID
+ * @free_pasid: Free a PASID
+ *
  * @program_sh_mem_settings: A function that should initiate the memory
  * properties such as main aperture memory type (cache / non cached) and
  * secondary aperture base address, size and memory type.
@@ -160,6 +163,9 @@ struct kfd2kgd_calls {
 
 	uint32_t (*get_max_engine_clock_in_mhz)(struct kgd_dev *kgd);
 
+	int (*alloc_pasid)(unsigned int bits);
+	void (*free_pasid)(unsigned int pasid);
+
 	/* Register access functions */
 	void (*program_sh_mem_settings)(struct kgd_dev *kgd, uint32_t vmid,
 			uint32_t sh_mem_config,	uint32_t sh_mem_ape1_base,
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/8] drm/radeon: Add PASID manager for KFD
       [not found] ` <1504045524-23853-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-08-29 22:25   ` [PATCH 1/8] drm/amdgpu: Fix error handling in amdgpu_vm_init Felix Kuehling
  2017-08-29 22:25   ` [PATCH 2/8] drm/amdgpu: Add PASID management Felix Kuehling
@ 2017-08-29 22:25   ` Felix Kuehling
  2017-08-29 22:25   ` [PATCH 4/8] drm/amdkfd: Separate doorbell allocation from PASID Felix Kuehling
                     ` (5 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Felix Kuehling @ 2017-08-29 22:25 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Felix Kuehling

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/radeon/radeon_kfd.c | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/drivers/gpu/drm/radeon/radeon_kfd.c b/drivers/gpu/drm/radeon/radeon_kfd.c
index f6578c9..a2ac8ac 100644
--- a/drivers/gpu/drm/radeon/radeon_kfd.c
+++ b/drivers/gpu/drm/radeon/radeon_kfd.c
@@ -58,6 +58,10 @@ static uint64_t get_vmem_size(struct kgd_dev *kgd);
 static uint64_t get_gpu_clock_counter(struct kgd_dev *kgd);
 
 static uint32_t get_max_engine_clock_in_mhz(struct kgd_dev *kgd);
+
+static int alloc_pasid(unsigned int bits);
+static void free_pasid(unsigned int pasid);
+
 static uint16_t get_fw_version(struct kgd_dev *kgd, enum kgd_engine_type type);
 
 /*
@@ -112,6 +116,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
 	.get_vmem_size = get_vmem_size,
 	.get_gpu_clock_counter = get_gpu_clock_counter,
 	.get_max_engine_clock_in_mhz = get_max_engine_clock_in_mhz,
+	.alloc_pasid = alloc_pasid,
+	.free_pasid = free_pasid,
 	.program_sh_mem_settings = kgd_program_sh_mem_settings,
 	.set_pasid_vmid_mapping = kgd_set_pasid_vmid_mapping,
 	.init_pipeline = kgd_init_pipeline,
@@ -341,6 +347,31 @@ static uint32_t get_max_engine_clock_in_mhz(struct kgd_dev *kgd)
 	return rdev->pm.dpm.dyn_state.max_clock_voltage_on_ac.sclk / 100;
 }
 
+/*
+ * PASID manager
+ */
+static DEFINE_IDA(pasid_ida);
+
+int alloc_pasid(unsigned int bits)
+{
+	int pasid = -EINVAL;
+
+	for (bits = min(bits, 31U); bits > 0; bits--) {
+		pasid = ida_simple_get(&pasid_ida,
+				       1U << (bits - 1), 1U << bits,
+				       GFP_KERNEL);
+		if (pasid != -ENOSPC)
+			break;
+	}
+
+	return pasid;
+}
+
+void free_pasid(unsigned int pasid)
+{
+	ida_simple_remove(&pasid_ida, pasid);
+}
+
 static inline struct radeon_device *get_radeon_device(struct kgd_dev *kgd)
 {
 	return (struct radeon_device *)kgd;
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 4/8] drm/amdkfd: Separate doorbell allocation from PASID
       [not found] ` <1504045524-23853-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (2 preceding siblings ...)
  2017-08-29 22:25   ` [PATCH 3/8] drm/radeon: Add PASID manager for KFD Felix Kuehling
@ 2017-08-29 22:25   ` Felix Kuehling
  2017-08-29 22:25   ` [PATCH 5/8] drm/amdkfd: Use PASID manager from KGD Felix Kuehling
                     ` (4 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Felix Kuehling @ 2017-08-29 22:25 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Felix Kuehling

PASID management is moving into KGD. Limiting the PASID range to the
number of doorbell pages is no longer practical.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_device.c   |  7 -----
 drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c | 50 +++++++++++++++++++++----------
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h     | 10 +++----
 drivers/gpu/drm/amd/amdkfd/kfd_process.c  |  6 ++++
 4 files changed, 45 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 61fff25..5df12b2 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -168,13 +168,6 @@ static bool device_iommu_pasid_init(struct kfd_dev *kfd)
 	pasid_limit = min_t(unsigned int,
 			(unsigned int)(1 << kfd->device_info->max_pasid_bits),
 			iommu_info.max_pasids);
-	/*
-	 * last pasid is used for kernel queues doorbells
-	 * in the future the last pasid might be used for a kernel thread.
-	 */
-	pasid_limit = min_t(unsigned int,
-				pasid_limit,
-				kfd->doorbell_process_limit - 1);
 
 	err = amd_iommu_init_device(kfd->pdev, pasid_limit);
 	if (err < 0) {
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
index acf4d2a..feb76c2 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
@@ -24,16 +24,15 @@
 #include <linux/mman.h>
 #include <linux/slab.h>
 #include <linux/io.h>
+#include <linux/idr.h>
 
 /*
- * This extension supports a kernel level doorbells management for
- * the kernel queues.
- * Basically the last doorbells page is devoted to kernel queues
- * and that's assures that any user process won't get access to the
- * kernel doorbells page
+ * This extension supports a kernel level doorbells management for the
+ * kernel queues using the first doorbell page reserved for the kernel.
  */
 
-#define KERNEL_DOORBELL_PASID 1
+static DEFINE_IDA(doorbell_ida);
+static unsigned int max_doorbell_slices;
 #define KFD_SIZE_OF_DOORBELL_IN_BYTES 4
 
 /*
@@ -84,13 +83,16 @@ int kfd_doorbell_init(struct kfd_dev *kfd)
 			(doorbell_aperture_size - doorbell_start_offset) /
 						doorbell_process_allocation();
 	else
-		doorbell_process_limit = 0;
+		return -ENOSPC;
+
+	if (!max_doorbell_slices ||
+	    doorbell_process_limit < max_doorbell_slices)
+		max_doorbell_slices = doorbell_process_limit;
 
 	kfd->doorbell_base = kfd->shared_resources.doorbell_physical_address +
 				doorbell_start_offset;
 
 	kfd->doorbell_id_offset = doorbell_start_offset / sizeof(u32);
-	kfd->doorbell_process_limit = doorbell_process_limit - 1;
 
 	kfd->doorbell_kernel_ptr = ioremap(kfd->doorbell_base,
 						doorbell_process_allocation());
@@ -185,11 +187,10 @@ u32 __iomem *kfd_get_kernel_doorbell(struct kfd_dev *kfd,
 		return NULL;
 
 	/*
-	 * Calculating the kernel doorbell offset using "faked" kernel
-	 * pasid that allocated for kernel queues only
+	 * Calculating the kernel doorbell offset using the first
+	 * doorbell page.
 	 */
-	*doorbell_off = KERNEL_DOORBELL_PASID * (doorbell_process_allocation() /
-							sizeof(u32)) + inx;
+	*doorbell_off = kfd->doorbell_id_offset + inx;
 
 	pr_debug("Get kernel queue doorbell\n"
 			 "     doorbell offset   == 0x%08X\n"
@@ -228,11 +229,12 @@ unsigned int kfd_queue_id_to_doorbell(struct kfd_dev *kfd,
 {
 	/*
 	 * doorbell_id_offset accounts for doorbells taken by KGD.
-	 * pasid * doorbell_process_allocation/sizeof(u32) adjusts
-	 * to the process's doorbells
+	 * index * doorbell_process_allocation/sizeof(u32) adjusts to
+	 * the process's doorbells.
 	 */
 	return kfd->doorbell_id_offset +
-		process->pasid * (doorbell_process_allocation()/sizeof(u32)) +
+		process->doorbell_index
+		* doorbell_process_allocation() / sizeof(u32) +
 		queue_id;
 }
 
@@ -250,5 +252,21 @@ phys_addr_t kfd_get_process_doorbells(struct kfd_dev *dev,
 					struct kfd_process *process)
 {
 	return dev->doorbell_base +
-		process->pasid * doorbell_process_allocation();
+		process->doorbell_index * doorbell_process_allocation();
+}
+
+int kfd_alloc_process_doorbells(struct kfd_process *process)
+{
+	int r = ida_simple_get(&doorbell_ida, 1, max_doorbell_slices,
+				GFP_KERNEL);
+	if (r > 0)
+		process->doorbell_index = r;
+
+	return r;
+}
+
+void kfd_free_process_doorbells(struct kfd_process *process)
+{
+	if (process->doorbell_index)
+		ida_simple_remove(&doorbell_ida, process->doorbell_index);
 }
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index b397ec7..4cb90f5 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -157,9 +157,6 @@ struct kfd_dev {
 					 * to HW doorbell, GFX reserved some
 					 * at the start)
 					 */
-	size_t doorbell_process_limit;	/* Number of processes we have doorbell
-					 * space for.
-					 */
 	u32 __iomem *doorbell_kernel_ptr; /* This is a pointer for a doorbells
 					   * page used by kernel queue
 					   */
@@ -495,6 +492,7 @@ struct kfd_process {
 	struct rcu_head	rcu;
 
 	unsigned int pasid;
+	unsigned int doorbell_index;
 
 	/*
 	 * List of kfd_process_device structures,
@@ -583,6 +581,10 @@ void write_kernel_doorbell(u32 __iomem *db, u32 value);
 unsigned int kfd_queue_id_to_doorbell(struct kfd_dev *kfd,
 					struct kfd_process *process,
 					unsigned int queue_id);
+phys_addr_t kfd_get_process_doorbells(struct kfd_dev *dev,
+					struct kfd_process *process);
+int kfd_alloc_process_doorbells(struct kfd_process *process);
+void kfd_free_process_doorbells(struct kfd_process *process);
 
 /* GTT Sub-Allocator */
 
@@ -694,8 +696,6 @@ int pm_send_unmap_queue(struct packet_manager *pm, enum kfd_queue_type type,
 void pm_release_ib(struct packet_manager *pm);
 
 uint64_t kfd_get_number_elems(struct kfd_dev *kfd);
-phys_addr_t kfd_get_process_doorbells(struct kfd_dev *dev,
-					struct kfd_process *process);
 
 /* Events */
 extern const struct kfd_event_interrupt_class event_interrupt_class_cik;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index c74cf22..9e65ce3 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -183,6 +183,7 @@ static void kfd_process_wq_release(struct work_struct *work)
 	kfd_event_free_process(p);
 
 	kfd_pasid_free(p->pasid);
+	kfd_free_process_doorbells(p);
 
 	mutex_unlock(&p->mutex);
 
@@ -288,6 +289,9 @@ static struct kfd_process *create_process(const struct task_struct *thread)
 	if (process->pasid == 0)
 		goto err_alloc_pasid;
 
+	if (kfd_alloc_process_doorbells(process) < 0)
+		goto err_alloc_doorbells;
+
 	mutex_init(&process->mutex);
 
 	process->mm = thread->mm;
@@ -329,6 +333,8 @@ static struct kfd_process *create_process(const struct task_struct *thread)
 	mmu_notifier_unregister_no_release(&process->mmu_notifier, process->mm);
 err_mmu_notifier:
 	mutex_destroy(&process->mutex);
+	kfd_free_process_doorbells(process);
+err_alloc_doorbells:
 	kfd_pasid_free(process->pasid);
 err_alloc_pasid:
 	kfree(process->queues);
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 5/8] drm/amdkfd: Use PASID manager from KGD
       [not found] ` <1504045524-23853-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (3 preceding siblings ...)
  2017-08-29 22:25   ` [PATCH 4/8] drm/amdkfd: Separate doorbell allocation from PASID Felix Kuehling
@ 2017-08-29 22:25   ` Felix Kuehling
       [not found]     ` <1504045524-23853-6-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-08-29 22:25   ` [PATCH 6/8] drm/amdgpu: Add prescreening stage in IH processing Felix Kuehling
                     ` (3 subsequent siblings)
  8 siblings, 1 reply; 18+ messages in thread
From: Felix Kuehling @ 2017-08-29 22:25 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Felix Kuehling

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_module.c |  6 ---
 drivers/gpu/drm/amd/amdkfd/kfd_pasid.c  | 90 ++++++++++++++-------------------
 2 files changed, 38 insertions(+), 58 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_module.c b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
index 0d73bea..6c5a9ca 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
@@ -103,10 +103,6 @@ static int __init kfd_module_init(void)
 		return -1;
 	}
 
-	err = kfd_pasid_init();
-	if (err < 0)
-		return err;
-
 	err = kfd_chardev_init();
 	if (err < 0)
 		goto err_ioctl;
@@ -126,7 +122,6 @@ static int __init kfd_module_init(void)
 err_topology:
 	kfd_chardev_exit();
 err_ioctl:
-	kfd_pasid_exit();
 	return err;
 }
 
@@ -137,7 +132,6 @@ static void __exit kfd_module_exit(void)
 	kfd_process_destroy_wq();
 	kfd_topology_shutdown();
 	kfd_chardev_exit();
-	kfd_pasid_exit();
 	dev_info(kfd_device, "Removed module\n");
 }
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pasid.c b/drivers/gpu/drm/amd/amdkfd/kfd_pasid.c
index 1e06de0..d6a7961 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_pasid.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_pasid.c
@@ -20,78 +20,64 @@
  * OTHER DEALINGS IN THE SOFTWARE.
  */
 
-#include <linux/slab.h>
 #include <linux/types.h>
 #include "kfd_priv.h"
 
-static unsigned long *pasid_bitmap;
-static unsigned int pasid_limit;
-static DEFINE_MUTEX(pasid_mutex);
-
-int kfd_pasid_init(void)
-{
-	pasid_limit = KFD_MAX_NUM_OF_PROCESSES;
-
-	pasid_bitmap = kcalloc(BITS_TO_LONGS(pasid_limit), sizeof(long),
-				GFP_KERNEL);
-	if (!pasid_bitmap)
-		return -ENOMEM;
-
-	set_bit(0, pasid_bitmap); /* PASID 0 is reserved. */
-
-	return 0;
-}
-
-void kfd_pasid_exit(void)
-{
-	kfree(pasid_bitmap);
-}
+static unsigned int pasid_bits = 16;
+static const struct kfd2kgd_calls *kfd2kgd;
 
 bool kfd_set_pasid_limit(unsigned int new_limit)
 {
-	if (new_limit < pasid_limit) {
-		bool ok;
-
-		mutex_lock(&pasid_mutex);
-
-		/* ensure that no pasids >= new_limit are in-use */
-		ok = (find_next_bit(pasid_bitmap, pasid_limit, new_limit) ==
-								pasid_limit);
-		if (ok)
-			pasid_limit = new_limit;
-
-		mutex_unlock(&pasid_mutex);
-
-		return ok;
+	if (new_limit < 2)
+		return false;
+
+	if (new_limit < (1U << pasid_bits)) {
+		if (kfd2kgd)
+			/* We've already allocated user PASIDs, too late to
+			 * change the limit
+			 */
+			return false;
+
+		while (new_limit < (1U << pasid_bits))
+			pasid_bits--;
 	}
 
 	return true;
 }
 
-inline unsigned int kfd_get_pasid_limit(void)
+unsigned int kfd_get_pasid_limit(void)
 {
-	return pasid_limit;
+	return 1U << pasid_bits;
 }
 
 unsigned int kfd_pasid_alloc(void)
 {
-	unsigned int found;
-
-	mutex_lock(&pasid_mutex);
-
-	found = find_first_zero_bit(pasid_bitmap, pasid_limit);
-	if (found == pasid_limit)
-		found = 0;
-	else
-		set_bit(found, pasid_bitmap);
+	int r;
+
+	/* Find the first best KFD device for calling KGD */
+	if (!kfd2kgd) {
+		struct kfd_dev *dev = NULL;
+		unsigned int i = 0;
+
+		while ((dev = kfd_topology_enum_kfd_devices(i)) != NULL) {
+			if (dev && dev->kfd2kgd) {
+				kfd2kgd = dev->kfd2kgd;
+				break;
+			}
+			i++;
+		}
+
+		if (!kfd2kgd)
+			return false;
+	}
 
-	mutex_unlock(&pasid_mutex);
+	r = kfd2kgd->alloc_pasid(pasid_bits);
 
-	return found;
+	return r > 0 ? r : 0;
 }
 
 void kfd_pasid_free(unsigned int pasid)
 {
-	if (!WARN_ON(pasid == 0 || pasid >= pasid_limit))
-		clear_bit(pasid, pasid_bitmap);
+	if (kfd2kgd)
+		kfd2kgd->free_pasid(pasid);
 }
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 6/8] drm/amdgpu: Add prescreening stage in IH processing
       [not found] ` <1504045524-23853-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (4 preceding siblings ...)
  2017-08-29 22:25   ` [PATCH 5/8] drm/amdkfd: Use PASID manager from KGD Felix Kuehling
@ 2017-08-29 22:25   ` Felix Kuehling
  2017-08-29 22:25   ` [PATCH 7/8] lib: Closed hash table with low overhead Felix Kuehling
                     ` (2 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Felix Kuehling @ 2017-08-29 22:25 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Felix Kuehling

To filter out high-frequency interrupts that can be safely ignored.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h     |  2 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c  |  6 ++++++
 drivers/gpu/drm/amd/amdgpu/cik_ih.c     | 14 ++++++++++++++
 drivers/gpu/drm/amd/amdgpu/cz_ih.c      | 14 ++++++++++++++
 drivers/gpu/drm/amd/amdgpu/iceland_ih.c | 14 ++++++++++++++
 drivers/gpu/drm/amd/amdgpu/si_ih.c      | 14 ++++++++++++++
 drivers/gpu/drm/amd/amdgpu/tonga_ih.c   | 14 ++++++++++++++
 drivers/gpu/drm/amd/amdgpu/vega10_ih.c  | 14 ++++++++++++++
 8 files changed, 92 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 103635a..8db6b23 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -332,6 +332,7 @@ struct amdgpu_gart_funcs {
 struct amdgpu_ih_funcs {
 	/* ring read/write ptr handling, called from interrupt context */
 	u32 (*get_wptr)(struct amdgpu_device *adev);
+	bool (*prescreen_iv)(struct amdgpu_device *adev);
 	void (*decode_iv)(struct amdgpu_device *adev,
 			  struct amdgpu_iv_entry *entry);
 	void (*set_rptr)(struct amdgpu_device *adev);
@@ -1759,6 +1760,7 @@ amdgpu_get_sdma_instance(struct amdgpu_ring *ring)
 #define amdgpu_ring_init_cond_exec(r) (r)->funcs->init_cond_exec((r))
 #define amdgpu_ring_patch_cond_exec(r,o) (r)->funcs->patch_cond_exec((r),(o))
 #define amdgpu_ih_get_wptr(adev) (adev)->irq.ih_funcs->get_wptr((adev))
+#define amdgpu_ih_prescreen_iv(adev) (adev)->irq.ih_funcs->prescreen_iv((adev))
 #define amdgpu_ih_decode_iv(adev, iv) (adev)->irq.ih_funcs->decode_iv((adev), (iv))
 #define amdgpu_ih_set_rptr(adev) (adev)->irq.ih_funcs->set_rptr((adev))
 #define amdgpu_display_vblank_get_counter(adev, crtc) (adev)->mode_info.funcs->vblank_get_counter((adev), (crtc))
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
index 3ab4c65..c834a40 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
@@ -169,6 +169,12 @@ int amdgpu_ih_process(struct amdgpu_device *adev)
 	while (adev->irq.ih.rptr != wptr) {
 		u32 ring_index = adev->irq.ih.rptr >> 2;
 
+		/* Prescreening of high-frequency interrupts */
+		if (!amdgpu_ih_prescreen_iv(adev)) {
+			adev->irq.ih.rptr &= adev->irq.ih.ptr_mask;
+			continue;
+		}
+
 		/* Before dispatching irq to IP blocks, send it to amdkfd */
 		amdgpu_amdkfd_interrupt(adev,
 				(const void *) &adev->irq.ih.ring[ring_index]);
diff --git a/drivers/gpu/drm/amd/amdgpu/cik_ih.c b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
index b891843..07d3d89 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
@@ -228,6 +228,19 @@ static u32 cik_ih_get_wptr(struct amdgpu_device *adev)
  * [127:96] - reserved
  */
 
+/**
+ * cik_ih_prescreen_iv - prescreen an interrupt vector
+ *
+ * @adev: amdgpu_device pointer
+ *
+ * Returns true if the interrupt vector should be further processed.
+ */
+static bool cik_ih_prescreen_iv(struct amdgpu_device *adev)
+{
+	/* Process all interrupts */
+	return true;
+}
+
  /**
  * cik_ih_decode_iv - decode an interrupt vector
  *
@@ -433,6 +446,7 @@ static const struct amd_ip_funcs cik_ih_ip_funcs = {
 
 static const struct amdgpu_ih_funcs cik_ih_funcs = {
 	.get_wptr = cik_ih_get_wptr,
+	.prescreen_iv = cik_ih_prescreen_iv,
 	.decode_iv = cik_ih_decode_iv,
 	.set_rptr = cik_ih_set_rptr
 };
diff --git a/drivers/gpu/drm/amd/amdgpu/cz_ih.c b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
index 0c1209c..b6cdf4a 100644
--- a/drivers/gpu/drm/amd/amdgpu/cz_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
@@ -208,6 +208,19 @@ static u32 cz_ih_get_wptr(struct amdgpu_device *adev)
 }
 
 /**
+ * cz_ih_prescreen_iv - prescreen an interrupt vector
+ *
+ * @adev: amdgpu_device pointer
+ *
+ * Returns true if the interrupt vector should be further processed.
+ */
+static bool cz_ih_prescreen_iv(struct amdgpu_device *adev)
+{
+	/* Process all interrupts */
+	return true;
+}
+
+/**
  * cz_ih_decode_iv - decode an interrupt vector
  *
  * @adev: amdgpu_device pointer
@@ -414,6 +427,7 @@ static const struct amd_ip_funcs cz_ih_ip_funcs = {
 
 static const struct amdgpu_ih_funcs cz_ih_funcs = {
 	.get_wptr = cz_ih_get_wptr,
+	.prescreen_iv = cz_ih_prescreen_iv,
 	.decode_iv = cz_ih_decode_iv,
 	.set_rptr = cz_ih_set_rptr
 };
diff --git a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
index 7a0ea27..65ed6d3 100644
--- a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
@@ -208,6 +208,19 @@ static u32 iceland_ih_get_wptr(struct amdgpu_device *adev)
 }
 
 /**
+ * iceland_ih_prescreen_iv - prescreen an interrupt vector
+ *
+ * @adev: amdgpu_device pointer
+ *
+ * Returns true if the interrupt vector should be further processed.
+ */
+static bool iceland_ih_prescreen_iv(struct amdgpu_device *adev)
+{
+	/* Process all interrupts */
+	return true;
+}
+
+/**
  * iceland_ih_decode_iv - decode an interrupt vector
  *
  * @adev: amdgpu_device pointer
@@ -412,6 +425,7 @@ static const struct amd_ip_funcs iceland_ih_ip_funcs = {
 
 static const struct amdgpu_ih_funcs iceland_ih_funcs = {
 	.get_wptr = iceland_ih_get_wptr,
+	.prescreen_iv = iceland_ih_prescreen_iv,
 	.decode_iv = iceland_ih_decode_iv,
 	.set_rptr = iceland_ih_set_rptr
 };
diff --git a/drivers/gpu/drm/amd/amdgpu/si_ih.c b/drivers/gpu/drm/amd/amdgpu/si_ih.c
index ce25e03..588fa4a 100644
--- a/drivers/gpu/drm/amd/amdgpu/si_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/si_ih.c
@@ -118,6 +118,19 @@ static u32 si_ih_get_wptr(struct amdgpu_device *adev)
 	return (wptr & adev->irq.ih.ptr_mask);
 }
 
+/**
+ * si_ih_prescreen_iv - prescreen an interrupt vector
+ *
+ * @adev: amdgpu_device pointer
+ *
+ * Returns true if the interrupt vector should be further processed.
+ */
+static bool si_ih_prescreen_iv(struct amdgpu_device *adev)
+{
+	/* Process all interrupts */
+	return true;
+}
+
 static void si_ih_decode_iv(struct amdgpu_device *adev,
 			     struct amdgpu_iv_entry *entry)
 {
@@ -288,6 +301,7 @@ static const struct amd_ip_funcs si_ih_ip_funcs = {
 
 static const struct amdgpu_ih_funcs si_ih_funcs = {
 	.get_wptr = si_ih_get_wptr,
+	.prescreeen_iv = si_ih_prescreen_iv,
 	.decode_iv = si_ih_decode_iv,
 	.set_rptr = si_ih_set_rptr
 };
diff --git a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
index 923df2c..5ed0069 100644
--- a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
@@ -219,6 +219,19 @@ static u32 tonga_ih_get_wptr(struct amdgpu_device *adev)
 }
 
 /**
+ * tonga_ih_prescreen_iv - prescreen an interrupt vector
+ *
+ * @adev: amdgpu_device pointer
+ *
+ * Returns true if the interrupt vector should be further processed.
+ */
+static bool tonga_ih_prescreen_iv(struct amdgpu_device *adev)
+{
+	/* Process all interrupts */
+	return true;
+}
+
+/**
  * tonga_ih_decode_iv - decode an interrupt vector
  *
  * @adev: amdgpu_device pointer
@@ -478,6 +491,7 @@ static const struct amd_ip_funcs tonga_ih_ip_funcs = {
 
 static const struct amdgpu_ih_funcs tonga_ih_funcs = {
 	.get_wptr = tonga_ih_get_wptr,
+	.prescreen_iv = tonga_ih_prescreen_iv,
 	.decode_iv = tonga_ih_decode_iv,
 	.set_rptr = tonga_ih_set_rptr
 };
diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
index 56150e8..eda4771 100644
--- a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
@@ -227,6 +227,19 @@ static u32 vega10_ih_get_wptr(struct amdgpu_device *adev)
 }
 
 /**
+ * vega10_ih_prescreen_iv - prescreen an interrupt vector
+ *
+ * @adev: amdgpu_device pointer
+ *
+ * Returns true if the interrupt vector should be further processed.
+ */
+static bool vega10_ih_prescreen_iv(struct amdgpu_device *adev)
+{
+	/* TODO: Filter known pending page faults */
+	return true;
+}
+
+/**
  * vega10_ih_decode_iv - decode an interrupt vector
  *
  * @adev: amdgpu_device pointer
@@ -410,6 +423,7 @@ const struct amd_ip_funcs vega10_ih_ip_funcs = {
 
 static const struct amdgpu_ih_funcs vega10_ih_funcs = {
 	.get_wptr = vega10_ih_get_wptr,
+	.prescreen_iv = vega10_ih_prescreen_iv,
 	.decode_iv = vega10_ih_decode_iv,
 	.set_rptr = vega10_ih_set_rptr
 };
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 7/8] lib: Closed hash table with low overhead
       [not found] ` <1504045524-23853-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (5 preceding siblings ...)
  2017-08-29 22:25   ` [PATCH 6/8] drm/amdgpu: Add prescreening stage in IH processing Felix Kuehling
@ 2017-08-29 22:25   ` Felix Kuehling
  2017-08-29 22:25   ` [PATCH 8/8] drm/amdgpu: Track pending retry faults in IH and VM (v2) Felix Kuehling
  2017-09-06 21:53   ` [PATCH 0/8] Retry page fault handling for Vega10 Felix Kuehling
  8 siblings, 0 replies; 18+ messages in thread
From: Felix Kuehling @ 2017-08-29 22:25 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Felix Kuehling

This adds a statically sized closed hash table implementation with
low memory and CPU overhead. The API is inspired by kfifo.

Storing, retrieving and deleting data does not involve any dynamic
memory management, which makes it ideal for use in interrupt context.
Static memory usage per entry comprises a 32 or 64 bit hash key, two
bits for occupancy tracking and the value size stored in the table.
No list heads or pointers are needed. Therefore this data structure
should be quite cache-friendly, too.

It uses linear probing and lazy deletion. During lookups free space
is reclaimed and entries relocated to speed up future lookups.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
---
 include/linux/chash.h | 358 +++++++++++++++++++++++++++++
 lib/Kconfig           |  24 ++
 lib/Makefile          |   2 +
 lib/chash.c           | 622 ++++++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 1006 insertions(+)
 create mode 100644 include/linux/chash.h
 create mode 100644 lib/chash.c

diff --git a/include/linux/chash.h b/include/linux/chash.h
new file mode 100644
index 0000000..c89b92b
--- /dev/null
+++ b/include/linux/chash.h
@@ -0,0 +1,358 @@
+/*
+ * Copyright 2017 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#ifndef _LINUX_CHASH_H
+#define _LINUX_CHASH_H
+
+#include <linux/types.h>
+#include <linux/hash.h>
+#include <linux/bug.h>
+#include <linux/bitops.h>
+
+struct __chash_table {
+	u8 bits;
+	u8 key_size;
+	unsigned int value_size;
+	u32 size_mask;
+	unsigned long *occup_bitmap, *valid_bitmap;
+	union {
+		u32 *keys32;
+		u64 *keys64;
+	};
+	u8 *values;
+
+#ifdef CONFIG_CHASH_STATS
+	u64 hits, hits_steps, hits_time_ns;
+	u64 miss, miss_steps, miss_time_ns;
+	u64 relocs, reloc_dist;
+#endif
+};
+
+#define __CHASH_BITMAP_SIZE(bits)				\
+	(((1 << (bits)) + BITS_PER_LONG - 1) / BITS_PER_LONG)
+#define __CHASH_ARRAY_SIZE(bits, size)				\
+	((((size) << (bits)) + sizeof(long) - 1) / sizeof(long))
+
+#define __CHASH_DATA_SIZE(bits, key_size, value_size)	\
+	(__CHASH_BITMAP_SIZE(bits) * 2 +		\
+	 __CHASH_ARRAY_SIZE(bits, key_size) +		\
+	 __CHASH_ARRAY_SIZE(bits, value_size))
+
+#define STRUCT_CHASH_TABLE(bits, key_size, value_size)			\
+	struct {							\
+		struct __chash_table table;				\
+		unsigned long data					\
+			[__CHASH_DATA_SIZE(bits, key_size, value_size)];\
+	}
+
+/**
+ * struct chash_table - Dynamically allocated closed hash table
+ *
+ * Use this struct for dynamically allocated hash tables (using
+ * chash_table_alloc and chash_table_free), where the size is
+ * determined at runtime.
+ */
+struct chash_table {
+	struct __chash_table table;
+	unsigned long *data;
+};
+
+/**
+ * DECLARE_CHASH_TABLE - macro to declare a closed hash table
+ * @table: name of the declared hash table
+ * @bts: Table size will be 2^bits entries
+ * @key_sz: Size of hash keys in bytes, 4 or 8
+ * @val_sz: Size of data values in bytes, can be 0
+ *
+ * This declares the hash table variable with a static size.
+ *
+ * The closed hash table stores key-value pairs with low memory and
+ * lookup overhead. In operation it performs no dynamic memory
+ * management. The data being stored does not require any
+ * list_heads. The hash table performs best with small @val_sz and as
+ * long as some space (about 50%) is left free in the table. But the
+ * table can still work reasonably efficiently even when filled up to
+ * about 90%. If bigger data items need to be stored and looked up,
+ * store the pointer to it as value in the hash table.
+ *
+ * @val_sz may be 0. This can be useful when all the stored
+ * information is contained in the key itself and the fact that it is
+ * in the hash table (or not).
+ */
+#define DECLARE_CHASH_TABLE(table, bts, key_sz, val_sz)		\
+	STRUCT_CHASH_TABLE(bts, key_sz, val_sz) table
+
+#ifdef CONFIG_CHASH_STATS
+#define __CHASH_STATS_INIT(prefix),		\
+		prefix.hits = 0,		\
+		prefix.hits_steps = 0,		\
+		prefix.hits_time_ns = 0,	\
+		prefix.miss = 0,		\
+		prefix.miss_steps = 0,		\
+		prefix.miss_time_ns = 0,	\
+		prefix.relocs = 0,		\
+		prefix.reloc_dist = 0
+#else
+#define __CHASH_STATS_INIT(prefix)
+#endif
+
+#define __CHASH_TABLE_INIT(prefix, data, bts, key_sz, val_sz)	\
+	prefix.bits = (bts),					\
+		prefix.key_size = (key_sz),			\
+		prefix.value_size = (val_sz),			\
+		prefix.size_mask = ((1 << bts) - 1),		\
+		prefix.occup_bitmap = &data[0],			\
+		prefix.valid_bitmap = &data			\
+			[__CHASH_BITMAP_SIZE(bts)],		\
+		prefix.keys64 = (u64 *)&data			\
+			[__CHASH_BITMAP_SIZE(bts) * 2],		\
+		prefix.values = (u8 *)&data			\
+			[__CHASH_BITMAP_SIZE(bts) * 2 +		\
+			 __CHASH_ARRAY_SIZE(bts, key_sz)]	\
+		__CHASH_STATS_INIT(prefix)
+
+/**
+ * DEFINE_CHASH_TABLE - macro to define and initialize a closed hash table
+ * @tbl: name of the declared hash table
+ * @bts: Table size will be 2^bits entries
+ * @key_sz: Size of hash keys in bytes, 4 or 8
+ * @val_sz: Size of data values in bytes, can be 0
+ *
+ * Note: the macro can be used for global and local hash table variables.
+ */
+#define DEFINE_CHASH_TABLE(tbl, bts, key_sz, val_sz)			\
+	DECLARE_CHASH_TABLE(tbl, bts, key_sz, val_sz) = {		\
+		.table = {						\
+			__CHASH_TABLE_INIT(, (tbl).data, bts, key_sz, val_sz) \
+		},							\
+		.data = {0}						\
+	}
+
+/**
+ * INIT_CHASH_TABLE - Initialize a hash table declared by DECLARE_CHASH_TABLE
+ * @tbl: name of the declared hash table
+ * @bts: Table size will be 2^bits entries
+ * @key_sz: Size of hash keys in bytes, 4 or 8
+ * @val_sz: Size of data values in bytes, can be 0
+ */
+#define INIT_CHASH_TABLE(tbl, bts, key_sz, val_sz)			\
+	__CHASH_TABLE_INIT(((tbl).table), (tbl).data, bts, key_sz, val_sz)
+
+int chash_table_alloc(struct chash_table *table, u8 bits, u8 key_size,
+		      unsigned int value_size, gfp_t gfp_mask);
+void chash_table_free(struct chash_table *table);
+
+/**
+ * chash_table_dump_stats - Dump statistics of a closed hash table
+ * @tbl: Pointer to the table structure
+ *
+ * Dumps some performance statistics of the table gathered in operation
+ * in the kernel log using pr_debug. If CONFIG_DYNAMIC_DEBUG is enabled,
+ * user must turn on messages for chash.c (file chash.c +p).
+ */
+#ifdef CONFIG_CHASH_STATS
+#define chash_table_dump_stats(tbl) __chash_table_dump_stats(&(*tbl).table)
+
+void __chash_table_dump_stats(struct __chash_table *table);
+#else
+#define chash_table_dump_stats(tbl)
+#endif
+
+/**
+ * chash_table_reset_stats - Reset statistics of a closed hash table
+ * @tbl: Pointer to the table structure
+ */
+#ifdef CONFIG_CHASH_STATS
+#define chash_table_reset_stats(tbl) __chash_table_reset_stats(&(*tbl).table)
+
+static inline void __chash_table_reset_stats(struct __chash_table *table)
+{
+	(void)table __CHASH_STATS_INIT((*table));
+}
+#else
+#define chash_table_reset_stats(tbl)
+#endif
+
+/**
+ * chash_table_copy_in - Copy a new value into the hash table
+ * @tbl: Pointer to the table structure
+ * @key: Key of the entry to add or update
+ * @value: Pointer to value to copy, may be NULL
+ *
+ * If @key already has an entry, its value is replaced. Otherwise a
+ * new entry is added. If @value is NULL, the value is left unchanged
+ * or uninitialized. Returns 1 if an entry already existed, 0 if a new
+ * entry was added or %-ENOMEM if there was no free space in the
+ * table.
+ */
+#define chash_table_copy_in(tbl, key, value)			\
+	__chash_table_copy_in(&(*tbl).table, key, value)
+
+int __chash_table_copy_in(struct __chash_table *table, u64 key,
+			  const void *value);
+
+/**
+ * chash_table_copy_out - Copy a value out of the hash table
+ * @tbl: Pointer to the table structure
+ * @key: Key of the entry to find
+ * @value: Pointer to value to copy, may be NULL
+ *
+ * If @value is not NULL and the table has a non-0 value_size, the
+ * value at @key is copied to @value. Returns the slot index of the
+ * entry or %-EINVAL if @key was not found.
+ */
+#define chash_table_copy_out(tbl, key, value)			\
+	__chash_table_copy_out(&(*tbl).table, key, value, false)
+
+int __chash_table_copy_out(struct __chash_table *table, u64 key,
+			   void *value, bool remove);
+
+/**
+ * chash_table_remove - Remove an entry from the hash table
+ * @tbl: Pointer to the table structure
+ * @key: Key of the entry to find
+ * @value: Pointer to value to copy, may be NULL
+ *
+ * If @value is not NULL and the table has a non-0 value_size, the
+ * value at @key is copied to @value. The entry is removed from the
+ * table. Returns the slot index of the removed entry or %-EINVAL if
+ * @key was not found.
+ */
+#define chash_table_remove(tbl, key, value)			\
+	__chash_table_copy_out(&(*tbl).table, key, value, true)
+
+/*
+ * Low level iterator API used internally by the above functions.
+ */
+struct chash_iter {
+	struct __chash_table *table;
+	unsigned long mask;
+	int slot;
+};
+
+/**
+ * CHASH_ITER_INIT - Initialize a hash table iterator
+ * @tbl: Pointer to hash table to iterate over
+ * @s: Initial slot number
+ */
+#define CHASH_ITER_INIT(table, s) {			\
+		table,					\
+		1UL << ((s) & (BITS_PER_LONG - 1)),	\
+		s					\
+	}
+/**
+ * CHASH_ITER_SET - Set hash table iterator to new slot
+ * @iter: Iterator
+ * @s: Slot number
+ */
+#define CHASH_ITER_SET(iter, s)					\
+	(iter).mask = 1UL << ((s) & (BITS_PER_LONG - 1)),	\
+	(iter).slot = (s)
+/**
+ * CHASH_ITER_INC - Increment hash table iterator
+ * @table: Hash table to iterate over
+ *
+ * Wraps around at the end.
+ */
+#define CHASH_ITER_INC(iter) do {					\
+		(iter).mask = (iter).mask << 1 |			\
+			(iter).mask >> (BITS_PER_LONG - 1);		\
+		(iter).slot = ((iter).slot + 1) & (iter).table->size_mask; \
+	} while (0)
+
+static inline bool chash_iter_is_valid(const struct chash_iter iter)
+{
+	BUG_ON((unsigned)iter.slot >= (1 << iter.table->bits));
+	return !!(iter.table->valid_bitmap[iter.slot >> _BITOPS_LONG_SHIFT] &
+		  iter.mask);
+}
+static inline bool chash_iter_is_empty(const struct chash_iter iter)
+{
+	BUG_ON((unsigned)iter.slot >= (1 << iter.table->bits));
+	return !(iter.table->occup_bitmap[iter.slot >> _BITOPS_LONG_SHIFT] &
+		 iter.mask);
+}
+
+static inline void chash_iter_set_valid(const struct chash_iter iter)
+{
+	BUG_ON((unsigned)iter.slot >= (1 << iter.table->bits));
+	iter.table->valid_bitmap[iter.slot >> _BITOPS_LONG_SHIFT] |= iter.mask;
+	iter.table->occup_bitmap[iter.slot >> _BITOPS_LONG_SHIFT] |= iter.mask;
+}
+static inline void chash_iter_set_invalid(const struct chash_iter iter)
+{
+	BUG_ON((unsigned)iter.slot >= (1 << iter.table->bits));
+	iter.table->valid_bitmap[iter.slot >> _BITOPS_LONG_SHIFT] &= ~iter.mask;
+}
+static inline void chash_iter_set_empty(const struct chash_iter iter)
+{
+	BUG_ON((unsigned)iter.slot >= (1 << iter.table->bits));
+	iter.table->occup_bitmap[iter.slot >> _BITOPS_LONG_SHIFT] &= ~iter.mask;
+}
+
+static inline u32 chash_iter_key32(const struct chash_iter iter)
+{
+	BUG_ON(iter.table->key_size != 4);
+	BUG_ON((unsigned)iter.slot >= (1 << iter.table->bits));
+	return iter.table->keys32[iter.slot];
+}
+static inline u64 chash_iter_key64(const struct chash_iter iter)
+{
+	BUG_ON(iter.table->key_size != 8);
+	BUG_ON((unsigned)iter.slot >= (1 << iter.table->bits));
+	return iter.table->keys64[iter.slot];
+}
+static inline u64 chash_iter_key(const struct chash_iter iter)
+{
+	BUG_ON((unsigned)iter.slot >= (1 << iter.table->bits));
+	return (iter.table->key_size == 4) ?
+		iter.table->keys32[iter.slot] : iter.table->keys64[iter.slot];
+}
+
+static inline u32 chash_iter_hash32(const struct chash_iter iter)
+{
+	BUG_ON(iter.table->key_size != 4);
+	return hash_32(chash_iter_key32(iter), iter.table->bits);
+}
+
+static inline u32 chash_iter_hash64(const struct chash_iter iter)
+{
+	BUG_ON(iter.table->key_size != 8);
+	return hash_64(chash_iter_key64(iter), iter.table->bits);
+}
+
+static inline u32 chash_iter_hash(const struct chash_iter iter)
+{
+	return (iter.table->key_size == 4) ?
+		hash_32(chash_iter_key32(iter), iter.table->bits) :
+		hash_64(chash_iter_key64(iter), iter.table->bits);
+}
+
+static inline void *chash_iter_value(const struct chash_iter iter)
+{
+	BUG_ON((unsigned)iter.slot >= (1 << iter.table->bits));
+	return iter.table->values +
+		((unsigned long)iter.slot * iter.table->value_size);
+}
+
+#endif /* _LINUX_CHASH_H */
diff --git a/lib/Kconfig b/lib/Kconfig
index 6762529..9e49129 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -575,4 +575,28 @@ config PARMAN
 config PRIME_NUMBERS
 	tristate
 
+#
+# Closed hash table
+#
+config CHASH
+	tristate "Closed hash table"
+	help
+	 Statically sized closed hash table implementation with low
+	 memory and CPU overhead.
+
+config CHASH_STATS
+	bool "Closed hash table performance statistics"
+	depends on CHASH
+	default n
+	help
+	 Enable collection of performance statistics for closed hash tables.
+
+config CHASH_SELFTEST
+	bool "Closed hash table self test"
+	depends on CHASH
+	default n
+	help
+	 Runs a selftest during module load. Several module parameters
+	 are available to modify the behaviour of the test.
+
 endmenu
diff --git a/lib/Makefile b/lib/Makefile
index 40c1837..c332ed9 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -243,3 +243,5 @@ UBSAN_SANITIZE_ubsan.o := n
 obj-$(CONFIG_SBITMAP) += sbitmap.o
 
 obj-$(CONFIG_PARMAN) += parman.o
+
+obj-$(CONFIG_CHASH) += chash.o
diff --git a/lib/chash.c b/lib/chash.c
new file mode 100644
index 0000000..1bc4287
--- /dev/null
+++ b/lib/chash.c
@@ -0,0 +1,622 @@
+/*
+ * Copyright 2017 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#include <linux/types.h>
+#include <linux/hash.h>
+#include <linux/bug.h>
+#include <linux/slab.h>
+#include <linux/module.h>
+#include <linux/sched/clock.h>
+#include <linux/chash.h>
+
+/**
+ * chash_table_alloc - Allocate closed hash table
+ * @table: Pointer to the table structure
+ * @bits: Table size will be 2^bits entries
+ * @key_size: Size of hash keys in bytes, 4 or 8
+ * @value_size: Size of data values in bytes, can be 0
+ */
+int chash_table_alloc(struct chash_table *table, u8 bits, u8 key_size,
+		      unsigned int value_size, gfp_t gfp_mask)
+{
+	if (bits > 31)
+		return -EINVAL;
+
+	if (key_size != 4 && key_size != 8)
+		return -EINVAL;
+
+	table->data = kcalloc(__CHASH_DATA_SIZE(bits, key_size, value_size),
+		       sizeof(long), gfp_mask);
+	if (!table->data)
+		return -ENOMEM;
+
+	__CHASH_TABLE_INIT(table->table, table->data,
+			   bits, key_size, value_size);
+
+	return 0;
+}
+EXPORT_SYMBOL(chash_table_alloc);
+
+/**
+ * chash_table_free - Free closed hash table
+ * @table: Pointer to the table structure
+ */
+void chash_table_free(struct chash_table *table)
+{
+	kfree(table->data);
+}
+EXPORT_SYMBOL(chash_table_free);
+
+#ifdef CONFIG_CHASH_STATS
+
+#define DIV_FRAC(nom, denom, quot, frac, frac_digits) do {		\
+		(quot) = (nom) / (denom);				\
+		(frac) = ((nom) % (denom) * (frac_digits) +		\
+			  (denom) / 2) / (denom);			\
+	} while (0)
+
+void __chash_table_dump_stats(struct __chash_table *table)
+{
+	struct chash_iter iter = CHASH_ITER_INIT(table, 0);
+	u32 filled = 0, empty = 0, tombstones = 0;
+	u64 quot1, quot2;
+	u32 frac1, frac2;
+
+	do {
+		if (chash_iter_is_valid(iter))
+			filled++;
+		else if (chash_iter_is_empty(iter))
+			empty++;
+		else
+			tombstones++;
+		CHASH_ITER_INC(iter);
+	} while (iter.slot);
+
+	pr_debug("chash: key size %u, value size %u\n",
+		 table->key_size, table->value_size);
+	pr_debug("  Slots total/filled/empty/tombstones: %u / %u / %u / %u\n",
+		 1 << table->bits, filled, empty, tombstones);
+	if (table->hits > 0) {
+		DIV_FRAC(table->hits_steps, table->hits, quot1, frac1, 1000);
+		DIV_FRAC(table->hits * 1000, table->hits_time_ns,
+			 quot2, frac2, 1000);
+	} else {
+		quot1 = quot2 = 0;
+		frac1 = frac2 = 0;
+	}
+	pr_debug("  Hits   (avg.cost, rate): %llu (%llu.%03u, %llu.%03u M/s)\n",
+		 table->hits, quot1, frac1, quot2, frac2);
+	if (table->miss > 0) {
+		DIV_FRAC(table->miss_steps, table->miss, quot1, frac1, 1000);
+		DIV_FRAC(table->miss * 1000, table->miss_time_ns,
+			 quot2, frac2, 1000);
+	} else {
+		quot1 = quot2 = 0;
+		frac1 = frac2 = 0;
+	}
+	pr_debug("  Misses (avg.cost, rate): %llu (%llu.%03u, %llu.%03u M/s)\n",
+		 table->miss, quot1, frac1, quot2, frac2);
+	if (table->hits + table->miss > 0) {
+		DIV_FRAC(table->hits_steps + table->miss_steps,
+			 table->hits + table->miss, quot1, frac1, 1000);
+		DIV_FRAC((table->hits + table->miss) * 1000,
+			 (table->hits_time_ns + table->miss_time_ns),
+			 quot2, frac2, 1000);
+	} else {
+		quot1 = quot2 = 0;
+		frac1 = frac2 = 0;
+	}
+	pr_debug("  Total  (avg.cost, rate): %llu (%llu.%03u, %llu.%03u M/s)\n",
+		 table->hits + table->miss, quot1, frac1, quot2, frac2);
+	if (table->relocs > 0) {
+		DIV_FRAC(table->hits + table->miss, table->relocs,
+			 quot1, frac1, 1000);
+		DIV_FRAC(table->reloc_dist, table->relocs, quot2, frac2, 1000);
+		pr_debug("  Relocations (freq, avg.dist): %llu (1:%llu.%03u, %llu.%03u)\n",
+			 table->relocs, quot1, frac1, quot2, frac2);
+	} else {
+		pr_debug("  No relocations\n");
+	}
+}
+EXPORT_SYMBOL(__chash_table_dump_stats);
+
+#undef DIV_FRAC
+#endif
+
+#define CHASH_INC(table, a) ((a) = ((a) + 1) & (table)->size_mask)
+#define CHASH_ADD(table, a, b) (((a) + (b)) & (table)->size_mask)
+#define CHASH_SUB(table, a, b) (((a) - (b)) & (table)->size_mask)
+#define CHASH_IN_RANGE(table, slot, first, last) \
+	(CHASH_SUB(table, slot, first) <= CHASH_SUB(table, last, first))
+
+/*#define CHASH_DEBUG Uncomment this to enable verbose debug output*/
+#ifdef CHASH_DEBUG
+static void chash_table_dump(struct __chash_table *table)
+{
+	struct chash_iter iter = CHASH_ITER_INIT(table, 0);
+
+	do {
+		if ((iter.slot & 3) == 0)
+			pr_debug("%04x: ", iter.slot);
+
+		if (chash_iter_is_valid(iter))
+			pr_debug("[%016llx] ", chash_iter_key(iter));
+		else if (chash_iter_is_empty(iter))
+			pr_debug("[    <empty>     ] ");
+		else
+			pr_debug("[  <tombstone>   ] ");
+
+		if ((iter.slot & 3) == 3)
+			pr_debug("\n");
+
+		CHASH_ITER_INC(iter);
+	} while (iter.slot);
+
+	if ((iter.slot & 3) != 0)
+		pr_debug("\n");
+}
+
+static int chash_table_check(struct __chash_table *table)
+{
+	u32 hash;
+	struct chash_iter iter = CHASH_ITER_INIT(table, 0);
+	struct chash_iter cur = CHASH_ITER_INIT(table, 0);
+
+	do {
+		if (!chash_iter_is_valid(iter)) {
+			CHASH_ITER_INC(iter);
+			continue;
+		}
+
+		hash = chash_iter_hash(iter);
+		CHASH_ITER_SET(cur, hash);
+		while (cur.slot != iter.slot) {
+			if (chash_iter_is_empty(cur)) {
+				pr_err("Path to element at %x with hash %x broken at slot %x\n",
+				       iter.slot, hash, cur.slot);
+				chash_table_dump(table);
+				return -EINVAL;
+			}
+			CHASH_ITER_INC(cur);
+		}
+
+		CHASH_ITER_INC(iter);
+	} while (iter.slot);
+
+	return 0;
+}
+#endif
+
+static void chash_iter_relocate(struct chash_iter dst, struct chash_iter src)
+{
+	BUG_ON(src.table == dst.table && src.slot == dst.slot);
+	BUG_ON(src.table->key_size != src.table->key_size);
+	BUG_ON(src.table->value_size != src.table->value_size);
+
+	if (dst.table->key_size == 4)
+		dst.table->keys32[dst.slot] = src.table->keys32[src.slot];
+	else
+		dst.table->keys64[dst.slot] = src.table->keys64[src.slot];
+
+	if (dst.table->value_size)
+		memcpy(chash_iter_value(dst), chash_iter_value(src),
+		       dst.table->value_size);
+
+	chash_iter_set_valid(dst);
+	chash_iter_set_invalid(src);
+
+#ifdef CONFIG_CHASH_STATS
+	if (src.table == dst.table) {
+		dst.table->relocs++;
+		dst.table->reloc_dist +=
+			CHASH_SUB(dst.table, src.slot, dst.slot);
+	}
+#endif
+}
+
+/**
+ * __chash_table_find - Helper for looking up a hash table entry
+ * @iter: Pointer to hash table iterator
+ * @key: Key of the entry to find
+ * @for_removal: set to true if the element will be removed soon
+ *
+ * Searches for an entry in the hash table with a given key. iter must
+ * be initialized by the caller to point to the home position of the
+ * hypothetical entry, i.e. it must be initialized with the hash table
+ * and the key's hash as the initial slot for the search.
+ *
+ * This function also does some local clean-up to speed up future
+ * look-ups by relocating entries to better slots and removing
+ * tombstones that are no longer needed.
+ *
+ * If @for_removal is true, the function avoids relocating the entry
+ * that is being returned.
+ *
+ * Returns 0 if the search is successful. In this case iter is updated
+ * to point to the found entry. Otherwise %-EINVAL is returned and the
+ * iter is updated to point to the first available slot for the given
+ * key. If the table is full, the slot is set to -1.
+ */
+static int chash_table_find(struct chash_iter *iter, u64 key,
+			    bool for_removal)
+{
+#ifdef CONFIG_CHASH_STATS
+	u64 ts1 = local_clock();
+#endif
+	u32 hash = iter->slot;
+	struct chash_iter first_redundant = CHASH_ITER_INIT(iter->table, -1);
+	int first_avail = (for_removal ? -2 : -1);
+
+	while (!chash_iter_is_valid(*iter) || chash_iter_key(*iter) != key) {
+		if (chash_iter_is_empty(*iter)) {
+			/* Found an empty slot, which ends the
+			 * search. Clean up any preceding tombstones
+			 * that are no longer needed because they lead
+			 * to no-where
+			 */
+			if ((int)first_redundant.slot < 0)
+				goto not_found;
+			while (first_redundant.slot != iter->slot) {
+				if (!chash_iter_is_valid(first_redundant))
+					chash_iter_set_empty(first_redundant);
+				CHASH_ITER_INC(first_redundant);
+			}
+#ifdef CHASH_DEBUG
+			chash_table_check(iter->table);
+#endif
+			goto not_found;
+		} else if (!chash_iter_is_valid(*iter)) {
+			/* Found a tombstone. Remember it as candidate
+			 * for relocating the entry we're looking for
+			 * or for adding a new entry with the given key
+			 */
+			if (first_avail == -1)
+				first_avail = iter->slot;
+			/* Or mark it as the start of a series of
+			 * potentially redundant tombstones
+			 */
+			else if (first_redundant.slot == -1)
+				CHASH_ITER_SET(first_redundant, iter->slot);
+		} else if (first_redundant.slot >= 0) {
+			/* Found a valid, occupied slot with a
+			 * preceding series of tombstones. Relocate it
+			 * to a better position that no longer depends
+			 * on those tombstones
+			 */
+			u32 cur_hash = chash_iter_hash(*iter);
+
+			if (!CHASH_IN_RANGE(iter->table, cur_hash,
+					    first_redundant.slot + 1,
+					    iter->slot)) {
+				/* This entry has a hash at or before
+				 * the first tombstone we found. We
+				 * can relocate it to that tombstone
+				 * and advance to the next tombstone
+				 */
+				chash_iter_relocate(first_redundant, *iter);
+				do {
+					CHASH_ITER_INC(first_redundant);
+				} while (chash_iter_is_valid(first_redundant));
+			} else if (cur_hash != iter->slot) {
+				/* Relocate entry to its home position
+				 * or as close as possible so it no
+				 * longer depends on any preceding
+				 * tombstones
+				 */
+				struct chash_iter new_iter =
+					CHASH_ITER_INIT(iter->table, cur_hash);
+
+				while (new_iter.slot != iter->slot &&
+				       chash_iter_is_valid(new_iter))
+					CHASH_ITER_INC(new_iter);
+
+				if (new_iter.slot != iter->slot)
+					chash_iter_relocate(new_iter, *iter);
+			}
+		}
+
+		CHASH_ITER_INC(*iter);
+		if (iter->slot == hash) {
+			iter->slot = -1;
+			goto not_found;
+		}
+	}
+
+#ifdef CONFIG_CHASH_STATS
+	iter->table->hits++;
+	iter->table->hits_steps += CHASH_SUB(iter->table, iter->slot, hash) + 1;
+#endif
+
+	if (first_avail >= 0) {
+		CHASH_ITER_SET(first_redundant, first_avail);
+		chash_iter_relocate(first_redundant, *iter);
+		iter->slot = first_redundant.slot;
+		iter->mask = first_redundant.mask;
+	}
+
+#ifdef CONFIG_CHASH_STATS
+	iter->table->hits_time_ns += local_clock() - ts1;
+#endif
+
+	return 0;
+
+not_found:
+#ifdef CONFIG_CHASH_STATS
+	iter->table->miss++;
+	iter->table->miss_steps += (iter->slot < 0) ?
+		(1 << iter->table->bits) :
+		CHASH_SUB(iter->table, iter->slot, hash) + 1;
+#endif
+
+	if (first_avail >= 0)
+		CHASH_ITER_SET(*iter, first_avail);
+
+#ifdef CONFIG_CHASH_STATS
+	iter->table->miss_time_ns += local_clock() - ts1;
+#endif
+
+	return -EINVAL;
+}
+
+int __chash_table_copy_in(struct __chash_table *table, u64 key,
+			  const void *value)
+{
+	u32 hash = (table->key_size == 4) ?
+		hash_32(key, table->bits) : hash_64(key, table->bits);
+	struct chash_iter iter = CHASH_ITER_INIT(table, hash);
+	int r = chash_table_find(&iter, key, false);
+
+	/* Found an existing entry */
+	if (!r) {
+		if (value && table->value_size)
+			memcpy(chash_iter_value(iter), value,
+			       table->value_size);
+		return 1;
+	}
+
+	/* Is there a place to add a new entry? */
+	if (iter.slot < 0) {
+		pr_err("Hash table overflow\n");
+		return -ENOMEM;
+	}
+
+	chash_iter_set_valid(iter);
+
+	if (table->key_size == 4)
+		table->keys32[iter.slot] = key;
+	else
+		table->keys64[iter.slot] = key;
+	if (value && table->value_size)
+		memcpy(chash_iter_value(iter), value, table->value_size);
+
+	return 0;
+}
+EXPORT_SYMBOL(__chash_table_copy_in);
+
+int __chash_table_copy_out(struct __chash_table *table, u64 key,
+			   void *value, bool remove)
+{
+	u32 hash = (table->key_size == 4) ?
+		hash_32(key, table->bits) : hash_64(key, table->bits);
+	struct chash_iter iter = CHASH_ITER_INIT(table, hash);
+	int r = chash_table_find(&iter, key, remove);
+
+	if (r < 0)
+		return r;
+
+	if (value && table->value_size)
+		memcpy(value, chash_iter_value(iter), table->value_size);
+
+	if (remove)
+		chash_iter_set_invalid(iter);
+
+	return iter.slot;
+}
+EXPORT_SYMBOL(__chash_table_copy_out);
+
+#ifdef CONFIG_CHASH_SELFTEST
+/**
+ * chash_self_test - Run a self-test of the hash table implementation
+ * @bits: Table size will be 2^bits entries
+ * @key_size: Size of hash keys in bytes, 4 or 8
+ * @min_fill: Minimum fill level during the test
+ * @max_fill: Maximum fill level during the test
+ * @iterations: Number of test iterations
+ *
+ * The test adds and removes entries from a hash table, cycling the
+ * fill level between min_fill and max_fill entries. Also tests lookup
+ * and value retrieval.
+ */
+static int __init chash_self_test(u8 bits, u8 key_size,
+				  int min_fill, int max_fill,
+				  u64 iterations)
+{
+	struct chash_table table;
+	int ret;
+	u64 add_count, rmv_count;
+	u64 value;
+
+	if (key_size == 4 && iterations > 0xffffffff)
+		return -EINVAL;
+	if (min_fill >= max_fill)
+		return -EINVAL;
+
+	ret = chash_table_alloc(&table, bits, key_size, sizeof(u64),
+				GFP_KERNEL);
+	if (ret) {
+		pr_err("chash_table_alloc failed: %d\n", ret);
+		return ret;
+	}
+
+	for (add_count = 0, rmv_count = 0; add_count < iterations;
+	     add_count++) {
+		/* When we hit the max_fill level, remove entries down
+		 * to min_fill
+		 */
+		if (add_count - rmv_count == max_fill) {
+			u64 find_count = rmv_count;
+
+			/* First try to find all entries that we're
+			 * about to remove, confirm their value, test
+			 * writing them back a second time.
+			 */
+			for (; add_count - find_count > min_fill;
+			     find_count++) {
+				ret = chash_table_copy_out(&table, find_count,
+							   &value);
+				if (ret < 0) {
+					pr_err("chash_table_copy_out failed: %d\n",
+					       ret);
+					goto out;
+				}
+				if (value != ~find_count) {
+					pr_err("Wrong value retrieved for key 0x%llx, expected 0x%llx got 0x%llx\n",
+					       find_count, ~find_count, value);
+#ifdef CHASH_DEBUG
+					chash_table_dump(&table.table);
+#endif
+					ret = -EFAULT;
+					goto out;
+				}
+				ret = chash_table_copy_in(&table, find_count,
+							  &value);
+				if (ret != 1) {
+					pr_err("copy_in second time returned %d, expected 1\n",
+					       ret);
+					ret = -EFAULT;
+					goto out;
+				}
+			}
+			/* Remove them until we hit min_fill level */
+			for (; add_count - rmv_count > min_fill; rmv_count++) {
+				ret = chash_table_remove(&table, rmv_count,
+							 NULL);
+				if (ret < 0) {
+					pr_err("chash_table_remove failed: %d\n",
+					       ret);
+					goto out;
+				}
+			}
+		}
+
+		/* Add a new value */
+		value = ~add_count;
+		ret = chash_table_copy_in(&table, add_count, &value);
+		if (ret != 0) {
+			pr_err("copy_in first time returned %d, expected 0\n",
+			       ret);
+			ret = -EFAULT;
+			goto out;
+		}
+	}
+
+	chash_table_dump_stats(&table);
+	chash_table_reset_stats(&table);
+
+out:
+	chash_table_free(&table);
+	return ret;
+}
+
+static unsigned int chash_test_bits = 10;
+MODULE_PARM_DESC(test_bits,
+		 "Selftest number of hash bits ([4..20], default=10)");
+module_param_named(test_bits, chash_test_bits, uint, 0444);
+
+static unsigned int chash_test_keysize = 8;
+MODULE_PARM_DESC(test_keysize, "Selftest keysize (4 or 8, default=8)");
+module_param_named(test_keysize, chash_test_keysize, uint, 0444);
+
+static unsigned int chash_test_minfill;
+MODULE_PARM_DESC(test_minfill, "Selftest minimum #entries (default=50%)");
+module_param_named(test_minfill, chash_test_minfill, uint, 0444);
+
+static unsigned int chash_test_maxfill;
+MODULE_PARM_DESC(test_maxfill, "Selftest maximum #entries (default=80%)");
+module_param_named(test_maxfill, chash_test_maxfill, uint, 0444);
+
+static unsigned long chash_test_iters;
+MODULE_PARM_DESC(test_iters, "Selftest iterations (default=1000 x #entries)");
+module_param_named(test_iters, chash_test_iters, ulong, 0444);
+
+static int __init chash_init(void)
+{
+	int ret;
+	u64 ts1_ns, ts_delta_us;
+
+	/* Skip self test on user errors */
+	if (chash_test_bits < 4 || chash_test_bits > 20) {
+		pr_err("chash: test_bits out of range [4..20].\n");
+		return 0;
+	}
+	if (chash_test_keysize != 4 && chash_test_keysize != 8) {
+		pr_err("chash: test_keysize invalid. Must be 4 or 8.\n");
+		return 0;
+	}
+
+	if (!chash_test_minfill)
+		chash_test_minfill = (1 << chash_test_bits) / 2;
+	if (!chash_test_maxfill)
+		chash_test_maxfill = (1 << chash_test_bits) * 4 / 5;
+	if (!chash_test_iters)
+		chash_test_iters = (1 << chash_test_bits) * 1000;
+
+	if (chash_test_minfill >= (1 << chash_test_bits)) {
+		pr_err("chash: test_minfill too big. Must be < table size.\n");
+		return 0;
+	}
+	if (chash_test_maxfill >= (1 << chash_test_bits)) {
+		pr_err("chash: test_maxfill too big. Must be < table size.\n");
+		return 0;
+	}
+	if (chash_test_minfill >= chash_test_maxfill) {
+		pr_err("chash: test_minfill must be < test_maxfill.\n");
+		return 0;
+	}
+	if (chash_test_keysize == 4 && chash_test_iters > 0xffffffff) {
+		pr_err("chash: test_iters must be < 4G for 4 byte keys.\n");
+		return 0;
+	}
+
+	ts1_ns = local_clock();
+	ret = chash_self_test(chash_test_bits, chash_test_keysize,
+			      chash_test_minfill, chash_test_maxfill,
+			      chash_test_iters);
+	if (!ret) {
+		ts_delta_us = (local_clock() - ts1_ns) / 1000;
+		pr_info("chash: self test took %llu us, %llu iterations/s\n",
+			ts_delta_us,
+			(u64)chash_test_iters * 1000000 / ts_delta_us);
+	} else {
+		pr_err("chash: self test failed: %d\n", ret);
+	}
+
+	return ret;
+}
+
+module_init(chash_init);
+
+#endif /* CONFIG_CHASH_SELFTEST */
+
+MODULE_DESCRIPTION("Closed hash table");
+MODULE_LICENSE("GPL and additional rights");
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 8/8] drm/amdgpu: Track pending retry faults in IH and VM (v2)
       [not found] ` <1504045524-23853-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (6 preceding siblings ...)
  2017-08-29 22:25   ` [PATCH 7/8] lib: Closed hash table with low overhead Felix Kuehling
@ 2017-08-29 22:25   ` Felix Kuehling
  2017-09-06 21:53   ` [PATCH 0/8] Retry page fault handling for Vega10 Felix Kuehling
  8 siblings, 0 replies; 18+ messages in thread
From: Felix Kuehling @ 2017-08-29 22:25 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Felix Kuehling

IH tracks pending retry faults in a hash table for fast lookup in
interrupt context. Each VM has a short FIFO of pending VM faults for
processing in a bottom half.

The IH prescreening stage adds retry faults and filters out repeated
retry interrupts to minimize the impact of interrupt storms.

It's the VM's responsibility remove pending faults once they are
handled. For now this is only done when the VM is destroyed.

v2:
- Made the hash table smaller and the FIFO longer. I never want the
  FIFO to fill up, because that would make prescreen take longer.
  128 pending page faults should be enough to keep migrations busy.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/Kconfig                |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c | 76 +++++++++++++++++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h | 12 ++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c |  7 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h |  7 +++
 drivers/gpu/drm/amd/amdgpu/vega10_ih.c | 78 +++++++++++++++++++++++++++++++++-
 6 files changed, 180 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 83cb2a8..a79ce4c 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -184,6 +184,7 @@ config DRM_AMDGPU
 	select BACKLIGHT_CLASS_DEVICE
 	select BACKLIGHT_LCD_SUPPORT
 	select INTERVAL_TREE
+	select CHASH
 	help
 	  Choose this option if you have a recent AMD Radeon graphics card.
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
index c834a40..f5f27e4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
@@ -196,3 +196,79 @@ int amdgpu_ih_process(struct amdgpu_device *adev)
 
 	return IRQ_HANDLED;
 }
+
+/**
+ * amdgpu_ih_add_fault - Add a page fault record
+ *
+ * @adev: amdgpu device pointer
+ * @key: 64-bit encoding of PASID and address
+ *
+ * This should be called when a retry page fault interrupt is
+ * received. If this is a new page fault, it will be added to a hash
+ * table. The return value indicates whether this is a new fault, or
+ * a fault that was already known and is already being handled.
+ *
+ * If there are too many pending page faults, this will fail. Retry
+ * interrupts should be ignored in this case until there is enough
+ * free space.
+ *
+ * Returns 0 if the fault was added, 1 if the fault was already known,
+ * -ENOSPC if there are too many pending faults.
+ */
+int amdgpu_ih_add_fault(struct amdgpu_device *adev, u64 key)
+{
+	unsigned long flags;
+	int r = -ENOSPC;
+
+	if (WARN_ON_ONCE(!adev->irq.ih.faults))
+		/* Should be allocated in <IP>_ih_sw_init on GPUs that
+		 * support retry faults and require retry filtering.
+		 */
+		return r;
+
+	spin_lock_irqsave(&adev->irq.ih.faults->lock, flags);
+
+	/* Only let the hash table fill up to 50% for best performance */
+	if (adev->irq.ih.faults->count >= (1 << (AMDGPU_PAGEFAULT_HASH_BITS-1)))
+		goto unlock_out;
+
+	r = chash_table_copy_in(&adev->irq.ih.faults->hash, key, NULL);
+	if (!r)
+		adev->irq.ih.faults->count++;
+
+	/* chash_table_copy_in should never fail unless we're losing count */
+	WARN_ON_ONCE(r < 0);
+
+unlock_out:
+	spin_unlock_irqrestore(&adev->irq.ih.faults->lock, flags);
+	return r;
+}
+
+/**
+ * amdgpu_ih_clear_fault - Remove a page fault record
+ *
+ * @adev: amdgpu device pointer
+ * @key: 64-bit encoding of PASID and address
+ *
+ * This should be called when a page fault has been handled. Any
+ * future interrupt with this key will be processed as a new
+ * page fault.
+ */
+void amdgpu_ih_clear_fault(struct amdgpu_device *adev, u64 key)
+{
+	unsigned long flags;
+	int r;
+
+	if (!adev->irq.ih.faults)
+		return;
+
+	spin_lock_irqsave(&adev->irq.ih.faults->lock, flags);
+
+	r = chash_table_remove(&adev->irq.ih.faults->hash, key, NULL);
+	if (!WARN_ON_ONCE(r < 0)) {
+		adev->irq.ih.faults->count--;
+		WARN_ON_ONCE(adev->irq.ih.faults->count < 0);
+	}
+
+	spin_unlock_irqrestore(&adev->irq.ih.faults->lock, flags);
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
index 3de8e74..ada89358 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
@@ -24,6 +24,8 @@
 #ifndef __AMDGPU_IH_H__
 #define __AMDGPU_IH_H__
 
+#include <linux/chash.h>
+
 struct amdgpu_device;
  /*
   * vega10+ IH clients
@@ -69,6 +71,13 @@ enum amdgpu_ih_clientid
 
 #define AMDGPU_IH_CLIENTID_LEGACY 0
 
+#define AMDGPU_PAGEFAULT_HASH_BITS 8
+struct amdgpu_retryfault_hashtable {
+	DECLARE_CHASH_TABLE(hash, AMDGPU_PAGEFAULT_HASH_BITS, 8, 0);
+	spinlock_t	lock;
+	int		count;
+};
+
 /*
  * R6xx+ IH ring
  */
@@ -87,6 +96,7 @@ struct amdgpu_ih_ring {
 	bool			use_doorbell;
 	bool			use_bus_addr;
 	dma_addr_t		rb_dma_addr; /* only used when use_bus_addr = true */
+	struct amdgpu_retryfault_hashtable *faults;
 };
 
 #define AMDGPU_IH_SRC_DATA_MAX_SIZE_DW 4
@@ -109,5 +119,7 @@ int amdgpu_ih_ring_init(struct amdgpu_device *adev, unsigned ring_size,
 			bool use_bus_addr);
 void amdgpu_ih_ring_fini(struct amdgpu_device *adev);
 int amdgpu_ih_process(struct amdgpu_device *adev);
+int amdgpu_ih_add_fault(struct amdgpu_device *adev, u64 key);
+void amdgpu_ih_clear_fault(struct amdgpu_device *adev, u64 key);
 
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index f07b3b6..cb61750 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2606,6 +2606,8 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
 		vm->pasid = pasid;
 	}
 
+	INIT_KFIFO(vm->faults);
+
 	return 0;
 
 error_free_root:
@@ -2657,8 +2659,13 @@ void amdgpu_vm_fini(struct amdgpu_device *adev, struct amdgpu_vm *vm)
 {
 	struct amdgpu_bo_va_mapping *mapping, *tmp;
 	bool prt_fini_needed = !!adev->gart.gart_funcs->set_prt;
+	u64 fault;
 	int i;
 
+	/* Clear pending page faults from IH when the VM is destroyed */
+	while (kfifo_get(&vm->faults, &fault))
+		amdgpu_ih_clear_fault(adev, fault);
+
 	if (vm->pasid) {
 		unsigned long flags;
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
index 861d457..e6003ed 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -120,6 +120,10 @@ struct amdgpu_vm_pt {
 	unsigned			last_entry_used;
 };
 
+#define AMDGPU_VM_FAULT(pasid, addr) (((u64)(pasid) << 48) | (addr))
+#define AMDGPU_VM_FAULT_PASID(fault) ((u64)(fault) >> 48)
+#define AMDGPU_VM_FAULT_ADDR(fault)  ((u64)(fault) & 0xfffffffff000ULL)
+
 struct amdgpu_vm {
 	/* tree of virtual addresses mapped */
 	struct rb_root		va;
@@ -157,6 +161,9 @@ struct amdgpu_vm {
 
 	/* Flag to indicate ATS support from PTE for GFX9 */
 	bool			pte_support_ats;
+
+	/* Up to 128 pending page faults */
+	DECLARE_KFIFO(faults, u64, 128);
 };
 
 struct amdgpu_vm_id {
diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
index eda4771..dd6af21 100644
--- a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
@@ -235,8 +235,73 @@ static u32 vega10_ih_get_wptr(struct amdgpu_device *adev)
  */
 static bool vega10_ih_prescreen_iv(struct amdgpu_device *adev)
 {
-	/* TODO: Filter known pending page faults */
+	u32 ring_index = adev->irq.ih.rptr >> 2;
+	u32 dw0, dw3, dw4, dw5;
+	u16 pasid;
+	u64 addr, key;
+	struct amdgpu_vm *vm;
+	int r;
+
+	dw0 = le32_to_cpu(adev->irq.ih.ring[ring_index + 0]);
+	dw3 = le32_to_cpu(adev->irq.ih.ring[ring_index + 3]);
+	dw4 = le32_to_cpu(adev->irq.ih.ring[ring_index + 4]);
+	dw5 = le32_to_cpu(adev->irq.ih.ring[ring_index + 5]);
+
+	/* Filter retry page faults, let only the first one pass. If
+	 * there are too many outstanding faults, ignore them until
+	 * some faults get cleared.
+	 */
+	switch (dw0 & 0xff) {
+	case AMDGPU_IH_CLIENTID_VMC:
+	case AMDGPU_IH_CLIENTID_UTCL2:
+		break;
+	default:
+		/* Not a VM fault */
+		return true;
+	}
+
+	/* Not a retry fault */
+	if (!(dw5 & 0x80))
+		return true;
+
+	pasid = dw3 & 0xffff;
+	/* No PASID, can't identify faulting process */
+	if (!pasid)
+		return true;
+
+	addr = ((u64)(dw5 & 0xf) << 44) | ((u64)dw4 << 12);
+	key = AMDGPU_VM_FAULT(pasid, addr);
+	r = amdgpu_ih_add_fault(adev, key);
+
+	/* Hash table is full or the fault is already being processed,
+	 * ignore further page faults
+	 */
+	if (r != 0)
+		goto ignore_iv;
+
+	/* Track retry faults in per-VM fault FIFO. */
+	spin_lock(&adev->vm_manager.pasid_lock);
+	vm = idr_find(&adev->vm_manager.pasid_idr, pasid);
+	spin_unlock(&adev->vm_manager.pasid_lock);
+	if (WARN_ON_ONCE(!vm)) {
+		/* VM not found, process it normally */
+		amdgpu_ih_clear_fault(adev, key);
+		return true;
+	}
+	/* No locking required with single writer and single reader */
+	r = kfifo_put(&vm->faults, key);
+	if (!r) {
+		/* FIFO is full. Ignore it until there is space */
+		amdgpu_ih_clear_fault(adev, key);
+		goto ignore_iv;
+	}
+
+	/* It's the first fault for this address, process it normally */
 	return true;
+
+ignore_iv:
+	adev->irq.ih.rptr += 32;
+	return false;
 }
 
 /**
@@ -323,6 +388,14 @@ static int vega10_ih_sw_init(void *handle)
 	adev->irq.ih.use_doorbell = true;
 	adev->irq.ih.doorbell_index = AMDGPU_DOORBELL64_IH << 1;
 
+	adev->irq.ih.faults = kmalloc(sizeof(*adev->irq.ih.faults), GFP_KERNEL);
+	if (!adev->irq.ih.faults)
+		return -ENOMEM;
+	INIT_CHASH_TABLE(adev->irq.ih.faults->hash,
+			 AMDGPU_PAGEFAULT_HASH_BITS, 8, 0);
+	spin_lock_init(&adev->irq.ih.faults->lock);
+	adev->irq.ih.faults->count = 0;
+
 	r = amdgpu_irq_init(adev);
 
 	return r;
@@ -335,6 +408,9 @@ static int vega10_ih_sw_fini(void *handle)
 	amdgpu_irq_fini(adev);
 	amdgpu_ih_ring_fini(adev);
 
+	kfree(adev->irq.ih.faults);
+	adev->irq.ih.faults = NULL;
+
 	return 0;
 }
 
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 5/8] drm/amdkfd: Use PASID manager from KGD
       [not found]     ` <1504045524-23853-6-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-09-03 11:54       ` Oded Gabbay
       [not found]         ` <CAFCwf13voQSyLFp8smtgMa=ZRRgrf+7H3wzfnF0cP+ak4tMhGw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 18+ messages in thread
From: Oded Gabbay @ 2017-09-03 11:54 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: amd-gfx list

On Wed, Aug 30, 2017 at 1:25 AM, Felix Kuehling <Felix.Kuehling@amd.com> wrote:
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_module.c |  6 ---
>  drivers/gpu/drm/amd/amdkfd/kfd_pasid.c  | 90 ++++++++++++++-------------------
>  2 files changed, 38 insertions(+), 58 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_module.c b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
> index 0d73bea..6c5a9ca 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
> @@ -103,10 +103,6 @@ static int __init kfd_module_init(void)
>                 return -1;
>         }
>
> -       err = kfd_pasid_init();
> -       if (err < 0)
> -               return err;
> -
>         err = kfd_chardev_init();
>         if (err < 0)
>                 goto err_ioctl;
> @@ -126,7 +122,6 @@ static int __init kfd_module_init(void)
>  err_topology:
>         kfd_chardev_exit();
>  err_ioctl:
> -       kfd_pasid_exit();
>         return err;
>  }
>
> @@ -137,7 +132,6 @@ static void __exit kfd_module_exit(void)
>         kfd_process_destroy_wq();
>         kfd_topology_shutdown();
>         kfd_chardev_exit();
> -       kfd_pasid_exit();
>         dev_info(kfd_device, "Removed module\n");
>  }
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pasid.c b/drivers/gpu/drm/amd/amdkfd/kfd_pasid.c
> index 1e06de0..d6a7961 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_pasid.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_pasid.c
> @@ -20,78 +20,64 @@
>   * OTHER DEALINGS IN THE SOFTWARE.
>   */
>
> -#include <linux/slab.h>
>  #include <linux/types.h>
>  #include "kfd_priv.h"
>
> -static unsigned long *pasid_bitmap;
> -static unsigned int pasid_limit;
> -static DEFINE_MUTEX(pasid_mutex);
> -
> -int kfd_pasid_init(void)
> -{
> -       pasid_limit = KFD_MAX_NUM_OF_PROCESSES;
> -
> -       pasid_bitmap = kcalloc(BITS_TO_LONGS(pasid_limit), sizeof(long),
> -                               GFP_KERNEL);
> -       if (!pasid_bitmap)
> -               return -ENOMEM;
> -
> -       set_bit(0, pasid_bitmap); /* PASID 0 is reserved. */
> -
> -       return 0;
> -}
> -
> -void kfd_pasid_exit(void)
> -{
> -       kfree(pasid_bitmap);
> -}
> +static unsigned int pasid_bits = 16;
> +static const struct kfd2kgd_calls *kfd2kgd;
>
>  bool kfd_set_pasid_limit(unsigned int new_limit)
>  {
> -       if (new_limit < pasid_limit) {
> -               bool ok;
> -
> -               mutex_lock(&pasid_mutex);
> -
> -               /* ensure that no pasids >= new_limit are in-use */
> -               ok = (find_next_bit(pasid_bitmap, pasid_limit, new_limit) ==
> -                                                               pasid_limit);
> -               if (ok)
> -                       pasid_limit = new_limit;
> -
> -               mutex_unlock(&pasid_mutex);
> -
> -               return ok;
> +       if (new_limit < 2)
> +               return false;
> +
> +       if (new_limit < (1U << pasid_bits)) {
> +               if (kfd2kgd)
> +                       /* We've already allocated user PASIDs, too late to
> +                        * change the limit
> +                        */
> +                       return false;
> +
> +               while (new_limit < (1U << pasid_bits))
> +                       pasid_bits--;
>         }
>
>         return true;
>  }
>
> -inline unsigned int kfd_get_pasid_limit(void)
> +unsigned int kfd_get_pasid_limit(void)
>  {
> -       return pasid_limit;
> +       return 1U << pasid_bits;
>  }
>
>  unsigned int kfd_pasid_alloc(void)
>  {
> -       unsigned int found;
> -
> -       mutex_lock(&pasid_mutex);
> -
> -       found = find_first_zero_bit(pasid_bitmap, pasid_limit);
> -       if (found == pasid_limit)
> -               found = 0;
> -       else
> -               set_bit(found, pasid_bitmap);
> +       int r;
> +
> +       /* Find the first best KFD device for calling KGD */
> +       if (!kfd2kgd) {
> +               struct kfd_dev *dev = NULL;
> +               unsigned int i = 0;
> +
> +               while ((dev = kfd_topology_enum_kfd_devices(i)) != NULL) {
> +                       if (dev && dev->kfd2kgd) {
> +                               kfd2kgd = dev->kfd2kgd;
> +                               break;
> +                       }
> +                       i++;
> +               }
> +
> +               if (!kfd2kgd)
> +                       return false;
> +       }

Don't you need to allocate PASID on all possible devices ?
If so, does it need to be the same on all devices ?


>
> -       mutex_unlock(&pasid_mutex);
> +       r = kfd2kgd->alloc_pasid(pasid_bits);
>
> -       return found;
> +       return r > 0 ? r : 0;
>  }
>
>  void kfd_pasid_free(unsigned int pasid)
>  {
> -       if (!WARN_ON(pasid == 0 || pasid >= pasid_limit))
> -               clear_bit(pasid, pasid_bitmap);
> +       if (kfd2kgd)
> +               kfd2kgd->free_pasid(pasid);
>  }
> --
> 2.7.4
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 5/8] drm/amdkfd: Use PASID manager from KGD
       [not found]         ` <CAFCwf13voQSyLFp8smtgMa=ZRRgrf+7H3wzfnF0cP+ak4tMhGw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-09-05 15:39           ` Felix Kuehling
  0 siblings, 0 replies; 18+ messages in thread
From: Felix Kuehling @ 2017-09-05 15:39 UTC (permalink / raw)
  To: Oded Gabbay; +Cc: amd-gfx list


On 2017-09-03 07:54 AM, Oded Gabbay wrote:
> Don't you need to allocate PASID on all possible devices ?
> If so, does it need to be the same on all devices ?
The PASID allocator is global (not per-device), so the PASID is intended
to be the same for all devices. I only need to find a device for the
kfd2kgd interface pointer. In a multi-GPU system it doesn't matter which
one. They all share the same PASID allocator.

Technically, I think different devices could use different PASIDs for
the same process. But it would make the driver more complicated and
there is no reason to do this.

Regards,
  Felix


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/8] Retry page fault handling for Vega10
       [not found] ` <1504045524-23853-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (7 preceding siblings ...)
  2017-08-29 22:25   ` [PATCH 8/8] drm/amdgpu: Track pending retry faults in IH and VM (v2) Felix Kuehling
@ 2017-09-06 21:53   ` Felix Kuehling
       [not found]     ` <0816a963-54cc-0041-4b09-4bf41ee46fbf-5C7GfCeVMHo@public.gmane.org>
  8 siblings, 1 reply; 18+ messages in thread
From: Felix Kuehling @ 2017-09-06 21:53 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Deucher, Alexander,
	Oded Gabbay, Christian König

I realized that the drm-next-4.15-wip branch isn't very useful for
testing this, because it has no display support for Vega10 and no KFD
support for Vega10. So you can't test graphics or compute on Vega10 with
this branch. On the other hand, I need to make changes in both KFD and
AMDGPU, so I tried to avoid an amd-internal branch. But it seems until
either DAL or KFD is upstream, it'll have to remain on an AMD-internal
branch (amd-staging-4.12 for now, to be changed soon). Christian, would
this enable any of work you were going to do?

Alex, is this going to make your regular upstreaming more difficult? Or
are you OK with upstreaming KFD changes that have dependencies with
amdgpu changes? Oded, would you be OK with Alex upstreaming KFD changes
along with amdgpu changes? Assuming they have your "Reviewed-by"?

I also haven't got any feedback from LKLM on the addition of the chash
data structure to kernel/lib. I'm considering adding it in
drivers/gpu/drm/amd/chash as an interim step. It can be moved to lib
later, if other components are interested in using it. Any objections?

Regards,
  Felix


On 2017-08-29 06:25 PM, Felix Kuehling wrote:
> Rebased on the public drm-next-4.15-wip. Patch 8 from the WIP patch
> series did not apply at all, because upstream KFD doesn't support
> GPUVM yet.
>
> The "lib: Closed hash table ..." change is updated and the same as
> what I sent to LKML yesterday. Changes are mainly in the way the self
> test is hooked up, Kconfig options and some checkpatch fixes. If it
> takes too long to get accepted upstream, I could add it under
> drivers/gpu/drm/amd/chash in the interim.
>
> This is only compile tested on this branch. I can't do much more
> because the upstream KFD doesn't support Vega10 and GPUVM yet. Someone
> will have to add PASID support for graphics on top of this.
>
> TODO:
> * Finish upstreaming KFD
> * Allocate PASIDs for graphics contexts
> * Setup VMID-PASID mapping during graphics command submission
> * Confirm that graphics page faults have the correct PASID in the IV
>
> Felix Kuehling (8):
>   drm/amdgpu: Fix error handling in amdgpu_vm_init
>   drm/amdgpu: Add PASID management
>   drm/radeon: Add PASID manager for KFD
>   drm/amdkfd: Separate doorbell allocation from PASID
>   drm/amdkfd: Use PASID manager from KGD
>   drm/amdgpu: Add prescreening stage in IH processing
>   lib: Closed hash table with low overhead
>   drm/amdgpu: Track pending retry faults in IH and VM (v2)
>
>  drivers/gpu/drm/Kconfig                           |   1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h               |   2 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c |   2 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c |   2 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c            |  82 +++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h            |  12 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |   2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            |  84 ++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h            |  21 +-
>  drivers/gpu/drm/amd/amdgpu/cik_ih.c               |  14 +
>  drivers/gpu/drm/amd/amdgpu/cz_ih.c                |  14 +
>  drivers/gpu/drm/amd/amdgpu/iceland_ih.c           |  14 +
>  drivers/gpu/drm/amd/amdgpu/si_ih.c                |  14 +
>  drivers/gpu/drm/amd/amdgpu/tonga_ih.c             |  14 +
>  drivers/gpu/drm/amd/amdgpu/vega10_ih.c            |  90 ++++
>  drivers/gpu/drm/amd/amdkfd/kfd_device.c           |   7 -
>  drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c         |  50 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_module.c           |   6 -
>  drivers/gpu/drm/amd/amdkfd/kfd_pasid.c            |  90 ++--
>  drivers/gpu/drm/amd/amdkfd/kfd_priv.h             |  10 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_process.c          |   6 +
>  drivers/gpu/drm/amd/include/kgd_kfd_interface.h   |   6 +
>  drivers/gpu/drm/radeon/radeon_kfd.c               |  31 ++
>  include/linux/chash.h                             | 358 +++++++++++++
>  lib/Kconfig                                       |  24 +
>  lib/Makefile                                      |   2 +
>  lib/chash.c                                       | 622 ++++++++++++++++++++++
>  27 files changed, 1489 insertions(+), 91 deletions(-)
>  create mode 100644 include/linux/chash.h
>  create mode 100644 lib/chash.c
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH 0/8] Retry page fault handling for Vega10
       [not found]     ` <0816a963-54cc-0041-4b09-4bf41ee46fbf-5C7GfCeVMHo@public.gmane.org>
@ 2017-09-11 19:29       ` Deucher, Alexander
       [not found]         ` <BN6PR12MB1652DBBD8E350972D2BC6C3AF7680-/b2+HYfkarQqUD6E6FAiowdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
  0 siblings, 1 reply; 18+ messages in thread
From: Deucher, Alexander @ 2017-09-11 19:29 UTC (permalink / raw)
  To: Kuehling, Felix, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	Oded Gabbay, Koenig, Christian

> -----Original Message-----
> From: Kuehling, Felix
> Sent: Wednesday, September 06, 2017 5:54 PM
> To: amd-gfx@lists.freedesktop.org; Deucher, Alexander; Oded Gabbay;
> Koenig, Christian
> Subject: Re: [PATCH 0/8] Retry page fault handling for Vega10
> 
> I realized that the drm-next-4.15-wip branch isn't very useful for
> testing this, because it has no display support for Vega10 and no KFD
> support for Vega10. So you can't test graphics or compute on Vega10 with
> this branch. On the other hand, I need to make changes in both KFD and
> AMDGPU, so I tried to avoid an amd-internal branch. But it seems until
> either DAL or KFD is upstream, it'll have to remain on an AMD-internal
> branch (amd-staging-4.12 for now, to be changed soon). Christian, would
> this enable any of work you were going to do?
> 
> Alex, is this going to make your regular upstreaming more difficult? Or
> are you OK with upstreaming KFD changes that have dependencies with
> amdgpu changes? Oded, would you be OK with Alex upstreaming KFD
> changes
> along with amdgpu changes? Assuming they have your "Reviewed-by"?

I'm fine to take the changes through my tree if Oded is ok with it.

> 
> I also haven't got any feedback from LKLM on the addition of the chash
> data structure to kernel/lib. I'm considering adding it in
> drivers/gpu/drm/amd/chash as an interim step. It can be moved to lib
> later, if other components are interested in using it. Any objections?

Works for me.

Alex

> 
> Regards,
>   Felix
> 
> 
> On 2017-08-29 06:25 PM, Felix Kuehling wrote:
> > Rebased on the public drm-next-4.15-wip. Patch 8 from the WIP patch
> > series did not apply at all, because upstream KFD doesn't support
> > GPUVM yet.
> >
> > The "lib: Closed hash table ..." change is updated and the same as
> > what I sent to LKML yesterday. Changes are mainly in the way the self
> > test is hooked up, Kconfig options and some checkpatch fixes. If it
> > takes too long to get accepted upstream, I could add it under
> > drivers/gpu/drm/amd/chash in the interim.
> >
> > This is only compile tested on this branch. I can't do much more
> > because the upstream KFD doesn't support Vega10 and GPUVM yet.
> Someone
> > will have to add PASID support for graphics on top of this.
> >
> > TODO:
> > * Finish upstreaming KFD
> > * Allocate PASIDs for graphics contexts
> > * Setup VMID-PASID mapping during graphics command submission
> > * Confirm that graphics page faults have the correct PASID in the IV
> >
> > Felix Kuehling (8):
> >   drm/amdgpu: Fix error handling in amdgpu_vm_init
> >   drm/amdgpu: Add PASID management
> >   drm/radeon: Add PASID manager for KFD
> >   drm/amdkfd: Separate doorbell allocation from PASID
> >   drm/amdkfd: Use PASID manager from KGD
> >   drm/amdgpu: Add prescreening stage in IH processing
> >   lib: Closed hash table with low overhead
> >   drm/amdgpu: Track pending retry faults in IH and VM (v2)
> >
> >  drivers/gpu/drm/Kconfig                           |   1 +
> >  drivers/gpu/drm/amd/amdgpu/amdgpu.h               |   2 +
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c |   2 +
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c |   2 +
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c            |  82 +++
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h            |  12 +
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |   2 +-
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            |  84 ++-
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h            |  21 +-
> >  drivers/gpu/drm/amd/amdgpu/cik_ih.c               |  14 +
> >  drivers/gpu/drm/amd/amdgpu/cz_ih.c                |  14 +
> >  drivers/gpu/drm/amd/amdgpu/iceland_ih.c           |  14 +
> >  drivers/gpu/drm/amd/amdgpu/si_ih.c                |  14 +
> >  drivers/gpu/drm/amd/amdgpu/tonga_ih.c             |  14 +
> >  drivers/gpu/drm/amd/amdgpu/vega10_ih.c            |  90 ++++
> >  drivers/gpu/drm/amd/amdkfd/kfd_device.c           |   7 -
> >  drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c         |  50 +-
> >  drivers/gpu/drm/amd/amdkfd/kfd_module.c           |   6 -
> >  drivers/gpu/drm/amd/amdkfd/kfd_pasid.c            |  90 ++--
> >  drivers/gpu/drm/amd/amdkfd/kfd_priv.h             |  10 +-
> >  drivers/gpu/drm/amd/amdkfd/kfd_process.c          |   6 +
> >  drivers/gpu/drm/amd/include/kgd_kfd_interface.h   |   6 +
> >  drivers/gpu/drm/radeon/radeon_kfd.c               |  31 ++
> >  include/linux/chash.h                             | 358 +++++++++++++
> >  lib/Kconfig                                       |  24 +
> >  lib/Makefile                                      |   2 +
> >  lib/chash.c                                       | 622 ++++++++++++++++++++++
> >  27 files changed, 1489 insertions(+), 91 deletions(-)
> >  create mode 100644 include/linux/chash.h
> >  create mode 100644 lib/chash.c
> >

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/8] Retry page fault handling for Vega10
       [not found]         ` <BN6PR12MB1652DBBD8E350972D2BC6C3AF7680-/b2+HYfkarQqUD6E6FAiowdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
@ 2017-09-12  5:48           ` Oded Gabbay
       [not found]             ` <CAFCwf11o=4iizAbRVNE2w3FoacU3yutKsbMvFGfLNTscv1Ym+A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 18+ messages in thread
From: Oded Gabbay @ 2017-09-12  5:48 UTC (permalink / raw)
  To: Deucher, Alexander, Kuehling, Felix, John Bridgman
  Cc: Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On Mon, Sep 11, 2017 at 10:29 PM, Deucher, Alexander
<Alexander.Deucher@amd.com> wrote:
>> -----Original Message-----
>> From: Kuehling, Felix
>> Sent: Wednesday, September 06, 2017 5:54 PM
>> To: amd-gfx@lists.freedesktop.org; Deucher, Alexander; Oded Gabbay;
>> Koenig, Christian
>> Subject: Re: [PATCH 0/8] Retry page fault handling for Vega10
>>
>> I realized that the drm-next-4.15-wip branch isn't very useful for
>> testing this, because it has no display support for Vega10 and no KFD
>> support for Vega10. So you can't test graphics or compute on Vega10 with
>> this branch. On the other hand, I need to make changes in both KFD and
>> AMDGPU, so I tried to avoid an amd-internal branch. But it seems until
>> either DAL or KFD is upstream, it'll have to remain on an AMD-internal
>> branch (amd-staging-4.12 for now, to be changed soon). Christian, would
>> this enable any of work you were going to do?
>>
>> Alex, is this going to make your regular upstreaming more difficult? Or
>> are you OK with upstreaming KFD changes that have dependencies with
>> amdgpu changes? Oded, would you be OK with Alex upstreaming KFD
>> changes
>> along with amdgpu changes? Assuming they have your "Reviewed-by"?
>
> I'm fine to take the changes through my tree if Oded is ok with it.

+John,

If Alex is fine with it then I'm fine with it as well, as long as this
is a temporary solution until some point where you have some
convergence between your internal code and the upstream one.
And of course if you have amdkfd only changes then that can be
upstreamed through me directly.

Having said that, if you/John/Alex think that this is a more permanent
solution, then maybe a better plan is to first unify the drivers (as
was discussed many times) before starting to upstream changes. If the
90% of the changes are in both drivers, then there is really no point
of keeping amdkfd as a separate driver.

Oded


>
>>
>> I also haven't got any feedback from LKLM on the addition of the chash
>> data structure to kernel/lib. I'm considering adding it in
>> drivers/gpu/drm/amd/chash as an interim step. It can be moved to lib
>> later, if other components are interested in using it. Any objections?
>
> Works for me.
>
> Alex
>
>>
>> Regards,
>>   Felix
>>
>>
>> On 2017-08-29 06:25 PM, Felix Kuehling wrote:
>> > Rebased on the public drm-next-4.15-wip. Patch 8 from the WIP patch
>> > series did not apply at all, because upstream KFD doesn't support
>> > GPUVM yet.
>> >
>> > The "lib: Closed hash table ..." change is updated and the same as
>> > what I sent to LKML yesterday. Changes are mainly in the way the self
>> > test is hooked up, Kconfig options and some checkpatch fixes. If it
>> > takes too long to get accepted upstream, I could add it under
>> > drivers/gpu/drm/amd/chash in the interim.
>> >
>> > This is only compile tested on this branch. I can't do much more
>> > because the upstream KFD doesn't support Vega10 and GPUVM yet.
>> Someone
>> > will have to add PASID support for graphics on top of this.
>> >
>> > TODO:
>> > * Finish upstreaming KFD
>> > * Allocate PASIDs for graphics contexts
>> > * Setup VMID-PASID mapping during graphics command submission
>> > * Confirm that graphics page faults have the correct PASID in the IV
>> >
>> > Felix Kuehling (8):
>> >   drm/amdgpu: Fix error handling in amdgpu_vm_init
>> >   drm/amdgpu: Add PASID management
>> >   drm/radeon: Add PASID manager for KFD
>> >   drm/amdkfd: Separate doorbell allocation from PASID
>> >   drm/amdkfd: Use PASID manager from KGD
>> >   drm/amdgpu: Add prescreening stage in IH processing
>> >   lib: Closed hash table with low overhead
>> >   drm/amdgpu: Track pending retry faults in IH and VM (v2)
>> >
>> >  drivers/gpu/drm/Kconfig                           |   1 +
>> >  drivers/gpu/drm/amd/amdgpu/amdgpu.h               |   2 +
>> >  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c |   2 +
>> >  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c |   2 +
>> >  drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c            |  82 +++
>> >  drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h            |  12 +
>> >  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |   2 +-
>> >  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            |  84 ++-
>> >  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h            |  21 +-
>> >  drivers/gpu/drm/amd/amdgpu/cik_ih.c               |  14 +
>> >  drivers/gpu/drm/amd/amdgpu/cz_ih.c                |  14 +
>> >  drivers/gpu/drm/amd/amdgpu/iceland_ih.c           |  14 +
>> >  drivers/gpu/drm/amd/amdgpu/si_ih.c                |  14 +
>> >  drivers/gpu/drm/amd/amdgpu/tonga_ih.c             |  14 +
>> >  drivers/gpu/drm/amd/amdgpu/vega10_ih.c            |  90 ++++
>> >  drivers/gpu/drm/amd/amdkfd/kfd_device.c           |   7 -
>> >  drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c         |  50 +-
>> >  drivers/gpu/drm/amd/amdkfd/kfd_module.c           |   6 -
>> >  drivers/gpu/drm/amd/amdkfd/kfd_pasid.c            |  90 ++--
>> >  drivers/gpu/drm/amd/amdkfd/kfd_priv.h             |  10 +-
>> >  drivers/gpu/drm/amd/amdkfd/kfd_process.c          |   6 +
>> >  drivers/gpu/drm/amd/include/kgd_kfd_interface.h   |   6 +
>> >  drivers/gpu/drm/radeon/radeon_kfd.c               |  31 ++
>> >  include/linux/chash.h                             | 358 +++++++++++++
>> >  lib/Kconfig                                       |  24 +
>> >  lib/Makefile                                      |   2 +
>> >  lib/chash.c                                       | 622 ++++++++++++++++++++++
>> >  27 files changed, 1489 insertions(+), 91 deletions(-)
>> >  create mode 100644 include/linux/chash.h
>> >  create mode 100644 lib/chash.c
>> >
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/8] Retry page fault handling for Vega10
       [not found]             ` <CAFCwf11o=4iizAbRVNE2w3FoacU3yutKsbMvFGfLNTscv1Ym+A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-09-12 20:26               ` Felix Kuehling
  0 siblings, 0 replies; 18+ messages in thread
From: Felix Kuehling @ 2017-09-12 20:26 UTC (permalink / raw)
  To: Oded Gabbay, Deucher, Alexander, John Bridgman
  Cc: Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On 2017-09-12 01:48 AM, Oded Gabbay wrote:
> +John,
>
> If Alex is fine with it then I'm fine with it as well, as long as this
> is a temporary solution until some point where you have some
> convergence between your internal code and the upstream one.

Agreed. I want to converge those branches as soon as possible.

> And of course if you have amdkfd only changes then that can be
> upstreamed through me directly.

Yes.

>
> Having said that, if you/John/Alex think that this is a more permanent
> solution, then maybe a better plan is to first unify the drivers (as
> was discussed many times) before starting to upstream changes. If the
> 90% of the changes are in both drivers, then there is really no point
> of keeping amdkfd as a separate driver.

If we unify the drivers first, it would further diverge the branches.
That would be counter-productive in my opinion. I'd rather get the
branches to converge first and then unify the drivers so we don't have
to do it twice.

Regards,
  Felix

>
> Oded

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/8] Retry page fault handling for Vega10
       [not found]     ` <5f932a0d-7425-46ff-2800-f1b868495f06-5C7GfCeVMHo@public.gmane.org>
@ 2017-09-19 12:12       ` Christian König
  0 siblings, 0 replies; 18+ messages in thread
From: Christian König @ 2017-09-19 12:12 UTC (permalink / raw)
  To: Felix Kuehling, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	Deucher, Alexander, Oded Gabbay

Am 19.09.2017 um 03:05 schrieb Felix Kuehling:
> Thanks for the reviews. I rebased this on amd-staging-drm-next, retested
> and submitted.
>
> Christian, do you want to do some graphics PASID and VMFault work on top
> of that? I think I'll be working on more KFD upstreaming this week and
> maybe look at this subject again next week.

Yeah, that's on my TODO list together with quite a bunch of other things.

Going to give that a try when I have time, but don't expect anything 
before xmas.

Regards,
Christian.

>
> Regards,
>    Felix
>
>
> On 2017-09-12 07:05 PM, Felix Kuehling wrote:
>> Rebased on adeucher/amd-staging-4.13 and tested on Vega10 (graphics)
>> and Kaveri (KFD). Meaningful graphics tests with retry faults enabled
>> will only be possible after PASID support is added to amdgpu_cs.
>>
>> The chash table was moved to drivers/gpu/drm/amd/lib for now but is
>> ready to move to lib if needed. I have not got any feedback on LKLM
>> and I don't want that to hold up the patch series.
>>
>> TODO:
>> * Finish upstreaming KFD
>> * Allocate PASIDs for graphics contexts
>> * Setup VMID-PASID mapping during graphics command submission
>> * Confirm that graphics page faults have the correct PASID in the IV
>>
>>
>> Felix Kuehling (8):
>>    drm/amdgpu: Fix error handling in amdgpu_vm_init
>>    drm/amdgpu: Add PASID management
>>    drm/radeon: Add PASID manager for KFD
>>    drm/amdkfd: Separate doorbell allocation from PASID
>>    drm/amdkfd: Use PASID manager from KGD
>>    drm/amdgpu: Add prescreening stage in IH processing
>>    drm/amd: Closed hash table with low overhead
>>    drm/amdgpu: Track pending retry faults in IH and VM (v2)
>>
>>   drivers/gpu/drm/Kconfig                           |   3 +
>>   drivers/gpu/drm/Makefile                          |   1 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h               |   2 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c |   2 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c |   2 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c            |  82 +++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h            |  12 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |   2 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            |  84 ++-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h            |  21 +-
>>   drivers/gpu/drm/amd/amdgpu/cik_ih.c               |  14 +
>>   drivers/gpu/drm/amd/amdgpu/cz_ih.c                |  14 +
>>   drivers/gpu/drm/amd/amdgpu/iceland_ih.c           |  14 +
>>   drivers/gpu/drm/amd/amdgpu/si_ih.c                |  14 +
>>   drivers/gpu/drm/amd/amdgpu/tonga_ih.c             |  14 +
>>   drivers/gpu/drm/amd/amdgpu/vega10_ih.c            |  90 ++++
>>   drivers/gpu/drm/amd/amdkfd/kfd_device.c           |   7 -
>>   drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c         |  50 +-
>>   drivers/gpu/drm/amd/amdkfd/kfd_module.c           |   6 -
>>   drivers/gpu/drm/amd/amdkfd/kfd_pasid.c            |  90 ++--
>>   drivers/gpu/drm/amd/amdkfd/kfd_priv.h             |  10 +-
>>   drivers/gpu/drm/amd/amdkfd/kfd_process.c          |   6 +
>>   drivers/gpu/drm/amd/include/kgd_kfd_interface.h   |   6 +
>>   drivers/gpu/drm/amd/include/linux/chash.h         | 358 +++++++++++++
>>   drivers/gpu/drm/amd/lib/Kconfig                   |  27 +
>>   drivers/gpu/drm/amd/lib/Makefile                  |  11 +
>>   drivers/gpu/drm/amd/lib/chash.c                   | 622 ++++++++++++++++++++++
>>   drivers/gpu/drm/radeon/radeon_kfd.c               |  31 ++
>>   28 files changed, 1504 insertions(+), 91 deletions(-)
>>   create mode 100644 drivers/gpu/drm/amd/include/linux/chash.h
>>   create mode 100644 drivers/gpu/drm/amd/lib/Kconfig
>>   create mode 100644 drivers/gpu/drm/amd/lib/Makefile
>>   create mode 100644 drivers/gpu/drm/amd/lib/chash.c
>>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/8] Retry page fault handling for Vega10
       [not found] ` <1505257545-28000-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-09-19  1:05   ` Felix Kuehling
       [not found]     ` <5f932a0d-7425-46ff-2800-f1b868495f06-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 18+ messages in thread
From: Felix Kuehling @ 2017-09-19  1:05 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Deucher, Alexander,
	Oded Gabbay, Christian König

Thanks for the reviews. I rebased this on amd-staging-drm-next, retested
and submitted.

Christian, do you want to do some graphics PASID and VMFault work on top
of that? I think I'll be working on more KFD upstreaming this week and
maybe look at this subject again next week.

Regards,
  Felix


On 2017-09-12 07:05 PM, Felix Kuehling wrote:
> Rebased on adeucher/amd-staging-4.13 and tested on Vega10 (graphics)
> and Kaveri (KFD). Meaningful graphics tests with retry faults enabled
> will only be possible after PASID support is added to amdgpu_cs.
>
> The chash table was moved to drivers/gpu/drm/amd/lib for now but is
> ready to move to lib if needed. I have not got any feedback on LKLM
> and I don't want that to hold up the patch series.
>
> TODO:
> * Finish upstreaming KFD
> * Allocate PASIDs for graphics contexts
> * Setup VMID-PASID mapping during graphics command submission
> * Confirm that graphics page faults have the correct PASID in the IV
>
>
> Felix Kuehling (8):
>   drm/amdgpu: Fix error handling in amdgpu_vm_init
>   drm/amdgpu: Add PASID management
>   drm/radeon: Add PASID manager for KFD
>   drm/amdkfd: Separate doorbell allocation from PASID
>   drm/amdkfd: Use PASID manager from KGD
>   drm/amdgpu: Add prescreening stage in IH processing
>   drm/amd: Closed hash table with low overhead
>   drm/amdgpu: Track pending retry faults in IH and VM (v2)
>
>  drivers/gpu/drm/Kconfig                           |   3 +
>  drivers/gpu/drm/Makefile                          |   1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h               |   2 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c |   2 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c |   2 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c            |  82 +++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h            |  12 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |   2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            |  84 ++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h            |  21 +-
>  drivers/gpu/drm/amd/amdgpu/cik_ih.c               |  14 +
>  drivers/gpu/drm/amd/amdgpu/cz_ih.c                |  14 +
>  drivers/gpu/drm/amd/amdgpu/iceland_ih.c           |  14 +
>  drivers/gpu/drm/amd/amdgpu/si_ih.c                |  14 +
>  drivers/gpu/drm/amd/amdgpu/tonga_ih.c             |  14 +
>  drivers/gpu/drm/amd/amdgpu/vega10_ih.c            |  90 ++++
>  drivers/gpu/drm/amd/amdkfd/kfd_device.c           |   7 -
>  drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c         |  50 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_module.c           |   6 -
>  drivers/gpu/drm/amd/amdkfd/kfd_pasid.c            |  90 ++--
>  drivers/gpu/drm/amd/amdkfd/kfd_priv.h             |  10 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_process.c          |   6 +
>  drivers/gpu/drm/amd/include/kgd_kfd_interface.h   |   6 +
>  drivers/gpu/drm/amd/include/linux/chash.h         | 358 +++++++++++++
>  drivers/gpu/drm/amd/lib/Kconfig                   |  27 +
>  drivers/gpu/drm/amd/lib/Makefile                  |  11 +
>  drivers/gpu/drm/amd/lib/chash.c                   | 622 ++++++++++++++++++++++
>  drivers/gpu/drm/radeon/radeon_kfd.c               |  31 ++
>  28 files changed, 1504 insertions(+), 91 deletions(-)
>  create mode 100644 drivers/gpu/drm/amd/include/linux/chash.h
>  create mode 100644 drivers/gpu/drm/amd/lib/Kconfig
>  create mode 100644 drivers/gpu/drm/amd/lib/Makefile
>  create mode 100644 drivers/gpu/drm/amd/lib/chash.c
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 0/8] Retry page fault handling for Vega10
@ 2017-09-12 23:05 Felix Kuehling
       [not found] ` <1505257545-28000-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 18+ messages in thread
From: Felix Kuehling @ 2017-09-12 23:05 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Felix Kuehling

Rebased on adeucher/amd-staging-4.13 and tested on Vega10 (graphics)
and Kaveri (KFD). Meaningful graphics tests with retry faults enabled
will only be possible after PASID support is added to amdgpu_cs.

The chash table was moved to drivers/gpu/drm/amd/lib for now but is
ready to move to lib if needed. I have not got any feedback on LKLM
and I don't want that to hold up the patch series.

TODO:
* Finish upstreaming KFD
* Allocate PASIDs for graphics contexts
* Setup VMID-PASID mapping during graphics command submission
* Confirm that graphics page faults have the correct PASID in the IV


Felix Kuehling (8):
  drm/amdgpu: Fix error handling in amdgpu_vm_init
  drm/amdgpu: Add PASID management
  drm/radeon: Add PASID manager for KFD
  drm/amdkfd: Separate doorbell allocation from PASID
  drm/amdkfd: Use PASID manager from KGD
  drm/amdgpu: Add prescreening stage in IH processing
  drm/amd: Closed hash table with low overhead
  drm/amdgpu: Track pending retry faults in IH and VM (v2)

 drivers/gpu/drm/Kconfig                           |   3 +
 drivers/gpu/drm/Makefile                          |   1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu.h               |   2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c |   2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c |   2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c            |  82 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h            |  12 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            |  84 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h            |  21 +-
 drivers/gpu/drm/amd/amdgpu/cik_ih.c               |  14 +
 drivers/gpu/drm/amd/amdgpu/cz_ih.c                |  14 +
 drivers/gpu/drm/amd/amdgpu/iceland_ih.c           |  14 +
 drivers/gpu/drm/amd/amdgpu/si_ih.c                |  14 +
 drivers/gpu/drm/amd/amdgpu/tonga_ih.c             |  14 +
 drivers/gpu/drm/amd/amdgpu/vega10_ih.c            |  90 ++++
 drivers/gpu/drm/amd/amdkfd/kfd_device.c           |   7 -
 drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c         |  50 +-
 drivers/gpu/drm/amd/amdkfd/kfd_module.c           |   6 -
 drivers/gpu/drm/amd/amdkfd/kfd_pasid.c            |  90 ++--
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h             |  10 +-
 drivers/gpu/drm/amd/amdkfd/kfd_process.c          |   6 +
 drivers/gpu/drm/amd/include/kgd_kfd_interface.h   |   6 +
 drivers/gpu/drm/amd/include/linux/chash.h         | 358 +++++++++++++
 drivers/gpu/drm/amd/lib/Kconfig                   |  27 +
 drivers/gpu/drm/amd/lib/Makefile                  |  11 +
 drivers/gpu/drm/amd/lib/chash.c                   | 622 ++++++++++++++++++++++
 drivers/gpu/drm/radeon/radeon_kfd.c               |  31 ++
 28 files changed, 1504 insertions(+), 91 deletions(-)
 create mode 100644 drivers/gpu/drm/amd/include/linux/chash.h
 create mode 100644 drivers/gpu/drm/amd/lib/Kconfig
 create mode 100644 drivers/gpu/drm/amd/lib/Makefile
 create mode 100644 drivers/gpu/drm/amd/lib/chash.c

-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2017-09-19 12:12 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-29 22:25 [PATCH 0/8] Retry page fault handling for Vega10 Felix Kuehling
     [not found] ` <1504045524-23853-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-08-29 22:25   ` [PATCH 1/8] drm/amdgpu: Fix error handling in amdgpu_vm_init Felix Kuehling
2017-08-29 22:25   ` [PATCH 2/8] drm/amdgpu: Add PASID management Felix Kuehling
2017-08-29 22:25   ` [PATCH 3/8] drm/radeon: Add PASID manager for KFD Felix Kuehling
2017-08-29 22:25   ` [PATCH 4/8] drm/amdkfd: Separate doorbell allocation from PASID Felix Kuehling
2017-08-29 22:25   ` [PATCH 5/8] drm/amdkfd: Use PASID manager from KGD Felix Kuehling
     [not found]     ` <1504045524-23853-6-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-09-03 11:54       ` Oded Gabbay
     [not found]         ` <CAFCwf13voQSyLFp8smtgMa=ZRRgrf+7H3wzfnF0cP+ak4tMhGw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-09-05 15:39           ` Felix Kuehling
2017-08-29 22:25   ` [PATCH 6/8] drm/amdgpu: Add prescreening stage in IH processing Felix Kuehling
2017-08-29 22:25   ` [PATCH 7/8] lib: Closed hash table with low overhead Felix Kuehling
2017-08-29 22:25   ` [PATCH 8/8] drm/amdgpu: Track pending retry faults in IH and VM (v2) Felix Kuehling
2017-09-06 21:53   ` [PATCH 0/8] Retry page fault handling for Vega10 Felix Kuehling
     [not found]     ` <0816a963-54cc-0041-4b09-4bf41ee46fbf-5C7GfCeVMHo@public.gmane.org>
2017-09-11 19:29       ` Deucher, Alexander
     [not found]         ` <BN6PR12MB1652DBBD8E350972D2BC6C3AF7680-/b2+HYfkarQqUD6E6FAiowdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2017-09-12  5:48           ` Oded Gabbay
     [not found]             ` <CAFCwf11o=4iizAbRVNE2w3FoacU3yutKsbMvFGfLNTscv1Ym+A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-09-12 20:26               ` Felix Kuehling
2017-09-12 23:05 Felix Kuehling
     [not found] ` <1505257545-28000-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-09-19  1:05   ` Felix Kuehling
     [not found]     ` <5f932a0d-7425-46ff-2800-f1b868495f06-5C7GfCeVMHo@public.gmane.org>
2017-09-19 12:12       ` Christian König

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.