All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/9] WIP: Retry page fault handling for Vega10
@ 2017-08-26  7:19 Felix Kuehling
       [not found] ` <1503731949-22742-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Felix Kuehling @ 2017-08-26  7:19 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Felix Kuehling

This is based on amd-kfd-staging, because that's easier for me to test.
I'm planning to port to amd-staging-4.x for submission upstream.

With this patch series, I'm able to turn retry faults on and handle the
interrupt storm from VM faults. Only the first VM fault interrupt per
process and address gets handled the usual way. Retry interruptr are
filtered in a new prescreening stage in amdgpu_ih_process.

Pending faults are tracked in a hash table in IH to detect retry faults
and a FIFO in the VM for later processing.

Looking up the VM from the fault interrupt depends on the PASID.
Currently only KFD VMs have proper PASIDs.

TODO (need some help with these):
* Allocate PASIDs for graphics contexts
* Setup VMID-PASID mapping during graphics command submission
* Confirm that graphics page faults have the correct PASID in the IV

Once that's done, we should have a foundation to start working on HMM
and proper SVM memory management with demand paging.

Felix Kuehling (9):
  drm/amdgpu: Fix error handling in amdgpu_vm_init
  drm/amdgpu: Add PASID management
  drm/radeon: Add PASID manager for KFD
  drm/amdkfd: Separate doorbell allocation from PASID
  drm/amdkfd: Use PASID manager from KGD
  drm/amd: Set the PASID for KFD VMs
  drm/amdgpu: Add prescreening stage in IH processing
  lib: Closed hash table with low overhead
  drm/amdgpu: Track pending retry faults in IH and VM

 drivers/gpu/drm/Kconfig                           |   1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu.h               |   2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h        |   3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c |   2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c |   2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c |   2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |   6 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c            |  82 ++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h            |  12 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            |  88 +++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h            |  21 +-
 drivers/gpu/drm/amd/amdgpu/cik_ih.c               |  14 +
 drivers/gpu/drm/amd/amdgpu/cz_ih.c                |  14 +
 drivers/gpu/drm/amd/amdgpu/iceland_ih.c           |  14 +
 drivers/gpu/drm/amd/amdgpu/si_ih.c                |  14 +
 drivers/gpu/drm/amd/amdgpu/tonga_ih.c             |  14 +
 drivers/gpu/drm/amd/amdgpu/vega10_ih.c            |  90 ++++
 drivers/gpu/drm/amd/amdkfd/kfd_device.c           |  18 +-
 drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c         |  48 +-
 drivers/gpu/drm/amd/amdkfd/kfd_module.c           |   6 -
 drivers/gpu/drm/amd/amdkfd/kfd_pasid.c            |  84 ++--
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h             |  10 +-
 drivers/gpu/drm/amd/amdkfd/kfd_process.c          |   8 +-
 drivers/gpu/drm/amd/include/kgd_kfd_interface.h   |   8 +-
 drivers/gpu/drm/radeon/radeon_kfd.c               |  36 +-
 include/linux/chash.h                             | 349 +++++++++++++++
 lib/Kconfig                                       |   8 +
 lib/Makefile                                      |   2 +
 lib/chash.c                                       | 521 ++++++++++++++++++++++
 30 files changed, 1376 insertions(+), 105 deletions(-)
 create mode 100644 include/linux/chash.h
 create mode 100644 lib/chash.c

-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 1/9] drm/amdgpu: Fix error handling in amdgpu_vm_init
       [not found] ` <1503731949-22742-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-08-26  7:19   ` Felix Kuehling
       [not found]     ` <1503731949-22742-2-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-08-26  7:19   ` [PATCH 2/9] drm/amdgpu: Add PASID management Felix Kuehling
                     ` (8 subsequent siblings)
  9 siblings, 1 reply; 23+ messages in thread
From: Felix Kuehling @ 2017-08-26  7:19 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Felix Kuehling

Make sure vm->root.bo is not left reserved if amdgpu_bo_kmap fails.

Change-Id: If3687b39a50b0ffe7f8be2ea6e927fa2ca0f9e45
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index e57a72e..70d7632 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2556,14 +2556,11 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
 		goto error_free_root;
 
 	vm->last_eviction_counter = atomic64_read(&adev->num_evictions);
-
-	if (vm->use_cpu_for_update) {
+	if (vm->use_cpu_for_update)
 		r = amdgpu_bo_kmap(vm->root.bo, NULL);
-		if (r)
-			goto error_free_root;
-	}
-
 	amdgpu_bo_unreserve(vm->root.bo);
+	if (r)
+		goto error_free_root;
 
 	vm->vm_context = vm_context;
 	if (vm_context == AMDGPU_VM_CONTEXT_COMPUTE) {
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 2/9] drm/amdgpu: Add PASID management
       [not found] ` <1503731949-22742-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-08-26  7:19   ` [PATCH 1/9] drm/amdgpu: Fix error handling in amdgpu_vm_init Felix Kuehling
@ 2017-08-26  7:19   ` Felix Kuehling
       [not found]     ` <1503731949-22742-3-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-08-26  7:19   ` [PATCH 3/9] drm/radeon: Add PASID manager for KFD Felix Kuehling
                     ` (7 subsequent siblings)
  9 siblings, 1 reply; 23+ messages in thread
From: Felix Kuehling @ 2017-08-26  7:19 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Felix Kuehling

Allows assigning a PASID to a VM for identifying VMs involved in page
faults. The global PASID manager is also exported in the KFD
interface so that AMDGPU and KFD can share the PASID space.

PASIDs of different sizes can be requested. On APUs, the PASID size
is deterined by the capabilities of the IOMMU. So KFD must be able
to allocate PASIDs in a smaller range.

TODO:
* Actually assign PASIDs to VMs
* Update the PASID-VMID mapping registers during CS

Change-Id: I88c9357a7c584f10e84b5607ac09eba77c833393
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c |  2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c |  2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c |  2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            | 76 ++++++++++++++++++++++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h            | 14 ++++-
 drivers/gpu/drm/amd/include/kgd_kfd_interface.h   |  6 ++
 8 files changed, 101 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
index 3e28d2b..0807d52 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
@@ -188,6 +188,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
 	.get_local_mem_info = get_local_mem_info,
 	.get_gpu_clock_counter = get_gpu_clock_counter,
 	.get_max_engine_clock_in_mhz = get_max_engine_clock_in_mhz,
+	.alloc_pasid = amdgpu_vm_alloc_pasid,
+	.free_pasid = amdgpu_vm_free_pasid,
 	.create_process_vm = amdgpu_amdkfd_gpuvm_create_process_vm,
 	.destroy_process_vm = amdgpu_amdkfd_gpuvm_destroy_process_vm,
 	.get_process_page_dir = amdgpu_amdkfd_gpuvm_get_process_page_dir,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
index 3b6b4d9..c20c000 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
@@ -162,6 +162,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
 	.get_local_mem_info = get_local_mem_info,
 	.get_gpu_clock_counter = get_gpu_clock_counter,
 	.get_max_engine_clock_in_mhz = get_max_engine_clock_in_mhz,
+	.alloc_pasid = amdgpu_vm_alloc_pasid,
+	.free_pasid = amdgpu_vm_free_pasid,
 	.create_process_vm = amdgpu_amdkfd_gpuvm_create_process_vm,
 	.destroy_process_vm = amdgpu_amdkfd_gpuvm_destroy_process_vm,
 	.create_process_gpumem = create_process_gpumem,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
index 961369d..bb99c64 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
@@ -209,6 +209,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
 	.get_local_mem_info = get_local_mem_info,
 	.get_gpu_clock_counter = get_gpu_clock_counter,
 	.get_max_engine_clock_in_mhz = get_max_engine_clock_in_mhz,
+	.alloc_pasid = amdgpu_vm_alloc_pasid,
+	.free_pasid = amdgpu_vm_free_pasid,
 	.create_process_vm = amdgpu_amdkfd_gpuvm_create_process_vm,
 	.destroy_process_vm = amdgpu_amdkfd_gpuvm_destroy_process_vm,
 	.create_process_gpumem = create_process_gpumem,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 35f7d77..462011c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -1397,7 +1397,7 @@ int amdgpu_amdkfd_gpuvm_create_process_vm(struct kgd_dev *kgd, void **vm,
 		return -ENOMEM;
 
 	/* Initialize the VM context, allocate the page directory and zero it */
-	ret = amdgpu_vm_init(adev, &new_vm->base, AMDGPU_VM_CONTEXT_COMPUTE);
+	ret = amdgpu_vm_init(adev, &new_vm->base, AMDGPU_VM_CONTEXT_COMPUTE, 0);
 	if (ret != 0) {
 		pr_err("Failed init vm ret %d\n", ret);
 		/* Undo everything related to the new VM context */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index e390c01..ba3dc4d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -825,7 +825,7 @@ int amdgpu_driver_open_kms(struct drm_device *dev, struct drm_file *file_priv)
 	}
 
 	r = amdgpu_vm_init(adev, &fpriv->vm,
-			   AMDGPU_VM_CONTEXT_GFX);
+			   AMDGPU_VM_CONTEXT_GFX, 0);
 	if (r) {
 		kfree(fpriv);
 		goto out_suspend;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 70d7632..c635699 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -27,12 +27,59 @@
  */
 #include <linux/dma-fence-array.h>
 #include <linux/interval_tree_generic.h>
+#include <linux/idr.h>
 #include <drm/drmP.h>
 #include <drm/amdgpu_drm.h>
 #include "amdgpu.h"
 #include "amdgpu_trace.h"
 
 /*
+ * PASID manager
+ *
+ * PASIDs are global address space identifiers that can be shared
+ * between the GPU, an IOMMU and the driver. VMs on different devices
+ * may use the same PASID if they share the same address
+ * space. Therefore PASIDs are allocated using a global IDA. VMs are
+ * looked up from the PASID per amdgpu_device.
+ */
+static DEFINE_IDA(amdgpu_vm_pasid_ida);
+
+/**
+ * amdgpu_vm_alloc_pasid - Allocate a PASID
+ * @bits: Maximum width of the PASID in bits, must be at least 1
+ *
+ * Allocates a PASID of the given width while keeping smaller PASIDs
+ * available if possible.
+ *
+ * Returns a positive integer on success. Returns %-EINVAL if bits==0.
+ * Returns %-ENOSPC if no PASID was avaliable. Returns %-ENOMEM on
+ * memory allocation failure.
+ */
+int amdgpu_vm_alloc_pasid(unsigned int bits)
+{
+	int pasid = -EINVAL;
+
+	for (bits = min(bits, 31U); bits > 0; bits--) {
+		pasid = ida_simple_get(&amdgpu_vm_pasid_ida,
+				       1U << (bits - 1), 1U << bits,
+				       GFP_KERNEL);
+		if (pasid != -ENOSPC)
+			break;
+	}
+
+	return pasid;
+}
+
+/**
+ * amdgpu_vm_free_pasid - Free a PASID
+ * @pasid: PASID to free
+ */
+void amdgpu_vm_free_pasid(unsigned int pasid)
+{
+	ida_simple_remove(&amdgpu_vm_pasid_ida, pasid);
+}
+
+/*
  * GPUVM
  * GPUVM is similar to the legacy gart on older asics, however
  * rather than there being a single global gart table
@@ -2482,7 +2529,7 @@ void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint64_t vm_size, uint32_
  * Init @vm fields.
  */
 int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
-		   int vm_context)
+		   int vm_context, unsigned int pasid)
 {
 	const unsigned align = min(AMDGPU_VM_PTB_ALIGN_SIZE,
 		AMDGPU_VM_PTE_COUNT(adev) * 8);
@@ -2562,6 +2609,19 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
 	if (r)
 		goto error_free_root;
 
+	if (pasid) {
+		unsigned long flags;
+
+		spin_lock_irqsave(&adev->vm_manager.pasid_lock, flags);
+		r = idr_alloc(&adev->vm_manager.pasid_idr, vm, pasid, pasid + 1,
+			      GFP_ATOMIC);
+		spin_unlock_irqrestore(&adev->vm_manager.pasid_lock, flags);
+		if (r < 0)
+			goto error_free_root;
+
+		vm->pasid = pasid;
+	}
+
 	vm->vm_context = vm_context;
 	if (vm_context == AMDGPU_VM_CONTEXT_COMPUTE) {
 		mutex_lock(&id_mgr->lock);
@@ -2650,6 +2710,14 @@ void amdgpu_vm_fini(struct amdgpu_device *adev, struct amdgpu_vm *vm)
 		mutex_unlock(&id_mgr->lock);
 	}
 
+	if (vm->pasid) {
+		unsigned long flags;
+
+		spin_lock_irqsave(&adev->vm_manager.pasid_lock, flags);
+		idr_remove(&adev->vm_manager.pasid_idr, vm->pasid);
+		spin_unlock_irqrestore(&adev->vm_manager.pasid_lock, flags);
+	}
+
 	amd_sched_entity_fini(vm->entity.sched, &vm->entity);
 
 	if (!RB_EMPTY_ROOT(&vm->va)) {
@@ -2729,6 +2797,9 @@ void amdgpu_vm_manager_init(struct amdgpu_device *adev)
 	adev->vm_manager.vm_update_mode = 0;
 #endif
 
+	idr_init(&adev->vm_manager.pasid_idr);
+	spin_lock_init(&adev->vm_manager.pasid_lock);
+
 	adev->vm_manager.n_compute_vms = 0;
 }
 
@@ -2743,6 +2814,9 @@ void amdgpu_vm_manager_fini(struct amdgpu_device *adev)
 {
 	unsigned i, j;
 
+	WARN_ON(!idr_is_empty(&adev->vm_manager.pasid_idr));
+	idr_destroy(&adev->vm_manager.pasid_idr);
+
 	for (i = 0; i < AMDGPU_MAX_VMHUBS; ++i) {
 		struct amdgpu_vm_id_manager *id_mgr =
 			&adev->vm_manager.id_mgr[i];
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
index 49e15d7..692b05c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -25,6 +25,7 @@
 #define __AMDGPU_VM_H__
 
 #include <linux/rbtree.h>
+#include <linux/idr.h>
 
 #include "gpu_scheduler.h"
 #include "amdgpu_sync.h"
@@ -143,8 +144,9 @@ struct amdgpu_vm {
 	/* Scheduler entity for page table updates */
 	struct amd_sched_entity	entity;
 
-	/* client id */
+	/* client id and PASID (TODO: replace client_id with PASID) */
 	u64                     client_id;
+	unsigned int		pasid;
 	/* dedicated to vm */
 	struct amdgpu_vm_id	*reserved_vmid[AMDGPU_MAX_VMHUBS];
 
@@ -219,14 +221,22 @@ struct amdgpu_vm_manager {
 	 */
 	int					vm_update_mode;
 
+	/* PASID to VM mapping, will be used in interrupt context to
+	 * look up VM of a page fault
+	 */
+	struct idr				pasid_idr;
+	spinlock_t				pasid_lock;
+
 	/* Number of Compute VMs, used for detecting Compute activity */
 	unsigned                                n_compute_vms;
 };
 
+int amdgpu_vm_alloc_pasid(unsigned int bits);
+void amdgpu_vm_free_pasid(unsigned int pasid);
 void amdgpu_vm_manager_init(struct amdgpu_device *adev);
 void amdgpu_vm_manager_fini(struct amdgpu_device *adev);
 int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
-		   int vm_context);
+		   int vm_context, unsigned int pasid);
 void amdgpu_vm_fini(struct amdgpu_device *adev, struct amdgpu_vm *vm);
 void amdgpu_vm_get_pd_bo(struct amdgpu_vm *vm,
 			 struct list_head *validated,
diff --git a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
index e8027b3..5833ef7 100644
--- a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
+++ b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
@@ -188,6 +188,9 @@ struct tile_config {
  *
  * @get_max_engine_clock_in_mhz: Retrieves maximum GPU clock in MHz
  *
+ * @alloc_pasid: Allocate a PASID
+ * @free_pasid: Free a PASID
+ *
  * @program_sh_mem_settings: A function that should initiate the memory
  * properties such as main aperture memory type (cache / non cached) and
  * secondary aperture base address, size and memory type.
@@ -264,6 +267,9 @@ struct kfd2kgd_calls {
 
 	uint32_t (*get_max_engine_clock_in_mhz)(struct kgd_dev *kgd);
 
+	int (*alloc_pasid)(unsigned int bits);
+	void (*free_pasid)(unsigned int pasid);
+
 	int (*create_process_vm)(struct kgd_dev *kgd, void **vm,
 				 void **process_info);
 	void (*destroy_process_vm)(struct kgd_dev *kgd, void *vm);
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 3/9] drm/radeon: Add PASID manager for KFD
       [not found] ` <1503731949-22742-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-08-26  7:19   ` [PATCH 1/9] drm/amdgpu: Fix error handling in amdgpu_vm_init Felix Kuehling
  2017-08-26  7:19   ` [PATCH 2/9] drm/amdgpu: Add PASID management Felix Kuehling
@ 2017-08-26  7:19   ` Felix Kuehling
       [not found]     ` <1503731949-22742-4-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-08-26  7:19   ` [PATCH 4/9] drm/amdkfd: Separate doorbell allocation from PASID Felix Kuehling
                     ` (6 subsequent siblings)
  9 siblings, 1 reply; 23+ messages in thread
From: Felix Kuehling @ 2017-08-26  7:19 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Felix Kuehling

Change-Id: I101a5ac8e0ebf0dbbe6dbd1f61cd8236d499c5b8
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/radeon/radeon_kfd.c | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/drivers/gpu/drm/radeon/radeon_kfd.c b/drivers/gpu/drm/radeon/radeon_kfd.c
index 81fb94b..acfe34e 100644
--- a/drivers/gpu/drm/radeon/radeon_kfd.c
+++ b/drivers/gpu/drm/radeon/radeon_kfd.c
@@ -79,6 +79,9 @@ static uint64_t get_gpu_clock_counter(struct kgd_dev *kgd);
 
 static uint32_t get_max_engine_clock_in_mhz(struct kgd_dev *kgd);
 
+static int alloc_pasid(unsigned int bits);
+static void free_pasid(unsigned int pasid);
+
 static int create_process_vm(struct kgd_dev *kgd, void **vm, void **info);
 static void destroy_process_vm(struct kgd_dev *kgd, void *vm);
 
@@ -161,6 +164,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
 	.get_local_mem_info = get_local_mem_info,
 	.get_gpu_clock_counter = get_gpu_clock_counter,
 	.get_max_engine_clock_in_mhz = get_max_engine_clock_in_mhz,
+	.alloc_pasid = alloc_pasid,
+	.free_pasid = free_pasid,
 	.create_process_vm = create_process_vm,
 	.destroy_process_vm = destroy_process_vm,
 	.get_process_page_dir = get_process_page_dir,
@@ -415,6 +420,31 @@ static uint32_t get_max_engine_clock_in_mhz(struct kgd_dev *kgd)
 }
 
 /*
+ * PASID manager
+ */
+static DEFINE_IDA(pasid_ida);
+
+int alloc_pasid(unsigned int bits)
+{
+	int pasid = -EINVAL;
+
+	for (bits = min(bits, 31U); bits > 0; bits--) {
+		pasid = ida_simple_get(&pasid_ida,
+				       1U << (bits - 1), 1U << bits,
+				       GFP_KERNEL);
+		if (pasid != -ENOSPC)
+			break;
+	}
+
+	return pasid;
+}
+
+void free_pasid(unsigned int pasid)
+{
+	ida_simple_remove(&pasid_ida, pasid);
+}
+
+/*
  * Creates a VM context for HSA process
  */
 static int create_process_vm(struct kgd_dev *kgd, void **vm, void **info)
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 4/9] drm/amdkfd: Separate doorbell allocation from PASID
       [not found] ` <1503731949-22742-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (2 preceding siblings ...)
  2017-08-26  7:19   ` [PATCH 3/9] drm/radeon: Add PASID manager for KFD Felix Kuehling
@ 2017-08-26  7:19   ` Felix Kuehling
  2017-08-26  7:19   ` [PATCH 5/9] drm/amdkfd: Use PASID manager from KGD Felix Kuehling
                     ` (5 subsequent siblings)
  9 siblings, 0 replies; 23+ messages in thread
From: Felix Kuehling @ 2017-08-26  7:19 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Felix Kuehling

PASID management is moving into KGD. Limiting the PASID range to the
number of doorbell pages is no longer practical.

Change-Id: I2ddee0c481d3cb54c779ef6f180e0188a9ca53ca
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_device.c   |  7 -----
 drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c | 48 ++++++++++++++++++++-----------
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h     | 10 +++----
 drivers/gpu/drm/amd/amdkfd/kfd_process.c  |  6 ++++
 4 files changed, 43 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index d102134..378a9cf 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -413,13 +413,6 @@ static bool device_iommu_pasid_init(struct kfd_dev *kfd)
 	pasid_limit = min_t(unsigned int,
 			(unsigned int)(1 << kfd->device_info->max_pasid_bits),
 			iommu_info.max_pasids);
-	/*
-	 * last pasid is used for kernel queues doorbells
-	 * in the future the last pasid might be used for a kernel thread.
-	 */
-	pasid_limit = min_t(unsigned int,
-				pasid_limit,
-				kfd->doorbell_process_limit - 1);
 
 	if (!kfd_set_pasid_limit(pasid_limit)) {
 		dev_err(kfd_device, "error setting pasid limit\n");
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
index 008d258..17a0b97 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
@@ -26,14 +26,12 @@
 #include <linux/io.h>
 
 /*
- * This extension supports a kernel level doorbells management for
- * the kernel queues.
- * Basically the last doorbells page is devoted to kernel queues
- * and that's assures that any user process won't get access to the
- * kernel doorbells page
+ * This extension supports a kernel level doorbells management for the
+ * kernel queues using the first doorbell page reserved for the kernel.
  */
 
-#define KERNEL_DOORBELL_PASID 1
+static DEFINE_IDA(doorbell_ida);
+static unsigned int max_doorbell_slices;
 
 /*
  * Each device exposes a doorbell aperture, a PCI MMIO aperture that
@@ -83,13 +81,15 @@ int kfd_doorbell_init(struct kfd_dev *kfd)
 			(doorbell_aperture_size - doorbell_start_offset) /
 						kfd_doorbell_process_slice(kfd);
 	else
-		doorbell_process_limit = 0;
+		return -ENOSPC;
+
+	if (!max_doorbell_slices || doorbell_process_limit < max_doorbell_slices)
+		max_doorbell_slices = doorbell_process_limit;
 
 	kfd->doorbell_base = kfd->shared_resources.doorbell_physical_address +
 				doorbell_start_offset;
 
 	kfd->doorbell_id_offset = doorbell_start_offset / sizeof(u32);
-	kfd->doorbell_process_limit = doorbell_process_limit - 1;
 
 	kfd->doorbell_kernel_ptr = ioremap(kfd->doorbell_base,
 					   kfd_doorbell_process_slice(kfd));
@@ -181,12 +181,11 @@ void __iomem *kfd_get_kernel_doorbell(struct kfd_dev *kfd,
 	inx *= kfd->device_info->doorbell_size / sizeof(u32);
 
 	/*
-	 * Calculating the kernel doorbell offset using "faked" kernel
-	 * pasid that allocated for kernel queues only. Offset is in
-	 * dword units regardless of the ASIC-dependent doorbell size.
+	 * Calculating the kernel doorbell offset using the first
+	 * doorbell page. Offset is in dword units regardless of the
+	 * ASIC-dependent doorbell size.
 	 */
-	*doorbell_off = KERNEL_DOORBELL_PASID *
-		(kfd_doorbell_process_slice(kfd) / sizeof(u32)) + inx;
+	*doorbell_off = kfd->doorbell_id_offset + inx;
 
 	pr_debug("Get kernel queue doorbell\n"
 			 "     doorbell offset   == 0x%08X\n"
@@ -235,12 +234,13 @@ unsigned int kfd_doorbell_id_to_offset(struct kfd_dev *kfd,
 {
 	/*
 	 * doorbell_id_offset accounts for doorbells taken by KGD.
-	 * pasid * kfd_doorbell_process_slice/sizeof(u32) adjusts to
+	 * index * kfd_doorbell_process_slice/sizeof(u32) adjusts to
 	 * the process's doorbells. The offset returned is in dword
 	 * units regardless of the ASIC-dependent doorbell size.
 	 */
 	return kfd->doorbell_id_offset +
-		process->pasid * (kfd_doorbell_process_slice(kfd)/sizeof(u32)) +
+		process->doorbell_index
+		* kfd_doorbell_process_slice(kfd) / sizeof(u32) +
 		doorbell_id * kfd->device_info->doorbell_size / sizeof(u32);
 }
 
@@ -258,5 +258,21 @@ phys_addr_t kfd_get_process_doorbells(struct kfd_dev *dev,
 					struct kfd_process *process)
 {
 	return dev->doorbell_base +
-		process->pasid * kfd_doorbell_process_slice(dev);
+		process->doorbell_index * kfd_doorbell_process_slice(dev);
+}
+
+int kfd_alloc_process_doorbells(struct kfd_process *process)
+{
+	int r = ida_simple_get(&doorbell_ida, 1, max_doorbell_slices,
+				GFP_KERNEL);
+	if (r > 0)
+		process->doorbell_index = r;
+
+	return r;
+}
+
+void kfd_free_process_doorbells(struct kfd_process *process)
+{
+	if (process->doorbell_index)
+		ida_simple_remove(&doorbell_ida, process->doorbell_index);
 }
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index f0d0995..325ef81 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -225,9 +225,6 @@ struct kfd_dev {
 					 * to HW doorbell, GFX reserved some
 					 * at the start)
 					 */
-	size_t doorbell_process_limit;	/* Number of processes we have doorbell
-					 * space for.
-					 */
 	u32 __iomem *doorbell_kernel_ptr; /* This is a pointer for a doorbells
 					   * page used by kernel queue
 					   */
@@ -692,6 +689,7 @@ struct kfd_process {
 	struct rcu_head	rcu;
 
 	unsigned int pasid;
+	unsigned int doorbell_index;
 
 	/*
 	 * List of kfd_process_device structures,
@@ -826,6 +824,10 @@ void write_kernel_doorbell64(void __iomem *db, u64 value);
 unsigned int kfd_doorbell_id_to_offset(struct kfd_dev *kfd,
 					struct kfd_process *process,
 					unsigned int doorbell_id);
+phys_addr_t kfd_get_process_doorbells(struct kfd_dev *dev,
+					struct kfd_process *process);
+int kfd_alloc_process_doorbells(struct kfd_process *process);
+void kfd_free_process_doorbells(struct kfd_process *process);
 
 /* GTT Sub-Allocator */
 
@@ -1020,8 +1022,6 @@ void kfd_pm_func_init_v9(struct packet_manager *pm, uint16_t fw_ver);
 
 
 uint64_t kfd_get_number_elems(struct kfd_dev *kfd);
-phys_addr_t kfd_get_process_doorbells(struct kfd_dev *dev,
-					struct kfd_process *process);
 int amdkfd_fence_wait_timeout(unsigned int *fence_addr,
 				unsigned int fence_value,
 				unsigned long timeout_ms);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index e3ecc9a..a20ced0 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -386,6 +386,7 @@ static void kfd_process_wq_release(struct work_struct *work)
 	kfd_event_free_process(p);
 
 	kfd_pasid_free(p->pasid);
+	kfd_free_process_doorbells(p);
 
 	mutex_destroy(&p->mutex);
 
@@ -562,6 +563,9 @@ static struct kfd_process *create_process(const struct task_struct *thread,
 	if (process->pasid == 0)
 		goto err_alloc_pasid;
 
+	if (kfd_alloc_process_doorbells(process) < 0)
+		goto err_alloc_doorbells;
+
 	kref_init(&process->ref);
 	mutex_init(&process->mutex);
 
@@ -623,6 +627,8 @@ static struct kfd_process *create_process(const struct task_struct *thread,
 	mmu_notifier_unregister_no_release(&process->mmu_notifier, process->mm);
 err_mmu_notifier:
 	mutex_destroy(&process->mutex);
+	kfd_free_process_doorbells(process);
+err_alloc_doorbells:
 	kfd_pasid_free(process->pasid);
 err_alloc_pasid:
 	kfree(process);
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 5/9] drm/amdkfd: Use PASID manager from KGD
       [not found] ` <1503731949-22742-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (3 preceding siblings ...)
  2017-08-26  7:19   ` [PATCH 4/9] drm/amdkfd: Separate doorbell allocation from PASID Felix Kuehling
@ 2017-08-26  7:19   ` Felix Kuehling
  2017-08-26  7:19   ` [PATCH 6/9] drm/amd: Set the PASID for KFD VMs Felix Kuehling
                     ` (4 subsequent siblings)
  9 siblings, 0 replies; 23+ messages in thread
From: Felix Kuehling @ 2017-08-26  7:19 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Felix Kuehling

Also fixed PASID limit setting for dGPUs.

Change-Id: If3bb8c0742019745bedf09077a271b1ed14deea1
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_device.c | 11 ++---
 drivers/gpu/drm/amd/amdkfd/kfd_module.c |  6 ---
 drivers/gpu/drm/amd/amdkfd/kfd_pasid.c  | 84 +++++++++++++--------------------
 3 files changed, 39 insertions(+), 62 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 378a9cf..1915e93 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -391,7 +391,6 @@ static bool device_iommu_pasid_init(struct kfd_dev *kfd)
 					AMD_IOMMU_DEVICE_FLAG_PASID_SUP;
 
 	struct amd_iommu_device_info iommu_info;
-	unsigned int pasid_limit;
 	int err;
 
 	err = amd_iommu_device_info(kfd->pdev, &iommu_info);
@@ -410,11 +409,7 @@ static bool device_iommu_pasid_init(struct kfd_dev *kfd)
 		return false;
 	}
 
-	pasid_limit = min_t(unsigned int,
-			(unsigned int)(1 << kfd->device_info->max_pasid_bits),
-			iommu_info.max_pasids);
-
-	if (!kfd_set_pasid_limit(pasid_limit)) {
+	if (!kfd_set_pasid_limit(iommu_info.max_pasids)) {
 		dev_err(kfd_device, "error setting pasid limit\n");
 		return false;
 	}
@@ -605,6 +600,10 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
 		goto device_queue_manager_error;
 	}
 
+	if (!kfd_set_pasid_limit(1U << kfd->device_info->max_pasid_bits)) {
+		dev_err(kfd_device, "Error setting pasid limit\n");
+		goto device_iommu_pasid_error;
+	}
 #if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
 	if (kfd->device_info->is_need_iommu_device) {
 		if (!device_iommu_pasid_init(kfd)) {
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_module.c b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
index aba3e9d..9846d58 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
@@ -131,10 +131,6 @@ static int __init kfd_module_init(void)
 		return -1;
 	}
 
-	err = kfd_pasid_init();
-	if (err < 0)
-		return err;
-
 	err = kfd_chardev_init();
 	if (err < 0)
 		goto err_ioctl;
@@ -162,7 +158,6 @@ static int __init kfd_module_init(void)
 err_topology:
 	kfd_chardev_exit();
 err_ioctl:
-	kfd_pasid_exit();
 	return err;
 }
 
@@ -175,7 +170,6 @@ static void __exit kfd_module_exit(void)
 	kfd_process_destroy_wq();
 	kfd_topology_shutdown();
 	kfd_chardev_exit();
-	kfd_pasid_exit();
 	dev_info(kfd_device, "Removed module\n");
 }
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pasid.c b/drivers/gpu/drm/amd/amdkfd/kfd_pasid.c
index 1e06de0..ed78937 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_pasid.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_pasid.c
@@ -20,78 +20,62 @@
  * OTHER DEALINGS IN THE SOFTWARE.
  */
 
-#include <linux/slab.h>
 #include <linux/types.h>
 #include "kfd_priv.h"
 
-static unsigned long *pasid_bitmap;
-static unsigned int pasid_limit;
-static DEFINE_MUTEX(pasid_mutex);
-
-int kfd_pasid_init(void)
-{
-	pasid_limit = KFD_MAX_NUM_OF_PROCESSES;
-
-	pasid_bitmap = kcalloc(BITS_TO_LONGS(pasid_limit), sizeof(long),
-				GFP_KERNEL);
-	if (!pasid_bitmap)
-		return -ENOMEM;
-
-	set_bit(0, pasid_bitmap); /* PASID 0 is reserved. */
-
-	return 0;
-}
-
-void kfd_pasid_exit(void)
-{
-	kfree(pasid_bitmap);
-}
+static unsigned int pasid_bits = 16;
+static const struct kfd2kgd_calls *kfd2kgd;
 
 bool kfd_set_pasid_limit(unsigned int new_limit)
 {
-	if (new_limit < pasid_limit) {
-		bool ok;
-
-		mutex_lock(&pasid_mutex);
-
-		/* ensure that no pasids >= new_limit are in-use */
-		ok = (find_next_bit(pasid_bitmap, pasid_limit, new_limit) ==
-								pasid_limit);
-		if (ok)
-			pasid_limit = new_limit;
-
-		mutex_unlock(&pasid_mutex);
-
-		return ok;
+	if (new_limit < 2)
+		return false;
+
+	if (new_limit < (1U << pasid_bits)) {
+		if (kfd2kgd)
+			/* We've already allocated user PASIDs, too late to
+			 * change the limit
+			 */
+			return false;
+
+		while (new_limit < (1U << pasid_bits))
+			pasid_bits--;
 	}
 
 	return true;
 }
 
-inline unsigned int kfd_get_pasid_limit(void)
+unsigned int kfd_get_pasid_limit(void)
 {
-	return pasid_limit;
+	return 1U << pasid_bits;
 }
 
 unsigned int kfd_pasid_alloc(void)
 {
-	unsigned int found;
+	int r;
+
+	/* Find the first best KFD device for calling KGD */
+	if (!kfd2kgd) {
+		struct kfd_dev *dev = NULL;
+		unsigned int i;
 
-	mutex_lock(&pasid_mutex);
+		for (i = 0; kfd_topology_enum_kfd_devices(i, &dev) == 0; i++)
+			if (dev && dev->kfd2kgd) {
+				kfd2kgd = dev->kfd2kgd;
+				break;
+			}
 
-	found = find_first_zero_bit(pasid_bitmap, pasid_limit);
-	if (found == pasid_limit)
-		found = 0;
-	else
-		set_bit(found, pasid_bitmap);
+		if (!kfd2kgd)
+			return false;
+	}
 
-	mutex_unlock(&pasid_mutex);
+	r = kfd2kgd->alloc_pasid(pasid_bits);
 
-	return found;
+	return r > 0 ? r : 0;
 }
 
 void kfd_pasid_free(unsigned int pasid)
 {
-	if (!WARN_ON(pasid == 0 || pasid >= pasid_limit))
-		clear_bit(pasid, pasid_bitmap);
+	if (kfd2kgd)
+		kfd2kgd->free_pasid(pasid);
 }
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 6/9] drm/amd: Set the PASID for KFD VMs
       [not found] ` <1503731949-22742-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (4 preceding siblings ...)
  2017-08-26  7:19   ` [PATCH 5/9] drm/amdkfd: Use PASID manager from KGD Felix Kuehling
@ 2017-08-26  7:19   ` Felix Kuehling
  2017-08-26  7:19   ` [PATCH 7/9] drm/amdgpu: Add prescreening stage in IH processing Felix Kuehling
                     ` (3 subsequent siblings)
  9 siblings, 0 replies; 23+ messages in thread
From: Felix Kuehling @ 2017-08-26  7:19 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Felix Kuehling

Change-Id: I5d949d61fa9d828b453187698c250633188386e0
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h       | 3 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 6 ++++--
 drivers/gpu/drm/amd/amdkfd/kfd_process.c         | 2 +-
 drivers/gpu/drm/amd/include/kgd_kfd_interface.h  | 2 +-
 drivers/gpu/drm/radeon/radeon_kfd.c              | 6 ++++--
 5 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index d9e3734..8bd34da 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -202,7 +202,8 @@ int amdgpu_amdkfd_gpuvm_unmap_memory_from_gpu(
 		struct kgd_dev *kgd, struct kgd_mem *mem, void *vm);
 
 int amdgpu_amdkfd_gpuvm_create_process_vm(struct kgd_dev *kgd, void **vm,
-					  void **process_info);
+					  void **process_info,
+					  unsigned int pasid);
 void amdgpu_amdkfd_gpuvm_destroy_process_vm(struct kgd_dev *kgd, void *vm);
 
 uint32_t amdgpu_amdkfd_gpuvm_get_process_page_dir(void *vm);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 462011c..902c05d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -1385,7 +1385,8 @@ static u64 get_vm_pd_gpu_offset(void *vm)
 }
 
 int amdgpu_amdkfd_gpuvm_create_process_vm(struct kgd_dev *kgd, void **vm,
-					  void **process_info)
+					  void **process_info,
+					  unsigned int pasid)
 {
 	int ret;
 	struct amdkfd_vm *new_vm;
@@ -1397,7 +1398,8 @@ int amdgpu_amdkfd_gpuvm_create_process_vm(struct kgd_dev *kgd, void **vm,
 		return -ENOMEM;
 
 	/* Initialize the VM context, allocate the page directory and zero it */
-	ret = amdgpu_vm_init(adev, &new_vm->base, AMDGPU_VM_CONTEXT_COMPUTE, 0);
+	ret = amdgpu_vm_init(adev, &new_vm->base, AMDGPU_VM_CONTEXT_COMPUTE,
+			     pasid);
 	if (ret != 0) {
 		pr_err("Failed init vm ret %d\n", ret);
 		/* Undo everything related to the new VM context */
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index a20ced0..3b11e35 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -702,7 +702,7 @@ struct kfd_process_device *kfd_create_process_device_data(struct kfd_dev *dev,
 
 	/* Create the GPUVM context for this specific device */
 	if (dev->kfd2kgd->create_process_vm(dev->kgd, &pdd->vm,
-					&p->process_info)) {
+					    &p->process_info, p->pasid)) {
 		pr_err("Failed to create process VM object\n");
 		goto err_create_pdd;
 	}
diff --git a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
index 5833ef7..f0c6e11 100644
--- a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
+++ b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
@@ -271,7 +271,7 @@ struct kfd2kgd_calls {
 	void (*free_pasid)(unsigned int pasid);
 
 	int (*create_process_vm)(struct kgd_dev *kgd, void **vm,
-				 void **process_info);
+				 void **process_info, unsigned int pasid);
 	void (*destroy_process_vm)(struct kgd_dev *kgd, void *vm);
 
 	int (*create_process_gpumem)(struct kgd_dev *kgd, uint64_t va, size_t size, void *vm, struct kgd_mem **mem);
diff --git a/drivers/gpu/drm/radeon/radeon_kfd.c b/drivers/gpu/drm/radeon/radeon_kfd.c
index acfe34e..9daf661 100644
--- a/drivers/gpu/drm/radeon/radeon_kfd.c
+++ b/drivers/gpu/drm/radeon/radeon_kfd.c
@@ -82,7 +82,8 @@ static uint32_t get_max_engine_clock_in_mhz(struct kgd_dev *kgd);
 static int alloc_pasid(unsigned int bits);
 static void free_pasid(unsigned int pasid);
 
-static int create_process_vm(struct kgd_dev *kgd, void **vm, void **info);
+static int create_process_vm(struct kgd_dev *kgd, void **vm, void **info,
+			     unsigned int pasid);
 static void destroy_process_vm(struct kgd_dev *kgd, void *vm);
 
 static uint32_t get_process_page_dir(void *vm);
@@ -447,7 +448,8 @@ void free_pasid(unsigned int pasid)
 /*
  * Creates a VM context for HSA process
  */
-static int create_process_vm(struct kgd_dev *kgd, void **vm, void **info)
+static int create_process_vm(struct kgd_dev *kgd, void **vm, void **info,
+			     unsigned int pasid)
 {
 	int ret;
 	struct radeon_vm *new_vm;
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 7/9] drm/amdgpu: Add prescreening stage in IH processing
       [not found] ` <1503731949-22742-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (5 preceding siblings ...)
  2017-08-26  7:19   ` [PATCH 6/9] drm/amd: Set the PASID for KFD VMs Felix Kuehling
@ 2017-08-26  7:19   ` Felix Kuehling
  2017-08-26  7:19   ` [PATCH 8/9] lib: Closed hash table with low overhead Felix Kuehling
                     ` (2 subsequent siblings)
  9 siblings, 0 replies; 23+ messages in thread
From: Felix Kuehling @ 2017-08-26  7:19 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Felix Kuehling

To filter out high-frequency interrupts that can be safely ignored.

Change-Id: Ia2f449a3eefbb3c134460b9e8ba9a6bfc1842223
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h     |  2 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c  |  6 ++++++
 drivers/gpu/drm/amd/amdgpu/cik_ih.c     | 14 ++++++++++++++
 drivers/gpu/drm/amd/amdgpu/cz_ih.c      | 14 ++++++++++++++
 drivers/gpu/drm/amd/amdgpu/iceland_ih.c | 14 ++++++++++++++
 drivers/gpu/drm/amd/amdgpu/si_ih.c      | 14 ++++++++++++++
 drivers/gpu/drm/amd/amdgpu/tonga_ih.c   | 14 ++++++++++++++
 drivers/gpu/drm/amd/amdgpu/vega10_ih.c  | 14 ++++++++++++++
 8 files changed, 92 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 6675498..7f26aac 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -335,6 +335,7 @@ struct amdgpu_gart_funcs {
 struct amdgpu_ih_funcs {
 	/* ring read/write ptr handling, called from interrupt context */
 	u32 (*get_wptr)(struct amdgpu_device *adev);
+	bool (*prescreen_iv)(struct amdgpu_device *adev);
 	void (*decode_iv)(struct amdgpu_device *adev,
 			  struct amdgpu_iv_entry *entry);
 	void (*set_rptr)(struct amdgpu_device *adev);
@@ -1774,6 +1775,7 @@ amdgpu_get_sdma_instance(struct amdgpu_ring *ring)
 #define amdgpu_ring_init_cond_exec(r) (r)->funcs->init_cond_exec((r))
 #define amdgpu_ring_patch_cond_exec(r,o) (r)->funcs->patch_cond_exec((r),(o))
 #define amdgpu_ih_get_wptr(adev) (adev)->irq.ih_funcs->get_wptr((adev))
+#define amdgpu_ih_prescreen_iv(adev) (adev)->irq.ih_funcs->prescreen_iv((adev))
 #define amdgpu_ih_decode_iv(adev, iv) (adev)->irq.ih_funcs->decode_iv((adev), (iv))
 #define amdgpu_ih_set_rptr(adev) (adev)->irq.ih_funcs->set_rptr((adev))
 #define amdgpu_display_vblank_get_counter(adev, crtc) (adev)->mode_info.funcs->vblank_get_counter((adev), (crtc))
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
index 3ab4c65..c834a40 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
@@ -169,6 +169,12 @@ int amdgpu_ih_process(struct amdgpu_device *adev)
 	while (adev->irq.ih.rptr != wptr) {
 		u32 ring_index = adev->irq.ih.rptr >> 2;
 
+		/* Prescreening of high-frequency interrupts */
+		if (!amdgpu_ih_prescreen_iv(adev)) {
+			adev->irq.ih.rptr &= adev->irq.ih.ptr_mask;
+			continue;
+		}
+
 		/* Before dispatching irq to IP blocks, send it to amdkfd */
 		amdgpu_amdkfd_interrupt(adev,
 				(const void *) &adev->irq.ih.ring[ring_index]);
diff --git a/drivers/gpu/drm/amd/amdgpu/cik_ih.c b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
index c57c3f1..5c93da5 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik_ih.c
@@ -228,6 +228,19 @@ static u32 cik_ih_get_wptr(struct amdgpu_device *adev)
  * [127:96] - reserved
  */
 
+/**
+ * cik_ih_prescreen_iv - prescreen an interrupt vector
+ *
+ * @adev: amdgpu_device pointer
+ *
+ * Returns true if the interrupt vector should be further processed.
+ */
+static bool cik_ih_prescreen_iv(struct amdgpu_device *adev)
+{
+	/* Process all interrupts */
+	return true;
+}
+
  /**
  * cik_ih_decode_iv - decode an interrupt vector
  *
@@ -433,6 +446,7 @@ static const struct amd_ip_funcs cik_ih_ip_funcs = {
 
 static const struct amdgpu_ih_funcs cik_ih_funcs = {
 	.get_wptr = cik_ih_get_wptr,
+	.prescreen_iv = cik_ih_prescreen_iv,
 	.decode_iv = cik_ih_decode_iv,
 	.set_rptr = cik_ih_set_rptr
 };
diff --git a/drivers/gpu/drm/amd/amdgpu/cz_ih.c b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
index a5f294e..2a9452b 100644
--- a/drivers/gpu/drm/amd/amdgpu/cz_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/cz_ih.c
@@ -208,6 +208,19 @@ static u32 cz_ih_get_wptr(struct amdgpu_device *adev)
 }
 
 /**
+ * cz_ih_prescreen_iv - prescreen an interrupt vector
+ *
+ * @adev: amdgpu_device pointer
+ *
+ * Returns true if the interrupt vector should be further processed.
+ */
+static bool cz_ih_prescreen_iv(struct amdgpu_device *adev)
+{
+	/* Process all interrupts */
+	return true;
+}
+
+/**
  * cz_ih_decode_iv - decode an interrupt vector
  *
  * @adev: amdgpu_device pointer
@@ -414,6 +427,7 @@ static const struct amd_ip_funcs cz_ih_ip_funcs = {
 
 static const struct amdgpu_ih_funcs cz_ih_funcs = {
 	.get_wptr = cz_ih_get_wptr,
+	.prescreen_iv = cz_ih_prescreen_iv,
 	.decode_iv = cz_ih_decode_iv,
 	.set_rptr = cz_ih_set_rptr
 };
diff --git a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
index cb622ad..77ed599 100644
--- a/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/iceland_ih.c
@@ -208,6 +208,19 @@ static u32 iceland_ih_get_wptr(struct amdgpu_device *adev)
 }
 
 /**
+ * iceland_ih_prescreen_iv - prescreen an interrupt vector
+ *
+ * @adev: amdgpu_device pointer
+ *
+ * Returns true if the interrupt vector should be further processed.
+ */
+static bool iceland_ih_prescreen_iv(struct amdgpu_device *adev)
+{
+	/* Process all interrupts */
+	return true;
+}
+
+/**
  * iceland_ih_decode_iv - decode an interrupt vector
  *
  * @adev: amdgpu_device pointer
@@ -412,6 +425,7 @@ static const struct amd_ip_funcs iceland_ih_ip_funcs = {
 
 static const struct amdgpu_ih_funcs iceland_ih_funcs = {
 	.get_wptr = iceland_ih_get_wptr,
+	.prescreen_iv = iceland_ih_prescreen_iv,
 	.decode_iv = iceland_ih_decode_iv,
 	.set_rptr = iceland_ih_set_rptr
 };
diff --git a/drivers/gpu/drm/amd/amdgpu/si_ih.c b/drivers/gpu/drm/amd/amdgpu/si_ih.c
index e660842..e7e4d2dd 100644
--- a/drivers/gpu/drm/amd/amdgpu/si_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/si_ih.c
@@ -118,6 +118,19 @@ static u32 si_ih_get_wptr(struct amdgpu_device *adev)
 	return (wptr & adev->irq.ih.ptr_mask);
 }
 
+/**
+ * si_ih_prescreen_iv - prescreen an interrupt vector
+ *
+ * @adev: amdgpu_device pointer
+ *
+ * Returns true if the interrupt vector should be further processed.
+ */
+static bool si_ih_prescreen_iv(struct amdgpu_device *adev)
+{
+	/* Process all interrupts */
+	return true;
+}
+
 static void si_ih_decode_iv(struct amdgpu_device *adev,
 			     struct amdgpu_iv_entry *entry)
 {
@@ -288,6 +301,7 @@ static const struct amd_ip_funcs si_ih_ip_funcs = {
 
 static const struct amdgpu_ih_funcs si_ih_funcs = {
 	.get_wptr = si_ih_get_wptr,
+	.prescreeen_iv = si_ih_prescreen_iv,
 	.decode_iv = si_ih_decode_iv,
 	.set_rptr = si_ih_set_rptr
 };
diff --git a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
index 3a5097a..bb2be6a 100644
--- a/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/tonga_ih.c
@@ -219,6 +219,19 @@ static u32 tonga_ih_get_wptr(struct amdgpu_device *adev)
 }
 
 /**
+ * tonga_ih_prescreen_iv - prescreen an interrupt vector
+ *
+ * @adev: amdgpu_device pointer
+ *
+ * Returns true if the interrupt vector should be further processed.
+ */
+static bool tonga_ih_prescreen_iv(struct amdgpu_device *adev)
+{
+	/* Process all interrupts */
+	return true;
+}
+
+/**
  * tonga_ih_decode_iv - decode an interrupt vector
  *
  * @adev: amdgpu_device pointer
@@ -478,6 +491,7 @@ static const struct amd_ip_funcs tonga_ih_ip_funcs = {
 
 static const struct amdgpu_ih_funcs tonga_ih_funcs = {
 	.get_wptr = tonga_ih_get_wptr,
+	.prescreen_iv = tonga_ih_prescreen_iv,
 	.decode_iv = tonga_ih_decode_iv,
 	.set_rptr = tonga_ih_set_rptr
 };
diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
index 67610f7..d14a2d5 100644
--- a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
@@ -227,6 +227,19 @@ static u32 vega10_ih_get_wptr(struct amdgpu_device *adev)
 }
 
 /**
+ * vega10_ih_prescreen_iv - prescreen an interrupt vector
+ *
+ * @adev: amdgpu_device pointer
+ *
+ * Returns true if the interrupt vector should be further processed.
+ */
+static bool vega10_ih_prescreen_iv(struct amdgpu_device *adev)
+{
+	/* TODO: Filter known pending page faults */
+	return true;
+}
+
+/**
  * vega10_ih_decode_iv - decode an interrupt vector
  *
  * @adev: amdgpu_device pointer
@@ -410,6 +423,7 @@ const struct amd_ip_funcs vega10_ih_ip_funcs = {
 
 static const struct amdgpu_ih_funcs vega10_ih_funcs = {
 	.get_wptr = vega10_ih_get_wptr,
+	.prescreen_iv = vega10_ih_prescreen_iv,
 	.decode_iv = vega10_ih_decode_iv,
 	.set_rptr = vega10_ih_set_rptr
 };
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 8/9] lib: Closed hash table with low overhead
       [not found] ` <1503731949-22742-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (6 preceding siblings ...)
  2017-08-26  7:19   ` [PATCH 7/9] drm/amdgpu: Add prescreening stage in IH processing Felix Kuehling
@ 2017-08-26  7:19   ` Felix Kuehling
       [not found]     ` <1503731949-22742-9-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-08-26  7:19   ` [PATCH 9/9] drm/amdgpu: Track pending retry faults in IH and VM Felix Kuehling
  2017-08-27 22:22   ` [PATCH 0/9] WIP: Retry page fault handling for Vega10 Oded Gabbay
  9 siblings, 1 reply; 23+ messages in thread
From: Felix Kuehling @ 2017-08-26  7:19 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Felix Kuehling

This adds a closed hash table implementation with low memory and CPU
overhead. The API is inspired by kfifo.

Storing, retrieving and deleting data does not involve any dynamic
memory management, which makes it ideal for use in interrupt context.
Memory overhead per entry is the 32 or 64 bit hash key, two bits for
free/used tracking and whatever value size is stored in the table.
No list heads or pointers, therefore this data structure should be
quite cache-friendly, too.

After entries are removed, free space maintenance is necessary. At
the same time, entries that had hash collisions on insertion can be
relocated to speed up future lookups. This is done incrementally and
opportunistically to avoid long stalls.

CPU overhead is very small as long as the table doesn't fill up more
than about 50%. It's still quite efficient up to 90% full. The less
free space is in the table, the more likely collisions get, and the
more maintenance overhead is required to maintain free space and
efficiency.

Change-Id: I86e72510941969e7523df11f9e68926f46ef7af1
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 include/linux/chash.h | 349 +++++++++++++++++++++++++++++++++
 lib/Kconfig           |   8 +
 lib/Makefile          |   2 +
 lib/chash.c           | 521 ++++++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 880 insertions(+)
 create mode 100644 include/linux/chash.h
 create mode 100644 lib/chash.c

diff --git a/include/linux/chash.h b/include/linux/chash.h
new file mode 100644
index 0000000..3835575
--- /dev/null
+++ b/include/linux/chash.h
@@ -0,0 +1,349 @@
+/*
+ * Copyright 2017 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#ifndef _LINUX_CHASH_H
+#define _LINUX_CHASH_H
+
+#include <linux/types.h>
+#include <linux/hash.h>
+#include <linux/bug.h>
+#include <linux/bitops.h>
+
+struct __chash_table {
+	u8 bits;
+	u8 key_size;
+	unsigned int value_size;
+	u32 size_mask;
+	unsigned long *occup_bitmap, *valid_bitmap;
+	union {
+		u32 *keys32;
+		u64 *keys64;
+	};
+	u8 *values;
+
+#define CHASH_STATS
+#ifdef CHASH_STATS
+	u64 total_add_calls, total_add_steps;
+	u64 total_find_calls, total_find_steps;
+	u64 total_not_find_calls, total_not_find_steps;
+	u64 total_relocations, total_relocation_distance;
+#endif
+};
+
+#define __CHASH_BITMAP_SIZE(bits)				\
+	(((1 << (bits)) + BITS_PER_LONG - 1) / BITS_PER_LONG)
+#define __CHASH_ARRAY_SIZE(bits, size)				\
+	((((size) << (bits)) + sizeof(long) - 1) / sizeof(long))
+
+#define __CHASH_DATA_SIZE(bits, key_size, value_size)	\
+	(__CHASH_BITMAP_SIZE(bits) * 2 +		\
+	 __CHASH_ARRAY_SIZE(bits, key_size) +		\
+	 __CHASH_ARRAY_SIZE(bits, value_size))
+
+#define STRUCT_CHASH_TABLE(bits, key_size, value_size)			\
+	struct {							\
+		struct __chash_table table;				\
+		unsigned long data[					\
+			__CHASH_DATA_SIZE(bits, key_size, value_size)];	\
+	}
+
+/**
+ * struct chash_table - Dynamically allocated closed hash table
+ *
+ * Use this struct for dynamically allocated hash tables (using
+ * chash_table_alloc and chash_table_free), where the size is
+ * determined at runtime.
+ */
+struct chash_table {
+	struct __chash_table table;
+	unsigned long *data;
+};
+
+/**
+ * DECLARE_CHASH_TABLE - macro to declare a closed hash table
+ * @table: name of the declared hash table
+ * @bts: Table size will be 2^bits entries
+ * @key_sz: Size of hash keys in bytes, 4 or 8
+ * @val_sz: Size of data values in bytes, can be 0
+ *
+ * This declares the hash table variable with a static size.
+ *
+ * The closed hash table stores key-value pairs with low memory and
+ * lookup overhead. In operation it performs no dynamic memory
+ * management. The data being stored does not require any
+ * list_heads. The hash table performs best with small @val_sz and as
+ * long as some space (about 50%) is left free in the table. But the
+ * table can still work reasonably efficiently even when filled up to
+ * about 90%. If bigger data items need to be stored and looked up,
+ * store the pointer to it as value in the hash table.
+ *
+ * @val_sz may be 0. This can be useful when all the stored
+ * information is contained in the key itself and the fact that it is
+ * in the hash table (or not).
+ */
+#define DECLARE_CHASH_TABLE(table, bts, key_sz, val_sz)		\
+	STRUCT_CHASH_TABLE(bts, key_sz, val_sz) table
+
+#ifdef CHASH_STATS
+#define __CHASH_STATS_INIT(prefix) ,			\
+		prefix.total_add_calls = 0,		\
+		prefix.total_add_steps = 0,		\
+		prefix.total_find_calls = 0,		\
+		prefix.total_find_steps = 0,		\
+		prefix.total_not_find_calls = 0,	\
+		prefix.total_not_find_steps = 0,	\
+		prefix.total_relocations = 0,		\
+		prefix.total_relocation_distance = 0
+#else
+#define __CHASH_STATS_INIT(prefix)
+#endif
+
+#define __CHASH_TABLE_INIT(prefix, data, bts, key_sz, val_sz)	\
+	prefix.bits = (bts),					\
+		prefix.key_size = (key_sz),			\
+		prefix.value_size = (val_sz),			\
+		prefix.size_mask = ((1 << bts) - 1),		\
+		prefix.occup_bitmap = &data[0],			\
+		prefix.valid_bitmap = &data[			\
+			__CHASH_BITMAP_SIZE(bts)],		\
+		prefix.keys64 = (u64 *)&data[			\
+			__CHASH_BITMAP_SIZE(bts) * 2],		\
+		prefix.values = (u8 *)&data[			\
+			__CHASH_BITMAP_SIZE(bts) * 2 +		\
+			__CHASH_ARRAY_SIZE(bts, key_sz)]	\
+		__CHASH_STATS_INIT(prefix)
+
+/**
+ * DEFINE_CHASH_TABLE - macro to define and initialize a closed hash table
+ * @tbl: name of the declared hash table
+ * @bts: Table size will be 2^bits entries
+ * @key_sz: Size of hash keys in bytes, 4 or 8
+ * @val_sz: Size of data values in bytes, can be 0
+ *
+ * Note: the macro can be used for global and local hash table variables.
+ */
+#define DEFINE_CHASH_TABLE(tbl, bts, key_sz, val_sz)			\
+	DECLARE_CHASH_TABLE(tbl, bts, key_sz, val_sz) = {		\
+		.table = {						\
+			__CHASH_TABLE_INIT(, (tbl).data, bts, key_sz, val_sz) \
+		},							\
+		.data = {0}						\
+	}
+
+/**
+ * INIT_CHASH_TABLE - Initialize a hash table declared by DECLARE_CHASH_TABLE
+ * @tbl: name of the declared hash table
+ * @bts: Table size will be 2^bits entries
+ * @key_sz: Size of hash keys in bytes, 4 or 8
+ * @val_sz: Size of data values in bytes, can be 0
+ */
+#define INIT_CHASH_TABLE(tbl, bts, key_sz, val_sz)			\
+	__CHASH_TABLE_INIT(((tbl).table), (tbl).data, bts, key_sz, val_sz)
+
+int chash_table_alloc(struct chash_table *table, u8 bits, u8 key_size,
+		      unsigned int value_size, gfp_t gfp_mask);
+void chash_table_free(struct chash_table *table);
+
+/**
+ * chash_table_dump_stats - Dump statistics of a closed hash table
+ * @tbl: Pointer to the table structure
+ *
+ * Dumps some performance statistics of the table gathered in operation.
+ */
+#ifdef CHASH_STATS
+#define chash_table_dump_stats(tbl) __chash_table_dump_stats(&(*tbl).table)
+
+void __chash_table_dump_stats(struct __chash_table *table);
+#else
+#define chash_table_dump_stats(tbl)
+#endif
+
+/**
+ * chash_table_copy_in - Copy a new value into the hash table
+ * @tbl: Pointer to the table structure
+ * @key: Key of the entry to add or update
+ * @value: Pointer to value to copy, may be NULL
+ *
+ * If @key already has an entry, its value is replaced. Otherwise a
+ * new entry is added. If @value is NULL, the value is left unchanged
+ * or uninitialized. Returns 1 if an entry already existed, 0 if a new
+ * entry was added or %-ENOMEM if there was no free space in the
+ * table.
+ */
+#define chash_table_copy_in(tbl, key, value)			\
+	__chash_table_copy_in(&(*tbl).table, key, value)
+
+int __chash_table_copy_in(struct __chash_table *table, u64 key,
+			  const void *value);
+
+/**
+ * chash_table_copy_out - Copy a value out of the hash table
+ * @tbl: Pointer to the table structure
+ * @key: Key of the entry to find
+ * @value: Pointer to value to copy, may be NULL
+ *
+ * If @value is not NULL and the table has a non-0 value_size, the
+ * value at @key is copied to @value. Returns the slot index of the
+ * entry or %-EINVAL if @key was not found.
+ */
+#define chash_table_copy_out(tbl, key, value)			\
+	__chash_table_copy_out(&(*tbl).table, key, value, false)
+
+int __chash_table_copy_out(struct __chash_table *table, u64 key,
+			   void *value, bool remove);
+
+/**
+ * chash_table_remove - Remove an entry from the hash table
+ * @tbl: Pointer to the table structure
+ * @key: Key of the entry to find
+ * @value: Pointer to value to copy, may be NULL
+ *
+ * If @value is not NULL and the table has a non-0 value_size, the
+ * value at @key is copied to @value. The entry is removed from the
+ * table. Returns the slot index of the removed entry or %-EINVAL if
+ * @key was not found.
+ */
+#define chash_table_remove(tbl, key, value)			\
+	__chash_table_copy_out(&(*tbl).table, key, value, true)
+
+#define CHASH_SELF_TEST
+#ifdef CHASH_SELF_TEST
+int chash_self_test(u8 bits, u8 key_size, int min_fill, int max_fill,
+		    u64 iterations);
+#endif
+
+/*
+ * Low level iterator API used internally by the above functions.
+ */
+struct chash_iter {
+	struct __chash_table *table;
+	unsigned long mask;
+	int slot;
+};
+
+/**
+ * CHASH_ITER_INIT - Initialize a hash table iterator
+ * @tbl: Pointer to hash table to iterate over
+ * @s: Initial slot number
+ */
+#define CHASH_ITER_INIT(table, s) {			\
+		table,					\
+		1UL << ((s) & (BITS_PER_LONG - 1)),	\
+		s					\
+	}
+/**
+ * CHASH_ITER_SET - Set hash table iterator to new slot
+ * @iter: Iterator
+ * @s: Slot number
+ */
+#define CHASH_ITER_SET(iter, s)					\
+	(iter).mask = 1UL << ((s) & (BITS_PER_LONG - 1)),	\
+	(iter).slot = (s)
+/**
+ * CHASH_ITER_INC - Increment hash table iterator
+ * @table: Hash table to iterate over
+ *
+ * Wraps around at the end.
+ */
+#define CHASH_ITER_INC(iter) do {					\
+		(iter).mask = (iter).mask << 1 |			\
+			(iter).mask >> (BITS_PER_LONG - 1);		\
+		(iter).slot = ((iter).slot + 1) & (iter).table->size_mask; \
+	} while (0)
+
+static inline bool chash_iter_is_valid(const struct chash_iter iter)
+{
+	BUG_ON((unsigned)iter.slot >= (1 << iter.table->bits));
+	return !!(iter.table->valid_bitmap[iter.slot >> _BITOPS_LONG_SHIFT] &
+		  iter.mask);
+}
+static inline bool chash_iter_is_empty(const struct chash_iter iter)
+{
+	BUG_ON((unsigned)iter.slot >= (1 << iter.table->bits));
+	return !(iter.table->occup_bitmap[iter.slot >> _BITOPS_LONG_SHIFT] &
+		 iter.mask);
+}
+
+static inline void chash_iter_set_valid(const struct chash_iter iter)
+{
+	BUG_ON((unsigned)iter.slot >= (1 << iter.table->bits));
+	iter.table->valid_bitmap[iter.slot >> _BITOPS_LONG_SHIFT] |= iter.mask;
+	iter.table->occup_bitmap[iter.slot >> _BITOPS_LONG_SHIFT] |= iter.mask;
+}
+static inline void chash_iter_set_invalid(const struct chash_iter iter)
+{
+	BUG_ON((unsigned)iter.slot >= (1 << iter.table->bits));
+	iter.table->valid_bitmap[iter.slot >> _BITOPS_LONG_SHIFT] &= ~iter.mask;
+}
+static inline void chash_iter_set_empty(const struct chash_iter iter)
+{
+	BUG_ON((unsigned)iter.slot >= (1 << iter.table->bits));
+	iter.table->occup_bitmap[iter.slot >> _BITOPS_LONG_SHIFT] &= ~iter.mask;
+}
+
+static inline u32 chash_iter_key32(const struct chash_iter iter)
+{
+	BUG_ON(iter.table->key_size != 4);
+	BUG_ON((unsigned)iter.slot >= (1 << iter.table->bits));
+	return iter.table->keys32[iter.slot];
+}
+static inline u64 chash_iter_key64(const struct chash_iter iter)
+{
+	BUG_ON(iter.table->key_size != 8);
+	BUG_ON((unsigned)iter.slot >= (1 << iter.table->bits));
+	return iter.table->keys64[iter.slot];
+}
+static inline u64 chash_iter_key(const struct chash_iter iter)
+{
+	BUG_ON((unsigned)iter.slot >= (1 << iter.table->bits));
+	return (iter.table->key_size == 4) ?
+		iter.table->keys32[iter.slot] : iter.table->keys64[iter.slot];
+}
+
+static inline u32 chash_iter_hash32(const struct chash_iter iter)
+{
+	BUG_ON(iter.table->key_size != 4);
+	return hash_32(chash_iter_key32(iter), iter.table->bits);
+}
+
+static inline u32 chash_iter_hash64(const struct chash_iter iter)
+{
+	BUG_ON(iter.table->key_size != 8);
+	return hash_64(chash_iter_key64(iter), iter.table->bits);
+}
+
+static inline u32 chash_iter_hash(const struct chash_iter iter)
+{
+	return (iter.table->key_size == 4) ?
+		hash_32(chash_iter_key32(iter), iter.table->bits) :
+		hash_64(chash_iter_key64(iter), iter.table->bits);
+}
+
+static inline void *chash_iter_value(const struct chash_iter iter)
+{
+	BUG_ON((unsigned)iter.slot >= (1 << iter.table->bits));
+	return iter.table->values +
+		((unsigned long)iter.slot * iter.table->value_size);
+}
+
+#endif /* _LINUX_CHASH_H */
diff --git a/lib/Kconfig b/lib/Kconfig
index 0c8b78a..e5e1438 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -564,4 +564,12 @@ config PARMAN
 config PRIME_NUMBERS
 	tristate
 
+#
+# Closed hash table
+#
+config CHASH
+	tristate "Closed hash table"
+        help
+	 Closed hash table implementation with low memory and CPU overhead.
+
 endmenu
diff --git a/lib/Makefile b/lib/Makefile
index 0166fbc..a44ec9f 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -243,3 +243,5 @@ UBSAN_SANITIZE_ubsan.o := n
 obj-$(CONFIG_SBITMAP) += sbitmap.o
 
 obj-$(CONFIG_PARMAN) += parman.o
+
+obj-$(CONFIG_CHASH) += chash.o
diff --git a/lib/chash.c b/lib/chash.c
new file mode 100644
index 0000000..08cdac1
--- /dev/null
+++ b/lib/chash.c
@@ -0,0 +1,521 @@
+/*
+ * Copyright 2017 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#include <linux/types.h>
+#include <linux/hash.h>
+#include <linux/bug.h>
+#include <linux/slab.h>
+#include <linux/module.h>
+#include <linux/chash.h>
+
+/**
+ * chash_table_alloc - Allocate closed hash table
+ * @table: Pointer to the table structure
+ * @bits: Table size will be 2^bits entries
+ * @key_size: Size of hash keys in bytes, 4 or 8
+ * @value_size: Size of data values in bytes, can be 0
+ */
+int chash_table_alloc(struct chash_table *table, u8 bits, u8 key_size,
+		      unsigned value_size, gfp_t gfp_mask)
+{
+	if (bits > 31)
+		return -EINVAL;
+
+	if (key_size != 4 && key_size != 8)
+		return -EINVAL;
+
+	table->data = kcalloc(__CHASH_DATA_SIZE(bits, key_size, value_size),
+		       sizeof(long), gfp_mask);
+	if (!table->data)
+		return -ENOMEM;
+
+	__CHASH_TABLE_INIT(table->table, table->data, bits, key_size, value_size);
+
+	return 0;
+}
+EXPORT_SYMBOL(chash_table_alloc);
+
+/**
+ * chash_table_free - Free closed hash table
+ * @table: Pointer to the table structure
+ */
+void chash_table_free(struct chash_table *table)
+{
+	kfree(table->data);
+}
+EXPORT_SYMBOL(chash_table_free);
+
+#ifdef CHASH_STATS
+
+#define DIV_FRAC(nom, denom, quot, frac, frac_digits) do {		\
+		(quot) = (nom) / (denom);				\
+		(frac) = ((nom) % (denom) * (frac_digits) +		\
+			  (denom) / 2) / (denom);			\
+	} while (0)
+
+void __chash_table_dump_stats(struct __chash_table *table)
+{
+	struct chash_iter iter = CHASH_ITER_INIT(table, 0);
+	u32 filled = 0, empty = 0, tombstones = 0;
+	u32 quot, frac;
+	u32 quot2, frac2;
+
+	do {
+		if (chash_iter_is_valid(iter))
+			filled++;
+		else if (chash_iter_is_empty(iter))
+			empty++;
+		else
+			tombstones++;
+		CHASH_ITER_INC(iter);
+	} while (iter.slot);
+
+	pr_info("Hash table key size %d, value size %d\n",
+		table->key_size, table->value_size);
+	pr_info("  Slots total/filled/empty/tombstones: %u / %u / %u / %u\n",
+		1 << table->bits, filled, empty, tombstones);
+	pr_info("  Avg number of search steps for:\n");
+	if (table->total_add_calls > 0)
+		DIV_FRAC(table->total_add_steps, table->total_add_calls,
+			 quot, frac, 1000);
+	else
+		quot = frac = 0;
+	pr_info("    Add           : %u.%03u\n", quot, frac);
+	if (table->total_find_calls > 0)
+		DIV_FRAC(table->total_find_steps,
+			 table->total_find_calls, quot, frac, 1000);
+	else
+		quot = frac = 0;
+	if (table->total_not_find_calls > 0)
+		DIV_FRAC(table->total_not_find_steps,
+			 table->total_not_find_calls, quot2, frac2, 1000);
+	else
+		quot2 = frac2 = 0;
+	pr_info("    Find(hit/miss): %u.%03u / %u.%03u\n", quot, frac, quot2, frac2);
+	if (table->total_relocations) {
+		u64 quot64;
+
+		DIV_FRAC(table->total_find_calls + table->total_not_find_calls,
+			 table->total_relocations, quot64, frac, 1000);
+		DIV_FRAC(table->total_relocation_distance,
+			 table->total_relocations, quot2, frac2, 1000);
+		pr_info("  Relocations (freq/avg.dist): 1:%llu.%03u / %u.%03u\n",
+			quot64, frac, quot2, frac2);
+	} else {
+		pr_info("  No relocations\n");
+	}
+}
+EXPORT_SYMBOL(__chash_table_dump_stats);
+
+#undef DIV_FRAC
+#endif
+
+#define CHASH_INC(table, a) ((a) = ((a) + 1) & (table)->size_mask)
+#define CHASH_ADD(table, a, b) (((a) + (b)) & (table)->size_mask)
+#define CHASH_SUB(table, a, b) (((a) - (b)) & (table)->size_mask)
+#define CHASH_IN_RANGE(table, slot, first, last) \
+	(CHASH_SUB(table, slot, first) <= CHASH_SUB(table, last, first))
+
+/*#define CHASH_DEBUG Uncomment this to enable verbose debug output*/
+#ifdef CHASH_DEBUG
+static void chash_table_dump(struct __chash_table *table)
+{
+	struct chash_iter iter = CHASH_ITER_INIT(table, 0);
+
+	do {
+		if ((iter.slot & 3) == 0)
+			pr_debug("%04x: ", iter.slot);
+
+		if (chash_iter_is_valid(iter))
+			pr_debug("[%016llx] ", chash_iter_key(iter));
+		else if (chash_iter_is_empty(iter))
+			pr_debug("[    <empty>     ] ");
+		else
+			pr_debug("[  <tombstone>   ] ");
+
+		if ((iter.slot & 3) == 3)
+			pr_debug("\n");
+
+		CHASH_ITER_INC(iter);
+	} while (iter.slot);
+
+	if ((iter.slot & 3) != 0)
+		pr_debug("\n");
+}
+
+static int chash_table_check(struct __chash_table *table)
+{
+	u32 hash;
+	struct chash_iter iter = CHASH_ITER_INIT(table, 0);
+	struct chash_iter cur = CHASH_ITER_INIT(table, 0);
+
+	do {
+		if (!chash_iter_is_valid(iter)) {
+			CHASH_ITER_INC(iter);
+			continue;
+		}
+
+		hash = chash_iter_hash(iter);
+		CHASH_ITER_SET(cur, hash);
+		while (cur.slot != iter.slot) {
+			if (chash_iter_is_empty(cur)) {
+				pr_err("Path to element at %x with hash %x broken at slot %x\n",
+				       iter.slot, hash, cur.slot);
+				chash_table_dump(table);
+				return -EINVAL;
+			}
+			CHASH_ITER_INC(cur);
+		}
+
+		CHASH_ITER_INC(iter);
+	} while (iter.slot);
+
+	return 0;
+}
+#endif
+
+static void chash_iter_relocate(struct chash_iter dst, struct chash_iter src)
+{
+	BUG_ON(src.table == dst.table && src.slot == dst.slot);
+	BUG_ON(src.table->key_size != src.table->key_size);
+	BUG_ON(src.table->value_size != src.table->value_size);
+
+	if (dst.table->key_size == 4)
+		dst.table->keys32[dst.slot] = src.table->keys32[src.slot];
+	else
+		dst.table->keys64[dst.slot] = src.table->keys64[src.slot];
+
+	if (dst.table->value_size)
+		memcpy(chash_iter_value(dst), chash_iter_value(src),
+		       dst.table->value_size);
+
+	chash_iter_set_valid(dst);
+	chash_iter_set_invalid(src);
+
+#ifdef CHASH_STATS
+	if (src.table == dst.table) {
+		dst.table->total_relocations++;
+		dst.table->total_relocation_distance +=
+			CHASH_SUB(dst.table, src.slot, dst.slot);
+	}
+#endif
+}
+
+/**
+ * __chash_table_find - Helper for looking up a hash table entry
+ * @iter: Pointer to hash table iterator
+ * @key: Key of the entry to find
+ * @for_removal: set to true if the element will be removed soon
+ *
+ * Searches for an entry in the hash table with a given key. iter must
+ * be initialized by the caller to point to the home position of the
+ * hypothetical entry, i.e. it must be initialized with the hash table
+ * and the key's hash as the initial slot for the search.
+ *
+ * This function also does some local clean-up to speed up future
+ * look-ups by relocating entries to better slots and removing
+ * tombstones that are no longer needed.
+ *
+ * If @for_removal is true, the function avoids relocating the entry
+ * that is being returned.
+ *
+ * Returns 0 if the search is successful. In this case iter is updated
+ * to point to the found entry. Otherwise %-EINVAL is returned and the
+ * iter is updated to point to the first available slot for the given
+ * key. If the table is full, the slot is set to -1.
+ */
+static int chash_table_find(struct chash_iter *iter, u64 key,
+			    bool for_removal)
+{
+	u32 hash = iter->slot;
+	struct chash_iter first_redundant = CHASH_ITER_INIT(iter->table, -1);
+	int first_avail = (for_removal ? -2 : -1);
+
+	while (!chash_iter_is_valid(*iter) || chash_iter_key(*iter) != key) {
+		if (chash_iter_is_empty(*iter)) {
+			/* Found an empty slot, which ends the
+			 * search. Clean up any preceding tombstones
+			 * that are no longer needed because they lead
+			 * to no-where
+			 */
+			if ((int)first_redundant.slot < 0)
+				goto not_found;
+			while (first_redundant.slot != iter->slot) {
+				if (!chash_iter_is_valid(first_redundant))
+					chash_iter_set_empty(first_redundant);
+				CHASH_ITER_INC(first_redundant);
+			}
+#ifdef CHASH_DEBUG
+			chash_table_check(iter->table);
+#endif
+			goto not_found;
+		} else if (!chash_iter_is_valid(*iter)) {
+			/* Found a tombstone. Remember it as candidate
+			 * for relocating the entry we're looking for
+			 * or for adding a new entry with the given key
+			 */
+			if (first_avail == -1)
+				first_avail = iter->slot;
+			/* Or mark it as the start of a series of
+			 * potentially redundant tombstones
+			 */
+			else if (first_redundant.slot == -1)
+				CHASH_ITER_SET(first_redundant, iter->slot);
+		} else if (first_redundant.slot >= 0) {
+			/* Found a valid, occupied slot with a
+			 * preceding series of tombstones. Relocate it
+			 * to a better position that no longer depends
+			 * on those tombstones
+			 */
+			u32 cur_hash = chash_iter_hash(*iter);
+
+			if (!CHASH_IN_RANGE(iter->table, cur_hash,
+					    first_redundant.slot + 1,
+					    iter->slot)) {
+				/* This entry has a hash at or before
+				 * the first tombstone we found. We
+				 * can relocate it to that tombstone
+				 * and advance to the next tombstone
+				 */
+				chash_iter_relocate(first_redundant, *iter);
+				do {
+					CHASH_ITER_INC(first_redundant);
+				} while (chash_iter_is_valid(first_redundant));
+			} else if (cur_hash != iter->slot) {
+				/* Relocate entry to its home position
+				 * or a close as possible so it no
+				 * longer depends on any preceding
+				 * tombstones
+				 */
+				struct chash_iter new_iter =
+					CHASH_ITER_INIT(iter->table, cur_hash);
+
+				while (new_iter.slot != iter->slot &&
+				       chash_iter_is_valid(new_iter))
+					CHASH_ITER_INC(new_iter);
+
+				if (new_iter.slot != iter->slot)
+					chash_iter_relocate(new_iter, *iter);
+			}
+		}
+
+		CHASH_ITER_INC(*iter);
+		if (iter->slot == hash) {
+			iter->slot = -1;
+			goto not_found;
+		}
+	}
+
+#ifdef CHASH_STATS
+	iter->table->total_find_calls++;
+	iter->table->total_find_steps +=
+		CHASH_SUB(iter->table, iter->slot, hash) + 1;
+#endif
+
+	if (first_avail >= 0) {
+		CHASH_ITER_SET(first_redundant, first_avail);
+		chash_iter_relocate(first_redundant, *iter);
+		iter->slot = first_redundant.slot;
+		iter->mask = first_redundant.mask;
+	}
+
+	return 0;
+
+not_found:
+#ifdef CHASH_STATS
+	iter->table->total_not_find_calls++;
+	iter->table->total_not_find_steps += (iter->slot < 0) ?
+		(1 << iter->table->bits) :
+		CHASH_SUB(iter->table, iter->slot, hash) + 1;
+#endif
+	if (first_avail >= 0)
+		CHASH_ITER_SET(*iter, first_avail);
+	return -EINVAL;
+}
+
+int __chash_table_copy_in(struct __chash_table *table, u64 key,
+			  const void *value)
+{
+	u32 hash = (table->key_size == 4) ?
+		hash_32(key, table->bits) : hash_64(key, table->bits);
+	struct chash_iter iter = CHASH_ITER_INIT(table, hash);
+	int r = chash_table_find(&iter, key, false);
+
+	/* Found an existing entry */
+	if (!r) {
+		if (value && table->value_size)
+			memcpy(chash_iter_value(iter), value,
+			       table->value_size);
+		return 1;
+	}
+
+	/* Is there a place to add a new entry? */
+	if (iter.slot < 0) {
+		pr_err("Hash table overflow\n");
+		return -ENOMEM;
+	}
+
+	chash_iter_set_valid(iter);
+
+	if (table->key_size == 4)
+		table->keys32[iter.slot] = key;
+	else
+		table->keys64[iter.slot] = key;
+	if (value && table->value_size)
+		memcpy(chash_iter_value(iter), value, table->value_size);
+
+#ifdef CHASH_STATS
+	table->total_add_calls++;
+	table->total_add_steps += CHASH_SUB(table, iter.slot, hash) + 1;
+#endif
+	return 0;
+}
+EXPORT_SYMBOL(__chash_table_copy_in);
+
+int __chash_table_copy_out(struct __chash_table *table, u64 key,
+			   void *value, bool remove)
+{
+	u32 hash = (table->key_size == 4) ?
+		hash_32(key, table->bits) : hash_64(key, table->bits);
+	struct chash_iter iter = CHASH_ITER_INIT(table, hash);
+	int r = chash_table_find(&iter, key, remove);
+
+	if (r < 0)
+		return r;
+
+	if (value && table->value_size)
+		memcpy(value, chash_iter_value(iter), table->value_size);
+
+	if (remove)
+		chash_iter_set_invalid(iter);
+
+	return iter.slot;
+}
+EXPORT_SYMBOL(__chash_table_copy_out);
+
+#ifdef CHASH_SELF_TEST
+/**
+ * chash_self_test - Run a self-test of the hash table implementation
+ * @bits: Table size will be 2^bits entries
+ * @key_size: Size of hash keys in bytes, 4 or 8
+ * @min_fill: Minimum fill level during the test
+ * @max_fill: Maximum fill level during the test
+ * @iterations: Number of test iterations
+ *
+ * The test adds and removes entries from a hash table, cycling the
+ * fill level between min_fill and max_fill entries. Also tests lookup
+ * and value retrieval.
+ */
+int chash_self_test(u8 bits, u8 key_size, int min_fill, int max_fill,
+		    u64 iterations)
+{
+	struct chash_table table;
+	int ret;
+	u64 add_count, rmv_count;
+	u64 value;
+
+	if (key_size == 4 && iterations > 0xffffffff)
+		return -EINVAL;
+	if (min_fill >= max_fill)
+		return -EINVAL;
+
+	ret = chash_table_alloc(&table, bits, key_size, sizeof(u64),
+				GFP_KERNEL);
+	if (ret) {
+		pr_err("chash_table_alloc failed: %d\n", ret);
+		return ret;
+	}
+
+	for (add_count = 0, rmv_count = 0; add_count < iterations;
+	     add_count++) {
+		/* When we hit the max_fill level, remove entries down
+		 * to min_fill */
+		if (add_count - rmv_count == max_fill) {
+			u64 find_count = rmv_count;
+
+			/* First try to find all entries that we're
+			 * about to remove, confirm their value, test
+			 * writing them back a second time. */
+			for (; add_count - find_count > min_fill;
+			     find_count++) {
+				ret = chash_table_copy_out(&table, find_count,
+							   &value);
+				if (ret < 0) {
+					pr_err("chash_table_copy_out failed: %d\n",
+					       ret);
+					goto out;
+				}
+				if (value != ~find_count) {
+					pr_err("Wrong value retrieved for key 0x%llx, expected 0x%llx got 0x%llx\n",
+					       find_count, ~find_count, value);
+#ifdef CHASH_DEBUG
+					chash_table_dump(&table.table);
+#endif
+					ret = -EFAULT;
+					goto out;
+				}
+				ret = chash_table_copy_in(&table, find_count,
+							  &value);
+				if (ret != 1) {
+					pr_err("copy_in second time returned %d, expected 1\n",
+					       ret);
+					ret = -EFAULT;
+					goto out;
+				}
+			}
+			/* Remove them until we hit min_fill level */
+			for (; add_count - rmv_count > min_fill; rmv_count++) {
+				ret = chash_table_remove(&table, rmv_count, NULL);
+				if (ret < 0) {
+					pr_err("chash_table_remove failed: %d\n",
+					       ret);
+					goto out;
+				}
+			}
+		}
+
+		/* Add a new value */
+		value = ~add_count;
+		ret = chash_table_copy_in(&table, add_count, &value);
+		if (ret != 0) {
+			pr_err("copy_in first time returned %d, expected 0\n",
+			       ret);
+			ret = -EFAULT;
+			goto out;
+		}
+	}
+
+#ifdef CHASH_STATS
+	chash_table_dump_stats(&table);
+#endif
+
+out:
+	chash_table_free(&table);
+	return ret;
+}
+EXPORT_SYMBOL(chash_self_test);
+
+#endif /* CHASH_SELF_TEST */
+
+MODULE_DESCRIPTION("Closed hash table");
+MODULE_LICENSE("GPL and additional rights");
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 9/9] drm/amdgpu: Track pending retry faults in IH and VM
       [not found] ` <1503731949-22742-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (7 preceding siblings ...)
  2017-08-26  7:19   ` [PATCH 8/9] lib: Closed hash table with low overhead Felix Kuehling
@ 2017-08-26  7:19   ` Felix Kuehling
       [not found]     ` <1503731949-22742-10-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-08-27 22:22   ` [PATCH 0/9] WIP: Retry page fault handling for Vega10 Oded Gabbay
  9 siblings, 1 reply; 23+ messages in thread
From: Felix Kuehling @ 2017-08-26  7:19 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Felix Kuehling

IH tracks pending retry faults in a hash table for fast lookup in
interrupt context. Each VM has a short FIFO of pending VM faults for
processing in a bottom half.

The IH prescreening stage adds retry faults and filters out repeated
retry interrupts to minimize the impact of interrupt storms.

It's the VM's responsibility remove pending faults once they are
handled. For now this is only done when the VM is destroyed.

Change-Id: I0cf15bfc767d06d9d5c3b13ad1ba7bc6aa520947
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/Kconfig                |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c | 76 +++++++++++++++++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h | 12 ++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c |  7 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h |  7 +++
 drivers/gpu/drm/amd/amdgpu/vega10_ih.c | 78 +++++++++++++++++++++++++++++++++-
 6 files changed, 180 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 78d7fc0..f8902dc 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -184,6 +184,7 @@ config DRM_AMDGPU
 	select BACKLIGHT_CLASS_DEVICE
 	select BACKLIGHT_LCD_SUPPORT
 	select INTERVAL_TREE
+        select CHASH
 	help
 	  Choose this option if you have a recent AMD Radeon graphics card.
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
index c834a40..d4d3579 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
@@ -196,3 +196,79 @@ int amdgpu_ih_process(struct amdgpu_device *adev)
 
 	return IRQ_HANDLED;
 }
+
+/**
+ * amdgpu_ih_add_fault - Add a page fault record
+ *
+ * @adev: amdgpu device pointer
+ * @key: 64-bit encoding of PASID and address
+ *
+ * This should be called when a retry page fault interrupt is
+ * received. If this is a new page fault, it will be added to a hash
+ * table. The return value indicates whether this is a new fault, or
+ * a fault that was already known and is already being handled.
+ *
+ * If there are too many pending page faults, this will fail. Retry
+ * interrupts should be ignored in this case until there is enough
+ * free space.
+ *
+ * Returns 0 if the fault was added, 1 if the fault was already known,
+ * -ENOSPC if there are too many pending faults.
+ */
+int amdgpu_ih_add_fault(struct amdgpu_device *adev, u64 key)
+{
+	unsigned long flags;
+	int r = -ENOSPC;
+
+	if (WARN_ON_ONCE(!adev->irq.ih.faults))
+		/* Should be allocated in <IP>_ih_sw_init on GPUs that
+		 * support retry faults and require retry filtering.
+		 */
+		return r;
+
+	spin_lock_irqsave(&adev->irq.ih.faults->lock, flags);
+
+	/* Only let the hash table fill up to 50% for best performance */
+	if (adev->irq.ih.faults->count > (1 << (AMDGPU_PAGEFAULT_HASH_BITS-1)))
+		goto unlock_out;
+
+	r = chash_table_copy_in(&adev->irq.ih.faults->hash, key, NULL);
+	if (!r)
+		adev->irq.ih.faults->count++;
+
+	/* chash_table_copy_in should never fail unless we're losing count */
+	WARN_ON_ONCE(r < 0);
+
+unlock_out:
+	spin_unlock_irqrestore(&adev->irq.ih.faults->lock, flags);
+	return r;
+}
+
+/**
+ * amdgpu_ih_clear_fault - Remove a page fault record
+ *
+ * @adev: amdgpu device pointer
+ * @key: 64-bit encoding of PASID and address
+ *
+ * This should be called when a page fault has been handled. Any
+ * future interrupt with this key will be processed as a new
+ * page fault.
+ */
+void amdgpu_ih_clear_fault(struct amdgpu_device *adev, u64 key)
+{
+	unsigned long flags;
+	int r;
+
+	if (!adev->irq.ih.faults)
+		return;
+
+	spin_lock_irqsave(&adev->irq.ih.faults->lock, flags);
+
+	r = chash_table_remove(&adev->irq.ih.faults->hash, key, NULL);
+	if (!WARN_ON_ONCE(r < 0)) {
+		adev->irq.ih.faults->count--;
+		WARN_ON_ONCE(adev->irq.ih.faults->count < 0);
+	}
+
+	spin_unlock_irqrestore(&adev->irq.ih.faults->lock, flags);
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
index 3de8e74..d107f1b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
@@ -24,6 +24,8 @@
 #ifndef __AMDGPU_IH_H__
 #define __AMDGPU_IH_H__
 
+#include <linux/chash.h>
+
 struct amdgpu_device;
  /*
   * vega10+ IH clients
@@ -69,6 +71,13 @@ enum amdgpu_ih_clientid
 
 #define AMDGPU_IH_CLIENTID_LEGACY 0
 
+#define AMDGPU_PAGEFAULT_HASH_BITS 10
+struct amdgpu_retryfault_hashtable {
+	DECLARE_CHASH_TABLE(hash, AMDGPU_PAGEFAULT_HASH_BITS, 8, 0);
+	spinlock_t	lock;
+	int		count;
+};
+
 /*
  * R6xx+ IH ring
  */
@@ -87,6 +96,7 @@ struct amdgpu_ih_ring {
 	bool			use_doorbell;
 	bool			use_bus_addr;
 	dma_addr_t		rb_dma_addr; /* only used when use_bus_addr = true */
+	struct amdgpu_retryfault_hashtable *faults;
 };
 
 #define AMDGPU_IH_SRC_DATA_MAX_SIZE_DW 4
@@ -109,5 +119,7 @@ int amdgpu_ih_ring_init(struct amdgpu_device *adev, unsigned ring_size,
 			bool use_bus_addr);
 void amdgpu_ih_ring_fini(struct amdgpu_device *adev);
 int amdgpu_ih_process(struct amdgpu_device *adev);
+int amdgpu_ih_add_fault(struct amdgpu_device *adev, u64 key);
+void amdgpu_ih_clear_fault(struct amdgpu_device *adev, u64 key);
 
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index c635699..8bdabb3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2622,6 +2622,8 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
 		vm->pasid = pasid;
 	}
 
+	INIT_KFIFO(vm->faults);
+
 	vm->vm_context = vm_context;
 	if (vm_context == AMDGPU_VM_CONTEXT_COMPUTE) {
 		mutex_lock(&id_mgr->lock);
@@ -2688,6 +2690,7 @@ void amdgpu_vm_fini(struct amdgpu_device *adev, struct amdgpu_vm *vm)
 {
 	struct amdgpu_bo_va_mapping *mapping, *tmp;
 	bool prt_fini_needed = !!adev->gart.gart_funcs->set_prt;
+	u64 fault;
 	int i;
 
 	if (vm->vm_context == AMDGPU_VM_CONTEXT_COMPUTE) {
@@ -2710,6 +2713,10 @@ void amdgpu_vm_fini(struct amdgpu_device *adev, struct amdgpu_vm *vm)
 		mutex_unlock(&id_mgr->lock);
 	}
 
+	/* Clear pending page faults from IH when the VM is destroyed */
+	while (kfifo_get(&vm->faults, &fault))
+		amdgpu_ih_clear_fault(adev, fault);
+
 	if (vm->pasid) {
 		unsigned long flags;
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
index 692b05c..51d3e35 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -117,6 +117,10 @@ struct amdgpu_vm_pt {
 	unsigned		last_entry_used;
 };
 
+#define AMDGPU_VM_FAULT(pasid, addr) (((u64)(pasid) << 48) | (addr))
+#define AMDGPU_VM_FAULT_PASID(fault) ((u64)(fault) >> 48)
+#define AMDGPU_VM_FAULT_ADDR(fault)  ((u64)(fault) & 0xfffffffff000ULL)
+
 struct amdgpu_vm {
 	/* tree of virtual addresses mapped */
 	struct rb_root		va;
@@ -158,6 +162,9 @@ struct amdgpu_vm {
 
 	/* Whether this is a Compute or GFX Context */
 	int			vm_context;
+
+	/* Up to 16 pending page faults */
+	DECLARE_KFIFO(faults, u64, 16);
 };
 
 struct amdgpu_vm_id {
diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
index d14a2d5..ae2b84a 100644
--- a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
@@ -235,8 +235,73 @@ static u32 vega10_ih_get_wptr(struct amdgpu_device *adev)
  */
 static bool vega10_ih_prescreen_iv(struct amdgpu_device *adev)
 {
-	/* TODO: Filter known pending page faults */
+	u32 ring_index = adev->irq.ih.rptr >> 2;
+	u32 dw0, dw3, dw4, dw5;
+	u16 pasid;
+	u64 addr, key;
+	struct amdgpu_vm *vm;
+	int r;
+
+	dw0 = le32_to_cpu(adev->irq.ih.ring[ring_index + 0]);
+	dw3 = le32_to_cpu(adev->irq.ih.ring[ring_index + 3]);
+	dw4 = le32_to_cpu(adev->irq.ih.ring[ring_index + 4]);
+	dw5 = le32_to_cpu(adev->irq.ih.ring[ring_index + 5]);
+
+	/* Filter retry page faults, let only the first one pass. If
+	 * there are too many outstanding faults, ignore them until
+	 * some faults get cleared.
+	 */
+	switch(dw0 & 0xff) {
+	case AMDGPU_IH_CLIENTID_VMC:
+	case AMDGPU_IH_CLIENTID_UTCL2:
+		break;
+	default:
+		/* Not a VM fault */
+		return true;
+	}
+
+	/* Not a retry fault */
+	if (!(dw5 & 0x80))
+		return true;
+
+	pasid = dw3 & 0xffff;
+	/* No PASID, can't identify faulting process */
+	if (!pasid)
+		return true;
+
+	addr = ((u64)(dw5 & 0xf) << 44) | ((u64)dw4 << 12);
+	key = AMDGPU_VM_FAULT(pasid, addr);
+	r = amdgpu_ih_add_fault(adev, key);
+
+	/* Hash table is full or the fault is already being processed,
+	 * ignore further page faults
+	 */
+	if (r != 0)
+		goto ignore_iv;
+
+	/* Track retry faults in per-VM fault FIFO. */
+	spin_lock(&adev->vm_manager.pasid_lock);
+	vm = idr_find(&adev->vm_manager.pasid_idr, pasid);
+	spin_unlock(&adev->vm_manager.pasid_lock);
+	if (WARN_ON_ONCE(!vm)) {
+		/* VM not found, process it normally */
+		amdgpu_ih_clear_fault(adev, key);
+		return true;
+	}
+	/* No locking required with single writer and single reader */
+	r = kfifo_put(&vm->faults, key);
+	if (!r) {
+		/* FIFO is full. Ignore it until there is space */
+		amdgpu_ih_clear_fault(adev, key);
+		goto ignore_iv;
+	}
+
+	/* It's the first fault for this address, process it normally */
 	return true;
+
+ignore_iv:
+	adev->irq.ih.rptr += 32;
+	return false;
 }
 
 /**
@@ -323,6 +388,14 @@ static int vega10_ih_sw_init(void *handle)
 	adev->irq.ih.use_doorbell = true;
 	adev->irq.ih.doorbell_index = AMDGPU_DOORBELL64_IH << 1;
 
+	adev->irq.ih.faults = kmalloc(sizeof(*adev->irq.ih.faults), GFP_KERNEL);
+	if (!adev->irq.ih.faults)
+		return -ENOMEM;
+	INIT_CHASH_TABLE(adev->irq.ih.faults->hash,
+			 AMDGPU_PAGEFAULT_HASH_BITS, 8, 0);
+	spin_lock_init(&adev->irq.ih.faults->lock);
+	adev->irq.ih.faults->count = 0;
+
 	r = amdgpu_irq_init(adev);
 
 	return r;
@@ -335,6 +408,9 @@ static int vega10_ih_sw_fini(void *handle)
 	amdgpu_irq_fini(adev);
 	amdgpu_ih_ring_fini(adev);
 
+	kfree(adev->irq.ih.faults);
+	adev->irq.ih.faults = NULL;
+
 	return 0;
 }
 
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/9] drm/amdgpu: Fix error handling in amdgpu_vm_init
       [not found]     ` <1503731949-22742-2-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-08-26 13:22       ` Christian König
  2017-08-28  2:51       ` zhoucm1
  1 sibling, 0 replies; 23+ messages in thread
From: Christian König @ 2017-08-26 13:22 UTC (permalink / raw)
  To: Felix Kuehling, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Am 26.08.2017 um 09:19 schrieb Felix Kuehling:
> Make sure vm->root.bo is not left reserved if amdgpu_bo_kmap fails.
>
> Change-Id: If3687b39a50b0ffe7f8be2ea6e927fa2ca0f9e45
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>

Reviewed-by: Christian König <christian.koenig@amd.com>

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 9 +++------
>   1 file changed, 3 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index e57a72e..70d7632 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -2556,14 +2556,11 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
>   		goto error_free_root;
>   
>   	vm->last_eviction_counter = atomic64_read(&adev->num_evictions);
> -
> -	if (vm->use_cpu_for_update) {
> +	if (vm->use_cpu_for_update)
>   		r = amdgpu_bo_kmap(vm->root.bo, NULL);
> -		if (r)
> -			goto error_free_root;
> -	}
> -
>   	amdgpu_bo_unreserve(vm->root.bo);
> +	if (r)
> +		goto error_free_root;
>   
>   	vm->vm_context = vm_context;
>   	if (vm_context == AMDGPU_VM_CONTEXT_COMPUTE) {


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/9] drm/amdgpu: Add PASID management
       [not found]     ` <1503731949-22742-3-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-08-26 13:27       ` Christian König
       [not found]         ` <994b23cd-67b3-4498-2c2b-d4fc2ea68be7-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
  2017-08-28  3:06       ` zhoucm1
  1 sibling, 1 reply; 23+ messages in thread
From: Christian König @ 2017-08-26 13:27 UTC (permalink / raw)
  To: Felix Kuehling, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Am 26.08.2017 um 09:19 schrieb Felix Kuehling:
> Allows assigning a PASID to a VM for identifying VMs involved in page
> faults. The global PASID manager is also exported in the KFD
> interface so that AMDGPU and KFD can share the PASID space.
>
> PASIDs of different sizes can be requested. On APUs, the PASID size
> is deterined by the capabilities of the IOMMU. So KFD must be able
> to allocate PASIDs in a smaller range.
>
> TODO:
> * Actually assign PASIDs to VMs
> * Update the PASID-VMID mapping registers during CS
>
> Change-Id: I88c9357a7c584f10e84b5607ac09eba77c833393
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>

The patch itself is Reviewed-by: Christian König <christian.koenig@amd.com>.

But I'm a bit confused, doesn't his stuff belong into the IOMMU code?

Regards,
Christian.

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c |  2 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c |  2 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c |  2 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |  2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |  2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            | 76 ++++++++++++++++++++++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h            | 14 ++++-
>   drivers/gpu/drm/amd/include/kgd_kfd_interface.h   |  6 ++
>   8 files changed, 101 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
> index 3e28d2b..0807d52 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
> @@ -188,6 +188,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
>   	.get_local_mem_info = get_local_mem_info,
>   	.get_gpu_clock_counter = get_gpu_clock_counter,
>   	.get_max_engine_clock_in_mhz = get_max_engine_clock_in_mhz,
> +	.alloc_pasid = amdgpu_vm_alloc_pasid,
> +	.free_pasid = amdgpu_vm_free_pasid,
>   	.create_process_vm = amdgpu_amdkfd_gpuvm_create_process_vm,
>   	.destroy_process_vm = amdgpu_amdkfd_gpuvm_destroy_process_vm,
>   	.get_process_page_dir = amdgpu_amdkfd_gpuvm_get_process_page_dir,
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
> index 3b6b4d9..c20c000 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
> @@ -162,6 +162,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
>   	.get_local_mem_info = get_local_mem_info,
>   	.get_gpu_clock_counter = get_gpu_clock_counter,
>   	.get_max_engine_clock_in_mhz = get_max_engine_clock_in_mhz,
> +	.alloc_pasid = amdgpu_vm_alloc_pasid,
> +	.free_pasid = amdgpu_vm_free_pasid,
>   	.create_process_vm = amdgpu_amdkfd_gpuvm_create_process_vm,
>   	.destroy_process_vm = amdgpu_amdkfd_gpuvm_destroy_process_vm,
>   	.create_process_gpumem = create_process_gpumem,
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
> index 961369d..bb99c64 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
> @@ -209,6 +209,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
>   	.get_local_mem_info = get_local_mem_info,
>   	.get_gpu_clock_counter = get_gpu_clock_counter,
>   	.get_max_engine_clock_in_mhz = get_max_engine_clock_in_mhz,
> +	.alloc_pasid = amdgpu_vm_alloc_pasid,
> +	.free_pasid = amdgpu_vm_free_pasid,
>   	.create_process_vm = amdgpu_amdkfd_gpuvm_create_process_vm,
>   	.destroy_process_vm = amdgpu_amdkfd_gpuvm_destroy_process_vm,
>   	.create_process_gpumem = create_process_gpumem,
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index 35f7d77..462011c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -1397,7 +1397,7 @@ int amdgpu_amdkfd_gpuvm_create_process_vm(struct kgd_dev *kgd, void **vm,
>   		return -ENOMEM;
>   
>   	/* Initialize the VM context, allocate the page directory and zero it */
> -	ret = amdgpu_vm_init(adev, &new_vm->base, AMDGPU_VM_CONTEXT_COMPUTE);
> +	ret = amdgpu_vm_init(adev, &new_vm->base, AMDGPU_VM_CONTEXT_COMPUTE, 0);
>   	if (ret != 0) {
>   		pr_err("Failed init vm ret %d\n", ret);
>   		/* Undo everything related to the new VM context */
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> index e390c01..ba3dc4d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> @@ -825,7 +825,7 @@ int amdgpu_driver_open_kms(struct drm_device *dev, struct drm_file *file_priv)
>   	}
>   
>   	r = amdgpu_vm_init(adev, &fpriv->vm,
> -			   AMDGPU_VM_CONTEXT_GFX);
> +			   AMDGPU_VM_CONTEXT_GFX, 0);
>   	if (r) {
>   		kfree(fpriv);
>   		goto out_suspend;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 70d7632..c635699 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -27,12 +27,59 @@
>    */
>   #include <linux/dma-fence-array.h>
>   #include <linux/interval_tree_generic.h>
> +#include <linux/idr.h>
>   #include <drm/drmP.h>
>   #include <drm/amdgpu_drm.h>
>   #include "amdgpu.h"
>   #include "amdgpu_trace.h"
>   
>   /*
> + * PASID manager
> + *
> + * PASIDs are global address space identifiers that can be shared
> + * between the GPU, an IOMMU and the driver. VMs on different devices
> + * may use the same PASID if they share the same address
> + * space. Therefore PASIDs are allocated using a global IDA. VMs are
> + * looked up from the PASID per amdgpu_device.
> + */
> +static DEFINE_IDA(amdgpu_vm_pasid_ida);
> +
> +/**
> + * amdgpu_vm_alloc_pasid - Allocate a PASID
> + * @bits: Maximum width of the PASID in bits, must be at least 1
> + *
> + * Allocates a PASID of the given width while keeping smaller PASIDs
> + * available if possible.
> + *
> + * Returns a positive integer on success. Returns %-EINVAL if bits==0.
> + * Returns %-ENOSPC if no PASID was avaliable. Returns %-ENOMEM on
> + * memory allocation failure.
> + */
> +int amdgpu_vm_alloc_pasid(unsigned int bits)
> +{
> +	int pasid = -EINVAL;
> +
> +	for (bits = min(bits, 31U); bits > 0; bits--) {
> +		pasid = ida_simple_get(&amdgpu_vm_pasid_ida,
> +				       1U << (bits - 1), 1U << bits,
> +				       GFP_KERNEL);
> +		if (pasid != -ENOSPC)
> +			break;
> +	}
> +
> +	return pasid;
> +}
> +
> +/**
> + * amdgpu_vm_free_pasid - Free a PASID
> + * @pasid: PASID to free
> + */
> +void amdgpu_vm_free_pasid(unsigned int pasid)
> +{
> +	ida_simple_remove(&amdgpu_vm_pasid_ida, pasid);
> +}
> +
> +/*
>    * GPUVM
>    * GPUVM is similar to the legacy gart on older asics, however
>    * rather than there being a single global gart table
> @@ -2482,7 +2529,7 @@ void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint64_t vm_size, uint32_
>    * Init @vm fields.
>    */
>   int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
> -		   int vm_context)
> +		   int vm_context, unsigned int pasid)
>   {
>   	const unsigned align = min(AMDGPU_VM_PTB_ALIGN_SIZE,
>   		AMDGPU_VM_PTE_COUNT(adev) * 8);
> @@ -2562,6 +2609,19 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
>   	if (r)
>   		goto error_free_root;
>   
> +	if (pasid) {
> +		unsigned long flags;
> +
> +		spin_lock_irqsave(&adev->vm_manager.pasid_lock, flags);
> +		r = idr_alloc(&adev->vm_manager.pasid_idr, vm, pasid, pasid + 1,
> +			      GFP_ATOMIC);
> +		spin_unlock_irqrestore(&adev->vm_manager.pasid_lock, flags);
> +		if (r < 0)
> +			goto error_free_root;
> +
> +		vm->pasid = pasid;
> +	}
> +
>   	vm->vm_context = vm_context;
>   	if (vm_context == AMDGPU_VM_CONTEXT_COMPUTE) {
>   		mutex_lock(&id_mgr->lock);
> @@ -2650,6 +2710,14 @@ void amdgpu_vm_fini(struct amdgpu_device *adev, struct amdgpu_vm *vm)
>   		mutex_unlock(&id_mgr->lock);
>   	}
>   
> +	if (vm->pasid) {
> +		unsigned long flags;
> +
> +		spin_lock_irqsave(&adev->vm_manager.pasid_lock, flags);
> +		idr_remove(&adev->vm_manager.pasid_idr, vm->pasid);
> +		spin_unlock_irqrestore(&adev->vm_manager.pasid_lock, flags);
> +	}
> +
>   	amd_sched_entity_fini(vm->entity.sched, &vm->entity);
>   
>   	if (!RB_EMPTY_ROOT(&vm->va)) {
> @@ -2729,6 +2797,9 @@ void amdgpu_vm_manager_init(struct amdgpu_device *adev)
>   	adev->vm_manager.vm_update_mode = 0;
>   #endif
>   
> +	idr_init(&adev->vm_manager.pasid_idr);
> +	spin_lock_init(&adev->vm_manager.pasid_lock);
> +
>   	adev->vm_manager.n_compute_vms = 0;
>   }
>   
> @@ -2743,6 +2814,9 @@ void amdgpu_vm_manager_fini(struct amdgpu_device *adev)
>   {
>   	unsigned i, j;
>   
> +	WARN_ON(!idr_is_empty(&adev->vm_manager.pasid_idr));
> +	idr_destroy(&adev->vm_manager.pasid_idr);
> +
>   	for (i = 0; i < AMDGPU_MAX_VMHUBS; ++i) {
>   		struct amdgpu_vm_id_manager *id_mgr =
>   			&adev->vm_manager.id_mgr[i];
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> index 49e15d7..692b05c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> @@ -25,6 +25,7 @@
>   #define __AMDGPU_VM_H__
>   
>   #include <linux/rbtree.h>
> +#include <linux/idr.h>
>   
>   #include "gpu_scheduler.h"
>   #include "amdgpu_sync.h"
> @@ -143,8 +144,9 @@ struct amdgpu_vm {
>   	/* Scheduler entity for page table updates */
>   	struct amd_sched_entity	entity;
>   
> -	/* client id */
> +	/* client id and PASID (TODO: replace client_id with PASID) */
>   	u64                     client_id;
> +	unsigned int		pasid;
>   	/* dedicated to vm */
>   	struct amdgpu_vm_id	*reserved_vmid[AMDGPU_MAX_VMHUBS];
>   
> @@ -219,14 +221,22 @@ struct amdgpu_vm_manager {
>   	 */
>   	int					vm_update_mode;
>   
> +	/* PASID to VM mapping, will be used in interrupt context to
> +	 * look up VM of a page fault
> +	 */
> +	struct idr				pasid_idr;
> +	spinlock_t				pasid_lock;
> +
>   	/* Number of Compute VMs, used for detecting Compute activity */
>   	unsigned                                n_compute_vms;
>   };
>   
> +int amdgpu_vm_alloc_pasid(unsigned int bits);
> +void amdgpu_vm_free_pasid(unsigned int pasid);
>   void amdgpu_vm_manager_init(struct amdgpu_device *adev);
>   void amdgpu_vm_manager_fini(struct amdgpu_device *adev);
>   int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
> -		   int vm_context);
> +		   int vm_context, unsigned int pasid);
>   void amdgpu_vm_fini(struct amdgpu_device *adev, struct amdgpu_vm *vm);
>   void amdgpu_vm_get_pd_bo(struct amdgpu_vm *vm,
>   			 struct list_head *validated,
> diff --git a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
> index e8027b3..5833ef7 100644
> --- a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
> +++ b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
> @@ -188,6 +188,9 @@ struct tile_config {
>    *
>    * @get_max_engine_clock_in_mhz: Retrieves maximum GPU clock in MHz
>    *
> + * @alloc_pasid: Allocate a PASID
> + * @free_pasid: Free a PASID
> + *
>    * @program_sh_mem_settings: A function that should initiate the memory
>    * properties such as main aperture memory type (cache / non cached) and
>    * secondary aperture base address, size and memory type.
> @@ -264,6 +267,9 @@ struct kfd2kgd_calls {
>   
>   	uint32_t (*get_max_engine_clock_in_mhz)(struct kgd_dev *kgd);
>   
> +	int (*alloc_pasid)(unsigned int bits);
> +	void (*free_pasid)(unsigned int pasid);
> +
>   	int (*create_process_vm)(struct kgd_dev *kgd, void **vm,
>   				 void **process_info);
>   	void (*destroy_process_vm)(struct kgd_dev *kgd, void *vm);


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/9] drm/radeon: Add PASID manager for KFD
       [not found]     ` <1503731949-22742-4-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-08-26 13:27       ` Christian König
  0 siblings, 0 replies; 23+ messages in thread
From: Christian König @ 2017-08-26 13:27 UTC (permalink / raw)
  To: Felix Kuehling, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Am 26.08.2017 um 09:19 schrieb Felix Kuehling:
> Change-Id: I101a5ac8e0ebf0dbbe6dbd1f61cd8236d499c5b8
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>

Reviewed-by: Christian König <christian.koenig@amd.com>

> ---
>   drivers/gpu/drm/radeon/radeon_kfd.c | 30 ++++++++++++++++++++++++++++++
>   1 file changed, 30 insertions(+)
>
> diff --git a/drivers/gpu/drm/radeon/radeon_kfd.c b/drivers/gpu/drm/radeon/radeon_kfd.c
> index 81fb94b..acfe34e 100644
> --- a/drivers/gpu/drm/radeon/radeon_kfd.c
> +++ b/drivers/gpu/drm/radeon/radeon_kfd.c
> @@ -79,6 +79,9 @@ static uint64_t get_gpu_clock_counter(struct kgd_dev *kgd);
>   
>   static uint32_t get_max_engine_clock_in_mhz(struct kgd_dev *kgd);
>   
> +static int alloc_pasid(unsigned int bits);
> +static void free_pasid(unsigned int pasid);
> +
>   static int create_process_vm(struct kgd_dev *kgd, void **vm, void **info);
>   static void destroy_process_vm(struct kgd_dev *kgd, void *vm);
>   
> @@ -161,6 +164,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
>   	.get_local_mem_info = get_local_mem_info,
>   	.get_gpu_clock_counter = get_gpu_clock_counter,
>   	.get_max_engine_clock_in_mhz = get_max_engine_clock_in_mhz,
> +	.alloc_pasid = alloc_pasid,
> +	.free_pasid = free_pasid,
>   	.create_process_vm = create_process_vm,
>   	.destroy_process_vm = destroy_process_vm,
>   	.get_process_page_dir = get_process_page_dir,
> @@ -415,6 +420,31 @@ static uint32_t get_max_engine_clock_in_mhz(struct kgd_dev *kgd)
>   }
>   
>   /*
> + * PASID manager
> + */
> +static DEFINE_IDA(pasid_ida);
> +
> +int alloc_pasid(unsigned int bits)
> +{
> +	int pasid = -EINVAL;
> +
> +	for (bits = min(bits, 31U); bits > 0; bits--) {
> +		pasid = ida_simple_get(&pasid_ida,
> +				       1U << (bits - 1), 1U << bits,
> +				       GFP_KERNEL);
> +		if (pasid != -ENOSPC)
> +			break;
> +	}
> +
> +	return pasid;
> +}
> +
> +void free_pasid(unsigned int pasid)
> +{
> +	ida_simple_remove(&pasid_ida, pasid);
> +}
> +
> +/*
>    * Creates a VM context for HSA process
>    */
>   static int create_process_vm(struct kgd_dev *kgd, void **vm, void **info)


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 8/9] lib: Closed hash table with low overhead
       [not found]     ` <1503731949-22742-9-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-08-26 13:32       ` Christian König
  0 siblings, 0 replies; 23+ messages in thread
From: Christian König @ 2017-08-26 13:32 UTC (permalink / raw)
  To: Felix Kuehling, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Am 26.08.2017 um 09:19 schrieb Felix Kuehling:
> This adds a closed hash table implementation with low memory and CPU
> overhead. The API is inspired by kfifo.
>
> Storing, retrieving and deleting data does not involve any dynamic
> memory management, which makes it ideal for use in interrupt context.
> Memory overhead per entry is the 32 or 64 bit hash key, two bits for
> free/used tracking and whatever value size is stored in the table.
> No list heads or pointers, therefore this data structure should be
> quite cache-friendly, too.
>
> After entries are removed, free space maintenance is necessary. At
> the same time, entries that had hash collisions on insertion can be
> relocated to speed up future lookups. This is done incrementally and
> opportunistically to avoid long stalls.
>
> CPU overhead is very small as long as the table doesn't fill up more
> than about 50%. It's still quite efficient up to 90% full. The less
> free space is in the table, the more likely collisions get, and the
> more maintenance overhead is required to maintain free space and
> efficiency.
>
> Change-Id: I86e72510941969e7523df11f9e68926f46ef7af1
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>

Acked-by: Christian König <christian.koenig@amd.com>

Please also send to the LKML cause that really needs a wider audience.

Regards,
Christian.

> ---
>   include/linux/chash.h | 349 +++++++++++++++++++++++++++++++++
>   lib/Kconfig           |   8 +
>   lib/Makefile          |   2 +
>   lib/chash.c           | 521 ++++++++++++++++++++++++++++++++++++++++++++++++++
>   4 files changed, 880 insertions(+)
>   create mode 100644 include/linux/chash.h
>   create mode 100644 lib/chash.c
>
> diff --git a/include/linux/chash.h b/include/linux/chash.h
> new file mode 100644
> index 0000000..3835575
> --- /dev/null
> +++ b/include/linux/chash.h
> @@ -0,0 +1,349 @@
> +/*
> + * Copyright 2017 Advanced Micro Devices, Inc.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
> + * OTHER DEALINGS IN THE SOFTWARE.
> + *
> + */
> +
> +#ifndef _LINUX_CHASH_H
> +#define _LINUX_CHASH_H
> +
> +#include <linux/types.h>
> +#include <linux/hash.h>
> +#include <linux/bug.h>
> +#include <linux/bitops.h>
> +
> +struct __chash_table {
> +	u8 bits;
> +	u8 key_size;
> +	unsigned int value_size;
> +	u32 size_mask;
> +	unsigned long *occup_bitmap, *valid_bitmap;
> +	union {
> +		u32 *keys32;
> +		u64 *keys64;
> +	};
> +	u8 *values;
> +
> +#define CHASH_STATS
> +#ifdef CHASH_STATS
> +	u64 total_add_calls, total_add_steps;
> +	u64 total_find_calls, total_find_steps;
> +	u64 total_not_find_calls, total_not_find_steps;
> +	u64 total_relocations, total_relocation_distance;
> +#endif
> +};
> +
> +#define __CHASH_BITMAP_SIZE(bits)				\
> +	(((1 << (bits)) + BITS_PER_LONG - 1) / BITS_PER_LONG)
> +#define __CHASH_ARRAY_SIZE(bits, size)				\
> +	((((size) << (bits)) + sizeof(long) - 1) / sizeof(long))
> +
> +#define __CHASH_DATA_SIZE(bits, key_size, value_size)	\
> +	(__CHASH_BITMAP_SIZE(bits) * 2 +		\
> +	 __CHASH_ARRAY_SIZE(bits, key_size) +		\
> +	 __CHASH_ARRAY_SIZE(bits, value_size))
> +
> +#define STRUCT_CHASH_TABLE(bits, key_size, value_size)			\
> +	struct {							\
> +		struct __chash_table table;				\
> +		unsigned long data[					\
> +			__CHASH_DATA_SIZE(bits, key_size, value_size)];	\
> +	}
> +
> +/**
> + * struct chash_table - Dynamically allocated closed hash table
> + *
> + * Use this struct for dynamically allocated hash tables (using
> + * chash_table_alloc and chash_table_free), where the size is
> + * determined at runtime.
> + */
> +struct chash_table {
> +	struct __chash_table table;
> +	unsigned long *data;
> +};
> +
> +/**
> + * DECLARE_CHASH_TABLE - macro to declare a closed hash table
> + * @table: name of the declared hash table
> + * @bts: Table size will be 2^bits entries
> + * @key_sz: Size of hash keys in bytes, 4 or 8
> + * @val_sz: Size of data values in bytes, can be 0
> + *
> + * This declares the hash table variable with a static size.
> + *
> + * The closed hash table stores key-value pairs with low memory and
> + * lookup overhead. In operation it performs no dynamic memory
> + * management. The data being stored does not require any
> + * list_heads. The hash table performs best with small @val_sz and as
> + * long as some space (about 50%) is left free in the table. But the
> + * table can still work reasonably efficiently even when filled up to
> + * about 90%. If bigger data items need to be stored and looked up,
> + * store the pointer to it as value in the hash table.
> + *
> + * @val_sz may be 0. This can be useful when all the stored
> + * information is contained in the key itself and the fact that it is
> + * in the hash table (or not).
> + */
> +#define DECLARE_CHASH_TABLE(table, bts, key_sz, val_sz)		\
> +	STRUCT_CHASH_TABLE(bts, key_sz, val_sz) table
> +
> +#ifdef CHASH_STATS
> +#define __CHASH_STATS_INIT(prefix) ,			\
> +		prefix.total_add_calls = 0,		\
> +		prefix.total_add_steps = 0,		\
> +		prefix.total_find_calls = 0,		\
> +		prefix.total_find_steps = 0,		\
> +		prefix.total_not_find_calls = 0,	\
> +		prefix.total_not_find_steps = 0,	\
> +		prefix.total_relocations = 0,		\
> +		prefix.total_relocation_distance = 0
> +#else
> +#define __CHASH_STATS_INIT(prefix)
> +#endif
> +
> +#define __CHASH_TABLE_INIT(prefix, data, bts, key_sz, val_sz)	\
> +	prefix.bits = (bts),					\
> +		prefix.key_size = (key_sz),			\
> +		prefix.value_size = (val_sz),			\
> +		prefix.size_mask = ((1 << bts) - 1),		\
> +		prefix.occup_bitmap = &data[0],			\
> +		prefix.valid_bitmap = &data[			\
> +			__CHASH_BITMAP_SIZE(bts)],		\
> +		prefix.keys64 = (u64 *)&data[			\
> +			__CHASH_BITMAP_SIZE(bts) * 2],		\
> +		prefix.values = (u8 *)&data[			\
> +			__CHASH_BITMAP_SIZE(bts) * 2 +		\
> +			__CHASH_ARRAY_SIZE(bts, key_sz)]	\
> +		__CHASH_STATS_INIT(prefix)
> +
> +/**
> + * DEFINE_CHASH_TABLE - macro to define and initialize a closed hash table
> + * @tbl: name of the declared hash table
> + * @bts: Table size will be 2^bits entries
> + * @key_sz: Size of hash keys in bytes, 4 or 8
> + * @val_sz: Size of data values in bytes, can be 0
> + *
> + * Note: the macro can be used for global and local hash table variables.
> + */
> +#define DEFINE_CHASH_TABLE(tbl, bts, key_sz, val_sz)			\
> +	DECLARE_CHASH_TABLE(tbl, bts, key_sz, val_sz) = {		\
> +		.table = {						\
> +			__CHASH_TABLE_INIT(, (tbl).data, bts, key_sz, val_sz) \
> +		},							\
> +		.data = {0}						\
> +	}
> +
> +/**
> + * INIT_CHASH_TABLE - Initialize a hash table declared by DECLARE_CHASH_TABLE
> + * @tbl: name of the declared hash table
> + * @bts: Table size will be 2^bits entries
> + * @key_sz: Size of hash keys in bytes, 4 or 8
> + * @val_sz: Size of data values in bytes, can be 0
> + */
> +#define INIT_CHASH_TABLE(tbl, bts, key_sz, val_sz)			\
> +	__CHASH_TABLE_INIT(((tbl).table), (tbl).data, bts, key_sz, val_sz)
> +
> +int chash_table_alloc(struct chash_table *table, u8 bits, u8 key_size,
> +		      unsigned int value_size, gfp_t gfp_mask);
> +void chash_table_free(struct chash_table *table);
> +
> +/**
> + * chash_table_dump_stats - Dump statistics of a closed hash table
> + * @tbl: Pointer to the table structure
> + *
> + * Dumps some performance statistics of the table gathered in operation.
> + */
> +#ifdef CHASH_STATS
> +#define chash_table_dump_stats(tbl) __chash_table_dump_stats(&(*tbl).table)
> +
> +void __chash_table_dump_stats(struct __chash_table *table);
> +#else
> +#define chash_table_dump_stats(tbl)
> +#endif
> +
> +/**
> + * chash_table_copy_in - Copy a new value into the hash table
> + * @tbl: Pointer to the table structure
> + * @key: Key of the entry to add or update
> + * @value: Pointer to value to copy, may be NULL
> + *
> + * If @key already has an entry, its value is replaced. Otherwise a
> + * new entry is added. If @value is NULL, the value is left unchanged
> + * or uninitialized. Returns 1 if an entry already existed, 0 if a new
> + * entry was added or %-ENOMEM if there was no free space in the
> + * table.
> + */
> +#define chash_table_copy_in(tbl, key, value)			\
> +	__chash_table_copy_in(&(*tbl).table, key, value)
> +
> +int __chash_table_copy_in(struct __chash_table *table, u64 key,
> +			  const void *value);
> +
> +/**
> + * chash_table_copy_out - Copy a value out of the hash table
> + * @tbl: Pointer to the table structure
> + * @key: Key of the entry to find
> + * @value: Pointer to value to copy, may be NULL
> + *
> + * If @value is not NULL and the table has a non-0 value_size, the
> + * value at @key is copied to @value. Returns the slot index of the
> + * entry or %-EINVAL if @key was not found.
> + */
> +#define chash_table_copy_out(tbl, key, value)			\
> +	__chash_table_copy_out(&(*tbl).table, key, value, false)
> +
> +int __chash_table_copy_out(struct __chash_table *table, u64 key,
> +			   void *value, bool remove);
> +
> +/**
> + * chash_table_remove - Remove an entry from the hash table
> + * @tbl: Pointer to the table structure
> + * @key: Key of the entry to find
> + * @value: Pointer to value to copy, may be NULL
> + *
> + * If @value is not NULL and the table has a non-0 value_size, the
> + * value at @key is copied to @value. The entry is removed from the
> + * table. Returns the slot index of the removed entry or %-EINVAL if
> + * @key was not found.
> + */
> +#define chash_table_remove(tbl, key, value)			\
> +	__chash_table_copy_out(&(*tbl).table, key, value, true)
> +
> +#define CHASH_SELF_TEST
> +#ifdef CHASH_SELF_TEST
> +int chash_self_test(u8 bits, u8 key_size, int min_fill, int max_fill,
> +		    u64 iterations);
> +#endif
> +
> +/*
> + * Low level iterator API used internally by the above functions.
> + */
> +struct chash_iter {
> +	struct __chash_table *table;
> +	unsigned long mask;
> +	int slot;
> +};
> +
> +/**
> + * CHASH_ITER_INIT - Initialize a hash table iterator
> + * @tbl: Pointer to hash table to iterate over
> + * @s: Initial slot number
> + */
> +#define CHASH_ITER_INIT(table, s) {			\
> +		table,					\
> +		1UL << ((s) & (BITS_PER_LONG - 1)),	\
> +		s					\
> +	}
> +/**
> + * CHASH_ITER_SET - Set hash table iterator to new slot
> + * @iter: Iterator
> + * @s: Slot number
> + */
> +#define CHASH_ITER_SET(iter, s)					\
> +	(iter).mask = 1UL << ((s) & (BITS_PER_LONG - 1)),	\
> +	(iter).slot = (s)
> +/**
> + * CHASH_ITER_INC - Increment hash table iterator
> + * @table: Hash table to iterate over
> + *
> + * Wraps around at the end.
> + */
> +#define CHASH_ITER_INC(iter) do {					\
> +		(iter).mask = (iter).mask << 1 |			\
> +			(iter).mask >> (BITS_PER_LONG - 1);		\
> +		(iter).slot = ((iter).slot + 1) & (iter).table->size_mask; \
> +	} while (0)
> +
> +static inline bool chash_iter_is_valid(const struct chash_iter iter)
> +{
> +	BUG_ON((unsigned)iter.slot >= (1 << iter.table->bits));
> +	return !!(iter.table->valid_bitmap[iter.slot >> _BITOPS_LONG_SHIFT] &
> +		  iter.mask);
> +}
> +static inline bool chash_iter_is_empty(const struct chash_iter iter)
> +{
> +	BUG_ON((unsigned)iter.slot >= (1 << iter.table->bits));
> +	return !(iter.table->occup_bitmap[iter.slot >> _BITOPS_LONG_SHIFT] &
> +		 iter.mask);
> +}
> +
> +static inline void chash_iter_set_valid(const struct chash_iter iter)
> +{
> +	BUG_ON((unsigned)iter.slot >= (1 << iter.table->bits));
> +	iter.table->valid_bitmap[iter.slot >> _BITOPS_LONG_SHIFT] |= iter.mask;
> +	iter.table->occup_bitmap[iter.slot >> _BITOPS_LONG_SHIFT] |= iter.mask;
> +}
> +static inline void chash_iter_set_invalid(const struct chash_iter iter)
> +{
> +	BUG_ON((unsigned)iter.slot >= (1 << iter.table->bits));
> +	iter.table->valid_bitmap[iter.slot >> _BITOPS_LONG_SHIFT] &= ~iter.mask;
> +}
> +static inline void chash_iter_set_empty(const struct chash_iter iter)
> +{
> +	BUG_ON((unsigned)iter.slot >= (1 << iter.table->bits));
> +	iter.table->occup_bitmap[iter.slot >> _BITOPS_LONG_SHIFT] &= ~iter.mask;
> +}
> +
> +static inline u32 chash_iter_key32(const struct chash_iter iter)
> +{
> +	BUG_ON(iter.table->key_size != 4);
> +	BUG_ON((unsigned)iter.slot >= (1 << iter.table->bits));
> +	return iter.table->keys32[iter.slot];
> +}
> +static inline u64 chash_iter_key64(const struct chash_iter iter)
> +{
> +	BUG_ON(iter.table->key_size != 8);
> +	BUG_ON((unsigned)iter.slot >= (1 << iter.table->bits));
> +	return iter.table->keys64[iter.slot];
> +}
> +static inline u64 chash_iter_key(const struct chash_iter iter)
> +{
> +	BUG_ON((unsigned)iter.slot >= (1 << iter.table->bits));
> +	return (iter.table->key_size == 4) ?
> +		iter.table->keys32[iter.slot] : iter.table->keys64[iter.slot];
> +}
> +
> +static inline u32 chash_iter_hash32(const struct chash_iter iter)
> +{
> +	BUG_ON(iter.table->key_size != 4);
> +	return hash_32(chash_iter_key32(iter), iter.table->bits);
> +}
> +
> +static inline u32 chash_iter_hash64(const struct chash_iter iter)
> +{
> +	BUG_ON(iter.table->key_size != 8);
> +	return hash_64(chash_iter_key64(iter), iter.table->bits);
> +}
> +
> +static inline u32 chash_iter_hash(const struct chash_iter iter)
> +{
> +	return (iter.table->key_size == 4) ?
> +		hash_32(chash_iter_key32(iter), iter.table->bits) :
> +		hash_64(chash_iter_key64(iter), iter.table->bits);
> +}
> +
> +static inline void *chash_iter_value(const struct chash_iter iter)
> +{
> +	BUG_ON((unsigned)iter.slot >= (1 << iter.table->bits));
> +	return iter.table->values +
> +		((unsigned long)iter.slot * iter.table->value_size);
> +}
> +
> +#endif /* _LINUX_CHASH_H */
> diff --git a/lib/Kconfig b/lib/Kconfig
> index 0c8b78a..e5e1438 100644
> --- a/lib/Kconfig
> +++ b/lib/Kconfig
> @@ -564,4 +564,12 @@ config PARMAN
>   config PRIME_NUMBERS
>   	tristate
>   
> +#
> +# Closed hash table
> +#
> +config CHASH
> +	tristate "Closed hash table"
> +        help
> +	 Closed hash table implementation with low memory and CPU overhead.
> +
>   endmenu
> diff --git a/lib/Makefile b/lib/Makefile
> index 0166fbc..a44ec9f 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -243,3 +243,5 @@ UBSAN_SANITIZE_ubsan.o := n
>   obj-$(CONFIG_SBITMAP) += sbitmap.o
>   
>   obj-$(CONFIG_PARMAN) += parman.o
> +
> +obj-$(CONFIG_CHASH) += chash.o
> diff --git a/lib/chash.c b/lib/chash.c
> new file mode 100644
> index 0000000..08cdac1
> --- /dev/null
> +++ b/lib/chash.c
> @@ -0,0 +1,521 @@
> +/*
> + * Copyright 2017 Advanced Micro Devices, Inc.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
> + * OTHER DEALINGS IN THE SOFTWARE.
> + *
> + */
> +
> +#include <linux/types.h>
> +#include <linux/hash.h>
> +#include <linux/bug.h>
> +#include <linux/slab.h>
> +#include <linux/module.h>
> +#include <linux/chash.h>
> +
> +/**
> + * chash_table_alloc - Allocate closed hash table
> + * @table: Pointer to the table structure
> + * @bits: Table size will be 2^bits entries
> + * @key_size: Size of hash keys in bytes, 4 or 8
> + * @value_size: Size of data values in bytes, can be 0
> + */
> +int chash_table_alloc(struct chash_table *table, u8 bits, u8 key_size,
> +		      unsigned value_size, gfp_t gfp_mask)
> +{
> +	if (bits > 31)
> +		return -EINVAL;
> +
> +	if (key_size != 4 && key_size != 8)
> +		return -EINVAL;
> +
> +	table->data = kcalloc(__CHASH_DATA_SIZE(bits, key_size, value_size),
> +		       sizeof(long), gfp_mask);
> +	if (!table->data)
> +		return -ENOMEM;
> +
> +	__CHASH_TABLE_INIT(table->table, table->data, bits, key_size, value_size);
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL(chash_table_alloc);
> +
> +/**
> + * chash_table_free - Free closed hash table
> + * @table: Pointer to the table structure
> + */
> +void chash_table_free(struct chash_table *table)
> +{
> +	kfree(table->data);
> +}
> +EXPORT_SYMBOL(chash_table_free);
> +
> +#ifdef CHASH_STATS
> +
> +#define DIV_FRAC(nom, denom, quot, frac, frac_digits) do {		\
> +		(quot) = (nom) / (denom);				\
> +		(frac) = ((nom) % (denom) * (frac_digits) +		\
> +			  (denom) / 2) / (denom);			\
> +	} while (0)
> +
> +void __chash_table_dump_stats(struct __chash_table *table)
> +{
> +	struct chash_iter iter = CHASH_ITER_INIT(table, 0);
> +	u32 filled = 0, empty = 0, tombstones = 0;
> +	u32 quot, frac;
> +	u32 quot2, frac2;
> +
> +	do {
> +		if (chash_iter_is_valid(iter))
> +			filled++;
> +		else if (chash_iter_is_empty(iter))
> +			empty++;
> +		else
> +			tombstones++;
> +		CHASH_ITER_INC(iter);
> +	} while (iter.slot);
> +
> +	pr_info("Hash table key size %d, value size %d\n",
> +		table->key_size, table->value_size);
> +	pr_info("  Slots total/filled/empty/tombstones: %u / %u / %u / %u\n",
> +		1 << table->bits, filled, empty, tombstones);
> +	pr_info("  Avg number of search steps for:\n");
> +	if (table->total_add_calls > 0)
> +		DIV_FRAC(table->total_add_steps, table->total_add_calls,
> +			 quot, frac, 1000);
> +	else
> +		quot = frac = 0;
> +	pr_info("    Add           : %u.%03u\n", quot, frac);
> +	if (table->total_find_calls > 0)
> +		DIV_FRAC(table->total_find_steps,
> +			 table->total_find_calls, quot, frac, 1000);
> +	else
> +		quot = frac = 0;
> +	if (table->total_not_find_calls > 0)
> +		DIV_FRAC(table->total_not_find_steps,
> +			 table->total_not_find_calls, quot2, frac2, 1000);
> +	else
> +		quot2 = frac2 = 0;
> +	pr_info("    Find(hit/miss): %u.%03u / %u.%03u\n", quot, frac, quot2, frac2);
> +	if (table->total_relocations) {
> +		u64 quot64;
> +
> +		DIV_FRAC(table->total_find_calls + table->total_not_find_calls,
> +			 table->total_relocations, quot64, frac, 1000);
> +		DIV_FRAC(table->total_relocation_distance,
> +			 table->total_relocations, quot2, frac2, 1000);
> +		pr_info("  Relocations (freq/avg.dist): 1:%llu.%03u / %u.%03u\n",
> +			quot64, frac, quot2, frac2);
> +	} else {
> +		pr_info("  No relocations\n");
> +	}
> +}
> +EXPORT_SYMBOL(__chash_table_dump_stats);
> +
> +#undef DIV_FRAC
> +#endif
> +
> +#define CHASH_INC(table, a) ((a) = ((a) + 1) & (table)->size_mask)
> +#define CHASH_ADD(table, a, b) (((a) + (b)) & (table)->size_mask)
> +#define CHASH_SUB(table, a, b) (((a) - (b)) & (table)->size_mask)
> +#define CHASH_IN_RANGE(table, slot, first, last) \
> +	(CHASH_SUB(table, slot, first) <= CHASH_SUB(table, last, first))
> +
> +/*#define CHASH_DEBUG Uncomment this to enable verbose debug output*/
> +#ifdef CHASH_DEBUG
> +static void chash_table_dump(struct __chash_table *table)
> +{
> +	struct chash_iter iter = CHASH_ITER_INIT(table, 0);
> +
> +	do {
> +		if ((iter.slot & 3) == 0)
> +			pr_debug("%04x: ", iter.slot);
> +
> +		if (chash_iter_is_valid(iter))
> +			pr_debug("[%016llx] ", chash_iter_key(iter));
> +		else if (chash_iter_is_empty(iter))
> +			pr_debug("[    <empty>     ] ");
> +		else
> +			pr_debug("[  <tombstone>   ] ");
> +
> +		if ((iter.slot & 3) == 3)
> +			pr_debug("\n");
> +
> +		CHASH_ITER_INC(iter);
> +	} while (iter.slot);
> +
> +	if ((iter.slot & 3) != 0)
> +		pr_debug("\n");
> +}
> +
> +static int chash_table_check(struct __chash_table *table)
> +{
> +	u32 hash;
> +	struct chash_iter iter = CHASH_ITER_INIT(table, 0);
> +	struct chash_iter cur = CHASH_ITER_INIT(table, 0);
> +
> +	do {
> +		if (!chash_iter_is_valid(iter)) {
> +			CHASH_ITER_INC(iter);
> +			continue;
> +		}
> +
> +		hash = chash_iter_hash(iter);
> +		CHASH_ITER_SET(cur, hash);
> +		while (cur.slot != iter.slot) {
> +			if (chash_iter_is_empty(cur)) {
> +				pr_err("Path to element at %x with hash %x broken at slot %x\n",
> +				       iter.slot, hash, cur.slot);
> +				chash_table_dump(table);
> +				return -EINVAL;
> +			}
> +			CHASH_ITER_INC(cur);
> +		}
> +
> +		CHASH_ITER_INC(iter);
> +	} while (iter.slot);
> +
> +	return 0;
> +}
> +#endif
> +
> +static void chash_iter_relocate(struct chash_iter dst, struct chash_iter src)
> +{
> +	BUG_ON(src.table == dst.table && src.slot == dst.slot);
> +	BUG_ON(src.table->key_size != src.table->key_size);
> +	BUG_ON(src.table->value_size != src.table->value_size);
> +
> +	if (dst.table->key_size == 4)
> +		dst.table->keys32[dst.slot] = src.table->keys32[src.slot];
> +	else
> +		dst.table->keys64[dst.slot] = src.table->keys64[src.slot];
> +
> +	if (dst.table->value_size)
> +		memcpy(chash_iter_value(dst), chash_iter_value(src),
> +		       dst.table->value_size);
> +
> +	chash_iter_set_valid(dst);
> +	chash_iter_set_invalid(src);
> +
> +#ifdef CHASH_STATS
> +	if (src.table == dst.table) {
> +		dst.table->total_relocations++;
> +		dst.table->total_relocation_distance +=
> +			CHASH_SUB(dst.table, src.slot, dst.slot);
> +	}
> +#endif
> +}
> +
> +/**
> + * __chash_table_find - Helper for looking up a hash table entry
> + * @iter: Pointer to hash table iterator
> + * @key: Key of the entry to find
> + * @for_removal: set to true if the element will be removed soon
> + *
> + * Searches for an entry in the hash table with a given key. iter must
> + * be initialized by the caller to point to the home position of the
> + * hypothetical entry, i.e. it must be initialized with the hash table
> + * and the key's hash as the initial slot for the search.
> + *
> + * This function also does some local clean-up to speed up future
> + * look-ups by relocating entries to better slots and removing
> + * tombstones that are no longer needed.
> + *
> + * If @for_removal is true, the function avoids relocating the entry
> + * that is being returned.
> + *
> + * Returns 0 if the search is successful. In this case iter is updated
> + * to point to the found entry. Otherwise %-EINVAL is returned and the
> + * iter is updated to point to the first available slot for the given
> + * key. If the table is full, the slot is set to -1.
> + */
> +static int chash_table_find(struct chash_iter *iter, u64 key,
> +			    bool for_removal)
> +{
> +	u32 hash = iter->slot;
> +	struct chash_iter first_redundant = CHASH_ITER_INIT(iter->table, -1);
> +	int first_avail = (for_removal ? -2 : -1);
> +
> +	while (!chash_iter_is_valid(*iter) || chash_iter_key(*iter) != key) {
> +		if (chash_iter_is_empty(*iter)) {
> +			/* Found an empty slot, which ends the
> +			 * search. Clean up any preceding tombstones
> +			 * that are no longer needed because they lead
> +			 * to no-where
> +			 */
> +			if ((int)first_redundant.slot < 0)
> +				goto not_found;
> +			while (first_redundant.slot != iter->slot) {
> +				if (!chash_iter_is_valid(first_redundant))
> +					chash_iter_set_empty(first_redundant);
> +				CHASH_ITER_INC(first_redundant);
> +			}
> +#ifdef CHASH_DEBUG
> +			chash_table_check(iter->table);
> +#endif
> +			goto not_found;
> +		} else if (!chash_iter_is_valid(*iter)) {
> +			/* Found a tombstone. Remember it as candidate
> +			 * for relocating the entry we're looking for
> +			 * or for adding a new entry with the given key
> +			 */
> +			if (first_avail == -1)
> +				first_avail = iter->slot;
> +			/* Or mark it as the start of a series of
> +			 * potentially redundant tombstones
> +			 */
> +			else if (first_redundant.slot == -1)
> +				CHASH_ITER_SET(first_redundant, iter->slot);
> +		} else if (first_redundant.slot >= 0) {
> +			/* Found a valid, occupied slot with a
> +			 * preceding series of tombstones. Relocate it
> +			 * to a better position that no longer depends
> +			 * on those tombstones
> +			 */
> +			u32 cur_hash = chash_iter_hash(*iter);
> +
> +			if (!CHASH_IN_RANGE(iter->table, cur_hash,
> +					    first_redundant.slot + 1,
> +					    iter->slot)) {
> +				/* This entry has a hash at or before
> +				 * the first tombstone we found. We
> +				 * can relocate it to that tombstone
> +				 * and advance to the next tombstone
> +				 */
> +				chash_iter_relocate(first_redundant, *iter);
> +				do {
> +					CHASH_ITER_INC(first_redundant);
> +				} while (chash_iter_is_valid(first_redundant));
> +			} else if (cur_hash != iter->slot) {
> +				/* Relocate entry to its home position
> +				 * or a close as possible so it no
> +				 * longer depends on any preceding
> +				 * tombstones
> +				 */
> +				struct chash_iter new_iter =
> +					CHASH_ITER_INIT(iter->table, cur_hash);
> +
> +				while (new_iter.slot != iter->slot &&
> +				       chash_iter_is_valid(new_iter))
> +					CHASH_ITER_INC(new_iter);
> +
> +				if (new_iter.slot != iter->slot)
> +					chash_iter_relocate(new_iter, *iter);
> +			}
> +		}
> +
> +		CHASH_ITER_INC(*iter);
> +		if (iter->slot == hash) {
> +			iter->slot = -1;
> +			goto not_found;
> +		}
> +	}
> +
> +#ifdef CHASH_STATS
> +	iter->table->total_find_calls++;
> +	iter->table->total_find_steps +=
> +		CHASH_SUB(iter->table, iter->slot, hash) + 1;
> +#endif
> +
> +	if (first_avail >= 0) {
> +		CHASH_ITER_SET(first_redundant, first_avail);
> +		chash_iter_relocate(first_redundant, *iter);
> +		iter->slot = first_redundant.slot;
> +		iter->mask = first_redundant.mask;
> +	}
> +
> +	return 0;
> +
> +not_found:
> +#ifdef CHASH_STATS
> +	iter->table->total_not_find_calls++;
> +	iter->table->total_not_find_steps += (iter->slot < 0) ?
> +		(1 << iter->table->bits) :
> +		CHASH_SUB(iter->table, iter->slot, hash) + 1;
> +#endif
> +	if (first_avail >= 0)
> +		CHASH_ITER_SET(*iter, first_avail);
> +	return -EINVAL;
> +}
> +
> +int __chash_table_copy_in(struct __chash_table *table, u64 key,
> +			  const void *value)
> +{
> +	u32 hash = (table->key_size == 4) ?
> +		hash_32(key, table->bits) : hash_64(key, table->bits);
> +	struct chash_iter iter = CHASH_ITER_INIT(table, hash);
> +	int r = chash_table_find(&iter, key, false);
> +
> +	/* Found an existing entry */
> +	if (!r) {
> +		if (value && table->value_size)
> +			memcpy(chash_iter_value(iter), value,
> +			       table->value_size);
> +		return 1;
> +	}
> +
> +	/* Is there a place to add a new entry? */
> +	if (iter.slot < 0) {
> +		pr_err("Hash table overflow\n");
> +		return -ENOMEM;
> +	}
> +
> +	chash_iter_set_valid(iter);
> +
> +	if (table->key_size == 4)
> +		table->keys32[iter.slot] = key;
> +	else
> +		table->keys64[iter.slot] = key;
> +	if (value && table->value_size)
> +		memcpy(chash_iter_value(iter), value, table->value_size);
> +
> +#ifdef CHASH_STATS
> +	table->total_add_calls++;
> +	table->total_add_steps += CHASH_SUB(table, iter.slot, hash) + 1;
> +#endif
> +	return 0;
> +}
> +EXPORT_SYMBOL(__chash_table_copy_in);
> +
> +int __chash_table_copy_out(struct __chash_table *table, u64 key,
> +			   void *value, bool remove)
> +{
> +	u32 hash = (table->key_size == 4) ?
> +		hash_32(key, table->bits) : hash_64(key, table->bits);
> +	struct chash_iter iter = CHASH_ITER_INIT(table, hash);
> +	int r = chash_table_find(&iter, key, remove);
> +
> +	if (r < 0)
> +		return r;
> +
> +	if (value && table->value_size)
> +		memcpy(value, chash_iter_value(iter), table->value_size);
> +
> +	if (remove)
> +		chash_iter_set_invalid(iter);
> +
> +	return iter.slot;
> +}
> +EXPORT_SYMBOL(__chash_table_copy_out);
> +
> +#ifdef CHASH_SELF_TEST
> +/**
> + * chash_self_test - Run a self-test of the hash table implementation
> + * @bits: Table size will be 2^bits entries
> + * @key_size: Size of hash keys in bytes, 4 or 8
> + * @min_fill: Minimum fill level during the test
> + * @max_fill: Maximum fill level during the test
> + * @iterations: Number of test iterations
> + *
> + * The test adds and removes entries from a hash table, cycling the
> + * fill level between min_fill and max_fill entries. Also tests lookup
> + * and value retrieval.
> + */
> +int chash_self_test(u8 bits, u8 key_size, int min_fill, int max_fill,
> +		    u64 iterations)
> +{
> +	struct chash_table table;
> +	int ret;
> +	u64 add_count, rmv_count;
> +	u64 value;
> +
> +	if (key_size == 4 && iterations > 0xffffffff)
> +		return -EINVAL;
> +	if (min_fill >= max_fill)
> +		return -EINVAL;
> +
> +	ret = chash_table_alloc(&table, bits, key_size, sizeof(u64),
> +				GFP_KERNEL);
> +	if (ret) {
> +		pr_err("chash_table_alloc failed: %d\n", ret);
> +		return ret;
> +	}
> +
> +	for (add_count = 0, rmv_count = 0; add_count < iterations;
> +	     add_count++) {
> +		/* When we hit the max_fill level, remove entries down
> +		 * to min_fill */
> +		if (add_count - rmv_count == max_fill) {
> +			u64 find_count = rmv_count;
> +
> +			/* First try to find all entries that we're
> +			 * about to remove, confirm their value, test
> +			 * writing them back a second time. */
> +			for (; add_count - find_count > min_fill;
> +			     find_count++) {
> +				ret = chash_table_copy_out(&table, find_count,
> +							   &value);
> +				if (ret < 0) {
> +					pr_err("chash_table_copy_out failed: %d\n",
> +					       ret);
> +					goto out;
> +				}
> +				if (value != ~find_count) {
> +					pr_err("Wrong value retrieved for key 0x%llx, expected 0x%llx got 0x%llx\n",
> +					       find_count, ~find_count, value);
> +#ifdef CHASH_DEBUG
> +					chash_table_dump(&table.table);
> +#endif
> +					ret = -EFAULT;
> +					goto out;
> +				}
> +				ret = chash_table_copy_in(&table, find_count,
> +							  &value);
> +				if (ret != 1) {
> +					pr_err("copy_in second time returned %d, expected 1\n",
> +					       ret);
> +					ret = -EFAULT;
> +					goto out;
> +				}
> +			}
> +			/* Remove them until we hit min_fill level */
> +			for (; add_count - rmv_count > min_fill; rmv_count++) {
> +				ret = chash_table_remove(&table, rmv_count, NULL);
> +				if (ret < 0) {
> +					pr_err("chash_table_remove failed: %d\n",
> +					       ret);
> +					goto out;
> +				}
> +			}
> +		}
> +
> +		/* Add a new value */
> +		value = ~add_count;
> +		ret = chash_table_copy_in(&table, add_count, &value);
> +		if (ret != 0) {
> +			pr_err("copy_in first time returned %d, expected 0\n",
> +			       ret);
> +			ret = -EFAULT;
> +			goto out;
> +		}
> +	}
> +
> +#ifdef CHASH_STATS
> +	chash_table_dump_stats(&table);
> +#endif
> +
> +out:
> +	chash_table_free(&table);
> +	return ret;
> +}
> +EXPORT_SYMBOL(chash_self_test);
> +
> +#endif /* CHASH_SELF_TEST */
> +
> +MODULE_DESCRIPTION("Closed hash table");
> +MODULE_LICENSE("GPL and additional rights");


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 9/9] drm/amdgpu: Track pending retry faults in IH and VM
       [not found]     ` <1503731949-22742-10-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2017-08-26 13:36       ` Christian König
  0 siblings, 0 replies; 23+ messages in thread
From: Christian König @ 2017-08-26 13:36 UTC (permalink / raw)
  To: Felix Kuehling, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Am 26.08.2017 um 09:19 schrieb Felix Kuehling:
> IH tracks pending retry faults in a hash table for fast lookup in
> interrupt context. Each VM has a short FIFO of pending VM faults for
> processing in a bottom half.
>
> The IH prescreening stage adds retry faults and filters out repeated
> retry interrupts to minimize the impact of interrupt storms.
>
> It's the VM's responsibility remove pending faults once they are
> handled. For now this is only done when the VM is destroyed.
>
> Change-Id: I0cf15bfc767d06d9d5c3b13ad1ba7bc6aa520947
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>

Acked-by: Christian König <christian.koenig@amd.com>

> ---
>   drivers/gpu/drm/Kconfig                |  1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c | 76 +++++++++++++++++++++++++++++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h | 12 ++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c |  7 +++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h |  7 +++
>   drivers/gpu/drm/amd/amdgpu/vega10_ih.c | 78 +++++++++++++++++++++++++++++++++-
>   6 files changed, 180 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> index 78d7fc0..f8902dc 100644
> --- a/drivers/gpu/drm/Kconfig
> +++ b/drivers/gpu/drm/Kconfig
> @@ -184,6 +184,7 @@ config DRM_AMDGPU
>   	select BACKLIGHT_CLASS_DEVICE
>   	select BACKLIGHT_LCD_SUPPORT
>   	select INTERVAL_TREE
> +        select CHASH
>   	help
>   	  Choose this option if you have a recent AMD Radeon graphics card.
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
> index c834a40..d4d3579 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
> @@ -196,3 +196,79 @@ int amdgpu_ih_process(struct amdgpu_device *adev)
>   
>   	return IRQ_HANDLED;
>   }
> +
> +/**
> + * amdgpu_ih_add_fault - Add a page fault record
> + *
> + * @adev: amdgpu device pointer
> + * @key: 64-bit encoding of PASID and address
> + *
> + * This should be called when a retry page fault interrupt is
> + * received. If this is a new page fault, it will be added to a hash
> + * table. The return value indicates whether this is a new fault, or
> + * a fault that was already known and is already being handled.
> + *
> + * If there are too many pending page faults, this will fail. Retry
> + * interrupts should be ignored in this case until there is enough
> + * free space.
> + *
> + * Returns 0 if the fault was added, 1 if the fault was already known,
> + * -ENOSPC if there are too many pending faults.
> + */
> +int amdgpu_ih_add_fault(struct amdgpu_device *adev, u64 key)
> +{
> +	unsigned long flags;
> +	int r = -ENOSPC;
> +
> +	if (WARN_ON_ONCE(!adev->irq.ih.faults))
> +		/* Should be allocated in <IP>_ih_sw_init on GPUs that
> +		 * support retry faults and require retry filtering.
> +		 */
> +		return r;
> +
> +	spin_lock_irqsave(&adev->irq.ih.faults->lock, flags);
> +
> +	/* Only let the hash table fill up to 50% for best performance */
> +	if (adev->irq.ih.faults->count > (1 << (AMDGPU_PAGEFAULT_HASH_BITS-1)))
> +		goto unlock_out;
> +
> +	r = chash_table_copy_in(&adev->irq.ih.faults->hash, key, NULL);
> +	if (!r)
> +		adev->irq.ih.faults->count++;
> +
> +	/* chash_table_copy_in should never fail unless we're losing count */
> +	WARN_ON_ONCE(r < 0);
> +
> +unlock_out:
> +	spin_unlock_irqrestore(&adev->irq.ih.faults->lock, flags);
> +	return r;
> +}
> +
> +/**
> + * amdgpu_ih_clear_fault - Remove a page fault record
> + *
> + * @adev: amdgpu device pointer
> + * @key: 64-bit encoding of PASID and address
> + *
> + * This should be called when a page fault has been handled. Any
> + * future interrupt with this key will be processed as a new
> + * page fault.
> + */
> +void amdgpu_ih_clear_fault(struct amdgpu_device *adev, u64 key)
> +{
> +	unsigned long flags;
> +	int r;
> +
> +	if (!adev->irq.ih.faults)
> +		return;
> +
> +	spin_lock_irqsave(&adev->irq.ih.faults->lock, flags);
> +
> +	r = chash_table_remove(&adev->irq.ih.faults->hash, key, NULL);
> +	if (!WARN_ON_ONCE(r < 0)) {
> +		adev->irq.ih.faults->count--;
> +		WARN_ON_ONCE(adev->irq.ih.faults->count < 0);
> +	}
> +
> +	spin_unlock_irqrestore(&adev->irq.ih.faults->lock, flags);
> +}
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
> index 3de8e74..d107f1b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
> @@ -24,6 +24,8 @@
>   #ifndef __AMDGPU_IH_H__
>   #define __AMDGPU_IH_H__
>   
> +#include <linux/chash.h>
> +
>   struct amdgpu_device;
>    /*
>     * vega10+ IH clients
> @@ -69,6 +71,13 @@ enum amdgpu_ih_clientid
>   
>   #define AMDGPU_IH_CLIENTID_LEGACY 0
>   
> +#define AMDGPU_PAGEFAULT_HASH_BITS 10
> +struct amdgpu_retryfault_hashtable {
> +	DECLARE_CHASH_TABLE(hash, AMDGPU_PAGEFAULT_HASH_BITS, 8, 0);
> +	spinlock_t	lock;
> +	int		count;
> +};
> +
>   /*
>    * R6xx+ IH ring
>    */
> @@ -87,6 +96,7 @@ struct amdgpu_ih_ring {
>   	bool			use_doorbell;
>   	bool			use_bus_addr;
>   	dma_addr_t		rb_dma_addr; /* only used when use_bus_addr = true */
> +	struct amdgpu_retryfault_hashtable *faults;
>   };
>   
>   #define AMDGPU_IH_SRC_DATA_MAX_SIZE_DW 4
> @@ -109,5 +119,7 @@ int amdgpu_ih_ring_init(struct amdgpu_device *adev, unsigned ring_size,
>   			bool use_bus_addr);
>   void amdgpu_ih_ring_fini(struct amdgpu_device *adev);
>   int amdgpu_ih_process(struct amdgpu_device *adev);
> +int amdgpu_ih_add_fault(struct amdgpu_device *adev, u64 key);
> +void amdgpu_ih_clear_fault(struct amdgpu_device *adev, u64 key);
>   
>   #endif
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index c635699..8bdabb3 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -2622,6 +2622,8 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
>   		vm->pasid = pasid;
>   	}
>   
> +	INIT_KFIFO(vm->faults);
> +
>   	vm->vm_context = vm_context;
>   	if (vm_context == AMDGPU_VM_CONTEXT_COMPUTE) {
>   		mutex_lock(&id_mgr->lock);
> @@ -2688,6 +2690,7 @@ void amdgpu_vm_fini(struct amdgpu_device *adev, struct amdgpu_vm *vm)
>   {
>   	struct amdgpu_bo_va_mapping *mapping, *tmp;
>   	bool prt_fini_needed = !!adev->gart.gart_funcs->set_prt;
> +	u64 fault;
>   	int i;
>   
>   	if (vm->vm_context == AMDGPU_VM_CONTEXT_COMPUTE) {
> @@ -2710,6 +2713,10 @@ void amdgpu_vm_fini(struct amdgpu_device *adev, struct amdgpu_vm *vm)
>   		mutex_unlock(&id_mgr->lock);
>   	}
>   
> +	/* Clear pending page faults from IH when the VM is destroyed */
> +	while (kfifo_get(&vm->faults, &fault))
> +		amdgpu_ih_clear_fault(adev, fault);
> +
>   	if (vm->pasid) {
>   		unsigned long flags;
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> index 692b05c..51d3e35 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> @@ -117,6 +117,10 @@ struct amdgpu_vm_pt {
>   	unsigned		last_entry_used;
>   };
>   
> +#define AMDGPU_VM_FAULT(pasid, addr) (((u64)(pasid) << 48) | (addr))
> +#define AMDGPU_VM_FAULT_PASID(fault) ((u64)(fault) >> 48)
> +#define AMDGPU_VM_FAULT_ADDR(fault)  ((u64)(fault) & 0xfffffffff000ULL)
> +
>   struct amdgpu_vm {
>   	/* tree of virtual addresses mapped */
>   	struct rb_root		va;
> @@ -158,6 +162,9 @@ struct amdgpu_vm {
>   
>   	/* Whether this is a Compute or GFX Context */
>   	int			vm_context;
> +
> +	/* Up to 16 pending page faults */
> +	DECLARE_KFIFO(faults, u64, 16);
>   };
>   
>   struct amdgpu_vm_id {
> diff --git a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
> index d14a2d5..ae2b84a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vega10_ih.c
> @@ -235,8 +235,73 @@ static u32 vega10_ih_get_wptr(struct amdgpu_device *adev)
>    */
>   static bool vega10_ih_prescreen_iv(struct amdgpu_device *adev)
>   {
> -	/* TODO: Filter known pending page faults */
> +	u32 ring_index = adev->irq.ih.rptr >> 2;
> +	u32 dw0, dw3, dw4, dw5;
> +	u16 pasid;
> +	u64 addr, key;
> +	struct amdgpu_vm *vm;
> +	int r;
> +
> +	dw0 = le32_to_cpu(adev->irq.ih.ring[ring_index + 0]);
> +	dw3 = le32_to_cpu(adev->irq.ih.ring[ring_index + 3]);
> +	dw4 = le32_to_cpu(adev->irq.ih.ring[ring_index + 4]);
> +	dw5 = le32_to_cpu(adev->irq.ih.ring[ring_index + 5]);
> +
> +	/* Filter retry page faults, let only the first one pass. If
> +	 * there are too many outstanding faults, ignore them until
> +	 * some faults get cleared.
> +	 */
> +	switch(dw0 & 0xff) {
> +	case AMDGPU_IH_CLIENTID_VMC:
> +	case AMDGPU_IH_CLIENTID_UTCL2:
> +		break;
> +	default:
> +		/* Not a VM fault */
> +		return true;
> +	}
> +
> +	/* Not a retry fault */
> +	if (!(dw5 & 0x80))
> +		return true;
> +
> +	pasid = dw3 & 0xffff;
> +	/* No PASID, can't identify faulting process */
> +	if (!pasid)
> +		return true;
> +
> +	addr = ((u64)(dw5 & 0xf) << 44) | ((u64)dw4 << 12);
> +	key = AMDGPU_VM_FAULT(pasid, addr);
> +	r = amdgpu_ih_add_fault(adev, key);
> +
> +	/* Hash table is full or the fault is already being processed,
> +	 * ignore further page faults
> +	 */
> +	if (r != 0)
> +		goto ignore_iv;
> +
> +	/* Track retry faults in per-VM fault FIFO. */
> +	spin_lock(&adev->vm_manager.pasid_lock);
> +	vm = idr_find(&adev->vm_manager.pasid_idr, pasid);
> +	spin_unlock(&adev->vm_manager.pasid_lock);
> +	if (WARN_ON_ONCE(!vm)) {
> +		/* VM not found, process it normally */
> +		amdgpu_ih_clear_fault(adev, key);
> +		return true;
> +	}
> +	/* No locking required with single writer and single reader */
> +	r = kfifo_put(&vm->faults, key);
> +	if (!r) {
> +		/* FIFO is full. Ignore it until there is space */
> +		amdgpu_ih_clear_fault(adev, key);
> +		goto ignore_iv;
> +	}
> +
> +	/* It's the first fault for this address, process it normally */
>   	return true;
> +
> +ignore_iv:
> +	adev->irq.ih.rptr += 32;
> +	return false;
>   }
>   
>   /**
> @@ -323,6 +388,14 @@ static int vega10_ih_sw_init(void *handle)
>   	adev->irq.ih.use_doorbell = true;
>   	adev->irq.ih.doorbell_index = AMDGPU_DOORBELL64_IH << 1;
>   
> +	adev->irq.ih.faults = kmalloc(sizeof(*adev->irq.ih.faults), GFP_KERNEL);
> +	if (!adev->irq.ih.faults)
> +		return -ENOMEM;
> +	INIT_CHASH_TABLE(adev->irq.ih.faults->hash,
> +			 AMDGPU_PAGEFAULT_HASH_BITS, 8, 0);
> +	spin_lock_init(&adev->irq.ih.faults->lock);
> +	adev->irq.ih.faults->count = 0;
> +
>   	r = amdgpu_irq_init(adev);
>   
>   	return r;
> @@ -335,6 +408,9 @@ static int vega10_ih_sw_fini(void *handle)
>   	amdgpu_irq_fini(adev);
>   	amdgpu_ih_ring_fini(adev);
>   
> +	kfree(adev->irq.ih.faults);
> +	adev->irq.ih.faults = NULL;
> +
>   	return 0;
>   }
>   


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/9] drm/amdgpu: Add PASID management
       [not found]         ` <994b23cd-67b3-4498-2c2b-d4fc2ea68be7-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
@ 2017-08-26 14:41           ` Kuehling, Felix
  0 siblings, 0 replies; 23+ messages in thread
From: Kuehling, Felix @ 2017-08-26 14:41 UTC (permalink / raw)
  To: Christian König, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

> But I'm a bit confused, doesn't his stuff belong into the IOMMU code?

PASIDs work on dGPUs without an IOMMU. They're just device-specific process identifiers. On APUs there is an extra step in KFD that registers a process+PASID with the IOMMU driver. That way the IOMMU knows what to do when it sees address translation requests for a specific PASID from a specific device. This is done by calling amd_iommu_bind_pasid:

    struct kfd_process_device *kfd_bind_process_to_device(struct kfd_dev *dev,
                                                     struct kfd_process *p)
    {
            ...
            err = amd_iommu_bind_pasid(dev->pdev, p->pasid, p->lead_thread);
            ...
    }

The PASID management is done by the device driver, not the IOMMU driver. I think different devices can use different PASIDs for the same process.

Regards,
  Felix

________________________________________
From: Christian König <deathsimple@vodafone.de>
Sent: Saturday, August 26, 2017 9:27 AM
To: Kuehling, Felix; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 2/9] drm/amdgpu: Add PASID management

Am 26.08.2017 um 09:19 schrieb Felix Kuehling:
> Allows assigning a PASID to a VM for identifying VMs involved in page
> faults. The global PASID manager is also exported in the KFD
> interface so that AMDGPU and KFD can share the PASID space.
>
> PASIDs of different sizes can be requested. On APUs, the PASID size
> is deterined by the capabilities of the IOMMU. So KFD must be able
> to allocate PASIDs in a smaller range.
>
> TODO:
> * Actually assign PASIDs to VMs
> * Update the PASID-VMID mapping registers during CS
>
> Change-Id: I88c9357a7c584f10e84b5607ac09eba77c833393
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>

The patch itself is Reviewed-by: Christian König <christian.koenig@amd.com>.

But I'm a bit confused, doesn't his stuff belong into the IOMMU code?

Regards,
Christian.

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c |  2 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c |  2 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c |  2 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |  2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |  2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            | 76 ++++++++++++++++++++++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h            | 14 ++++-
>   drivers/gpu/drm/amd/include/kgd_kfd_interface.h   |  6 ++
>   8 files changed, 101 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
> index 3e28d2b..0807d52 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
> @@ -188,6 +188,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
>       .get_local_mem_info = get_local_mem_info,
>       .get_gpu_clock_counter = get_gpu_clock_counter,
>       .get_max_engine_clock_in_mhz = get_max_engine_clock_in_mhz,
> +     .alloc_pasid = amdgpu_vm_alloc_pasid,
> +     .free_pasid = amdgpu_vm_free_pasid,
>       .create_process_vm = amdgpu_amdkfd_gpuvm_create_process_vm,
>       .destroy_process_vm = amdgpu_amdkfd_gpuvm_destroy_process_vm,
>       .get_process_page_dir = amdgpu_amdkfd_gpuvm_get_process_page_dir,
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
> index 3b6b4d9..c20c000 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
> @@ -162,6 +162,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
>       .get_local_mem_info = get_local_mem_info,
>       .get_gpu_clock_counter = get_gpu_clock_counter,
>       .get_max_engine_clock_in_mhz = get_max_engine_clock_in_mhz,
> +     .alloc_pasid = amdgpu_vm_alloc_pasid,
> +     .free_pasid = amdgpu_vm_free_pasid,
>       .create_process_vm = amdgpu_amdkfd_gpuvm_create_process_vm,
>       .destroy_process_vm = amdgpu_amdkfd_gpuvm_destroy_process_vm,
>       .create_process_gpumem = create_process_gpumem,
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
> index 961369d..bb99c64 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
> @@ -209,6 +209,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
>       .get_local_mem_info = get_local_mem_info,
>       .get_gpu_clock_counter = get_gpu_clock_counter,
>       .get_max_engine_clock_in_mhz = get_max_engine_clock_in_mhz,
> +     .alloc_pasid = amdgpu_vm_alloc_pasid,
> +     .free_pasid = amdgpu_vm_free_pasid,
>       .create_process_vm = amdgpu_amdkfd_gpuvm_create_process_vm,
>       .destroy_process_vm = amdgpu_amdkfd_gpuvm_destroy_process_vm,
>       .create_process_gpumem = create_process_gpumem,
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index 35f7d77..462011c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -1397,7 +1397,7 @@ int amdgpu_amdkfd_gpuvm_create_process_vm(struct kgd_dev *kgd, void **vm,
>               return -ENOMEM;
>
>       /* Initialize the VM context, allocate the page directory and zero it */
> -     ret = amdgpu_vm_init(adev, &new_vm->base, AMDGPU_VM_CONTEXT_COMPUTE);
> +     ret = amdgpu_vm_init(adev, &new_vm->base, AMDGPU_VM_CONTEXT_COMPUTE, 0);
>       if (ret != 0) {
>               pr_err("Failed init vm ret %d\n", ret);
>               /* Undo everything related to the new VM context */
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> index e390c01..ba3dc4d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> @@ -825,7 +825,7 @@ int amdgpu_driver_open_kms(struct drm_device *dev, struct drm_file *file_priv)
>       }
>
>       r = amdgpu_vm_init(adev, &fpriv->vm,
> -                        AMDGPU_VM_CONTEXT_GFX);
> +                        AMDGPU_VM_CONTEXT_GFX, 0);
>       if (r) {
>               kfree(fpriv);
>               goto out_suspend;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 70d7632..c635699 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -27,12 +27,59 @@
>    */
>   #include <linux/dma-fence-array.h>
>   #include <linux/interval_tree_generic.h>
> +#include <linux/idr.h>
>   #include <drm/drmP.h>
>   #include <drm/amdgpu_drm.h>
>   #include "amdgpu.h"
>   #include "amdgpu_trace.h"
>
>   /*
> + * PASID manager
> + *
> + * PASIDs are global address space identifiers that can be shared
> + * between the GPU, an IOMMU and the driver. VMs on different devices
> + * may use the same PASID if they share the same address
> + * space. Therefore PASIDs are allocated using a global IDA. VMs are
> + * looked up from the PASID per amdgpu_device.
> + */
> +static DEFINE_IDA(amdgpu_vm_pasid_ida);
> +
> +/**
> + * amdgpu_vm_alloc_pasid - Allocate a PASID
> + * @bits: Maximum width of the PASID in bits, must be at least 1
> + *
> + * Allocates a PASID of the given width while keeping smaller PASIDs
> + * available if possible.
> + *
> + * Returns a positive integer on success. Returns %-EINVAL if bits==0.
> + * Returns %-ENOSPC if no PASID was avaliable. Returns %-ENOMEM on
> + * memory allocation failure.
> + */
> +int amdgpu_vm_alloc_pasid(unsigned int bits)
> +{
> +     int pasid = -EINVAL;
> +
> +     for (bits = min(bits, 31U); bits > 0; bits--) {
> +             pasid = ida_simple_get(&amdgpu_vm_pasid_ida,
> +                                    1U << (bits - 1), 1U << bits,
> +                                    GFP_KERNEL);
> +             if (pasid != -ENOSPC)
> +                     break;
> +     }
> +
> +     return pasid;
> +}
> +
> +/**
> + * amdgpu_vm_free_pasid - Free a PASID
> + * @pasid: PASID to free
> + */
> +void amdgpu_vm_free_pasid(unsigned int pasid)
> +{
> +     ida_simple_remove(&amdgpu_vm_pasid_ida, pasid);
> +}
> +
> +/*
>    * GPUVM
>    * GPUVM is similar to the legacy gart on older asics, however
>    * rather than there being a single global gart table
> @@ -2482,7 +2529,7 @@ void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint64_t vm_size, uint32_
>    * Init @vm fields.
>    */
>   int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
> -                int vm_context)
> +                int vm_context, unsigned int pasid)
>   {
>       const unsigned align = min(AMDGPU_VM_PTB_ALIGN_SIZE,
>               AMDGPU_VM_PTE_COUNT(adev) * 8);
> @@ -2562,6 +2609,19 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
>       if (r)
>               goto error_free_root;
>
> +     if (pasid) {
> +             unsigned long flags;
> +
> +             spin_lock_irqsave(&adev->vm_manager.pasid_lock, flags);
> +             r = idr_alloc(&adev->vm_manager.pasid_idr, vm, pasid, pasid + 1,
> +                           GFP_ATOMIC);
> +             spin_unlock_irqrestore(&adev->vm_manager.pasid_lock, flags);
> +             if (r < 0)
> +                     goto error_free_root;
> +
> +             vm->pasid = pasid;
> +     }
> +
>       vm->vm_context = vm_context;
>       if (vm_context == AMDGPU_VM_CONTEXT_COMPUTE) {
>               mutex_lock(&id_mgr->lock);
> @@ -2650,6 +2710,14 @@ void amdgpu_vm_fini(struct amdgpu_device *adev, struct amdgpu_vm *vm)
>               mutex_unlock(&id_mgr->lock);
>       }
>
> +     if (vm->pasid) {
> +             unsigned long flags;
> +
> +             spin_lock_irqsave(&adev->vm_manager.pasid_lock, flags);
> +             idr_remove(&adev->vm_manager.pasid_idr, vm->pasid);
> +             spin_unlock_irqrestore(&adev->vm_manager.pasid_lock, flags);
> +     }
> +
>       amd_sched_entity_fini(vm->entity.sched, &vm->entity);
>
>       if (!RB_EMPTY_ROOT(&vm->va)) {
> @@ -2729,6 +2797,9 @@ void amdgpu_vm_manager_init(struct amdgpu_device *adev)
>       adev->vm_manager.vm_update_mode = 0;
>   #endif
>
> +     idr_init(&adev->vm_manager.pasid_idr);
> +     spin_lock_init(&adev->vm_manager.pasid_lock);
> +
>       adev->vm_manager.n_compute_vms = 0;
>   }
>
> @@ -2743,6 +2814,9 @@ void amdgpu_vm_manager_fini(struct amdgpu_device *adev)
>   {
>       unsigned i, j;
>
> +     WARN_ON(!idr_is_empty(&adev->vm_manager.pasid_idr));
> +     idr_destroy(&adev->vm_manager.pasid_idr);
> +
>       for (i = 0; i < AMDGPU_MAX_VMHUBS; ++i) {
>               struct amdgpu_vm_id_manager *id_mgr =
>                       &adev->vm_manager.id_mgr[i];
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> index 49e15d7..692b05c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> @@ -25,6 +25,7 @@
>   #define __AMDGPU_VM_H__
>
>   #include <linux/rbtree.h>
> +#include <linux/idr.h>
>
>   #include "gpu_scheduler.h"
>   #include "amdgpu_sync.h"
> @@ -143,8 +144,9 @@ struct amdgpu_vm {
>       /* Scheduler entity for page table updates */
>       struct amd_sched_entity entity;
>
> -     /* client id */
> +     /* client id and PASID (TODO: replace client_id with PASID) */
>       u64                     client_id;
> +     unsigned int            pasid;
>       /* dedicated to vm */
>       struct amdgpu_vm_id     *reserved_vmid[AMDGPU_MAX_VMHUBS];
>
> @@ -219,14 +221,22 @@ struct amdgpu_vm_manager {
>        */
>       int                                     vm_update_mode;
>
> +     /* PASID to VM mapping, will be used in interrupt context to
> +      * look up VM of a page fault
> +      */
> +     struct idr                              pasid_idr;
> +     spinlock_t                              pasid_lock;
> +
>       /* Number of Compute VMs, used for detecting Compute activity */
>       unsigned                                n_compute_vms;
>   };
>
> +int amdgpu_vm_alloc_pasid(unsigned int bits);
> +void amdgpu_vm_free_pasid(unsigned int pasid);
>   void amdgpu_vm_manager_init(struct amdgpu_device *adev);
>   void amdgpu_vm_manager_fini(struct amdgpu_device *adev);
>   int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
> -                int vm_context);
> +                int vm_context, unsigned int pasid);
>   void amdgpu_vm_fini(struct amdgpu_device *adev, struct amdgpu_vm *vm);
>   void amdgpu_vm_get_pd_bo(struct amdgpu_vm *vm,
>                        struct list_head *validated,
> diff --git a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
> index e8027b3..5833ef7 100644
> --- a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
> +++ b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
> @@ -188,6 +188,9 @@ struct tile_config {
>    *
>    * @get_max_engine_clock_in_mhz: Retrieves maximum GPU clock in MHz
>    *
> + * @alloc_pasid: Allocate a PASID
> + * @free_pasid: Free a PASID
> + *
>    * @program_sh_mem_settings: A function that should initiate the memory
>    * properties such as main aperture memory type (cache / non cached) and
>    * secondary aperture base address, size and memory type.
> @@ -264,6 +267,9 @@ struct kfd2kgd_calls {
>
>       uint32_t (*get_max_engine_clock_in_mhz)(struct kgd_dev *kgd);
>
> +     int (*alloc_pasid)(unsigned int bits);
> +     void (*free_pasid)(unsigned int pasid);
> +
>       int (*create_process_vm)(struct kgd_dev *kgd, void **vm,
>                                void **process_info);
>       void (*destroy_process_vm)(struct kgd_dev *kgd, void *vm);


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/9] WIP: Retry page fault handling for Vega10
       [not found] ` <1503731949-22742-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (8 preceding siblings ...)
  2017-08-26  7:19   ` [PATCH 9/9] drm/amdgpu: Track pending retry faults in IH and VM Felix Kuehling
@ 2017-08-27 22:22   ` Oded Gabbay
       [not found]     ` <CAFCwf10G+4ra9UD6upxaBc5FwSu4efB9oLKKYSZHcHQ-w9TZgQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  9 siblings, 1 reply; 23+ messages in thread
From: Oded Gabbay @ 2017-08-27 22:22 UTC (permalink / raw)
  To: Kuehling, Felix; +Cc: amd-gfx list


[-- Attachment #1.1: Type: text/plain, Size: 3922 bytes --]

Hi Felix,
I'm currently on vacation and I will return at the end of the week, so I
will not be able to review the patches until then.

Oded

On Aug 26, 2017 09:19, "Felix Kuehling" <Felix.Kuehling-5C7GfCeVMHo@public.gmane.org> wrote:

> This is based on amd-kfd-staging, because that's easier for me to test.
> I'm planning to port to amd-staging-4.x for submission upstream.
>
> With this patch series, I'm able to turn retry faults on and handle the
> interrupt storm from VM faults. Only the first VM fault interrupt per
> process and address gets handled the usual way. Retry interruptr are
> filtered in a new prescreening stage in amdgpu_ih_process.
>
> Pending faults are tracked in a hash table in IH to detect retry faults
> and a FIFO in the VM for later processing.
>
> Looking up the VM from the fault interrupt depends on the PASID.
> Currently only KFD VMs have proper PASIDs.
>
> TODO (need some help with these):
> * Allocate PASIDs for graphics contexts
> * Setup VMID-PASID mapping during graphics command submission
> * Confirm that graphics page faults have the correct PASID in the IV
>
> Once that's done, we should have a foundation to start working on HMM
> and proper SVM memory management with demand paging.
>
> Felix Kuehling (9):
>   drm/amdgpu: Fix error handling in amdgpu_vm_init
>   drm/amdgpu: Add PASID management
>   drm/radeon: Add PASID manager for KFD
>   drm/amdkfd: Separate doorbell allocation from PASID
>   drm/amdkfd: Use PASID manager from KGD
>   drm/amd: Set the PASID for KFD VMs
>   drm/amdgpu: Add prescreening stage in IH processing
>   lib: Closed hash table with low overhead
>   drm/amdgpu: Track pending retry faults in IH and VM
>
>  drivers/gpu/drm/Kconfig                           |   1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h               |   2 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h        |   3 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c |   2 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c |   2 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c |   2 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |   6 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c            |  82 ++++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h            |  12 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |   2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            |  88 +++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h            |  21 +-
>  drivers/gpu/drm/amd/amdgpu/cik_ih.c               |  14 +
>  drivers/gpu/drm/amd/amdgpu/cz_ih.c                |  14 +
>  drivers/gpu/drm/amd/amdgpu/iceland_ih.c           |  14 +
>  drivers/gpu/drm/amd/amdgpu/si_ih.c                |  14 +
>  drivers/gpu/drm/amd/amdgpu/tonga_ih.c             |  14 +
>  drivers/gpu/drm/amd/amdgpu/vega10_ih.c            |  90 ++++
>  drivers/gpu/drm/amd/amdkfd/kfd_device.c           |  18 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c         |  48 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_module.c           |   6 -
>  drivers/gpu/drm/amd/amdkfd/kfd_pasid.c            |  84 ++--
>  drivers/gpu/drm/amd/amdkfd/kfd_priv.h             |  10 +-
>  drivers/gpu/drm/amd/amdkfd/kfd_process.c          |   8 +-
>  drivers/gpu/drm/amd/include/kgd_kfd_interface.h   |   8 +-
>  drivers/gpu/drm/radeon/radeon_kfd.c               |  36 +-
>  include/linux/chash.h                             | 349 +++++++++++++++
>  lib/Kconfig                                       |   8 +
>  lib/Makefile                                      |   2 +
>  lib/chash.c                                       | 521
> ++++++++++++++++++++++
>  30 files changed, 1376 insertions(+), 105 deletions(-)
>  create mode 100644 include/linux/chash.h
>  create mode 100644 lib/chash.c
>
> --
> 2.7.4
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>

[-- Attachment #1.2: Type: text/html, Size: 5166 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/9] drm/amdgpu: Fix error handling in amdgpu_vm_init
       [not found]     ` <1503731949-22742-2-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-08-26 13:22       ` Christian König
@ 2017-08-28  2:51       ` zhoucm1
  1 sibling, 0 replies; 23+ messages in thread
From: zhoucm1 @ 2017-08-28  2:51 UTC (permalink / raw)
  To: Felix Kuehling, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW



On 2017年08月26日 15:19, Felix Kuehling wrote:
> Make sure vm->root.bo is not left reserved if amdgpu_bo_kmap fails.
>
> Change-Id: If3687b39a50b0ffe7f8be2ea6e927fa2ca0f9e45
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 9 +++------
>   1 file changed, 3 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index e57a72e..70d7632 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -2556,14 +2556,11 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
>   		goto error_free_root;
>   
>   	vm->last_eviction_counter = atomic64_read(&adev->num_evictions);
> -
> -	if (vm->use_cpu_for_update) {
> +	if (vm->use_cpu_for_update)
>   		r = amdgpu_bo_kmap(vm->root.bo, NULL);
> -		if (r)
> -			goto error_free_root;
> -	}
> -
>   	amdgpu_bo_unreserve(vm->root.bo);
> +	if (r)
> +		goto error_free_root;
>   
>   	vm->vm_context = vm_context;
>   	if (vm_context == AMDGPU_VM_CONTEXT_COMPUTE) {

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/9] drm/amdgpu: Add PASID management
       [not found]     ` <1503731949-22742-3-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2017-08-26 13:27       ` Christian König
@ 2017-08-28  3:06       ` zhoucm1
       [not found]         ` <c730cbbc-919c-23a5-8d10-3ab5fbfa3543-5C7GfCeVMHo@public.gmane.org>
  1 sibling, 1 reply; 23+ messages in thread
From: zhoucm1 @ 2017-08-28  3:06 UTC (permalink / raw)
  To: Felix Kuehling, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Could we separate PASID manager to a clean file like amdgpu_pasid.c like 
what context manager done?

Since VM code is growing, and which looks more and more complex.

btw: really like many comments about PASID explaination. :)

Regards,
David Zhou
On 2017年08月26日 15:19, Felix Kuehling wrote:
> Allows assigning a PASID to a VM for identifying VMs involved in page
> faults. The global PASID manager is also exported in the KFD
> interface so that AMDGPU and KFD can share the PASID space.
>
> PASIDs of different sizes can be requested. On APUs, the PASID size
> is deterined by the capabilities of the IOMMU. So KFD must be able
> to allocate PASIDs in a smaller range.
>
> TODO:
> * Actually assign PASIDs to VMs
> * Update the PASID-VMID mapping registers during CS
>
> Change-Id: I88c9357a7c584f10e84b5607ac09eba77c833393
> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c |  2 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c |  2 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c |  2 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |  2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |  2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            | 76 ++++++++++++++++++++++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h            | 14 ++++-
>   drivers/gpu/drm/amd/include/kgd_kfd_interface.h   |  6 ++
>   8 files changed, 101 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
> index 3e28d2b..0807d52 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
> @@ -188,6 +188,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
>   	.get_local_mem_info = get_local_mem_info,
>   	.get_gpu_clock_counter = get_gpu_clock_counter,
>   	.get_max_engine_clock_in_mhz = get_max_engine_clock_in_mhz,
> +	.alloc_pasid = amdgpu_vm_alloc_pasid,
> +	.free_pasid = amdgpu_vm_free_pasid,
>   	.create_process_vm = amdgpu_amdkfd_gpuvm_create_process_vm,
>   	.destroy_process_vm = amdgpu_amdkfd_gpuvm_destroy_process_vm,
>   	.get_process_page_dir = amdgpu_amdkfd_gpuvm_get_process_page_dir,
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
> index 3b6b4d9..c20c000 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
> @@ -162,6 +162,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
>   	.get_local_mem_info = get_local_mem_info,
>   	.get_gpu_clock_counter = get_gpu_clock_counter,
>   	.get_max_engine_clock_in_mhz = get_max_engine_clock_in_mhz,
> +	.alloc_pasid = amdgpu_vm_alloc_pasid,
> +	.free_pasid = amdgpu_vm_free_pasid,
>   	.create_process_vm = amdgpu_amdkfd_gpuvm_create_process_vm,
>   	.destroy_process_vm = amdgpu_amdkfd_gpuvm_destroy_process_vm,
>   	.create_process_gpumem = create_process_gpumem,
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
> index 961369d..bb99c64 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
> @@ -209,6 +209,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
>   	.get_local_mem_info = get_local_mem_info,
>   	.get_gpu_clock_counter = get_gpu_clock_counter,
>   	.get_max_engine_clock_in_mhz = get_max_engine_clock_in_mhz,
> +	.alloc_pasid = amdgpu_vm_alloc_pasid,
> +	.free_pasid = amdgpu_vm_free_pasid,
>   	.create_process_vm = amdgpu_amdkfd_gpuvm_create_process_vm,
>   	.destroy_process_vm = amdgpu_amdkfd_gpuvm_destroy_process_vm,
>   	.create_process_gpumem = create_process_gpumem,
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index 35f7d77..462011c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -1397,7 +1397,7 @@ int amdgpu_amdkfd_gpuvm_create_process_vm(struct kgd_dev *kgd, void **vm,
>   		return -ENOMEM;
>   
>   	/* Initialize the VM context, allocate the page directory and zero it */
> -	ret = amdgpu_vm_init(adev, &new_vm->base, AMDGPU_VM_CONTEXT_COMPUTE);
> +	ret = amdgpu_vm_init(adev, &new_vm->base, AMDGPU_VM_CONTEXT_COMPUTE, 0);
>   	if (ret != 0) {
>   		pr_err("Failed init vm ret %d\n", ret);
>   		/* Undo everything related to the new VM context */
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> index e390c01..ba3dc4d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
> @@ -825,7 +825,7 @@ int amdgpu_driver_open_kms(struct drm_device *dev, struct drm_file *file_priv)
>   	}
>   
>   	r = amdgpu_vm_init(adev, &fpriv->vm,
> -			   AMDGPU_VM_CONTEXT_GFX);
> +			   AMDGPU_VM_CONTEXT_GFX, 0);
>   	if (r) {
>   		kfree(fpriv);
>   		goto out_suspend;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 70d7632..c635699 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -27,12 +27,59 @@
>    */
>   #include <linux/dma-fence-array.h>
>   #include <linux/interval_tree_generic.h>
> +#include <linux/idr.h>
>   #include <drm/drmP.h>
>   #include <drm/amdgpu_drm.h>
>   #include "amdgpu.h"
>   #include "amdgpu_trace.h"
>   
>   /*
> + * PASID manager
> + *
> + * PASIDs are global address space identifiers that can be shared
> + * between the GPU, an IOMMU and the driver. VMs on different devices
> + * may use the same PASID if they share the same address
> + * space. Therefore PASIDs are allocated using a global IDA. VMs are
> + * looked up from the PASID per amdgpu_device.
> + */
> +static DEFINE_IDA(amdgpu_vm_pasid_ida);
> +
> +/**
> + * amdgpu_vm_alloc_pasid - Allocate a PASID
> + * @bits: Maximum width of the PASID in bits, must be at least 1
> + *
> + * Allocates a PASID of the given width while keeping smaller PASIDs
> + * available if possible.
> + *
> + * Returns a positive integer on success. Returns %-EINVAL if bits==0.
> + * Returns %-ENOSPC if no PASID was avaliable. Returns %-ENOMEM on
> + * memory allocation failure.
> + */
> +int amdgpu_vm_alloc_pasid(unsigned int bits)
> +{
> +	int pasid = -EINVAL;
> +
> +	for (bits = min(bits, 31U); bits > 0; bits--) {
> +		pasid = ida_simple_get(&amdgpu_vm_pasid_ida,
> +				       1U << (bits - 1), 1U << bits,
> +				       GFP_KERNEL);
> +		if (pasid != -ENOSPC)
> +			break;
> +	}
> +
> +	return pasid;
> +}
> +
> +/**
> + * amdgpu_vm_free_pasid - Free a PASID
> + * @pasid: PASID to free
> + */
> +void amdgpu_vm_free_pasid(unsigned int pasid)
> +{
> +	ida_simple_remove(&amdgpu_vm_pasid_ida, pasid);
> +}
> +
> +/*
>    * GPUVM
>    * GPUVM is similar to the legacy gart on older asics, however
>    * rather than there being a single global gart table
> @@ -2482,7 +2529,7 @@ void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint64_t vm_size, uint32_
>    * Init @vm fields.
>    */
>   int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
> -		   int vm_context)
> +		   int vm_context, unsigned int pasid)
>   {
>   	const unsigned align = min(AMDGPU_VM_PTB_ALIGN_SIZE,
>   		AMDGPU_VM_PTE_COUNT(adev) * 8);
> @@ -2562,6 +2609,19 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
>   	if (r)
>   		goto error_free_root;
>   
> +	if (pasid) {
> +		unsigned long flags;
> +
> +		spin_lock_irqsave(&adev->vm_manager.pasid_lock, flags);
> +		r = idr_alloc(&adev->vm_manager.pasid_idr, vm, pasid, pasid + 1,
> +			      GFP_ATOMIC);
> +		spin_unlock_irqrestore(&adev->vm_manager.pasid_lock, flags);
> +		if (r < 0)
> +			goto error_free_root;
> +
> +		vm->pasid = pasid;
> +	}
> +
>   	vm->vm_context = vm_context;
>   	if (vm_context == AMDGPU_VM_CONTEXT_COMPUTE) {
>   		mutex_lock(&id_mgr->lock);
> @@ -2650,6 +2710,14 @@ void amdgpu_vm_fini(struct amdgpu_device *adev, struct amdgpu_vm *vm)
>   		mutex_unlock(&id_mgr->lock);
>   	}
>   
> +	if (vm->pasid) {
> +		unsigned long flags;
> +
> +		spin_lock_irqsave(&adev->vm_manager.pasid_lock, flags);
> +		idr_remove(&adev->vm_manager.pasid_idr, vm->pasid);
> +		spin_unlock_irqrestore(&adev->vm_manager.pasid_lock, flags);
> +	}
> +
>   	amd_sched_entity_fini(vm->entity.sched, &vm->entity);
>   
>   	if (!RB_EMPTY_ROOT(&vm->va)) {
> @@ -2729,6 +2797,9 @@ void amdgpu_vm_manager_init(struct amdgpu_device *adev)
>   	adev->vm_manager.vm_update_mode = 0;
>   #endif
>   
> +	idr_init(&adev->vm_manager.pasid_idr);
> +	spin_lock_init(&adev->vm_manager.pasid_lock);
> +
>   	adev->vm_manager.n_compute_vms = 0;
>   }
>   
> @@ -2743,6 +2814,9 @@ void amdgpu_vm_manager_fini(struct amdgpu_device *adev)
>   {
>   	unsigned i, j;
>   
> +	WARN_ON(!idr_is_empty(&adev->vm_manager.pasid_idr));
> +	idr_destroy(&adev->vm_manager.pasid_idr);
> +
>   	for (i = 0; i < AMDGPU_MAX_VMHUBS; ++i) {
>   		struct amdgpu_vm_id_manager *id_mgr =
>   			&adev->vm_manager.id_mgr[i];
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> index 49e15d7..692b05c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> @@ -25,6 +25,7 @@
>   #define __AMDGPU_VM_H__
>   
>   #include <linux/rbtree.h>
> +#include <linux/idr.h>
>   
>   #include "gpu_scheduler.h"
>   #include "amdgpu_sync.h"
> @@ -143,8 +144,9 @@ struct amdgpu_vm {
>   	/* Scheduler entity for page table updates */
>   	struct amd_sched_entity	entity;
>   
> -	/* client id */
> +	/* client id and PASID (TODO: replace client_id with PASID) */
>   	u64                     client_id;
> +	unsigned int		pasid;
>   	/* dedicated to vm */
>   	struct amdgpu_vm_id	*reserved_vmid[AMDGPU_MAX_VMHUBS];
>   
> @@ -219,14 +221,22 @@ struct amdgpu_vm_manager {
>   	 */
>   	int					vm_update_mode;
>   
> +	/* PASID to VM mapping, will be used in interrupt context to
> +	 * look up VM of a page fault
> +	 */
> +	struct idr				pasid_idr;
> +	spinlock_t				pasid_lock;
> +
>   	/* Number of Compute VMs, used for detecting Compute activity */
>   	unsigned                                n_compute_vms;
>   };
>   
> +int amdgpu_vm_alloc_pasid(unsigned int bits);
> +void amdgpu_vm_free_pasid(unsigned int pasid);
>   void amdgpu_vm_manager_init(struct amdgpu_device *adev);
>   void amdgpu_vm_manager_fini(struct amdgpu_device *adev);
>   int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
> -		   int vm_context);
> +		   int vm_context, unsigned int pasid);
>   void amdgpu_vm_fini(struct amdgpu_device *adev, struct amdgpu_vm *vm);
>   void amdgpu_vm_get_pd_bo(struct amdgpu_vm *vm,
>   			 struct list_head *validated,
> diff --git a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
> index e8027b3..5833ef7 100644
> --- a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
> +++ b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
> @@ -188,6 +188,9 @@ struct tile_config {
>    *
>    * @get_max_engine_clock_in_mhz: Retrieves maximum GPU clock in MHz
>    *
> + * @alloc_pasid: Allocate a PASID
> + * @free_pasid: Free a PASID
> + *
>    * @program_sh_mem_settings: A function that should initiate the memory
>    * properties such as main aperture memory type (cache / non cached) and
>    * secondary aperture base address, size and memory type.
> @@ -264,6 +267,9 @@ struct kfd2kgd_calls {
>   
>   	uint32_t (*get_max_engine_clock_in_mhz)(struct kgd_dev *kgd);
>   
> +	int (*alloc_pasid)(unsigned int bits);
> +	void (*free_pasid)(unsigned int pasid);
> +
>   	int (*create_process_vm)(struct kgd_dev *kgd, void **vm,
>   				 void **process_info);
>   	void (*destroy_process_vm)(struct kgd_dev *kgd, void *vm);

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/9] drm/amdgpu: Add PASID management
       [not found]         ` <c730cbbc-919c-23a5-8d10-3ab5fbfa3543-5C7GfCeVMHo@public.gmane.org>
@ 2017-08-28  6:45           ` Christian König
       [not found]             ` <8e5428ef-5419-6241-369f-a48e63b77934-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: Christian König @ 2017-08-28  6:45 UTC (permalink / raw)
  To: zhoucm1, Felix Kuehling, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

I agree that the VM code is growing a bit to much. but the crux is it is 
very close to VMID management, so we should keep the code together.

I wanted to avoid it, but you suggested to separate the VMID management 
quite a while ago. How about moving both VMID as well as PASSID into 
amdgpu_hwid_mgr.c?

Regards,
Christian.

Am 28.08.2017 um 05:06 schrieb zhoucm1:
> Could we separate PASID manager to a clean file like amdgpu_pasid.c 
> like what context manager done?
>
> Since VM code is growing, and which looks more and more complex.
>
> btw: really like many comments about PASID explaination. :)
>
> Regards,
> David Zhou
> On 2017年08月26日 15:19, Felix Kuehling wrote:
>> Allows assigning a PASID to a VM for identifying VMs involved in page
>> faults. The global PASID manager is also exported in the KFD
>> interface so that AMDGPU and KFD can share the PASID space.
>>
>> PASIDs of different sizes can be requested. On APUs, the PASID size
>> is deterined by the capabilities of the IOMMU. So KFD must be able
>> to allocate PASIDs in a smaller range.
>>
>> TODO:
>> * Actually assign PASIDs to VMs
>> * Update the PASID-VMID mapping registers during CS
>>
>> Change-Id: I88c9357a7c584f10e84b5607ac09eba77c833393
>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c |  2 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c |  2 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c |  2 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |  2 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |  2 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            | 76 
>> ++++++++++++++++++++++-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h            | 14 ++++-
>>   drivers/gpu/drm/amd/include/kgd_kfd_interface.h   |  6 ++
>>   8 files changed, 101 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
>> index 3e28d2b..0807d52 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
>> @@ -188,6 +188,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
>>       .get_local_mem_info = get_local_mem_info,
>>       .get_gpu_clock_counter = get_gpu_clock_counter,
>>       .get_max_engine_clock_in_mhz = get_max_engine_clock_in_mhz,
>> +    .alloc_pasid = amdgpu_vm_alloc_pasid,
>> +    .free_pasid = amdgpu_vm_free_pasid,
>>       .create_process_vm = amdgpu_amdkfd_gpuvm_create_process_vm,
>>       .destroy_process_vm = amdgpu_amdkfd_gpuvm_destroy_process_vm,
>>       .get_process_page_dir = amdgpu_amdkfd_gpuvm_get_process_page_dir,
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
>> index 3b6b4d9..c20c000 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
>> @@ -162,6 +162,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
>>       .get_local_mem_info = get_local_mem_info,
>>       .get_gpu_clock_counter = get_gpu_clock_counter,
>>       .get_max_engine_clock_in_mhz = get_max_engine_clock_in_mhz,
>> +    .alloc_pasid = amdgpu_vm_alloc_pasid,
>> +    .free_pasid = amdgpu_vm_free_pasid,
>>       .create_process_vm = amdgpu_amdkfd_gpuvm_create_process_vm,
>>       .destroy_process_vm = amdgpu_amdkfd_gpuvm_destroy_process_vm,
>>       .create_process_gpumem = create_process_gpumem,
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
>> index 961369d..bb99c64 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
>> @@ -209,6 +209,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
>>       .get_local_mem_info = get_local_mem_info,
>>       .get_gpu_clock_counter = get_gpu_clock_counter,
>>       .get_max_engine_clock_in_mhz = get_max_engine_clock_in_mhz,
>> +    .alloc_pasid = amdgpu_vm_alloc_pasid,
>> +    .free_pasid = amdgpu_vm_free_pasid,
>>       .create_process_vm = amdgpu_amdkfd_gpuvm_create_process_vm,
>>       .destroy_process_vm = amdgpu_amdkfd_gpuvm_destroy_process_vm,
>>       .create_process_gpumem = create_process_gpumem,
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> index 35f7d77..462011c 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> @@ -1397,7 +1397,7 @@ int 
>> amdgpu_amdkfd_gpuvm_create_process_vm(struct kgd_dev *kgd, void **vm,
>>           return -ENOMEM;
>>         /* Initialize the VM context, allocate the page directory and 
>> zero it */
>> -    ret = amdgpu_vm_init(adev, &new_vm->base, 
>> AMDGPU_VM_CONTEXT_COMPUTE);
>> +    ret = amdgpu_vm_init(adev, &new_vm->base, 
>> AMDGPU_VM_CONTEXT_COMPUTE, 0);
>>       if (ret != 0) {
>>           pr_err("Failed init vm ret %d\n", ret);
>>           /* Undo everything related to the new VM context */
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
>> index e390c01..ba3dc4d 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
>> @@ -825,7 +825,7 @@ int amdgpu_driver_open_kms(struct drm_device 
>> *dev, struct drm_file *file_priv)
>>       }
>>         r = amdgpu_vm_init(adev, &fpriv->vm,
>> -               AMDGPU_VM_CONTEXT_GFX);
>> +               AMDGPU_VM_CONTEXT_GFX, 0);
>>       if (r) {
>>           kfree(fpriv);
>>           goto out_suspend;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 70d7632..c635699 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -27,12 +27,59 @@
>>    */
>>   #include <linux/dma-fence-array.h>
>>   #include <linux/interval_tree_generic.h>
>> +#include <linux/idr.h>
>>   #include <drm/drmP.h>
>>   #include <drm/amdgpu_drm.h>
>>   #include "amdgpu.h"
>>   #include "amdgpu_trace.h"
>>     /*
>> + * PASID manager
>> + *
>> + * PASIDs are global address space identifiers that can be shared
>> + * between the GPU, an IOMMU and the driver. VMs on different devices
>> + * may use the same PASID if they share the same address
>> + * space. Therefore PASIDs are allocated using a global IDA. VMs are
>> + * looked up from the PASID per amdgpu_device.
>> + */
>> +static DEFINE_IDA(amdgpu_vm_pasid_ida);
>> +
>> +/**
>> + * amdgpu_vm_alloc_pasid - Allocate a PASID
>> + * @bits: Maximum width of the PASID in bits, must be at least 1
>> + *
>> + * Allocates a PASID of the given width while keeping smaller PASIDs
>> + * available if possible.
>> + *
>> + * Returns a positive integer on success. Returns %-EINVAL if bits==0.
>> + * Returns %-ENOSPC if no PASID was avaliable. Returns %-ENOMEM on
>> + * memory allocation failure.
>> + */
>> +int amdgpu_vm_alloc_pasid(unsigned int bits)
>> +{
>> +    int pasid = -EINVAL;
>> +
>> +    for (bits = min(bits, 31U); bits > 0; bits--) {
>> +        pasid = ida_simple_get(&amdgpu_vm_pasid_ida,
>> +                       1U << (bits - 1), 1U << bits,
>> +                       GFP_KERNEL);
>> +        if (pasid != -ENOSPC)
>> +            break;
>> +    }
>> +
>> +    return pasid;
>> +}
>> +
>> +/**
>> + * amdgpu_vm_free_pasid - Free a PASID
>> + * @pasid: PASID to free
>> + */
>> +void amdgpu_vm_free_pasid(unsigned int pasid)
>> +{
>> +    ida_simple_remove(&amdgpu_vm_pasid_ida, pasid);
>> +}
>> +
>> +/*
>>    * GPUVM
>>    * GPUVM is similar to the legacy gart on older asics, however
>>    * rather than there being a single global gart table
>> @@ -2482,7 +2529,7 @@ void amdgpu_vm_adjust_size(struct amdgpu_device 
>> *adev, uint64_t vm_size, uint32_
>>    * Init @vm fields.
>>    */
>>   int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
>> -           int vm_context)
>> +           int vm_context, unsigned int pasid)
>>   {
>>       const unsigned align = min(AMDGPU_VM_PTB_ALIGN_SIZE,
>>           AMDGPU_VM_PTE_COUNT(adev) * 8);
>> @@ -2562,6 +2609,19 @@ int amdgpu_vm_init(struct amdgpu_device *adev, 
>> struct amdgpu_vm *vm,
>>       if (r)
>>           goto error_free_root;
>>   +    if (pasid) {
>> +        unsigned long flags;
>> +
>> +        spin_lock_irqsave(&adev->vm_manager.pasid_lock, flags);
>> +        r = idr_alloc(&adev->vm_manager.pasid_idr, vm, pasid, pasid 
>> + 1,
>> +                  GFP_ATOMIC);
>> + spin_unlock_irqrestore(&adev->vm_manager.pasid_lock, flags);
>> +        if (r < 0)
>> +            goto error_free_root;
>> +
>> +        vm->pasid = pasid;
>> +    }
>> +
>>       vm->vm_context = vm_context;
>>       if (vm_context == AMDGPU_VM_CONTEXT_COMPUTE) {
>>           mutex_lock(&id_mgr->lock);
>> @@ -2650,6 +2710,14 @@ void amdgpu_vm_fini(struct amdgpu_device 
>> *adev, struct amdgpu_vm *vm)
>>           mutex_unlock(&id_mgr->lock);
>>       }
>>   +    if (vm->pasid) {
>> +        unsigned long flags;
>> +
>> +        spin_lock_irqsave(&adev->vm_manager.pasid_lock, flags);
>> +        idr_remove(&adev->vm_manager.pasid_idr, vm->pasid);
>> + spin_unlock_irqrestore(&adev->vm_manager.pasid_lock, flags);
>> +    }
>> +
>>       amd_sched_entity_fini(vm->entity.sched, &vm->entity);
>>         if (!RB_EMPTY_ROOT(&vm->va)) {
>> @@ -2729,6 +2797,9 @@ void amdgpu_vm_manager_init(struct 
>> amdgpu_device *adev)
>>       adev->vm_manager.vm_update_mode = 0;
>>   #endif
>>   +    idr_init(&adev->vm_manager.pasid_idr);
>> +    spin_lock_init(&adev->vm_manager.pasid_lock);
>> +
>>       adev->vm_manager.n_compute_vms = 0;
>>   }
>>   @@ -2743,6 +2814,9 @@ void amdgpu_vm_manager_fini(struct 
>> amdgpu_device *adev)
>>   {
>>       unsigned i, j;
>>   + WARN_ON(!idr_is_empty(&adev->vm_manager.pasid_idr));
>> +    idr_destroy(&adev->vm_manager.pasid_idr);
>> +
>>       for (i = 0; i < AMDGPU_MAX_VMHUBS; ++i) {
>>           struct amdgpu_vm_id_manager *id_mgr =
>>               &adev->vm_manager.id_mgr[i];
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>> index 49e15d7..692b05c 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>> @@ -25,6 +25,7 @@
>>   #define __AMDGPU_VM_H__
>>     #include <linux/rbtree.h>
>> +#include <linux/idr.h>
>>     #include "gpu_scheduler.h"
>>   #include "amdgpu_sync.h"
>> @@ -143,8 +144,9 @@ struct amdgpu_vm {
>>       /* Scheduler entity for page table updates */
>>       struct amd_sched_entity    entity;
>>   -    /* client id */
>> +    /* client id and PASID (TODO: replace client_id with PASID) */
>>       u64                     client_id;
>> +    unsigned int        pasid;
>>       /* dedicated to vm */
>>       struct amdgpu_vm_id    *reserved_vmid[AMDGPU_MAX_VMHUBS];
>>   @@ -219,14 +221,22 @@ struct amdgpu_vm_manager {
>>        */
>>       int                    vm_update_mode;
>>   +    /* PASID to VM mapping, will be used in interrupt context to
>> +     * look up VM of a page fault
>> +     */
>> +    struct idr                pasid_idr;
>> +    spinlock_t                pasid_lock;
>> +
>>       /* Number of Compute VMs, used for detecting Compute activity */
>>       unsigned                                n_compute_vms;
>>   };
>>   +int amdgpu_vm_alloc_pasid(unsigned int bits);
>> +void amdgpu_vm_free_pasid(unsigned int pasid);
>>   void amdgpu_vm_manager_init(struct amdgpu_device *adev);
>>   void amdgpu_vm_manager_fini(struct amdgpu_device *adev);
>>   int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
>> -           int vm_context);
>> +           int vm_context, unsigned int pasid);
>>   void amdgpu_vm_fini(struct amdgpu_device *adev, struct amdgpu_vm *vm);
>>   void amdgpu_vm_get_pd_bo(struct amdgpu_vm *vm,
>>                struct list_head *validated,
>> diff --git a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h 
>> b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
>> index e8027b3..5833ef7 100644
>> --- a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
>> +++ b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
>> @@ -188,6 +188,9 @@ struct tile_config {
>>    *
>>    * @get_max_engine_clock_in_mhz: Retrieves maximum GPU clock in MHz
>>    *
>> + * @alloc_pasid: Allocate a PASID
>> + * @free_pasid: Free a PASID
>> + *
>>    * @program_sh_mem_settings: A function that should initiate the 
>> memory
>>    * properties such as main aperture memory type (cache / non 
>> cached) and
>>    * secondary aperture base address, size and memory type.
>> @@ -264,6 +267,9 @@ struct kfd2kgd_calls {
>>         uint32_t (*get_max_engine_clock_in_mhz)(struct kgd_dev *kgd);
>>   +    int (*alloc_pasid)(unsigned int bits);
>> +    void (*free_pasid)(unsigned int pasid);
>> +
>>       int (*create_process_vm)(struct kgd_dev *kgd, void **vm,
>>                    void **process_info);
>>       void (*destroy_process_vm)(struct kgd_dev *kgd, void *vm);
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/9] drm/amdgpu: Add PASID management
       [not found]             ` <8e5428ef-5419-6241-369f-a48e63b77934-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
@ 2017-08-28  7:15               ` zhoucm1
       [not found]                 ` <c593059f-548d-340c-6bd5-7650b8830aad-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 23+ messages in thread
From: zhoucm1 @ 2017-08-28  7:15 UTC (permalink / raw)
  To: Christian König, Felix Kuehling,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW



On 2017年08月28日 14:45, Christian König wrote:
> I agree that the VM code is growing a bit to much. but the crux is it 
> is very close to VMID management, so we should keep the code together.
>
> I wanted to avoid it, but you suggested to separate the VMID 
> management quite a while ago. How about moving both VMID as well as 
> PASSID into amdgpu_hwid_mgr.c?
If let me choose, I will prefer one is amdgpu_vmid.c, and one is 
amdgpu_pasid.c. :)
Anyway, I have no strong opinion on that, depend on your guys like.

Cheers,
David Zhou
>
> Regards,
> Christian.
>
> Am 28.08.2017 um 05:06 schrieb zhoucm1:
>> Could we separate PASID manager to a clean file like amdgpu_pasid.c 
>> like what context manager done?
>>
>> Since VM code is growing, and which looks more and more complex.
>>
>> btw: really like many comments about PASID explaination. :)
>>
>> Regards,
>> David Zhou
>> On 2017年08月26日 15:19, Felix Kuehling wrote:
>>> Allows assigning a PASID to a VM for identifying VMs involved in page
>>> faults. The global PASID manager is also exported in the KFD
>>> interface so that AMDGPU and KFD can share the PASID space.
>>>
>>> PASIDs of different sizes can be requested. On APUs, the PASID size
>>> is deterined by the capabilities of the IOMMU. So KFD must be able
>>> to allocate PASIDs in a smaller range.
>>>
>>> TODO:
>>> * Actually assign PASIDs to VMs
>>> * Update the PASID-VMID mapping registers during CS
>>>
>>> Change-Id: I88c9357a7c584f10e84b5607ac09eba77c833393
>>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c |  2 +
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c |  2 +
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c |  2 +
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |  2 +-
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |  2 +-
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            | 76 
>>> ++++++++++++++++++++++-
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h            | 14 ++++-
>>>   drivers/gpu/drm/amd/include/kgd_kfd_interface.h   |  6 ++
>>>   8 files changed, 101 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
>>> index 3e28d2b..0807d52 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
>>> @@ -188,6 +188,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
>>>       .get_local_mem_info = get_local_mem_info,
>>>       .get_gpu_clock_counter = get_gpu_clock_counter,
>>>       .get_max_engine_clock_in_mhz = get_max_engine_clock_in_mhz,
>>> +    .alloc_pasid = amdgpu_vm_alloc_pasid,
>>> +    .free_pasid = amdgpu_vm_free_pasid,
>>>       .create_process_vm = amdgpu_amdkfd_gpuvm_create_process_vm,
>>>       .destroy_process_vm = amdgpu_amdkfd_gpuvm_destroy_process_vm,
>>>       .get_process_page_dir = amdgpu_amdkfd_gpuvm_get_process_page_dir,
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
>>> index 3b6b4d9..c20c000 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
>>> @@ -162,6 +162,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
>>>       .get_local_mem_info = get_local_mem_info,
>>>       .get_gpu_clock_counter = get_gpu_clock_counter,
>>>       .get_max_engine_clock_in_mhz = get_max_engine_clock_in_mhz,
>>> +    .alloc_pasid = amdgpu_vm_alloc_pasid,
>>> +    .free_pasid = amdgpu_vm_free_pasid,
>>>       .create_process_vm = amdgpu_amdkfd_gpuvm_create_process_vm,
>>>       .destroy_process_vm = amdgpu_amdkfd_gpuvm_destroy_process_vm,
>>>       .create_process_gpumem = create_process_gpumem,
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
>>> index 961369d..bb99c64 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
>>> @@ -209,6 +209,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
>>>       .get_local_mem_info = get_local_mem_info,
>>>       .get_gpu_clock_counter = get_gpu_clock_counter,
>>>       .get_max_engine_clock_in_mhz = get_max_engine_clock_in_mhz,
>>> +    .alloc_pasid = amdgpu_vm_alloc_pasid,
>>> +    .free_pasid = amdgpu_vm_free_pasid,
>>>       .create_process_vm = amdgpu_amdkfd_gpuvm_create_process_vm,
>>>       .destroy_process_vm = amdgpu_amdkfd_gpuvm_destroy_process_vm,
>>>       .create_process_gpumem = create_process_gpumem,
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>> index 35f7d77..462011c 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>> @@ -1397,7 +1397,7 @@ int 
>>> amdgpu_amdkfd_gpuvm_create_process_vm(struct kgd_dev *kgd, void **vm,
>>>           return -ENOMEM;
>>>         /* Initialize the VM context, allocate the page directory 
>>> and zero it */
>>> -    ret = amdgpu_vm_init(adev, &new_vm->base, 
>>> AMDGPU_VM_CONTEXT_COMPUTE);
>>> +    ret = amdgpu_vm_init(adev, &new_vm->base, 
>>> AMDGPU_VM_CONTEXT_COMPUTE, 0);
>>>       if (ret != 0) {
>>>           pr_err("Failed init vm ret %d\n", ret);
>>>           /* Undo everything related to the new VM context */
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
>>> index e390c01..ba3dc4d 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
>>> @@ -825,7 +825,7 @@ int amdgpu_driver_open_kms(struct drm_device 
>>> *dev, struct drm_file *file_priv)
>>>       }
>>>         r = amdgpu_vm_init(adev, &fpriv->vm,
>>> -               AMDGPU_VM_CONTEXT_GFX);
>>> +               AMDGPU_VM_CONTEXT_GFX, 0);
>>>       if (r) {
>>>           kfree(fpriv);
>>>           goto out_suspend;
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> index 70d7632..c635699 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> @@ -27,12 +27,59 @@
>>>    */
>>>   #include <linux/dma-fence-array.h>
>>>   #include <linux/interval_tree_generic.h>
>>> +#include <linux/idr.h>
>>>   #include <drm/drmP.h>
>>>   #include <drm/amdgpu_drm.h>
>>>   #include "amdgpu.h"
>>>   #include "amdgpu_trace.h"
>>>     /*
>>> + * PASID manager
>>> + *
>>> + * PASIDs are global address space identifiers that can be shared
>>> + * between the GPU, an IOMMU and the driver. VMs on different devices
>>> + * may use the same PASID if they share the same address
>>> + * space. Therefore PASIDs are allocated using a global IDA. VMs are
>>> + * looked up from the PASID per amdgpu_device.
>>> + */
>>> +static DEFINE_IDA(amdgpu_vm_pasid_ida);
>>> +
>>> +/**
>>> + * amdgpu_vm_alloc_pasid - Allocate a PASID
>>> + * @bits: Maximum width of the PASID in bits, must be at least 1
>>> + *
>>> + * Allocates a PASID of the given width while keeping smaller PASIDs
>>> + * available if possible.
>>> + *
>>> + * Returns a positive integer on success. Returns %-EINVAL if bits==0.
>>> + * Returns %-ENOSPC if no PASID was avaliable. Returns %-ENOMEM on
>>> + * memory allocation failure.
>>> + */
>>> +int amdgpu_vm_alloc_pasid(unsigned int bits)
>>> +{
>>> +    int pasid = -EINVAL;
>>> +
>>> +    for (bits = min(bits, 31U); bits > 0; bits--) {
>>> +        pasid = ida_simple_get(&amdgpu_vm_pasid_ida,
>>> +                       1U << (bits - 1), 1U << bits,
>>> +                       GFP_KERNEL);
>>> +        if (pasid != -ENOSPC)
>>> +            break;
>>> +    }
>>> +
>>> +    return pasid;
>>> +}
>>> +
>>> +/**
>>> + * amdgpu_vm_free_pasid - Free a PASID
>>> + * @pasid: PASID to free
>>> + */
>>> +void amdgpu_vm_free_pasid(unsigned int pasid)
>>> +{
>>> +    ida_simple_remove(&amdgpu_vm_pasid_ida, pasid);
>>> +}
>>> +
>>> +/*
>>>    * GPUVM
>>>    * GPUVM is similar to the legacy gart on older asics, however
>>>    * rather than there being a single global gart table
>>> @@ -2482,7 +2529,7 @@ void amdgpu_vm_adjust_size(struct 
>>> amdgpu_device *adev, uint64_t vm_size, uint32_
>>>    * Init @vm fields.
>>>    */
>>>   int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
>>> -           int vm_context)
>>> +           int vm_context, unsigned int pasid)
>>>   {
>>>       const unsigned align = min(AMDGPU_VM_PTB_ALIGN_SIZE,
>>>           AMDGPU_VM_PTE_COUNT(adev) * 8);
>>> @@ -2562,6 +2609,19 @@ int amdgpu_vm_init(struct amdgpu_device 
>>> *adev, struct amdgpu_vm *vm,
>>>       if (r)
>>>           goto error_free_root;
>>>   +    if (pasid) {
>>> +        unsigned long flags;
>>> +
>>> + spin_lock_irqsave(&adev->vm_manager.pasid_lock, flags);
>>> +        r = idr_alloc(&adev->vm_manager.pasid_idr, vm, pasid, pasid 
>>> + 1,
>>> +                  GFP_ATOMIC);
>>> + spin_unlock_irqrestore(&adev->vm_manager.pasid_lock, flags);
>>> +        if (r < 0)
>>> +            goto error_free_root;
>>> +
>>> +        vm->pasid = pasid;
>>> +    }
>>> +
>>>       vm->vm_context = vm_context;
>>>       if (vm_context == AMDGPU_VM_CONTEXT_COMPUTE) {
>>>           mutex_lock(&id_mgr->lock);
>>> @@ -2650,6 +2710,14 @@ void amdgpu_vm_fini(struct amdgpu_device 
>>> *adev, struct amdgpu_vm *vm)
>>>           mutex_unlock(&id_mgr->lock);
>>>       }
>>>   +    if (vm->pasid) {
>>> +        unsigned long flags;
>>> +
>>> + spin_lock_irqsave(&adev->vm_manager.pasid_lock, flags);
>>> +        idr_remove(&adev->vm_manager.pasid_idr, vm->pasid);
>>> + spin_unlock_irqrestore(&adev->vm_manager.pasid_lock, flags);
>>> +    }
>>> +
>>>       amd_sched_entity_fini(vm->entity.sched, &vm->entity);
>>>         if (!RB_EMPTY_ROOT(&vm->va)) {
>>> @@ -2729,6 +2797,9 @@ void amdgpu_vm_manager_init(struct 
>>> amdgpu_device *adev)
>>>       adev->vm_manager.vm_update_mode = 0;
>>>   #endif
>>>   +    idr_init(&adev->vm_manager.pasid_idr);
>>> +    spin_lock_init(&adev->vm_manager.pasid_lock);
>>> +
>>>       adev->vm_manager.n_compute_vms = 0;
>>>   }
>>>   @@ -2743,6 +2814,9 @@ void amdgpu_vm_manager_fini(struct 
>>> amdgpu_device *adev)
>>>   {
>>>       unsigned i, j;
>>>   + WARN_ON(!idr_is_empty(&adev->vm_manager.pasid_idr));
>>> +    idr_destroy(&adev->vm_manager.pasid_idr);
>>> +
>>>       for (i = 0; i < AMDGPU_MAX_VMHUBS; ++i) {
>>>           struct amdgpu_vm_id_manager *id_mgr =
>>>               &adev->vm_manager.id_mgr[i];
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>> index 49e15d7..692b05c 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>> @@ -25,6 +25,7 @@
>>>   #define __AMDGPU_VM_H__
>>>     #include <linux/rbtree.h>
>>> +#include <linux/idr.h>
>>>     #include "gpu_scheduler.h"
>>>   #include "amdgpu_sync.h"
>>> @@ -143,8 +144,9 @@ struct amdgpu_vm {
>>>       /* Scheduler entity for page table updates */
>>>       struct amd_sched_entity    entity;
>>>   -    /* client id */
>>> +    /* client id and PASID (TODO: replace client_id with PASID) */
>>>       u64                     client_id;
>>> +    unsigned int        pasid;
>>>       /* dedicated to vm */
>>>       struct amdgpu_vm_id *reserved_vmid[AMDGPU_MAX_VMHUBS];
>>>   @@ -219,14 +221,22 @@ struct amdgpu_vm_manager {
>>>        */
>>>       int                    vm_update_mode;
>>>   +    /* PASID to VM mapping, will be used in interrupt context to
>>> +     * look up VM of a page fault
>>> +     */
>>> +    struct idr                pasid_idr;
>>> +    spinlock_t                pasid_lock;
>>> +
>>>       /* Number of Compute VMs, used for detecting Compute activity */
>>>       unsigned                                n_compute_vms;
>>>   };
>>>   +int amdgpu_vm_alloc_pasid(unsigned int bits);
>>> +void amdgpu_vm_free_pasid(unsigned int pasid);
>>>   void amdgpu_vm_manager_init(struct amdgpu_device *adev);
>>>   void amdgpu_vm_manager_fini(struct amdgpu_device *adev);
>>>   int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
>>> -           int vm_context);
>>> +           int vm_context, unsigned int pasid);
>>>   void amdgpu_vm_fini(struct amdgpu_device *adev, struct amdgpu_vm 
>>> *vm);
>>>   void amdgpu_vm_get_pd_bo(struct amdgpu_vm *vm,
>>>                struct list_head *validated,
>>> diff --git a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h 
>>> b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
>>> index e8027b3..5833ef7 100644
>>> --- a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
>>> +++ b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
>>> @@ -188,6 +188,9 @@ struct tile_config {
>>>    *
>>>    * @get_max_engine_clock_in_mhz: Retrieves maximum GPU clock in MHz
>>>    *
>>> + * @alloc_pasid: Allocate a PASID
>>> + * @free_pasid: Free a PASID
>>> + *
>>>    * @program_sh_mem_settings: A function that should initiate the 
>>> memory
>>>    * properties such as main aperture memory type (cache / non 
>>> cached) and
>>>    * secondary aperture base address, size and memory type.
>>> @@ -264,6 +267,9 @@ struct kfd2kgd_calls {
>>>         uint32_t (*get_max_engine_clock_in_mhz)(struct kgd_dev *kgd);
>>>   +    int (*alloc_pasid)(unsigned int bits);
>>> +    void (*free_pasid)(unsigned int pasid);
>>> +
>>>       int (*create_process_vm)(struct kgd_dev *kgd, void **vm,
>>>                    void **process_info);
>>>       void (*destroy_process_vm)(struct kgd_dev *kgd, void *vm);
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/9] drm/amdgpu: Add PASID management
       [not found]                 ` <c593059f-548d-340c-6bd5-7650b8830aad-5C7GfCeVMHo@public.gmane.org>
@ 2017-08-28 13:26                   ` Kuehling, Felix
  0 siblings, 0 replies; 23+ messages in thread
From: Kuehling, Felix @ 2017-08-28 13:26 UTC (permalink / raw)
  To: Zhou, David(ChunMing),
	Christian König, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

I have no problem either way. I feel creating a new amdgpu_hwid_mgr is more work and more churn because it would probably come with a bunch of renaming.

Just the PASID manager is very small and I don't expect it to grow. It's much less complex than the VMID manager because PASIDs don't get assigned to processes dynamically. Once a PASID is assigned to a process, it remains the same until that process terminates.

Therefore I'd prefer to leave it in amdgpu_vmid.c. If you feel that file is growing too much in different directions, reorganizing it is out of the scope of this patch series.

Regards,
  Felix
________________________________________
From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> on behalf of zhoucm1 <david1.zhou@amd.com>
Sent: Monday, August 28, 2017 3:15:23 AM
To: Christian König; Kuehling, Felix; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 2/9] drm/amdgpu: Add PASID management

On 2017年08月28日 14:45, Christian König wrote:
> I agree that the VM code is growing a bit to much. but the crux is it
> is very close to VMID management, so we should keep the code together.
>
> I wanted to avoid it, but you suggested to separate the VMID
> management quite a while ago. How about moving both VMID as well as
> PASSID into amdgpu_hwid_mgr.c?
If let me choose, I will prefer one is amdgpu_vmid.c, and one is
amdgpu_pasid.c. :)
Anyway, I have no strong opinion on that, depend on your guys like.

Cheers,
David Zhou
>
> Regards,
> Christian.
>
> Am 28.08.2017 um 05:06 schrieb zhoucm1:
>> Could we separate PASID manager to a clean file like amdgpu_pasid.c
>> like what context manager done?
>>
>> Since VM code is growing, and which looks more and more complex.
>>
>> btw: really like many comments about PASID explaination. :)
>>
>> Regards,
>> David Zhou
>> On 2017年08月26日 15:19, Felix Kuehling wrote:
>>> Allows assigning a PASID to a VM for identifying VMs involved in page
>>> faults. The global PASID manager is also exported in the KFD
>>> interface so that AMDGPU and KFD can share the PASID space.
>>>
>>> PASIDs of different sizes can be requested. On APUs, the PASID size
>>> is deterined by the capabilities of the IOMMU. So KFD must be able
>>> to allocate PASIDs in a smaller range.
>>>
>>> TODO:
>>> * Actually assign PASIDs to VMs
>>> * Update the PASID-VMID mapping registers during CS
>>>
>>> Change-Id: I88c9357a7c584f10e84b5607ac09eba77c833393
>>> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c |  2 +
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c |  2 +
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c |  2 +
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |  2 +-
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |  2 +-
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            | 76
>>> ++++++++++++++++++++++-
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h            | 14 ++++-
>>>   drivers/gpu/drm/amd/include/kgd_kfd_interface.h   |  6 ++
>>>   8 files changed, 101 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
>>> index 3e28d2b..0807d52 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
>>> @@ -188,6 +188,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
>>>       .get_local_mem_info = get_local_mem_info,
>>>       .get_gpu_clock_counter = get_gpu_clock_counter,
>>>       .get_max_engine_clock_in_mhz = get_max_engine_clock_in_mhz,
>>> +    .alloc_pasid = amdgpu_vm_alloc_pasid,
>>> +    .free_pasid = amdgpu_vm_free_pasid,
>>>       .create_process_vm = amdgpu_amdkfd_gpuvm_create_process_vm,
>>>       .destroy_process_vm = amdgpu_amdkfd_gpuvm_destroy_process_vm,
>>>       .get_process_page_dir = amdgpu_amdkfd_gpuvm_get_process_page_dir,
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
>>> index 3b6b4d9..c20c000 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
>>> @@ -162,6 +162,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
>>>       .get_local_mem_info = get_local_mem_info,
>>>       .get_gpu_clock_counter = get_gpu_clock_counter,
>>>       .get_max_engine_clock_in_mhz = get_max_engine_clock_in_mhz,
>>> +    .alloc_pasid = amdgpu_vm_alloc_pasid,
>>> +    .free_pasid = amdgpu_vm_free_pasid,
>>>       .create_process_vm = amdgpu_amdkfd_gpuvm_create_process_vm,
>>>       .destroy_process_vm = amdgpu_amdkfd_gpuvm_destroy_process_vm,
>>>       .create_process_gpumem = create_process_gpumem,
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
>>> index 961369d..bb99c64 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c
>>> @@ -209,6 +209,8 @@ static const struct kfd2kgd_calls kfd2kgd = {
>>>       .get_local_mem_info = get_local_mem_info,
>>>       .get_gpu_clock_counter = get_gpu_clock_counter,
>>>       .get_max_engine_clock_in_mhz = get_max_engine_clock_in_mhz,
>>> +    .alloc_pasid = amdgpu_vm_alloc_pasid,
>>> +    .free_pasid = amdgpu_vm_free_pasid,
>>>       .create_process_vm = amdgpu_amdkfd_gpuvm_create_process_vm,
>>>       .destroy_process_vm = amdgpu_amdkfd_gpuvm_destroy_process_vm,
>>>       .create_process_gpumem = create_process_gpumem,
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>> index 35f7d77..462011c 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>>> @@ -1397,7 +1397,7 @@ int
>>> amdgpu_amdkfd_gpuvm_create_process_vm(struct kgd_dev *kgd, void **vm,
>>>           return -ENOMEM;
>>>         /* Initialize the VM context, allocate the page directory
>>> and zero it */
>>> -    ret = amdgpu_vm_init(adev, &new_vm->base,
>>> AMDGPU_VM_CONTEXT_COMPUTE);
>>> +    ret = amdgpu_vm_init(adev, &new_vm->base,
>>> AMDGPU_VM_CONTEXT_COMPUTE, 0);
>>>       if (ret != 0) {
>>>           pr_err("Failed init vm ret %d\n", ret);
>>>           /* Undo everything related to the new VM context */
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
>>> index e390c01..ba3dc4d 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
>>> @@ -825,7 +825,7 @@ int amdgpu_driver_open_kms(struct drm_device
>>> *dev, struct drm_file *file_priv)
>>>       }
>>>         r = amdgpu_vm_init(adev, &fpriv->vm,
>>> -               AMDGPU_VM_CONTEXT_GFX);
>>> +               AMDGPU_VM_CONTEXT_GFX, 0);
>>>       if (r) {
>>>           kfree(fpriv);
>>>           goto out_suspend;
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> index 70d7632..c635699 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> @@ -27,12 +27,59 @@
>>>    */
>>>   #include <linux/dma-fence-array.h>
>>>   #include <linux/interval_tree_generic.h>
>>> +#include <linux/idr.h>
>>>   #include <drm/drmP.h>
>>>   #include <drm/amdgpu_drm.h>
>>>   #include "amdgpu.h"
>>>   #include "amdgpu_trace.h"
>>>     /*
>>> + * PASID manager
>>> + *
>>> + * PASIDs are global address space identifiers that can be shared
>>> + * between the GPU, an IOMMU and the driver. VMs on different devices
>>> + * may use the same PASID if they share the same address
>>> + * space. Therefore PASIDs are allocated using a global IDA. VMs are
>>> + * looked up from the PASID per amdgpu_device.
>>> + */
>>> +static DEFINE_IDA(amdgpu_vm_pasid_ida);
>>> +
>>> +/**
>>> + * amdgpu_vm_alloc_pasid - Allocate a PASID
>>> + * @bits: Maximum width of the PASID in bits, must be at least 1
>>> + *
>>> + * Allocates a PASID of the given width while keeping smaller PASIDs
>>> + * available if possible.
>>> + *
>>> + * Returns a positive integer on success. Returns %-EINVAL if bits==0.
>>> + * Returns %-ENOSPC if no PASID was avaliable. Returns %-ENOMEM on
>>> + * memory allocation failure.
>>> + */
>>> +int amdgpu_vm_alloc_pasid(unsigned int bits)
>>> +{
>>> +    int pasid = -EINVAL;
>>> +
>>> +    for (bits = min(bits, 31U); bits > 0; bits--) {
>>> +        pasid = ida_simple_get(&amdgpu_vm_pasid_ida,
>>> +                       1U << (bits - 1), 1U << bits,
>>> +                       GFP_KERNEL);
>>> +        if (pasid != -ENOSPC)
>>> +            break;
>>> +    }
>>> +
>>> +    return pasid;
>>> +}
>>> +
>>> +/**
>>> + * amdgpu_vm_free_pasid - Free a PASID
>>> + * @pasid: PASID to free
>>> + */
>>> +void amdgpu_vm_free_pasid(unsigned int pasid)
>>> +{
>>> +    ida_simple_remove(&amdgpu_vm_pasid_ida, pasid);
>>> +}
>>> +
>>> +/*
>>>    * GPUVM
>>>    * GPUVM is similar to the legacy gart on older asics, however
>>>    * rather than there being a single global gart table
>>> @@ -2482,7 +2529,7 @@ void amdgpu_vm_adjust_size(struct
>>> amdgpu_device *adev, uint64_t vm_size, uint32_
>>>    * Init @vm fields.
>>>    */
>>>   int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
>>> -           int vm_context)
>>> +           int vm_context, unsigned int pasid)
>>>   {
>>>       const unsigned align = min(AMDGPU_VM_PTB_ALIGN_SIZE,
>>>           AMDGPU_VM_PTE_COUNT(adev) * 8);
>>> @@ -2562,6 +2609,19 @@ int amdgpu_vm_init(struct amdgpu_device
>>> *adev, struct amdgpu_vm *vm,
>>>       if (r)
>>>           goto error_free_root;
>>>   +    if (pasid) {
>>> +        unsigned long flags;
>>> +
>>> + spin_lock_irqsave(&adev->vm_manager.pasid_lock, flags);
>>> +        r = idr_alloc(&adev->vm_manager.pasid_idr, vm, pasid, pasid
>>> + 1,
>>> +                  GFP_ATOMIC);
>>> + spin_unlock_irqrestore(&adev->vm_manager.pasid_lock, flags);
>>> +        if (r < 0)
>>> +            goto error_free_root;
>>> +
>>> +        vm->pasid = pasid;
>>> +    }
>>> +
>>>       vm->vm_context = vm_context;
>>>       if (vm_context == AMDGPU_VM_CONTEXT_COMPUTE) {
>>>           mutex_lock(&id_mgr->lock);
>>> @@ -2650,6 +2710,14 @@ void amdgpu_vm_fini(struct amdgpu_device
>>> *adev, struct amdgpu_vm *vm)
>>>           mutex_unlock(&id_mgr->lock);
>>>       }
>>>   +    if (vm->pasid) {
>>> +        unsigned long flags;
>>> +
>>> + spin_lock_irqsave(&adev->vm_manager.pasid_lock, flags);
>>> +        idr_remove(&adev->vm_manager.pasid_idr, vm->pasid);
>>> + spin_unlock_irqrestore(&adev->vm_manager.pasid_lock, flags);
>>> +    }
>>> +
>>>       amd_sched_entity_fini(vm->entity.sched, &vm->entity);
>>>         if (!RB_EMPTY_ROOT(&vm->va)) {
>>> @@ -2729,6 +2797,9 @@ void amdgpu_vm_manager_init(struct
>>> amdgpu_device *adev)
>>>       adev->vm_manager.vm_update_mode = 0;
>>>   #endif
>>>   +    idr_init(&adev->vm_manager.pasid_idr);
>>> +    spin_lock_init(&adev->vm_manager.pasid_lock);
>>> +
>>>       adev->vm_manager.n_compute_vms = 0;
>>>   }
>>>   @@ -2743,6 +2814,9 @@ void amdgpu_vm_manager_fini(struct
>>> amdgpu_device *adev)
>>>   {
>>>       unsigned i, j;
>>>   + WARN_ON(!idr_is_empty(&adev->vm_manager.pasid_idr));
>>> +    idr_destroy(&adev->vm_manager.pasid_idr);
>>> +
>>>       for (i = 0; i < AMDGPU_MAX_VMHUBS; ++i) {
>>>           struct amdgpu_vm_id_manager *id_mgr =
>>>               &adev->vm_manager.id_mgr[i];
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>> index 49e15d7..692b05c 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>> @@ -25,6 +25,7 @@
>>>   #define __AMDGPU_VM_H__
>>>     #include <linux/rbtree.h>
>>> +#include <linux/idr.h>
>>>     #include "gpu_scheduler.h"
>>>   #include "amdgpu_sync.h"
>>> @@ -143,8 +144,9 @@ struct amdgpu_vm {
>>>       /* Scheduler entity for page table updates */
>>>       struct amd_sched_entity    entity;
>>>   -    /* client id */
>>> +    /* client id and PASID (TODO: replace client_id with PASID) */
>>>       u64                     client_id;
>>> +    unsigned int        pasid;
>>>       /* dedicated to vm */
>>>       struct amdgpu_vm_id *reserved_vmid[AMDGPU_MAX_VMHUBS];
>>>   @@ -219,14 +221,22 @@ struct amdgpu_vm_manager {
>>>        */
>>>       int                    vm_update_mode;
>>>   +    /* PASID to VM mapping, will be used in interrupt context to
>>> +     * look up VM of a page fault
>>> +     */
>>> +    struct idr                pasid_idr;
>>> +    spinlock_t                pasid_lock;
>>> +
>>>       /* Number of Compute VMs, used for detecting Compute activity */
>>>       unsigned                                n_compute_vms;
>>>   };
>>>   +int amdgpu_vm_alloc_pasid(unsigned int bits);
>>> +void amdgpu_vm_free_pasid(unsigned int pasid);
>>>   void amdgpu_vm_manager_init(struct amdgpu_device *adev);
>>>   void amdgpu_vm_manager_fini(struct amdgpu_device *adev);
>>>   int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
>>> -           int vm_context);
>>> +           int vm_context, unsigned int pasid);
>>>   void amdgpu_vm_fini(struct amdgpu_device *adev, struct amdgpu_vm
>>> *vm);
>>>   void amdgpu_vm_get_pd_bo(struct amdgpu_vm *vm,
>>>                struct list_head *validated,
>>> diff --git a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
>>> b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
>>> index e8027b3..5833ef7 100644
>>> --- a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
>>> +++ b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
>>> @@ -188,6 +188,9 @@ struct tile_config {
>>>    *
>>>    * @get_max_engine_clock_in_mhz: Retrieves maximum GPU clock in MHz
>>>    *
>>> + * @alloc_pasid: Allocate a PASID
>>> + * @free_pasid: Free a PASID
>>> + *
>>>    * @program_sh_mem_settings: A function that should initiate the
>>> memory
>>>    * properties such as main aperture memory type (cache / non
>>> cached) and
>>>    * secondary aperture base address, size and memory type.
>>> @@ -264,6 +267,9 @@ struct kfd2kgd_calls {
>>>         uint32_t (*get_max_engine_clock_in_mhz)(struct kgd_dev *kgd);
>>>   +    int (*alloc_pasid)(unsigned int bits);
>>> +    void (*free_pasid)(unsigned int pasid);
>>> +
>>>       int (*create_process_vm)(struct kgd_dev *kgd, void **vm,
>>>                    void **process_info);
>>>       void (*destroy_process_vm)(struct kgd_dev *kgd, void *vm);
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/9] WIP: Retry page fault handling for Vega10
       [not found]     ` <CAFCwf10G+4ra9UD6upxaBc5FwSu4efB9oLKKYSZHcHQ-w9TZgQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-08-28 13:36       ` Kuehling, Felix
  0 siblings, 0 replies; 23+ messages in thread
From: Kuehling, Felix @ 2017-08-28 13:36 UTC (permalink / raw)
  To: Oded Gabbay; +Cc: amd-gfx list

Hi Oded,

Thanks for the heads up. Enjoy your vacation. I'll rebase this on Alex's drm-next for your review.

Regards,
  Felix
________________________________________
From: Oded Gabbay <oded.gabbay@gmail.com>
Sent: Sunday, August 27, 2017 6:22:37 PM
To: Kuehling, Felix
Cc: amd-gfx list
Subject: Re: [PATCH 0/9] WIP: Retry page fault handling for Vega10

Hi Felix,
I'm currently on vacation and I will return at the end of the week, so I will not be able to review the patches until then.

Oded

On Aug 26, 2017 09:19, "Felix Kuehling" <Felix.Kuehling@amd.com<mailto:Felix.Kuehling@amd.com>> wrote:
This is based on amd-kfd-staging, because that's easier for me to test.
I'm planning to port to amd-staging-4.x for submission upstream.

With this patch series, I'm able to turn retry faults on and handle the
interrupt storm from VM faults. Only the first VM fault interrupt per
process and address gets handled the usual way. Retry interruptr are
filtered in a new prescreening stage in amdgpu_ih_process.

Pending faults are tracked in a hash table in IH to detect retry faults
and a FIFO in the VM for later processing.

Looking up the VM from the fault interrupt depends on the PASID.
Currently only KFD VMs have proper PASIDs.

TODO (need some help with these):
* Allocate PASIDs for graphics contexts
* Setup VMID-PASID mapping during graphics command submission
* Confirm that graphics page faults have the correct PASID in the IV

Once that's done, we should have a foundation to start working on HMM
and proper SVM memory management with demand paging.

Felix Kuehling (9):
  drm/amdgpu: Fix error handling in amdgpu_vm_init
  drm/amdgpu: Add PASID management
  drm/radeon: Add PASID manager for KFD
  drm/amdkfd: Separate doorbell allocation from PASID
  drm/amdkfd: Use PASID manager from KGD
  drm/amd: Set the PASID for KFD VMs
  drm/amdgpu: Add prescreening stage in IH processing
  lib: Closed hash table with low overhead
  drm/amdgpu: Track pending retry faults in IH and VM

 drivers/gpu/drm/Kconfig                           |   1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu.h               |   2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h        |   3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c |   2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c |   2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c |   2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |   6 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c            |  82 ++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h            |  12 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c           |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c            |  88 +++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h            |  21 +-
 drivers/gpu/drm/amd/amdgpu/cik_ih.c               |  14 +
 drivers/gpu/drm/amd/amdgpu/cz_ih.c                |  14 +
 drivers/gpu/drm/amd/amdgpu/iceland_ih.c           |  14 +
 drivers/gpu/drm/amd/amdgpu/si_ih.c                |  14 +
 drivers/gpu/drm/amd/amdgpu/tonga_ih.c             |  14 +
 drivers/gpu/drm/amd/amdgpu/vega10_ih.c            |  90 ++++
 drivers/gpu/drm/amd/amdkfd/kfd_device.c           |  18 +-
 drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c         |  48 +-
 drivers/gpu/drm/amd/amdkfd/kfd_module.c           |   6 -
 drivers/gpu/drm/amd/amdkfd/kfd_pasid.c            |  84 ++--
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h             |  10 +-
 drivers/gpu/drm/amd/amdkfd/kfd_process.c          |   8 +-
 drivers/gpu/drm/amd/include/kgd_kfd_interface.h   |   8 +-
 drivers/gpu/drm/radeon/radeon_kfd.c               |  36 +-
 include/linux/chash.h                             | 349 +++++++++++++++
 lib/Kconfig                                       |   8 +
 lib/Makefile                                      |   2 +
 lib/chash.c                                       | 521 ++++++++++++++++++++++
 30 files changed, 1376 insertions(+), 105 deletions(-)
 create mode 100644 include/linux/chash.h
 create mode 100644 lib/chash.c

--
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2017-08-28 13:36 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-26  7:19 [PATCH 0/9] WIP: Retry page fault handling for Vega10 Felix Kuehling
     [not found] ` <1503731949-22742-1-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-08-26  7:19   ` [PATCH 1/9] drm/amdgpu: Fix error handling in amdgpu_vm_init Felix Kuehling
     [not found]     ` <1503731949-22742-2-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-08-26 13:22       ` Christian König
2017-08-28  2:51       ` zhoucm1
2017-08-26  7:19   ` [PATCH 2/9] drm/amdgpu: Add PASID management Felix Kuehling
     [not found]     ` <1503731949-22742-3-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-08-26 13:27       ` Christian König
     [not found]         ` <994b23cd-67b3-4498-2c2b-d4fc2ea68be7-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
2017-08-26 14:41           ` Kuehling, Felix
2017-08-28  3:06       ` zhoucm1
     [not found]         ` <c730cbbc-919c-23a5-8d10-3ab5fbfa3543-5C7GfCeVMHo@public.gmane.org>
2017-08-28  6:45           ` Christian König
     [not found]             ` <8e5428ef-5419-6241-369f-a48e63b77934-ANTagKRnAhcb1SvskN2V4Q@public.gmane.org>
2017-08-28  7:15               ` zhoucm1
     [not found]                 ` <c593059f-548d-340c-6bd5-7650b8830aad-5C7GfCeVMHo@public.gmane.org>
2017-08-28 13:26                   ` Kuehling, Felix
2017-08-26  7:19   ` [PATCH 3/9] drm/radeon: Add PASID manager for KFD Felix Kuehling
     [not found]     ` <1503731949-22742-4-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-08-26 13:27       ` Christian König
2017-08-26  7:19   ` [PATCH 4/9] drm/amdkfd: Separate doorbell allocation from PASID Felix Kuehling
2017-08-26  7:19   ` [PATCH 5/9] drm/amdkfd: Use PASID manager from KGD Felix Kuehling
2017-08-26  7:19   ` [PATCH 6/9] drm/amd: Set the PASID for KFD VMs Felix Kuehling
2017-08-26  7:19   ` [PATCH 7/9] drm/amdgpu: Add prescreening stage in IH processing Felix Kuehling
2017-08-26  7:19   ` [PATCH 8/9] lib: Closed hash table with low overhead Felix Kuehling
     [not found]     ` <1503731949-22742-9-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-08-26 13:32       ` Christian König
2017-08-26  7:19   ` [PATCH 9/9] drm/amdgpu: Track pending retry faults in IH and VM Felix Kuehling
     [not found]     ` <1503731949-22742-10-git-send-email-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2017-08-26 13:36       ` Christian König
2017-08-27 22:22   ` [PATCH 0/9] WIP: Retry page fault handling for Vega10 Oded Gabbay
     [not found]     ` <CAFCwf10G+4ra9UD6upxaBc5FwSu4efB9oLKKYSZHcHQ-w9TZgQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-08-28 13:36       ` Kuehling, Felix

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.