All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/amdkfd: move flushing TLBs from map to unmap
@ 2021-05-25 17:53 Eric Huang
  2021-05-25 19:16 ` Felix Kuehling
  0 siblings, 1 reply; 8+ messages in thread
From: Eric Huang @ 2021-05-25 17:53 UTC (permalink / raw)
  To: amd-gfx

It it to optimize memory allocation latency.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 960913a35ee4..ab73741edb97 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1657,20 +1657,6 @@ static int kfd_ioctl_map_memory_to_gpu(struct 
file *filep,
                 goto sync_memory_failed;
         }

-       /* Flush TLBs after waiting for the page table updates to 
complete */
-       for (i = 0; i < args->n_devices; i++) {
-               peer = kfd_device_by_id(devices_arr[i]);
-               if (WARN_ON_ONCE(!peer))
-                       continue;
-               peer_pdd = kfd_get_process_device_data(peer, p);
-               if (WARN_ON_ONCE(!peer_pdd))
-                       continue;
-               if (!amdgpu_read_lock(peer->ddev, true)) {
-                       kfd_flush_tlb(peer_pdd);
-                       amdgpu_read_unlock(peer->ddev);
-               }
-       }
-
         kfree(devices_arr);

         trace_kfd_map_memory_to_gpu_end(p,
@@ -1766,6 +1752,7 @@ static int kfd_ioctl_unmap_memory_from_gpu(struct 
file *filep,
                         amdgpu_read_unlock(peer->ddev);
                         goto unmap_memory_from_gpu_failed;
                 }
+               kfd_flush_tlb(peer_pdd);
                 amdgpu_read_unlock(peer->ddev);
                 args->n_success = i+1;
         }
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] drm/amdkfd: move flushing TLBs from map to unmap
  2021-05-25 17:53 [PATCH] drm/amdkfd: move flushing TLBs from map to unmap Eric Huang
@ 2021-05-25 19:16 ` Felix Kuehling
  2021-05-26 19:21   ` Eric Huang
  0 siblings, 1 reply; 8+ messages in thread
From: Felix Kuehling @ 2021-05-25 19:16 UTC (permalink / raw)
  To: Eric Huang, amd-gfx

Similar to a recent fix by Philip Yang 76e08b37d0aa ("drm/amdgpu: flush
TLB if valid PDE turns into PTE"), there needs to be a conditional TLB
flush after map, if any PDEs were unmapped and turned into PTEs in the
process. This is currently returned by amdgpu_vm_bo_update_mapping in
the "table_freed" parameter. This needs to be also returned by
amdgpu_vm_bo_update and reported back to KFD, so KFD can do the TLB
flush after map, if needed.

kfd_flush_tlb probably needs a new parameter to determine the flush
type. The flush after map can be a "legacy" flush (type 0). The flush
after unmap must be a "heavy-weight" flush (type 2) to make sure we
don't evict cache lines into pages that we no longer own.

Finally, in the ticket I thought about possible optimizations using a
worker to minimize the impact of TLB flushes on unmap latency. That
could be a follow up commit.

Regards,
  Felix


Am 2021-05-25 um 1:53 p.m. schrieb Eric Huang:
> It it to optimize memory allocation latency.
>
> Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> index 960913a35ee4..ab73741edb97 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> @@ -1657,20 +1657,6 @@ static int kfd_ioctl_map_memory_to_gpu(struct
> file *filep,
>                 goto sync_memory_failed;
>         }
>
> -       /* Flush TLBs after waiting for the page table updates to
> complete */
> -       for (i = 0; i < args->n_devices; i++) {
> -               peer = kfd_device_by_id(devices_arr[i]);
> -               if (WARN_ON_ONCE(!peer))
> -                       continue;
> -               peer_pdd = kfd_get_process_device_data(peer, p);
> -               if (WARN_ON_ONCE(!peer_pdd))
> -                       continue;
> -               if (!amdgpu_read_lock(peer->ddev, true)) {
> -                       kfd_flush_tlb(peer_pdd);
> -                       amdgpu_read_unlock(peer->ddev);
> -               }
> -       }
> -
>         kfree(devices_arr);
>
>         trace_kfd_map_memory_to_gpu_end(p,
> @@ -1766,6 +1752,7 @@ static int
> kfd_ioctl_unmap_memory_from_gpu(struct file *filep,
>                         amdgpu_read_unlock(peer->ddev);
>                         goto unmap_memory_from_gpu_failed;
>                 }
> +               kfd_flush_tlb(peer_pdd);
>                 amdgpu_read_unlock(peer->ddev);
>                 args->n_success = i+1;
>         }
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] drm/amdkfd: move flushing TLBs from map to unmap
  2021-05-25 19:16 ` Felix Kuehling
@ 2021-05-26 19:21   ` Eric Huang
  2021-05-26 21:25     ` Felix Kuehling
  0 siblings, 1 reply; 8+ messages in thread
From: Eric Huang @ 2021-05-26 19:21 UTC (permalink / raw)
  To: Felix Kuehling, amd-gfx

[-- Attachment #1: Type: text/plain, Size: 3790 bytes --]


On 2021-05-25 3:16 p.m., Felix Kuehling wrote:
> Similar to a recent fix by Philip Yang 76e08b37d0aa ("drm/amdgpu: flush
> TLB if valid PDE turns into PTE"), there needs to be a conditional TLB
> flush after map, if any PDEs were unmapped and turned into PTEs in the
> process. This is currently returned by amdgpu_vm_bo_update_mapping in
> the "table_freed" parameter. This needs to be also returned by
> amdgpu_vm_bo_update and reported back to KFD, so KFD can do the TLB
> flush after map, if needed.
I follow up your suggestion to create another patch (attached) and test 
it. It seems it doesn't improve the latency when memory size is bigger 
than huge page (2M), because table_freed parameter will always be true 
when mapping page is huge page size. I think Philip's patch is to fix 
the case of remapping memory from small page to huge page in HMM, but it 
doesn't consider if the memory is remapped and arbitrarily flushes TLBs 
when mapping huge page.
> kfd_flush_tlb probably needs a new parameter to determine the flush
> type. The flush after map can be a "legacy" flush (type 0). The flush
> after unmap must be a "heavy-weight" flush (type 2) to make sure we
> don't evict cache lines into pages that we no longer own.
>
> Finally, in the ticket I thought about possible optimizations using a
> worker to minimize the impact of TLB flushes on unmap latency. That
> could be a follow up commit.
It is a good idea to use worker, but how do we grantee it done before 
memory is remapped? if remapping depends on it, then more latency will 
be introduced in map.

Regards,
Eric
> Regards,
>    Felix
>
>
> Am 2021-05-25 um 1:53 p.m. schrieb Eric Huang:
>> It it to optimize memory allocation latency.
>>
>> Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>> b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>> index 960913a35ee4..ab73741edb97 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>> @@ -1657,20 +1657,6 @@ static int kfd_ioctl_map_memory_to_gpu(struct
>> file *filep,
>>                  goto sync_memory_failed;
>>          }
>>
>> -       /* Flush TLBs after waiting for the page table updates to
>> complete */
>> -       for (i = 0; i < args->n_devices; i++) {
>> -               peer = kfd_device_by_id(devices_arr[i]);
>> -               if (WARN_ON_ONCE(!peer))
>> -                       continue;
>> -               peer_pdd = kfd_get_process_device_data(peer, p);
>> -               if (WARN_ON_ONCE(!peer_pdd))
>> -                       continue;
>> -               if (!amdgpu_read_lock(peer->ddev, true)) {
>> -                       kfd_flush_tlb(peer_pdd);
>> -                       amdgpu_read_unlock(peer->ddev);
>> -               }
>> -       }
>> -
>>          kfree(devices_arr);
>>
>>          trace_kfd_map_memory_to_gpu_end(p,
>> @@ -1766,6 +1752,7 @@ static int
>> kfd_ioctl_unmap_memory_from_gpu(struct file *filep,
>>                          amdgpu_read_unlock(peer->ddev);
>>                          goto unmap_memory_from_gpu_failed;
>>                  }
>> +               kfd_flush_tlb(peer_pdd);
>>                  amdgpu_read_unlock(peer->ddev);
>>                  args->n_success = i+1;
>>          }
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[-- Attachment #2: 0001-drm-amdkfd-conditionally-flush-TLBs-after-map.patch --]
[-- Type: text/x-patch, Size: 9991 bytes --]

From 6218597e7117ec2f18cecd9314e196d598497b62 Mon Sep 17 00:00:00 2001
From: Eric Huang <jinhuieric.huang@amd.com>
Date: Wed, 26 May 2021 14:50:52 -0400
Subject: [PATCH] drm/amdkfd: conditionally flush TLBs after map

---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h    |  1 +
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c        |  6 ++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c       |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        |  8 +++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h        |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c      | 27 ++++++++++---------
 .../drm/amd/amdkfd/kfd_device_queue_manager.c |  6 ++---
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h         |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_process.c      |  4 +--
 10 files changed, 32 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 2560977760b3..997258c24ef2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -85,6 +85,7 @@ struct kgd_mem {
 
 	bool aql_queue;
 	bool is_imported;
+	bool table_freed;
 };
 
 /* KFD Memory Eviction */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 385c33675227..e445ac7ff2ee 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -1126,7 +1126,7 @@ static int update_gpuvm_pte(struct kgd_mem *mem,
 		return ret;
 
 	/* Update the page tables  */
-	ret = amdgpu_vm_bo_update(adev, bo_va, false);
+	ret = amdgpu_vm_bo_update(adev, bo_va, false, &mem->table_freed);
 	if (ret) {
 		pr_err("amdgpu_vm_bo_update failed\n");
 		return ret;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index e9f9f462a652..e3df132e53a5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -916,7 +916,7 @@ static int amdgpu_cs_vm_handling(struct amdgpu_cs_parser *p)
 	if (r)
 		return r;
 
-	r = amdgpu_vm_bo_update(adev, fpriv->prt_va, false);
+	r = amdgpu_vm_bo_update(adev, fpriv->prt_va, false, NULL);
 	if (r)
 		return r;
 
@@ -927,7 +927,7 @@ static int amdgpu_cs_vm_handling(struct amdgpu_cs_parser *p)
 	if (amdgpu_mcbp || amdgpu_sriov_vf(adev)) {
 		bo_va = fpriv->csa_va;
 		BUG_ON(!bo_va);
-		r = amdgpu_vm_bo_update(adev, bo_va, false);
+		r = amdgpu_vm_bo_update(adev, bo_va, false, NULL);
 		if (r)
 			return r;
 
@@ -946,7 +946,7 @@ static int amdgpu_cs_vm_handling(struct amdgpu_cs_parser *p)
 		if (bo_va == NULL)
 			continue;
 
-		r = amdgpu_vm_bo_update(adev, bo_va, false);
+		r = amdgpu_vm_bo_update(adev, bo_va, false, NULL);
 		if (r)
 			return r;
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index 2120a87a949f..eac2fd0048cc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -696,7 +696,7 @@ static void amdgpu_gem_va_update_vm(struct amdgpu_device *adev,
 
 	if (operation == AMDGPU_VA_OP_MAP ||
 	    operation == AMDGPU_VA_OP_REPLACE) {
-		r = amdgpu_vm_bo_update(adev, bo_va, false);
+		r = amdgpu_vm_bo_update(adev, bo_va, false, NULL);
 		if (r)
 			goto error;
 	}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 0dee2e8797c7..851d128609af 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -1790,7 +1790,7 @@ void amdgpu_vm_get_memory(struct amdgpu_vm *vm, uint64_t *vram_mem,
  * 0 for success, -EINVAL for failure.
  */
 int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va,
-			bool clear)
+			bool clear, bool *table_freed)
 {
 	struct amdgpu_bo *bo = bo_va->base.bo;
 	struct amdgpu_vm *vm = bo_va->base.vm;
@@ -1883,7 +1883,7 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va,
 						resv, mapping->start,
 						mapping->last, update_flags,
 						mapping->offset, mem,
-						pages_addr, last_update, NULL,
+						pages_addr, last_update, table_freed,
 						vram_base_offset);
 		if (r)
 			return r;
@@ -2137,7 +2137,7 @@ int amdgpu_vm_handle_moved(struct amdgpu_device *adev,
 
 	list_for_each_entry_safe(bo_va, tmp, &vm->moved, base.vm_status) {
 		/* Per VM BOs never need to bo cleared in the page tables */
-		r = amdgpu_vm_bo_update(adev, bo_va, false);
+		r = amdgpu_vm_bo_update(adev, bo_va, false, NULL);
 		if (r)
 			return r;
 	}
@@ -2156,7 +2156,7 @@ int amdgpu_vm_handle_moved(struct amdgpu_device *adev,
 		else
 			clear = true;
 
-		r = amdgpu_vm_bo_update(adev, bo_va, clear);
+		r = amdgpu_vm_bo_update(adev, bo_va, clear, NULL);
 		if (r)
 			return r;
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
index 67bba8462e7d..a53f95958b49 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -419,7 +419,7 @@ int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
 
 int amdgpu_vm_bo_update(struct amdgpu_device *adev,
 			struct amdgpu_bo_va *bo_va,
-			bool clear);
+			bool clear, bool *talbe_freed);
 bool amdgpu_vm_evictable(struct amdgpu_bo *bo);
 void amdgpu_vm_bo_invalidate(struct amdgpu_device *adev,
 			     struct amdgpu_bo *bo, bool evicted);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 960913a35ee4..0c31ff62d0a9 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1649,8 +1649,6 @@ static int kfd_ioctl_map_memory_to_gpu(struct file *filep,
 		args->n_success = i+1;
 	}
 
-	mutex_unlock(&p->mutex);
-
 	err = amdgpu_amdkfd_gpuvm_sync_memory(dev->kgd, (struct kgd_mem *) mem, true);
 	if (err) {
 		pr_debug("Sync memory failed, wait interrupted by user signal\n");
@@ -1658,19 +1656,23 @@ static int kfd_ioctl_map_memory_to_gpu(struct file *filep,
 	}
 
 	/* Flush TLBs after waiting for the page table updates to complete */
-	for (i = 0; i < args->n_devices; i++) {
-		peer = kfd_device_by_id(devices_arr[i]);
-		if (WARN_ON_ONCE(!peer))
-			continue;
-		peer_pdd = kfd_get_process_device_data(peer, p);
-		if (WARN_ON_ONCE(!peer_pdd))
-			continue;
-		if (!amdgpu_read_lock(peer->ddev, true)) {
-			kfd_flush_tlb(peer_pdd);
-			amdgpu_read_unlock(peer->ddev);
+	if (((struct kgd_mem *)mem)->table_freed) {
+		for (i = 0; i < args->n_devices; i++) {
+			peer = kfd_device_by_id(devices_arr[i]);
+			if (WARN_ON_ONCE(!peer))
+				continue;
+			peer_pdd = kfd_get_process_device_data(peer, p);
+			if (WARN_ON_ONCE(!peer_pdd))
+				continue;
+			if (!amdgpu_read_lock(peer->ddev, true)) {
+				kfd_flush_tlb(peer_pdd, TLB_FLUSH_LEGACY);
+				amdgpu_read_unlock(peer->ddev);
+			}
 		}
+		((struct kgd_mem *)mem)->table_freed = false;
 	}
 
+	mutex_unlock(&p->mutex);
 	kfree(devices_arr);
 
 	trace_kfd_map_memory_to_gpu_end(p,
@@ -1766,6 +1768,7 @@ static int kfd_ioctl_unmap_memory_from_gpu(struct file *filep,
 			amdgpu_read_unlock(peer->ddev);
 			goto unmap_memory_from_gpu_failed;
 		}
+		kfd_flush_tlb(peer_pdd, TLB_FLUSH_HEAVYWEIGHT);
 		amdgpu_read_unlock(peer->ddev);
 		args->n_success = i+1;
 	}
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index c1bea1f7627b..a4920bc5cfbc 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -278,7 +278,7 @@ static int allocate_vmid(struct device_queue_manager *dqm,
 			qpd->vmid,
 			qpd->page_table_base);
 	/* invalidate the VM context after pasid and vmid mapping is set up */
-	kfd_flush_tlb(qpd_to_pdd(qpd));
+	kfd_flush_tlb(qpd_to_pdd(qpd), TLB_FLUSH_LEGACY);
 
 	if (dqm->dev->kfd2kgd->set_scratch_backing_va)
 		dqm->dev->kfd2kgd->set_scratch_backing_va(dqm->dev->kgd,
@@ -314,7 +314,7 @@ static void deallocate_vmid(struct device_queue_manager *dqm,
 		if (flush_texture_cache_nocpsch(q->device, qpd))
 			pr_err("Failed to flush TC\n");
 
-	kfd_flush_tlb(qpd_to_pdd(qpd));
+	kfd_flush_tlb(qpd_to_pdd(qpd), TLB_FLUSH_LEGACY);
 
 	/* Release the vmid mapping */
 	set_pasid_vmid_mapping(dqm, 0, qpd->vmid);
@@ -885,7 +885,7 @@ static int restore_process_queues_nocpsch(struct device_queue_manager *dqm,
 				dqm->dev->kgd,
 				qpd->vmid,
 				qpd->page_table_base);
-		kfd_flush_tlb(pdd);
+		kfd_flush_tlb(pdd, TLB_FLUSH_LEGACY);
 	}
 
 	/* Take a safe reference to the mm_struct, which may otherwise
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index ecdd5e782b81..edce3ecf207d 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -1338,7 +1338,7 @@ void kfd_signal_reset_event(struct kfd_dev *dev);
 
 void kfd_signal_poison_consumed_event(struct kfd_dev *dev, u32 pasid);
 
-void kfd_flush_tlb(struct kfd_process_device *pdd);
+void kfd_flush_tlb(struct kfd_process_device *pdd, enum TLB_FLUSH_TYPE type);
 
 int dbgdev_wave_reset_wavefronts(struct kfd_dev *dev, struct kfd_process *p);
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index 4ab9da288f90..a03373743a3d 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -2161,7 +2161,7 @@ int kfd_reserved_mem_mmap(struct kfd_dev *dev, struct kfd_process *process,
 			       KFD_CWSR_TBA_TMA_SIZE, vma->vm_page_prot);
 }
 
-void kfd_flush_tlb(struct kfd_process_device *pdd)
+void kfd_flush_tlb(struct kfd_process_device *pdd, enum TLB_FLUSH_TYPE type)
 {
 	struct kfd_dev *dev = pdd->dev;
 
@@ -2174,7 +2174,7 @@ void kfd_flush_tlb(struct kfd_process_device *pdd)
 							pdd->qpd.vmid);
 	} else {
 		amdgpu_amdkfd_flush_gpu_tlb_pasid(dev->kgd,
-					pdd->process->pasid, TLB_FLUSH_LEGACY);
+					pdd->process->pasid, type);
 	}
 }
 
-- 
2.25.1


[-- Attachment #3: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] drm/amdkfd: move flushing TLBs from map to unmap
  2021-05-26 19:21   ` Eric Huang
@ 2021-05-26 21:25     ` Felix Kuehling
  2021-05-27 14:05       ` philip yang
  0 siblings, 1 reply; 8+ messages in thread
From: Felix Kuehling @ 2021-05-26 21:25 UTC (permalink / raw)
  To: Eric Huang, amd-gfx

Am 2021-05-26 um 3:21 p.m. schrieb Eric Huang:
>
> On 2021-05-25 3:16 p.m., Felix Kuehling wrote:
>> Similar to a recent fix by Philip Yang 76e08b37d0aa ("drm/amdgpu: flush
>> TLB if valid PDE turns into PTE"), there needs to be a conditional TLB
>> flush after map, if any PDEs were unmapped and turned into PTEs in the
>> process. This is currently returned by amdgpu_vm_bo_update_mapping in
>> the "table_freed" parameter. This needs to be also returned by
>> amdgpu_vm_bo_update and reported back to KFD, so KFD can do the TLB
>> flush after map, if needed.
> I follow up your suggestion to create another patch (attached) and
> test it. It seems it doesn't improve the latency when memory size is
> bigger than huge page (2M), because table_freed parameter will always
> be true when mapping page is huge page size. I think Philip's patch is
> to fix the case of remapping memory from small page to huge page in
> HMM, but it doesn't consider if the memory is remapped and arbitrarily
> flushes TLBs when mapping huge page.

That's unexpected. Turning an invalid PDE into a valid (huge) PTE should
not trigger a TLB flush.

Regards,
  Felix


>> kfd_flush_tlb probably needs a new parameter to determine the flush
>> type. The flush after map can be a "legacy" flush (type 0). The flush
>> after unmap must be a "heavy-weight" flush (type 2) to make sure we
>> don't evict cache lines into pages that we no longer own.
>>
>> Finally, in the ticket I thought about possible optimizations using a
>> worker to minimize the impact of TLB flushes on unmap latency. That
>> could be a follow up commit.
> It is a good idea to use worker, but how do we grantee it done before
> memory is remapped? if remapping depends on it, then more latency will
> be introduced in map.
>
> Regards,
> Eric
>> Regards,
>>    Felix
>>
>>
>> Am 2021-05-25 um 1:53 p.m. schrieb Eric Huang:
>>> It it to optimize memory allocation latency.
>>>
>>> Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>>> b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>>> index 960913a35ee4..ab73741edb97 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>>> @@ -1657,20 +1657,6 @@ static int kfd_ioctl_map_memory_to_gpu(struct
>>> file *filep,
>>>                  goto sync_memory_failed;
>>>          }
>>>
>>> -       /* Flush TLBs after waiting for the page table updates to
>>> complete */
>>> -       for (i = 0; i < args->n_devices; i++) {
>>> -               peer = kfd_device_by_id(devices_arr[i]);
>>> -               if (WARN_ON_ONCE(!peer))
>>> -                       continue;
>>> -               peer_pdd = kfd_get_process_device_data(peer, p);
>>> -               if (WARN_ON_ONCE(!peer_pdd))
>>> -                       continue;
>>> -               if (!amdgpu_read_lock(peer->ddev, true)) {
>>> -                       kfd_flush_tlb(peer_pdd);
>>> -                       amdgpu_read_unlock(peer->ddev);
>>> -               }
>>> -       }
>>> -
>>>          kfree(devices_arr);
>>>
>>>          trace_kfd_map_memory_to_gpu_end(p,
>>> @@ -1766,6 +1752,7 @@ static int
>>> kfd_ioctl_unmap_memory_from_gpu(struct file *filep,
>>>                          amdgpu_read_unlock(peer->ddev);
>>>                          goto unmap_memory_from_gpu_failed;
>>>                  }
>>> +               kfd_flush_tlb(peer_pdd);
>>>                  amdgpu_read_unlock(peer->ddev);
>>>                  args->n_success = i+1;
>>>          }
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] drm/amdkfd: move flushing TLBs from map to unmap
  2021-05-26 21:25     ` Felix Kuehling
@ 2021-05-27 14:05       ` philip yang
  2021-05-28 15:23         ` Christian König
  0 siblings, 1 reply; 8+ messages in thread
From: philip yang @ 2021-05-27 14:05 UTC (permalink / raw)
  To: Felix Kuehling, Eric Huang, amd-gfx

[-- Attachment #1: Type: text/html, Size: 9508 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] drm/amdkfd: move flushing TLBs from map to unmap
  2021-05-27 14:05       ` philip yang
@ 2021-05-28 15:23         ` Christian König
  2021-05-28 16:39           ` Eric Huang
  0 siblings, 1 reply; 8+ messages in thread
From: Christian König @ 2021-05-28 15:23 UTC (permalink / raw)
  To: philip yang, Felix Kuehling, Eric Huang, amd-gfx


[-- Attachment #1.1: Type: text/plain, Size: 6337 bytes --]



Am 27.05.21 um 16:05 schrieb philip yang:
>
>
> On 2021-05-26 5:25 p.m., Felix Kuehling wrote:
>> Am 2021-05-26 um 3:21 p.m. schrieb Eric Huang:
>>> On 2021-05-25 3:16 p.m., Felix Kuehling wrote:
>>>> Similar to a recent fix by Philip Yang 76e08b37d0aa ("drm/amdgpu: flush
>>>> TLB if valid PDE turns into PTE"), there needs to be a conditional TLB
>>>> flush after map, if any PDEs were unmapped and turned into PTEs in the
>>>> process. This is currently returned by amdgpu_vm_bo_update_mapping in
>>>> the "table_freed" parameter. This needs to be also returned by
>>>> amdgpu_vm_bo_update and reported back to KFD, so KFD can do the TLB
>>>> flush after map, if needed.
>>> I follow up your suggestion to create another patch (attached) and
>>> test it. It seems it doesn't improve the latency when memory size is
>>> bigger than huge page (2M), because table_freed parameter will always
>>> be true when mapping page is huge page size. I think Philip's patch is
>>> to fix the case of remapping memory from small page to huge page in
>>> HMM, but it doesn't consider if the memory is remapped and arbitrarily
>>> flushes TLBs when mapping huge page.
>> That's unexpected. Turning an invalid PDE into a valid (huge) PTE should
>> not trigger a TLB flush.
>
> table_freed will be true if PDE has been used by previous mapping, 
> unmap the previous mapping will clear the PTEs, leave PDE unchanged as 
> P=0, V=1 (in memory and TLB), then huge page mapping turns PDE to PTE 
> (P=1, V=1) in memory, and free PTE page.
>

I think there might be a little bug in your patch. See we set 
params.table_freed to true when we call amdgpu_vm_free_pts(), but 
amdgpu_vm_free_pts() doesn't necessary frees anything.

It can be that all subsequent page tables where never allocated before.

Christian.

> For example, test map 0x7ffe37401000, unmap it, and then map 
> 0x7ffe3740000 2MB huge page, table_freed will be true, means that 
> flush TLB is needed after mapping huge page.
>
> You can change the test, don't unmap previous mapping, then 2MB huge 
> page will get new GPU virtual address, or closeKFD, openKFD again to 
> create new GPU vm.
>
> Regards,
>
> Philip
>
>> Regards,
>>    Felix
>>
>>
>>>> kfd_flush_tlb probably needs a new parameter to determine the flush
>>>> type. The flush after map can be a "legacy" flush (type 0). The flush
>>>> after unmap must be a "heavy-weight" flush (type 2) to make sure we
>>>> don't evict cache lines into pages that we no longer own.
>>>>
>>>> Finally, in the ticket I thought about possible optimizations using a
>>>> worker to minimize the impact of TLB flushes on unmap latency. That
>>>> could be a follow up commit.
>>> It is a good idea to use worker, but how do we grantee it done before
>>> memory is remapped? if remapping depends on it, then more latency will
>>> be introduced in map.
>>>
>>> Regards,
>>> Eric
>>>> Regards,
>>>>     Felix
>>>>
>>>>
>>>> Am 2021-05-25 um 1:53 p.m. schrieb Eric Huang:
>>>>> It it to optimize memory allocation latency.
>>>>>
>>>>> Signed-off-by: Eric Huang<jinhuieric.huang@amd.com>
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>>>>> b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>>>>> index 960913a35ee4..ab73741edb97 100644
>>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>>>>> @@ -1657,20 +1657,6 @@ static int kfd_ioctl_map_memory_to_gpu(struct
>>>>> file *filep,
>>>>>                   goto sync_memory_failed;
>>>>>           }
>>>>>
>>>>> -       /* Flush TLBs after waiting for the page table updates to
>>>>> complete */
>>>>> -       for (i = 0; i < args->n_devices; i++) {
>>>>> -               peer = kfd_device_by_id(devices_arr[i]);
>>>>> -               if (WARN_ON_ONCE(!peer))
>>>>> -                       continue;
>>>>> -               peer_pdd = kfd_get_process_device_data(peer, p);
>>>>> -               if (WARN_ON_ONCE(!peer_pdd))
>>>>> -                       continue;
>>>>> -               if (!amdgpu_read_lock(peer->ddev, true)) {
>>>>> -                       kfd_flush_tlb(peer_pdd);
>>>>> -                       amdgpu_read_unlock(peer->ddev);
>>>>> -               }
>>>>> -       }
>>>>> -
>>>>>           kfree(devices_arr);
>>>>>
>>>>>           trace_kfd_map_memory_to_gpu_end(p,
>>>>> @@ -1766,6 +1752,7 @@ static int
>>>>> kfd_ioctl_unmap_memory_from_gpu(struct file *filep,
>>>>>                           amdgpu_read_unlock(peer->ddev);
>>>>>                           goto unmap_memory_from_gpu_failed;
>>>>>                   }
>>>>> +               kfd_flush_tlb(peer_pdd);
>>>>>                   amdgpu_read_unlock(peer->ddev);
>>>>>                   args->n_success = i+1;
>>>>>           }
>>>>> _______________________________________________
>>>>> amd-gfx mailing list
>>>>> amd-gfx@lists.freedesktop.org
>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7Cphilip.yang%40amd.com%7C92ac3fbce9264fbcf40508d9208cc477%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637576611241705305%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=S8NSZRdXq%2B74tSSLkm2TYEVDr%2Fr%2BW%2FET7CJln7tbEQo%3D&amp;reserved=0
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7Cphilip.yang%40amd.com%7C92ac3fbce9264fbcf40508d9208cc477%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637576611241705305%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=S8NSZRdXq%2B74tSSLkm2TYEVDr%2Fr%2BW%2FET7CJln7tbEQo%3D&amp;reserved=0
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[-- Attachment #1.2: Type: text/html, Size: 9452 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] drm/amdkfd: move flushing TLBs from map to unmap
  2021-05-28 15:23         ` Christian König
@ 2021-05-28 16:39           ` Eric Huang
  2021-05-31 13:11             ` philip yang
  0 siblings, 1 reply; 8+ messages in thread
From: Eric Huang @ 2021-05-28 16:39 UTC (permalink / raw)
  To: Christian König, philip yang, Felix Kuehling, amd-gfx


[-- Attachment #1.1: Type: text/plain, Size: 7100 bytes --]


On 2021-05-28 11:23 a.m., Christian König wrote:
>
>
> Am 27.05.21 um 16:05 schrieb philip yang:
>>
>>
>> On 2021-05-26 5:25 p.m., Felix Kuehling wrote:
>>> Am 2021-05-26 um 3:21 p.m. schrieb Eric Huang:
>>>> On 2021-05-25 3:16 p.m., Felix Kuehling wrote:
>>>>> Similar to a recent fix by Philip Yang 76e08b37d0aa ("drm/amdgpu: flush
>>>>> TLB if valid PDE turns into PTE"), there needs to be a conditional TLB
>>>>> flush after map, if any PDEs were unmapped and turned into PTEs in the
>>>>> process. This is currently returned by amdgpu_vm_bo_update_mapping in
>>>>> the "table_freed" parameter. This needs to be also returned by
>>>>> amdgpu_vm_bo_update and reported back to KFD, so KFD can do the TLB
>>>>> flush after map, if needed.
>>>> I follow up your suggestion to create another patch (attached) and
>>>> test it. It seems it doesn't improve the latency when memory size is
>>>> bigger than huge page (2M), because table_freed parameter will always
>>>> be true when mapping page is huge page size. I think Philip's patch is
>>>> to fix the case of remapping memory from small page to huge page in
>>>> HMM, but it doesn't consider if the memory is remapped and arbitrarily
>>>> flushes TLBs when mapping huge page.
>>> That's unexpected. Turning an invalid PDE into a valid (huge) PTE should
>>> not trigger a TLB flush.
>>
>> table_freed will be true if PDE has been used by previous mapping, 
>> unmap the previous mapping will clear the PTEs, leave PDE unchanged 
>> as P=0, V=1 (in memory and TLB), then huge page mapping turns PDE to 
>> PTE (P=1, V=1) in memory, and free PTE page.
>>
>
> I think there might be a little bug in your patch. See we set 
> params.table_freed to true when we call amdgpu_vm_free_pts(), but 
> amdgpu_vm_free_pts() doesn't necessary frees anything.
>
> It can be that all subsequent page tables where never allocated before.
>
> Christian.

After I printed infos in function amdgpu_vm_update_ptes(), when we map a 
memory with size 2M(huge page), the function will allocate 9 ptes (2M == 
PAGE_SIZE << 9) , until check "if (frag >= parent_shift)", then cursor 
goes up one level to PDE0 and frees all 9 ptes. So that is why 
table_freed is always true when mapping memory which size is bigger than 2M.

I will add some codes to check if PDE entry is valid before 
amdgpu_vm_update_flags(), and set table_freed accordingly. That will fix 
exactly page fault in the corner case above Philip mentioned.

Regards,
Eric

>
>> For example, test map 0x7ffe37401000, unmap it, and then map 
>> 0x7ffe3740000 2MB huge page, table_freed will be true, means that 
>> flush TLB is needed after mapping huge page.
>>
>> You can change the test, don't unmap previous mapping, then 2MB huge 
>> page will get new GPU virtual address, or closeKFD, openKFD again to 
>> create new GPU vm.
>>
>> Regards,
>>
>> Philip
>>
>>> Regards,
>>>    Felix
>>>
>>>
>>>>> kfd_flush_tlb probably needs a new parameter to determine the flush
>>>>> type. The flush after map can be a "legacy" flush (type 0). The flush
>>>>> after unmap must be a "heavy-weight" flush (type 2) to make sure we
>>>>> don't evict cache lines into pages that we no longer own.
>>>>>
>>>>> Finally, in the ticket I thought about possible optimizations using a
>>>>> worker to minimize the impact of TLB flushes on unmap latency. That
>>>>> could be a follow up commit.
>>>> It is a good idea to use worker, but how do we grantee it done before
>>>> memory is remapped? if remapping depends on it, then more latency will
>>>> be introduced in map.
>>>>
>>>> Regards,
>>>> Eric
>>>>> Regards,
>>>>>     Felix
>>>>>
>>>>>
>>>>> Am 2021-05-25 um 1:53 p.m. schrieb Eric Huang:
>>>>>> It it to optimize memory allocation latency.
>>>>>>
>>>>>> Signed-off-by: Eric Huang<jinhuieric.huang@amd.com>
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>>>>>> b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>>>>>> index 960913a35ee4..ab73741edb97 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
>>>>>> @@ -1657,20 +1657,6 @@ static int kfd_ioctl_map_memory_to_gpu(struct
>>>>>> file *filep,
>>>>>>                   goto sync_memory_failed;
>>>>>>           }
>>>>>>
>>>>>> -       /* Flush TLBs after waiting for the page table updates to
>>>>>> complete */
>>>>>> -       for (i = 0; i < args->n_devices; i++) {
>>>>>> -               peer = kfd_device_by_id(devices_arr[i]);
>>>>>> -               if (WARN_ON_ONCE(!peer))
>>>>>> -                       continue;
>>>>>> -               peer_pdd = kfd_get_process_device_data(peer, p);
>>>>>> -               if (WARN_ON_ONCE(!peer_pdd))
>>>>>> -                       continue;
>>>>>> -               if (!amdgpu_read_lock(peer->ddev, true)) {
>>>>>> -                       kfd_flush_tlb(peer_pdd);
>>>>>> -                       amdgpu_read_unlock(peer->ddev);
>>>>>> -               }
>>>>>> -       }
>>>>>> -
>>>>>>           kfree(devices_arr);
>>>>>>
>>>>>>           trace_kfd_map_memory_to_gpu_end(p,
>>>>>> @@ -1766,6 +1752,7 @@ static int
>>>>>> kfd_ioctl_unmap_memory_from_gpu(struct file *filep,
>>>>>>                           amdgpu_read_unlock(peer->ddev);
>>>>>>                           goto unmap_memory_from_gpu_failed;
>>>>>>                   }
>>>>>> +               kfd_flush_tlb(peer_pdd);
>>>>>>                   amdgpu_read_unlock(peer->ddev);
>>>>>>                   args->n_success = i+1;
>>>>>>           }
>>>>>> _______________________________________________
>>>>>> amd-gfx mailing list
>>>>>> amd-gfx@lists.freedesktop.org
>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7Cphilip.yang%40amd.com%7C92ac3fbce9264fbcf40508d9208cc477%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637576611241705305%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=S8NSZRdXq%2B74tSSLkm2TYEVDr%2Fr%2BW%2FET7CJln7tbEQo%3D&amp;reserved=0
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7Cphilip.yang%40amd.com%7C92ac3fbce9264fbcf40508d9208cc477%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637576611241705305%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=S8NSZRdXq%2B74tSSLkm2TYEVDr%2Fr%2BW%2FET7CJln7tbEQo%3D&amp;reserved=0
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>


[-- Attachment #1.2: Type: text/html, Size: 13091 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] drm/amdkfd: move flushing TLBs from map to unmap
  2021-05-28 16:39           ` Eric Huang
@ 2021-05-31 13:11             ` philip yang
  0 siblings, 0 replies; 8+ messages in thread
From: philip yang @ 2021-05-31 13:11 UTC (permalink / raw)
  To: Eric Huang, Christian König, Felix Kuehling, amd-gfx

[-- Attachment #1: Type: text/html, Size: 13973 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-05-31 13:11 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-25 17:53 [PATCH] drm/amdkfd: move flushing TLBs from map to unmap Eric Huang
2021-05-25 19:16 ` Felix Kuehling
2021-05-26 19:21   ` Eric Huang
2021-05-26 21:25     ` Felix Kuehling
2021-05-27 14:05       ` philip yang
2021-05-28 15:23         ` Christian König
2021-05-28 16:39           ` Eric Huang
2021-05-31 13:11             ` philip yang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.