All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/amdkfd: Fix eviction fence handling
@ 2024-04-18  3:14 Felix Kuehling
  2024-04-18 16:33 ` Philip Yang
  2024-04-18 17:23 ` Ba, Gang
  0 siblings, 2 replies; 3+ messages in thread
From: Felix Kuehling @ 2024-04-18  3:14 UTC (permalink / raw)
  To: amd-gfx; +Cc: gang.ba, vitaly.prosyak

Handle case that dma_fence_get_rcu_safe returns NULL.

If restore work is already scheduled, only update its timer. The same
work item cannot be queued twice, so undo the extra queue eviction.

Fixes: 9a1c1339abf9 ("drm/amdkfd: Run restore_workers on freezable WQs")
Signed-off-by: Felix Kuehling <felix.kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_process.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index b79986412cd8..aafdf064651f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -1922,6 +1922,8 @@ static int signal_eviction_fence(struct kfd_process *p)
 	rcu_read_lock();
 	ef = dma_fence_get_rcu_safe(&p->ef);
 	rcu_read_unlock();
+	if (!ef)
+		return -EINVAL;
 
 	ret = dma_fence_signal(ef);
 	dma_fence_put(ef);
@@ -1949,10 +1951,9 @@ static void evict_process_worker(struct work_struct *work)
 		 * they are responsible stopping the queues and scheduling
 		 * the restore work.
 		 */
-		if (!signal_eviction_fence(p))
-			queue_delayed_work(kfd_restore_wq, &p->restore_work,
-				msecs_to_jiffies(PROCESS_RESTORE_TIME_MS));
-		else
+		if (signal_eviction_fence(p) ||
+		    mod_delayed_work(kfd_restore_wq, &p->restore_work,
+				     msecs_to_jiffies(PROCESS_RESTORE_TIME_MS)))
 			kfd_process_restore_queues(p);
 
 		pr_debug("Finished evicting pasid 0x%x\n", p->pasid);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] drm/amdkfd: Fix eviction fence handling
  2024-04-18  3:14 [PATCH] drm/amdkfd: Fix eviction fence handling Felix Kuehling
@ 2024-04-18 16:33 ` Philip Yang
  2024-04-18 17:23 ` Ba, Gang
  1 sibling, 0 replies; 3+ messages in thread
From: Philip Yang @ 2024-04-18 16:33 UTC (permalink / raw)
  To: Felix Kuehling, amd-gfx; +Cc: gang.ba, vitaly.prosyak

[-- Attachment #1: Type: text/html, Size: 2267 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] drm/amdkfd: Fix eviction fence handling
  2024-04-18  3:14 [PATCH] drm/amdkfd: Fix eviction fence handling Felix Kuehling
  2024-04-18 16:33 ` Philip Yang
@ 2024-04-18 17:23 ` Ba, Gang
  1 sibling, 0 replies; 3+ messages in thread
From: Ba, Gang @ 2024-04-18 17:23 UTC (permalink / raw)
  To: Kuehling, Felix, amd-gfx; +Cc: Prosyak, Vitaly

[-- Attachment #1: Type: text/plain, Size: 2214 bytes --]

[AMD Official Use Only - General]

Tested-by: Gang BA <Gang.Ba@amd.com>
Reviewed-by: Gang BA <Gang.Ba@amd.com>
________________________________
From: Kuehling, Felix <Felix.Kuehling@amd.com>
Sent: Wednesday, April 17, 2024 11:14 PM
To: amd-gfx@lists.freedesktop.org <amd-gfx@lists.freedesktop.org>
Cc: Ba, Gang <Gang.Ba@amd.com>; Prosyak, Vitaly <Vitaly.Prosyak@amd.com>
Subject: [PATCH] drm/amdkfd: Fix eviction fence handling

Handle case that dma_fence_get_rcu_safe returns NULL.

If restore work is already scheduled, only update its timer. The same
work item cannot be queued twice, so undo the extra queue eviction.

Fixes: 9a1c1339abf9 ("drm/amdkfd: Run restore_workers on freezable WQs")
Signed-off-by: Felix Kuehling <felix.kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_process.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index b79986412cd8..aafdf064651f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -1922,6 +1922,8 @@ static int signal_eviction_fence(struct kfd_process *p)
         rcu_read_lock();
         ef = dma_fence_get_rcu_safe(&p->ef);
         rcu_read_unlock();
+       if (!ef)
+               return -EINVAL;

         ret = dma_fence_signal(ef);
         dma_fence_put(ef);
@@ -1949,10 +1951,9 @@ static void evict_process_worker(struct work_struct *work)
                  * they are responsible stopping the queues and scheduling
                  * the restore work.
                  */
-               if (!signal_eviction_fence(p))
-                       queue_delayed_work(kfd_restore_wq, &p->restore_work,
-                               msecs_to_jiffies(PROCESS_RESTORE_TIME_MS));
-               else
+               if (signal_eviction_fence(p) ||
+                   mod_delayed_work(kfd_restore_wq, &p->restore_work,
+                                    msecs_to_jiffies(PROCESS_RESTORE_TIME_MS)))
                         kfd_process_restore_queues(p);

                 pr_debug("Finished evicting pasid 0x%x\n", p->pasid);
--
2.34.1


[-- Attachment #2: Type: text/html, Size: 5093 bytes --]

^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-04-18 17:23 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-18  3:14 [PATCH] drm/amdkfd: Fix eviction fence handling Felix Kuehling
2024-04-18 16:33 ` Philip Yang
2024-04-18 17:23 ` Ba, Gang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.