All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
To: <amd-gfx@lists.freedesktop.org>
Cc: Zoy.Bai@amd.com, Andrey Grodzovsky <andrey.grodzovsky@amd.com>,
	lijo.lazar@amd.com, Christian.Koenig@amd.com
Subject: [PATCH v2 3/7] drm/admgpu: Serialize RAS recovery work directly into reset domain queue.
Date: Tue, 17 May 2022 15:20:58 -0400	[thread overview]
Message-ID: <20220517192102.238176-4-andrey.grodzovsky@amd.com> (raw)
In-Reply-To: <20220517192102.238176-1-andrey.grodzovsky@amd.com>

Save the extra usless work schedule. Also swith to delayed work.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 12 +++++++-----
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h |  2 +-
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index a653cf3b3d13..7e8c7bcc7303 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -35,6 +35,8 @@
 #include "amdgpu_xgmi.h"
 #include "ivsrcid/nbio/irqsrcs_nbif_7_4.h"
 #include "atom.h"
+#include "amdgpu_reset.h"
+
 #ifdef CONFIG_X86_MCE_AMD
 #include <asm/mce.h>
 
@@ -1889,7 +1891,7 @@ static int amdgpu_ras_badpages_read(struct amdgpu_device *adev,
 static void amdgpu_ras_do_recovery(struct work_struct *work)
 {
 	struct amdgpu_ras *ras =
-		container_of(work, struct amdgpu_ras, recovery_work);
+		container_of(work, struct amdgpu_ras, recovery_work.work);
 	struct amdgpu_device *remote_adev = NULL;
 	struct amdgpu_device *adev = ras->adev;
 	struct list_head device_list, *device_list_handle =  NULL;
@@ -1916,7 +1918,7 @@ static void amdgpu_ras_do_recovery(struct work_struct *work)
 	}
 
 	if (amdgpu_device_should_recover_gpu(ras->adev))
-		amdgpu_device_gpu_recover(ras->adev, NULL);
+		amdgpu_device_gpu_recover_imp(ras->adev, NULL);
 	atomic_set(&ras->in_recovery, 0);
 }
 
@@ -2148,7 +2150,7 @@ int amdgpu_ras_recovery_init(struct amdgpu_device *adev)
 	}
 
 	mutex_init(&con->recovery_lock);
-	INIT_WORK(&con->recovery_work, amdgpu_ras_do_recovery);
+	INIT_DELAYED_WORK(&con->recovery_work, amdgpu_ras_do_recovery);
 	atomic_set(&con->in_recovery, 0);
 	con->eeprom_control.bad_channel_bitmap = 0;
 
@@ -2217,7 +2219,7 @@ static int amdgpu_ras_recovery_fini(struct amdgpu_device *adev)
 	if (!data)
 		return 0;
 
-	cancel_work_sync(&con->recovery_work);
+	cancel_delayed_work_sync(&con->recovery_work);
 
 	mutex_lock(&con->recovery_lock);
 	con->eh_data = NULL;
@@ -2910,7 +2912,7 @@ int amdgpu_ras_reset_gpu(struct amdgpu_device *adev)
 	struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);
 
 	if (atomic_cmpxchg(&ras->in_recovery, 0, 1) == 0)
-		schedule_work(&ras->recovery_work);
+		amdgpu_reset_domain_schedule(ras->adev->reset_domain, &ras->recovery_work);
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
index b9a6fac2b8b2..f7e21c2abc61 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h
@@ -347,7 +347,7 @@ struct amdgpu_ras {
 	struct ras_manager *objs;
 
 	/* gpu recovery */
-	struct work_struct recovery_work;
+	struct delayed_work recovery_work;
 	atomic_t in_recovery;
 	struct amdgpu_device *adev;
 	/* error handler data */
-- 
2.25.1


  parent reply	other threads:[~2022-05-17 19:21 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-17 19:20 [PATCH v2 0/7] Fix multiple GPU resets in XGMI hive Andrey Grodzovsky
2022-05-17 19:20 ` [PATCH v2 1/7] drm/amdgpu: Cache result of last reset at reset domain level Andrey Grodzovsky
2022-05-18  6:02   ` Christian König
2022-05-17 19:20 ` [PATCH v2 2/7] drm/amdgpu: Switch to delayed work from work_struct Andrey Grodzovsky
2022-05-18  6:03   ` Christian König
2022-05-17 19:20 ` Andrey Grodzovsky [this message]
2022-05-17 19:20 ` [PATCH v2 4/7] drm/amdgpu: Add delayed work for GPU reset from debugfs Andrey Grodzovsky
2022-05-17 19:21 ` [PATCH v2 5/7] drm/amdgpu: Add delayed work for GPU reset from kfd Andrey Grodzovsky
2022-05-17 19:21 ` [PATCH v2 6/7] drm/amdgpu: Rename amdgpu_device_gpu_recover_imp back to amdgpu_device_gpu_recover Andrey Grodzovsky
2022-05-17 19:21 ` [PATCH v2 7/7] drm/amdgpu: Stop any pending reset if another in progress Andrey Grodzovsky
2022-05-17 20:56   ` Felix Kuehling
2022-05-18  6:07 ` [PATCH v2 0/7] Fix multiple GPU resets in XGMI hive Christian König
2022-05-18 14:24   ` Andrey Grodzovsky
2022-05-19  7:58     ` Christian König
2022-05-19 13:41       ` Andrey Grodzovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220517192102.238176-4-andrey.grodzovsky@amd.com \
    --to=andrey.grodzovsky@amd.com \
    --cc=Christian.Koenig@amd.com \
    --cc=Zoy.Bai@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=lijo.lazar@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.