dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [RFC v3 00/12] Define and use reset domain for GPU recovery in amdgpu
@ 2022-01-25 22:37 Andrey Grodzovsky
  2022-01-25 22:37 ` [RFC v3 01/12] drm/amdgpu: Introduce reset domain Andrey Grodzovsky
                   ` (12 more replies)
  0 siblings, 13 replies; 27+ messages in thread
From: Andrey Grodzovsky @ 2022-01-25 22:37 UTC (permalink / raw)
  To: dri-devel, amd-gfx
  Cc: horace.chen, lijo.lazar, jingwech, christian.koenig, Monk.Liu

This patchset is based on earlier work by Boris[1] that allowed to have an
ordered workqueue at the driver level that will be used by the different
schedulers to queue their timeout work. On top of that I also serialized
any GPU reset we trigger from within amdgpu code to also go through the same
ordered wq and in this way simplify somewhat our GPU reset code so we don't need
to protect from concurrency by multiple GPU reset triggeres such as TDR on one
hand and sysfs trigger or RAS trigger on the other hand.

As advised by Christian and Daniel I defined a reset_domain struct such that
all the entities that go through reset together will be serialized one against
another. 

TDR triggered by multiple entities within the same domain due to the same reason will not
be triggered as the first such reset will cancel all the pending resets. This is
relevant only to TDR timers and not to triggered resets coming from RAS or SYSFS,
those will still happen after the in flight resets finishes.

v2:
Add handling on SRIOV configuration, the reset notify coming from host 
and driver already trigger a work queue to handle the reset so drop this
intermediate wq and send directly to timeout wq. (Shaoyun)

v3:
Lijo suggested puting 'adev->in_gpu_reset' in amdgpu_reset_domain struct.
I followed his advise and also moved adev->reset_sem into same place. This
in turn caused to do some follow-up refactor of the original patches 
where i decoupled amdgpu_reset_domain life cycle frolm XGMI hive because hive is destroyed and 
reconstructed for the case of reset the devices in the XGMI hive during probe for SRIOV See [2]
while we need the reset sem and gpu_reset flag to always be present. This was attained
by adding refcount to amdgpu_reset_domain so each device can safely point to it as long as
it needs.


[1] https://patchwork.kernel.org/project/dri-devel/patch/20210629073510.2764391-3-boris.brezillon@collabora.com/
[2] https://www.spinics.net/lists/amd-gfx/msg58836.html

P.S Going through drm-misc-next and not amd-staging-drm-next as Boris work hasn't landed yet there.

P.P.S Patches 8-12 are the refactor on top of the original V2 patchset.

P.P.P.S I wasn't able yet to test the reworked code on XGMI SRIOV system because drm-misc-next fails to load there.
Would appriciate if maybe jingwech can try it on his system like he tested V2.

Andrey Grodzovsky (12):
  drm/amdgpu: Introduce reset domain
  drm/amdgpu: Move scheduler init to after XGMI is ready
  drm/amdgpu: Fix crash on modprobe
  drm/amdgpu: Serialize non TDR gpu recovery with TDRs
  drm/amd/virt: For SRIOV send GPU reset directly to TDR queue.
  drm/amdgpu: Drop hive->in_reset
  drm/amdgpu: Drop concurrent GPU reset protection for device
  drm/amdgpu: Rework reset domain to be refcounted.
  drm/amdgpu: Move reset sem into reset_domain
  drm/amdgpu: Move in_gpu_reset into reset_domain
  drm/amdgpu: Rework amdgpu_device_lock_adev
  Revert 'drm/amdgpu: annotate a false positive recursive locking'

 drivers/gpu/drm/amd/amdgpu/amdgpu.h           |  15 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c   |  10 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 275 ++++++++++--------
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c     |  43 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c       |   2 +-
 .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c    |  18 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c     |  39 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h     |  12 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h      |   2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c      |  24 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h      |   3 +-
 drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c        |   6 +-
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c         |  14 +-
 drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c         |  19 +-
 drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c         |  19 +-
 drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c         |  11 +-
 16 files changed, 313 insertions(+), 199 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [RFC v3 01/12] drm/amdgpu: Introduce reset domain
  2022-01-25 22:37 [RFC v3 00/12] Define and use reset domain for GPU recovery in amdgpu Andrey Grodzovsky
@ 2022-01-25 22:37 ` Andrey Grodzovsky
  2022-01-26 12:07   ` Christian König
  2022-01-25 22:37 ` [RFC v3 02/12] drm/amdgpu: Move scheduler init to after XGMI is ready Andrey Grodzovsky
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 27+ messages in thread
From: Andrey Grodzovsky @ 2022-01-25 22:37 UTC (permalink / raw)
  To: dri-devel, amd-gfx
  Cc: Daniel Vetter, horace.chen, lijo.lazar, jingwech,
	Christian König, christian.koenig, Monk.Liu

Defined a reset_domain struct such that
all the entities that go through reset
together will be serialized one against
another. Do it for both single device and
XGMI hive cases.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Suggested-by: Christian König <ckoenig.leichtzumerken@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  7 +++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 20 +++++++++++++++++++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c   |  9 +++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h   |  2 ++
 4 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 9f017663ac50..b5ff76aae7e0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -812,6 +812,11 @@ struct amd_powerplay {
 
 #define AMDGPU_RESET_MAGIC_NUM 64
 #define AMDGPU_MAX_DF_PERFMONS 4
+
+struct amdgpu_reset_domain {
+	struct workqueue_struct *wq;
+};
+
 struct amdgpu_device {
 	struct device			*dev;
 	struct pci_dev			*pdev;
@@ -1096,6 +1101,8 @@ struct amdgpu_device {
 
 	struct amdgpu_reset_control     *reset_cntl;
 	uint32_t                        ip_versions[HW_ID_MAX][HWIP_MAX_INSTANCE];
+
+	struct amdgpu_reset_domain	reset_domain;
 };
 
 static inline struct amdgpu_device *drm_to_adev(struct drm_device *ddev)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 90d22a376632..0f3e6c078f88 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2391,9 +2391,27 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev)
 	if (r)
 		goto init_failed;
 
-	if (adev->gmc.xgmi.num_physical_nodes > 1)
+	if (adev->gmc.xgmi.num_physical_nodes > 1) {
+		struct amdgpu_hive_info *hive;
+
 		amdgpu_xgmi_add_device(adev);
 
+		hive = amdgpu_get_xgmi_hive(adev);
+		if (!hive || !hive->reset_domain.wq) {
+			DRM_ERROR("Failed to obtain reset domain info for XGMI hive:%llx", hive->hive_id);
+			r = -EINVAL;
+			goto init_failed;
+		}
+
+		adev->reset_domain.wq = hive->reset_domain.wq;
+	} else {
+		adev->reset_domain.wq = alloc_ordered_workqueue("amdgpu-reset-dev", 0);
+		if (!adev->reset_domain.wq) {
+			r = -ENOMEM;
+			goto init_failed;
+		}
+	}
+
 	/* Don't init kfd if whole hive need to be reset during init */
 	if (!adev->gmc.xgmi.pending_reset)
 		amdgpu_amdkfd_device_init(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
index 567df2db23ac..a858e3457c5c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
@@ -392,6 +392,14 @@ struct amdgpu_hive_info *amdgpu_get_xgmi_hive(struct amdgpu_device *adev)
 		goto pro_end;
 	}
 
+	hive->reset_domain.wq = alloc_ordered_workqueue("amdgpu-reset-hive", 0);
+	if (!hive->reset_domain.wq) {
+		dev_err(adev->dev, "XGMI: failed allocating wq for reset domain!\n");
+		kfree(hive);
+		hive = NULL;
+		goto pro_end;
+	}
+
 	hive->hive_id = adev->gmc.xgmi.hive_id;
 	INIT_LIST_HEAD(&hive->device_list);
 	INIT_LIST_HEAD(&hive->node);
@@ -401,6 +409,7 @@ struct amdgpu_hive_info *amdgpu_get_xgmi_hive(struct amdgpu_device *adev)
 	task_barrier_init(&hive->tb);
 	hive->pstate = AMDGPU_XGMI_PSTATE_UNKNOWN;
 	hive->hi_req_gpu = NULL;
+
 	/*
 	 * hive pstate on boot is high in vega20 so we have to go to low
 	 * pstate on after boot.
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
index d2189bf7d428..6121aaa292cb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
@@ -42,6 +42,8 @@ struct amdgpu_hive_info {
 		AMDGPU_XGMI_PSTATE_MAX_VEGA20,
 		AMDGPU_XGMI_PSTATE_UNKNOWN
 	} pstate;
+
+	struct amdgpu_reset_domain reset_domain;
 };
 
 struct amdgpu_pcs_ras_field {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC v3 02/12] drm/amdgpu: Move scheduler init to after XGMI is ready
  2022-01-25 22:37 [RFC v3 00/12] Define and use reset domain for GPU recovery in amdgpu Andrey Grodzovsky
  2022-01-25 22:37 ` [RFC v3 01/12] drm/amdgpu: Introduce reset domain Andrey Grodzovsky
@ 2022-01-25 22:37 ` Andrey Grodzovsky
  2022-01-25 22:37 ` [RFC v3 03/12] drm/amdgpu: Fix crash on modprobe Andrey Grodzovsky
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 27+ messages in thread
From: Andrey Grodzovsky @ 2022-01-25 22:37 UTC (permalink / raw)
  To: dri-devel, amd-gfx
  Cc: horace.chen, lijo.lazar, jingwech, christian.koenig, Monk.Liu

Before we initialize schedulers we must know which reset
domain are we in - for single device there iis a single
domain per device and so single wq per device. For XGMI
the reset domain spans the entire XGMI hive and so the
reset wq is per hive.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 45 ++++++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c  | 34 ++--------------
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h   |  2 +
 3 files changed, 51 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 0f3e6c078f88..7c063fd37389 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2284,6 +2284,47 @@ static int amdgpu_device_fw_loading(struct amdgpu_device *adev)
 	return r;
 }
 
+static int amdgpu_device_init_schedulers(struct amdgpu_device *adev)
+{
+	long timeout;
+	int r, i;
+
+	for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
+		struct amdgpu_ring *ring = adev->rings[i];
+
+		/* No need to setup the GPU scheduler for rings that don't need it */
+		if (!ring || ring->no_scheduler)
+			continue;
+
+		switch (ring->funcs->type) {
+		case AMDGPU_RING_TYPE_GFX:
+			timeout = adev->gfx_timeout;
+			break;
+		case AMDGPU_RING_TYPE_COMPUTE:
+			timeout = adev->compute_timeout;
+			break;
+		case AMDGPU_RING_TYPE_SDMA:
+			timeout = adev->sdma_timeout;
+			break;
+		default:
+			timeout = adev->video_timeout;
+			break;
+		}
+
+		r = drm_sched_init(&ring->sched, &amdgpu_sched_ops,
+				   ring->num_hw_submission, amdgpu_job_hang_limit,
+				   timeout, adev->reset_domain.wq, ring->sched_score, ring->name);
+		if (r) {
+			DRM_ERROR("Failed to create scheduler on ring %s.\n",
+				  ring->name);
+			return r;
+		}
+	}
+
+	return 0;
+}
+
+
 /**
  * amdgpu_device_ip_init - run init for hardware IPs
  *
@@ -2412,6 +2453,10 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev)
 		}
 	}
 
+	r = amdgpu_device_init_schedulers(adev);
+	if (r)
+		goto init_failed;
+
 	/* Don't init kfd if whole hive need to be reset during init */
 	if (!adev->gmc.xgmi.pending_reset)
 		amdgpu_amdkfd_device_init(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index 3b7e86ea7167..5527c68c51de 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -456,8 +456,6 @@ int amdgpu_fence_driver_init_ring(struct amdgpu_ring *ring,
 				  atomic_t *sched_score)
 {
 	struct amdgpu_device *adev = ring->adev;
-	long timeout;
-	int r;
 
 	if (!adev)
 		return -EINVAL;
@@ -477,36 +475,12 @@ int amdgpu_fence_driver_init_ring(struct amdgpu_ring *ring,
 	spin_lock_init(&ring->fence_drv.lock);
 	ring->fence_drv.fences = kcalloc(num_hw_submission * 2, sizeof(void *),
 					 GFP_KERNEL);
-	if (!ring->fence_drv.fences)
-		return -ENOMEM;
 
-	/* No need to setup the GPU scheduler for rings that don't need it */
-	if (ring->no_scheduler)
-		return 0;
+	ring->num_hw_submission = num_hw_submission;
+	ring->sched_score = sched_score;
 
-	switch (ring->funcs->type) {
-	case AMDGPU_RING_TYPE_GFX:
-		timeout = adev->gfx_timeout;
-		break;
-	case AMDGPU_RING_TYPE_COMPUTE:
-		timeout = adev->compute_timeout;
-		break;
-	case AMDGPU_RING_TYPE_SDMA:
-		timeout = adev->sdma_timeout;
-		break;
-	default:
-		timeout = adev->video_timeout;
-		break;
-	}
-
-	r = drm_sched_init(&ring->sched, &amdgpu_sched_ops,
-			   num_hw_submission, amdgpu_job_hang_limit,
-			   timeout, NULL, sched_score, ring->name);
-	if (r) {
-		DRM_ERROR("Failed to create scheduler on ring %s.\n",
-			  ring->name);
-		return r;
-	}
+	if (!ring->fence_drv.fences)
+		return -ENOMEM;
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index 4d380e79752c..a4b8279e3011 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -253,6 +253,8 @@ struct amdgpu_ring {
 	bool			has_compute_vm_bug;
 	bool			no_scheduler;
 	int			hw_prio;
+	unsigned 		num_hw_submission;
+	atomic_t		*sched_score;
 };
 
 #define amdgpu_ring_parse_cs(r, p, ib) ((r)->funcs->parse_cs((p), (ib)))
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC v3 03/12] drm/amdgpu: Fix crash on modprobe
  2022-01-25 22:37 [RFC v3 00/12] Define and use reset domain for GPU recovery in amdgpu Andrey Grodzovsky
  2022-01-25 22:37 ` [RFC v3 01/12] drm/amdgpu: Introduce reset domain Andrey Grodzovsky
  2022-01-25 22:37 ` [RFC v3 02/12] drm/amdgpu: Move scheduler init to after XGMI is ready Andrey Grodzovsky
@ 2022-01-25 22:37 ` Andrey Grodzovsky
  2022-01-25 22:37 ` [RFC v3 04/12] drm/amdgpu: Serialize non TDR gpu recovery with TDRs Andrey Grodzovsky
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 27+ messages in thread
From: Andrey Grodzovsky @ 2022-01-25 22:37 UTC (permalink / raw)
  To: dri-devel, amd-gfx
  Cc: horace.chen, lijo.lazar, jingwech, christian.koenig, Monk.Liu

Restrict jobs resubmission to suspend case
only since schedulers not initialised yet on
probe.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index 5527c68c51de..8904c5a63dfa 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -582,7 +582,14 @@ void amdgpu_fence_driver_hw_init(struct amdgpu_device *adev)
 		if (!ring || !ring->fence_drv.initialized)
 			continue;
 
-		if (!ring->no_scheduler) {
+		/**
+		 * Restrict jobs resubmission to suspend case
+		 * only since schedulers not initialised yet on
+		 * probe.
+		 *
+		 * TODO - restructure resume to make that unnecessary
+		 */
+		if (adev->in_suspend && !ring->no_scheduler) {
 			drm_sched_resubmit_jobs(&ring->sched);
 			drm_sched_start(&ring->sched, true);
 		}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC v3 04/12] drm/amdgpu: Serialize non TDR gpu recovery with TDRs
  2022-01-25 22:37 [RFC v3 00/12] Define and use reset domain for GPU recovery in amdgpu Andrey Grodzovsky
                   ` (2 preceding siblings ...)
  2022-01-25 22:37 ` [RFC v3 03/12] drm/amdgpu: Fix crash on modprobe Andrey Grodzovsky
@ 2022-01-25 22:37 ` Andrey Grodzovsky
  2022-01-25 22:37 ` [RFC v3 05/12] drm/amd/virt: For SRIOV send GPU reset directly to TDR queue Andrey Grodzovsky
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 27+ messages in thread
From: Andrey Grodzovsky @ 2022-01-25 22:37 UTC (permalink / raw)
  To: dri-devel, amd-gfx
  Cc: horace.chen, lijo.lazar, jingwech, christian.koenig, Monk.Liu

Use reset domain wq also for non TDR gpu recovery trigers
such as sysfs and RAS. We must serialize all possible
GPU recoveries to gurantee no concurrency there.
For TDR call the original recovery function directly since
it's already executed from within the wq. For others just
use a wrapper to qeueue work and wait on it to finish.

v2: Rename to amdgpu_recover_work_struct

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  2 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 33 +++++++++++++++++++++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c    |  2 +-
 3 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index b5ff76aae7e0..8e96b9a14452 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1296,6 +1296,8 @@ bool amdgpu_device_has_job_running(struct amdgpu_device *adev);
 bool amdgpu_device_should_recover_gpu(struct amdgpu_device *adev);
 int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
 			      struct amdgpu_job* job);
+int amdgpu_device_gpu_recover_imp(struct amdgpu_device *adev,
+			      struct amdgpu_job *job);
 void amdgpu_device_pci_config_reset(struct amdgpu_device *adev);
 int amdgpu_device_pci_reset(struct amdgpu_device *adev);
 bool amdgpu_device_need_post(struct amdgpu_device *adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 7c063fd37389..258ec3c0b2af 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4979,7 +4979,7 @@ static void amdgpu_device_recheck_guilty_jobs(
  * Returns 0 for success or an error on failure.
  */
 
-int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
+int amdgpu_device_gpu_recover_imp(struct amdgpu_device *adev,
 			      struct amdgpu_job *job)
 {
 	struct list_head device_list, *device_list_handle =  NULL;
@@ -5237,6 +5237,37 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
 	return r;
 }
 
+struct amdgpu_recover_work_struct {
+	struct work_struct base;
+	struct amdgpu_device *adev;
+	struct amdgpu_job *job;
+	int ret;
+};
+
+static void amdgpu_device_queue_gpu_recover_work(struct work_struct *work)
+{
+	struct amdgpu_recover_work_struct *recover_work = container_of(work, struct amdgpu_recover_work_struct, base);
+
+	recover_work->ret = amdgpu_device_gpu_recover_imp(recover_work->adev, recover_work->job);
+}
+/*
+ * Serialize gpu recover into reset domain single threaded wq
+ */
+int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
+				    struct amdgpu_job *job)
+{
+	struct amdgpu_recover_work_struct work = {.adev = adev, .job = job};
+
+	INIT_WORK(&work.base, amdgpu_device_queue_gpu_recover_work);
+
+	if (!queue_work(adev->reset_domain.wq, &work.base))
+		return -EAGAIN;
+
+	flush_work(&work.base);
+
+	return work.ret;
+}
+
 /**
  * amdgpu_device_get_pcie_info - fence pcie info about the PCIE slot
  *
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index bfc47bea23db..38c9fd7b7ad4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -63,7 +63,7 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job)
 		  ti.process_name, ti.tgid, ti.task_name, ti.pid);
 
 	if (amdgpu_device_should_recover_gpu(ring->adev)) {
-		amdgpu_device_gpu_recover(ring->adev, job);
+		amdgpu_device_gpu_recover_imp(ring->adev, job);
 	} else {
 		drm_sched_suspend_timeout(&ring->sched);
 		if (amdgpu_sriov_vf(adev))
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC v3 05/12] drm/amd/virt: For SRIOV send GPU reset directly to TDR queue.
  2022-01-25 22:37 [RFC v3 00/12] Define and use reset domain for GPU recovery in amdgpu Andrey Grodzovsky
                   ` (3 preceding siblings ...)
  2022-01-25 22:37 ` [RFC v3 04/12] drm/amdgpu: Serialize non TDR gpu recovery with TDRs Andrey Grodzovsky
@ 2022-01-25 22:37 ` Andrey Grodzovsky
  2022-01-25 22:37 ` [RFC v3 06/12] drm/amdgpu: Drop hive->in_reset Andrey Grodzovsky
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 27+ messages in thread
From: Andrey Grodzovsky @ 2022-01-25 22:37 UTC (permalink / raw)
  To: dri-devel, amd-gfx
  Cc: horace.chen, lijo.lazar, jingwech, christian.koenig, Monk.Liu,
	Liu Shaoyun

No need to to trigger another work queue inside the work queue.

v3:

Problem:
Extra reset caused by host side FLR notification
following guest side triggered reset.
Fix: Preven qeuing flr_work from mailbox irq if guest
already executing a reset.

Suggested-by: Liu Shaoyun <Shaoyun.Liu@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c | 9 ++++++---
 drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c | 9 ++++++---
 drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c | 9 ++++++---
 3 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
index 23b066bcffb2..b2b40e169342 100644
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
@@ -276,7 +276,7 @@ static void xgpu_ai_mailbox_flr_work(struct work_struct *work)
 	if (amdgpu_device_should_recover_gpu(adev)
 		&& (!amdgpu_device_has_job_running(adev) ||
 		adev->sdma_timeout == MAX_SCHEDULE_TIMEOUT))
-		amdgpu_device_gpu_recover(adev, NULL);
+		amdgpu_device_gpu_recover_imp(adev, NULL);
 }
 
 static int xgpu_ai_set_mailbox_rcv_irq(struct amdgpu_device *adev,
@@ -301,8 +301,11 @@ static int xgpu_ai_mailbox_rcv_irq(struct amdgpu_device *adev,
 
 	switch (event) {
 		case IDH_FLR_NOTIFICATION:
-		if (amdgpu_sriov_runtime(adev))
-			schedule_work(&adev->virt.flr_work);
+		if (amdgpu_sriov_runtime(adev) && !amdgpu_in_reset(adev))
+			WARN_ONCE(!queue_work(adev->reset_domain.wq,
+					      &adev->virt.flr_work),
+				  "Failed to queue work! at %s",
+				  __func__);
 		break;
 		case IDH_QUERY_ALIVE:
 			xgpu_ai_mailbox_send_ack(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
index a35e6d87e537..08411924150d 100644
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
@@ -308,7 +308,7 @@ static void xgpu_nv_mailbox_flr_work(struct work_struct *work)
 		adev->gfx_timeout == MAX_SCHEDULE_TIMEOUT ||
 		adev->compute_timeout == MAX_SCHEDULE_TIMEOUT ||
 		adev->video_timeout == MAX_SCHEDULE_TIMEOUT))
-		amdgpu_device_gpu_recover(adev, NULL);
+		amdgpu_device_gpu_recover_imp(adev, NULL);
 }
 
 static int xgpu_nv_set_mailbox_rcv_irq(struct amdgpu_device *adev,
@@ -336,8 +336,11 @@ static int xgpu_nv_mailbox_rcv_irq(struct amdgpu_device *adev,
 
 	switch (event) {
 	case IDH_FLR_NOTIFICATION:
-		if (amdgpu_sriov_runtime(adev))
-			schedule_work(&adev->virt.flr_work);
+		if (amdgpu_sriov_runtime(adev) && !amdgpu_in_reset(adev))
+			WARN_ONCE(!queue_work(adev->reset_domain.wq,
+					      &adev->virt.flr_work),
+				  "Failed to queue work! at %s",
+				  __func__);
 		break;
 		/* READY_TO_ACCESS_GPU is fetched by kernel polling, IRQ can ignore
 		 * it byfar since that polling thread will handle it,
diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c
index aef9d059ae52..02290febfcf4 100644
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c
@@ -521,7 +521,7 @@ static void xgpu_vi_mailbox_flr_work(struct work_struct *work)
 
 	/* Trigger recovery due to world switch failure */
 	if (amdgpu_device_should_recover_gpu(adev))
-		amdgpu_device_gpu_recover(adev, NULL);
+		amdgpu_device_gpu_recover_imp(adev, NULL);
 }
 
 static int xgpu_vi_set_mailbox_rcv_irq(struct amdgpu_device *adev,
@@ -550,8 +550,11 @@ static int xgpu_vi_mailbox_rcv_irq(struct amdgpu_device *adev,
 		r = xgpu_vi_mailbox_rcv_msg(adev, IDH_FLR_NOTIFICATION);
 
 		/* only handle FLR_NOTIFY now */
-		if (!r)
-			schedule_work(&adev->virt.flr_work);
+		if (!r && !amdgpu_in_reset(adev))
+			WARN_ONCE(!queue_work(adev->reset_domain.wq,
+					      &adev->virt.flr_work),
+				  "Failed to queue work! at %s",
+				  __func__);
 	}
 
 	return 0;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC v3 06/12] drm/amdgpu: Drop hive->in_reset
  2022-01-25 22:37 [RFC v3 00/12] Define and use reset domain for GPU recovery in amdgpu Andrey Grodzovsky
                   ` (4 preceding siblings ...)
  2022-01-25 22:37 ` [RFC v3 05/12] drm/amd/virt: For SRIOV send GPU reset directly to TDR queue Andrey Grodzovsky
@ 2022-01-25 22:37 ` Andrey Grodzovsky
  2022-02-08  6:33   ` Lazar, Lijo
  2022-01-25 22:37 ` [RFC v3 07/12] drm/amdgpu: Drop concurrent GPU reset protection for device Andrey Grodzovsky
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 27+ messages in thread
From: Andrey Grodzovsky @ 2022-01-25 22:37 UTC (permalink / raw)
  To: dri-devel, amd-gfx
  Cc: horace.chen, lijo.lazar, jingwech, christian.koenig, Monk.Liu

Since we serialize all resets no need to protect from concurrent
resets.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 19 +------------------
 drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c   |  1 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h   |  1 -
 3 files changed, 1 insertion(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 258ec3c0b2af..107a393ebbfd 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -5013,25 +5013,9 @@ int amdgpu_device_gpu_recover_imp(struct amdgpu_device *adev,
 	dev_info(adev->dev, "GPU %s begin!\n",
 		need_emergency_restart ? "jobs stop":"reset");
 
-	/*
-	 * Here we trylock to avoid chain of resets executing from
-	 * either trigger by jobs on different adevs in XGMI hive or jobs on
-	 * different schedulers for same device while this TO handler is running.
-	 * We always reset all schedulers for device and all devices for XGMI
-	 * hive so that should take care of them too.
-	 */
 	hive = amdgpu_get_xgmi_hive(adev);
-	if (hive) {
-		if (atomic_cmpxchg(&hive->in_reset, 0, 1) != 0) {
-			DRM_INFO("Bailing on TDR for s_job:%llx, hive: %llx as another already in progress",
-				job ? job->base.id : -1, hive->hive_id);
-			amdgpu_put_xgmi_hive(hive);
-			if (job && job->vm)
-				drm_sched_increase_karma(&job->base);
-			return 0;
-		}
+	if (hive)
 		mutex_lock(&hive->hive_lock);
-	}
 
 	reset_context.method = AMD_RESET_METHOD_NONE;
 	reset_context.reset_req_dev = adev;
@@ -5227,7 +5211,6 @@ int amdgpu_device_gpu_recover_imp(struct amdgpu_device *adev,
 
 skip_recovery:
 	if (hive) {
-		atomic_set(&hive->in_reset, 0);
 		mutex_unlock(&hive->hive_lock);
 		amdgpu_put_xgmi_hive(hive);
 	}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
index a858e3457c5c..9ad742039ac9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
@@ -404,7 +404,6 @@ struct amdgpu_hive_info *amdgpu_get_xgmi_hive(struct amdgpu_device *adev)
 	INIT_LIST_HEAD(&hive->device_list);
 	INIT_LIST_HEAD(&hive->node);
 	mutex_init(&hive->hive_lock);
-	atomic_set(&hive->in_reset, 0);
 	atomic_set(&hive->number_devices, 0);
 	task_barrier_init(&hive->tb);
 	hive->pstate = AMDGPU_XGMI_PSTATE_UNKNOWN;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
index 6121aaa292cb..2f2ce53645a5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
@@ -33,7 +33,6 @@ struct amdgpu_hive_info {
 	struct list_head node;
 	atomic_t number_devices;
 	struct mutex hive_lock;
-	atomic_t in_reset;
 	int hi_req_count;
 	struct amdgpu_device *hi_req_gpu;
 	struct task_barrier tb;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC v3 07/12] drm/amdgpu: Drop concurrent GPU reset protection for device
  2022-01-25 22:37 [RFC v3 00/12] Define and use reset domain for GPU recovery in amdgpu Andrey Grodzovsky
                   ` (5 preceding siblings ...)
  2022-01-25 22:37 ` [RFC v3 06/12] drm/amdgpu: Drop hive->in_reset Andrey Grodzovsky
@ 2022-01-25 22:37 ` Andrey Grodzovsky
  2022-01-25 22:37 ` [RFC v3 08/12] drm/amdgpu: Rework reset domain to be refcounted Andrey Grodzovsky
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 27+ messages in thread
From: Andrey Grodzovsky @ 2022-01-25 22:37 UTC (permalink / raw)
  To: dri-devel, amd-gfx
  Cc: horace.chen, lijo.lazar, jingwech, christian.koenig, Monk.Liu

Since now all GPU resets are serialzied there is no need for this.

This patch also reverts 'drm/amdgpu: race issue when jobs on 2 ring timeout'

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 89 ++--------------------
 1 file changed, 7 insertions(+), 82 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 107a393ebbfd..fef952ca8db5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4763,11 +4763,10 @@ int amdgpu_do_asic_reset(struct list_head *device_list_handle,
 	return r;
 }
 
-static bool amdgpu_device_lock_adev(struct amdgpu_device *adev,
+static void amdgpu_device_lock_adev(struct amdgpu_device *adev,
 				struct amdgpu_hive_info *hive)
 {
-	if (atomic_cmpxchg(&adev->in_gpu_reset, 0, 1) != 0)
-		return false;
+	atomic_set(&adev->in_gpu_reset, 1);
 
 	if (hive) {
 		down_write_nest_lock(&adev->reset_sem, &hive->hive_lock);
@@ -4786,8 +4785,6 @@ static bool amdgpu_device_lock_adev(struct amdgpu_device *adev,
 		adev->mp1_state = PP_MP1_STATE_NONE;
 		break;
 	}
-
-	return true;
 }
 
 static void amdgpu_device_unlock_adev(struct amdgpu_device *adev)
@@ -4798,46 +4795,6 @@ static void amdgpu_device_unlock_adev(struct amdgpu_device *adev)
 	up_write(&adev->reset_sem);
 }
 
-/*
- * to lockup a list of amdgpu devices in a hive safely, if not a hive
- * with multiple nodes, it will be similar as amdgpu_device_lock_adev.
- *
- * unlock won't require roll back.
- */
-static int amdgpu_device_lock_hive_adev(struct amdgpu_device *adev, struct amdgpu_hive_info *hive)
-{
-	struct amdgpu_device *tmp_adev = NULL;
-
-	if (adev->gmc.xgmi.num_physical_nodes > 1) {
-		if (!hive) {
-			dev_err(adev->dev, "Hive is NULL while device has multiple xgmi nodes");
-			return -ENODEV;
-		}
-		list_for_each_entry(tmp_adev, &hive->device_list, gmc.xgmi.head) {
-			if (!amdgpu_device_lock_adev(tmp_adev, hive))
-				goto roll_back;
-		}
-	} else if (!amdgpu_device_lock_adev(adev, hive))
-		return -EAGAIN;
-
-	return 0;
-roll_back:
-	if (!list_is_first(&tmp_adev->gmc.xgmi.head, &hive->device_list)) {
-		/*
-		 * if the lockup iteration break in the middle of a hive,
-		 * it may means there may has a race issue,
-		 * or a hive device locked up independently.
-		 * we may be in trouble and may not, so will try to roll back
-		 * the lock and give out a warnning.
-		 */
-		dev_warn(tmp_adev->dev, "Hive lock iteration broke in the middle. Rolling back to unlock");
-		list_for_each_entry_continue_reverse(tmp_adev, &hive->device_list, gmc.xgmi.head) {
-			amdgpu_device_unlock_adev(tmp_adev);
-		}
-	}
-	return -EAGAIN;
-}
-
 static void amdgpu_device_resume_display_audio(struct amdgpu_device *adev)
 {
 	struct pci_dev *p = NULL;
@@ -5023,22 +4980,6 @@ int amdgpu_device_gpu_recover_imp(struct amdgpu_device *adev,
 	reset_context.hive = hive;
 	clear_bit(AMDGPU_NEED_FULL_RESET, &reset_context.flags);
 
-	/*
-	 * lock the device before we try to operate the linked list
-	 * if didn't get the device lock, don't touch the linked list since
-	 * others may iterating it.
-	 */
-	r = amdgpu_device_lock_hive_adev(adev, hive);
-	if (r) {
-		dev_info(adev->dev, "Bailing on TDR for s_job:%llx, as another already in progress",
-					job ? job->base.id : -1);
-
-		/* even we skipped this reset, still need to set the job to guilty */
-		if (job && job->vm)
-			drm_sched_increase_karma(&job->base);
-		goto skip_recovery;
-	}
-
 	/*
 	 * Build list of devices to reset.
 	 * In case we are in XGMI hive mode, resort the device list
@@ -5058,6 +4999,9 @@ int amdgpu_device_gpu_recover_imp(struct amdgpu_device *adev,
 
 	/* block all schedulers and reset given job's ring */
 	list_for_each_entry(tmp_adev, device_list_handle, reset_list) {
+
+		amdgpu_device_lock_adev(tmp_adev, hive);
+
 		/*
 		 * Try to put the audio codec into suspend state
 		 * before gpu reset started.
@@ -5209,13 +5153,12 @@ int amdgpu_device_gpu_recover_imp(struct amdgpu_device *adev,
 		amdgpu_device_unlock_adev(tmp_adev);
 	}
 
-skip_recovery:
 	if (hive) {
 		mutex_unlock(&hive->hive_lock);
 		amdgpu_put_xgmi_hive(hive);
 	}
 
-	if (r && r != -EAGAIN)
+	if (r)
 		dev_info(adev->dev, "GPU reset end with ret = %d\n", r);
 	return r;
 }
@@ -5438,20 +5381,6 @@ int amdgpu_device_baco_exit(struct drm_device *dev)
 	return 0;
 }
 
-static void amdgpu_cancel_all_tdr(struct amdgpu_device *adev)
-{
-	int i;
-
-	for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
-		struct amdgpu_ring *ring = adev->rings[i];
-
-		if (!ring || !ring->sched.thread)
-			continue;
-
-		cancel_delayed_work_sync(&ring->sched.work_tdr);
-	}
-}
-
 /**
  * amdgpu_pci_error_detected - Called when a PCI error is detected.
  * @pdev: PCI device struct
@@ -5482,14 +5411,10 @@ pci_ers_result_t amdgpu_pci_error_detected(struct pci_dev *pdev, pci_channel_sta
 	/* Fatal error, prepare for slot reset */
 	case pci_channel_io_frozen:
 		/*
-		 * Cancel and wait for all TDRs in progress if failing to
-		 * set  adev->in_gpu_reset in amdgpu_device_lock_adev
-		 *
 		 * Locking adev->reset_sem will prevent any external access
 		 * to GPU during PCI error recovery
 		 */
-		while (!amdgpu_device_lock_adev(adev, NULL))
-			amdgpu_cancel_all_tdr(adev);
+		amdgpu_device_lock_adev(adev, NULL);
 
 		/*
 		 * Block any work scheduling as we do for regular GPU reset
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC v3 08/12] drm/amdgpu: Rework reset domain to be refcounted.
  2022-01-25 22:37 [RFC v3 00/12] Define and use reset domain for GPU recovery in amdgpu Andrey Grodzovsky
                   ` (6 preceding siblings ...)
  2022-01-25 22:37 ` [RFC v3 07/12] drm/amdgpu: Drop concurrent GPU reset protection for device Andrey Grodzovsky
@ 2022-01-25 22:37 ` Andrey Grodzovsky
  2022-01-26 12:12   ` Christian König
  2022-02-02 17:26   ` [RFC v4] " Andrey Grodzovsky
  2022-01-25 22:37 ` [RFC v3 09/12] drm/amdgpu: Move reset sem into reset_domain Andrey Grodzovsky
                   ` (4 subsequent siblings)
  12 siblings, 2 replies; 27+ messages in thread
From: Andrey Grodzovsky @ 2022-01-25 22:37 UTC (permalink / raw)
  To: dri-devel, amd-gfx
  Cc: horace.chen, lijo.lazar, jingwech, christian.koenig, Monk.Liu

The reset domain contains register access semaphor
now and so needs to be present as long as each device
in a hive needs it and so it cannot be binded to XGMI
hive life cycle.
Adress this by making reset domain refcounted and pointed
by each member of the hive and the hive itself.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  6 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 44 +++++++++++++---------
 drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c  | 36 ++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h  | 10 +++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c   | 26 ++++++++++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h   |  2 +-
 drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c      |  4 +-
 drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c      |  4 +-
 drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c      |  4 +-
 9 files changed, 105 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 8e96b9a14452..f2ba460bfd59 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -813,9 +813,7 @@ struct amd_powerplay {
 #define AMDGPU_RESET_MAGIC_NUM 64
 #define AMDGPU_MAX_DF_PERFMONS 4
 
-struct amdgpu_reset_domain {
-	struct workqueue_struct *wq;
-};
+struct amdgpu_reset_domain;
 
 struct amdgpu_device {
 	struct device			*dev;
@@ -1102,7 +1100,7 @@ struct amdgpu_device {
 	struct amdgpu_reset_control     *reset_cntl;
 	uint32_t                        ip_versions[HW_ID_MAX][HWIP_MAX_INSTANCE];
 
-	struct amdgpu_reset_domain	reset_domain;
+	struct amdgpu_reset_domain	*reset_domain;
 };
 
 static inline struct amdgpu_device *drm_to_adev(struct drm_device *ddev)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index fef952ca8db5..b24829096359 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2313,7 +2313,7 @@ static int amdgpu_device_init_schedulers(struct amdgpu_device *adev)
 
 		r = drm_sched_init(&ring->sched, &amdgpu_sched_ops,
 				   ring->num_hw_submission, amdgpu_job_hang_limit,
-				   timeout, adev->reset_domain.wq, ring->sched_score, ring->name);
+				   timeout, adev->reset_domain->wq, ring->sched_score, ring->name);
 		if (r) {
 			DRM_ERROR("Failed to create scheduler on ring %s.\n",
 				  ring->name);
@@ -2432,24 +2432,22 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev)
 	if (r)
 		goto init_failed;
 
+	/**
+	 * In case of XGMI grab extra reference for reset domain for this device
+	 */
 	if (adev->gmc.xgmi.num_physical_nodes > 1) {
-		struct amdgpu_hive_info *hive;
-
-		amdgpu_xgmi_add_device(adev);
+		if (amdgpu_xgmi_add_device(adev) == 0) {
+			struct amdgpu_hive_info *hive = amdgpu_get_xgmi_hive(adev);
 
-		hive = amdgpu_get_xgmi_hive(adev);
-		if (!hive || !hive->reset_domain.wq) {
-			DRM_ERROR("Failed to obtain reset domain info for XGMI hive:%llx", hive->hive_id);
-			r = -EINVAL;
-			goto init_failed;
-		}
+			if (!hive->reset_domain ||
+			    !kref_get_unless_zero(&hive->reset_domain->refcount)) {
+				r = -ENOENT;
+				goto init_failed;
+			}
 
-		adev->reset_domain.wq = hive->reset_domain.wq;
-	} else {
-		adev->reset_domain.wq = alloc_ordered_workqueue("amdgpu-reset-dev", 0);
-		if (!adev->reset_domain.wq) {
-			r = -ENOMEM;
-			goto init_failed;
+			/* Drop the early temporary reset domain we created for device */
+			kref_put(&adev->reset_domain->refcount, amdgpu_reset_destroy_reset_domain);
+			adev->reset_domain = hive->reset_domain;
 		}
 	}
 
@@ -3599,6 +3597,15 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 		return r;
 	}
 
+	/*
+	 * Reset domain needs to be present early, before XGMI hive discovered
+	 * (if any) and intitialized to use reset sem and in_gpu reset flag
+	 * early on during init.
+	 */
+	adev->reset_domain = amdgpu_reset_create_reset_domain("amdgpu-reset-dev");
+	if (!adev->reset_domain)
+		return -ENOMEM;
+
 	/* early init functions */
 	r = amdgpu_device_ip_early_init(adev);
 	if (r)
@@ -3949,6 +3956,9 @@ void amdgpu_device_fini_sw(struct amdgpu_device *adev)
 	if (adev->mman.discovery_bin)
 		amdgpu_discovery_fini(adev);
 
+	kref_put(&adev->reset_domain->refcount, amdgpu_reset_destroy_reset_domain);
+	adev->reset_domain = NULL;
+
 	kfree(adev->pci_state);
 
 }
@@ -5186,7 +5196,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
 
 	INIT_WORK(&work.base, amdgpu_device_queue_gpu_recover_work);
 
-	if (!queue_work(adev->reset_domain.wq, &work.base))
+	if (!queue_work(adev->reset_domain->wq, &work.base))
 		return -EAGAIN;
 
 	flush_work(&work.base);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
index 02afd4115675..f2d310cd8d40 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
@@ -96,3 +96,39 @@ int amdgpu_reset_perform_reset(struct amdgpu_device *adev,
 	return reset_handler->restore_hwcontext(adev->reset_cntl,
 						reset_context);
 }
+
+
+void amdgpu_reset_destroy_reset_domain(struct kref *ref)
+{
+	struct amdgpu_reset_domain *reset_domain = container_of(ref,
+								struct amdgpu_reset_domain,
+								refcount);
+	destroy_workqueue(reset_domain->wq);
+	kvfree(reset_domain);
+}
+
+struct amdgpu_reset_domain *amdgpu_reset_create_reset_domain(char *wq_name)
+{
+	struct amdgpu_reset_domain *reset_domain;
+
+	reset_domain = kvzalloc(sizeof(struct amdgpu_reset_domain), GFP_KERNEL);
+	if (!reset_domain) {
+		DRM_ERROR("Failed to allocate amdgpu_reset_domain!");
+		return NULL;
+	}
+
+	kref_init(&reset_domain->refcount);
+
+	reset_domain->wq = create_singlethread_workqueue(wq_name);
+	if (!reset_domain->wq) {
+		DRM_ERROR("Failed to allocate wq for amdgpu_reset_domain!");
+		kref_put(&reset_domain->refcount, amdgpu_reset_destroy_reset_domain);
+		return NULL;
+
+	}
+
+	return reset_domain;
+}
+
+
+
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
index e00d38d9160a..cd030e63e4c6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
@@ -70,6 +70,12 @@ struct amdgpu_reset_control {
 	void (*async_reset)(struct work_struct *work);
 };
 
+struct amdgpu_reset_domain {
+	struct kref refcount;
+	struct workqueue_struct *wq;
+};
+
+
 int amdgpu_reset_init(struct amdgpu_device *adev);
 int amdgpu_reset_fini(struct amdgpu_device *adev);
 
@@ -82,4 +88,8 @@ int amdgpu_reset_perform_reset(struct amdgpu_device *adev,
 int amdgpu_reset_add_handler(struct amdgpu_reset_control *reset_ctl,
 			     struct amdgpu_reset_handler *handler);
 
+struct amdgpu_reset_domain *amdgpu_reset_create_reset_domain(char *wq_name);
+
+void amdgpu_reset_destroy_reset_domain(struct kref *ref);
+
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
index 9ad742039ac9..5908a3f8208a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
@@ -32,6 +32,8 @@
 #include "wafl/wafl2_4_0_0_smn.h"
 #include "wafl/wafl2_4_0_0_sh_mask.h"
 
+#include "amdgpu_reset.h"
+
 #define smnPCS_XGMI23_PCS_ERROR_STATUS   0x11a01210
 #define smnPCS_XGMI3X16_PCS_ERROR_STATUS 0x11a0020c
 #define smnPCS_GOPX1_PCS_ERROR_STATUS    0x12200210
@@ -226,6 +228,9 @@ static void amdgpu_xgmi_hive_release(struct kobject *kobj)
 	struct amdgpu_hive_info *hive = container_of(
 		kobj, struct amdgpu_hive_info, kobj);
 
+	kref_put(&hive->reset_domain->refcount, amdgpu_reset_destroy_reset_domain);
+	hive->reset_domain = NULL;
+
 	mutex_destroy(&hive->hive_lock);
 	kfree(hive);
 }
@@ -392,12 +397,21 @@ struct amdgpu_hive_info *amdgpu_get_xgmi_hive(struct amdgpu_device *adev)
 		goto pro_end;
 	}
 
-	hive->reset_domain.wq = alloc_ordered_workqueue("amdgpu-reset-hive", 0);
-	if (!hive->reset_domain.wq) {
-		dev_err(adev->dev, "XGMI: failed allocating wq for reset domain!\n");
-		kfree(hive);
-		hive = NULL;
-		goto pro_end;
+	/**
+	 * Avoid recreating reset domain when hive is reconstructed for the case
+	 * of reset the devices in the XGMI hive during probe for SRIOV
+	 * See https://www.spinics.net/lists/amd-gfx/msg58836.html
+	 */
+	if (!adev->reset_domain) {
+		hive->reset_domain = amdgpu_reset_create_reset_domain("amdgpu-reset-hive");
+			if (!hive->reset_domain) {
+				dev_err(adev->dev, "XGMI: failed initializing reset domain for xgmi hive\n");
+				ret = -ENOMEM;
+				kobject_put(&hive->kobj);
+				kfree(hive);
+				hive = NULL;
+				goto pro_end;
+			}
 	}
 
 	hive->hive_id = adev->gmc.xgmi.hive_id;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
index 2f2ce53645a5..dcaad22be492 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
@@ -42,7 +42,7 @@ struct amdgpu_hive_info {
 		AMDGPU_XGMI_PSTATE_UNKNOWN
 	} pstate;
 
-	struct amdgpu_reset_domain reset_domain;
+	struct amdgpu_reset_domain *reset_domain;
 };
 
 struct amdgpu_pcs_ras_field {
diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
index b2b40e169342..05e98af30b48 100644
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
@@ -32,6 +32,8 @@
 #include "soc15_common.h"
 #include "mxgpu_ai.h"
 
+#include "amdgpu_reset.h"
+
 static void xgpu_ai_mailbox_send_ack(struct amdgpu_device *adev)
 {
 	WREG8(AI_MAIBOX_CONTROL_RCV_OFFSET_BYTE, 2);
@@ -302,7 +304,7 @@ static int xgpu_ai_mailbox_rcv_irq(struct amdgpu_device *adev,
 	switch (event) {
 		case IDH_FLR_NOTIFICATION:
 		if (amdgpu_sriov_runtime(adev) && !amdgpu_in_reset(adev))
-			WARN_ONCE(!queue_work(adev->reset_domain.wq,
+			WARN_ONCE(!queue_work(adev->reset_domain->wq,
 					      &adev->virt.flr_work),
 				  "Failed to queue work! at %s",
 				  __func__);
diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
index 08411924150d..6e12055f3f4a 100644
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
@@ -31,6 +31,8 @@
 #include "soc15_common.h"
 #include "mxgpu_nv.h"
 
+#include "amdgpu_reset.h"
+
 static void xgpu_nv_mailbox_send_ack(struct amdgpu_device *adev)
 {
 	WREG8(NV_MAIBOX_CONTROL_RCV_OFFSET_BYTE, 2);
@@ -337,7 +339,7 @@ static int xgpu_nv_mailbox_rcv_irq(struct amdgpu_device *adev,
 	switch (event) {
 	case IDH_FLR_NOTIFICATION:
 		if (amdgpu_sriov_runtime(adev) && !amdgpu_in_reset(adev))
-			WARN_ONCE(!queue_work(adev->reset_domain.wq,
+			WARN_ONCE(!queue_work(adev->reset_domain->wq,
 					      &adev->virt.flr_work),
 				  "Failed to queue work! at %s",
 				  __func__);
diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c
index 02290febfcf4..fe1570c2146e 100644
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c
@@ -42,6 +42,8 @@
 #include "smu/smu_7_1_3_d.h"
 #include "mxgpu_vi.h"
 
+#include "amdgpu_reset.h"
+
 /* VI golden setting */
 static const u32 xgpu_fiji_mgcg_cgcg_init[] = {
 	mmRLC_CGTT_MGCG_OVERRIDE, 0xffffffff, 0xffffffff,
@@ -551,7 +553,7 @@ static int xgpu_vi_mailbox_rcv_irq(struct amdgpu_device *adev,
 
 		/* only handle FLR_NOTIFY now */
 		if (!r && !amdgpu_in_reset(adev))
-			WARN_ONCE(!queue_work(adev->reset_domain.wq,
+			WARN_ONCE(!queue_work(adev->reset_domain->wq,
 					      &adev->virt.flr_work),
 				  "Failed to queue work! at %s",
 				  __func__);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC v3 09/12] drm/amdgpu: Move reset sem into reset_domain
  2022-01-25 22:37 [RFC v3 00/12] Define and use reset domain for GPU recovery in amdgpu Andrey Grodzovsky
                   ` (7 preceding siblings ...)
  2022-01-25 22:37 ` [RFC v3 08/12] drm/amdgpu: Rework reset domain to be refcounted Andrey Grodzovsky
@ 2022-01-25 22:37 ` Andrey Grodzovsky
  2022-01-25 22:37 ` [RFC v3 10/12] drm/amdgpu: Move in_gpu_reset " Andrey Grodzovsky
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 27+ messages in thread
From: Andrey Grodzovsky @ 2022-01-25 22:37 UTC (permalink / raw)
  To: dri-devel, amd-gfx
  Cc: horace.chen, lijo.lazar, jingwech, christian.koenig, Monk.Liu

We want single instance of reset sem across all
reset clients because in case of XGMI we should stop
access cross device MMIO because any of them could be
in a reset in the moment.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h           |  1 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c   | 10 ++++----
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 23 +++++++++----------
 .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c    | 18 ++++++++-------
 drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c     |  2 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h     |  1 +
 drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c        |  6 +++--
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c         | 14 ++++++-----
 drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c         |  4 ++--
 drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c         |  4 ++--
 10 files changed, 46 insertions(+), 37 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index f2ba460bfd59..f021cd3c9d34 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1058,7 +1058,6 @@ struct amdgpu_device {
 
 	atomic_t 			in_gpu_reset;
 	enum pp_mp1_state               mp1_state;
-	struct rw_semaphore reset_sem;
 	struct amdgpu_doorbell_index doorbell_index;
 
 	struct mutex			notifier_lock;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index 164d6a9e9fbb..e7ccd5d16e9d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -37,6 +37,8 @@
 #include "amdgpu_fw_attestation.h"
 #include "amdgpu_umr.h"
 
+#include "amdgpu_reset.h"
+
 #if defined(CONFIG_DEBUG_FS)
 
 /**
@@ -1279,7 +1281,7 @@ static int amdgpu_debugfs_test_ib_show(struct seq_file *m, void *unused)
 	}
 
 	/* Avoid accidently unparking the sched thread during GPU reset */
-	r = down_write_killable(&adev->reset_sem);
+	r = down_write_killable(&adev->reset_domain->sem);
 	if (r)
 		return r;
 
@@ -1308,7 +1310,7 @@ static int amdgpu_debugfs_test_ib_show(struct seq_file *m, void *unused)
 		kthread_unpark(ring->sched.thread);
 	}
 
-	up_write(&adev->reset_sem);
+	up_write(&adev->reset_domain->sem);
 
 	pm_runtime_mark_last_busy(dev->dev);
 	pm_runtime_put_autosuspend(dev->dev);
@@ -1517,7 +1519,7 @@ static int amdgpu_debugfs_ib_preempt(void *data, u64 val)
 		return -ENOMEM;
 
 	/* Avoid accidently unparking the sched thread during GPU reset */
-	r = down_read_killable(&adev->reset_sem);
+	r = down_read_killable(&adev->reset_domain->sem);
 	if (r)
 		goto pro_end;
 
@@ -1560,7 +1562,7 @@ static int amdgpu_debugfs_ib_preempt(void *data, u64 val)
 	/* restart the scheduler */
 	kthread_unpark(ring->sched.thread);
 
-	up_read(&adev->reset_sem);
+	up_read(&adev->reset_domain->sem);
 
 	ttm_bo_unlock_delayed_workqueue(&adev->mman.bdev, resched);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index b24829096359..6991ab4a8191 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -423,10 +423,10 @@ bool amdgpu_device_skip_hw_access(struct amdgpu_device *adev)
 	 * the lock.
 	 */
 	if (in_task()) {
-		if (down_read_trylock(&adev->reset_sem))
-			up_read(&adev->reset_sem);
+		if (down_read_trylock(&adev->reset_domain->sem))
+			up_read(&adev->reset_domain->sem);
 		else
-			lockdep_assert_held(&adev->reset_sem);
+			lockdep_assert_held(&adev->reset_domain->sem);
 	}
 #endif
 	return false;
@@ -452,9 +452,9 @@ uint32_t amdgpu_device_rreg(struct amdgpu_device *adev,
 	if ((reg * 4) < adev->rmmio_size) {
 		if (!(acc_flags & AMDGPU_REGS_NO_KIQ) &&
 		    amdgpu_sriov_runtime(adev) &&
-		    down_read_trylock(&adev->reset_sem)) {
+		    down_read_trylock(&adev->reset_domain->sem)) {
 			ret = amdgpu_kiq_rreg(adev, reg);
-			up_read(&adev->reset_sem);
+			up_read(&adev->reset_domain->sem);
 		} else {
 			ret = readl(((void __iomem *)adev->rmmio) + (reg * 4));
 		}
@@ -537,9 +537,9 @@ void amdgpu_device_wreg(struct amdgpu_device *adev,
 	if ((reg * 4) < adev->rmmio_size) {
 		if (!(acc_flags & AMDGPU_REGS_NO_KIQ) &&
 		    amdgpu_sriov_runtime(adev) &&
-		    down_read_trylock(&adev->reset_sem)) {
+		    down_read_trylock(&adev->reset_domain->sem)) {
 			amdgpu_kiq_wreg(adev, reg, v);
-			up_read(&adev->reset_sem);
+			up_read(&adev->reset_domain->sem);
 		} else {
 			writel(v, ((void __iomem *)adev->rmmio) + (reg * 4));
 		}
@@ -3512,7 +3512,6 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 	mutex_init(&adev->virt.vf_errors.lock);
 	hash_init(adev->mn_hash);
 	atomic_set(&adev->in_gpu_reset, 0);
-	init_rwsem(&adev->reset_sem);
 	mutex_init(&adev->psp.mutex);
 	mutex_init(&adev->notifier_lock);
 
@@ -4779,9 +4778,9 @@ static void amdgpu_device_lock_adev(struct amdgpu_device *adev,
 	atomic_set(&adev->in_gpu_reset, 1);
 
 	if (hive) {
-		down_write_nest_lock(&adev->reset_sem, &hive->hive_lock);
+		down_write_nest_lock(&adev->reset_domain->sem, &hive->hive_lock);
 	} else {
-		down_write(&adev->reset_sem);
+		down_write(&adev->reset_domain->sem);
 	}
 
 	switch (amdgpu_asic_reset_method(adev)) {
@@ -4802,7 +4801,7 @@ static void amdgpu_device_unlock_adev(struct amdgpu_device *adev)
 	amdgpu_vf_error_trans_all(adev);
 	adev->mp1_state = PP_MP1_STATE_NONE;
 	atomic_set(&adev->in_gpu_reset, 0);
-	up_write(&adev->reset_sem);
+	up_write(&adev->reset_domain->sem);
 }
 
 static void amdgpu_device_resume_display_audio(struct amdgpu_device *adev)
@@ -5421,7 +5420,7 @@ pci_ers_result_t amdgpu_pci_error_detected(struct pci_dev *pdev, pci_channel_sta
 	/* Fatal error, prepare for slot reset */
 	case pci_channel_io_frozen:
 		/*
-		 * Locking adev->reset_sem will prevent any external access
+		 * Locking adev->reset_domain->sem will prevent any external access
 		 * to GPU during PCI error recovery
 		 */
 		amdgpu_device_lock_adev(adev, NULL);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
index 05117eda105b..d3e055314804 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
@@ -31,6 +31,8 @@
 #include <linux/debugfs.h>
 #include <linux/uaccess.h>
 
+#include "amdgpu_reset.h"
+
 #define EEPROM_I2C_MADDR_VEGA20         0x0
 #define EEPROM_I2C_MADDR_ARCTURUS       0x40000
 #define EEPROM_I2C_MADDR_ARCTURUS_D342  0x0
@@ -193,12 +195,12 @@ static int __write_table_header(struct amdgpu_ras_eeprom_control *control)
 	__encode_table_header_to_buf(&control->tbl_hdr, buf);
 
 	/* i2c may be unstable in gpu reset */
-	down_read(&adev->reset_sem);
+	down_read(&adev->reset_domain->sem);
 	res = amdgpu_eeprom_write(&adev->pm.smu_i2c,
 				  control->i2c_address +
 				  control->ras_header_offset,
 				  buf, RAS_TABLE_HEADER_SIZE);
-	up_read(&adev->reset_sem);
+	up_read(&adev->reset_domain->sem);
 
 	if (res < 0) {
 		DRM_ERROR("Failed to write EEPROM table header:%d", res);
@@ -387,13 +389,13 @@ static int __amdgpu_ras_eeprom_write(struct amdgpu_ras_eeprom_control *control,
 	int res;
 
 	/* i2c may be unstable in gpu reset */
-	down_read(&adev->reset_sem);
+	down_read(&adev->reset_domain->sem);
 	buf_size = num * RAS_TABLE_RECORD_SIZE;
 	res = amdgpu_eeprom_write(&adev->pm.smu_i2c,
 				  control->i2c_address +
 				  RAS_INDEX_TO_OFFSET(control, fri),
 				  buf, buf_size);
-	up_read(&adev->reset_sem);
+	up_read(&adev->reset_domain->sem);
 	if (res < 0) {
 		DRM_ERROR("Writing %d EEPROM table records error:%d",
 			  num, res);
@@ -547,12 +549,12 @@ amdgpu_ras_eeprom_update_header(struct amdgpu_ras_eeprom_control *control)
 		goto Out;
 	}
 
-	down_read(&adev->reset_sem);
+	down_read(&adev->reset_domain->sem);
 	res = amdgpu_eeprom_read(&adev->pm.smu_i2c,
 				 control->i2c_address +
 				 control->ras_record_offset,
 				 buf, buf_size);
-	up_read(&adev->reset_sem);
+	up_read(&adev->reset_domain->sem);
 	if (res < 0) {
 		DRM_ERROR("EEPROM failed reading records:%d\n",
 			  res);
@@ -642,13 +644,13 @@ static int __amdgpu_ras_eeprom_read(struct amdgpu_ras_eeprom_control *control,
 	int res;
 
 	/* i2c may be unstable in gpu reset */
-	down_read(&adev->reset_sem);
+	down_read(&adev->reset_domain->sem);
 	buf_size = num * RAS_TABLE_RECORD_SIZE;
 	res = amdgpu_eeprom_read(&adev->pm.smu_i2c,
 				 control->i2c_address +
 				 RAS_INDEX_TO_OFFSET(control, fri),
 				 buf, buf_size);
-	up_read(&adev->reset_sem);
+	up_read(&adev->reset_domain->sem);
 	if (res < 0) {
 		DRM_ERROR("Reading %d EEPROM table records error:%d",
 			  num, res);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
index f2d310cd8d40..011585e330f6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
@@ -127,6 +127,8 @@ struct amdgpu_reset_domain *amdgpu_reset_create_reset_domain(char *wq_name)
 
 	}
 
+	init_rwsem(&reset_domain->sem);
+
 	return reset_domain;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
index cd030e63e4c6..7451089b0c06 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
@@ -73,6 +73,7 @@ struct amdgpu_reset_control {
 struct amdgpu_reset_domain {
 	struct kref refcount;
 	struct workqueue_struct *wq;
+	struct rw_semaphore sem;
 };
 
 
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
index 3ec5ff5a6dbe..29d3a222181e 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
@@ -48,6 +48,8 @@
 #include "athub_v2_0.h"
 #include "athub_v2_1.h"
 
+#include "amdgpu_reset.h"
+
 #if 0
 static const struct soc15_reg_golden golden_settings_navi10_hdp[] =
 {
@@ -328,7 +330,7 @@ static void gmc_v10_0_flush_gpu_tlb(struct amdgpu_device *adev, uint32_t vmid,
 	 */
 	if (adev->gfx.kiq.ring.sched.ready &&
 	    (amdgpu_sriov_runtime(adev) || !amdgpu_sriov_vf(adev)) &&
-	    down_read_trylock(&adev->reset_sem)) {
+	    down_read_trylock(&adev->reset_domain->sem)) {
 		struct amdgpu_vmhub *hub = &adev->vmhub[vmhub];
 		const unsigned eng = 17;
 		u32 inv_req = hub->vmhub_funcs->get_invalidate_req(vmid, flush_type);
@@ -338,7 +340,7 @@ static void gmc_v10_0_flush_gpu_tlb(struct amdgpu_device *adev, uint32_t vmid,
 		amdgpu_virt_kiq_reg_write_reg_wait(adev, req, ack, inv_req,
 				1 << vmid);
 
-		up_read(&adev->reset_sem);
+		up_read(&adev->reset_domain->sem);
 		return;
 	}
 
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index cb82404df534..d5e8d4709d8b 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -62,6 +62,8 @@
 #include "amdgpu_ras.h"
 #include "amdgpu_xgmi.h"
 
+#include "amdgpu_reset.h"
+
 /* add these here since we already include dce12 headers and these are for DCN */
 #define mmHUBP0_DCSURF_PRI_VIEWPORT_DIMENSION                                                          0x055d
 #define mmHUBP0_DCSURF_PRI_VIEWPORT_DIMENSION_BASE_IDX                                                 2
@@ -766,13 +768,13 @@ static void gmc_v9_0_flush_gpu_tlb(struct amdgpu_device *adev, uint32_t vmid,
 	 */
 	if (adev->gfx.kiq.ring.sched.ready &&
 	    (amdgpu_sriov_runtime(adev) || !amdgpu_sriov_vf(adev)) &&
-	    down_read_trylock(&adev->reset_sem)) {
+	    down_read_trylock(&adev->reset_domain->sem)) {
 		uint32_t req = hub->vm_inv_eng0_req + hub->eng_distance * eng;
 		uint32_t ack = hub->vm_inv_eng0_ack + hub->eng_distance * eng;
 
 		amdgpu_virt_kiq_reg_write_reg_wait(adev, req, ack, inv_req,
 						   1 << vmid);
-		up_read(&adev->reset_sem);
+		up_read(&adev->reset_domain->sem);
 		return;
 	}
 
@@ -868,7 +870,7 @@ static int gmc_v9_0_flush_gpu_tlb_pasid(struct amdgpu_device *adev,
 	if (amdgpu_in_reset(adev))
 		return -EIO;
 
-	if (ring->sched.ready && down_read_trylock(&adev->reset_sem)) {
+	if (ring->sched.ready && down_read_trylock(&adev->reset_domain->sem)) {
 		/* Vega20+XGMI caches PTEs in TC and TLB. Add a
 		 * heavy-weight TLB flush (type 2), which flushes
 		 * both. Due to a race condition with concurrent
@@ -895,7 +897,7 @@ static int gmc_v9_0_flush_gpu_tlb_pasid(struct amdgpu_device *adev,
 		if (r) {
 			amdgpu_ring_undo(ring);
 			spin_unlock(&adev->gfx.kiq.ring_lock);
-			up_read(&adev->reset_sem);
+			up_read(&adev->reset_domain->sem);
 			return -ETIME;
 		}
 
@@ -904,10 +906,10 @@ static int gmc_v9_0_flush_gpu_tlb_pasid(struct amdgpu_device *adev,
 		r = amdgpu_fence_wait_polling(ring, seq, adev->usec_timeout);
 		if (r < 1) {
 			dev_err(adev->dev, "wait for kiq fence error: %ld.\n", r);
-			up_read(&adev->reset_sem);
+			up_read(&adev->reset_domain->sem);
 			return -ETIME;
 		}
-		up_read(&adev->reset_sem);
+		up_read(&adev->reset_domain->sem);
 		return 0;
 	}
 
diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
index 05e98af30b48..5dab06fce26a 100644
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
@@ -254,7 +254,7 @@ static void xgpu_ai_mailbox_flr_work(struct work_struct *work)
 	 * otherwise the mailbox msg will be ruined/reseted by
 	 * the VF FLR.
 	 */
-	if (!down_write_trylock(&adev->reset_sem))
+	if (!down_write_trylock(&adev->reset_domain->sem))
 		return;
 
 	amdgpu_virt_fini_data_exchange(adev);
@@ -272,7 +272,7 @@ static void xgpu_ai_mailbox_flr_work(struct work_struct *work)
 
 flr_done:
 	atomic_set(&adev->in_gpu_reset, 0);
-	up_write(&adev->reset_sem);
+	up_write(&adev->reset_domain->sem);
 
 	/* Trigger recovery for world switch failure if no TDR */
 	if (amdgpu_device_should_recover_gpu(adev)
diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
index 6e12055f3f4a..868144fff16a 100644
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
@@ -283,7 +283,7 @@ static void xgpu_nv_mailbox_flr_work(struct work_struct *work)
 	 * otherwise the mailbox msg will be ruined/reseted by
 	 * the VF FLR.
 	 */
-	if (!down_write_trylock(&adev->reset_sem))
+	if (!down_write_trylock(&adev->reset_domain->sem))
 		return;
 
 	amdgpu_virt_fini_data_exchange(adev);
@@ -301,7 +301,7 @@ static void xgpu_nv_mailbox_flr_work(struct work_struct *work)
 
 flr_done:
 	atomic_set(&adev->in_gpu_reset, 0);
-	up_write(&adev->reset_sem);
+	up_write(&adev->reset_domain->sem);
 
 	/* Trigger recovery for world switch failure if no TDR */
 	if (amdgpu_device_should_recover_gpu(adev)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC v3 10/12] drm/amdgpu: Move in_gpu_reset into reset_domain
  2022-01-25 22:37 [RFC v3 00/12] Define and use reset domain for GPU recovery in amdgpu Andrey Grodzovsky
                   ` (8 preceding siblings ...)
  2022-01-25 22:37 ` [RFC v3 09/12] drm/amdgpu: Move reset sem into reset_domain Andrey Grodzovsky
@ 2022-01-25 22:37 ` Andrey Grodzovsky
  2022-02-08 10:49   ` Lazar, Lijo
  2022-01-25 22:37 ` [RFC v3 11/12] drm/amdgpu: Rework amdgpu_device_lock_adev Andrey Grodzovsky
                   ` (2 subsequent siblings)
  12 siblings, 1 reply; 27+ messages in thread
From: Andrey Grodzovsky @ 2022-01-25 22:37 UTC (permalink / raw)
  To: dri-devel, amd-gfx
  Cc: horace.chen, lijo.lazar, jingwech, christian.koenig, Monk.Liu

We should have a single instance per entrire reset domain.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Suggested-by: Lijo Lazar <lijo.lazar@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  7 ++-----
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 +++++++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c  |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h  |  1 +
 drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c      |  4 ++--
 drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c      |  4 ++--
 6 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index f021cd3c9d34..087796e389ab 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1056,7 +1056,6 @@ struct amdgpu_device {
 	bool				in_s4;
 	bool				in_s0ix;
 
-	atomic_t 			in_gpu_reset;
 	enum pp_mp1_state               mp1_state;
 	struct amdgpu_doorbell_index doorbell_index;
 
@@ -1461,8 +1460,6 @@ static inline bool amdgpu_is_tmz(struct amdgpu_device *adev)
        return adev->gmc.tmz_enabled;
 }
 
-static inline int amdgpu_in_reset(struct amdgpu_device *adev)
-{
-	return atomic_read(&adev->in_gpu_reset);
-}
+int amdgpu_in_reset(struct amdgpu_device *adev);
+
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 6991ab4a8191..aa43af443ebe 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3511,7 +3511,6 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 	mutex_init(&adev->mn_lock);
 	mutex_init(&adev->virt.vf_errors.lock);
 	hash_init(adev->mn_hash);
-	atomic_set(&adev->in_gpu_reset, 0);
 	mutex_init(&adev->psp.mutex);
 	mutex_init(&adev->notifier_lock);
 
@@ -4775,7 +4774,7 @@ int amdgpu_do_asic_reset(struct list_head *device_list_handle,
 static void amdgpu_device_lock_adev(struct amdgpu_device *adev,
 				struct amdgpu_hive_info *hive)
 {
-	atomic_set(&adev->in_gpu_reset, 1);
+	atomic_set(&adev->reset_domain->in_gpu_reset, 1);
 
 	if (hive) {
 		down_write_nest_lock(&adev->reset_domain->sem, &hive->hive_lock);
@@ -4800,7 +4799,7 @@ static void amdgpu_device_unlock_adev(struct amdgpu_device *adev)
 {
 	amdgpu_vf_error_trans_all(adev);
 	adev->mp1_state = PP_MP1_STATE_NONE;
-	atomic_set(&adev->in_gpu_reset, 0);
+	atomic_set(&adev->reset_domain->in_gpu_reset, 0);
 	up_write(&adev->reset_domain->sem);
 }
 
@@ -5643,3 +5642,8 @@ void amdgpu_device_invalidate_hdp(struct amdgpu_device *adev,
 
 	amdgpu_asic_invalidate_hdp(adev, ring);
 }
+
+int amdgpu_in_reset(struct amdgpu_device *adev)
+{
+	return atomic_read(&adev->reset_domain->in_gpu_reset);
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
index 011585e330f6..e9b804a89b34 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
@@ -127,6 +127,7 @@ struct amdgpu_reset_domain *amdgpu_reset_create_reset_domain(char *wq_name)
 
 	}
 
+	atomic_set(&reset_domain->in_gpu_reset, 0);
 	init_rwsem(&reset_domain->sem);
 
 	return reset_domain;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
index 7451089b0c06..413982f4e1ce 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
@@ -74,6 +74,7 @@ struct amdgpu_reset_domain {
 	struct kref refcount;
 	struct workqueue_struct *wq;
 	struct rw_semaphore sem;
+	atomic_t in_gpu_reset;
 };
 
 
diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
index 5dab06fce26a..6c79746d18db 100644
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
@@ -258,7 +258,7 @@ static void xgpu_ai_mailbox_flr_work(struct work_struct *work)
 		return;
 
 	amdgpu_virt_fini_data_exchange(adev);
-	atomic_set(&adev->in_gpu_reset, 1);
+	atomic_set(&adev->reset_domain->in_gpu_reset, 1);
 
 	xgpu_ai_mailbox_trans_msg(adev, IDH_READY_TO_RESET, 0, 0, 0);
 
@@ -271,7 +271,7 @@ static void xgpu_ai_mailbox_flr_work(struct work_struct *work)
 	} while (timeout > 1);
 
 flr_done:
-	atomic_set(&adev->in_gpu_reset, 0);
+	atomic_set(&adev->reset_domain->in_gpu_reset, 0);
 	up_write(&adev->reset_domain->sem);
 
 	/* Trigger recovery for world switch failure if no TDR */
diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
index 868144fff16a..39f7e1e9ab81 100644
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
@@ -287,7 +287,7 @@ static void xgpu_nv_mailbox_flr_work(struct work_struct *work)
 		return;
 
 	amdgpu_virt_fini_data_exchange(adev);
-	atomic_set(&adev->in_gpu_reset, 1);
+	atomic_set(&adev->reset_domain->in_gpu_reset, 1);
 
 	xgpu_nv_mailbox_trans_msg(adev, IDH_READY_TO_RESET, 0, 0, 0);
 
@@ -300,7 +300,7 @@ static void xgpu_nv_mailbox_flr_work(struct work_struct *work)
 	} while (timeout > 1);
 
 flr_done:
-	atomic_set(&adev->in_gpu_reset, 0);
+	atomic_set(&adev->reset_domain->in_gpu_reset, 0);
 	up_write(&adev->reset_domain->sem);
 
 	/* Trigger recovery for world switch failure if no TDR */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC v3 11/12] drm/amdgpu: Rework amdgpu_device_lock_adev
  2022-01-25 22:37 [RFC v3 00/12] Define and use reset domain for GPU recovery in amdgpu Andrey Grodzovsky
                   ` (9 preceding siblings ...)
  2022-01-25 22:37 ` [RFC v3 10/12] drm/amdgpu: Move in_gpu_reset " Andrey Grodzovsky
@ 2022-01-25 22:37 ` Andrey Grodzovsky
  2022-01-25 22:37 ` [RFC v3 12/12] Revert 'drm/amdgpu: annotate a false positive recursive locking' Andrey Grodzovsky
  2022-01-28 19:36 ` [RFC v3 00/12] Define and use reset domain for GPU recovery in amdgpu Andrey Grodzovsky
  12 siblings, 0 replies; 27+ messages in thread
From: Andrey Grodzovsky @ 2022-01-25 22:37 UTC (permalink / raw)
  To: dri-devel, amd-gfx
  Cc: horace.chen, lijo.lazar, jingwech, christian.koenig, Monk.Liu

This functions needs to be split into 2 parts where
one is called only once for locking single instance of
reset_domain's sem and reset flag and the other part
which handles MP1 states should still be called for
each device in XGMI hive.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 48 ++++++++++++++++------
 1 file changed, 35 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index aa43af443ebe..31310922b6bf 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4771,16 +4771,20 @@ int amdgpu_do_asic_reset(struct list_head *device_list_handle,
 	return r;
 }
 
-static void amdgpu_device_lock_adev(struct amdgpu_device *adev,
-				struct amdgpu_hive_info *hive)
+static void amdgpu_device_lock_reset_domain(struct amdgpu_reset_domain *reset_domain,
+					    struct amdgpu_hive_info *hive)
 {
-	atomic_set(&adev->reset_domain->in_gpu_reset, 1);
+	atomic_set(&reset_domain->in_gpu_reset, 1);
 
 	if (hive) {
-		down_write_nest_lock(&adev->reset_domain->sem, &hive->hive_lock);
+		down_write_nest_lock(&reset_domain->sem, &hive->hive_lock);
 	} else {
-		down_write(&adev->reset_domain->sem);
+		down_write(&reset_domain->sem);
 	}
+}
+
+static void amdgpu_device_set_mp1_state(struct amdgpu_device *adev)
+{
 
 	switch (amdgpu_asic_reset_method(adev)) {
 	case AMD_RESET_METHOD_MODE1:
@@ -4795,14 +4799,19 @@ static void amdgpu_device_lock_adev(struct amdgpu_device *adev,
 	}
 }
 
-static void amdgpu_device_unlock_adev(struct amdgpu_device *adev)
+static void amdgpu_device_unset_mp1_state(struct amdgpu_device *adev)
 {
 	amdgpu_vf_error_trans_all(adev);
 	adev->mp1_state = PP_MP1_STATE_NONE;
-	atomic_set(&adev->reset_domain->in_gpu_reset, 0);
-	up_write(&adev->reset_domain->sem);
 }
 
+static void amdgpu_device_unlock_reset_domain(struct amdgpu_reset_domain *reset_domain)
+{
+	atomic_set(&reset_domain->in_gpu_reset, 0);
+	up_write(&reset_domain->sem);
+}
+
+
 static void amdgpu_device_resume_display_audio(struct amdgpu_device *adev)
 {
 	struct pci_dev *p = NULL;
@@ -5005,10 +5014,15 @@ int amdgpu_device_gpu_recover_imp(struct amdgpu_device *adev,
 		device_list_handle = &device_list;
 	}
 
+	/* We need to lock reset domain only once both for XGMI and single device */
+	tmp_adev = list_first_entry(device_list_handle, struct amdgpu_device,
+				    reset_list);
+	amdgpu_device_lock_reset_domain(tmp_adev->reset_domain, hive);
+
 	/* block all schedulers and reset given job's ring */
 	list_for_each_entry(tmp_adev, device_list_handle, reset_list) {
 
-		amdgpu_device_lock_adev(tmp_adev, hive);
+		amdgpu_device_set_mp1_state(tmp_adev);
 
 		/*
 		 * Try to put the audio codec into suspend state
@@ -5158,9 +5172,14 @@ int amdgpu_device_gpu_recover_imp(struct amdgpu_device *adev,
 
 		if (audio_suspended)
 			amdgpu_device_resume_display_audio(tmp_adev);
-		amdgpu_device_unlock_adev(tmp_adev);
+
+		amdgpu_device_unset_mp1_state(tmp_adev);
 	}
 
+	tmp_adev = list_first_entry(device_list_handle, struct amdgpu_device,
+					    reset_list);
+	amdgpu_device_unlock_reset_domain(tmp_adev->reset_domain);
+
 	if (hive) {
 		mutex_unlock(&hive->hive_lock);
 		amdgpu_put_xgmi_hive(hive);
@@ -5422,7 +5441,8 @@ pci_ers_result_t amdgpu_pci_error_detected(struct pci_dev *pdev, pci_channel_sta
 		 * Locking adev->reset_domain->sem will prevent any external access
 		 * to GPU during PCI error recovery
 		 */
-		amdgpu_device_lock_adev(adev, NULL);
+		amdgpu_device_lock_reset_domain(adev->reset_domain, NULL);
+		amdgpu_device_set_mp1_state(adev);
 
 		/*
 		 * Block any work scheduling as we do for regular GPU reset
@@ -5529,7 +5549,8 @@ pci_ers_result_t amdgpu_pci_slot_reset(struct pci_dev *pdev)
 		DRM_INFO("PCIe error recovery succeeded\n");
 	} else {
 		DRM_ERROR("PCIe error recovery failed, err:%d", r);
-		amdgpu_device_unlock_adev(adev);
+		amdgpu_device_unset_mp1_state(adev);
+		amdgpu_device_unlock_reset_domain(adev->reset_domain);
 	}
 
 	return r ? PCI_ERS_RESULT_DISCONNECT : PCI_ERS_RESULT_RECOVERED;
@@ -5566,7 +5587,8 @@ void amdgpu_pci_resume(struct pci_dev *pdev)
 		drm_sched_start(&ring->sched, true);
 	}
 
-	amdgpu_device_unlock_adev(adev);
+	amdgpu_device_unset_mp1_state(adev);
+	amdgpu_device_unlock_reset_domain(adev->reset_domain);
 }
 
 bool amdgpu_device_cache_pci_state(struct pci_dev *pdev)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [RFC v3 12/12] Revert 'drm/amdgpu: annotate a false positive recursive locking'
  2022-01-25 22:37 [RFC v3 00/12] Define and use reset domain for GPU recovery in amdgpu Andrey Grodzovsky
                   ` (10 preceding siblings ...)
  2022-01-25 22:37 ` [RFC v3 11/12] drm/amdgpu: Rework amdgpu_device_lock_adev Andrey Grodzovsky
@ 2022-01-25 22:37 ` Andrey Grodzovsky
  2022-01-28 19:36 ` [RFC v3 00/12] Define and use reset domain for GPU recovery in amdgpu Andrey Grodzovsky
  12 siblings, 0 replies; 27+ messages in thread
From: Andrey Grodzovsky @ 2022-01-25 22:37 UTC (permalink / raw)
  To: dri-devel, amd-gfx
  Cc: horace.chen, lijo.lazar, jingwech, christian.koenig, Monk.Liu

Since we have a single instance of reset semaphore which we
lock only once even for XGMI hive we don't need the nested
locking hint anymore.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 14 ++++----------
 1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 31310922b6bf..b97992d62db5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4771,16 +4771,10 @@ int amdgpu_do_asic_reset(struct list_head *device_list_handle,
 	return r;
 }
 
-static void amdgpu_device_lock_reset_domain(struct amdgpu_reset_domain *reset_domain,
-					    struct amdgpu_hive_info *hive)
+static void amdgpu_device_lock_reset_domain(struct amdgpu_reset_domain *reset_domain)
 {
 	atomic_set(&reset_domain->in_gpu_reset, 1);
-
-	if (hive) {
-		down_write_nest_lock(&reset_domain->sem, &hive->hive_lock);
-	} else {
-		down_write(&reset_domain->sem);
-	}
+	down_write(&reset_domain->sem);
 }
 
 static void amdgpu_device_set_mp1_state(struct amdgpu_device *adev)
@@ -5017,7 +5011,7 @@ int amdgpu_device_gpu_recover_imp(struct amdgpu_device *adev,
 	/* We need to lock reset domain only once both for XGMI and single device */
 	tmp_adev = list_first_entry(device_list_handle, struct amdgpu_device,
 				    reset_list);
-	amdgpu_device_lock_reset_domain(tmp_adev->reset_domain, hive);
+	amdgpu_device_lock_reset_domain(tmp_adev->reset_domain);
 
 	/* block all schedulers and reset given job's ring */
 	list_for_each_entry(tmp_adev, device_list_handle, reset_list) {
@@ -5441,7 +5435,7 @@ pci_ers_result_t amdgpu_pci_error_detected(struct pci_dev *pdev, pci_channel_sta
 		 * Locking adev->reset_domain->sem will prevent any external access
 		 * to GPU during PCI error recovery
 		 */
-		amdgpu_device_lock_reset_domain(adev->reset_domain, NULL);
+		amdgpu_device_lock_reset_domain(adev->reset_domain);
 		amdgpu_device_set_mp1_state(adev);
 
 		/*
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [RFC v3 01/12] drm/amdgpu: Introduce reset domain
  2022-01-25 22:37 ` [RFC v3 01/12] drm/amdgpu: Introduce reset domain Andrey Grodzovsky
@ 2022-01-26 12:07   ` Christian König
  2022-01-26 15:47     ` Andrey Grodzovsky
  0 siblings, 1 reply; 27+ messages in thread
From: Christian König @ 2022-01-26 12:07 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx
  Cc: Daniel Vetter, horace.chen, lijo.lazar, jingwech,
	christian.koenig, Monk.Liu

Am 25.01.22 um 23:37 schrieb Andrey Grodzovsky:
> Defined a reset_domain struct such that
> all the entities that go through reset
> together will be serialized one against
> another. Do it for both single device and
> XGMI hive cases.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Suggested-by: Christian König <ckoenig.leichtzumerken@gmail.com>
> Reviewed-by: Christian König <christian.koenig@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  7 +++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 20 +++++++++++++++++++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c   |  9 +++++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h   |  2 ++
>   4 files changed, 37 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 9f017663ac50..b5ff76aae7e0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -812,6 +812,11 @@ struct amd_powerplay {
>   
>   #define AMDGPU_RESET_MAGIC_NUM 64
>   #define AMDGPU_MAX_DF_PERFMONS 4
> +
> +struct amdgpu_reset_domain {
> +	struct workqueue_struct *wq;
> +};
> +
>   struct amdgpu_device {
>   	struct device			*dev;
>   	struct pci_dev			*pdev;
> @@ -1096,6 +1101,8 @@ struct amdgpu_device {
>   
>   	struct amdgpu_reset_control     *reset_cntl;
>   	uint32_t                        ip_versions[HW_ID_MAX][HWIP_MAX_INSTANCE];
> +
> +	struct amdgpu_reset_domain	reset_domain;

I'm a bit confused, shouldn't this be a pointer?

Regards,
Christian.

>   };
>   
>   static inline struct amdgpu_device *drm_to_adev(struct drm_device *ddev)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 90d22a376632..0f3e6c078f88 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2391,9 +2391,27 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev)
>   	if (r)
>   		goto init_failed;
>   
> -	if (adev->gmc.xgmi.num_physical_nodes > 1)
> +	if (adev->gmc.xgmi.num_physical_nodes > 1) {
> +		struct amdgpu_hive_info *hive;
> +
>   		amdgpu_xgmi_add_device(adev);
>   
> +		hive = amdgpu_get_xgmi_hive(adev);
> +		if (!hive || !hive->reset_domain.wq) {
> +			DRM_ERROR("Failed to obtain reset domain info for XGMI hive:%llx", hive->hive_id);
> +			r = -EINVAL;
> +			goto init_failed;
> +		}
> +
> +		adev->reset_domain.wq = hive->reset_domain.wq;
> +	} else {
> +		adev->reset_domain.wq = alloc_ordered_workqueue("amdgpu-reset-dev", 0);
> +		if (!adev->reset_domain.wq) {
> +			r = -ENOMEM;
> +			goto init_failed;
> +		}
> +	}
> +
>   	/* Don't init kfd if whole hive need to be reset during init */
>   	if (!adev->gmc.xgmi.pending_reset)
>   		amdgpu_amdkfd_device_init(adev);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
> index 567df2db23ac..a858e3457c5c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
> @@ -392,6 +392,14 @@ struct amdgpu_hive_info *amdgpu_get_xgmi_hive(struct amdgpu_device *adev)
>   		goto pro_end;
>   	}
>   
> +	hive->reset_domain.wq = alloc_ordered_workqueue("amdgpu-reset-hive", 0);
> +	if (!hive->reset_domain.wq) {
> +		dev_err(adev->dev, "XGMI: failed allocating wq for reset domain!\n");
> +		kfree(hive);
> +		hive = NULL;
> +		goto pro_end;
> +	}
> +
>   	hive->hive_id = adev->gmc.xgmi.hive_id;
>   	INIT_LIST_HEAD(&hive->device_list);
>   	INIT_LIST_HEAD(&hive->node);
> @@ -401,6 +409,7 @@ struct amdgpu_hive_info *amdgpu_get_xgmi_hive(struct amdgpu_device *adev)
>   	task_barrier_init(&hive->tb);
>   	hive->pstate = AMDGPU_XGMI_PSTATE_UNKNOWN;
>   	hive->hi_req_gpu = NULL;
> +
>   	/*
>   	 * hive pstate on boot is high in vega20 so we have to go to low
>   	 * pstate on after boot.
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
> index d2189bf7d428..6121aaa292cb 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
> @@ -42,6 +42,8 @@ struct amdgpu_hive_info {
>   		AMDGPU_XGMI_PSTATE_MAX_VEGA20,
>   		AMDGPU_XGMI_PSTATE_UNKNOWN
>   	} pstate;
> +
> +	struct amdgpu_reset_domain reset_domain;
>   };
>   
>   struct amdgpu_pcs_ras_field {


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC v3 08/12] drm/amdgpu: Rework reset domain to be refcounted.
  2022-01-25 22:37 ` [RFC v3 08/12] drm/amdgpu: Rework reset domain to be refcounted Andrey Grodzovsky
@ 2022-01-26 12:12   ` Christian König
  2022-02-02 17:26   ` [RFC v4] " Andrey Grodzovsky
  1 sibling, 0 replies; 27+ messages in thread
From: Christian König @ 2022-01-26 12:12 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx
  Cc: horace.chen, lijo.lazar, jingwech, Monk.Liu



Am 25.01.22 um 23:37 schrieb Andrey Grodzovsky:
> The reset domain contains register access semaphor
> now and so needs to be present as long as each device
> in a hive needs it and so it cannot be binded to XGMI
> hive life cycle.
> Adress this by making reset domain refcounted and pointed
> by each member of the hive and the hive itself.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  6 +--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 44 +++++++++++++---------
>   drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c  | 36 ++++++++++++++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h  | 10 +++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c   | 26 ++++++++++---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h   |  2 +-
>   drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c      |  4 +-
>   drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c      |  4 +-
>   drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c      |  4 +-
>   9 files changed, 105 insertions(+), 31 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 8e96b9a14452..f2ba460bfd59 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -813,9 +813,7 @@ struct amd_powerplay {
>   #define AMDGPU_RESET_MAGIC_NUM 64
>   #define AMDGPU_MAX_DF_PERFMONS 4
>   
> -struct amdgpu_reset_domain {
> -	struct workqueue_struct *wq;
> -};
> +struct amdgpu_reset_domain;
>   
>   struct amdgpu_device {
>   	struct device			*dev;
> @@ -1102,7 +1100,7 @@ struct amdgpu_device {
>   	struct amdgpu_reset_control     *reset_cntl;
>   	uint32_t                        ip_versions[HW_ID_MAX][HWIP_MAX_INSTANCE];
>   
> -	struct amdgpu_reset_domain	reset_domain;
> +	struct amdgpu_reset_domain	*reset_domain;

Ah! Here it is, I was missing that on the initial patch.

>   };
>   
>   static inline struct amdgpu_device *drm_to_adev(struct drm_device *ddev)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index fef952ca8db5..b24829096359 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2313,7 +2313,7 @@ static int amdgpu_device_init_schedulers(struct amdgpu_device *adev)
>   
>   		r = drm_sched_init(&ring->sched, &amdgpu_sched_ops,
>   				   ring->num_hw_submission, amdgpu_job_hang_limit,
> -				   timeout, adev->reset_domain.wq, ring->sched_score, ring->name);
> +				   timeout, adev->reset_domain->wq, ring->sched_score, ring->name);
>   		if (r) {
>   			DRM_ERROR("Failed to create scheduler on ring %s.\n",
>   				  ring->name);
> @@ -2432,24 +2432,22 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev)
>   	if (r)
>   		goto init_failed;
>   
> +	/**
> +	 * In case of XGMI grab extra reference for reset domain for this device
> +	 */
>   	if (adev->gmc.xgmi.num_physical_nodes > 1) {
> -		struct amdgpu_hive_info *hive;
> -
> -		amdgpu_xgmi_add_device(adev);
> +		if (amdgpu_xgmi_add_device(adev) == 0) {
> +			struct amdgpu_hive_info *hive = amdgpu_get_xgmi_hive(adev);
>   
> -		hive = amdgpu_get_xgmi_hive(adev);
> -		if (!hive || !hive->reset_domain.wq) {
> -			DRM_ERROR("Failed to obtain reset domain info for XGMI hive:%llx", hive->hive_id);
> -			r = -EINVAL;
> -			goto init_failed;
> -		}
> +			if (!hive->reset_domain ||
> +			    !kref_get_unless_zero(&hive->reset_domain->refcount)) {
> +				r = -ENOENT;
> +				goto init_failed;
> +			}
>   
> -		adev->reset_domain.wq = hive->reset_domain.wq;
> -	} else {
> -		adev->reset_domain.wq = alloc_ordered_workqueue("amdgpu-reset-dev", 0);
> -		if (!adev->reset_domain.wq) {
> -			r = -ENOMEM;
> -			goto init_failed;
> +			/* Drop the early temporary reset domain we created for device */
> +			kref_put(&adev->reset_domain->refcount, amdgpu_reset_destroy_reset_domain);
> +			adev->reset_domain = hive->reset_domain;
>   		}
>   	}
>   
> @@ -3599,6 +3597,15 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>   		return r;
>   	}
>   
> +	/*
> +	 * Reset domain needs to be present early, before XGMI hive discovered
> +	 * (if any) and intitialized to use reset sem and in_gpu reset flag
> +	 * early on during init.
> +	 */
> +	adev->reset_domain = amdgpu_reset_create_reset_domain("amdgpu-reset-dev");
> +	if (!adev->reset_domain)
> +		return -ENOMEM;
> +
>   	/* early init functions */
>   	r = amdgpu_device_ip_early_init(adev);
>   	if (r)
> @@ -3949,6 +3956,9 @@ void amdgpu_device_fini_sw(struct amdgpu_device *adev)
>   	if (adev->mman.discovery_bin)
>   		amdgpu_discovery_fini(adev);
>   
> +	kref_put(&adev->reset_domain->refcount, amdgpu_reset_destroy_reset_domain);
> +	adev->reset_domain = NULL;
> +
>   	kfree(adev->pci_state);
>   
>   }
> @@ -5186,7 +5196,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
>   
>   	INIT_WORK(&work.base, amdgpu_device_queue_gpu_recover_work);
>   
> -	if (!queue_work(adev->reset_domain.wq, &work.base))
> +	if (!queue_work(adev->reset_domain->wq, &work.base))
>   		return -EAGAIN;
>   
>   	flush_work(&work.base);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
> index 02afd4115675..f2d310cd8d40 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
> @@ -96,3 +96,39 @@ int amdgpu_reset_perform_reset(struct amdgpu_device *adev,
>   	return reset_handler->restore_hwcontext(adev->reset_cntl,
>   						reset_context);
>   }
> +
> +
> +void amdgpu_reset_destroy_reset_domain(struct kref *ref)
> +{
> +	struct amdgpu_reset_domain *reset_domain = container_of(ref,
> +								struct amdgpu_reset_domain,
> +								refcount);
> +	destroy_workqueue(reset_domain->wq);
> +	kvfree(reset_domain);
> +}
> +
> +struct amdgpu_reset_domain *amdgpu_reset_create_reset_domain(char *wq_name)

Maybe just amdgpu_reset_domain_create()/amdgpu_reset_domain_destroy() ?

Apart from that nit looks good to me on first glance.

Regards,
Christian.

> +{
> +	struct amdgpu_reset_domain *reset_domain;
> +
> +	reset_domain = kvzalloc(sizeof(struct amdgpu_reset_domain), GFP_KERNEL);
> +	if (!reset_domain) {
> +		DRM_ERROR("Failed to allocate amdgpu_reset_domain!");
> +		return NULL;
> +	}
> +
> +	kref_init(&reset_domain->refcount);
> +
> +	reset_domain->wq = create_singlethread_workqueue(wq_name);
> +	if (!reset_domain->wq) {
> +		DRM_ERROR("Failed to allocate wq for amdgpu_reset_domain!");
> +		kref_put(&reset_domain->refcount, amdgpu_reset_destroy_reset_domain);
> +		return NULL;
> +
> +	}
> +
> +	return reset_domain;
> +}
> +
> +
> +
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
> index e00d38d9160a..cd030e63e4c6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
> @@ -70,6 +70,12 @@ struct amdgpu_reset_control {
>   	void (*async_reset)(struct work_struct *work);
>   };
>   
> +struct amdgpu_reset_domain {
> +	struct kref refcount;
> +	struct workqueue_struct *wq;
> +};
> +
> +
>   int amdgpu_reset_init(struct amdgpu_device *adev);
>   int amdgpu_reset_fini(struct amdgpu_device *adev);
>   
> @@ -82,4 +88,8 @@ int amdgpu_reset_perform_reset(struct amdgpu_device *adev,
>   int amdgpu_reset_add_handler(struct amdgpu_reset_control *reset_ctl,
>   			     struct amdgpu_reset_handler *handler);
>   
> +struct amdgpu_reset_domain *amdgpu_reset_create_reset_domain(char *wq_name);
> +
> +void amdgpu_reset_destroy_reset_domain(struct kref *ref);
> +
>   #endif
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
> index 9ad742039ac9..5908a3f8208a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
> @@ -32,6 +32,8 @@
>   #include "wafl/wafl2_4_0_0_smn.h"
>   #include "wafl/wafl2_4_0_0_sh_mask.h"
>   
> +#include "amdgpu_reset.h"
> +
>   #define smnPCS_XGMI23_PCS_ERROR_STATUS   0x11a01210
>   #define smnPCS_XGMI3X16_PCS_ERROR_STATUS 0x11a0020c
>   #define smnPCS_GOPX1_PCS_ERROR_STATUS    0x12200210
> @@ -226,6 +228,9 @@ static void amdgpu_xgmi_hive_release(struct kobject *kobj)
>   	struct amdgpu_hive_info *hive = container_of(
>   		kobj, struct amdgpu_hive_info, kobj);
>   
> +	kref_put(&hive->reset_domain->refcount, amdgpu_reset_destroy_reset_domain);
> +	hive->reset_domain = NULL;
> +
>   	mutex_destroy(&hive->hive_lock);
>   	kfree(hive);
>   }
> @@ -392,12 +397,21 @@ struct amdgpu_hive_info *amdgpu_get_xgmi_hive(struct amdgpu_device *adev)
>   		goto pro_end;
>   	}
>   
> -	hive->reset_domain.wq = alloc_ordered_workqueue("amdgpu-reset-hive", 0);
> -	if (!hive->reset_domain.wq) {
> -		dev_err(adev->dev, "XGMI: failed allocating wq for reset domain!\n");
> -		kfree(hive);
> -		hive = NULL;
> -		goto pro_end;
> +	/**
> +	 * Avoid recreating reset domain when hive is reconstructed for the case
> +	 * of reset the devices in the XGMI hive during probe for SRIOV
> +	 * See https://www.spinics.net/lists/amd-gfx/msg58836.html
> +	 */
> +	if (!adev->reset_domain) {
> +		hive->reset_domain = amdgpu_reset_create_reset_domain("amdgpu-reset-hive");
> +			if (!hive->reset_domain) {
> +				dev_err(adev->dev, "XGMI: failed initializing reset domain for xgmi hive\n");
> +				ret = -ENOMEM;
> +				kobject_put(&hive->kobj);
> +				kfree(hive);
> +				hive = NULL;
> +				goto pro_end;
> +			}
>   	}
>   
>   	hive->hive_id = adev->gmc.xgmi.hive_id;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
> index 2f2ce53645a5..dcaad22be492 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
> @@ -42,7 +42,7 @@ struct amdgpu_hive_info {
>   		AMDGPU_XGMI_PSTATE_UNKNOWN
>   	} pstate;
>   
> -	struct amdgpu_reset_domain reset_domain;
> +	struct amdgpu_reset_domain *reset_domain;
>   };
>   
>   struct amdgpu_pcs_ras_field {
> diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
> index b2b40e169342..05e98af30b48 100644
> --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
> +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
> @@ -32,6 +32,8 @@
>   #include "soc15_common.h"
>   #include "mxgpu_ai.h"
>   
> +#include "amdgpu_reset.h"
> +
>   static void xgpu_ai_mailbox_send_ack(struct amdgpu_device *adev)
>   {
>   	WREG8(AI_MAIBOX_CONTROL_RCV_OFFSET_BYTE, 2);
> @@ -302,7 +304,7 @@ static int xgpu_ai_mailbox_rcv_irq(struct amdgpu_device *adev,
>   	switch (event) {
>   		case IDH_FLR_NOTIFICATION:
>   		if (amdgpu_sriov_runtime(adev) && !amdgpu_in_reset(adev))
> -			WARN_ONCE(!queue_work(adev->reset_domain.wq,
> +			WARN_ONCE(!queue_work(adev->reset_domain->wq,
>   					      &adev->virt.flr_work),
>   				  "Failed to queue work! at %s",
>   				  __func__);
> diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
> index 08411924150d..6e12055f3f4a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
> @@ -31,6 +31,8 @@
>   #include "soc15_common.h"
>   #include "mxgpu_nv.h"
>   
> +#include "amdgpu_reset.h"
> +
>   static void xgpu_nv_mailbox_send_ack(struct amdgpu_device *adev)
>   {
>   	WREG8(NV_MAIBOX_CONTROL_RCV_OFFSET_BYTE, 2);
> @@ -337,7 +339,7 @@ static int xgpu_nv_mailbox_rcv_irq(struct amdgpu_device *adev,
>   	switch (event) {
>   	case IDH_FLR_NOTIFICATION:
>   		if (amdgpu_sriov_runtime(adev) && !amdgpu_in_reset(adev))
> -			WARN_ONCE(!queue_work(adev->reset_domain.wq,
> +			WARN_ONCE(!queue_work(adev->reset_domain->wq,
>   					      &adev->virt.flr_work),
>   				  "Failed to queue work! at %s",
>   				  __func__);
> diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c
> index 02290febfcf4..fe1570c2146e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c
> +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c
> @@ -42,6 +42,8 @@
>   #include "smu/smu_7_1_3_d.h"
>   #include "mxgpu_vi.h"
>   
> +#include "amdgpu_reset.h"
> +
>   /* VI golden setting */
>   static const u32 xgpu_fiji_mgcg_cgcg_init[] = {
>   	mmRLC_CGTT_MGCG_OVERRIDE, 0xffffffff, 0xffffffff,
> @@ -551,7 +553,7 @@ static int xgpu_vi_mailbox_rcv_irq(struct amdgpu_device *adev,
>   
>   		/* only handle FLR_NOTIFY now */
>   		if (!r && !amdgpu_in_reset(adev))
> -			WARN_ONCE(!queue_work(adev->reset_domain.wq,
> +			WARN_ONCE(!queue_work(adev->reset_domain->wq,
>   					      &adev->virt.flr_work),
>   				  "Failed to queue work! at %s",
>   				  __func__);


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC v3 01/12] drm/amdgpu: Introduce reset domain
  2022-01-26 12:07   ` Christian König
@ 2022-01-26 15:47     ` Andrey Grodzovsky
  0 siblings, 0 replies; 27+ messages in thread
From: Andrey Grodzovsky @ 2022-01-26 15:47 UTC (permalink / raw)
  To: Christian König, dri-devel, amd-gfx
  Cc: Daniel Vetter, horace.chen, lijo.lazar, jingwech,
	christian.koenig, Monk.Liu


On 2022-01-26 07:07, Christian König wrote:
> Am 25.01.22 um 23:37 schrieb Andrey Grodzovsky:
>> Defined a reset_domain struct such that
>> all the entities that go through reset
>> together will be serialized one against
>> another. Do it for both single device and
>> XGMI hive cases.
>>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>> Suggested-by: Christian König <ckoenig.leichtzumerken@gmail.com>
>> Reviewed-by: Christian König <christian.koenig@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  7 +++++++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 20 +++++++++++++++++++-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c   |  9 +++++++++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h   |  2 ++
>>   4 files changed, 37 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> index 9f017663ac50..b5ff76aae7e0 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> @@ -812,6 +812,11 @@ struct amd_powerplay {
>>     #define AMDGPU_RESET_MAGIC_NUM 64
>>   #define AMDGPU_MAX_DF_PERFMONS 4
>> +
>> +struct amdgpu_reset_domain {
>> +    struct workqueue_struct *wq;
>> +};
>> +
>>   struct amdgpu_device {
>>       struct device            *dev;
>>       struct pci_dev            *pdev;
>> @@ -1096,6 +1101,8 @@ struct amdgpu_device {
>>         struct amdgpu_reset_control     *reset_cntl;
>>       uint32_t ip_versions[HW_ID_MAX][HWIP_MAX_INSTANCE];
>> +
>> +    struct amdgpu_reset_domain    reset_domain;
>
> I'm a bit confused, shouldn't this be a pointer?
>
> Regards,
> Christian.


Yea, I see you already noticed in the followup patch - i had troubles 
reworking from first patch, a lot
of merge conflicts and so I just added the rework on top of last patch-set.


Andrey


>
>>   };
>>     static inline struct amdgpu_device *drm_to_adev(struct drm_device 
>> *ddev)
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 90d22a376632..0f3e6c078f88 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -2391,9 +2391,27 @@ static int amdgpu_device_ip_init(struct 
>> amdgpu_device *adev)
>>       if (r)
>>           goto init_failed;
>>   -    if (adev->gmc.xgmi.num_physical_nodes > 1)
>> +    if (adev->gmc.xgmi.num_physical_nodes > 1) {
>> +        struct amdgpu_hive_info *hive;
>> +
>>           amdgpu_xgmi_add_device(adev);
>>   +        hive = amdgpu_get_xgmi_hive(adev);
>> +        if (!hive || !hive->reset_domain.wq) {
>> +            DRM_ERROR("Failed to obtain reset domain info for XGMI 
>> hive:%llx", hive->hive_id);
>> +            r = -EINVAL;
>> +            goto init_failed;
>> +        }
>> +
>> +        adev->reset_domain.wq = hive->reset_domain.wq;
>> +    } else {
>> +        adev->reset_domain.wq = 
>> alloc_ordered_workqueue("amdgpu-reset-dev", 0);
>> +        if (!adev->reset_domain.wq) {
>> +            r = -ENOMEM;
>> +            goto init_failed;
>> +        }
>> +    }
>> +
>>       /* Don't init kfd if whole hive need to be reset during init */
>>       if (!adev->gmc.xgmi.pending_reset)
>>           amdgpu_amdkfd_device_init(adev);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
>> index 567df2db23ac..a858e3457c5c 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
>> @@ -392,6 +392,14 @@ struct amdgpu_hive_info 
>> *amdgpu_get_xgmi_hive(struct amdgpu_device *adev)
>>           goto pro_end;
>>       }
>>   +    hive->reset_domain.wq = 
>> alloc_ordered_workqueue("amdgpu-reset-hive", 0);
>> +    if (!hive->reset_domain.wq) {
>> +        dev_err(adev->dev, "XGMI: failed allocating wq for reset 
>> domain!\n");
>> +        kfree(hive);
>> +        hive = NULL;
>> +        goto pro_end;
>> +    }
>> +
>>       hive->hive_id = adev->gmc.xgmi.hive_id;
>>       INIT_LIST_HEAD(&hive->device_list);
>>       INIT_LIST_HEAD(&hive->node);
>> @@ -401,6 +409,7 @@ struct amdgpu_hive_info 
>> *amdgpu_get_xgmi_hive(struct amdgpu_device *adev)
>>       task_barrier_init(&hive->tb);
>>       hive->pstate = AMDGPU_XGMI_PSTATE_UNKNOWN;
>>       hive->hi_req_gpu = NULL;
>> +
>>       /*
>>        * hive pstate on boot is high in vega20 so we have to go to low
>>        * pstate on after boot.
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
>> index d2189bf7d428..6121aaa292cb 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
>> @@ -42,6 +42,8 @@ struct amdgpu_hive_info {
>>           AMDGPU_XGMI_PSTATE_MAX_VEGA20,
>>           AMDGPU_XGMI_PSTATE_UNKNOWN
>>       } pstate;
>> +
>> +    struct amdgpu_reset_domain reset_domain;
>>   };
>>     struct amdgpu_pcs_ras_field {
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC v3 00/12] Define and use reset domain for GPU recovery in amdgpu
  2022-01-25 22:37 [RFC v3 00/12] Define and use reset domain for GPU recovery in amdgpu Andrey Grodzovsky
                   ` (11 preceding siblings ...)
  2022-01-25 22:37 ` [RFC v3 12/12] Revert 'drm/amdgpu: annotate a false positive recursive locking' Andrey Grodzovsky
@ 2022-01-28 19:36 ` Andrey Grodzovsky
  2022-02-02 18:57   ` Andrey Grodzovsky
  12 siblings, 1 reply; 27+ messages in thread
From: Andrey Grodzovsky @ 2022-01-28 19:36 UTC (permalink / raw)
  To: dri-devel, amd-gfx
  Cc: horace.chen, lijo.lazar, jingwech, christian.koenig, Monk.Liu

Just a gentle ping if people have more comments on this patch set ? 
Especially last 5 patches
as first 7 are exact same as V2 and we already went over them mostly.

Andrey

On 2022-01-25 17:37, Andrey Grodzovsky wrote:
> This patchset is based on earlier work by Boris[1] that allowed to have an
> ordered workqueue at the driver level that will be used by the different
> schedulers to queue their timeout work. On top of that I also serialized
> any GPU reset we trigger from within amdgpu code to also go through the same
> ordered wq and in this way simplify somewhat our GPU reset code so we don't need
> to protect from concurrency by multiple GPU reset triggeres such as TDR on one
> hand and sysfs trigger or RAS trigger on the other hand.
>
> As advised by Christian and Daniel I defined a reset_domain struct such that
> all the entities that go through reset together will be serialized one against
> another.
>
> TDR triggered by multiple entities within the same domain due to the same reason will not
> be triggered as the first such reset will cancel all the pending resets. This is
> relevant only to TDR timers and not to triggered resets coming from RAS or SYSFS,
> those will still happen after the in flight resets finishes.
>
> v2:
> Add handling on SRIOV configuration, the reset notify coming from host
> and driver already trigger a work queue to handle the reset so drop this
> intermediate wq and send directly to timeout wq. (Shaoyun)
>
> v3:
> Lijo suggested puting 'adev->in_gpu_reset' in amdgpu_reset_domain struct.
> I followed his advise and also moved adev->reset_sem into same place. This
> in turn caused to do some follow-up refactor of the original patches
> where i decoupled amdgpu_reset_domain life cycle frolm XGMI hive because hive is destroyed and
> reconstructed for the case of reset the devices in the XGMI hive during probe for SRIOV See [2]
> while we need the reset sem and gpu_reset flag to always be present. This was attained
> by adding refcount to amdgpu_reset_domain so each device can safely point to it as long as
> it needs.
>
>
> [1] https://patchwork.kernel.org/project/dri-devel/patch/20210629073510.2764391-3-boris.brezillon@collabora.com/
> [2] https://www.spinics.net/lists/amd-gfx/msg58836.html
>
> P.S Going through drm-misc-next and not amd-staging-drm-next as Boris work hasn't landed yet there.
>
> P.P.S Patches 8-12 are the refactor on top of the original V2 patchset.
>
> P.P.P.S I wasn't able yet to test the reworked code on XGMI SRIOV system because drm-misc-next fails to load there.
> Would appriciate if maybe jingwech can try it on his system like he tested V2.
>
> Andrey Grodzovsky (12):
>    drm/amdgpu: Introduce reset domain
>    drm/amdgpu: Move scheduler init to after XGMI is ready
>    drm/amdgpu: Fix crash on modprobe
>    drm/amdgpu: Serialize non TDR gpu recovery with TDRs
>    drm/amd/virt: For SRIOV send GPU reset directly to TDR queue.
>    drm/amdgpu: Drop hive->in_reset
>    drm/amdgpu: Drop concurrent GPU reset protection for device
>    drm/amdgpu: Rework reset domain to be refcounted.
>    drm/amdgpu: Move reset sem into reset_domain
>    drm/amdgpu: Move in_gpu_reset into reset_domain
>    drm/amdgpu: Rework amdgpu_device_lock_adev
>    Revert 'drm/amdgpu: annotate a false positive recursive locking'
>
>   drivers/gpu/drm/amd/amdgpu/amdgpu.h           |  15 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c   |  10 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 275 ++++++++++--------
>   drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c     |  43 +--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c       |   2 +-
>   .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c    |  18 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c     |  39 +++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h     |  12 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h      |   2 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c      |  24 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h      |   3 +-
>   drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c        |   6 +-
>   drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c         |  14 +-
>   drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c         |  19 +-
>   drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c         |  19 +-
>   drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c         |  11 +-
>   16 files changed, 313 insertions(+), 199 deletions(-)
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [RFC v4] drm/amdgpu: Rework reset domain to be refcounted.
  2022-01-25 22:37 ` [RFC v3 08/12] drm/amdgpu: Rework reset domain to be refcounted Andrey Grodzovsky
  2022-01-26 12:12   ` Christian König
@ 2022-02-02 17:26   ` Andrey Grodzovsky
  2022-02-08 11:25     ` Lazar, Lijo
  1 sibling, 1 reply; 27+ messages in thread
From: Andrey Grodzovsky @ 2022-02-02 17:26 UTC (permalink / raw)
  To: dri-devel, amd-gfx
  Cc: horace.chen, lijo.lazar, jingwech, christian.koenig, Monk.Liu

The reset domain contains register access semaphor
now and so needs to be present as long as each device
in a hive needs it and so it cannot be binded to XGMI
hive life cycle.
Adress this by making reset domain refcounted and pointed
by each member of the hive and the hive itself.

v4:
Fix crash on boot with XGMI hive by adding type to reset_domain.
XGMI will only create a new reset_domain if prevoius was of single
device type meaning it's first boot. Otherwsie it will take a
refocunt to exsiting reset_domain from the amdgou device.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  6 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 44 +++++++++++++---------
 drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c  | 38 +++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h  | 18 +++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c   | 29 +++++++++++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h   |  2 +-
 drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c      |  4 +-
 drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c      |  4 +-
 drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c      |  4 +-
 9 files changed, 118 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 8e96b9a14452..f2ba460bfd59 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -813,9 +813,7 @@ struct amd_powerplay {
 #define AMDGPU_RESET_MAGIC_NUM 64
 #define AMDGPU_MAX_DF_PERFMONS 4
 
-struct amdgpu_reset_domain {
-	struct workqueue_struct *wq;
-};
+struct amdgpu_reset_domain;
 
 struct amdgpu_device {
 	struct device			*dev;
@@ -1102,7 +1100,7 @@ struct amdgpu_device {
 	struct amdgpu_reset_control     *reset_cntl;
 	uint32_t                        ip_versions[HW_ID_MAX][HWIP_MAX_INSTANCE];
 
-	struct amdgpu_reset_domain	reset_domain;
+	struct amdgpu_reset_domain	*reset_domain;
 };
 
 static inline struct amdgpu_device *drm_to_adev(struct drm_device *ddev)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index fef952ca8db5..cd1b7af69c35 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2313,7 +2313,7 @@ static int amdgpu_device_init_schedulers(struct amdgpu_device *adev)
 
 		r = drm_sched_init(&ring->sched, &amdgpu_sched_ops,
 				   ring->num_hw_submission, amdgpu_job_hang_limit,
-				   timeout, adev->reset_domain.wq, ring->sched_score, ring->name);
+				   timeout, adev->reset_domain->wq, ring->sched_score, ring->name);
 		if (r) {
 			DRM_ERROR("Failed to create scheduler on ring %s.\n",
 				  ring->name);
@@ -2432,24 +2432,22 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev)
 	if (r)
 		goto init_failed;
 
+	/**
+	 * In case of XGMI grab extra reference for reset domain for this device
+	 */
 	if (adev->gmc.xgmi.num_physical_nodes > 1) {
-		struct amdgpu_hive_info *hive;
-
-		amdgpu_xgmi_add_device(adev);
+		if (amdgpu_xgmi_add_device(adev) == 0) {
+			struct amdgpu_hive_info *hive = amdgpu_get_xgmi_hive(adev);
 
-		hive = amdgpu_get_xgmi_hive(adev);
-		if (!hive || !hive->reset_domain.wq) {
-			DRM_ERROR("Failed to obtain reset domain info for XGMI hive:%llx", hive->hive_id);
-			r = -EINVAL;
-			goto init_failed;
-		}
+			if (!hive->reset_domain ||
+			    !kref_get_unless_zero(&hive->reset_domain->refcount)) {
+				r = -ENOENT;
+				goto init_failed;
+			}
 
-		adev->reset_domain.wq = hive->reset_domain.wq;
-	} else {
-		adev->reset_domain.wq = alloc_ordered_workqueue("amdgpu-reset-dev", 0);
-		if (!adev->reset_domain.wq) {
-			r = -ENOMEM;
-			goto init_failed;
+			/* Drop the early temporary reset domain we created for device */
+			kref_put(&adev->reset_domain->refcount, amdgpu_reset_destroy_reset_domain);
+			adev->reset_domain = hive->reset_domain;
 		}
 	}
 
@@ -3599,6 +3597,15 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 		return r;
 	}
 
+	/*
+	 * Reset domain needs to be present early, before XGMI hive discovered
+	 * (if any) and intitialized to use reset sem and in_gpu reset flag
+	 * early on during init.
+	 */
+	adev->reset_domain = amdgpu_reset_create_reset_domain(SINGLE_DEVICE ,"amdgpu-reset-dev");
+	if (!adev->reset_domain)
+		return -ENOMEM;
+
 	/* early init functions */
 	r = amdgpu_device_ip_early_init(adev);
 	if (r)
@@ -3949,6 +3956,9 @@ void amdgpu_device_fini_sw(struct amdgpu_device *adev)
 	if (adev->mman.discovery_bin)
 		amdgpu_discovery_fini(adev);
 
+	kref_put(&adev->reset_domain->refcount, amdgpu_reset_destroy_reset_domain);
+	adev->reset_domain = NULL;
+
 	kfree(adev->pci_state);
 
 }
@@ -5186,7 +5196,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
 
 	INIT_WORK(&work.base, amdgpu_device_queue_gpu_recover_work);
 
-	if (!queue_work(adev->reset_domain.wq, &work.base))
+	if (!queue_work(adev->reset_domain->wq, &work.base))
 		return -EAGAIN;
 
 	flush_work(&work.base);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
index 02afd4115675..14f003d5ebe8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
@@ -96,3 +96,41 @@ int amdgpu_reset_perform_reset(struct amdgpu_device *adev,
 	return reset_handler->restore_hwcontext(adev->reset_cntl,
 						reset_context);
 }
+
+
+void amdgpu_reset_destroy_reset_domain(struct kref *ref)
+{
+	struct amdgpu_reset_domain *reset_domain = container_of(ref,
+								struct amdgpu_reset_domain,
+								refcount);
+	destroy_workqueue(reset_domain->wq);
+	kvfree(reset_domain);
+}
+
+struct amdgpu_reset_domain *amdgpu_reset_create_reset_domain(enum amdgpu_reset_domain_type type,
+							     char *wq_name)
+{
+	struct amdgpu_reset_domain *reset_domain;
+
+	reset_domain = kvzalloc(sizeof(struct amdgpu_reset_domain), GFP_KERNEL);
+	if (!reset_domain) {
+		DRM_ERROR("Failed to allocate amdgpu_reset_domain!");
+		return NULL;
+	}
+
+	reset_domain->type = type;
+	kref_init(&reset_domain->refcount);
+
+	reset_domain->wq = create_singlethread_workqueue(wq_name);
+	if (!reset_domain->wq) {
+		DRM_ERROR("Failed to allocate wq for amdgpu_reset_domain!");
+		kref_put(&reset_domain->refcount, amdgpu_reset_destroy_reset_domain);
+		return NULL;
+
+	}
+
+	return reset_domain;
+}
+
+
+
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
index e00d38d9160a..0b310cd963d9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
@@ -70,6 +70,19 @@ struct amdgpu_reset_control {
 	void (*async_reset)(struct work_struct *work);
 };
 
+
+enum amdgpu_reset_domain_type {
+	SINGLE_DEVICE,
+	XGMI_HIVE
+};
+
+struct amdgpu_reset_domain {
+	struct kref refcount;
+	struct workqueue_struct *wq;
+	enum amdgpu_reset_domain_type type;
+};
+
+
 int amdgpu_reset_init(struct amdgpu_device *adev);
 int amdgpu_reset_fini(struct amdgpu_device *adev);
 
@@ -82,4 +95,9 @@ int amdgpu_reset_perform_reset(struct amdgpu_device *adev,
 int amdgpu_reset_add_handler(struct amdgpu_reset_control *reset_ctl,
 			     struct amdgpu_reset_handler *handler);
 
+struct amdgpu_reset_domain *amdgpu_reset_create_reset_domain(enum amdgpu_reset_domain_type type,
+							     char *wq_name);
+
+void amdgpu_reset_destroy_reset_domain(struct kref *ref);
+
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
index 9ad742039ac9..a216e88a7b54 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
@@ -32,6 +32,8 @@
 #include "wafl/wafl2_4_0_0_smn.h"
 #include "wafl/wafl2_4_0_0_sh_mask.h"
 
+#include "amdgpu_reset.h"
+
 #define smnPCS_XGMI23_PCS_ERROR_STATUS   0x11a01210
 #define smnPCS_XGMI3X16_PCS_ERROR_STATUS 0x11a0020c
 #define smnPCS_GOPX1_PCS_ERROR_STATUS    0x12200210
@@ -226,6 +228,9 @@ static void amdgpu_xgmi_hive_release(struct kobject *kobj)
 	struct amdgpu_hive_info *hive = container_of(
 		kobj, struct amdgpu_hive_info, kobj);
 
+	kref_put(&hive->reset_domain->refcount, amdgpu_reset_destroy_reset_domain);
+	hive->reset_domain = NULL;
+
 	mutex_destroy(&hive->hive_lock);
 	kfree(hive);
 }
@@ -392,12 +397,24 @@ struct amdgpu_hive_info *amdgpu_get_xgmi_hive(struct amdgpu_device *adev)
 		goto pro_end;
 	}
 
-	hive->reset_domain.wq = alloc_ordered_workqueue("amdgpu-reset-hive", 0);
-	if (!hive->reset_domain.wq) {
-		dev_err(adev->dev, "XGMI: failed allocating wq for reset domain!\n");
-		kfree(hive);
-		hive = NULL;
-		goto pro_end;
+	/**
+	 * Avoid recreating reset domain when hive is reconstructed for the case
+	 * of reset the devices in the XGMI hive during probe for SRIOV
+	 * See https://www.spinics.net/lists/amd-gfx/msg58836.html
+	 */
+	if (adev->reset_domain->type != XGMI_HIVE) {
+		hive->reset_domain = amdgpu_reset_create_reset_domain(XGMI_HIVE, "amdgpu-reset-hive");
+			if (!hive->reset_domain) {
+				dev_err(adev->dev, "XGMI: failed initializing reset domain for xgmi hive\n");
+				ret = -ENOMEM;
+				kobject_put(&hive->kobj);
+				kfree(hive);
+				hive = NULL;
+				goto pro_end;
+			}
+	} else {
+		kref_get_unless_zero(&adev->reset_domain->refcount);
+		hive->reset_domain = adev->reset_domain;
 	}
 
 	hive->hive_id = adev->gmc.xgmi.hive_id;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
index 2f2ce53645a5..dcaad22be492 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
@@ -42,7 +42,7 @@ struct amdgpu_hive_info {
 		AMDGPU_XGMI_PSTATE_UNKNOWN
 	} pstate;
 
-	struct amdgpu_reset_domain reset_domain;
+	struct amdgpu_reset_domain *reset_domain;
 };
 
 struct amdgpu_pcs_ras_field {
diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
index b2b40e169342..05e98af30b48 100644
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
@@ -32,6 +32,8 @@
 #include "soc15_common.h"
 #include "mxgpu_ai.h"
 
+#include "amdgpu_reset.h"
+
 static void xgpu_ai_mailbox_send_ack(struct amdgpu_device *adev)
 {
 	WREG8(AI_MAIBOX_CONTROL_RCV_OFFSET_BYTE, 2);
@@ -302,7 +304,7 @@ static int xgpu_ai_mailbox_rcv_irq(struct amdgpu_device *adev,
 	switch (event) {
 		case IDH_FLR_NOTIFICATION:
 		if (amdgpu_sriov_runtime(adev) && !amdgpu_in_reset(adev))
-			WARN_ONCE(!queue_work(adev->reset_domain.wq,
+			WARN_ONCE(!queue_work(adev->reset_domain->wq,
 					      &adev->virt.flr_work),
 				  "Failed to queue work! at %s",
 				  __func__);
diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
index 08411924150d..6e12055f3f4a 100644
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
@@ -31,6 +31,8 @@
 #include "soc15_common.h"
 #include "mxgpu_nv.h"
 
+#include "amdgpu_reset.h"
+
 static void xgpu_nv_mailbox_send_ack(struct amdgpu_device *adev)
 {
 	WREG8(NV_MAIBOX_CONTROL_RCV_OFFSET_BYTE, 2);
@@ -337,7 +339,7 @@ static int xgpu_nv_mailbox_rcv_irq(struct amdgpu_device *adev,
 	switch (event) {
 	case IDH_FLR_NOTIFICATION:
 		if (amdgpu_sriov_runtime(adev) && !amdgpu_in_reset(adev))
-			WARN_ONCE(!queue_work(adev->reset_domain.wq,
+			WARN_ONCE(!queue_work(adev->reset_domain->wq,
 					      &adev->virt.flr_work),
 				  "Failed to queue work! at %s",
 				  __func__);
diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c
index 02290febfcf4..fe1570c2146e 100644
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c
@@ -42,6 +42,8 @@
 #include "smu/smu_7_1_3_d.h"
 #include "mxgpu_vi.h"
 
+#include "amdgpu_reset.h"
+
 /* VI golden setting */
 static const u32 xgpu_fiji_mgcg_cgcg_init[] = {
 	mmRLC_CGTT_MGCG_OVERRIDE, 0xffffffff, 0xffffffff,
@@ -551,7 +553,7 @@ static int xgpu_vi_mailbox_rcv_irq(struct amdgpu_device *adev,
 
 		/* only handle FLR_NOTIFY now */
 		if (!r && !amdgpu_in_reset(adev))
-			WARN_ONCE(!queue_work(adev->reset_domain.wq,
+			WARN_ONCE(!queue_work(adev->reset_domain->wq,
 					      &adev->virt.flr_work),
 				  "Failed to queue work! at %s",
 				  __func__);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [RFC v3 00/12] Define and use reset domain for GPU recovery in amdgpu
  2022-01-28 19:36 ` [RFC v3 00/12] Define and use reset domain for GPU recovery in amdgpu Andrey Grodzovsky
@ 2022-02-02 18:57   ` Andrey Grodzovsky
  2022-02-09  6:06     ` JingWen Chen
  0 siblings, 1 reply; 27+ messages in thread
From: Andrey Grodzovsky @ 2022-02-02 18:57 UTC (permalink / raw)
  To: dri-devel, amd-gfx
  Cc: horace.chen, lijo.lazar, jingwech, christian.koenig, Monk.Liu

Just another ping, with Shyun's help I was able to do some smoke testing 
on XGMI SRIOV system (booting and triggering hive reset)
and for now looks good.

Andrey

On 2022-01-28 14:36, Andrey Grodzovsky wrote:
> Just a gentle ping if people have more comments on this patch set ? 
> Especially last 5 patches
> as first 7 are exact same as V2 and we already went over them mostly.
>
> Andrey
>
> On 2022-01-25 17:37, Andrey Grodzovsky wrote:
>> This patchset is based on earlier work by Boris[1] that allowed to 
>> have an
>> ordered workqueue at the driver level that will be used by the different
>> schedulers to queue their timeout work. On top of that I also serialized
>> any GPU reset we trigger from within amdgpu code to also go through 
>> the same
>> ordered wq and in this way simplify somewhat our GPU reset code so we 
>> don't need
>> to protect from concurrency by multiple GPU reset triggeres such as 
>> TDR on one
>> hand and sysfs trigger or RAS trigger on the other hand.
>>
>> As advised by Christian and Daniel I defined a reset_domain struct 
>> such that
>> all the entities that go through reset together will be serialized 
>> one against
>> another.
>>
>> TDR triggered by multiple entities within the same domain due to the 
>> same reason will not
>> be triggered as the first such reset will cancel all the pending 
>> resets. This is
>> relevant only to TDR timers and not to triggered resets coming from 
>> RAS or SYSFS,
>> those will still happen after the in flight resets finishes.
>>
>> v2:
>> Add handling on SRIOV configuration, the reset notify coming from host
>> and driver already trigger a work queue to handle the reset so drop this
>> intermediate wq and send directly to timeout wq. (Shaoyun)
>>
>> v3:
>> Lijo suggested puting 'adev->in_gpu_reset' in amdgpu_reset_domain 
>> struct.
>> I followed his advise and also moved adev->reset_sem into same place. 
>> This
>> in turn caused to do some follow-up refactor of the original patches
>> where i decoupled amdgpu_reset_domain life cycle frolm XGMI hive 
>> because hive is destroyed and
>> reconstructed for the case of reset the devices in the XGMI hive 
>> during probe for SRIOV See [2]
>> while we need the reset sem and gpu_reset flag to always be present. 
>> This was attained
>> by adding refcount to amdgpu_reset_domain so each device can safely 
>> point to it as long as
>> it needs.
>>
>>
>> [1] 
>> https://patchwork.kernel.org/project/dri-devel/patch/20210629073510.2764391-3-boris.brezillon@collabora.com/
>> [2] https://www.spinics.net/lists/amd-gfx/msg58836.html
>>
>> P.S Going through drm-misc-next and not amd-staging-drm-next as Boris 
>> work hasn't landed yet there.
>>
>> P.P.S Patches 8-12 are the refactor on top of the original V2 patchset.
>>
>> P.P.P.S I wasn't able yet to test the reworked code on XGMI SRIOV 
>> system because drm-misc-next fails to load there.
>> Would appriciate if maybe jingwech can try it on his system like he 
>> tested V2.
>>
>> Andrey Grodzovsky (12):
>>    drm/amdgpu: Introduce reset domain
>>    drm/amdgpu: Move scheduler init to after XGMI is ready
>>    drm/amdgpu: Fix crash on modprobe
>>    drm/amdgpu: Serialize non TDR gpu recovery with TDRs
>>    drm/amd/virt: For SRIOV send GPU reset directly to TDR queue.
>>    drm/amdgpu: Drop hive->in_reset
>>    drm/amdgpu: Drop concurrent GPU reset protection for device
>>    drm/amdgpu: Rework reset domain to be refcounted.
>>    drm/amdgpu: Move reset sem into reset_domain
>>    drm/amdgpu: Move in_gpu_reset into reset_domain
>>    drm/amdgpu: Rework amdgpu_device_lock_adev
>>    Revert 'drm/amdgpu: annotate a false positive recursive locking'
>>
>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h           |  15 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c   |  10 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 275 ++++++++++--------
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c     |  43 +--
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c       |   2 +-
>>   .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c    |  18 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c     |  39 +++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h     |  12 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h      |   2 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c      |  24 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h      |   3 +-
>>   drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c        |   6 +-
>>   drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c         |  14 +-
>>   drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c         |  19 +-
>>   drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c         |  19 +-
>>   drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c         |  11 +-
>>   16 files changed, 313 insertions(+), 199 deletions(-)
>>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC v3 06/12] drm/amdgpu: Drop hive->in_reset
  2022-01-25 22:37 ` [RFC v3 06/12] drm/amdgpu: Drop hive->in_reset Andrey Grodzovsky
@ 2022-02-08  6:33   ` Lazar, Lijo
  2022-02-08 15:39     ` Andrey Grodzovsky
  0 siblings, 1 reply; 27+ messages in thread
From: Lazar, Lijo @ 2022-02-08  6:33 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx
  Cc: Monk.Liu, horace.chen, jingwech, christian.koenig



On 1/26/2022 4:07 AM, Andrey Grodzovsky wrote:
> Since we serialize all resets no need to protect from concurrent
> resets.
> 
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Reviewed-by: Christian König <christian.koenig@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 19 +------------------
>   drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c   |  1 -
>   drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h   |  1 -
>   3 files changed, 1 insertion(+), 20 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 258ec3c0b2af..107a393ebbfd 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -5013,25 +5013,9 @@ int amdgpu_device_gpu_recover_imp(struct amdgpu_device *adev,
>   	dev_info(adev->dev, "GPU %s begin!\n",
>   		need_emergency_restart ? "jobs stop":"reset");
>   
> -	/*
> -	 * Here we trylock to avoid chain of resets executing from
> -	 * either trigger by jobs on different adevs in XGMI hive or jobs on
> -	 * different schedulers for same device while this TO handler is running.
> -	 * We always reset all schedulers for device and all devices for XGMI
> -	 * hive so that should take care of them too.
> -	 */
>   	hive = amdgpu_get_xgmi_hive(adev);
> -	if (hive) {
> -		if (atomic_cmpxchg(&hive->in_reset, 0, 1) != 0) {
> -			DRM_INFO("Bailing on TDR for s_job:%llx, hive: %llx as another already in progress",
> -				job ? job->base.id : -1, hive->hive_id);
> -			amdgpu_put_xgmi_hive(hive);
> -			if (job && job->vm)
> -				drm_sched_increase_karma(&job->base);
> -			return 0;
> -		}

This function in general will reset all devices in a hive.

In a situation like GPU0 in hive0 gets to this function first and GPU1 
in hive0 also hangs shortly (before GPU0 recovery process starts 
reseting other devices in hive), we don't want to execute work queued as 
part of GPU1's recovery also.Both GPU0 and GPU1 recovery process will 
try to reset all the devices in hive.

In short - if a reset domain is already active, probably we don't need 
to queue another work to the domain since all devices in the domain are 
expected to get reset shortly.

Thanks,
Lijo

> +	if (hive)
>   		mutex_lock(&hive->hive_lock);
> -	}
>   
>   	reset_context.method = AMD_RESET_METHOD_NONE;
>   	reset_context.reset_req_dev = adev;
> @@ -5227,7 +5211,6 @@ int amdgpu_device_gpu_recover_imp(struct amdgpu_device *adev,
>   
>   skip_recovery:
>   	if (hive) {
> -		atomic_set(&hive->in_reset, 0);
>   		mutex_unlock(&hive->hive_lock);
>   		amdgpu_put_xgmi_hive(hive);
>   	}
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
> index a858e3457c5c..9ad742039ac9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
> @@ -404,7 +404,6 @@ struct amdgpu_hive_info *amdgpu_get_xgmi_hive(struct amdgpu_device *adev)
>   	INIT_LIST_HEAD(&hive->device_list);
>   	INIT_LIST_HEAD(&hive->node);
>   	mutex_init(&hive->hive_lock);
> -	atomic_set(&hive->in_reset, 0);
>   	atomic_set(&hive->number_devices, 0);
>   	task_barrier_init(&hive->tb);
>   	hive->pstate = AMDGPU_XGMI_PSTATE_UNKNOWN;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
> index 6121aaa292cb..2f2ce53645a5 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
> @@ -33,7 +33,6 @@ struct amdgpu_hive_info {
>   	struct list_head node;
>   	atomic_t number_devices;
>   	struct mutex hive_lock;
> -	atomic_t in_reset;
>   	int hi_req_count;
>   	struct amdgpu_device *hi_req_gpu;
>   	struct task_barrier tb;
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC v3 10/12] drm/amdgpu: Move in_gpu_reset into reset_domain
  2022-01-25 22:37 ` [RFC v3 10/12] drm/amdgpu: Move in_gpu_reset " Andrey Grodzovsky
@ 2022-02-08 10:49   ` Lazar, Lijo
  0 siblings, 0 replies; 27+ messages in thread
From: Lazar, Lijo @ 2022-02-08 10:49 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx
  Cc: Monk.Liu, horace.chen, jingwech, christian.koenig



On 1/26/2022 4:07 AM, Andrey Grodzovsky wrote:
> We should have a single instance per entrire reset domain.
> 
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Suggested-by: Lijo Lazar <lijo.lazar@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  7 ++-----
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 +++++++---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c  |  1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h  |  1 +
>   drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c      |  4 ++--
>   drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c      |  4 ++--
>   6 files changed, 15 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index f021cd3c9d34..087796e389ab 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -1056,7 +1056,6 @@ struct amdgpu_device {
>   	bool				in_s4;
>   	bool				in_s0ix;
>   
> -	atomic_t 			in_gpu_reset;
>   	enum pp_mp1_state               mp1_state;
>   	struct amdgpu_doorbell_index doorbell_index;
>   
> @@ -1461,8 +1460,6 @@ static inline bool amdgpu_is_tmz(struct amdgpu_device *adev)
>          return adev->gmc.tmz_enabled;
>   }
>   
> -static inline int amdgpu_in_reset(struct amdgpu_device *adev)
> -{
> -	return atomic_read(&adev->in_gpu_reset);
> -}
> +int amdgpu_in_reset(struct amdgpu_device *adev);
> +
>   #endif
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 6991ab4a8191..aa43af443ebe 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -3511,7 +3511,6 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>   	mutex_init(&adev->mn_lock);
>   	mutex_init(&adev->virt.vf_errors.lock);
>   	hash_init(adev->mn_hash);
> -	atomic_set(&adev->in_gpu_reset, 0);
>   	mutex_init(&adev->psp.mutex);
>   	mutex_init(&adev->notifier_lock);
>   
> @@ -4775,7 +4774,7 @@ int amdgpu_do_asic_reset(struct list_head *device_list_handle,
>   static void amdgpu_device_lock_adev(struct amdgpu_device *adev,
>   				struct amdgpu_hive_info *hive)
>   {
> -	atomic_set(&adev->in_gpu_reset, 1);
> +	atomic_set(&adev->reset_domain->in_gpu_reset, 1);
>   
>   	if (hive) {
>   		down_write_nest_lock(&adev->reset_domain->sem, &hive->hive_lock);
> @@ -4800,7 +4799,7 @@ static void amdgpu_device_unlock_adev(struct amdgpu_device *adev)
>   {
>   	amdgpu_vf_error_trans_all(adev);
>   	adev->mp1_state = PP_MP1_STATE_NONE;
> -	atomic_set(&adev->in_gpu_reset, 0);
> +	atomic_set(&adev->reset_domain->in_gpu_reset, 0);
>   	up_write(&adev->reset_domain->sem);
>   }
>   
> @@ -5643,3 +5642,8 @@ void amdgpu_device_invalidate_hdp(struct amdgpu_device *adev,
>   
>   	amdgpu_asic_invalidate_hdp(adev, ring);
>   }
> +
> +int amdgpu_in_reset(struct amdgpu_device *adev)
> +{
> +	return atomic_read(&adev->reset_domain->in_gpu_reset);
> +}
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
> index 011585e330f6..e9b804a89b34 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
> @@ -127,6 +127,7 @@ struct amdgpu_reset_domain *amdgpu_reset_create_reset_domain(char *wq_name)
>   
>   	}
>   
> +	atomic_set(&reset_domain->in_gpu_reset, 0);
>   	init_rwsem(&reset_domain->sem);
>   
>   	return reset_domain;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
> index 7451089b0c06..413982f4e1ce 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
> @@ -74,6 +74,7 @@ struct amdgpu_reset_domain {
>   	struct kref refcount;
>   	struct workqueue_struct *wq;
>   	struct rw_semaphore sem;
> +	atomic_t in_gpu_reset;

Maybe 'active' (independent of gpu) just to indicate that a reset is 
ongoing in the domain?

Thanks,
Lijo

>   };
>   
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
> index 5dab06fce26a..6c79746d18db 100644
> --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
> +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
> @@ -258,7 +258,7 @@ static void xgpu_ai_mailbox_flr_work(struct work_struct *work)
>   		return;
>   
>   	amdgpu_virt_fini_data_exchange(adev);
> -	atomic_set(&adev->in_gpu_reset, 1);
> +	atomic_set(&adev->reset_domain->in_gpu_reset, 1);
>   
>   	xgpu_ai_mailbox_trans_msg(adev, IDH_READY_TO_RESET, 0, 0, 0);
>   
> @@ -271,7 +271,7 @@ static void xgpu_ai_mailbox_flr_work(struct work_struct *work)
>   	} while (timeout > 1);
>   
>   flr_done:
> -	atomic_set(&adev->in_gpu_reset, 0);
> +	atomic_set(&adev->reset_domain->in_gpu_reset, 0);
>   	up_write(&adev->reset_domain->sem);
>   
>   	/* Trigger recovery for world switch failure if no TDR */
> diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
> index 868144fff16a..39f7e1e9ab81 100644
> --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
> @@ -287,7 +287,7 @@ static void xgpu_nv_mailbox_flr_work(struct work_struct *work)
>   		return;
>   
>   	amdgpu_virt_fini_data_exchange(adev);
> -	atomic_set(&adev->in_gpu_reset, 1);
> +	atomic_set(&adev->reset_domain->in_gpu_reset, 1);
>   
>   	xgpu_nv_mailbox_trans_msg(adev, IDH_READY_TO_RESET, 0, 0, 0);
>   
> @@ -300,7 +300,7 @@ static void xgpu_nv_mailbox_flr_work(struct work_struct *work)
>   	} while (timeout > 1);
>   
>   flr_done:
> -	atomic_set(&adev->in_gpu_reset, 0);
> +	atomic_set(&adev->reset_domain->in_gpu_reset, 0);
>   	up_write(&adev->reset_domain->sem);
>   
>   	/* Trigger recovery for world switch failure if no TDR */
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC v4] drm/amdgpu: Rework reset domain to be refcounted.
  2022-02-02 17:26   ` [RFC v4] " Andrey Grodzovsky
@ 2022-02-08 11:25     ` Lazar, Lijo
  2022-02-08 16:19       ` Andrey Grodzovsky
  0 siblings, 1 reply; 27+ messages in thread
From: Lazar, Lijo @ 2022-02-08 11:25 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx
  Cc: Monk.Liu, horace.chen, jingwech, christian.koenig



On 2/2/2022 10:56 PM, Andrey Grodzovsky wrote:
> The reset domain contains register access semaphor
> now and so needs to be present as long as each device
> in a hive needs it and so it cannot be binded to XGMI
> hive life cycle.
> Adress this by making reset domain refcounted and pointed
> by each member of the hive and the hive itself.
> 
> v4:
> Fix crash on boot with XGMI hive by adding type to reset_domain.
> XGMI will only create a new reset_domain if prevoius was of single
> device type meaning it's first boot. Otherwsie it will take a
> refocunt to exsiting reset_domain from the amdgou device.
> 
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  6 +--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 44 +++++++++++++---------
>   drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c  | 38 +++++++++++++++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h  | 18 +++++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c   | 29 +++++++++++---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h   |  2 +-
>   drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c      |  4 +-
>   drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c      |  4 +-
>   drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c      |  4 +-
>   9 files changed, 118 insertions(+), 31 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 8e96b9a14452..f2ba460bfd59 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -813,9 +813,7 @@ struct amd_powerplay {
>   #define AMDGPU_RESET_MAGIC_NUM 64
>   #define AMDGPU_MAX_DF_PERFMONS 4
>   
> -struct amdgpu_reset_domain {
> -	struct workqueue_struct *wq;
> -};
> +struct amdgpu_reset_domain;
>   
>   struct amdgpu_device {
>   	struct device			*dev;
> @@ -1102,7 +1100,7 @@ struct amdgpu_device {
>   	struct amdgpu_reset_control     *reset_cntl;
>   	uint32_t                        ip_versions[HW_ID_MAX][HWIP_MAX_INSTANCE];
>   
> -	struct amdgpu_reset_domain	reset_domain;
> +	struct amdgpu_reset_domain	*reset_domain;
>   };
>   
>   static inline struct amdgpu_device *drm_to_adev(struct drm_device *ddev)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index fef952ca8db5..cd1b7af69c35 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2313,7 +2313,7 @@ static int amdgpu_device_init_schedulers(struct amdgpu_device *adev)
>   
>   		r = drm_sched_init(&ring->sched, &amdgpu_sched_ops,
>   				   ring->num_hw_submission, amdgpu_job_hang_limit,
> -				   timeout, adev->reset_domain.wq, ring->sched_score, ring->name);
> +				   timeout, adev->reset_domain->wq, ring->sched_score, ring->name);
>   		if (r) {
>   			DRM_ERROR("Failed to create scheduler on ring %s.\n",
>   				  ring->name);
> @@ -2432,24 +2432,22 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev)
>   	if (r)
>   		goto init_failed;
>   
> +	/**
> +	 * In case of XGMI grab extra reference for reset domain for this device
> +	 */
>   	if (adev->gmc.xgmi.num_physical_nodes > 1) {
> -		struct amdgpu_hive_info *hive;
> -
> -		amdgpu_xgmi_add_device(adev);
> +		if (amdgpu_xgmi_add_device(adev) == 0) {
> +			struct amdgpu_hive_info *hive = amdgpu_get_xgmi_hive(adev);
>   
> -		hive = amdgpu_get_xgmi_hive(adev);
> -		if (!hive || !hive->reset_domain.wq) {
> -			DRM_ERROR("Failed to obtain reset domain info for XGMI hive:%llx", hive->hive_id);
> -			r = -EINVAL;
> -			goto init_failed;
> -		}
> +			if (!hive->reset_domain ||
> +			    !kref_get_unless_zero(&hive->reset_domain->refcount)) {
> +				r = -ENOENT;
> +				goto init_failed;
> +			}
>   
> -		adev->reset_domain.wq = hive->reset_domain.wq;
> -	} else {
> -		adev->reset_domain.wq = alloc_ordered_workqueue("amdgpu-reset-dev", 0);
> -		if (!adev->reset_domain.wq) {
> -			r = -ENOMEM;
> -			goto init_failed;
> +			/* Drop the early temporary reset domain we created for device */
> +			kref_put(&adev->reset_domain->refcount, amdgpu_reset_destroy_reset_domain);
> +			adev->reset_domain = hive->reset_domain;
>   		}
>   	}
>   
> @@ -3599,6 +3597,15 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>   		return r;
>   	}
>   
> +	/*
> +	 * Reset domain needs to be present early, before XGMI hive discovered
> +	 * (if any) and intitialized to use reset sem and in_gpu reset flag
> +	 * early on during init.
> +	 */
> +	adev->reset_domain = amdgpu_reset_create_reset_domain(SINGLE_DEVICE ,"amdgpu-reset-dev");
> +	if (!adev->reset_domain)
> +		return -ENOMEM;
> +
>   	/* early init functions */
>   	r = amdgpu_device_ip_early_init(adev);
>   	if (r)
> @@ -3949,6 +3956,9 @@ void amdgpu_device_fini_sw(struct amdgpu_device *adev)
>   	if (adev->mman.discovery_bin)
>   		amdgpu_discovery_fini(adev);
>   
> +	kref_put(&adev->reset_domain->refcount, amdgpu_reset_destroy_reset_domain);
> +	adev->reset_domain = NULL;
> +
>   	kfree(adev->pci_state);
>   
>   }
> @@ -5186,7 +5196,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
>   
>   	INIT_WORK(&work.base, amdgpu_device_queue_gpu_recover_work);
>   
> -	if (!queue_work(adev->reset_domain.wq, &work.base))
> +	if (!queue_work(adev->reset_domain->wq, &work.base))
>   		return -EAGAIN;
>   
>   	flush_work(&work.base);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
> index 02afd4115675..14f003d5ebe8 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
> @@ -96,3 +96,41 @@ int amdgpu_reset_perform_reset(struct amdgpu_device *adev,
>   	return reset_handler->restore_hwcontext(adev->reset_cntl,
>   						reset_context);
>   }
> +
> +
> +void amdgpu_reset_destroy_reset_domain(struct kref *ref)
> +{
> +	struct amdgpu_reset_domain *reset_domain = container_of(ref,
> +								struct amdgpu_reset_domain,
> +								refcount);
> +	destroy_workqueue(reset_domain->wq);
> +	kvfree(reset_domain);
> +}
> +
> +struct amdgpu_reset_domain *amdgpu_reset_create_reset_domain(enum amdgpu_reset_domain_type type,
> +							     char *wq_name)
> +{
> +	struct amdgpu_reset_domain *reset_domain;
> +
> +	reset_domain = kvzalloc(sizeof(struct amdgpu_reset_domain), GFP_KERNEL);
> +	if (!reset_domain) {
> +		DRM_ERROR("Failed to allocate amdgpu_reset_domain!");
> +		return NULL;
> +	}
> +
> +	reset_domain->type = type;
> +	kref_init(&reset_domain->refcount);
> +
> +	reset_domain->wq = create_singlethread_workqueue(wq_name);
> +	if (!reset_domain->wq) {
> +		DRM_ERROR("Failed to allocate wq for amdgpu_reset_domain!");
> +		kref_put(&reset_domain->refcount, amdgpu_reset_destroy_reset_domain);
> +		return NULL;
> +
> +	}
> +
> +	return reset_domain;
> +}
> +
> +
> +
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
> index e00d38d9160a..0b310cd963d9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
> @@ -70,6 +70,19 @@ struct amdgpu_reset_control {
>   	void (*async_reset)(struct work_struct *work);
>   };
>   
> +
> +enum amdgpu_reset_domain_type {
> +	SINGLE_DEVICE,
> +	XGMI_HIVE
> +};
> +
> +struct amdgpu_reset_domain {
> +	struct kref refcount;
> +	struct workqueue_struct *wq;
> +	enum amdgpu_reset_domain_type type;
> +};
> +
> +
>   int amdgpu_reset_init(struct amdgpu_device *adev);
>   int amdgpu_reset_fini(struct amdgpu_device *adev);
>   
> @@ -82,4 +95,9 @@ int amdgpu_reset_perform_reset(struct amdgpu_device *adev,
>   int amdgpu_reset_add_handler(struct amdgpu_reset_control *reset_ctl,
>   			     struct amdgpu_reset_handler *handler);
>   
> +struct amdgpu_reset_domain *amdgpu_reset_create_reset_domain(enum amdgpu_reset_domain_type type,
> +							     char *wq_name);
> +
> +void amdgpu_reset_destroy_reset_domain(struct kref *ref);
> +
>   #endif
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
> index 9ad742039ac9..a216e88a7b54 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
> @@ -32,6 +32,8 @@
>   #include "wafl/wafl2_4_0_0_smn.h"
>   #include "wafl/wafl2_4_0_0_sh_mask.h"
>   
> +#include "amdgpu_reset.h"
> +
>   #define smnPCS_XGMI23_PCS_ERROR_STATUS   0x11a01210
>   #define smnPCS_XGMI3X16_PCS_ERROR_STATUS 0x11a0020c
>   #define smnPCS_GOPX1_PCS_ERROR_STATUS    0x12200210
> @@ -226,6 +228,9 @@ static void amdgpu_xgmi_hive_release(struct kobject *kobj)
>   	struct amdgpu_hive_info *hive = container_of(
>   		kobj, struct amdgpu_hive_info, kobj);
>   
> +	kref_put(&hive->reset_domain->refcount, amdgpu_reset_destroy_reset_domain);
> +	hive->reset_domain = NULL;
> +
>   	mutex_destroy(&hive->hive_lock);
>   	kfree(hive);
>   }
> @@ -392,12 +397,24 @@ struct amdgpu_hive_info *amdgpu_get_xgmi_hive(struct amdgpu_device *adev)
>   		goto pro_end;
>   	}
>   
> -	hive->reset_domain.wq = alloc_ordered_workqueue("amdgpu-reset-hive", 0);
> -	if (!hive->reset_domain.wq) {
> -		dev_err(adev->dev, "XGMI: failed allocating wq for reset domain!\n");
> -		kfree(hive);
> -		hive = NULL;
> -		goto pro_end;
> +	/**
> +	 * Avoid recreating reset domain when hive is reconstructed for the case
> +	 * of reset the devices in the XGMI hive during probe for SRIOV
> +	 * See https://www.spinics.net/lists/amd-gfx/msg58836.html
> +	 */
> +	if (adev->reset_domain->type != XGMI_HIVE) {
> +		hive->reset_domain = amdgpu_reset_create_reset_domain(XGMI_HIVE, "amdgpu-reset-hive");
> +			if (!hive->reset_domain) {
> +				dev_err(adev->dev, "XGMI: failed initializing reset domain for xgmi hive\n");
> +				ret = -ENOMEM;
> +				kobject_put(&hive->kobj);
> +				kfree(hive);
> +				hive = NULL;
> +				goto pro_end;
> +			}
> +	} else {
> +		kref_get_unless_zero(&adev->reset_domain->refcount);
> +		hive->reset_domain = adev->reset_domain;

Suggest to wrap this like -
	amdgpu_reset_domain_get(reset_domain)

and another like
	amdgpu_reset_domain_put(reset_domain)

Even smaller wrappers like
	amdgpu_reset_domain_schedule(reset_domain, reset_work)

Other than that, looks good to me (need to combine with earlier series 
as this misses a few other members in reset domain).

Thanks,
Lijo
	
>   	}
>   
>   	hive->hive_id = adev->gmc.xgmi.hive_id;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
> index 2f2ce53645a5..dcaad22be492 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
> @@ -42,7 +42,7 @@ struct amdgpu_hive_info {
>   		AMDGPU_XGMI_PSTATE_UNKNOWN
>   	} pstate;
>   
> -	struct amdgpu_reset_domain reset_domain;
> +	struct amdgpu_reset_domain *reset_domain;
>   };
>   
>   struct amdgpu_pcs_ras_field {
> diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
> index b2b40e169342..05e98af30b48 100644
> --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
> +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
> @@ -32,6 +32,8 @@
>   #include "soc15_common.h"
>   #include "mxgpu_ai.h"
>   
> +#include "amdgpu_reset.h"
> +
>   static void xgpu_ai_mailbox_send_ack(struct amdgpu_device *adev)
>   {
>   	WREG8(AI_MAIBOX_CONTROL_RCV_OFFSET_BYTE, 2);
> @@ -302,7 +304,7 @@ static int xgpu_ai_mailbox_rcv_irq(struct amdgpu_device *adev,
>   	switch (event) {
>   		case IDH_FLR_NOTIFICATION:
>   		if (amdgpu_sriov_runtime(adev) && !amdgpu_in_reset(adev))
> -			WARN_ONCE(!queue_work(adev->reset_domain.wq,
> +			WARN_ONCE(!queue_work(adev->reset_domain->wq,
>   					      &adev->virt.flr_work),
>   				  "Failed to queue work! at %s",
>   				  __func__);
> diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
> index 08411924150d..6e12055f3f4a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
> @@ -31,6 +31,8 @@
>   #include "soc15_common.h"
>   #include "mxgpu_nv.h"
>   
> +#include "amdgpu_reset.h"
> +
>   static void xgpu_nv_mailbox_send_ack(struct amdgpu_device *adev)
>   {
>   	WREG8(NV_MAIBOX_CONTROL_RCV_OFFSET_BYTE, 2);
> @@ -337,7 +339,7 @@ static int xgpu_nv_mailbox_rcv_irq(struct amdgpu_device *adev,
>   	switch (event) {
>   	case IDH_FLR_NOTIFICATION:
>   		if (amdgpu_sriov_runtime(adev) && !amdgpu_in_reset(adev))
> -			WARN_ONCE(!queue_work(adev->reset_domain.wq,
> +			WARN_ONCE(!queue_work(adev->reset_domain->wq,
>   					      &adev->virt.flr_work),
>   				  "Failed to queue work! at %s",
>   				  __func__);
> diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c
> index 02290febfcf4..fe1570c2146e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c
> +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c
> @@ -42,6 +42,8 @@
>   #include "smu/smu_7_1_3_d.h"
>   #include "mxgpu_vi.h"
>   
> +#include "amdgpu_reset.h"
> +
>   /* VI golden setting */
>   static const u32 xgpu_fiji_mgcg_cgcg_init[] = {
>   	mmRLC_CGTT_MGCG_OVERRIDE, 0xffffffff, 0xffffffff,
> @@ -551,7 +553,7 @@ static int xgpu_vi_mailbox_rcv_irq(struct amdgpu_device *adev,
>   
>   		/* only handle FLR_NOTIFY now */
>   		if (!r && !amdgpu_in_reset(adev))
> -			WARN_ONCE(!queue_work(adev->reset_domain.wq,
> +			WARN_ONCE(!queue_work(adev->reset_domain->wq,
>   					      &adev->virt.flr_work),
>   				  "Failed to queue work! at %s",
>   				  __func__);
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC v3 06/12] drm/amdgpu: Drop hive->in_reset
  2022-02-08  6:33   ` Lazar, Lijo
@ 2022-02-08 15:39     ` Andrey Grodzovsky
  0 siblings, 0 replies; 27+ messages in thread
From: Andrey Grodzovsky @ 2022-02-08 15:39 UTC (permalink / raw)
  To: Lazar, Lijo, dri-devel, amd-gfx
  Cc: Monk.Liu, horace.chen, jingwech, christian.koenig


On 2022-02-08 01:33, Lazar, Lijo wrote:
>
>
> On 1/26/2022 4:07 AM, Andrey Grodzovsky wrote:
>> Since we serialize all resets no need to protect from concurrent
>> resets.
>>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> Reviewed-by: Christian König <christian.koenig@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 19 +------------------
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c   |  1 -
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h   |  1 -
>>   3 files changed, 1 insertion(+), 20 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 258ec3c0b2af..107a393ebbfd 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -5013,25 +5013,9 @@ int amdgpu_device_gpu_recover_imp(struct 
>> amdgpu_device *adev,
>>       dev_info(adev->dev, "GPU %s begin!\n",
>>           need_emergency_restart ? "jobs stop":"reset");
>>   -    /*
>> -     * Here we trylock to avoid chain of resets executing from
>> -     * either trigger by jobs on different adevs in XGMI hive or 
>> jobs on
>> -     * different schedulers for same device while this TO handler is 
>> running.
>> -     * We always reset all schedulers for device and all devices for 
>> XGMI
>> -     * hive so that should take care of them too.
>> -     */
>>       hive = amdgpu_get_xgmi_hive(adev);
>> -    if (hive) {
>> -        if (atomic_cmpxchg(&hive->in_reset, 0, 1) != 0) {
>> -            DRM_INFO("Bailing on TDR for s_job:%llx, hive: %llx as 
>> another already in progress",
>> -                job ? job->base.id : -1, hive->hive_id);
>> -            amdgpu_put_xgmi_hive(hive);
>> -            if (job && job->vm)
>> -                drm_sched_increase_karma(&job->base);
>> -            return 0;
>> -        }
>
> This function in general will reset all devices in a hive.
>
> In a situation like GPU0 in hive0 gets to this function first and GPU1 
> in hive0 also hangs shortly (before GPU0 recovery process starts 
> reseting other devices in hive), we don't want to execute work queued 
> as part of GPU1's recovery also.Both GPU0 and GPU1 recovery process 
> will try to reset all the devices in hive.
>
> In short - if a reset domain is already active, probably we don't need 
> to queue another work to the domain since all devices in the domain 
> are expected to get reset shortly.
>
> Thanks,
> Lijo


No worries for this - check this part in drm_sched_stop 
https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/scheduler/sched_main.c#L452
this will be called for each scheduler participating in the reset domain 
(including schedulers of GPU) and will cancel any such pending resets that
we want to avoid executing.

Andrey


>
>> +    if (hive)
>>           mutex_lock(&hive->hive_lock);
>> -    }
>>         reset_context.method = AMD_RESET_METHOD_NONE;
>>       reset_context.reset_req_dev = adev;
>> @@ -5227,7 +5211,6 @@ int amdgpu_device_gpu_recover_imp(struct 
>> amdgpu_device *adev,
>>     skip_recovery:
>>       if (hive) {
>> -        atomic_set(&hive->in_reset, 0);
>>           mutex_unlock(&hive->hive_lock);
>>           amdgpu_put_xgmi_hive(hive);
>>       }
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
>> index a858e3457c5c..9ad742039ac9 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
>> @@ -404,7 +404,6 @@ struct amdgpu_hive_info 
>> *amdgpu_get_xgmi_hive(struct amdgpu_device *adev)
>>       INIT_LIST_HEAD(&hive->device_list);
>>       INIT_LIST_HEAD(&hive->node);
>>       mutex_init(&hive->hive_lock);
>> -    atomic_set(&hive->in_reset, 0);
>>       atomic_set(&hive->number_devices, 0);
>>       task_barrier_init(&hive->tb);
>>       hive->pstate = AMDGPU_XGMI_PSTATE_UNKNOWN;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
>> index 6121aaa292cb..2f2ce53645a5 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
>> @@ -33,7 +33,6 @@ struct amdgpu_hive_info {
>>       struct list_head node;
>>       atomic_t number_devices;
>>       struct mutex hive_lock;
>> -    atomic_t in_reset;
>>       int hi_req_count;
>>       struct amdgpu_device *hi_req_gpu;
>>       struct task_barrier tb;
>>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC v4] drm/amdgpu: Rework reset domain to be refcounted.
  2022-02-08 11:25     ` Lazar, Lijo
@ 2022-02-08 16:19       ` Andrey Grodzovsky
  2022-02-09  7:51         ` Christian König
  0 siblings, 1 reply; 27+ messages in thread
From: Andrey Grodzovsky @ 2022-02-08 16:19 UTC (permalink / raw)
  To: Lazar, Lijo, dri-devel, amd-gfx
  Cc: Monk.Liu, horace.chen, jingwech, christian.koenig


On 2022-02-08 06:25, Lazar, Lijo wrote:
>
>
> On 2/2/2022 10:56 PM, Andrey Grodzovsky wrote:
>> The reset domain contains register access semaphor
>> now and so needs to be present as long as each device
>> in a hive needs it and so it cannot be binded to XGMI
>> hive life cycle.
>> Adress this by making reset domain refcounted and pointed
>> by each member of the hive and the hive itself.
>>
>> v4:
>> Fix crash on boot with XGMI hive by adding type to reset_domain.
>> XGMI will only create a new reset_domain if prevoius was of single
>> device type meaning it's first boot. Otherwsie it will take a
>> refocunt to exsiting reset_domain from the amdgou device.
>>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  6 +--
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 44 +++++++++++++---------
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c  | 38 +++++++++++++++++++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h  | 18 +++++++++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c   | 29 +++++++++++---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h   |  2 +-
>>   drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c      |  4 +-
>>   drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c      |  4 +-
>>   drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c      |  4 +-
>>   9 files changed, 118 insertions(+), 31 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> index 8e96b9a14452..f2ba460bfd59 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> @@ -813,9 +813,7 @@ struct amd_powerplay {
>>   #define AMDGPU_RESET_MAGIC_NUM 64
>>   #define AMDGPU_MAX_DF_PERFMONS 4
>>   -struct amdgpu_reset_domain {
>> -    struct workqueue_struct *wq;
>> -};
>> +struct amdgpu_reset_domain;
>>     struct amdgpu_device {
>>       struct device            *dev;
>> @@ -1102,7 +1100,7 @@ struct amdgpu_device {
>>       struct amdgpu_reset_control     *reset_cntl;
>>       uint32_t ip_versions[HW_ID_MAX][HWIP_MAX_INSTANCE];
>>   -    struct amdgpu_reset_domain    reset_domain;
>> +    struct amdgpu_reset_domain    *reset_domain;
>>   };
>>     static inline struct amdgpu_device *drm_to_adev(struct drm_device 
>> *ddev)
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index fef952ca8db5..cd1b7af69c35 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -2313,7 +2313,7 @@ static int amdgpu_device_init_schedulers(struct 
>> amdgpu_device *adev)
>>             r = drm_sched_init(&ring->sched, &amdgpu_sched_ops,
>>                      ring->num_hw_submission, amdgpu_job_hang_limit,
>> -                   timeout, adev->reset_domain.wq, 
>> ring->sched_score, ring->name);
>> +                   timeout, adev->reset_domain->wq, 
>> ring->sched_score, ring->name);
>>           if (r) {
>>               DRM_ERROR("Failed to create scheduler on ring %s.\n",
>>                     ring->name);
>> @@ -2432,24 +2432,22 @@ static int amdgpu_device_ip_init(struct 
>> amdgpu_device *adev)
>>       if (r)
>>           goto init_failed;
>>   +    /**
>> +     * In case of XGMI grab extra reference for reset domain for 
>> this device
>> +     */
>>       if (adev->gmc.xgmi.num_physical_nodes > 1) {
>> -        struct amdgpu_hive_info *hive;
>> -
>> -        amdgpu_xgmi_add_device(adev);
>> +        if (amdgpu_xgmi_add_device(adev) == 0) {
>> +            struct amdgpu_hive_info *hive = amdgpu_get_xgmi_hive(adev);
>>   -        hive = amdgpu_get_xgmi_hive(adev);
>> -        if (!hive || !hive->reset_domain.wq) {
>> -            DRM_ERROR("Failed to obtain reset domain info for XGMI 
>> hive:%llx", hive->hive_id);
>> -            r = -EINVAL;
>> -            goto init_failed;
>> -        }
>> +            if (!hive->reset_domain ||
>> + !kref_get_unless_zero(&hive->reset_domain->refcount)) {
>> +                r = -ENOENT;
>> +                goto init_failed;
>> +            }
>>   -        adev->reset_domain.wq = hive->reset_domain.wq;
>> -    } else {
>> -        adev->reset_domain.wq = 
>> alloc_ordered_workqueue("amdgpu-reset-dev", 0);
>> -        if (!adev->reset_domain.wq) {
>> -            r = -ENOMEM;
>> -            goto init_failed;
>> +            /* Drop the early temporary reset domain we created for 
>> device */
>> +            kref_put(&adev->reset_domain->refcount, 
>> amdgpu_reset_destroy_reset_domain);
>> +            adev->reset_domain = hive->reset_domain;
>>           }
>>       }
>>   @@ -3599,6 +3597,15 @@ int amdgpu_device_init(struct amdgpu_device 
>> *adev,
>>           return r;
>>       }
>>   +    /*
>> +     * Reset domain needs to be present early, before XGMI hive 
>> discovered
>> +     * (if any) and intitialized to use reset sem and in_gpu reset flag
>> +     * early on during init.
>> +     */
>> +    adev->reset_domain = 
>> amdgpu_reset_create_reset_domain(SINGLE_DEVICE ,"amdgpu-reset-dev");
>> +    if (!adev->reset_domain)
>> +        return -ENOMEM;
>> +
>>       /* early init functions */
>>       r = amdgpu_device_ip_early_init(adev);
>>       if (r)
>> @@ -3949,6 +3956,9 @@ void amdgpu_device_fini_sw(struct amdgpu_device 
>> *adev)
>>       if (adev->mman.discovery_bin)
>>           amdgpu_discovery_fini(adev);
>>   +    kref_put(&adev->reset_domain->refcount, 
>> amdgpu_reset_destroy_reset_domain);
>> +    adev->reset_domain = NULL;
>> +
>>       kfree(adev->pci_state);
>>     }
>> @@ -5186,7 +5196,7 @@ int amdgpu_device_gpu_recover(struct 
>> amdgpu_device *adev,
>>         INIT_WORK(&work.base, amdgpu_device_queue_gpu_recover_work);
>>   -    if (!queue_work(adev->reset_domain.wq, &work.base))
>> +    if (!queue_work(adev->reset_domain->wq, &work.base))
>>           return -EAGAIN;
>>         flush_work(&work.base);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
>> index 02afd4115675..14f003d5ebe8 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
>> @@ -96,3 +96,41 @@ int amdgpu_reset_perform_reset(struct 
>> amdgpu_device *adev,
>>       return reset_handler->restore_hwcontext(adev->reset_cntl,
>>                           reset_context);
>>   }
>> +
>> +
>> +void amdgpu_reset_destroy_reset_domain(struct kref *ref)
>> +{
>> +    struct amdgpu_reset_domain *reset_domain = container_of(ref,
>> +                                struct amdgpu_reset_domain,
>> +                                refcount);
>> +    destroy_workqueue(reset_domain->wq);
>> +    kvfree(reset_domain);
>> +}
>> +
>> +struct amdgpu_reset_domain *amdgpu_reset_create_reset_domain(enum 
>> amdgpu_reset_domain_type type,
>> +                                 char *wq_name)
>> +{
>> +    struct amdgpu_reset_domain *reset_domain;
>> +
>> +    reset_domain = kvzalloc(sizeof(struct amdgpu_reset_domain), 
>> GFP_KERNEL);
>> +    if (!reset_domain) {
>> +        DRM_ERROR("Failed to allocate amdgpu_reset_domain!");
>> +        return NULL;
>> +    }
>> +
>> +    reset_domain->type = type;
>> +    kref_init(&reset_domain->refcount);
>> +
>> +    reset_domain->wq = create_singlethread_workqueue(wq_name);
>> +    if (!reset_domain->wq) {
>> +        DRM_ERROR("Failed to allocate wq for amdgpu_reset_domain!");
>> +        kref_put(&reset_domain->refcount, 
>> amdgpu_reset_destroy_reset_domain);
>> +        return NULL;
>> +
>> +    }
>> +
>> +    return reset_domain;
>> +}
>> +
>> +
>> +
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
>> index e00d38d9160a..0b310cd963d9 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
>> @@ -70,6 +70,19 @@ struct amdgpu_reset_control {
>>       void (*async_reset)(struct work_struct *work);
>>   };
>>   +
>> +enum amdgpu_reset_domain_type {
>> +    SINGLE_DEVICE,
>> +    XGMI_HIVE
>> +};
>> +
>> +struct amdgpu_reset_domain {
>> +    struct kref refcount;
>> +    struct workqueue_struct *wq;
>> +    enum amdgpu_reset_domain_type type;
>> +};
>> +
>> +
>>   int amdgpu_reset_init(struct amdgpu_device *adev);
>>   int amdgpu_reset_fini(struct amdgpu_device *adev);
>>   @@ -82,4 +95,9 @@ int amdgpu_reset_perform_reset(struct 
>> amdgpu_device *adev,
>>   int amdgpu_reset_add_handler(struct amdgpu_reset_control *reset_ctl,
>>                    struct amdgpu_reset_handler *handler);
>>   +struct amdgpu_reset_domain *amdgpu_reset_create_reset_domain(enum 
>> amdgpu_reset_domain_type type,
>> +                                 char *wq_name);
>> +
>> +void amdgpu_reset_destroy_reset_domain(struct kref *ref);
>> +
>>   #endif
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
>> index 9ad742039ac9..a216e88a7b54 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
>> @@ -32,6 +32,8 @@
>>   #include "wafl/wafl2_4_0_0_smn.h"
>>   #include "wafl/wafl2_4_0_0_sh_mask.h"
>>   +#include "amdgpu_reset.h"
>> +
>>   #define smnPCS_XGMI23_PCS_ERROR_STATUS   0x11a01210
>>   #define smnPCS_XGMI3X16_PCS_ERROR_STATUS 0x11a0020c
>>   #define smnPCS_GOPX1_PCS_ERROR_STATUS    0x12200210
>> @@ -226,6 +228,9 @@ static void amdgpu_xgmi_hive_release(struct 
>> kobject *kobj)
>>       struct amdgpu_hive_info *hive = container_of(
>>           kobj, struct amdgpu_hive_info, kobj);
>>   +    kref_put(&hive->reset_domain->refcount, 
>> amdgpu_reset_destroy_reset_domain);
>> +    hive->reset_domain = NULL;
>> +
>>       mutex_destroy(&hive->hive_lock);
>>       kfree(hive);
>>   }
>> @@ -392,12 +397,24 @@ struct amdgpu_hive_info 
>> *amdgpu_get_xgmi_hive(struct amdgpu_device *adev)
>>           goto pro_end;
>>       }
>>   -    hive->reset_domain.wq = 
>> alloc_ordered_workqueue("amdgpu-reset-hive", 0);
>> -    if (!hive->reset_domain.wq) {
>> -        dev_err(adev->dev, "XGMI: failed allocating wq for reset 
>> domain!\n");
>> -        kfree(hive);
>> -        hive = NULL;
>> -        goto pro_end;
>> +    /**
>> +     * Avoid recreating reset domain when hive is reconstructed for 
>> the case
>> +     * of reset the devices in the XGMI hive during probe for SRIOV
>> +     * See https://www.spinics.net/lists/amd-gfx/msg58836.html
>> +     */
>> +    if (adev->reset_domain->type != XGMI_HIVE) {
>> +        hive->reset_domain = 
>> amdgpu_reset_create_reset_domain(XGMI_HIVE, "amdgpu-reset-hive");
>> +            if (!hive->reset_domain) {
>> +                dev_err(adev->dev, "XGMI: failed initializing reset 
>> domain for xgmi hive\n");
>> +                ret = -ENOMEM;
>> +                kobject_put(&hive->kobj);
>> +                kfree(hive);
>> +                hive = NULL;
>> +                goto pro_end;
>> +            }
>> +    } else {
>> + kref_get_unless_zero(&adev->reset_domain->refcount);
>> +        hive->reset_domain = adev->reset_domain;
>
> Suggest to wrap this like -
>     amdgpu_reset_domain_get(reset_domain)
>
> and another like
>     amdgpu_reset_domain_put(reset_domain)



I already use kref_put, kref_get on reset domain so this will be 
confusing  to use same naming for
reset domain creation a bit I think to use it for creation.
I can do those wrappers around the direct usage of kref_put/kref_get


>
> Even smaller wrappers like
>     amdgpu_reset_domain_schedule(reset_domain, reset_work)


This really would be a one line syntactic wrapper but sure


>
> Other than that, looks good to me (need to combine with earlier series 
> as this misses a few other members in reset domain).


It's all because i added this and later patches on top of existing old 
patches since i had a lot of merge conflicts if i put this changes in 
original patches.
See patches 9 and 10 for moving the other members into reset domain.

Andrey


>
> Thanks,
> Lijo
>
>>       }
>>         hive->hive_id = adev->gmc.xgmi.hive_id;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
>> index 2f2ce53645a5..dcaad22be492 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
>> @@ -42,7 +42,7 @@ struct amdgpu_hive_info {
>>           AMDGPU_XGMI_PSTATE_UNKNOWN
>>       } pstate;
>>   -    struct amdgpu_reset_domain reset_domain;
>> +    struct amdgpu_reset_domain *reset_domain;
>>   };
>>     struct amdgpu_pcs_ras_field {
>> diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c 
>> b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
>> index b2b40e169342..05e98af30b48 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
>> @@ -32,6 +32,8 @@
>>   #include "soc15_common.h"
>>   #include "mxgpu_ai.h"
>>   +#include "amdgpu_reset.h"
>> +
>>   static void xgpu_ai_mailbox_send_ack(struct amdgpu_device *adev)
>>   {
>>       WREG8(AI_MAIBOX_CONTROL_RCV_OFFSET_BYTE, 2);
>> @@ -302,7 +304,7 @@ static int xgpu_ai_mailbox_rcv_irq(struct 
>> amdgpu_device *adev,
>>       switch (event) {
>>           case IDH_FLR_NOTIFICATION:
>>           if (amdgpu_sriov_runtime(adev) && !amdgpu_in_reset(adev))
>> -            WARN_ONCE(!queue_work(adev->reset_domain.wq,
>> +            WARN_ONCE(!queue_work(adev->reset_domain->wq,
>>                             &adev->virt.flr_work),
>>                     "Failed to queue work! at %s",
>>                     __func__);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c 
>> b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
>> index 08411924150d..6e12055f3f4a 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
>> @@ -31,6 +31,8 @@
>>   #include "soc15_common.h"
>>   #include "mxgpu_nv.h"
>>   +#include "amdgpu_reset.h"
>> +
>>   static void xgpu_nv_mailbox_send_ack(struct amdgpu_device *adev)
>>   {
>>       WREG8(NV_MAIBOX_CONTROL_RCV_OFFSET_BYTE, 2);
>> @@ -337,7 +339,7 @@ static int xgpu_nv_mailbox_rcv_irq(struct 
>> amdgpu_device *adev,
>>       switch (event) {
>>       case IDH_FLR_NOTIFICATION:
>>           if (amdgpu_sriov_runtime(adev) && !amdgpu_in_reset(adev))
>> -            WARN_ONCE(!queue_work(adev->reset_domain.wq,
>> +            WARN_ONCE(!queue_work(adev->reset_domain->wq,
>>                             &adev->virt.flr_work),
>>                     "Failed to queue work! at %s",
>>                     __func__);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c 
>> b/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c
>> index 02290febfcf4..fe1570c2146e 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c
>> @@ -42,6 +42,8 @@
>>   #include "smu/smu_7_1_3_d.h"
>>   #include "mxgpu_vi.h"
>>   +#include "amdgpu_reset.h"
>> +
>>   /* VI golden setting */
>>   static const u32 xgpu_fiji_mgcg_cgcg_init[] = {
>>       mmRLC_CGTT_MGCG_OVERRIDE, 0xffffffff, 0xffffffff,
>> @@ -551,7 +553,7 @@ static int xgpu_vi_mailbox_rcv_irq(struct 
>> amdgpu_device *adev,
>>             /* only handle FLR_NOTIFY now */
>>           if (!r && !amdgpu_in_reset(adev))
>> -            WARN_ONCE(!queue_work(adev->reset_domain.wq,
>> +            WARN_ONCE(!queue_work(adev->reset_domain->wq,
>>                             &adev->virt.flr_work),
>>                     "Failed to queue work! at %s",
>>                     __func__);
>>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC v3 00/12] Define and use reset domain for GPU recovery in amdgpu
  2022-02-02 18:57   ` Andrey Grodzovsky
@ 2022-02-09  6:06     ` JingWen Chen
  2022-02-09 16:08       ` Andrey Grodzovsky
  0 siblings, 1 reply; 27+ messages in thread
From: JingWen Chen @ 2022-02-09  6:06 UTC (permalink / raw)
  To: Andrey Grodzovsky, dri-devel, amd-gfx
  Cc: Monk.Liu, horace.chen, lijo.lazar, christian.koenig

Hi Andrey,

I have been testing your patch and it seems fine till now.

Best Regards,

Jingwen Chen

On 2022/2/3 上午2:57, Andrey Grodzovsky wrote:
> Just another ping, with Shyun's help I was able to do some smoke testing on XGMI SRIOV system (booting and triggering hive reset)
> and for now looks good.
>
> Andrey
>
> On 2022-01-28 14:36, Andrey Grodzovsky wrote:
>> Just a gentle ping if people have more comments on this patch set ? Especially last 5 patches
>> as first 7 are exact same as V2 and we already went over them mostly.
>>
>> Andrey
>>
>> On 2022-01-25 17:37, Andrey Grodzovsky wrote:
>>> This patchset is based on earlier work by Boris[1] that allowed to have an
>>> ordered workqueue at the driver level that will be used by the different
>>> schedulers to queue their timeout work. On top of that I also serialized
>>> any GPU reset we trigger from within amdgpu code to also go through the same
>>> ordered wq and in this way simplify somewhat our GPU reset code so we don't need
>>> to protect from concurrency by multiple GPU reset triggeres such as TDR on one
>>> hand and sysfs trigger or RAS trigger on the other hand.
>>>
>>> As advised by Christian and Daniel I defined a reset_domain struct such that
>>> all the entities that go through reset together will be serialized one against
>>> another.
>>>
>>> TDR triggered by multiple entities within the same domain due to the same reason will not
>>> be triggered as the first such reset will cancel all the pending resets. This is
>>> relevant only to TDR timers and not to triggered resets coming from RAS or SYSFS,
>>> those will still happen after the in flight resets finishes.
>>>
>>> v2:
>>> Add handling on SRIOV configuration, the reset notify coming from host
>>> and driver already trigger a work queue to handle the reset so drop this
>>> intermediate wq and send directly to timeout wq. (Shaoyun)
>>>
>>> v3:
>>> Lijo suggested puting 'adev->in_gpu_reset' in amdgpu_reset_domain struct.
>>> I followed his advise and also moved adev->reset_sem into same place. This
>>> in turn caused to do some follow-up refactor of the original patches
>>> where i decoupled amdgpu_reset_domain life cycle frolm XGMI hive because hive is destroyed and
>>> reconstructed for the case of reset the devices in the XGMI hive during probe for SRIOV See [2]
>>> while we need the reset sem and gpu_reset flag to always be present. This was attained
>>> by adding refcount to amdgpu_reset_domain so each device can safely point to it as long as
>>> it needs.
>>>
>>>
>>> [1] https://patchwork.kernel.org/project/dri-devel/patch/20210629073510.2764391-3-boris.brezillon@collabora.com/
>>> [2] https://www.spinics.net/lists/amd-gfx/msg58836.html
>>>
>>> P.S Going through drm-misc-next and not amd-staging-drm-next as Boris work hasn't landed yet there.
>>>
>>> P.P.S Patches 8-12 are the refactor on top of the original V2 patchset.
>>>
>>> P.P.P.S I wasn't able yet to test the reworked code on XGMI SRIOV system because drm-misc-next fails to load there.
>>> Would appriciate if maybe jingwech can try it on his system like he tested V2.
>>>
>>> Andrey Grodzovsky (12):
>>>    drm/amdgpu: Introduce reset domain
>>>    drm/amdgpu: Move scheduler init to after XGMI is ready
>>>    drm/amdgpu: Fix crash on modprobe
>>>    drm/amdgpu: Serialize non TDR gpu recovery with TDRs
>>>    drm/amd/virt: For SRIOV send GPU reset directly to TDR queue.
>>>    drm/amdgpu: Drop hive->in_reset
>>>    drm/amdgpu: Drop concurrent GPU reset protection for device
>>>    drm/amdgpu: Rework reset domain to be refcounted.
>>>    drm/amdgpu: Move reset sem into reset_domain
>>>    drm/amdgpu: Move in_gpu_reset into reset_domain
>>>    drm/amdgpu: Rework amdgpu_device_lock_adev
>>>    Revert 'drm/amdgpu: annotate a false positive recursive locking'
>>>
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h           |  15 +-
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c   |  10 +-
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 275 ++++++++++--------
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c     |  43 +--
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c       |   2 +-
>>>   .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c    |  18 +-
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c     |  39 +++
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h     |  12 +
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h      |   2 +
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c      |  24 +-
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h      |   3 +-
>>>   drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c        |   6 +-
>>>   drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c         |  14 +-
>>>   drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c         |  19 +-
>>>   drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c         |  19 +-
>>>   drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c         |  11 +-
>>>   16 files changed, 313 insertions(+), 199 deletions(-)
>>>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC v4] drm/amdgpu: Rework reset domain to be refcounted.
  2022-02-08 16:19       ` Andrey Grodzovsky
@ 2022-02-09  7:51         ` Christian König
  0 siblings, 0 replies; 27+ messages in thread
From: Christian König @ 2022-02-09  7:51 UTC (permalink / raw)
  To: Andrey Grodzovsky, Lazar, Lijo, dri-devel, amd-gfx
  Cc: horace.chen, jingwech, christian.koenig, Monk.Liu



Am 08.02.22 um 17:19 schrieb Andrey Grodzovsky:
>
> On 2022-02-08 06:25, Lazar, Lijo wrote:
>>
>>
>> On 2/2/2022 10:56 PM, Andrey Grodzovsky wrote:
>>> The reset domain contains register access semaphor
>>> now and so needs to be present as long as each device
>>> in a hive needs it and so it cannot be binded to XGMI
>>> hive life cycle.
>>> Adress this by making reset domain refcounted and pointed
>>> by each member of the hive and the hive itself.
>>>
>>> v4:
>>> Fix crash on boot with XGMI hive by adding type to reset_domain.
>>> XGMI will only create a new reset_domain if prevoius was of single
>>> device type meaning it's first boot. Otherwsie it will take a
>>> refocunt to exsiting reset_domain from the amdgou device.
>>>
>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h        |  6 +--
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 44 
>>> +++++++++++++---------
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c  | 38 +++++++++++++++++++
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h  | 18 +++++++++
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c   | 29 +++++++++++---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h   |  2 +-
>>>   drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c      |  4 +-
>>>   drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c      |  4 +-
>>>   drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c      |  4 +-
>>>   9 files changed, 118 insertions(+), 31 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> index 8e96b9a14452..f2ba460bfd59 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> @@ -813,9 +813,7 @@ struct amd_powerplay {
>>>   #define AMDGPU_RESET_MAGIC_NUM 64
>>>   #define AMDGPU_MAX_DF_PERFMONS 4
>>>   -struct amdgpu_reset_domain {
>>> -    struct workqueue_struct *wq;
>>> -};
>>> +struct amdgpu_reset_domain;
>>>     struct amdgpu_device {
>>>       struct device            *dev;
>>> @@ -1102,7 +1100,7 @@ struct amdgpu_device {
>>>       struct amdgpu_reset_control     *reset_cntl;
>>>       uint32_t ip_versions[HW_ID_MAX][HWIP_MAX_INSTANCE];
>>>   -    struct amdgpu_reset_domain    reset_domain;
>>> +    struct amdgpu_reset_domain    *reset_domain;
>>>   };
>>>     static inline struct amdgpu_device *drm_to_adev(struct 
>>> drm_device *ddev)
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> index fef952ca8db5..cd1b7af69c35 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> @@ -2313,7 +2313,7 @@ static int 
>>> amdgpu_device_init_schedulers(struct amdgpu_device *adev)
>>>             r = drm_sched_init(&ring->sched, &amdgpu_sched_ops,
>>>                      ring->num_hw_submission, amdgpu_job_hang_limit,
>>> -                   timeout, adev->reset_domain.wq, 
>>> ring->sched_score, ring->name);
>>> +                   timeout, adev->reset_domain->wq, 
>>> ring->sched_score, ring->name);
>>>           if (r) {
>>>               DRM_ERROR("Failed to create scheduler on ring %s.\n",
>>>                     ring->name);
>>> @@ -2432,24 +2432,22 @@ static int amdgpu_device_ip_init(struct 
>>> amdgpu_device *adev)
>>>       if (r)
>>>           goto init_failed;
>>>   +    /**
>>> +     * In case of XGMI grab extra reference for reset domain for 
>>> this device
>>> +     */
>>>       if (adev->gmc.xgmi.num_physical_nodes > 1) {
>>> -        struct amdgpu_hive_info *hive;
>>> -
>>> -        amdgpu_xgmi_add_device(adev);
>>> +        if (amdgpu_xgmi_add_device(adev) == 0) {
>>> +            struct amdgpu_hive_info *hive = 
>>> amdgpu_get_xgmi_hive(adev);
>>>   -        hive = amdgpu_get_xgmi_hive(adev);
>>> -        if (!hive || !hive->reset_domain.wq) {
>>> -            DRM_ERROR("Failed to obtain reset domain info for XGMI 
>>> hive:%llx", hive->hive_id);
>>> -            r = -EINVAL;
>>> -            goto init_failed;
>>> -        }
>>> +            if (!hive->reset_domain ||
>>> + !kref_get_unless_zero(&hive->reset_domain->refcount)) {
>>> +                r = -ENOENT;
>>> +                goto init_failed;
>>> +            }
>>>   -        adev->reset_domain.wq = hive->reset_domain.wq;
>>> -    } else {
>>> -        adev->reset_domain.wq = 
>>> alloc_ordered_workqueue("amdgpu-reset-dev", 0);
>>> -        if (!adev->reset_domain.wq) {
>>> -            r = -ENOMEM;
>>> -            goto init_failed;
>>> +            /* Drop the early temporary reset domain we created for 
>>> device */
>>> +            kref_put(&adev->reset_domain->refcount, 
>>> amdgpu_reset_destroy_reset_domain);
>>> +            adev->reset_domain = hive->reset_domain;
>>>           }
>>>       }
>>>   @@ -3599,6 +3597,15 @@ int amdgpu_device_init(struct amdgpu_device 
>>> *adev,
>>>           return r;
>>>       }
>>>   +    /*
>>> +     * Reset domain needs to be present early, before XGMI hive 
>>> discovered
>>> +     * (if any) and intitialized to use reset sem and in_gpu reset 
>>> flag
>>> +     * early on during init.
>>> +     */
>>> +    adev->reset_domain = 
>>> amdgpu_reset_create_reset_domain(SINGLE_DEVICE ,"amdgpu-reset-dev");
>>> +    if (!adev->reset_domain)
>>> +        return -ENOMEM;
>>> +
>>>       /* early init functions */
>>>       r = amdgpu_device_ip_early_init(adev);
>>>       if (r)
>>> @@ -3949,6 +3956,9 @@ void amdgpu_device_fini_sw(struct 
>>> amdgpu_device *adev)
>>>       if (adev->mman.discovery_bin)
>>>           amdgpu_discovery_fini(adev);
>>>   +    kref_put(&adev->reset_domain->refcount, 
>>> amdgpu_reset_destroy_reset_domain);
>>> +    adev->reset_domain = NULL;
>>> +
>>>       kfree(adev->pci_state);
>>>     }
>>> @@ -5186,7 +5196,7 @@ int amdgpu_device_gpu_recover(struct 
>>> amdgpu_device *adev,
>>>         INIT_WORK(&work.base, amdgpu_device_queue_gpu_recover_work);
>>>   -    if (!queue_work(adev->reset_domain.wq, &work.base))
>>> +    if (!queue_work(adev->reset_domain->wq, &work.base))
>>>           return -EAGAIN;
>>>         flush_work(&work.base);
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
>>> index 02afd4115675..14f003d5ebe8 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
>>> @@ -96,3 +96,41 @@ int amdgpu_reset_perform_reset(struct 
>>> amdgpu_device *adev,
>>>       return reset_handler->restore_hwcontext(adev->reset_cntl,
>>>                           reset_context);
>>>   }
>>> +
>>> +
>>> +void amdgpu_reset_destroy_reset_domain(struct kref *ref)
>>> +{
>>> +    struct amdgpu_reset_domain *reset_domain = container_of(ref,
>>> +                                struct amdgpu_reset_domain,
>>> +                                refcount);
>>> +    destroy_workqueue(reset_domain->wq);
>>> +    kvfree(reset_domain);
>>> +}
>>> +
>>> +struct amdgpu_reset_domain *amdgpu_reset_create_reset_domain(enum 
>>> amdgpu_reset_domain_type type,
>>> +                                 char *wq_name)
>>> +{
>>> +    struct amdgpu_reset_domain *reset_domain;
>>> +
>>> +    reset_domain = kvzalloc(sizeof(struct amdgpu_reset_domain), 
>>> GFP_KERNEL);
>>> +    if (!reset_domain) {
>>> +        DRM_ERROR("Failed to allocate amdgpu_reset_domain!");
>>> +        return NULL;
>>> +    }
>>> +
>>> +    reset_domain->type = type;
>>> +    kref_init(&reset_domain->refcount);
>>> +
>>> +    reset_domain->wq = create_singlethread_workqueue(wq_name);
>>> +    if (!reset_domain->wq) {
>>> +        DRM_ERROR("Failed to allocate wq for amdgpu_reset_domain!");
>>> +        kref_put(&reset_domain->refcount, 
>>> amdgpu_reset_destroy_reset_domain);
>>> +        return NULL;
>>> +
>>> +    }
>>> +
>>> +    return reset_domain;
>>> +}
>>> +
>>> +
>>> +
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
>>> index e00d38d9160a..0b310cd963d9 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h
>>> @@ -70,6 +70,19 @@ struct amdgpu_reset_control {
>>>       void (*async_reset)(struct work_struct *work);
>>>   };
>>>   +
>>> +enum amdgpu_reset_domain_type {
>>> +    SINGLE_DEVICE,
>>> +    XGMI_HIVE
>>> +};
>>> +
>>> +struct amdgpu_reset_domain {
>>> +    struct kref refcount;
>>> +    struct workqueue_struct *wq;
>>> +    enum amdgpu_reset_domain_type type;
>>> +};
>>> +
>>> +
>>>   int amdgpu_reset_init(struct amdgpu_device *adev);
>>>   int amdgpu_reset_fini(struct amdgpu_device *adev);
>>>   @@ -82,4 +95,9 @@ int amdgpu_reset_perform_reset(struct 
>>> amdgpu_device *adev,
>>>   int amdgpu_reset_add_handler(struct amdgpu_reset_control *reset_ctl,
>>>                    struct amdgpu_reset_handler *handler);
>>>   +struct amdgpu_reset_domain *amdgpu_reset_create_reset_domain(enum 
>>> amdgpu_reset_domain_type type,
>>> +                                 char *wq_name);
>>> +
>>> +void amdgpu_reset_destroy_reset_domain(struct kref *ref);
>>> +
>>>   #endif
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
>>> index 9ad742039ac9..a216e88a7b54 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
>>> @@ -32,6 +32,8 @@
>>>   #include "wafl/wafl2_4_0_0_smn.h"
>>>   #include "wafl/wafl2_4_0_0_sh_mask.h"
>>>   +#include "amdgpu_reset.h"
>>> +
>>>   #define smnPCS_XGMI23_PCS_ERROR_STATUS   0x11a01210
>>>   #define smnPCS_XGMI3X16_PCS_ERROR_STATUS 0x11a0020c
>>>   #define smnPCS_GOPX1_PCS_ERROR_STATUS    0x12200210
>>> @@ -226,6 +228,9 @@ static void amdgpu_xgmi_hive_release(struct 
>>> kobject *kobj)
>>>       struct amdgpu_hive_info *hive = container_of(
>>>           kobj, struct amdgpu_hive_info, kobj);
>>>   +    kref_put(&hive->reset_domain->refcount, 
>>> amdgpu_reset_destroy_reset_domain);
>>> +    hive->reset_domain = NULL;
>>> +
>>>       mutex_destroy(&hive->hive_lock);
>>>       kfree(hive);
>>>   }
>>> @@ -392,12 +397,24 @@ struct amdgpu_hive_info 
>>> *amdgpu_get_xgmi_hive(struct amdgpu_device *adev)
>>>           goto pro_end;
>>>       }
>>>   -    hive->reset_domain.wq = 
>>> alloc_ordered_workqueue("amdgpu-reset-hive", 0);
>>> -    if (!hive->reset_domain.wq) {
>>> -        dev_err(adev->dev, "XGMI: failed allocating wq for reset 
>>> domain!\n");
>>> -        kfree(hive);
>>> -        hive = NULL;
>>> -        goto pro_end;
>>> +    /**
>>> +     * Avoid recreating reset domain when hive is reconstructed for 
>>> the case
>>> +     * of reset the devices in the XGMI hive during probe for SRIOV
>>> +     * See https://www.spinics.net/lists/amd-gfx/msg58836.html
>>> +     */
>>> +    if (adev->reset_domain->type != XGMI_HIVE) {
>>> +        hive->reset_domain = 
>>> amdgpu_reset_create_reset_domain(XGMI_HIVE, "amdgpu-reset-hive");
>>> +            if (!hive->reset_domain) {
>>> +                dev_err(adev->dev, "XGMI: failed initializing reset 
>>> domain for xgmi hive\n");
>>> +                ret = -ENOMEM;
>>> +                kobject_put(&hive->kobj);
>>> +                kfree(hive);
>>> +                hive = NULL;
>>> +                goto pro_end;
>>> +            }
>>> +    } else {
>>> + kref_get_unless_zero(&adev->reset_domain->refcount);
>>> +        hive->reset_domain = adev->reset_domain;
>>
>> Suggest to wrap this like -
>>     amdgpu_reset_domain_get(reset_domain)
>>
>> and another like
>>     amdgpu_reset_domain_put(reset_domain)
>
>
>
> I already use kref_put, kref_get on reset domain so this will be 
> confusing  to use same naming for
> reset domain creation a bit I think to use it for creation.
> I can do those wrappers around the direct usage of kref_put/kref_get

Small inline wrappers in the header around kref_put/kref_get are pretty 
common, but not necessary mandatory.

>
>
>>
>> Even smaller wrappers like
>>     amdgpu_reset_domain_schedule(reset_domain, reset_work)
>
>
> This really would be a one line syntactic wrapper but sure
>
>
>>
>> Other than that, looks good to me (need to combine with earlier 
>> series as this misses a few other members in reset domain).
>
>
> It's all because i added this and later patches on top of existing old 
> patches since i had a lot of merge conflicts if i put this changes in 
> original patches.
> See patches 9 and 10 for moving the other members into reset domain.

I only skimmed over the set. It's a bit confusing, but if it makes 
things easier I would say go for it.

Regards,
Christian.

>
> Andrey
>
>
>>
>> Thanks,
>> Lijo
>>
>>>       }
>>>         hive->hive_id = adev->gmc.xgmi.hive_id;
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
>>> index 2f2ce53645a5..dcaad22be492 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
>>> @@ -42,7 +42,7 @@ struct amdgpu_hive_info {
>>>           AMDGPU_XGMI_PSTATE_UNKNOWN
>>>       } pstate;
>>>   -    struct amdgpu_reset_domain reset_domain;
>>> +    struct amdgpu_reset_domain *reset_domain;
>>>   };
>>>     struct amdgpu_pcs_ras_field {
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c 
>>> b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
>>> index b2b40e169342..05e98af30b48 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
>>> @@ -32,6 +32,8 @@
>>>   #include "soc15_common.h"
>>>   #include "mxgpu_ai.h"
>>>   +#include "amdgpu_reset.h"
>>> +
>>>   static void xgpu_ai_mailbox_send_ack(struct amdgpu_device *adev)
>>>   {
>>>       WREG8(AI_MAIBOX_CONTROL_RCV_OFFSET_BYTE, 2);
>>> @@ -302,7 +304,7 @@ static int xgpu_ai_mailbox_rcv_irq(struct 
>>> amdgpu_device *adev,
>>>       switch (event) {
>>>           case IDH_FLR_NOTIFICATION:
>>>           if (amdgpu_sriov_runtime(adev) && !amdgpu_in_reset(adev))
>>> -            WARN_ONCE(!queue_work(adev->reset_domain.wq,
>>> + WARN_ONCE(!queue_work(adev->reset_domain->wq,
>>>                             &adev->virt.flr_work),
>>>                     "Failed to queue work! at %s",
>>>                     __func__);
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c 
>>> b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
>>> index 08411924150d..6e12055f3f4a 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c
>>> @@ -31,6 +31,8 @@
>>>   #include "soc15_common.h"
>>>   #include "mxgpu_nv.h"
>>>   +#include "amdgpu_reset.h"
>>> +
>>>   static void xgpu_nv_mailbox_send_ack(struct amdgpu_device *adev)
>>>   {
>>>       WREG8(NV_MAIBOX_CONTROL_RCV_OFFSET_BYTE, 2);
>>> @@ -337,7 +339,7 @@ static int xgpu_nv_mailbox_rcv_irq(struct 
>>> amdgpu_device *adev,
>>>       switch (event) {
>>>       case IDH_FLR_NOTIFICATION:
>>>           if (amdgpu_sriov_runtime(adev) && !amdgpu_in_reset(adev))
>>> -            WARN_ONCE(!queue_work(adev->reset_domain.wq,
>>> + WARN_ONCE(!queue_work(adev->reset_domain->wq,
>>>                             &adev->virt.flr_work),
>>>                     "Failed to queue work! at %s",
>>>                     __func__);
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c 
>>> b/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c
>>> index 02290febfcf4..fe1570c2146e 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c
>>> @@ -42,6 +42,8 @@
>>>   #include "smu/smu_7_1_3_d.h"
>>>   #include "mxgpu_vi.h"
>>>   +#include "amdgpu_reset.h"
>>> +
>>>   /* VI golden setting */
>>>   static const u32 xgpu_fiji_mgcg_cgcg_init[] = {
>>>       mmRLC_CGTT_MGCG_OVERRIDE, 0xffffffff, 0xffffffff,
>>> @@ -551,7 +553,7 @@ static int xgpu_vi_mailbox_rcv_irq(struct 
>>> amdgpu_device *adev,
>>>             /* only handle FLR_NOTIFY now */
>>>           if (!r && !amdgpu_in_reset(adev))
>>> -            WARN_ONCE(!queue_work(adev->reset_domain.wq,
>>> + WARN_ONCE(!queue_work(adev->reset_domain->wq,
>>>                             &adev->virt.flr_work),
>>>                     "Failed to queue work! at %s",
>>>                     __func__);
>>>


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [RFC v3 00/12] Define and use reset domain for GPU recovery in amdgpu
  2022-02-09  6:06     ` JingWen Chen
@ 2022-02-09 16:08       ` Andrey Grodzovsky
  0 siblings, 0 replies; 27+ messages in thread
From: Andrey Grodzovsky @ 2022-02-09 16:08 UTC (permalink / raw)
  To: JingWen Chen, dri-devel, amd-gfx
  Cc: Monk.Liu, horace.chen, lijo.lazar, christian.koenig

Thanks a lot!

Andrey

On 2022-02-09 01:06, JingWen Chen wrote:
> Hi Andrey,
>
> I have been testing your patch and it seems fine till now.
>
> Best Regards,
>
> Jingwen Chen
>
> On 2022/2/3 上午2:57, Andrey Grodzovsky wrote:
>> Just another ping, with Shyun's help I was able to do some smoke testing on XGMI SRIOV system (booting and triggering hive reset)
>> and for now looks good.
>>
>> Andrey
>>
>> On 2022-01-28 14:36, Andrey Grodzovsky wrote:
>>> Just a gentle ping if people have more comments on this patch set ? Especially last 5 patches
>>> as first 7 are exact same as V2 and we already went over them mostly.
>>>
>>> Andrey
>>>
>>> On 2022-01-25 17:37, Andrey Grodzovsky wrote:
>>>> This patchset is based on earlier work by Boris[1] that allowed to have an
>>>> ordered workqueue at the driver level that will be used by the different
>>>> schedulers to queue their timeout work. On top of that I also serialized
>>>> any GPU reset we trigger from within amdgpu code to also go through the same
>>>> ordered wq and in this way simplify somewhat our GPU reset code so we don't need
>>>> to protect from concurrency by multiple GPU reset triggeres such as TDR on one
>>>> hand and sysfs trigger or RAS trigger on the other hand.
>>>>
>>>> As advised by Christian and Daniel I defined a reset_domain struct such that
>>>> all the entities that go through reset together will be serialized one against
>>>> another.
>>>>
>>>> TDR triggered by multiple entities within the same domain due to the same reason will not
>>>> be triggered as the first such reset will cancel all the pending resets. This is
>>>> relevant only to TDR timers and not to triggered resets coming from RAS or SYSFS,
>>>> those will still happen after the in flight resets finishes.
>>>>
>>>> v2:
>>>> Add handling on SRIOV configuration, the reset notify coming from host
>>>> and driver already trigger a work queue to handle the reset so drop this
>>>> intermediate wq and send directly to timeout wq. (Shaoyun)
>>>>
>>>> v3:
>>>> Lijo suggested puting 'adev->in_gpu_reset' in amdgpu_reset_domain struct.
>>>> I followed his advise and also moved adev->reset_sem into same place. This
>>>> in turn caused to do some follow-up refactor of the original patches
>>>> where i decoupled amdgpu_reset_domain life cycle frolm XGMI hive because hive is destroyed and
>>>> reconstructed for the case of reset the devices in the XGMI hive during probe for SRIOV See [2]
>>>> while we need the reset sem and gpu_reset flag to always be present. This was attained
>>>> by adding refcount to amdgpu_reset_domain so each device can safely point to it as long as
>>>> it needs.
>>>>
>>>>
>>>> [1] https://patchwork.kernel.org/project/dri-devel/patch/20210629073510.2764391-3-boris.brezillon@collabora.com/
>>>> [2] https://www.spinics.net/lists/amd-gfx/msg58836.html
>>>>
>>>> P.S Going through drm-misc-next and not amd-staging-drm-next as Boris work hasn't landed yet there.
>>>>
>>>> P.P.S Patches 8-12 are the refactor on top of the original V2 patchset.
>>>>
>>>> P.P.P.S I wasn't able yet to test the reworked code on XGMI SRIOV system because drm-misc-next fails to load there.
>>>> Would appriciate if maybe jingwech can try it on his system like he tested V2.
>>>>
>>>> Andrey Grodzovsky (12):
>>>>     drm/amdgpu: Introduce reset domain
>>>>     drm/amdgpu: Move scheduler init to after XGMI is ready
>>>>     drm/amdgpu: Fix crash on modprobe
>>>>     drm/amdgpu: Serialize non TDR gpu recovery with TDRs
>>>>     drm/amd/virt: For SRIOV send GPU reset directly to TDR queue.
>>>>     drm/amdgpu: Drop hive->in_reset
>>>>     drm/amdgpu: Drop concurrent GPU reset protection for device
>>>>     drm/amdgpu: Rework reset domain to be refcounted.
>>>>     drm/amdgpu: Move reset sem into reset_domain
>>>>     drm/amdgpu: Move in_gpu_reset into reset_domain
>>>>     drm/amdgpu: Rework amdgpu_device_lock_adev
>>>>     Revert 'drm/amdgpu: annotate a false positive recursive locking'
>>>>
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu.h           |  15 +-
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c   |  10 +-
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    | 275 ++++++++++--------
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c     |  43 +--
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_job.c       |   2 +-
>>>>    .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c    |  18 +-
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c     |  39 +++
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h     |  12 +
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h      |   2 +
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c      |  24 +-
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h      |   3 +-
>>>>    drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c        |   6 +-
>>>>    drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c         |  14 +-
>>>>    drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c         |  19 +-
>>>>    drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c         |  19 +-
>>>>    drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c         |  11 +-
>>>>    16 files changed, 313 insertions(+), 199 deletions(-)
>>>>

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2022-02-09 16:08 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-25 22:37 [RFC v3 00/12] Define and use reset domain for GPU recovery in amdgpu Andrey Grodzovsky
2022-01-25 22:37 ` [RFC v3 01/12] drm/amdgpu: Introduce reset domain Andrey Grodzovsky
2022-01-26 12:07   ` Christian König
2022-01-26 15:47     ` Andrey Grodzovsky
2022-01-25 22:37 ` [RFC v3 02/12] drm/amdgpu: Move scheduler init to after XGMI is ready Andrey Grodzovsky
2022-01-25 22:37 ` [RFC v3 03/12] drm/amdgpu: Fix crash on modprobe Andrey Grodzovsky
2022-01-25 22:37 ` [RFC v3 04/12] drm/amdgpu: Serialize non TDR gpu recovery with TDRs Andrey Grodzovsky
2022-01-25 22:37 ` [RFC v3 05/12] drm/amd/virt: For SRIOV send GPU reset directly to TDR queue Andrey Grodzovsky
2022-01-25 22:37 ` [RFC v3 06/12] drm/amdgpu: Drop hive->in_reset Andrey Grodzovsky
2022-02-08  6:33   ` Lazar, Lijo
2022-02-08 15:39     ` Andrey Grodzovsky
2022-01-25 22:37 ` [RFC v3 07/12] drm/amdgpu: Drop concurrent GPU reset protection for device Andrey Grodzovsky
2022-01-25 22:37 ` [RFC v3 08/12] drm/amdgpu: Rework reset domain to be refcounted Andrey Grodzovsky
2022-01-26 12:12   ` Christian König
2022-02-02 17:26   ` [RFC v4] " Andrey Grodzovsky
2022-02-08 11:25     ` Lazar, Lijo
2022-02-08 16:19       ` Andrey Grodzovsky
2022-02-09  7:51         ` Christian König
2022-01-25 22:37 ` [RFC v3 09/12] drm/amdgpu: Move reset sem into reset_domain Andrey Grodzovsky
2022-01-25 22:37 ` [RFC v3 10/12] drm/amdgpu: Move in_gpu_reset " Andrey Grodzovsky
2022-02-08 10:49   ` Lazar, Lijo
2022-01-25 22:37 ` [RFC v3 11/12] drm/amdgpu: Rework amdgpu_device_lock_adev Andrey Grodzovsky
2022-01-25 22:37 ` [RFC v3 12/12] Revert 'drm/amdgpu: annotate a false positive recursive locking' Andrey Grodzovsky
2022-01-28 19:36 ` [RFC v3 00/12] Define and use reset domain for GPU recovery in amdgpu Andrey Grodzovsky
2022-02-02 18:57   ` Andrey Grodzovsky
2022-02-09  6:06     ` JingWen Chen
2022-02-09 16:08       ` Andrey Grodzovsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).