All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/amdgpu: annotate a false positive recursive locking
@ 2020-08-06  7:02 Dennis Li
  2020-08-06  7:08 ` Christian König
  0 siblings, 1 reply; 24+ messages in thread
From: Dennis Li @ 2020-08-06  7:02 UTC (permalink / raw)
  To: amd-gfx, Alexander.Deucher, felix.kuehling, Hawking.Zhang; +Cc: Dennis Li

[  584.110304] ============================================
[  584.110590] WARNING: possible recursive locking detected
[  584.110876] 5.6.0-deli-v5.6-2848-g3f3109b0e75f #1 Tainted: G           OE
[  584.111164] --------------------------------------------
[  584.111456] kworker/38:1/553 is trying to acquire lock:
[  584.111721] ffff9b15ff0a47a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[  584.112112]
               but task is already holding lock:
[  584.112673] ffff9b1603d247a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[  584.113068]
               other info that might help us debug this:
[  584.113689]  Possible unsafe locking scenario:

[  584.114350]        CPU0
[  584.114685]        ----
[  584.115014]   lock(&adev->reset_sem);
[  584.115349]   lock(&adev->reset_sem);
[  584.115678]
                *** DEADLOCK ***

[  584.116624]  May be due to missing lock nesting notation

[  584.117284] 4 locks held by kworker/38:1/553:
[  584.117616]  #0: ffff9ad635c1d348 ((wq_completion)events){+.+.}, at: process_one_work+0x21f/0x630
[  584.117967]  #1: ffffac708e1c3e58 ((work_completion)(&con->recovery_work)){+.+.}, at: process_one_work+0x21f/0x630
[  584.118358]  #2: ffffffffc1c2a5d0 (&tmp->hive_lock){+.+.}, at: amdgpu_device_gpu_recover+0xae/0x1030 [amdgpu]
[  584.118786]  #3: ffff9b1603d247a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[  584.119222]
               stack backtrace:
[  584.119990] CPU: 38 PID: 553 Comm: kworker/38:1 Kdump: loaded Tainted: G           OE     5.6.0-deli-v5.6-2848-g3f3109b0e75f #1
[  584.120782] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, BIOS 3.1 05/23/2019
[  584.121223] Workqueue: events amdgpu_ras_do_recovery [amdgpu]
[  584.121638] Call Trace:
[  584.122050]  dump_stack+0x98/0xd5
[  584.122499]  __lock_acquire+0x1139/0x16e0
[  584.122931]  ? trace_hardirqs_on+0x3b/0xf0
[  584.123358]  ? cancel_delayed_work+0xa6/0xc0
[  584.123771]  lock_acquire+0xb8/0x1c0
[  584.124197]  ? amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[  584.124599]  down_write+0x49/0x120
[  584.125032]  ? amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[  584.125472]  amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[  584.125910]  ? amdgpu_ras_error_query+0x1b8/0x2a0 [amdgpu]
[  584.126367]  amdgpu_ras_do_recovery+0x159/0x190 [amdgpu]
[  584.126789]  process_one_work+0x29e/0x630
[  584.127208]  worker_thread+0x3c/0x3f0
[  584.127621]  ? __kthread_parkme+0x61/0x90
[  584.128014]  kthread+0x12f/0x150
[  584.128402]  ? process_one_work+0x630/0x630
[  584.128790]  ? kthread_park+0x90/0x90
[  584.129174]  ret_from_fork+0x3a/0x50

Each adev has owned lock_class_key to avoid false positive
recursive locking.

Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Change-Id: I7571efeccbf15483982031d00504a353031a854a

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index e97c088d03b3..766dc8f8c8a0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -967,6 +967,7 @@ struct amdgpu_device {
 	atomic_t                        in_gpu_reset;
 	enum pp_mp1_state               mp1_state;
 	struct rw_semaphore	reset_sem;
+	struct lock_class_key lock_key;
 	struct amdgpu_doorbell_index doorbell_index;
 
 	struct mutex			notifier_lock;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 6c572db42d92..d78df9312d34 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3037,6 +3037,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 	mutex_init(&adev->virt.vf_errors.lock);
 	hash_init(adev->mn_hash);
 	init_rwsem(&adev->reset_sem);
+	lockdep_set_class(&adev->reset_sem, &adev->lock_key);
 	atomic_set(&adev->in_gpu_reset, 0);
 	mutex_init(&adev->psp.mutex);
 	mutex_init(&adev->notifier_lock);
-- 
2.17.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] drm/amdgpu: annotate a false positive recursive locking
  2020-08-06  7:02 [PATCH] drm/amdgpu: annotate a false positive recursive locking Dennis Li
@ 2020-08-06  7:08 ` Christian König
  2020-08-06  7:44   ` Li, Dennis
  0 siblings, 1 reply; 24+ messages in thread
From: Christian König @ 2020-08-06  7:08 UTC (permalink / raw)
  To: Dennis Li, amd-gfx, Alexander.Deucher, felix.kuehling, Hawking.Zhang

Am 06.08.20 um 09:02 schrieb Dennis Li:
> [  584.110304] ============================================
> [  584.110590] WARNING: possible recursive locking detected
> [  584.110876] 5.6.0-deli-v5.6-2848-g3f3109b0e75f #1 Tainted: G           OE
> [  584.111164] --------------------------------------------
> [  584.111456] kworker/38:1/553 is trying to acquire lock:
> [  584.111721] ffff9b15ff0a47a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
> [  584.112112]
>                 but task is already holding lock:
> [  584.112673] ffff9b1603d247a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
> [  584.113068]
>                 other info that might help us debug this:
> [  584.113689]  Possible unsafe locking scenario:
>
> [  584.114350]        CPU0
> [  584.114685]        ----
> [  584.115014]   lock(&adev->reset_sem);
> [  584.115349]   lock(&adev->reset_sem);
> [  584.115678]
>                  *** DEADLOCK ***
>
> [  584.116624]  May be due to missing lock nesting notation
>
> [  584.117284] 4 locks held by kworker/38:1/553:
> [  584.117616]  #0: ffff9ad635c1d348 ((wq_completion)events){+.+.}, at: process_one_work+0x21f/0x630
> [  584.117967]  #1: ffffac708e1c3e58 ((work_completion)(&con->recovery_work)){+.+.}, at: process_one_work+0x21f/0x630
> [  584.118358]  #2: ffffffffc1c2a5d0 (&tmp->hive_lock){+.+.}, at: amdgpu_device_gpu_recover+0xae/0x1030 [amdgpu]
> [  584.118786]  #3: ffff9b1603d247a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
> [  584.119222]
>                 stack backtrace:
> [  584.119990] CPU: 38 PID: 553 Comm: kworker/38:1 Kdump: loaded Tainted: G           OE     5.6.0-deli-v5.6-2848-g3f3109b0e75f #1
> [  584.120782] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, BIOS 3.1 05/23/2019
> [  584.121223] Workqueue: events amdgpu_ras_do_recovery [amdgpu]
> [  584.121638] Call Trace:
> [  584.122050]  dump_stack+0x98/0xd5
> [  584.122499]  __lock_acquire+0x1139/0x16e0
> [  584.122931]  ? trace_hardirqs_on+0x3b/0xf0
> [  584.123358]  ? cancel_delayed_work+0xa6/0xc0
> [  584.123771]  lock_acquire+0xb8/0x1c0
> [  584.124197]  ? amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
> [  584.124599]  down_write+0x49/0x120
> [  584.125032]  ? amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
> [  584.125472]  amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
> [  584.125910]  ? amdgpu_ras_error_query+0x1b8/0x2a0 [amdgpu]
> [  584.126367]  amdgpu_ras_do_recovery+0x159/0x190 [amdgpu]
> [  584.126789]  process_one_work+0x29e/0x630
> [  584.127208]  worker_thread+0x3c/0x3f0
> [  584.127621]  ? __kthread_parkme+0x61/0x90
> [  584.128014]  kthread+0x12f/0x150
> [  584.128402]  ? process_one_work+0x630/0x630
> [  584.128790]  ? kthread_park+0x90/0x90
> [  584.129174]  ret_from_fork+0x3a/0x50
>
> Each adev has owned lock_class_key to avoid false positive
> recursive locking.

NAK, that is not a false positive but a real problem.

The issue here is that we have multiple reset semaphores, one for each 
device in the hive. If those are not acquired in the correct order we 
deadlock.

The real solution would be to move the reset_sem into the hive structure 
and make sure that we lock it only once.

Christian.

>
> Signed-off-by: Dennis Li <Dennis.Li@amd.com>
> Change-Id: I7571efeccbf15483982031d00504a353031a854a
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index e97c088d03b3..766dc8f8c8a0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -967,6 +967,7 @@ struct amdgpu_device {
>   	atomic_t                        in_gpu_reset;
>   	enum pp_mp1_state               mp1_state;
>   	struct rw_semaphore	reset_sem;
> +	struct lock_class_key lock_key;
>   	struct amdgpu_doorbell_index doorbell_index;
>   
>   	struct mutex			notifier_lock;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 6c572db42d92..d78df9312d34 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -3037,6 +3037,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>   	mutex_init(&adev->virt.vf_errors.lock);
>   	hash_init(adev->mn_hash);
>   	init_rwsem(&adev->reset_sem);
> +	lockdep_set_class(&adev->reset_sem, &adev->lock_key);
>   	atomic_set(&adev->in_gpu_reset, 0);
>   	mutex_init(&adev->psp.mutex);
>   	mutex_init(&adev->notifier_lock);

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [PATCH] drm/amdgpu: annotate a false positive recursive locking
  2020-08-06  7:08 ` Christian König
@ 2020-08-06  7:44   ` Li, Dennis
  2020-08-06  8:13     ` Christian König
  0 siblings, 1 reply; 24+ messages in thread
From: Li, Dennis @ 2020-08-06  7:44 UTC (permalink / raw)
  To: Koenig, Christian, amd-gfx, Deucher, Alexander, Kuehling, Felix,
	Zhang, Hawking

[AMD Official Use Only - Internal Distribution Only]

Hi, Christian,
      I agree with your concern. However we shouldn't rely on system to detect dead-lock, and should consider this when doing code implementation. In fact, dead-lock detection isn't enabled by default. 
      For your proposal to remove reset_sem into the hive structure, we can open a new topic to discuss it. Currently we couldn't make sure which is the best solution. For example, with your proposal, we must wait for all devices resuming successfully before resubmit an old task in one device, which will effect performance. 

Best Regards
Dennis Li
-----Original Message-----
From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of Christian König
Sent: Thursday, August 6, 2020 3:08 PM
To: Li, Dennis <Dennis.Li@amd.com>; amd-gfx@lists.freedesktop.org; Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix <Felix.Kuehling@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>
Subject: Re: [PATCH] drm/amdgpu: annotate a false positive recursive locking

Am 06.08.20 um 09:02 schrieb Dennis Li:
> [  584.110304] ============================================
> [  584.110590] WARNING: possible recursive locking detected
> [  584.110876] 5.6.0-deli-v5.6-2848-g3f3109b0e75f #1 Tainted: G           OE
> [  584.111164] --------------------------------------------
> [  584.111456] kworker/38:1/553 is trying to acquire lock:
> [  584.111721] ffff9b15ff0a47a0 (&adev->reset_sem){++++}, at: 
> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.112112]
>                 but task is already holding lock:
> [  584.112673] ffff9b1603d247a0 (&adev->reset_sem){++++}, at: 
> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.113068]
>                 other info that might help us debug this:
> [  584.113689]  Possible unsafe locking scenario:
>
> [  584.114350]        CPU0
> [  584.114685]        ----
> [  584.115014]   lock(&adev->reset_sem);
> [  584.115349]   lock(&adev->reset_sem);
> [  584.115678]
>                  *** DEADLOCK ***
>
> [  584.116624]  May be due to missing lock nesting notation
>
> [  584.117284] 4 locks held by kworker/38:1/553:
> [  584.117616]  #0: ffff9ad635c1d348 ((wq_completion)events){+.+.}, 
> at: process_one_work+0x21f/0x630 [  584.117967]  #1: ffffac708e1c3e58 
> ((work_completion)(&con->recovery_work)){+.+.}, at: 
> process_one_work+0x21f/0x630 [  584.118358]  #2: ffffffffc1c2a5d0 
> (&tmp->hive_lock){+.+.}, at: amdgpu_device_gpu_recover+0xae/0x1030 [amdgpu] [  584.118786]  #3: ffff9b1603d247a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.119222]
>                 stack backtrace:
> [  584.119990] CPU: 38 PID: 553 Comm: kworker/38:1 Kdump: loaded Tainted: G           OE     5.6.0-deli-v5.6-2848-g3f3109b0e75f #1
> [  584.120782] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, 
> BIOS 3.1 05/23/2019 [  584.121223] Workqueue: events 
> amdgpu_ras_do_recovery [amdgpu] [  584.121638] Call Trace:
> [  584.122050]  dump_stack+0x98/0xd5
> [  584.122499]  __lock_acquire+0x1139/0x16e0 [  584.122931]  ? 
> trace_hardirqs_on+0x3b/0xf0 [  584.123358]  ? 
> cancel_delayed_work+0xa6/0xc0 [  584.123771]  lock_acquire+0xb8/0x1c0 
> [  584.124197]  ? amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  
> 584.124599]  down_write+0x49/0x120 [  584.125032]  ? 
> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.125472]  
> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.125910]  ? 
> amdgpu_ras_error_query+0x1b8/0x2a0 [amdgpu] [  584.126367]  
> amdgpu_ras_do_recovery+0x159/0x190 [amdgpu] [  584.126789]  
> process_one_work+0x29e/0x630 [  584.127208]  worker_thread+0x3c/0x3f0 
> [  584.127621]  ? __kthread_parkme+0x61/0x90 [  584.128014]  
> kthread+0x12f/0x150 [  584.128402]  ? process_one_work+0x630/0x630 [  
> 584.128790]  ? kthread_park+0x90/0x90 [  584.129174]  
> ret_from_fork+0x3a/0x50
>
> Each adev has owned lock_class_key to avoid false positive recursive 
> locking.

NAK, that is not a false positive but a real problem.

The issue here is that we have multiple reset semaphores, one for each device in the hive. If those are not acquired in the correct order we deadlock.

The real solution would be to move the reset_sem into the hive structure and make sure that we lock it only once.

Christian.

>
> Signed-off-by: Dennis Li <Dennis.Li@amd.com>
> Change-Id: I7571efeccbf15483982031d00504a353031a854a
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index e97c088d03b3..766dc8f8c8a0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -967,6 +967,7 @@ struct amdgpu_device {
>   	atomic_t                        in_gpu_reset;
>   	enum pp_mp1_state               mp1_state;
>   	struct rw_semaphore	reset_sem;
> +	struct lock_class_key lock_key;
>   	struct amdgpu_doorbell_index doorbell_index;
>   
>   	struct mutex			notifier_lock;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 6c572db42d92..d78df9312d34 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -3037,6 +3037,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>   	mutex_init(&adev->virt.vf_errors.lock);
>   	hash_init(adev->mn_hash);
>   	init_rwsem(&adev->reset_sem);
> +	lockdep_set_class(&adev->reset_sem, &adev->lock_key);
>   	atomic_set(&adev->in_gpu_reset, 0);
>   	mutex_init(&adev->psp.mutex);
>   	mutex_init(&adev->notifier_lock);

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=02%7C01%7CDennis.Li%40amd.com%7C56c95f939ddd441bd10408d839d77c9e%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637322944985436656&amp;sdata=%2FBoRyEW3iK59Y52ctLWd4y1lOmi2CInb6lpIgAF88i4%3D&amp;reserved=0
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] drm/amdgpu: annotate a false positive recursive locking
  2020-08-06  7:44   ` Li, Dennis
@ 2020-08-06  8:13     ` Christian König
  2020-08-06  9:15       ` Li, Dennis
  0 siblings, 1 reply; 24+ messages in thread
From: Christian König @ 2020-08-06  8:13 UTC (permalink / raw)
  To: Li, Dennis, amd-gfx, Deucher, Alexander, Kuehling, Felix, Zhang, Hawking

Preventing locking problems during implementation is obviously a good 
approach, but lockdep has proven to be massively useful for finding and 
fixing problems.

Disabling lockdep splat by annotating lock with separate classes is 
usually a no-go and only allowed if there is no other potential approach.

In this case here we should really clean things up instead.

Christian.

Am 06.08.20 um 09:44 schrieb Li, Dennis:
> [AMD Official Use Only - Internal Distribution Only]
>
> Hi, Christian,
>        I agree with your concern. However we shouldn't rely on system to detect dead-lock, and should consider this when doing code implementation. In fact, dead-lock detection isn't enabled by default.
>        For your proposal to remove reset_sem into the hive structure, we can open a new topic to discuss it. Currently we couldn't make sure which is the best solution. For example, with your proposal, we must wait for all devices resuming successfully before resubmit an old task in one device, which will effect performance.
>
> Best Regards
> Dennis Li
> -----Original Message-----
> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of Christian König
> Sent: Thursday, August 6, 2020 3:08 PM
> To: Li, Dennis <Dennis.Li@amd.com>; amd-gfx@lists.freedesktop.org; Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix <Felix.Kuehling@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>
> Subject: Re: [PATCH] drm/amdgpu: annotate a false positive recursive locking
>
> Am 06.08.20 um 09:02 schrieb Dennis Li:
>> [  584.110304] ============================================
>> [  584.110590] WARNING: possible recursive locking detected
>> [  584.110876] 5.6.0-deli-v5.6-2848-g3f3109b0e75f #1 Tainted: G           OE
>> [  584.111164] --------------------------------------------
>> [  584.111456] kworker/38:1/553 is trying to acquire lock:
>> [  584.111721] ffff9b15ff0a47a0 (&adev->reset_sem){++++}, at:
>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.112112]
>>                  but task is already holding lock:
>> [  584.112673] ffff9b1603d247a0 (&adev->reset_sem){++++}, at:
>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.113068]
>>                  other info that might help us debug this:
>> [  584.113689]  Possible unsafe locking scenario:
>>
>> [  584.114350]        CPU0
>> [  584.114685]        ----
>> [  584.115014]   lock(&adev->reset_sem);
>> [  584.115349]   lock(&adev->reset_sem);
>> [  584.115678]
>>                   *** DEADLOCK ***
>>
>> [  584.116624]  May be due to missing lock nesting notation
>>
>> [  584.117284] 4 locks held by kworker/38:1/553:
>> [  584.117616]  #0: ffff9ad635c1d348 ((wq_completion)events){+.+.},
>> at: process_one_work+0x21f/0x630 [  584.117967]  #1: ffffac708e1c3e58
>> ((work_completion)(&con->recovery_work)){+.+.}, at:
>> process_one_work+0x21f/0x630 [  584.118358]  #2: ffffffffc1c2a5d0
>> (&tmp->hive_lock){+.+.}, at: amdgpu_device_gpu_recover+0xae/0x1030 [amdgpu] [  584.118786]  #3: ffff9b1603d247a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.119222]
>>                  stack backtrace:
>> [  584.119990] CPU: 38 PID: 553 Comm: kworker/38:1 Kdump: loaded Tainted: G           OE     5.6.0-deli-v5.6-2848-g3f3109b0e75f #1
>> [  584.120782] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT,
>> BIOS 3.1 05/23/2019 [  584.121223] Workqueue: events
>> amdgpu_ras_do_recovery [amdgpu] [  584.121638] Call Trace:
>> [  584.122050]  dump_stack+0x98/0xd5
>> [  584.122499]  __lock_acquire+0x1139/0x16e0 [  584.122931]  ?
>> trace_hardirqs_on+0x3b/0xf0 [  584.123358]  ?
>> cancel_delayed_work+0xa6/0xc0 [  584.123771]  lock_acquire+0xb8/0x1c0
>> [  584.124197]  ? amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [
>> 584.124599]  down_write+0x49/0x120 [  584.125032]  ?
>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.125472]
>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.125910]  ?
>> amdgpu_ras_error_query+0x1b8/0x2a0 [amdgpu] [  584.126367]
>> amdgpu_ras_do_recovery+0x159/0x190 [amdgpu] [  584.126789]
>> process_one_work+0x29e/0x630 [  584.127208]  worker_thread+0x3c/0x3f0
>> [  584.127621]  ? __kthread_parkme+0x61/0x90 [  584.128014]
>> kthread+0x12f/0x150 [  584.128402]  ? process_one_work+0x630/0x630 [
>> 584.128790]  ? kthread_park+0x90/0x90 [  584.129174]
>> ret_from_fork+0x3a/0x50
>>
>> Each adev has owned lock_class_key to avoid false positive recursive
>> locking.
> NAK, that is not a false positive but a real problem.
>
> The issue here is that we have multiple reset semaphores, one for each device in the hive. If those are not acquired in the correct order we deadlock.
>
> The real solution would be to move the reset_sem into the hive structure and make sure that we lock it only once.
>
> Christian.
>
>> Signed-off-by: Dennis Li <Dennis.Li@amd.com>
>> Change-Id: I7571efeccbf15483982031d00504a353031a854a
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> index e97c088d03b3..766dc8f8c8a0 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> @@ -967,6 +967,7 @@ struct amdgpu_device {
>>    	atomic_t                        in_gpu_reset;
>>    	enum pp_mp1_state               mp1_state;
>>    	struct rw_semaphore	reset_sem;
>> +	struct lock_class_key lock_key;
>>    	struct amdgpu_doorbell_index doorbell_index;
>>    
>>    	struct mutex			notifier_lock;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 6c572db42d92..d78df9312d34 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -3037,6 +3037,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>>    	mutex_init(&adev->virt.vf_errors.lock);
>>    	hash_init(adev->mn_hash);
>>    	init_rwsem(&adev->reset_sem);
>> +	lockdep_set_class(&adev->reset_sem, &adev->lock_key);
>>    	atomic_set(&adev->in_gpu_reset, 0);
>>    	mutex_init(&adev->psp.mutex);
>>    	mutex_init(&adev->notifier_lock);
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=02%7C01%7CDennis.Li%40amd.com%7C56c95f939ddd441bd10408d839d77c9e%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637322944985436656&amp;sdata=%2FBoRyEW3iK59Y52ctLWd4y1lOmi2CInb6lpIgAF88i4%3D&amp;reserved=0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [PATCH] drm/amdgpu: annotate a false positive recursive locking
  2020-08-06  8:13     ` Christian König
@ 2020-08-06  9:15       ` Li, Dennis
  2020-08-06  9:19         ` Christian König
  0 siblings, 1 reply; 24+ messages in thread
From: Li, Dennis @ 2020-08-06  9:15 UTC (permalink / raw)
  To: Koenig, Christian, amd-gfx, Deucher, Alexander, Kuehling, Felix,
	Zhang, Hawking

[AMD Official Use Only - Internal Distribution Only]

Hi, Christian,
     For this case, it is safe to use separated lock key. Please see the definition of init_rwsem as the below. Every init_rwsem calling will use a new static key, but devices of  the hive share the same code to do initialization, so their lock_class_key are the same. I think it is a limitation of init_rwsem.  In our case, it should be correct that reset_sem of each adev has different  lock_class_key. BTW, this change doesn't effect dead-lock detection and just correct it.

#define init_rwsem(sem)					\
do {								\
	static struct lock_class_key __key;			\
								\
	__init_rwsem((sem), #sem, &__key);			\
} while (0)

Best Regards
Dennis Li
-----Original Message-----
From: Koenig, Christian <Christian.Koenig@amd.com> 
Sent: Thursday, August 6, 2020 4:13 PM
To: Li, Dennis <Dennis.Li@amd.com>; amd-gfx@lists.freedesktop.org; Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix <Felix.Kuehling@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>
Subject: Re: [PATCH] drm/amdgpu: annotate a false positive recursive locking

Preventing locking problems during implementation is obviously a good approach, but lockdep has proven to be massively useful for finding and fixing problems.

Disabling lockdep splat by annotating lock with separate classes is usually a no-go and only allowed if there is no other potential approach.

In this case here we should really clean things up instead.

Christian.

Am 06.08.20 um 09:44 schrieb Li, Dennis:
> [AMD Official Use Only - Internal Distribution Only]
>
> Hi, Christian,
>        I agree with your concern. However we shouldn't rely on system to detect dead-lock, and should consider this when doing code implementation. In fact, dead-lock detection isn't enabled by default.
>        For your proposal to remove reset_sem into the hive structure, we can open a new topic to discuss it. Currently we couldn't make sure which is the best solution. For example, with your proposal, we must wait for all devices resuming successfully before resubmit an old task in one device, which will effect performance.
>
> Best Regards
> Dennis Li
> -----Original Message-----
> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of 
> Christian König
> Sent: Thursday, August 6, 2020 3:08 PM
> To: Li, Dennis <Dennis.Li@amd.com>; amd-gfx@lists.freedesktop.org; 
> Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix 
> <Felix.Kuehling@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>
> Subject: Re: [PATCH] drm/amdgpu: annotate a false positive recursive 
> locking
>
> Am 06.08.20 um 09:02 schrieb Dennis Li:
>> [  584.110304] ============================================
>> [  584.110590] WARNING: possible recursive locking detected
>> [  584.110876] 5.6.0-deli-v5.6-2848-g3f3109b0e75f #1 Tainted: G           OE
>> [  584.111164] --------------------------------------------
>> [  584.111456] kworker/38:1/553 is trying to acquire lock:
>> [  584.111721] ffff9b15ff0a47a0 (&adev->reset_sem){++++}, at:
>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.112112]
>>                  but task is already holding lock:
>> [  584.112673] ffff9b1603d247a0 (&adev->reset_sem){++++}, at:
>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.113068]
>>                  other info that might help us debug this:
>> [  584.113689]  Possible unsafe locking scenario:
>>
>> [  584.114350]        CPU0
>> [  584.114685]        ----
>> [  584.115014]   lock(&adev->reset_sem);
>> [  584.115349]   lock(&adev->reset_sem);
>> [  584.115678]
>>                   *** DEADLOCK ***
>>
>> [  584.116624]  May be due to missing lock nesting notation
>>
>> [  584.117284] 4 locks held by kworker/38:1/553:
>> [  584.117616]  #0: ffff9ad635c1d348 ((wq_completion)events){+.+.},
>> at: process_one_work+0x21f/0x630 [  584.117967]  #1: ffffac708e1c3e58 
>> ((work_completion)(&con->recovery_work)){+.+.}, at:
>> process_one_work+0x21f/0x630 [  584.118358]  #2: ffffffffc1c2a5d0 
>> (&tmp->hive_lock){+.+.}, at: amdgpu_device_gpu_recover+0xae/0x1030 [amdgpu] [  584.118786]  #3: ffff9b1603d247a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.119222]
>>                  stack backtrace:
>> [  584.119990] CPU: 38 PID: 553 Comm: kworker/38:1 Kdump: loaded Tainted: G           OE     5.6.0-deli-v5.6-2848-g3f3109b0e75f #1
>> [  584.120782] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, 
>> BIOS 3.1 05/23/2019 [  584.121223] Workqueue: events 
>> amdgpu_ras_do_recovery [amdgpu] [  584.121638] Call Trace:
>> [  584.122050]  dump_stack+0x98/0xd5
>> [  584.122499]  __lock_acquire+0x1139/0x16e0 [  584.122931]  ?
>> trace_hardirqs_on+0x3b/0xf0 [  584.123358]  ?
>> cancel_delayed_work+0xa6/0xc0 [  584.123771]  lock_acquire+0xb8/0x1c0 
>> [  584.124197]  ? amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 
>> 584.124599]  down_write+0x49/0x120 [  584.125032]  ?
>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.125472]
>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.125910]  ?
>> amdgpu_ras_error_query+0x1b8/0x2a0 [amdgpu] [  584.126367]
>> amdgpu_ras_do_recovery+0x159/0x190 [amdgpu] [  584.126789]
>> process_one_work+0x29e/0x630 [  584.127208]  worker_thread+0x3c/0x3f0 
>> [  584.127621]  ? __kthread_parkme+0x61/0x90 [  584.128014]
>> kthread+0x12f/0x150 [  584.128402]  ? process_one_work+0x630/0x630 [
>> 584.128790]  ? kthread_park+0x90/0x90 [  584.129174]
>> ret_from_fork+0x3a/0x50
>>
>> Each adev has owned lock_class_key to avoid false positive recursive 
>> locking.
> NAK, that is not a false positive but a real problem.
>
> The issue here is that we have multiple reset semaphores, one for each device in the hive. If those are not acquired in the correct order we deadlock.
>
> The real solution would be to move the reset_sem into the hive structure and make sure that we lock it only once.
>
> Christian.
>
>> Signed-off-by: Dennis Li <Dennis.Li@amd.com>
>> Change-Id: I7571efeccbf15483982031d00504a353031a854a
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> index e97c088d03b3..766dc8f8c8a0 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> @@ -967,6 +967,7 @@ struct amdgpu_device {
>>    	atomic_t                        in_gpu_reset;
>>    	enum pp_mp1_state               mp1_state;
>>    	struct rw_semaphore	reset_sem;
>> +	struct lock_class_key lock_key;
>>    	struct amdgpu_doorbell_index doorbell_index;
>>    
>>    	struct mutex			notifier_lock;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index 6c572db42d92..d78df9312d34 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -3037,6 +3037,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>>    	mutex_init(&adev->virt.vf_errors.lock);
>>    	hash_init(adev->mn_hash);
>>    	init_rwsem(&adev->reset_sem);
>> +	lockdep_set_class(&adev->reset_sem, &adev->lock_key);
>>    	atomic_set(&adev->in_gpu_reset, 0);
>>    	mutex_init(&adev->psp.mutex);
>>    	mutex_init(&adev->notifier_lock);
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flist
> s.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=02%7C01%7CDe
> nnis.Li%40amd.com%7C56c95f939ddd441bd10408d839d77c9e%7C3dd8961fe4884e6
> 08e11a82d994e183d%7C0%7C0%7C637322944985436656&amp;sdata=%2FBoRyEW3iK5
> 9Y52ctLWd4y1lOmi2CInb6lpIgAF88i4%3D&amp;reserved=0
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] drm/amdgpu: annotate a false positive recursive locking
  2020-08-06  9:15       ` Li, Dennis
@ 2020-08-06  9:19         ` Christian König
  2020-08-06 10:10           ` Li, Dennis
  0 siblings, 1 reply; 24+ messages in thread
From: Christian König @ 2020-08-06  9:19 UTC (permalink / raw)
  To: Li, Dennis, amd-gfx, Deucher, Alexander, Kuehling, Felix, Zhang, Hawking

> I think it is a limitation of init_rwsem.

And exactly that's wrong, this is intentional and perfectly correct.

As far as I know it is perfectly possible that the locks in the hive are 
not always grabbed in the same order. And that's why lockdep is 
complaining about this.

What we should do instead is to make sure we have only a single lock for 
the complete hive instead.

Regards,
Christian.

Am 06.08.20 um 11:15 schrieb Li, Dennis:
> [AMD Official Use Only - Internal Distribution Only]
>
> Hi, Christian,
>       For this case, it is safe to use separated lock key. Please see the definition of init_rwsem as the below. Every init_rwsem calling will use a new static key, but devices of  the hive share the same code to do initialization, so their lock_class_key are the same. I think it is a limitation of init_rwsem.  In our case, it should be correct that reset_sem of each adev has different  lock_class_key. BTW, this change doesn't effect dead-lock detection and just correct it.
>
> #define init_rwsem(sem)					\
> do {								\
> 	static struct lock_class_key __key;			\
> 								\
> 	__init_rwsem((sem), #sem, &__key);			\
> } while (0)
>
> Best Regards
> Dennis Li
> -----Original Message-----
> From: Koenig, Christian <Christian.Koenig@amd.com>
> Sent: Thursday, August 6, 2020 4:13 PM
> To: Li, Dennis <Dennis.Li@amd.com>; amd-gfx@lists.freedesktop.org; Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix <Felix.Kuehling@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>
> Subject: Re: [PATCH] drm/amdgpu: annotate a false positive recursive locking
>
> Preventing locking problems during implementation is obviously a good approach, but lockdep has proven to be massively useful for finding and fixing problems.
>
> Disabling lockdep splat by annotating lock with separate classes is usually a no-go and only allowed if there is no other potential approach.
>
> In this case here we should really clean things up instead.
>
> Christian.
>
> Am 06.08.20 um 09:44 schrieb Li, Dennis:
>> [AMD Official Use Only - Internal Distribution Only]
>>
>> Hi, Christian,
>>         I agree with your concern. However we shouldn't rely on system to detect dead-lock, and should consider this when doing code implementation. In fact, dead-lock detection isn't enabled by default.
>>         For your proposal to remove reset_sem into the hive structure, we can open a new topic to discuss it. Currently we couldn't make sure which is the best solution. For example, with your proposal, we must wait for all devices resuming successfully before resubmit an old task in one device, which will effect performance.
>>
>> Best Regards
>> Dennis Li
>> -----Original Message-----
>> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of
>> Christian König
>> Sent: Thursday, August 6, 2020 3:08 PM
>> To: Li, Dennis <Dennis.Li@amd.com>; amd-gfx@lists.freedesktop.org;
>> Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix
>> <Felix.Kuehling@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>
>> Subject: Re: [PATCH] drm/amdgpu: annotate a false positive recursive
>> locking
>>
>> Am 06.08.20 um 09:02 schrieb Dennis Li:
>>> [  584.110304] ============================================
>>> [  584.110590] WARNING: possible recursive locking detected
>>> [  584.110876] 5.6.0-deli-v5.6-2848-g3f3109b0e75f #1 Tainted: G           OE
>>> [  584.111164] --------------------------------------------
>>> [  584.111456] kworker/38:1/553 is trying to acquire lock:
>>> [  584.111721] ffff9b15ff0a47a0 (&adev->reset_sem){++++}, at:
>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.112112]
>>>                   but task is already holding lock:
>>> [  584.112673] ffff9b1603d247a0 (&adev->reset_sem){++++}, at:
>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.113068]
>>>                   other info that might help us debug this:
>>> [  584.113689]  Possible unsafe locking scenario:
>>>
>>> [  584.114350]        CPU0
>>> [  584.114685]        ----
>>> [  584.115014]   lock(&adev->reset_sem);
>>> [  584.115349]   lock(&adev->reset_sem);
>>> [  584.115678]
>>>                    *** DEADLOCK ***
>>>
>>> [  584.116624]  May be due to missing lock nesting notation
>>>
>>> [  584.117284] 4 locks held by kworker/38:1/553:
>>> [  584.117616]  #0: ffff9ad635c1d348 ((wq_completion)events){+.+.},
>>> at: process_one_work+0x21f/0x630 [  584.117967]  #1: ffffac708e1c3e58
>>> ((work_completion)(&con->recovery_work)){+.+.}, at:
>>> process_one_work+0x21f/0x630 [  584.118358]  #2: ffffffffc1c2a5d0
>>> (&tmp->hive_lock){+.+.}, at: amdgpu_device_gpu_recover+0xae/0x1030 [amdgpu] [  584.118786]  #3: ffff9b1603d247a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.119222]
>>>                   stack backtrace:
>>> [  584.119990] CPU: 38 PID: 553 Comm: kworker/38:1 Kdump: loaded Tainted: G           OE     5.6.0-deli-v5.6-2848-g3f3109b0e75f #1
>>> [  584.120782] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT,
>>> BIOS 3.1 05/23/2019 [  584.121223] Workqueue: events
>>> amdgpu_ras_do_recovery [amdgpu] [  584.121638] Call Trace:
>>> [  584.122050]  dump_stack+0x98/0xd5
>>> [  584.122499]  __lock_acquire+0x1139/0x16e0 [  584.122931]  ?
>>> trace_hardirqs_on+0x3b/0xf0 [  584.123358]  ?
>>> cancel_delayed_work+0xa6/0xc0 [  584.123771]  lock_acquire+0xb8/0x1c0
>>> [  584.124197]  ? amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [
>>> 584.124599]  down_write+0x49/0x120 [  584.125032]  ?
>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.125472]
>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.125910]  ?
>>> amdgpu_ras_error_query+0x1b8/0x2a0 [amdgpu] [  584.126367]
>>> amdgpu_ras_do_recovery+0x159/0x190 [amdgpu] [  584.126789]
>>> process_one_work+0x29e/0x630 [  584.127208]  worker_thread+0x3c/0x3f0
>>> [  584.127621]  ? __kthread_parkme+0x61/0x90 [  584.128014]
>>> kthread+0x12f/0x150 [  584.128402]  ? process_one_work+0x630/0x630 [
>>> 584.128790]  ? kthread_park+0x90/0x90 [  584.129174]
>>> ret_from_fork+0x3a/0x50
>>>
>>> Each adev has owned lock_class_key to avoid false positive recursive
>>> locking.
>> NAK, that is not a false positive but a real problem.
>>
>> The issue here is that we have multiple reset semaphores, one for each device in the hive. If those are not acquired in the correct order we deadlock.
>>
>> The real solution would be to move the reset_sem into the hive structure and make sure that we lock it only once.
>>
>> Christian.
>>
>>> Signed-off-by: Dennis Li <Dennis.Li@amd.com>
>>> Change-Id: I7571efeccbf15483982031d00504a353031a854a
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> index e97c088d03b3..766dc8f8c8a0 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> @@ -967,6 +967,7 @@ struct amdgpu_device {
>>>     	atomic_t                        in_gpu_reset;
>>>     	enum pp_mp1_state               mp1_state;
>>>     	struct rw_semaphore	reset_sem;
>>> +	struct lock_class_key lock_key;
>>>     	struct amdgpu_doorbell_index doorbell_index;
>>>     
>>>     	struct mutex			notifier_lock;
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> index 6c572db42d92..d78df9312d34 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> @@ -3037,6 +3037,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>>>     	mutex_init(&adev->virt.vf_errors.lock);
>>>     	hash_init(adev->mn_hash);
>>>     	init_rwsem(&adev->reset_sem);
>>> +	lockdep_set_class(&adev->reset_sem, &adev->lock_key);
>>>     	atomic_set(&adev->in_gpu_reset, 0);
>>>     	mutex_init(&adev->psp.mutex);
>>>     	mutex_init(&adev->notifier_lock);
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flist
>> s.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=02%7C01%7CDe
>> nnis.Li%40amd.com%7C56c95f939ddd441bd10408d839d77c9e%7C3dd8961fe4884e6
>> 08e11a82d994e183d%7C0%7C0%7C637322944985436656&amp;sdata=%2FBoRyEW3iK5
>> 9Y52ctLWd4y1lOmi2CInb6lpIgAF88i4%3D&amp;reserved=0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [PATCH] drm/amdgpu: annotate a false positive recursive locking
  2020-08-06  9:19         ` Christian König
@ 2020-08-06 10:10           ` Li, Dennis
  2020-08-06 10:47             ` Christian König
  0 siblings, 1 reply; 24+ messages in thread
From: Li, Dennis @ 2020-08-06 10:10 UTC (permalink / raw)
  To: Koenig, Christian, amd-gfx, Deucher, Alexander, Kuehling, Felix,
	Zhang, Hawking

[AMD Official Use Only - Internal Distribution Only]

Hi, Christian,
       See my below comments.

Best Regards
Dennis Li
-----Original Message-----
From: Koenig, Christian <Christian.Koenig@amd.com> 
Sent: Thursday, August 6, 2020 5:19 PM
To: Li, Dennis <Dennis.Li@amd.com>; amd-gfx@lists.freedesktop.org; Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix <Felix.Kuehling@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>
Subject: Re: [PATCH] drm/amdgpu: annotate a false positive recursive locking

> I think it is a limitation of init_rwsem.

And exactly that's wrong, this is intentional and perfectly correct.

[Dennis Li] I couldn't understand. Why is it a perfectly correct? 
For example, we define two rw_sem: a and b. If we don't check init_rwsem definition, we may think case#1 and case#2 have the same behavior, but in fact they are different. 

Case 1:
init_rwsem(&a);
init_rwsem(&b);

Case2:
void init_rwsem_ext(rw_sem* sem)
{
     init_rwsem(sem);
}
init_rwsem_ext(&a);
init_rwsem_ext(&b);

As far as I know it is perfectly possible that the locks in the hive are not always grabbed in the same order. And that's why lockdep is complaining about this.
[Dennis Li] No. I takes a hive with two devices(a and b) to explain why lockdep complain. 

// Firstly driver lock the reset_sem of all devices
down_write(&a->reset_sem);
do_suspend(a);
down_write(&b->reset_sem);   // Because  b shared the same lock_class_key with a, lockdep will take a and b as the same rw_sem and complain here. 
do_suspend(b);

// do recovery
do_hive_recovery()

// unlock the reset_sem of all devices
do_resume(a);
up_write(&a->reset_sem);
do_resume(b);
up_write(&b->reset_sem);

What we should do instead is to make sure we have only a single lock for the complete hive instead.
[Dennis Li] If we use a single lock, users will must wait for all devices resuming successfully, but in fact their tasks are only running in device a. It is not friendly to users. 

Regards,
Christian.

Am 06.08.20 um 11:15 schrieb Li, Dennis:
> [AMD Official Use Only - Internal Distribution Only]
>
> Hi, Christian,
>       For this case, it is safe to use separated lock key. Please see the definition of init_rwsem as the below. Every init_rwsem calling will use a new static key, but devices of  the hive share the same code to do initialization, so their lock_class_key are the same. I think it is a limitation of init_rwsem.  In our case, it should be correct that reset_sem of each adev has different  lock_class_key. BTW, this change doesn't effect dead-lock detection and just correct it.
>
> #define init_rwsem(sem)					\
> do {								\
> 	static struct lock_class_key __key;			\
> 								\
> 	__init_rwsem((sem), #sem, &__key);			\
> } while (0)
>
> Best Regards
> Dennis Li
> -----Original Message-----
> From: Koenig, Christian <Christian.Koenig@amd.com>
> Sent: Thursday, August 6, 2020 4:13 PM
> To: Li, Dennis <Dennis.Li@amd.com>; amd-gfx@lists.freedesktop.org; 
> Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix 
> <Felix.Kuehling@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>
> Subject: Re: [PATCH] drm/amdgpu: annotate a false positive recursive 
> locking
>
> Preventing locking problems during implementation is obviously a good approach, but lockdep has proven to be massively useful for finding and fixing problems.
>
> Disabling lockdep splat by annotating lock with separate classes is usually a no-go and only allowed if there is no other potential approach.
>
> In this case here we should really clean things up instead.
>
> Christian.
>
> Am 06.08.20 um 09:44 schrieb Li, Dennis:
>> [AMD Official Use Only - Internal Distribution Only]
>>
>> Hi, Christian,
>>         I agree with your concern. However we shouldn't rely on system to detect dead-lock, and should consider this when doing code implementation. In fact, dead-lock detection isn't enabled by default.
>>         For your proposal to remove reset_sem into the hive structure, we can open a new topic to discuss it. Currently we couldn't make sure which is the best solution. For example, with your proposal, we must wait for all devices resuming successfully before resubmit an old task in one device, which will effect performance.
>>
>> Best Regards
>> Dennis Li
>> -----Original Message-----
>> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of 
>> Christian König
>> Sent: Thursday, August 6, 2020 3:08 PM
>> To: Li, Dennis <Dennis.Li@amd.com>; amd-gfx@lists.freedesktop.org; 
>> Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix 
>> <Felix.Kuehling@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>
>> Subject: Re: [PATCH] drm/amdgpu: annotate a false positive recursive 
>> locking
>>
>> Am 06.08.20 um 09:02 schrieb Dennis Li:
>>> [  584.110304] ============================================
>>> [  584.110590] WARNING: possible recursive locking detected
>>> [  584.110876] 5.6.0-deli-v5.6-2848-g3f3109b0e75f #1 Tainted: G           OE
>>> [  584.111164] --------------------------------------------
>>> [  584.111456] kworker/38:1/553 is trying to acquire lock:
>>> [  584.111721] ffff9b15ff0a47a0 (&adev->reset_sem){++++}, at:
>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.112112]
>>>                   but task is already holding lock:
>>> [  584.112673] ffff9b1603d247a0 (&adev->reset_sem){++++}, at:
>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.113068]
>>>                   other info that might help us debug this:
>>> [  584.113689]  Possible unsafe locking scenario:
>>>
>>> [  584.114350]        CPU0
>>> [  584.114685]        ----
>>> [  584.115014]   lock(&adev->reset_sem);
>>> [  584.115349]   lock(&adev->reset_sem);
>>> [  584.115678]
>>>                    *** DEADLOCK ***
>>>
>>> [  584.116624]  May be due to missing lock nesting notation
>>>
>>> [  584.117284] 4 locks held by kworker/38:1/553:
>>> [  584.117616]  #0: ffff9ad635c1d348 ((wq_completion)events){+.+.},
>>> at: process_one_work+0x21f/0x630 [  584.117967]  #1: 
>>> ffffac708e1c3e58 ((work_completion)(&con->recovery_work)){+.+.}, at:
>>> process_one_work+0x21f/0x630 [  584.118358]  #2: ffffffffc1c2a5d0 
>>> (&tmp->hive_lock){+.+.}, at: amdgpu_device_gpu_recover+0xae/0x1030 [amdgpu] [  584.118786]  #3: ffff9b1603d247a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.119222]
>>>                   stack backtrace:
>>> [  584.119990] CPU: 38 PID: 553 Comm: kworker/38:1 Kdump: loaded Tainted: G           OE     5.6.0-deli-v5.6-2848-g3f3109b0e75f #1
>>> [  584.120782] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, 
>>> BIOS 3.1 05/23/2019 [  584.121223] Workqueue: events 
>>> amdgpu_ras_do_recovery [amdgpu] [  584.121638] Call Trace:
>>> [  584.122050]  dump_stack+0x98/0xd5 [  584.122499]  
>>> __lock_acquire+0x1139/0x16e0 [  584.122931]  ?
>>> trace_hardirqs_on+0x3b/0xf0 [  584.123358]  ?
>>> cancel_delayed_work+0xa6/0xc0 [  584.123771]  
>>> lock_acquire+0xb8/0x1c0 [  584.124197]  ? 
>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.124599]  down_write+0x49/0x120 [  584.125032]  ?
>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.125472]
>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.125910]  ?
>>> amdgpu_ras_error_query+0x1b8/0x2a0 [amdgpu] [  584.126367]
>>> amdgpu_ras_do_recovery+0x159/0x190 [amdgpu] [  584.126789]
>>> process_one_work+0x29e/0x630 [  584.127208]  
>>> worker_thread+0x3c/0x3f0 [  584.127621]  ? 
>>> __kthread_parkme+0x61/0x90 [  584.128014]
>>> kthread+0x12f/0x150 [  584.128402]  ? process_one_work+0x630/0x630 [
>>> 584.128790]  ? kthread_park+0x90/0x90 [  584.129174]
>>> ret_from_fork+0x3a/0x50
>>>
>>> Each adev has owned lock_class_key to avoid false positive recursive 
>>> locking.
>> NAK, that is not a false positive but a real problem.
>>
>> The issue here is that we have multiple reset semaphores, one for each device in the hive. If those are not acquired in the correct order we deadlock.
>>
>> The real solution would be to move the reset_sem into the hive structure and make sure that we lock it only once.
>>
>> Christian.
>>
>>> Signed-off-by: Dennis Li <Dennis.Li@amd.com>
>>> Change-Id: I7571efeccbf15483982031d00504a353031a854a
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> index e97c088d03b3..766dc8f8c8a0 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> @@ -967,6 +967,7 @@ struct amdgpu_device {
>>>     	atomic_t                        in_gpu_reset;
>>>     	enum pp_mp1_state               mp1_state;
>>>     	struct rw_semaphore	reset_sem;
>>> +	struct lock_class_key lock_key;
>>>     	struct amdgpu_doorbell_index doorbell_index;
>>>     
>>>     	struct mutex			notifier_lock;
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> index 6c572db42d92..d78df9312d34 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> @@ -3037,6 +3037,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>>>     	mutex_init(&adev->virt.vf_errors.lock);
>>>     	hash_init(adev->mn_hash);
>>>     	init_rwsem(&adev->reset_sem);
>>> +	lockdep_set_class(&adev->reset_sem, &adev->lock_key);
>>>     	atomic_set(&adev->in_gpu_reset, 0);
>>>     	mutex_init(&adev->psp.mutex);
>>>     	mutex_init(&adev->notifier_lock);
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flis
>> t 
>> s.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=02%7C01%7CD
>> e
>> nnis.Li%40amd.com%7C56c95f939ddd441bd10408d839d77c9e%7C3dd8961fe4884e
>> 6
>> 08e11a82d994e183d%7C0%7C0%7C637322944985436656&amp;sdata=%2FBoRyEW3iK
>> 5
>> 9Y52ctLWd4y1lOmi2CInb6lpIgAF88i4%3D&amp;reserved=0
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] drm/amdgpu: annotate a false positive recursive locking
  2020-08-06 10:10           ` Li, Dennis
@ 2020-08-06 10:47             ` Christian König
  2020-08-06 11:38               ` Li, Dennis
  0 siblings, 1 reply; 24+ messages in thread
From: Christian König @ 2020-08-06 10:47 UTC (permalink / raw)
  To: Li, Dennis, amd-gfx, Deucher, Alexander, Kuehling, Felix, Zhang, Hawking


[SNIP]
>> I think it is a limitation of init_rwsem.
> And exactly that's wrong, this is intentional and perfectly correct.
>
> [Dennis Li] I couldn't understand. Why is it a perfectly correct?
> For example, we define two rw_sem: a and b. If we don't check init_rwsem definition, we may think case#1 and case#2 have the same behavior, but in fact they are different.
>
> Case 1:
> init_rwsem(&a);
> init_rwsem(&b);
>
> Case2:
> void init_rwsem_ext(rw_sem* sem)
> {
>       init_rwsem(sem);
> }
> init_rwsem_ext(&a);
> init_rwsem_ext(&b);
>
> As far as I know it is perfectly possible that the locks in the hive are not always grabbed in the same order. And that's why lockdep is complaining about this.
> [Dennis Li] No. I takes a hive with two devices(a and b) to explain why lockdep complain.
>
> // Firstly driver lock the reset_sem of all devices
> down_write(&a->reset_sem);
> do_suspend(a);
> down_write(&b->reset_sem);   // Because  b shared the same lock_class_key with a, lockdep will take a and b as the same rw_sem and complain here.
> do_suspend(b);
>
> // do recovery
> do_hive_recovery()
>
> // unlock the reset_sem of all devices
> do_resume(a);
> up_write(&a->reset_sem);
> do_resume(b);
> up_write(&b->reset_sem);

Ah! Now I understand what you are working around. So the problem is the 
static lock_class_key in the macro?

> What we should do instead is to make sure we have only a single lock for the complete hive instead.
> [Dennis Li] If we use a single lock, users will must wait for all devices resuming successfully, but in fact their tasks are only running in device a. It is not friendly to users.

Well that is intentional I would say. We can only restart submissions 
after all devices are resumed successfully, cause otherwise it can 
happen that a job on device A depends on memory on device B which isn't 
accessible.

Regards,
Christian.

>
> Regards,
> Christian.
>
> Am 06.08.20 um 11:15 schrieb Li, Dennis:
>> [AMD Official Use Only - Internal Distribution Only]
>>
>> Hi, Christian,
>>        For this case, it is safe to use separated lock key. Please see the definition of init_rwsem as the below. Every init_rwsem calling will use a new static key, but devices of  the hive share the same code to do initialization, so their lock_class_key are the same. I think it is a limitation of init_rwsem.  In our case, it should be correct that reset_sem of each adev has different  lock_class_key. BTW, this change doesn't effect dead-lock detection and just correct it.
>>
>> #define init_rwsem(sem)					\
>> do {								\
>> 	static struct lock_class_key __key;			\
>> 								\
>> 	__init_rwsem((sem), #sem, &__key);			\
>> } while (0)
>>
>> Best Regards
>> Dennis Li
>> -----Original Message-----
>> From: Koenig, Christian <Christian.Koenig@amd.com>
>> Sent: Thursday, August 6, 2020 4:13 PM
>> To: Li, Dennis <Dennis.Li@amd.com>; amd-gfx@lists.freedesktop.org;
>> Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix
>> <Felix.Kuehling@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>
>> Subject: Re: [PATCH] drm/amdgpu: annotate a false positive recursive
>> locking
>>
>> Preventing locking problems during implementation is obviously a good approach, but lockdep has proven to be massively useful for finding and fixing problems.
>>
>> Disabling lockdep splat by annotating lock with separate classes is usually a no-go and only allowed if there is no other potential approach.
>>
>> In this case here we should really clean things up instead.
>>
>> Christian.
>>
>> Am 06.08.20 um 09:44 schrieb Li, Dennis:
>>> [AMD Official Use Only - Internal Distribution Only]
>>>
>>> Hi, Christian,
>>>          I agree with your concern. However we shouldn't rely on system to detect dead-lock, and should consider this when doing code implementation. In fact, dead-lock detection isn't enabled by default.
>>>          For your proposal to remove reset_sem into the hive structure, we can open a new topic to discuss it. Currently we couldn't make sure which is the best solution. For example, with your proposal, we must wait for all devices resuming successfully before resubmit an old task in one device, which will effect performance.
>>>
>>> Best Regards
>>> Dennis Li
>>> -----Original Message-----
>>> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of
>>> Christian König
>>> Sent: Thursday, August 6, 2020 3:08 PM
>>> To: Li, Dennis <Dennis.Li@amd.com>; amd-gfx@lists.freedesktop.org;
>>> Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix
>>> <Felix.Kuehling@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>
>>> Subject: Re: [PATCH] drm/amdgpu: annotate a false positive recursive
>>> locking
>>>
>>> Am 06.08.20 um 09:02 schrieb Dennis Li:
>>>> [  584.110304] ============================================
>>>> [  584.110590] WARNING: possible recursive locking detected
>>>> [  584.110876] 5.6.0-deli-v5.6-2848-g3f3109b0e75f #1 Tainted: G           OE
>>>> [  584.111164] --------------------------------------------
>>>> [  584.111456] kworker/38:1/553 is trying to acquire lock:
>>>> [  584.111721] ffff9b15ff0a47a0 (&adev->reset_sem){++++}, at:
>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.112112]
>>>>                    but task is already holding lock:
>>>> [  584.112673] ffff9b1603d247a0 (&adev->reset_sem){++++}, at:
>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.113068]
>>>>                    other info that might help us debug this:
>>>> [  584.113689]  Possible unsafe locking scenario:
>>>>
>>>> [  584.114350]        CPU0
>>>> [  584.114685]        ----
>>>> [  584.115014]   lock(&adev->reset_sem);
>>>> [  584.115349]   lock(&adev->reset_sem);
>>>> [  584.115678]
>>>>                     *** DEADLOCK ***
>>>>
>>>> [  584.116624]  May be due to missing lock nesting notation
>>>>
>>>> [  584.117284] 4 locks held by kworker/38:1/553:
>>>> [  584.117616]  #0: ffff9ad635c1d348 ((wq_completion)events){+.+.},
>>>> at: process_one_work+0x21f/0x630 [  584.117967]  #1:
>>>> ffffac708e1c3e58 ((work_completion)(&con->recovery_work)){+.+.}, at:
>>>> process_one_work+0x21f/0x630 [  584.118358]  #2: ffffffffc1c2a5d0
>>>> (&tmp->hive_lock){+.+.}, at: amdgpu_device_gpu_recover+0xae/0x1030 [amdgpu] [  584.118786]  #3: ffff9b1603d247a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.119222]
>>>>                    stack backtrace:
>>>> [  584.119990] CPU: 38 PID: 553 Comm: kworker/38:1 Kdump: loaded Tainted: G           OE     5.6.0-deli-v5.6-2848-g3f3109b0e75f #1
>>>> [  584.120782] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT,
>>>> BIOS 3.1 05/23/2019 [  584.121223] Workqueue: events
>>>> amdgpu_ras_do_recovery [amdgpu] [  584.121638] Call Trace:
>>>> [  584.122050]  dump_stack+0x98/0xd5 [  584.122499]
>>>> __lock_acquire+0x1139/0x16e0 [  584.122931]  ?
>>>> trace_hardirqs_on+0x3b/0xf0 [  584.123358]  ?
>>>> cancel_delayed_work+0xa6/0xc0 [  584.123771]
>>>> lock_acquire+0xb8/0x1c0 [  584.124197]  ?
>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.124599]  down_write+0x49/0x120 [  584.125032]  ?
>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.125472]
>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.125910]  ?
>>>> amdgpu_ras_error_query+0x1b8/0x2a0 [amdgpu] [  584.126367]
>>>> amdgpu_ras_do_recovery+0x159/0x190 [amdgpu] [  584.126789]
>>>> process_one_work+0x29e/0x630 [  584.127208]
>>>> worker_thread+0x3c/0x3f0 [  584.127621]  ?
>>>> __kthread_parkme+0x61/0x90 [  584.128014]
>>>> kthread+0x12f/0x150 [  584.128402]  ? process_one_work+0x630/0x630 [
>>>> 584.128790]  ? kthread_park+0x90/0x90 [  584.129174]
>>>> ret_from_fork+0x3a/0x50
>>>>
>>>> Each adev has owned lock_class_key to avoid false positive recursive
>>>> locking.
>>> NAK, that is not a false positive but a real problem.
>>>
>>> The issue here is that we have multiple reset semaphores, one for each device in the hive. If those are not acquired in the correct order we deadlock.
>>>
>>> The real solution would be to move the reset_sem into the hive structure and make sure that we lock it only once.
>>>
>>> Christian.
>>>
>>>> Signed-off-by: Dennis Li <Dennis.Li@amd.com>
>>>> Change-Id: I7571efeccbf15483982031d00504a353031a854a
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> index e97c088d03b3..766dc8f8c8a0 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> @@ -967,6 +967,7 @@ struct amdgpu_device {
>>>>      	atomic_t                        in_gpu_reset;
>>>>      	enum pp_mp1_state               mp1_state;
>>>>      	struct rw_semaphore	reset_sem;
>>>> +	struct lock_class_key lock_key;
>>>>      	struct amdgpu_doorbell_index doorbell_index;
>>>>      
>>>>      	struct mutex			notifier_lock;
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> index 6c572db42d92..d78df9312d34 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> @@ -3037,6 +3037,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>>>>      	mutex_init(&adev->virt.vf_errors.lock);
>>>>      	hash_init(adev->mn_hash);
>>>>      	init_rwsem(&adev->reset_sem);
>>>> +	lockdep_set_class(&adev->reset_sem, &adev->lock_key);
>>>>      	atomic_set(&adev->in_gpu_reset, 0);
>>>>      	mutex_init(&adev->psp.mutex);
>>>>      	mutex_init(&adev->notifier_lock);
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flis
>>> t
>>> s.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=02%7C01%7CD
>>> e
>>> nnis.Li%40amd.com%7C56c95f939ddd441bd10408d839d77c9e%7C3dd8961fe4884e
>>> 6
>>> 08e11a82d994e183d%7C0%7C0%7C637322944985436656&amp;sdata=%2FBoRyEW3iK
>>> 5
>>> 9Y52ctLWd4y1lOmi2CInb6lpIgAF88i4%3D&amp;reserved=0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [PATCH] drm/amdgpu: annotate a false positive recursive locking
  2020-08-06 10:47             ` Christian König
@ 2020-08-06 11:38               ` Li, Dennis
  2020-08-06 13:16                 ` Christian König
  0 siblings, 1 reply; 24+ messages in thread
From: Li, Dennis @ 2020-08-06 11:38 UTC (permalink / raw)
  To: Koenig, Christian, amd-gfx, Deucher, Alexander, Kuehling, Felix,
	Zhang, Hawking

[AMD Official Use Only - Internal Distribution Only]

[SNIP]
>> I think it is a limitation of init_rwsem.
> And exactly that's wrong, this is intentional and perfectly correct.
>
> [Dennis Li] I couldn't understand. Why is it a perfectly correct?
> For example, we define two rw_sem: a and b. If we don't check init_rwsem definition, we may think case#1 and case#2 have the same behavior, but in fact they are different.
>
> Case 1:
> init_rwsem(&a);
> init_rwsem(&b);
>
> Case2:
> void init_rwsem_ext(rw_sem* sem)
> {
>       init_rwsem(sem);
> }
> init_rwsem_ext(&a);
> init_rwsem_ext(&b);
>
> As far as I know it is perfectly possible that the locks in the hive are not always grabbed in the same order. And that's why lockdep is complaining about this.
> [Dennis Li] No. I takes a hive with two devices(a and b) to explain why lockdep complain.
>
> // Firstly driver lock the reset_sem of all devices 
> down_write(&a->reset_sem); do_suspend(a);
> down_write(&b->reset_sem);   // Because  b shared the same lock_class_key with a, lockdep will take a and b as the same rw_sem and complain here.
> do_suspend(b);
>
> // do recovery
> do_hive_recovery()
>
> // unlock the reset_sem of all devices do_resume(a); 
> up_write(&a->reset_sem); do_resume(b); up_write(&b->reset_sem);

Ah! Now I understand what you are working around. So the problem is the static lock_class_key in the macro?
[Dennis Li] Yes. The author of init_rwsem might not consider our similar use case. 

> What we should do instead is to make sure we have only a single lock for the complete hive instead.
> [Dennis Li] If we use a single lock, users will must wait for all devices resuming successfully, but in fact their tasks are only running in device a. It is not friendly to users.

Well that is intentional I would say. We can only restart submissions after all devices are resumed successfully, cause otherwise it can happen that a job on device A depends on memory on device B which isn't accessible.
[Dennis Li] Yes, you are right. Driver have make sure that the shared resources(such as the shard memory) are ready before unlock the lock of adev one by one. But each device still has private hardware resources such as video  and display. 

Regards,
Christian.

>
> Regards,
> Christian.
>
> Am 06.08.20 um 11:15 schrieb Li, Dennis:
>> [AMD Official Use Only - Internal Distribution Only]
>>
>> Hi, Christian,
>>        For this case, it is safe to use separated lock key. Please see the definition of init_rwsem as the below. Every init_rwsem calling will use a new static key, but devices of  the hive share the same code to do initialization, so their lock_class_key are the same. I think it is a limitation of init_rwsem.  In our case, it should be correct that reset_sem of each adev has different  lock_class_key. BTW, this change doesn't effect dead-lock detection and just correct it.
>>
>> #define init_rwsem(sem)					\
>> do {								\
>> 	static struct lock_class_key __key;			\
>> 								\
>> 	__init_rwsem((sem), #sem, &__key);			\
>> } while (0)
>>
>> Best Regards
>> Dennis Li
>> -----Original Message-----
>> From: Koenig, Christian <Christian.Koenig@amd.com>
>> Sent: Thursday, August 6, 2020 4:13 PM
>> To: Li, Dennis <Dennis.Li@amd.com>; amd-gfx@lists.freedesktop.org; 
>> Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix 
>> <Felix.Kuehling@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>
>> Subject: Re: [PATCH] drm/amdgpu: annotate a false positive recursive 
>> locking
>>
>> Preventing locking problems during implementation is obviously a good approach, but lockdep has proven to be massively useful for finding and fixing problems.
>>
>> Disabling lockdep splat by annotating lock with separate classes is usually a no-go and only allowed if there is no other potential approach.
>>
>> In this case here we should really clean things up instead.
>>
>> Christian.
>>
>> Am 06.08.20 um 09:44 schrieb Li, Dennis:
>>> [AMD Official Use Only - Internal Distribution Only]
>>>
>>> Hi, Christian,
>>>          I agree with your concern. However we shouldn't rely on system to detect dead-lock, and should consider this when doing code implementation. In fact, dead-lock detection isn't enabled by default.
>>>          For your proposal to remove reset_sem into the hive structure, we can open a new topic to discuss it. Currently we couldn't make sure which is the best solution. For example, with your proposal, we must wait for all devices resuming successfully before resubmit an old task in one device, which will effect performance.
>>>
>>> Best Regards
>>> Dennis Li
>>> -----Original Message-----
>>> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of 
>>> Christian König
>>> Sent: Thursday, August 6, 2020 3:08 PM
>>> To: Li, Dennis <Dennis.Li@amd.com>; amd-gfx@lists.freedesktop.org; 
>>> Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix 
>>> <Felix.Kuehling@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>
>>> Subject: Re: [PATCH] drm/amdgpu: annotate a false positive recursive 
>>> locking
>>>
>>> Am 06.08.20 um 09:02 schrieb Dennis Li:
>>>> [  584.110304] ============================================
>>>> [  584.110590] WARNING: possible recursive locking detected
>>>> [  584.110876] 5.6.0-deli-v5.6-2848-g3f3109b0e75f #1 Tainted: G           OE
>>>> [  584.111164] --------------------------------------------
>>>> [  584.111456] kworker/38:1/553 is trying to acquire lock:
>>>> [  584.111721] ffff9b15ff0a47a0 (&adev->reset_sem){++++}, at:
>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.112112]
>>>>                    but task is already holding lock:
>>>> [  584.112673] ffff9b1603d247a0 (&adev->reset_sem){++++}, at:
>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.113068]
>>>>                    other info that might help us debug this:
>>>> [  584.113689]  Possible unsafe locking scenario:
>>>>
>>>> [  584.114350]        CPU0
>>>> [  584.114685]        ----
>>>> [  584.115014]   lock(&adev->reset_sem);
>>>> [  584.115349]   lock(&adev->reset_sem);
>>>> [  584.115678]
>>>>                     *** DEADLOCK ***
>>>>
>>>> [  584.116624]  May be due to missing lock nesting notation
>>>>
>>>> [  584.117284] 4 locks held by kworker/38:1/553:
>>>> [  584.117616]  #0: ffff9ad635c1d348 ((wq_completion)events){+.+.},
>>>> at: process_one_work+0x21f/0x630 [  584.117967]  #1:
>>>> ffffac708e1c3e58 ((work_completion)(&con->recovery_work)){+.+.}, at:
>>>> process_one_work+0x21f/0x630 [  584.118358]  #2: ffffffffc1c2a5d0 
>>>> (&tmp->hive_lock){+.+.}, at: amdgpu_device_gpu_recover+0xae/0x1030 [amdgpu] [  584.118786]  #3: ffff9b1603d247a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.119222]
>>>>                    stack backtrace:
>>>> [  584.119990] CPU: 38 PID: 553 Comm: kworker/38:1 Kdump: loaded Tainted: G           OE     5.6.0-deli-v5.6-2848-g3f3109b0e75f #1
>>>> [  584.120782] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, 
>>>> BIOS 3.1 05/23/2019 [  584.121223] Workqueue: events 
>>>> amdgpu_ras_do_recovery [amdgpu] [  584.121638] Call Trace:
>>>> [  584.122050]  dump_stack+0x98/0xd5 [  584.122499]
>>>> __lock_acquire+0x1139/0x16e0 [  584.122931]  ?
>>>> trace_hardirqs_on+0x3b/0xf0 [  584.123358]  ?
>>>> cancel_delayed_work+0xa6/0xc0 [  584.123771]
>>>> lock_acquire+0xb8/0x1c0 [  584.124197]  ?
>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.124599]  down_write+0x49/0x120 [  584.125032]  ?
>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.125472]
>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.125910]  ?
>>>> amdgpu_ras_error_query+0x1b8/0x2a0 [amdgpu] [  584.126367]
>>>> amdgpu_ras_do_recovery+0x159/0x190 [amdgpu] [  584.126789]
>>>> process_one_work+0x29e/0x630 [  584.127208]
>>>> worker_thread+0x3c/0x3f0 [  584.127621]  ?
>>>> __kthread_parkme+0x61/0x90 [  584.128014]
>>>> kthread+0x12f/0x150 [  584.128402]  ? process_one_work+0x630/0x630 
>>>> kthread+[
>>>> 584.128790]  ? kthread_park+0x90/0x90 [  584.129174]
>>>> ret_from_fork+0x3a/0x50
>>>>
>>>> Each adev has owned lock_class_key to avoid false positive 
>>>> recursive locking.
>>> NAK, that is not a false positive but a real problem.
>>>
>>> The issue here is that we have multiple reset semaphores, one for each device in the hive. If those are not acquired in the correct order we deadlock.
>>>
>>> The real solution would be to move the reset_sem into the hive structure and make sure that we lock it only once.
>>>
>>> Christian.
>>>
>>>> Signed-off-by: Dennis Li <Dennis.Li@amd.com>
>>>> Change-Id: I7571efeccbf15483982031d00504a353031a854a
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> index e97c088d03b3..766dc8f8c8a0 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>> @@ -967,6 +967,7 @@ struct amdgpu_device {
>>>>      	atomic_t                        in_gpu_reset;
>>>>      	enum pp_mp1_state               mp1_state;
>>>>      	struct rw_semaphore	reset_sem;
>>>> +	struct lock_class_key lock_key;
>>>>      	struct amdgpu_doorbell_index doorbell_index;
>>>>      
>>>>      	struct mutex			notifier_lock;
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> index 6c572db42d92..d78df9312d34 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>> @@ -3037,6 +3037,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>>>>      	mutex_init(&adev->virt.vf_errors.lock);
>>>>      	hash_init(adev->mn_hash);
>>>>      	init_rwsem(&adev->reset_sem);
>>>> +	lockdep_set_class(&adev->reset_sem, &adev->lock_key);
>>>>      	atomic_set(&adev->in_gpu_reset, 0);
>>>>      	mutex_init(&adev->psp.mutex);
>>>>      	mutex_init(&adev->notifier_lock);
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fli
>>> s
>>> t
>>> s.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=02%7C01%7C
>>> D
>>> e
>>> nnis.Li%40amd.com%7C56c95f939ddd441bd10408d839d77c9e%7C3dd8961fe4884
>>> e
>>> 6
>>> 08e11a82d994e183d%7C0%7C0%7C637322944985436656&amp;sdata=%2FBoRyEW3i
>>> K
>>> 5
>>> 9Y52ctLWd4y1lOmi2CInb6lpIgAF88i4%3D&amp;reserved=0
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] drm/amdgpu: annotate a false positive recursive locking
  2020-08-06 11:38               ` Li, Dennis
@ 2020-08-06 13:16                 ` Christian König
  2020-08-07  2:22                   ` Li, Dennis
  0 siblings, 1 reply; 24+ messages in thread
From: Christian König @ 2020-08-06 13:16 UTC (permalink / raw)
  To: Li, Dennis, amd-gfx, Deucher, Alexander, Kuehling, Felix, Zhang, Hawking

Am 06.08.20 um 13:38 schrieb Li, Dennis:
> [AMD Official Use Only - Internal Distribution Only]
>
> [SNIP]
>>> I think it is a limitation of init_rwsem.
>> And exactly that's wrong, this is intentional and perfectly correct.
>>
>> [Dennis Li] I couldn't understand. Why is it a perfectly correct?
>> For example, we define two rw_sem: a and b. If we don't check init_rwsem definition, we may think case#1 and case#2 have the same behavior, but in fact they are different.
>>
>> Case 1:
>> init_rwsem(&a);
>> init_rwsem(&b);
>>
>> Case2:
>> void init_rwsem_ext(rw_sem* sem)
>> {
>>        init_rwsem(sem);
>> }
>> init_rwsem_ext(&a);
>> init_rwsem_ext(&b);
>>
>> As far as I know it is perfectly possible that the locks in the hive are not always grabbed in the same order. And that's why lockdep is complaining about this.
>> [Dennis Li] No. I takes a hive with two devices(a and b) to explain why lockdep complain.
>>
>> // Firstly driver lock the reset_sem of all devices
>> down_write(&a->reset_sem); do_suspend(a);
>> down_write(&b->reset_sem);   // Because  b shared the same lock_class_key with a, lockdep will take a and b as the same rw_sem and complain here.
>> do_suspend(b);
>>
>> // do recovery
>> do_hive_recovery()
>>
>> // unlock the reset_sem of all devices do_resume(a);
>> up_write(&a->reset_sem); do_resume(b); up_write(&b->reset_sem);
> Ah! Now I understand what you are working around. So the problem is the static lock_class_key in the macro?
> [Dennis Li] Yes. The author of init_rwsem might not consider our similar use case.
>
>> What we should do instead is to make sure we have only a single lock for the complete hive instead.
>> [Dennis Li] If we use a single lock, users will must wait for all devices resuming successfully, but in fact their tasks are only running in device a. It is not friendly to users.
> Well that is intentional I would say. We can only restart submissions after all devices are resumed successfully, cause otherwise it can happen that a job on device A depends on memory on device B which isn't accessible.
> [Dennis Li] Yes, you are right. Driver have make sure that the shared resources(such as the shard memory) are ready before unlock the lock of adev one by one. But each device still has private hardware resources such as video  and display.

Yeah, but those are negligible and we have a team working on display 
support for XGMI. So this will sooner or later become a problem as well.

I still think that a single rwsem for the whole hive is still the best 
option here.

Regards,
Christian.

>
> Regards,
> Christian.
>
>> Regards,
>> Christian.
>>
>> Am 06.08.20 um 11:15 schrieb Li, Dennis:
>>> [AMD Official Use Only - Internal Distribution Only]
>>>
>>> Hi, Christian,
>>>         For this case, it is safe to use separated lock key. Please see the definition of init_rwsem as the below. Every init_rwsem calling will use a new static key, but devices of  the hive share the same code to do initialization, so their lock_class_key are the same. I think it is a limitation of init_rwsem.  In our case, it should be correct that reset_sem of each adev has different  lock_class_key. BTW, this change doesn't effect dead-lock detection and just correct it.
>>>
>>> #define init_rwsem(sem)					\
>>> do {								\
>>> 	static struct lock_class_key __key;			\
>>> 								\
>>> 	__init_rwsem((sem), #sem, &__key);			\
>>> } while (0)
>>>
>>> Best Regards
>>> Dennis Li
>>> -----Original Message-----
>>> From: Koenig, Christian <Christian.Koenig@amd.com>
>>> Sent: Thursday, August 6, 2020 4:13 PM
>>> To: Li, Dennis <Dennis.Li@amd.com>; amd-gfx@lists.freedesktop.org;
>>> Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix
>>> <Felix.Kuehling@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>
>>> Subject: Re: [PATCH] drm/amdgpu: annotate a false positive recursive
>>> locking
>>>
>>> Preventing locking problems during implementation is obviously a good approach, but lockdep has proven to be massively useful for finding and fixing problems.
>>>
>>> Disabling lockdep splat by annotating lock with separate classes is usually a no-go and only allowed if there is no other potential approach.
>>>
>>> In this case here we should really clean things up instead.
>>>
>>> Christian.
>>>
>>> Am 06.08.20 um 09:44 schrieb Li, Dennis:
>>>> [AMD Official Use Only - Internal Distribution Only]
>>>>
>>>> Hi, Christian,
>>>>           I agree with your concern. However we shouldn't rely on system to detect dead-lock, and should consider this when doing code implementation. In fact, dead-lock detection isn't enabled by default.
>>>>           For your proposal to remove reset_sem into the hive structure, we can open a new topic to discuss it. Currently we couldn't make sure which is the best solution. For example, with your proposal, we must wait for all devices resuming successfully before resubmit an old task in one device, which will effect performance.
>>>>
>>>> Best Regards
>>>> Dennis Li
>>>> -----Original Message-----
>>>> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of
>>>> Christian König
>>>> Sent: Thursday, August 6, 2020 3:08 PM
>>>> To: Li, Dennis <Dennis.Li@amd.com>; amd-gfx@lists.freedesktop.org;
>>>> Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix
>>>> <Felix.Kuehling@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>
>>>> Subject: Re: [PATCH] drm/amdgpu: annotate a false positive recursive
>>>> locking
>>>>
>>>> Am 06.08.20 um 09:02 schrieb Dennis Li:
>>>>> [  584.110304] ============================================
>>>>> [  584.110590] WARNING: possible recursive locking detected
>>>>> [  584.110876] 5.6.0-deli-v5.6-2848-g3f3109b0e75f #1 Tainted: G           OE
>>>>> [  584.111164] --------------------------------------------
>>>>> [  584.111456] kworker/38:1/553 is trying to acquire lock:
>>>>> [  584.111721] ffff9b15ff0a47a0 (&adev->reset_sem){++++}, at:
>>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.112112]
>>>>>                     but task is already holding lock:
>>>>> [  584.112673] ffff9b1603d247a0 (&adev->reset_sem){++++}, at:
>>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.113068]
>>>>>                     other info that might help us debug this:
>>>>> [  584.113689]  Possible unsafe locking scenario:
>>>>>
>>>>> [  584.114350]        CPU0
>>>>> [  584.114685]        ----
>>>>> [  584.115014]   lock(&adev->reset_sem);
>>>>> [  584.115349]   lock(&adev->reset_sem);
>>>>> [  584.115678]
>>>>>                      *** DEADLOCK ***
>>>>>
>>>>> [  584.116624]  May be due to missing lock nesting notation
>>>>>
>>>>> [  584.117284] 4 locks held by kworker/38:1/553:
>>>>> [  584.117616]  #0: ffff9ad635c1d348 ((wq_completion)events){+.+.},
>>>>> at: process_one_work+0x21f/0x630 [  584.117967]  #1:
>>>>> ffffac708e1c3e58 ((work_completion)(&con->recovery_work)){+.+.}, at:
>>>>> process_one_work+0x21f/0x630 [  584.118358]  #2: ffffffffc1c2a5d0
>>>>> (&tmp->hive_lock){+.+.}, at: amdgpu_device_gpu_recover+0xae/0x1030 [amdgpu] [  584.118786]  #3: ffff9b1603d247a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.119222]
>>>>>                     stack backtrace:
>>>>> [  584.119990] CPU: 38 PID: 553 Comm: kworker/38:1 Kdump: loaded Tainted: G           OE     5.6.0-deli-v5.6-2848-g3f3109b0e75f #1
>>>>> [  584.120782] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT,
>>>>> BIOS 3.1 05/23/2019 [  584.121223] Workqueue: events
>>>>> amdgpu_ras_do_recovery [amdgpu] [  584.121638] Call Trace:
>>>>> [  584.122050]  dump_stack+0x98/0xd5 [  584.122499]
>>>>> __lock_acquire+0x1139/0x16e0 [  584.122931]  ?
>>>>> trace_hardirqs_on+0x3b/0xf0 [  584.123358]  ?
>>>>> cancel_delayed_work+0xa6/0xc0 [  584.123771]
>>>>> lock_acquire+0xb8/0x1c0 [  584.124197]  ?
>>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.124599]  down_write+0x49/0x120 [  584.125032]  ?
>>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.125472]
>>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.125910]  ?
>>>>> amdgpu_ras_error_query+0x1b8/0x2a0 [amdgpu] [  584.126367]
>>>>> amdgpu_ras_do_recovery+0x159/0x190 [amdgpu] [  584.126789]
>>>>> process_one_work+0x29e/0x630 [  584.127208]
>>>>> worker_thread+0x3c/0x3f0 [  584.127621]  ?
>>>>> __kthread_parkme+0x61/0x90 [  584.128014]
>>>>> kthread+0x12f/0x150 [  584.128402]  ? process_one_work+0x630/0x630
>>>>> kthread+[
>>>>> 584.128790]  ? kthread_park+0x90/0x90 [  584.129174]
>>>>> ret_from_fork+0x3a/0x50
>>>>>
>>>>> Each adev has owned lock_class_key to avoid false positive
>>>>> recursive locking.
>>>> NAK, that is not a false positive but a real problem.
>>>>
>>>> The issue here is that we have multiple reset semaphores, one for each device in the hive. If those are not acquired in the correct order we deadlock.
>>>>
>>>> The real solution would be to move the reset_sem into the hive structure and make sure that we lock it only once.
>>>>
>>>> Christian.
>>>>
>>>>> Signed-off-by: Dennis Li <Dennis.Li@amd.com>
>>>>> Change-Id: I7571efeccbf15483982031d00504a353031a854a
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>> index e97c088d03b3..766dc8f8c8a0 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>> @@ -967,6 +967,7 @@ struct amdgpu_device {
>>>>>       	atomic_t                        in_gpu_reset;
>>>>>       	enum pp_mp1_state               mp1_state;
>>>>>       	struct rw_semaphore	reset_sem;
>>>>> +	struct lock_class_key lock_key;
>>>>>       	struct amdgpu_doorbell_index doorbell_index;
>>>>>       
>>>>>       	struct mutex			notifier_lock;
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> index 6c572db42d92..d78df9312d34 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> @@ -3037,6 +3037,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>>>>>       	mutex_init(&adev->virt.vf_errors.lock);
>>>>>       	hash_init(adev->mn_hash);
>>>>>       	init_rwsem(&adev->reset_sem);
>>>>> +	lockdep_set_class(&adev->reset_sem, &adev->lock_key);
>>>>>       	atomic_set(&adev->in_gpu_reset, 0);
>>>>>       	mutex_init(&adev->psp.mutex);
>>>>>       	mutex_init(&adev->notifier_lock);
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx@lists.freedesktop.org
>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fli
>>>> s
>>>> t
>>>> s.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=02%7C01%7C
>>>> D
>>>> e
>>>> nnis.Li%40amd.com%7C56c95f939ddd441bd10408d839d77c9e%7C3dd8961fe4884
>>>> e
>>>> 6
>>>> 08e11a82d994e183d%7C0%7C0%7C637322944985436656&amp;sdata=%2FBoRyEW3i
>>>> K
>>>> 5
>>>> 9Y52ctLWd4y1lOmi2CInb6lpIgAF88i4%3D&amp;reserved=0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [PATCH] drm/amdgpu: annotate a false positive recursive locking
  2020-08-06 13:16                 ` Christian König
@ 2020-08-07  2:22                   ` Li, Dennis
  2020-08-07  6:57                     ` Christian König
  0 siblings, 1 reply; 24+ messages in thread
From: Li, Dennis @ 2020-08-07  2:22 UTC (permalink / raw)
  To: Koenig, Christian, amd-gfx, Deucher, Alexander, Kuehling, Felix,
	Zhang, Hawking

[AMD Public Use]

> [SNIP]
>>> I think it is a limitation of init_rwsem.
>> And exactly that's wrong, this is intentional and perfectly correct.
>>
>> [Dennis Li] I couldn't understand. Why is it a perfectly correct?
>> For example, we define two rw_sem: a and b. If we don't check init_rwsem definition, we may think case#1 and case#2 have the same behavior, but in fact they are different.
>>
>> Case 1:
>> init_rwsem(&a);
>> init_rwsem(&b);
>>
>> Case2:
>> void init_rwsem_ext(rw_sem* sem)
>> {
>>        init_rwsem(sem);
>> }
>> init_rwsem_ext(&a);
>> init_rwsem_ext(&b);
>>
>> As far as I know it is perfectly possible that the locks in the hive are not always grabbed in the same order. And that's why lockdep is complaining about this.
>> [Dennis Li] No. I takes a hive with two devices(a and b) to explain why lockdep complain.
>>
>> // Firstly driver lock the reset_sem of all devices 
>> down_write(&a->reset_sem); do_suspend(a);
>> down_write(&b->reset_sem);   // Because  b shared the same lock_class_key with a, lockdep will take a and b as the same rw_sem and complain here.
>> do_suspend(b);
>>
>> // do recovery
>> do_hive_recovery()
>>
>> // unlock the reset_sem of all devices do_resume(a); 
>> up_write(&a->reset_sem); do_resume(b); up_write(&b->reset_sem);
> Ah! Now I understand what you are working around. So the problem is the static lock_class_key in the macro?
> [Dennis Li] Yes. The author of init_rwsem might not consider our similar use case.
>
>> What we should do instead is to make sure we have only a single lock for the complete hive instead.
>> [Dennis Li] If we use a single lock, users will must wait for all devices resuming successfully, but in fact their tasks are only running in device a. It is not friendly to users.
> Well that is intentional I would say. We can only restart submissions after all devices are resumed successfully, cause otherwise it can happen that a job on device A depends on memory on device B which isn't accessible.
> [Dennis Li] Yes, you are right. Driver have make sure that the shared resources(such as the shard memory) are ready before unlock the lock of adev one by one. But each device still has private hardware resources such as video  and display.

Yeah, but those are negligible and we have a team working on display support for XGMI. So this will sooner or later become a problem as well.

I still think that a single rwsem for the whole hive is still the best option here.

[Dennis Li] For your above concern, we can open a new thread to discuss it. If we make a decision to use a single after discussing, we can create another patch for it. 

Best Regards
Dennis lI
>
> Regards,
> Christian.
>
>> Regards,
>> Christian.
>>
>> Am 06.08.20 um 11:15 schrieb Li, Dennis:
>>> [AMD Official Use Only - Internal Distribution Only]
>>>
>>> Hi, Christian,
>>>         For this case, it is safe to use separated lock key. Please see the definition of init_rwsem as the below. Every init_rwsem calling will use a new static key, but devices of  the hive share the same code to do initialization, so their lock_class_key are the same. I think it is a limitation of init_rwsem.  In our case, it should be correct that reset_sem of each adev has different  lock_class_key. BTW, this change doesn't effect dead-lock detection and just correct it.
>>>
>>> #define init_rwsem(sem)					\
>>> do {								\
>>> 	static struct lock_class_key __key;			\
>>> 								\
>>> 	__init_rwsem((sem), #sem, &__key);			\
>>> } while (0)
>>>
>>> Best Regards
>>> Dennis Li
>>> -----Original Message-----
>>> From: Koenig, Christian <Christian.Koenig@amd.com>
>>> Sent: Thursday, August 6, 2020 4:13 PM
>>> To: Li, Dennis <Dennis.Li@amd.com>; amd-gfx@lists.freedesktop.org; 
>>> Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix 
>>> <Felix.Kuehling@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>
>>> Subject: Re: [PATCH] drm/amdgpu: annotate a false positive recursive 
>>> locking
>>>
>>> Preventing locking problems during implementation is obviously a good approach, but lockdep has proven to be massively useful for finding and fixing problems.
>>>
>>> Disabling lockdep splat by annotating lock with separate classes is usually a no-go and only allowed if there is no other potential approach.
>>>
>>> In this case here we should really clean things up instead.
>>>
>>> Christian.
>>>
>>> Am 06.08.20 um 09:44 schrieb Li, Dennis:
>>>> [AMD Official Use Only - Internal Distribution Only]
>>>>
>>>> Hi, Christian,
>>>>           I agree with your concern. However we shouldn't rely on system to detect dead-lock, and should consider this when doing code implementation. In fact, dead-lock detection isn't enabled by default.
>>>>           For your proposal to remove reset_sem into the hive structure, we can open a new topic to discuss it. Currently we couldn't make sure which is the best solution. For example, with your proposal, we must wait for all devices resuming successfully before resubmit an old task in one device, which will effect performance.
>>>>
>>>> Best Regards
>>>> Dennis Li
>>>> -----Original Message-----
>>>> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of 
>>>> Christian König
>>>> Sent: Thursday, August 6, 2020 3:08 PM
>>>> To: Li, Dennis <Dennis.Li@amd.com>; amd-gfx@lists.freedesktop.org; 
>>>> Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix 
>>>> <Felix.Kuehling@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>
>>>> Subject: Re: [PATCH] drm/amdgpu: annotate a false positive 
>>>> recursive locking
>>>>
>>>> Am 06.08.20 um 09:02 schrieb Dennis Li:
>>>>> [  584.110304] ============================================
>>>>> [  584.110590] WARNING: possible recursive locking detected
>>>>> [  584.110876] 5.6.0-deli-v5.6-2848-g3f3109b0e75f #1 Tainted: G           OE
>>>>> [  584.111164] --------------------------------------------
>>>>> [  584.111456] kworker/38:1/553 is trying to acquire lock:
>>>>> [  584.111721] ffff9b15ff0a47a0 (&adev->reset_sem){++++}, at:
>>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.112112]
>>>>>                     but task is already holding lock:
>>>>> [  584.112673] ffff9b1603d247a0 (&adev->reset_sem){++++}, at:
>>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.113068]
>>>>>                     other info that might help us debug this:
>>>>> [  584.113689]  Possible unsafe locking scenario:
>>>>>
>>>>> [  584.114350]        CPU0
>>>>> [  584.114685]        ----
>>>>> [  584.115014]   lock(&adev->reset_sem);
>>>>> [  584.115349]   lock(&adev->reset_sem);
>>>>> [  584.115678]
>>>>>                      *** DEADLOCK ***
>>>>>
>>>>> [  584.116624]  May be due to missing lock nesting notation
>>>>>
>>>>> [  584.117284] 4 locks held by kworker/38:1/553:
>>>>> [  584.117616]  #0: ffff9ad635c1d348 
>>>>> ((wq_completion)events){+.+.},
>>>>> at: process_one_work+0x21f/0x630 [  584.117967]  #1:
>>>>> ffffac708e1c3e58 ((work_completion)(&con->recovery_work)){+.+.}, at:
>>>>> process_one_work+0x21f/0x630 [  584.118358]  #2: ffffffffc1c2a5d0 
>>>>> (&tmp->hive_lock){+.+.}, at: amdgpu_device_gpu_recover+0xae/0x1030 [amdgpu] [  584.118786]  #3: ffff9b1603d247a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.119222]
>>>>>                     stack backtrace:
>>>>> [  584.119990] CPU: 38 PID: 553 Comm: kworker/38:1 Kdump: loaded Tainted: G           OE     5.6.0-deli-v5.6-2848-g3f3109b0e75f #1
>>>>> [  584.120782] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, 
>>>>> BIOS 3.1 05/23/2019 [  584.121223] Workqueue: events 
>>>>> amdgpu_ras_do_recovery [amdgpu] [  584.121638] Call Trace:
>>>>> [  584.122050]  dump_stack+0x98/0xd5 [  584.122499]
>>>>> __lock_acquire+0x1139/0x16e0 [  584.122931]  ?
>>>>> trace_hardirqs_on+0x3b/0xf0 [  584.123358]  ?
>>>>> cancel_delayed_work+0xa6/0xc0 [  584.123771]
>>>>> lock_acquire+0xb8/0x1c0 [  584.124197]  ?
>>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.124599]  down_write+0x49/0x120 [  584.125032]  ?
>>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.125472]
>>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.125910]  ?
>>>>> amdgpu_ras_error_query+0x1b8/0x2a0 [amdgpu] [  584.126367]
>>>>> amdgpu_ras_do_recovery+0x159/0x190 [amdgpu] [  584.126789]
>>>>> process_one_work+0x29e/0x630 [  584.127208]
>>>>> worker_thread+0x3c/0x3f0 [  584.127621]  ?
>>>>> __kthread_parkme+0x61/0x90 [  584.128014]
>>>>> kthread+0x12f/0x150 [  584.128402]  ? process_one_work+0x630/0x630 
>>>>> kthread+[
>>>>> 584.128790]  ? kthread_park+0x90/0x90 [  584.129174]
>>>>> ret_from_fork+0x3a/0x50
>>>>>
>>>>> Each adev has owned lock_class_key to avoid false positive 
>>>>> recursive locking.
>>>> NAK, that is not a false positive but a real problem.
>>>>
>>>> The issue here is that we have multiple reset semaphores, one for each device in the hive. If those are not acquired in the correct order we deadlock.
>>>>
>>>> The real solution would be to move the reset_sem into the hive structure and make sure that we lock it only once.
>>>>
>>>> Christian.
>>>>
>>>>> Signed-off-by: Dennis Li <Dennis.Li@amd.com>
>>>>> Change-Id: I7571efeccbf15483982031d00504a353031a854a
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>> index e97c088d03b3..766dc8f8c8a0 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>> @@ -967,6 +967,7 @@ struct amdgpu_device {
>>>>>       	atomic_t                        in_gpu_reset;
>>>>>       	enum pp_mp1_state               mp1_state;
>>>>>       	struct rw_semaphore	reset_sem;
>>>>> +	struct lock_class_key lock_key;
>>>>>       	struct amdgpu_doorbell_index doorbell_index;
>>>>>       
>>>>>       	struct mutex			notifier_lock;
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> index 6c572db42d92..d78df9312d34 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>> @@ -3037,6 +3037,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>>>>>       	mutex_init(&adev->virt.vf_errors.lock);
>>>>>       	hash_init(adev->mn_hash);
>>>>>       	init_rwsem(&adev->reset_sem);
>>>>> +	lockdep_set_class(&adev->reset_sem, &adev->lock_key);
>>>>>       	atomic_set(&adev->in_gpu_reset, 0);
>>>>>       	mutex_init(&adev->psp.mutex);
>>>>>       	mutex_init(&adev->notifier_lock);
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx@lists.freedesktop.org
>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fl
>>>> i
>>>> s
>>>> t
>>>> s.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=02%7C01%7
>>>> C
>>>> D
>>>> e
>>>> nnis.Li%40amd.com%7C56c95f939ddd441bd10408d839d77c9e%7C3dd8961fe488
>>>> 4
>>>> e
>>>> 6
>>>> 08e11a82d994e183d%7C0%7C0%7C637322944985436656&amp;sdata=%2FBoRyEW3
>>>> i
>>>> K
>>>> 5
>>>> 9Y52ctLWd4y1lOmi2CInb6lpIgAF88i4%3D&amp;reserved=0
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] drm/amdgpu: annotate a false positive recursive locking
  2020-08-07  2:22                   ` Li, Dennis
@ 2020-08-07  6:57                     ` Christian König
  2020-08-07  7:08                       ` Felix Kuehling
  2020-08-07  7:38                       ` Li, Dennis
  0 siblings, 2 replies; 24+ messages in thread
From: Christian König @ 2020-08-07  6:57 UTC (permalink / raw)
  To: Li, Dennis, amd-gfx, Deucher, Alexander, Kuehling, Felix, Zhang, Hawking

Am 07.08.20 um 04:22 schrieb Li, Dennis:
> [AMD Public Use]
>
>> [SNIP]
>>>> I think it is a limitation of init_rwsem.
>>> And exactly that's wrong, this is intentional and perfectly correct.
>>>
>>> [Dennis Li] I couldn't understand. Why is it a perfectly correct?
>>> For example, we define two rw_sem: a and b. If we don't check init_rwsem definition, we may think case#1 and case#2 have the same behavior, but in fact they are different.
>>>
>>> Case 1:
>>> init_rwsem(&a);
>>> init_rwsem(&b);
>>>
>>> Case2:
>>> void init_rwsem_ext(rw_sem* sem)
>>> {
>>>         init_rwsem(sem);
>>> }
>>> init_rwsem_ext(&a);
>>> init_rwsem_ext(&b);
>>>
>>> As far as I know it is perfectly possible that the locks in the hive are not always grabbed in the same order. And that's why lockdep is complaining about this.
>>> [Dennis Li] No. I takes a hive with two devices(a and b) to explain why lockdep complain.
>>>
>>> // Firstly driver lock the reset_sem of all devices
>>> down_write(&a->reset_sem); do_suspend(a);
>>> down_write(&b->reset_sem);   // Because  b shared the same lock_class_key with a, lockdep will take a and b as the same rw_sem and complain here.
>>> do_suspend(b);
>>>
>>> // do recovery
>>> do_hive_recovery()
>>>
>>> // unlock the reset_sem of all devices do_resume(a);
>>> up_write(&a->reset_sem); do_resume(b); up_write(&b->reset_sem);
>> Ah! Now I understand what you are working around. So the problem is the static lock_class_key in the macro?
>> [Dennis Li] Yes. The author of init_rwsem might not consider our similar use case.

Well this is also really not the intended use case.

When you lock the same rwsem class recursively you can easily run into 
deadlocks if you don't keep the order the same all the time.

And abstracting init functions behind your own layer is a no-go in the 
Linux kernel as well.

>>> What we should do instead is to make sure we have only a single lock for the complete hive instead.
>>> [Dennis Li] If we use a single lock, users will must wait for all devices resuming successfully, but in fact their tasks are only running in device a. It is not friendly to users.
>> Well that is intentional I would say. We can only restart submissions after all devices are resumed successfully, cause otherwise it can happen that a job on device A depends on memory on device B which isn't accessible.
>> [Dennis Li] Yes, you are right. Driver have make sure that the shared resources(such as the shard memory) are ready before unlock the lock of adev one by one. But each device still has private hardware resources such as video  and display.
> Yeah, but those are negligible and we have a team working on display support for XGMI. So this will sooner or later become a problem as well.
>
> I still think that a single rwsem for the whole hive is still the best option here.
>
> [Dennis Li] For your above concern, we can open a new thread to discuss it. If we make a decision to use a single after discussing, we can create another patch for it.

That's a really good argument, but I still hesitate to merge this patch. 
How severe is the lockdep splat?

At bare minimum we need a "/* TODO: ...." comment why we do this and how 
to remove the workaround again.

Regards,
Christian.

>
> Best Regards
> Dennis lI
>> Regards,
>> Christian.
>>
>>> Regards,
>>> Christian.
>>>
>>> Am 06.08.20 um 11:15 schrieb Li, Dennis:
>>>> [AMD Official Use Only - Internal Distribution Only]
>>>>
>>>> Hi, Christian,
>>>>          For this case, it is safe to use separated lock key. Please see the definition of init_rwsem as the below. Every init_rwsem calling will use a new static key, but devices of  the hive share the same code to do initialization, so their lock_class_key are the same. I think it is a limitation of init_rwsem.  In our case, it should be correct that reset_sem of each adev has different  lock_class_key. BTW, this change doesn't effect dead-lock detection and just correct it.
>>>>
>>>> #define init_rwsem(sem)					\
>>>> do {								\
>>>> 	static struct lock_class_key __key;			\
>>>> 								\
>>>> 	__init_rwsem((sem), #sem, &__key);			\
>>>> } while (0)
>>>>
>>>> Best Regards
>>>> Dennis Li
>>>> -----Original Message-----
>>>> From: Koenig, Christian <Christian.Koenig@amd.com>
>>>> Sent: Thursday, August 6, 2020 4:13 PM
>>>> To: Li, Dennis <Dennis.Li@amd.com>; amd-gfx@lists.freedesktop.org;
>>>> Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix
>>>> <Felix.Kuehling@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>
>>>> Subject: Re: [PATCH] drm/amdgpu: annotate a false positive recursive
>>>> locking
>>>>
>>>> Preventing locking problems during implementation is obviously a good approach, but lockdep has proven to be massively useful for finding and fixing problems.
>>>>
>>>> Disabling lockdep splat by annotating lock with separate classes is usually a no-go and only allowed if there is no other potential approach.
>>>>
>>>> In this case here we should really clean things up instead.
>>>>
>>>> Christian.
>>>>
>>>> Am 06.08.20 um 09:44 schrieb Li, Dennis:
>>>>> [AMD Official Use Only - Internal Distribution Only]
>>>>>
>>>>> Hi, Christian,
>>>>>            I agree with your concern. However we shouldn't rely on system to detect dead-lock, and should consider this when doing code implementation. In fact, dead-lock detection isn't enabled by default.
>>>>>            For your proposal to remove reset_sem into the hive structure, we can open a new topic to discuss it. Currently we couldn't make sure which is the best solution. For example, with your proposal, we must wait for all devices resuming successfully before resubmit an old task in one device, which will effect performance.
>>>>>
>>>>> Best Regards
>>>>> Dennis Li
>>>>> -----Original Message-----
>>>>> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of
>>>>> Christian König
>>>>> Sent: Thursday, August 6, 2020 3:08 PM
>>>>> To: Li, Dennis <Dennis.Li@amd.com>; amd-gfx@lists.freedesktop.org;
>>>>> Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix
>>>>> <Felix.Kuehling@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>
>>>>> Subject: Re: [PATCH] drm/amdgpu: annotate a false positive
>>>>> recursive locking
>>>>>
>>>>> Am 06.08.20 um 09:02 schrieb Dennis Li:
>>>>>> [  584.110304] ============================================
>>>>>> [  584.110590] WARNING: possible recursive locking detected
>>>>>> [  584.110876] 5.6.0-deli-v5.6-2848-g3f3109b0e75f #1 Tainted: G           OE
>>>>>> [  584.111164] --------------------------------------------
>>>>>> [  584.111456] kworker/38:1/553 is trying to acquire lock:
>>>>>> [  584.111721] ffff9b15ff0a47a0 (&adev->reset_sem){++++}, at:
>>>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.112112]
>>>>>>                      but task is already holding lock:
>>>>>> [  584.112673] ffff9b1603d247a0 (&adev->reset_sem){++++}, at:
>>>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.113068]
>>>>>>                      other info that might help us debug this:
>>>>>> [  584.113689]  Possible unsafe locking scenario:
>>>>>>
>>>>>> [  584.114350]        CPU0
>>>>>> [  584.114685]        ----
>>>>>> [  584.115014]   lock(&adev->reset_sem);
>>>>>> [  584.115349]   lock(&adev->reset_sem);
>>>>>> [  584.115678]
>>>>>>                       *** DEADLOCK ***
>>>>>>
>>>>>> [  584.116624]  May be due to missing lock nesting notation
>>>>>>
>>>>>> [  584.117284] 4 locks held by kworker/38:1/553:
>>>>>> [  584.117616]  #0: ffff9ad635c1d348
>>>>>> ((wq_completion)events){+.+.},
>>>>>> at: process_one_work+0x21f/0x630 [  584.117967]  #1:
>>>>>> ffffac708e1c3e58 ((work_completion)(&con->recovery_work)){+.+.}, at:
>>>>>> process_one_work+0x21f/0x630 [  584.118358]  #2: ffffffffc1c2a5d0
>>>>>> (&tmp->hive_lock){+.+.}, at: amdgpu_device_gpu_recover+0xae/0x1030 [amdgpu] [  584.118786]  #3: ffff9b1603d247a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.119222]
>>>>>>                      stack backtrace:
>>>>>> [  584.119990] CPU: 38 PID: 553 Comm: kworker/38:1 Kdump: loaded Tainted: G           OE     5.6.0-deli-v5.6-2848-g3f3109b0e75f #1
>>>>>> [  584.120782] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT,
>>>>>> BIOS 3.1 05/23/2019 [  584.121223] Workqueue: events
>>>>>> amdgpu_ras_do_recovery [amdgpu] [  584.121638] Call Trace:
>>>>>> [  584.122050]  dump_stack+0x98/0xd5 [  584.122499]
>>>>>> __lock_acquire+0x1139/0x16e0 [  584.122931]  ?
>>>>>> trace_hardirqs_on+0x3b/0xf0 [  584.123358]  ?
>>>>>> cancel_delayed_work+0xa6/0xc0 [  584.123771]
>>>>>> lock_acquire+0xb8/0x1c0 [  584.124197]  ?
>>>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.124599]  down_write+0x49/0x120 [  584.125032]  ?
>>>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.125472]
>>>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.125910]  ?
>>>>>> amdgpu_ras_error_query+0x1b8/0x2a0 [amdgpu] [  584.126367]
>>>>>> amdgpu_ras_do_recovery+0x159/0x190 [amdgpu] [  584.126789]
>>>>>> process_one_work+0x29e/0x630 [  584.127208]
>>>>>> worker_thread+0x3c/0x3f0 [  584.127621]  ?
>>>>>> __kthread_parkme+0x61/0x90 [  584.128014]
>>>>>> kthread+0x12f/0x150 [  584.128402]  ? process_one_work+0x630/0x630
>>>>>> kthread+[
>>>>>> 584.128790]  ? kthread_park+0x90/0x90 [  584.129174]
>>>>>> ret_from_fork+0x3a/0x50
>>>>>>
>>>>>> Each adev has owned lock_class_key to avoid false positive
>>>>>> recursive locking.
>>>>> NAK, that is not a false positive but a real problem.
>>>>>
>>>>> The issue here is that we have multiple reset semaphores, one for each device in the hive. If those are not acquired in the correct order we deadlock.
>>>>>
>>>>> The real solution would be to move the reset_sem into the hive structure and make sure that we lock it only once.
>>>>>
>>>>> Christian.
>>>>>
>>>>>> Signed-off-by: Dennis Li <Dennis.Li@amd.com>
>>>>>> Change-Id: I7571efeccbf15483982031d00504a353031a854a
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>>> index e97c088d03b3..766dc8f8c8a0 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>>> @@ -967,6 +967,7 @@ struct amdgpu_device {
>>>>>>        	atomic_t                        in_gpu_reset;
>>>>>>        	enum pp_mp1_state               mp1_state;
>>>>>>        	struct rw_semaphore	reset_sem;
>>>>>> +	struct lock_class_key lock_key;
>>>>>>        	struct amdgpu_doorbell_index doorbell_index;
>>>>>>        
>>>>>>        	struct mutex			notifier_lock;
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> index 6c572db42d92..d78df9312d34 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> @@ -3037,6 +3037,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>>>>>>        	mutex_init(&adev->virt.vf_errors.lock);
>>>>>>        	hash_init(adev->mn_hash);
>>>>>>        	init_rwsem(&adev->reset_sem);
>>>>>> +	lockdep_set_class(&adev->reset_sem, &adev->lock_key);
>>>>>>        	atomic_set(&adev->in_gpu_reset, 0);
>>>>>>        	mutex_init(&adev->psp.mutex);
>>>>>>        	mutex_init(&adev->notifier_lock);
>>>>> _______________________________________________
>>>>> amd-gfx mailing list
>>>>> amd-gfx@lists.freedesktop.org
>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fl
>>>>> i
>>>>> s
>>>>> t
>>>>> s.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=02%7C01%7
>>>>> C
>>>>> D
>>>>> e
>>>>> nnis.Li%40amd.com%7C56c95f939ddd441bd10408d839d77c9e%7C3dd8961fe488
>>>>> 4
>>>>> e
>>>>> 6
>>>>> 08e11a82d994e183d%7C0%7C0%7C637322944985436656&amp;sdata=%2FBoRyEW3
>>>>> i
>>>>> K
>>>>> 5
>>>>> 9Y52ctLWd4y1lOmi2CInb6lpIgAF88i4%3D&amp;reserved=0

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] drm/amdgpu: annotate a false positive recursive locking
  2020-08-07  6:57                     ` Christian König
@ 2020-08-07  7:08                       ` Felix Kuehling
  2020-08-07  7:38                       ` Li, Dennis
  1 sibling, 0 replies; 24+ messages in thread
From: Felix Kuehling @ 2020-08-07  7:08 UTC (permalink / raw)
  To: Christian König, Li, Dennis, amd-gfx, Deucher, Alexander,
	Zhang, Hawking


Am 2020-08-07 um 2:57 a.m. schrieb Christian König:
[snip]
> That's a really good argument, but I still hesitate to merge this
> patch. How severe is the lockdep splat?

I argued before that any lockdep splat is bad, because it disables
further lockdep checking and can hide other lockdep problems down the
road. The longer you live with it, the more potential locking problems
you can introduce unknowingly, that you'll need to work through one at a
time, as you uncover them later.

Regards,
  Felix


>
> At bare minimum we need a "/* TODO: ...." comment why we do this and
> how to remove the workaround again.
>
> Regards,
> Christian.
>
>>
>> Best Regards
>> Dennis lI
>>> Regards,
>>> Christian.
>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>> Am 06.08.20 um 11:15 schrieb Li, Dennis:
>>>>> [AMD Official Use Only - Internal Distribution Only]
>>>>>
>>>>> Hi, Christian,
>>>>>          For this case, it is safe to use separated lock key.
>>>>> Please see the definition of init_rwsem as the below. Every
>>>>> init_rwsem calling will use a new static key, but devices of  the
>>>>> hive share the same code to do initialization, so their
>>>>> lock_class_key are the same. I think it is a limitation of
>>>>> init_rwsem.  In our case, it should be correct that reset_sem of
>>>>> each adev has different  lock_class_key. BTW, this change doesn't
>>>>> effect dead-lock detection and just correct it.
>>>>>
>>>>> #define init_rwsem(sem)                    \
>>>>> do {                                \
>>>>>     static struct lock_class_key __key;            \
>>>>>                                 \
>>>>>     __init_rwsem((sem), #sem, &__key);            \
>>>>> } while (0)
>>>>>
>>>>> Best Regards
>>>>> Dennis Li
>>>>> -----Original Message-----
>>>>> From: Koenig, Christian <Christian.Koenig@amd.com>
>>>>> Sent: Thursday, August 6, 2020 4:13 PM
>>>>> To: Li, Dennis <Dennis.Li@amd.com>; amd-gfx@lists.freedesktop.org;
>>>>> Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix
>>>>> <Felix.Kuehling@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>
>>>>> Subject: Re: [PATCH] drm/amdgpu: annotate a false positive recursive
>>>>> locking
>>>>>
>>>>> Preventing locking problems during implementation is obviously a
>>>>> good approach, but lockdep has proven to be massively useful for
>>>>> finding and fixing problems.
>>>>>
>>>>> Disabling lockdep splat by annotating lock with separate classes
>>>>> is usually a no-go and only allowed if there is no other potential
>>>>> approach.
>>>>>
>>>>> In this case here we should really clean things up instead.
>>>>>
>>>>> Christian.
>>>>>
>>>>> Am 06.08.20 um 09:44 schrieb Li, Dennis:
>>>>>> [AMD Official Use Only - Internal Distribution Only]
>>>>>>
>>>>>> Hi, Christian,
>>>>>>            I agree with your concern. However we shouldn't rely
>>>>>> on system to detect dead-lock, and should consider this when
>>>>>> doing code implementation. In fact, dead-lock detection isn't
>>>>>> enabled by default.
>>>>>>            For your proposal to remove reset_sem into the hive
>>>>>> structure, we can open a new topic to discuss it. Currently we
>>>>>> couldn't make sure which is the best solution. For example, with
>>>>>> your proposal, we must wait for all devices resuming successfully
>>>>>> before resubmit an old task in one device, which will effect
>>>>>> performance.
>>>>>>
>>>>>> Best Regards
>>>>>> Dennis Li
>>>>>> -----Original Message-----
>>>>>> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of
>>>>>> Christian König
>>>>>> Sent: Thursday, August 6, 2020 3:08 PM
>>>>>> To: Li, Dennis <Dennis.Li@amd.com>; amd-gfx@lists.freedesktop.org;
>>>>>> Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix
>>>>>> <Felix.Kuehling@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>
>>>>>> Subject: Re: [PATCH] drm/amdgpu: annotate a false positive
>>>>>> recursive locking
>>>>>>
>>>>>> Am 06.08.20 um 09:02 schrieb Dennis Li:
>>>>>>> [  584.110304] ============================================
>>>>>>> [  584.110590] WARNING: possible recursive locking detected
>>>>>>> [  584.110876] 5.6.0-deli-v5.6-2848-g3f3109b0e75f #1 Tainted:
>>>>>>> G           OE
>>>>>>> [  584.111164] --------------------------------------------
>>>>>>> [  584.111456] kworker/38:1/553 is trying to acquire lock:
>>>>>>> [  584.111721] ffff9b15ff0a47a0 (&adev->reset_sem){++++}, at:
>>>>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.112112]
>>>>>>>                      but task is already holding lock:
>>>>>>> [  584.112673] ffff9b1603d247a0 (&adev->reset_sem){++++}, at:
>>>>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.113068]
>>>>>>>                      other info that might help us debug this:
>>>>>>> [  584.113689]  Possible unsafe locking scenario:
>>>>>>>
>>>>>>> [  584.114350]        CPU0
>>>>>>> [  584.114685]        ----
>>>>>>> [  584.115014]   lock(&adev->reset_sem);
>>>>>>> [  584.115349]   lock(&adev->reset_sem);
>>>>>>> [  584.115678]
>>>>>>>                       *** DEADLOCK ***
>>>>>>>
>>>>>>> [  584.116624]  May be due to missing lock nesting notation
>>>>>>>
>>>>>>> [  584.117284] 4 locks held by kworker/38:1/553:
>>>>>>> [  584.117616]  #0: ffff9ad635c1d348
>>>>>>> ((wq_completion)events){+.+.},
>>>>>>> at: process_one_work+0x21f/0x630 [  584.117967]  #1:
>>>>>>> ffffac708e1c3e58 ((work_completion)(&con->recovery_work)){+.+.},
>>>>>>> at:
>>>>>>> process_one_work+0x21f/0x630 [  584.118358]  #2: ffffffffc1c2a5d0
>>>>>>> (&tmp->hive_lock){+.+.}, at:
>>>>>>> amdgpu_device_gpu_recover+0xae/0x1030 [amdgpu] [  584.118786] 
>>>>>>> #3: ffff9b1603d247a0 (&adev->reset_sem){++++}, at:
>>>>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.119222]
>>>>>>>                      stack backtrace:
>>>>>>> [  584.119990] CPU: 38 PID: 553 Comm: kworker/38:1 Kdump: loaded
>>>>>>> Tainted: G           OE     5.6.0-deli-v5.6-2848-g3f3109b0e75f #1
>>>>>>> [  584.120782] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT,
>>>>>>> BIOS 3.1 05/23/2019 [  584.121223] Workqueue: events
>>>>>>> amdgpu_ras_do_recovery [amdgpu] [  584.121638] Call Trace:
>>>>>>> [  584.122050]  dump_stack+0x98/0xd5 [  584.122499]
>>>>>>> __lock_acquire+0x1139/0x16e0 [  584.122931]  ?
>>>>>>> trace_hardirqs_on+0x3b/0xf0 [  584.123358]  ?
>>>>>>> cancel_delayed_work+0xa6/0xc0 [  584.123771]
>>>>>>> lock_acquire+0xb8/0x1c0 [  584.124197]  ?
>>>>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.124599] 
>>>>>>> down_write+0x49/0x120 [  584.125032]  ?
>>>>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.125472]
>>>>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.125910]  ?
>>>>>>> amdgpu_ras_error_query+0x1b8/0x2a0 [amdgpu] [  584.126367]
>>>>>>> amdgpu_ras_do_recovery+0x159/0x190 [amdgpu] [  584.126789]
>>>>>>> process_one_work+0x29e/0x630 [  584.127208]
>>>>>>> worker_thread+0x3c/0x3f0 [  584.127621]  ?
>>>>>>> __kthread_parkme+0x61/0x90 [  584.128014]
>>>>>>> kthread+0x12f/0x150 [  584.128402]  ? process_one_work+0x630/0x630
>>>>>>> kthread+[
>>>>>>> 584.128790]  ? kthread_park+0x90/0x90 [  584.129174]
>>>>>>> ret_from_fork+0x3a/0x50
>>>>>>>
>>>>>>> Each adev has owned lock_class_key to avoid false positive
>>>>>>> recursive locking.
>>>>>> NAK, that is not a false positive but a real problem.
>>>>>>
>>>>>> The issue here is that we have multiple reset semaphores, one for
>>>>>> each device in the hive. If those are not acquired in the correct
>>>>>> order we deadlock.
>>>>>>
>>>>>> The real solution would be to move the reset_sem into the hive
>>>>>> structure and make sure that we lock it only once.
>>>>>>
>>>>>> Christian.
>>>>>>
>>>>>>> Signed-off-by: Dennis Li <Dennis.Li@amd.com>
>>>>>>> Change-Id: I7571efeccbf15483982031d00504a353031a854a
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>>>> index e97c088d03b3..766dc8f8c8a0 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>>>> @@ -967,6 +967,7 @@ struct amdgpu_device {
>>>>>>>            atomic_t                        in_gpu_reset;
>>>>>>>            enum pp_mp1_state               mp1_state;
>>>>>>>            struct rw_semaphore    reset_sem;
>>>>>>> +    struct lock_class_key lock_key;
>>>>>>>            struct amdgpu_doorbell_index doorbell_index;
>>>>>>>                   struct mutex            notifier_lock;
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>> index 6c572db42d92..d78df9312d34 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>>> @@ -3037,6 +3037,7 @@ int amdgpu_device_init(struct
>>>>>>> amdgpu_device *adev,
>>>>>>>            mutex_init(&adev->virt.vf_errors.lock);
>>>>>>>            hash_init(adev->mn_hash);
>>>>>>>            init_rwsem(&adev->reset_sem);
>>>>>>> +    lockdep_set_class(&adev->reset_sem, &adev->lock_key);
>>>>>>>            atomic_set(&adev->in_gpu_reset, 0);
>>>>>>>            mutex_init(&adev->psp.mutex);
>>>>>>>            mutex_init(&adev->notifier_lock);
>>>>>> _______________________________________________
>>>>>> amd-gfx mailing list
>>>>>> amd-gfx@lists.freedesktop.org
>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fl
>>>>>> i
>>>>>> s
>>>>>> t
>>>>>> s.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=02%7C01%7
>>>>>> C
>>>>>> D
>>>>>> e
>>>>>> nnis.Li%40amd.com%7C56c95f939ddd441bd10408d839d77c9e%7C3dd8961fe488
>>>>>> 4
>>>>>> e
>>>>>> 6
>>>>>> 08e11a82d994e183d%7C0%7C0%7C637322944985436656&amp;sdata=%2FBoRyEW3
>>>>>> i
>>>>>> K
>>>>>> 5
>>>>>> 9Y52ctLWd4y1lOmi2CInb6lpIgAF88i4%3D&amp;reserved=0
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [PATCH] drm/amdgpu: annotate a false positive recursive locking
  2020-08-07  6:57                     ` Christian König
  2020-08-07  7:08                       ` Felix Kuehling
@ 2020-08-07  7:38                       ` Li, Dennis
  2020-08-07  9:08                         ` Christian König
  1 sibling, 1 reply; 24+ messages in thread
From: Li, Dennis @ 2020-08-07  7:38 UTC (permalink / raw)
  To: Koenig, Christian, amd-gfx, Deucher, Alexander, Kuehling, Felix,
	Zhang, Hawking

[AMD Public Use]

> [AMD Public Use]
>
>> [SNIP]
>>>> I think it is a limitation of init_rwsem.
>>> And exactly that's wrong, this is intentional and perfectly correct.
>>>
>>> [Dennis Li] I couldn't understand. Why is it a perfectly correct?
>>> For example, we define two rw_sem: a and b. If we don't check init_rwsem definition, we may think case#1 and case#2 have the same behavior, but in fact they are different.
>>>
>>> Case 1:
>>> init_rwsem(&a);
>>> init_rwsem(&b);
>>>
>>> Case2:
>>> void init_rwsem_ext(rw_sem* sem)
>>> {
>>>         init_rwsem(sem);
>>> }
>>> init_rwsem_ext(&a);
>>> init_rwsem_ext(&b);
>>>
>>> As far as I know it is perfectly possible that the locks in the hive are not always grabbed in the same order. And that's why lockdep is complaining about this.
>>> [Dennis Li] No. I takes a hive with two devices(a and b) to explain why lockdep complain.
>>>
>>> // Firstly driver lock the reset_sem of all devices 
>>> down_write(&a->reset_sem); do_suspend(a);
>>> down_write(&b->reset_sem);   // Because  b shared the same lock_class_key with a, lockdep will take a and b as the same rw_sem and complain here.
>>> do_suspend(b);
>>>
>>> // do recovery
>>> do_hive_recovery()
>>>
>>> // unlock the reset_sem of all devices do_resume(a); 
>>> up_write(&a->reset_sem); do_resume(b); up_write(&b->reset_sem);
>> Ah! Now I understand what you are working around. So the problem is the static lock_class_key in the macro?
>> [Dennis Li] Yes. The author of init_rwsem might not consider our similar use case.

Well this is also really not the intended use case.

When you lock the same rwsem class recursively you can easily run into deadlocks if you don't keep the order the same all the time.

And abstracting init functions behind your own layer is a no-go in the Linux kernel as well.

>>> What we should do instead is to make sure we have only a single lock for the complete hive instead.
>>> [Dennis Li] If we use a single lock, users will must wait for all devices resuming successfully, but in fact their tasks are only running in device a. It is not friendly to users.
>> Well that is intentional I would say. We can only restart submissions after all devices are resumed successfully, cause otherwise it can happen that a job on device A depends on memory on device B which isn't accessible.
>> [Dennis Li] Yes, you are right. Driver have make sure that the shared resources(such as the shard memory) are ready before unlock the lock of adev one by one. But each device still has private hardware resources such as video  and display.
> Yeah, but those are negligible and we have a team working on display support for XGMI. So this will sooner or later become a problem as well.
>
> I still think that a single rwsem for the whole hive is still the best option here.
>
> [Dennis Li] For your above concern, we can open a new thread to discuss it. If we make a decision to use a single after discussing, we can create another patch for it.

That's a really good argument, but I still hesitate to merge this patch. 
How severe is the lockdep splat?

At bare minimum we need a "/* TODO: ...." comment why we do this and how to remove the workaround again.
[Dennis Li] It is not a workaround. According to design of lockdep, each rw_semaphore should has a separated lock_class_key. I have explained that init_rwsem has the above limitation, so we must correct it. Core network driver (net/core/dev.c) has the similar use case with us.

Regards,
Christian.

>
> Best Regards
> Dennis lI
>> Regards,
>> Christian.
>>
>>> Regards,
>>> Christian.
>>>
>>> Am 06.08.20 um 11:15 schrieb Li, Dennis:
>>>> [AMD Official Use Only - Internal Distribution Only]
>>>>
>>>> Hi, Christian,
>>>>          For this case, it is safe to use separated lock key. Please see the definition of init_rwsem as the below. Every init_rwsem calling will use a new static key, but devices of  the hive share the same code to do initialization, so their lock_class_key are the same. I think it is a limitation of init_rwsem.  In our case, it should be correct that reset_sem of each adev has different  lock_class_key. BTW, this change doesn't effect dead-lock detection and just correct it.
>>>>
>>>> #define init_rwsem(sem)					\
>>>> do {								\
>>>> 	static struct lock_class_key __key;			\
>>>> 								\
>>>> 	__init_rwsem((sem), #sem, &__key);			\
>>>> } while (0)
>>>>
>>>> Best Regards
>>>> Dennis Li
>>>> -----Original Message-----
>>>> From: Koenig, Christian <Christian.Koenig@amd.com>
>>>> Sent: Thursday, August 6, 2020 4:13 PM
>>>> To: Li, Dennis <Dennis.Li@amd.com>; amd-gfx@lists.freedesktop.org; 
>>>> Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix 
>>>> <Felix.Kuehling@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>
>>>> Subject: Re: [PATCH] drm/amdgpu: annotate a false positive 
>>>> recursive locking
>>>>
>>>> Preventing locking problems during implementation is obviously a good approach, but lockdep has proven to be massively useful for finding and fixing problems.
>>>>
>>>> Disabling lockdep splat by annotating lock with separate classes is usually a no-go and only allowed if there is no other potential approach.
>>>>
>>>> In this case here we should really clean things up instead.
>>>>
>>>> Christian.
>>>>
>>>> Am 06.08.20 um 09:44 schrieb Li, Dennis:
>>>>> [AMD Official Use Only - Internal Distribution Only]
>>>>>
>>>>> Hi, Christian,
>>>>>            I agree with your concern. However we shouldn't rely on system to detect dead-lock, and should consider this when doing code implementation. In fact, dead-lock detection isn't enabled by default.
>>>>>            For your proposal to remove reset_sem into the hive structure, we can open a new topic to discuss it. Currently we couldn't make sure which is the best solution. For example, with your proposal, we must wait for all devices resuming successfully before resubmit an old task in one device, which will effect performance.
>>>>>
>>>>> Best Regards
>>>>> Dennis Li
>>>>> -----Original Message-----
>>>>> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of 
>>>>> Christian König
>>>>> Sent: Thursday, August 6, 2020 3:08 PM
>>>>> To: Li, Dennis <Dennis.Li@amd.com>; amd-gfx@lists.freedesktop.org; 
>>>>> Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix 
>>>>> <Felix.Kuehling@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>
>>>>> Subject: Re: [PATCH] drm/amdgpu: annotate a false positive 
>>>>> recursive locking
>>>>>
>>>>> Am 06.08.20 um 09:02 schrieb Dennis Li:
>>>>>> [  584.110304] ============================================
>>>>>> [  584.110590] WARNING: possible recursive locking detected
>>>>>> [  584.110876] 5.6.0-deli-v5.6-2848-g3f3109b0e75f #1 Tainted: G           OE
>>>>>> [  584.111164] --------------------------------------------
>>>>>> [  584.111456] kworker/38:1/553 is trying to acquire lock:
>>>>>> [  584.111721] ffff9b15ff0a47a0 (&adev->reset_sem){++++}, at:
>>>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.112112]
>>>>>>                      but task is already holding lock:
>>>>>> [  584.112673] ffff9b1603d247a0 (&adev->reset_sem){++++}, at:
>>>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.113068]
>>>>>>                      other info that might help us debug this:
>>>>>> [  584.113689]  Possible unsafe locking scenario:
>>>>>>
>>>>>> [  584.114350]        CPU0
>>>>>> [  584.114685]        ----
>>>>>> [  584.115014]   lock(&adev->reset_sem);
>>>>>> [  584.115349]   lock(&adev->reset_sem);
>>>>>> [  584.115678]
>>>>>>                       *** DEADLOCK ***
>>>>>>
>>>>>> [  584.116624]  May be due to missing lock nesting notation
>>>>>>
>>>>>> [  584.117284] 4 locks held by kworker/38:1/553:
>>>>>> [  584.117616]  #0: ffff9ad635c1d348 
>>>>>> ((wq_completion)events){+.+.},
>>>>>> at: process_one_work+0x21f/0x630 [  584.117967]  #1:
>>>>>> ffffac708e1c3e58 ((work_completion)(&con->recovery_work)){+.+.}, at:
>>>>>> process_one_work+0x21f/0x630 [  584.118358]  #2: ffffffffc1c2a5d0 
>>>>>> (&tmp->hive_lock){+.+.}, at: amdgpu_device_gpu_recover+0xae/0x1030 [amdgpu] [  584.118786]  #3: ffff9b1603d247a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.119222]
>>>>>>                      stack backtrace:
>>>>>> [  584.119990] CPU: 38 PID: 553 Comm: kworker/38:1 Kdump: loaded Tainted: G           OE     5.6.0-deli-v5.6-2848-g3f3109b0e75f #1
>>>>>> [  584.120782] Hardware name: Supermicro 
>>>>>> SYS-7049GP-TRT/X11DPG-QT, BIOS 3.1 05/23/2019 [  584.121223] 
>>>>>> Workqueue: events amdgpu_ras_do_recovery [amdgpu] [  584.121638] Call Trace:
>>>>>> [  584.122050]  dump_stack+0x98/0xd5 [  584.122499]
>>>>>> __lock_acquire+0x1139/0x16e0 [  584.122931]  ?
>>>>>> trace_hardirqs_on+0x3b/0xf0 [  584.123358]  ?
>>>>>> cancel_delayed_work+0xa6/0xc0 [  584.123771]
>>>>>> lock_acquire+0xb8/0x1c0 [  584.124197]  ?
>>>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.124599]  down_write+0x49/0x120 [  584.125032]  ?
>>>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.125472]
>>>>>> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.125910]  ?
>>>>>> amdgpu_ras_error_query+0x1b8/0x2a0 [amdgpu] [  584.126367]
>>>>>> amdgpu_ras_do_recovery+0x159/0x190 [amdgpu] [  584.126789]
>>>>>> process_one_work+0x29e/0x630 [  584.127208]
>>>>>> worker_thread+0x3c/0x3f0 [  584.127621]  ?
>>>>>> __kthread_parkme+0x61/0x90 [  584.128014]
>>>>>> kthread+0x12f/0x150 [  584.128402]  ? 
>>>>>> kthread+process_one_work+0x630/0x630 [
>>>>>> 584.128790]  ? kthread_park+0x90/0x90 [  584.129174]
>>>>>> ret_from_fork+0x3a/0x50
>>>>>>
>>>>>> Each adev has owned lock_class_key to avoid false positive 
>>>>>> recursive locking.
>>>>> NAK, that is not a false positive but a real problem.
>>>>>
>>>>> The issue here is that we have multiple reset semaphores, one for each device in the hive. If those are not acquired in the correct order we deadlock.
>>>>>
>>>>> The real solution would be to move the reset_sem into the hive structure and make sure that we lock it only once.
>>>>>
>>>>> Christian.
>>>>>
>>>>>> Signed-off-by: Dennis Li <Dennis.Li@amd.com>
>>>>>> Change-Id: I7571efeccbf15483982031d00504a353031a854a
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>>> index e97c088d03b3..766dc8f8c8a0 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>>>>> @@ -967,6 +967,7 @@ struct amdgpu_device {
>>>>>>        	atomic_t                        in_gpu_reset;
>>>>>>        	enum pp_mp1_state               mp1_state;
>>>>>>        	struct rw_semaphore	reset_sem;
>>>>>> +	struct lock_class_key lock_key;
>>>>>>        	struct amdgpu_doorbell_index doorbell_index;
>>>>>>        
>>>>>>        	struct mutex			notifier_lock;
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> index 6c572db42d92..d78df9312d34 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>>>>> @@ -3037,6 +3037,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>>>>>>        	mutex_init(&adev->virt.vf_errors.lock);
>>>>>>        	hash_init(adev->mn_hash);
>>>>>>        	init_rwsem(&adev->reset_sem);
>>>>>> +	lockdep_set_class(&adev->reset_sem, &adev->lock_key);
>>>>>>        	atomic_set(&adev->in_gpu_reset, 0);
>>>>>>        	mutex_init(&adev->psp.mutex);
>>>>>>        	mutex_init(&adev->notifier_lock);
>>>>> _______________________________________________
>>>>> amd-gfx mailing list
>>>>> amd-gfx@lists.freedesktop.org
>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2F
>>>>> l
>>>>> i
>>>>> s
>>>>> t
>>>>> s.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=02%7C01%
>>>>> 7
>>>>> C
>>>>> D
>>>>> e
>>>>> nnis.Li%40amd.com%7C56c95f939ddd441bd10408d839d77c9e%7C3dd8961fe48
>>>>> 8
>>>>> 4
>>>>> e
>>>>> 6
>>>>> 08e11a82d994e183d%7C0%7C0%7C637322944985436656&amp;sdata=%2FBoRyEW
>>>>> 3
>>>>> i
>>>>> K
>>>>> 5
>>>>> 9Y52ctLWd4y1lOmi2CInb6lpIgAF88i4%3D&amp;reserved=0
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] drm/amdgpu: annotate a false positive recursive locking
  2020-08-07  7:38                       ` Li, Dennis
@ 2020-08-07  9:08                         ` Christian König
  2020-08-07  9:20                           ` Daniel Vetter
  0 siblings, 1 reply; 24+ messages in thread
From: Christian König @ 2020-08-07  9:08 UTC (permalink / raw)
  To: Li, Dennis, amd-gfx, Deucher, Alexander, Kuehling, Felix, Zhang,
	Hawking, Daniel Vetter

[SNIP]
>>>> What we should do instead is to make sure we have only a single lock for the complete hive instead.
>>>> [Dennis Li] If we use a single lock, users will must wait for all devices resuming successfully, but in fact their tasks are only running in device a. It is not friendly to users.
>>> Well that is intentional I would say. We can only restart submissions after all devices are resumed successfully, cause otherwise it can happen that a job on device A depends on memory on device B which isn't accessible.
>>> [Dennis Li] Yes, you are right. Driver have make sure that the shared resources(such as the shard memory) are ready before unlock the lock of adev one by one. But each device still has private hardware resources such as video  and display.
>> Yeah, but those are negligible and we have a team working on display support for XGMI. So this will sooner or later become a problem as well.
>>
>> I still think that a single rwsem for the whole hive is still the best option here.
>>
>> [Dennis Li] For your above concern, we can open a new thread to discuss it. If we make a decision to use a single after discussing, we can create another patch for it.
> That's a really good argument, but I still hesitate to merge this patch.
> How severe is the lockdep splat?
>
> At bare minimum we need a "/* TODO: ...." comment why we do this and how to remove the workaround again.
> [Dennis Li] It is not a workaround. According to design of lockdep, each rw_semaphore should has a separated lock_class_key. I have explained that init_rwsem has the above limitation, so we must correct it.

Yeah, but this explanation is not really sufficient. Again this is not 
an limitation of init_rwsem, but intentional behavior.

In other words in most cases lockdep should splat if you use rw 
semaphores like this.

That we avoid this by locking multiple amdgpu device always in the same 
order is rather questionable driver design.

>   Core network driver (net/core/dev.c) has the similar use case with us.

Just took a look at that, but this is completely different use case. The 
lockdep classes are grouped by driver type to note the difference in the 
network stack.

A quick search showed that no other device driver is doing this in the 
kernel, so I'm not sure if such a behavior wouldn't be rejected. Adding 
Daniel to comment on this as well.

Felix had some really good arguments to make that an urgent fix, so 
either we come up with some real rework of this or at least add a big 
"/* TODO....*/".

Regards,
Christian.

>
> Regards,
> Christian.
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] drm/amdgpu: annotate a false positive recursive locking
  2020-08-07  9:08                         ` Christian König
@ 2020-08-07  9:20                           ` Daniel Vetter
  2020-08-07  9:23                             ` Daniel Vetter
  0 siblings, 1 reply; 24+ messages in thread
From: Daniel Vetter @ 2020-08-07  9:20 UTC (permalink / raw)
  To: Christian König
  Cc: Deucher, Alexander, Kuehling, Felix, amd-gfx, Li, Dennis, Zhang, Hawking

On Fri, Aug 7, 2020 at 11:08 AM Christian König
<christian.koenig@amd.com> wrote:
>
> [SNIP]
> >>>> What we should do instead is to make sure we have only a single lock for the complete hive instead.
> >>>> [Dennis Li] If we use a single lock, users will must wait for all devices resuming successfully, but in fact their tasks are only running in device a. It is not friendly to users.
> >>> Well that is intentional I would say. We can only restart submissions after all devices are resumed successfully, cause otherwise it can happen that a job on device A depends on memory on device B which isn't accessible.
> >>> [Dennis Li] Yes, you are right. Driver have make sure that the shared resources(such as the shard memory) are ready before unlock the lock of adev one by one. But each device still has private hardware resources such as video  and display.
> >> Yeah, but those are negligible and we have a team working on display support for XGMI. So this will sooner or later become a problem as well.
> >>
> >> I still think that a single rwsem for the whole hive is still the best option here.
> >>
> >> [Dennis Li] For your above concern, we can open a new thread to discuss it. If we make a decision to use a single after discussing, we can create another patch for it.
> > That's a really good argument, but I still hesitate to merge this patch.
> > How severe is the lockdep splat?
> >
> > At bare minimum we need a "/* TODO: ...." comment why we do this and how to remove the workaround again.
> > [Dennis Li] It is not a workaround. According to design of lockdep, each rw_semaphore should has a separated lock_class_key. I have explained that init_rwsem has the above limitation, so we must correct it.
>
> Yeah, but this explanation is not really sufficient. Again this is not
> an limitation of init_rwsem, but intentional behavior.
>
> In other words in most cases lockdep should splat if you use rw
> semaphores like this.
>
> That we avoid this by locking multiple amdgpu device always in the same
> order is rather questionable driver design.

Yeah don't give multiple locking classes like that, that's fairly
terrible. The right fix is mutex_lock_nest_lock, which we're using in
ww_mutex or in which is also used in mm_take_all_locks.

If you solve the locking ordering by sorting all the locks you need
(ugh), then you can also just use a fake lockdep_map for your special
function only.

Also in general the idea of an rwsem in conjunction with xgmi cross
device memory handling just plain freaks me out :-) Please make sure
you prime this lock against dma_resv and dma_fence (and maybe also
drm_modeset_lock if this lock is outside of dma_resv), and ideally
this should only ever be used for setup stuff when loading drivers. I
guess that's why you went with a per-device rwsem. If it's really only
for setup then just do the simple thing of having an
xgmi_hive_reconfigure lock, which all the rwsems nest within, and no
fancy lock ordering or other tricks.

> >   Core network driver (net/core/dev.c) has the similar use case with us.
>
> Just took a look at that, but this is completely different use case. The
> lockdep classes are grouped by driver type to note the difference in the
> network stack.
>
> A quick search showed that no other device driver is doing this in the
> kernel, so I'm not sure if such a behavior wouldn't be rejected. Adding
> Daniel to comment on this as well.
>
> Felix had some really good arguments to make that an urgent fix, so
> either we come up with some real rework of this or at least add a big
> "/* TODO....*/".

Aside from "I have questions" I don't think there's any reason to no
fix this correctly with mutex_lock_nest_lock. Really shouldn't be a
semantic change from what I'm understand here.

Cheers, Daniel

>
> Regards,
> Christian.
>
> >
> > Regards,
> > Christian.
> >
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] drm/amdgpu: annotate a false positive recursive locking
  2020-08-07  9:20                           ` Daniel Vetter
@ 2020-08-07  9:23                             ` Daniel Vetter
  2020-08-07  9:34                               ` Christian König
  0 siblings, 1 reply; 24+ messages in thread
From: Daniel Vetter @ 2020-08-07  9:23 UTC (permalink / raw)
  To: Christian König
  Cc: Deucher, Alexander, Kuehling, Felix, amd-gfx, Li, Dennis, Zhang, Hawking

On Fri, Aug 7, 2020 at 11:20 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Fri, Aug 7, 2020 at 11:08 AM Christian König
> <christian.koenig@amd.com> wrote:
> >
> > [SNIP]
> > >>>> What we should do instead is to make sure we have only a single lock for the complete hive instead.
> > >>>> [Dennis Li] If we use a single lock, users will must wait for all devices resuming successfully, but in fact their tasks are only running in device a. It is not friendly to users.
> > >>> Well that is intentional I would say. We can only restart submissions after all devices are resumed successfully, cause otherwise it can happen that a job on device A depends on memory on device B which isn't accessible.
> > >>> [Dennis Li] Yes, you are right. Driver have make sure that the shared resources(such as the shard memory) are ready before unlock the lock of adev one by one. But each device still has private hardware resources such as video  and display.
> > >> Yeah, but those are negligible and we have a team working on display support for XGMI. So this will sooner or later become a problem as well.
> > >>
> > >> I still think that a single rwsem for the whole hive is still the best option here.
> > >>
> > >> [Dennis Li] For your above concern, we can open a new thread to discuss it. If we make a decision to use a single after discussing, we can create another patch for it.
> > > That's a really good argument, but I still hesitate to merge this patch.
> > > How severe is the lockdep splat?
> > >
> > > At bare minimum we need a "/* TODO: ...." comment why we do this and how to remove the workaround again.
> > > [Dennis Li] It is not a workaround. According to design of lockdep, each rw_semaphore should has a separated lock_class_key. I have explained that init_rwsem has the above limitation, so we must correct it.
> >
> > Yeah, but this explanation is not really sufficient. Again this is not
> > an limitation of init_rwsem, but intentional behavior.
> >
> > In other words in most cases lockdep should splat if you use rw
> > semaphores like this.
> >
> > That we avoid this by locking multiple amdgpu device always in the same
> > order is rather questionable driver design.
>
> Yeah don't give multiple locking classes like that, that's fairly
> terrible. The right fix is mutex_lock_nest_lock, which we're using in
> ww_mutex or in which is also used in mm_take_all_locks.
>
> If you solve the locking ordering by sorting all the locks you need
> (ugh), then you can also just use a fake lockdep_map for your special
> function only.
>
> Also in general the idea of an rwsem in conjunction with xgmi cross
> device memory handling just plain freaks me out :-) Please make sure
> you prime this lock against dma_resv and dma_fence (and maybe also
> drm_modeset_lock if this lock is outside of dma_resv), and ideally
> this should only ever be used for setup stuff when loading drivers. I
> guess that's why you went with a per-device rwsem. If it's really only
> for setup then just do the simple thing of having an
> xgmi_hive_reconfigure lock, which all the rwsems nest within, and no
> fancy lock ordering or other tricks.
>
> > >   Core network driver (net/core/dev.c) has the similar use case with us.
> >
> > Just took a look at that, but this is completely different use case. The
> > lockdep classes are grouped by driver type to note the difference in the
> > network stack.
> >
> > A quick search showed that no other device driver is doing this in the
> > kernel, so I'm not sure if such a behavior wouldn't be rejected. Adding
> > Daniel to comment on this as well.
> >
> > Felix had some really good arguments to make that an urgent fix, so
> > either we come up with some real rework of this or at least add a big
> > "/* TODO....*/".
>
> Aside from "I have questions" I don't think there's any reason to no
> fix this correctly with mutex_lock_nest_lock. Really shouldn't be a
> semantic change from what I'm understand here.

Also one more with my drm maintainer hat on, as a heads-up: Dave&me
looked at i915 gem code a bit too much, and we're appalled at the
massive over-use of lockdep class hacks and dubious nesting
annotations. Expect crack down on anyone who tries to play clever
tricks here, we have a few too many already :-)

So no mutex_lock_nested (totally different beast from
mutex_lock_nest_lock, and really unsafe since it's a runtime change of
the lockdep class) and also not any lockdep_set_class trickery or
anything like that. We need locking hierarchies which mere humans can
comprehend and review.

Cheers, Daniel

>
> Cheers, Daniel
>
> >
> > Regards,
> > Christian.
> >
> > >
> > > Regards,
> > > Christian.
> > >
> >
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] drm/amdgpu: annotate a false positive recursive locking
  2020-08-07  9:23                             ` Daniel Vetter
@ 2020-08-07  9:34                               ` Christian König
  2020-08-07  9:44                                 ` Daniel Vetter
  0 siblings, 1 reply; 24+ messages in thread
From: Christian König @ 2020-08-07  9:34 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Deucher, Alexander, Kuehling, Felix, amd-gfx, Li, Dennis, Zhang, Hawking

Am 07.08.20 um 11:23 schrieb Daniel Vetter:
> On Fri, Aug 7, 2020 at 11:20 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>> On Fri, Aug 7, 2020 at 11:08 AM Christian König
>> <christian.koenig@amd.com> wrote:
>>> [SNIP]
>>>>>>> What we should do instead is to make sure we have only a single lock for the complete hive instead.
>>>>>>> [Dennis Li] If we use a single lock, users will must wait for all devices resuming successfully, but in fact their tasks are only running in device a. It is not friendly to users.
>>>>>> Well that is intentional I would say. We can only restart submissions after all devices are resumed successfully, cause otherwise it can happen that a job on device A depends on memory on device B which isn't accessible.
>>>>>> [Dennis Li] Yes, you are right. Driver have make sure that the shared resources(such as the shard memory) are ready before unlock the lock of adev one by one. But each device still has private hardware resources such as video  and display.
>>>>> Yeah, but those are negligible and we have a team working on display support for XGMI. So this will sooner or later become a problem as well.
>>>>>
>>>>> I still think that a single rwsem for the whole hive is still the best option here.
>>>>>
>>>>> [Dennis Li] For your above concern, we can open a new thread to discuss it. If we make a decision to use a single after discussing, we can create another patch for it.
>>>> That's a really good argument, but I still hesitate to merge this patch.
>>>> How severe is the lockdep splat?
>>>>
>>>> At bare minimum we need a "/* TODO: ...." comment why we do this and how to remove the workaround again.
>>>> [Dennis Li] It is not a workaround. According to design of lockdep, each rw_semaphore should has a separated lock_class_key. I have explained that init_rwsem has the above limitation, so we must correct it.
>>> Yeah, but this explanation is not really sufficient. Again this is not
>>> an limitation of init_rwsem, but intentional behavior.
>>>
>>> In other words in most cases lockdep should splat if you use rw
>>> semaphores like this.
>>>
>>> That we avoid this by locking multiple amdgpu device always in the same
>>> order is rather questionable driver design.
>> Yeah don't give multiple locking classes like that, that's fairly
>> terrible.

Ok, that is exactly the answer I feared we would get.

>> The right fix is mutex_lock_nest_lock, which we're using in
>> ww_mutex or in which is also used in mm_take_all_locks.

Ah! Enlightenment! So in this case we should use down_write_nested() in 
this special space and all is well?

>>
>> If you solve the locking ordering by sorting all the locks you need
>> (ugh), then you can also just use a fake lockdep_map for your special
>> function only.
>>
>> Also in general the idea of an rwsem in conjunction with xgmi cross
>> device memory handling just plain freaks me out :-) Please make sure
>> you prime this lock against dma_resv and dma_fence (and maybe also
>> drm_modeset_lock if this lock is outside of dma_resv), and ideally
>> this should only ever be used for setup stuff when loading drivers. I
>> guess that's why you went with a per-device rwsem. If it's really only
>> for setup then just do the simple thing of having an
>> xgmi_hive_reconfigure lock, which all the rwsems nest within, and no
>> fancy lock ordering or other tricks.

Well this is for the group reset of all devices in a hive and you need 
to reboot to change the order the device are in the list.

Regards,
Christian.

>>
>>>>    Core network driver (net/core/dev.c) has the similar use case with us.
>>> Just took a look at that, but this is completely different use case. The
>>> lockdep classes are grouped by driver type to note the difference in the
>>> network stack.
>>>
>>> A quick search showed that no other device driver is doing this in the
>>> kernel, so I'm not sure if such a behavior wouldn't be rejected. Adding
>>> Daniel to comment on this as well.
>>>
>>> Felix had some really good arguments to make that an urgent fix, so
>>> either we come up with some real rework of this or at least add a big
>>> "/* TODO....*/".
>> Aside from "I have questions" I don't think there's any reason to no
>> fix this correctly with mutex_lock_nest_lock. Really shouldn't be a
>> semantic change from what I'm understand here.
> Also one more with my drm maintainer hat on, as a heads-up: Dave&me
> looked at i915 gem code a bit too much, and we're appalled at the
> massive over-use of lockdep class hacks and dubious nesting
> annotations. Expect crack down on anyone who tries to play clever
> tricks here, we have a few too many already :-)
>
> So no mutex_lock_nested (totally different beast from
> mutex_lock_nest_lock, and really unsafe since it's a runtime change of
> the lockdep class) and also not any lockdep_set_class trickery or
> anything like that. We need locking hierarchies which mere humans can
> comprehend and review.
>
> Cheers, Daniel
>
>> Cheers, Daniel
>>
>>> Regards,
>>> Christian.
>>>
>>>> Regards,
>>>> Christian.
>>>>
>>
>> --
>> Daniel Vetter
>> Software Engineer, Intel Corporation
>> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&amp;data=02%7C01%7Cchristian.koenig%40amd.com%7Cfa969d301e9f4a98c05608d83ab3a0f8%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637323890463734977&amp;sdata=dCB0oOQ9hniHqB%2FEUZ15r6lZNOSRYbRp9pXNhL7SVDM%3D&amp;reserved=0
>
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] drm/amdgpu: annotate a false positive recursive locking
  2020-08-07  9:34                               ` Christian König
@ 2020-08-07  9:44                                 ` Daniel Vetter
  2020-08-07 11:59                                   ` Li, Dennis
  0 siblings, 1 reply; 24+ messages in thread
From: Daniel Vetter @ 2020-08-07  9:44 UTC (permalink / raw)
  To: Christian König
  Cc: Deucher, Alexander, Kuehling, Felix, amd-gfx, Li, Dennis, Zhang, Hawking

On Fri, Aug 7, 2020 at 11:34 AM Christian König
<christian.koenig@amd.com> wrote:
>
> Am 07.08.20 um 11:23 schrieb Daniel Vetter:
> > On Fri, Aug 7, 2020 at 11:20 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> >> On Fri, Aug 7, 2020 at 11:08 AM Christian König
> >> <christian.koenig@amd.com> wrote:
> >>> [SNIP]
> >>>>>>> What we should do instead is to make sure we have only a single lock for the complete hive instead.
> >>>>>>> [Dennis Li] If we use a single lock, users will must wait for all devices resuming successfully, but in fact their tasks are only running in device a. It is not friendly to users.
> >>>>>> Well that is intentional I would say. We can only restart submissions after all devices are resumed successfully, cause otherwise it can happen that a job on device A depends on memory on device B which isn't accessible.
> >>>>>> [Dennis Li] Yes, you are right. Driver have make sure that the shared resources(such as the shard memory) are ready before unlock the lock of adev one by one. But each device still has private hardware resources such as video  and display.
> >>>>> Yeah, but those are negligible and we have a team working on display support for XGMI. So this will sooner or later become a problem as well.
> >>>>>
> >>>>> I still think that a single rwsem for the whole hive is still the best option here.
> >>>>>
> >>>>> [Dennis Li] For your above concern, we can open a new thread to discuss it. If we make a decision to use a single after discussing, we can create another patch for it.
> >>>> That's a really good argument, but I still hesitate to merge this patch.
> >>>> How severe is the lockdep splat?
> >>>>
> >>>> At bare minimum we need a "/* TODO: ...." comment why we do this and how to remove the workaround again.
> >>>> [Dennis Li] It is not a workaround. According to design of lockdep, each rw_semaphore should has a separated lock_class_key. I have explained that init_rwsem has the above limitation, so we must correct it.
> >>> Yeah, but this explanation is not really sufficient. Again this is not
> >>> an limitation of init_rwsem, but intentional behavior.
> >>>
> >>> In other words in most cases lockdep should splat if you use rw
> >>> semaphores like this.
> >>>
> >>> That we avoid this by locking multiple amdgpu device always in the same
> >>> order is rather questionable driver design.
> >> Yeah don't give multiple locking classes like that, that's fairly
> >> terrible.
>
> Ok, that is exactly the answer I feared we would get.
>
> >> The right fix is mutex_lock_nest_lock, which we're using in
> >> ww_mutex or in which is also used in mm_take_all_locks.
>
> Ah! Enlightenment! So in this case we should use down_write_nested() in
> this special space and all is well?

Nope. There's two kinds of nesting that lockdep supports:
1. _nested() variants, where you supply an integer parameter
specifying the lockdep subclass. That's like setting a locking
subclass at lock creating time, except much worse because you change
the lockdep class at runtime. This can lead to stuff like:

struct mutex A;
mutex_init(&A);
mutex_lock(&A);
mutex_lock_nested(&A, SINGLE_DEPTH_NESTING);

which is a very clear deadlock, but lockdep will think it's all fine.
Don't use that.

2. _nest_lock() variants, where you specify a super-lock, which makes
sure that all sub-locks cannot be acquired concurrently (or you
promise there's some other trick to avoid deadlocks). Lockdep checks
that you've acquired the super-lock, and then allows arbitary nesting.
This is what ww_mutex uses, and what mm_take_all_locks uses. If the
super-lock is an actual mutex, then this is actually deadlock free.
With ww_mutex it's a fake lockdep_map in the ww_acquire_ctx, but we
guarantee deadlock-freeness through the ww algorithm. The fake super
lock for ww_mutex is acquired when you get the ticket.

No 2 is the thing you want, not no 1.

> >>
> >> If you solve the locking ordering by sorting all the locks you need
> >> (ugh), then you can also just use a fake lockdep_map for your special
> >> function only.
> >>
> >> Also in general the idea of an rwsem in conjunction with xgmi cross
> >> device memory handling just plain freaks me out :-) Please make sure
> >> you prime this lock against dma_resv and dma_fence (and maybe also
> >> drm_modeset_lock if this lock is outside of dma_resv), and ideally
> >> this should only ever be used for setup stuff when loading drivers. I
> >> guess that's why you went with a per-device rwsem. If it's really only
> >> for setup then just do the simple thing of having an
> >> xgmi_hive_reconfigure lock, which all the rwsems nest within, and no
> >> fancy lock ordering or other tricks.
>
> Well this is for the group reset of all devices in a hive and you need
> to reboot to change the order the device are in the list.

Ok, that sounds save enough. I'd go with a xgmi_hive_lock then, and
down_write_nest_lock(&adev->xgmi_rwsem, &xgmi_hive_lock); Taking the
global lock for device reset shouldn't be a problem, we're not going
to reset multiple devices in parallel anyway. Or at least I don't hope
that use-case is a perf critical benchmark for you guys :-)

Cheers, Daniel

> Regards,
> Christian.
>
> >>
> >>>>    Core network driver (net/core/dev.c) has the similar use case with us.
> >>> Just took a look at that, but this is completely different use case. The
> >>> lockdep classes are grouped by driver type to note the difference in the
> >>> network stack.
> >>>
> >>> A quick search showed that no other device driver is doing this in the
> >>> kernel, so I'm not sure if such a behavior wouldn't be rejected. Adding
> >>> Daniel to comment on this as well.
> >>>
> >>> Felix had some really good arguments to make that an urgent fix, so
> >>> either we come up with some real rework of this or at least add a big
> >>> "/* TODO....*/".
> >> Aside from "I have questions" I don't think there's any reason to no
> >> fix this correctly with mutex_lock_nest_lock. Really shouldn't be a
> >> semantic change from what I'm understand here.
> > Also one more with my drm maintainer hat on, as a heads-up: Dave&me
> > looked at i915 gem code a bit too much, and we're appalled at the
> > massive over-use of lockdep class hacks and dubious nesting
> > annotations. Expect crack down on anyone who tries to play clever
> > tricks here, we have a few too many already :-)
> >
> > So no mutex_lock_nested (totally different beast from
> > mutex_lock_nest_lock, and really unsafe since it's a runtime change of
> > the lockdep class) and also not any lockdep_set_class trickery or
> > anything like that. We need locking hierarchies which mere humans can
> > comprehend and review.
> >
> > Cheers, Daniel
> >
> >> Cheers, Daniel
> >>
> >>> Regards,
> >>> Christian.
> >>>
> >>>> Regards,
> >>>> Christian.
> >>>>
> >>
> >> --
> >> Daniel Vetter
> >> Software Engineer, Intel Corporation
> >> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&amp;data=02%7C01%7Cchristian.koenig%40amd.com%7Cfa969d301e9f4a98c05608d83ab3a0f8%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637323890463734977&amp;sdata=dCB0oOQ9hniHqB%2FEUZ15r6lZNOSRYbRp9pXNhL7SVDM%3D&amp;reserved=0
> >
> >
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [PATCH] drm/amdgpu: annotate a false positive recursive locking
  2020-08-07  9:44                                 ` Daniel Vetter
@ 2020-08-07 11:59                                   ` Li, Dennis
  2020-08-07 12:28                                     ` Daniel Vetter
  0 siblings, 1 reply; 24+ messages in thread
From: Li, Dennis @ 2020-08-07 11:59 UTC (permalink / raw)
  To: Daniel Vetter, Koenig, Christian
  Cc: Deucher, Alexander, Kuehling, Felix, amd-gfx, Zhang, Hawking

[AMD Public Use]

Hi, Daniel,
      Thanks your review. And I also understand your concern. I guess you missed the description of this issue, so I paste it again in the below, and explain why this issue happens.

      For example, in a XGMI system with 2 GPU devices whose device entity is named adev. And each adev has a separated reset_sem. 
      // device init function 
      void device_init(adev) {
	...
	init_rwsem(&adev->reset_sem);
              ...
      }

     XGMI system have two GPU devices, so system will call device_init twice. However the definition of init_rwsem macro has a limitation which use a local static lock_class_key to initialize rw_sem, which cause each adev->reset_sem share the same lock_class_key. 

     #define init_rwsem(sem)			                             \ 
     do {								\
	 static struct lock_class_key __key;			\
								\
	__init_rwsem((sem), #sem, &__key);			\
     } while (0)

     And when GPU hang, we should do gpu recovery for all devices in the hive. Therefore we should lock each adev->reset_sem to protect GPU from be accessed by other threads during recovery. The detailed recovery sequence as the following:
     // Step1: lock all GPU firstly:
     for each adev of GPU0 or GPU1 {
           down_write(adev->reset_sem); 
           do_suspend(adev); 
    }

    // Step2:
    do_hive_recovery(hive);

    // Step3: resume and unlock GPU
    for each adev of GPU0 or GPU1 {
          do_resume(adev)
          up_write(adev->reset_sem);
    }

    Each adev->reset_sem has the same lock_class_key, and lockdep will take them as the same rw_semaphore object. Therefore in step1, when lock GPU1, lockdep will pop the below warning. 

    I have considered your proposal (using  _nest_lock() ) before. Just as you said, _nest_lock() will hide true positive recursive locking. So I gave up it in the end. 

    After reviewing codes of lockdep, I found the lockdep support dynamic_key, so using separated lock_class_key has no any side effect. In fact, init_rwsem also use dynamic_key. Please see the call sequence of init_rwsem and lockdep_set_class as the below:
   1) init_rwsem -> __init_rwsem -> lockdep_init_map; 
   2) lockdep_set_class -> lockdep_init_map;

    Finally we go back to your concern, you maybe worry this change will cause the below dead-lock can't be detected. In fact, lockdep still is able to detect the issue as circular locking dependency, but there is no warning "recursive locking " in this case. 
    Thread A: down_write(adev->reset_sem) for GPU 0 -> down_write(adev->reset_sem) for GPU 1 -> ... -> up_write(adev->reset_sem) for GPU 1 -> up_write(adev->reset_sem) for GPU 0
    Thread B: down_write(adev->reset_sem) for GPU 1 -> down_write(adev->reset_sem) for GPU 0 -> ... -> up_write(adev->reset_sem) for GPU 0 -> up_write(adev->reset_sem) for GPU 1

    But lockdep still can detect recursive locking for this case:
    Thread A: down_write(adev->reset_sem) for GPU 0 -> ...-> ...-> down_write(adev->reset_sem) for GPU 0 

[  584.110304] ============================================
[  584.110590] WARNING: possible recursive locking detected
[  584.110876] 5.6.0-deli-v5.6-2848-g3f3109b0e75f #1 Tainted: G           OE
[  584.111164] --------------------------------------------
[  584.111456] kworker/38:1/553 is trying to acquire lock:
[  584.111721] ffff9b15ff0a47a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[  584.112112]
               but task is already holding lock:
[  584.112673] ffff9b1603d247a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[  584.113068]
               other info that might help us debug this:
[  584.113689]  Possible unsafe locking scenario:

[  584.114350]        CPU0
[  584.114685]        ----
[  584.115014]   lock(&adev->reset_sem);
[  584.115349]   lock(&adev->reset_sem);
[  584.115678]
                *** DEADLOCK ***

[  584.116624]  May be due to missing lock nesting notation

[  584.117284] 4 locks held by kworker/38:1/553:
[  584.117616]  #0: ffff9ad635c1d348 ((wq_completion)events){+.+.}, at: process_one_work+0x21f/0x630
[  584.117967]  #1: ffffac708e1c3e58 ((work_completion)(&con->recovery_work)){+.+.}, at: process_one_work+0x21f/0x630
[  584.118358]  #2: ffffffffc1c2a5d0 (&tmp->hive_lock){+.+.}, at: amdgpu_device_gpu_recover+0xae/0x1030 [amdgpu]
[  584.118786]  #3: ffff9b1603d247a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]

Best Regards
Dennis Li
-----Original Message-----
From: Daniel Vetter <daniel@ffwll.ch> 
Sent: Friday, August 7, 2020 5:45 PM
To: Koenig, Christian <Christian.Koenig@amd.com>
Cc: Li, Dennis <Dennis.Li@amd.com>; amd-gfx@lists.freedesktop.org; Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix <Felix.Kuehling@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>
Subject: Re: [PATCH] drm/amdgpu: annotate a false positive recursive locking

On Fri, Aug 7, 2020 at 11:34 AM Christian König <christian.koenig@amd.com> wrote:
>
> Am 07.08.20 um 11:23 schrieb Daniel Vetter:
> > On Fri, Aug 7, 2020 at 11:20 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> >> On Fri, Aug 7, 2020 at 11:08 AM Christian König 
> >> <christian.koenig@amd.com> wrote:
> >>> [SNIP]
> >>>>>>> What we should do instead is to make sure we have only a single lock for the complete hive instead.
> >>>>>>> [Dennis Li] If we use a single lock, users will must wait for all devices resuming successfully, but in fact their tasks are only running in device a. It is not friendly to users.
> >>>>>> Well that is intentional I would say. We can only restart submissions after all devices are resumed successfully, cause otherwise it can happen that a job on device A depends on memory on device B which isn't accessible.
> >>>>>> [Dennis Li] Yes, you are right. Driver have make sure that the shared resources(such as the shard memory) are ready before unlock the lock of adev one by one. But each device still has private hardware resources such as video  and display.
> >>>>> Yeah, but those are negligible and we have a team working on display support for XGMI. So this will sooner or later become a problem as well.
> >>>>>
> >>>>> I still think that a single rwsem for the whole hive is still the best option here.
> >>>>>
> >>>>> [Dennis Li] For your above concern, we can open a new thread to discuss it. If we make a decision to use a single after discussing, we can create another patch for it.
> >>>> That's a really good argument, but I still hesitate to merge this patch.
> >>>> How severe is the lockdep splat?
> >>>>
> >>>> At bare minimum we need a "/* TODO: ...." comment why we do this and how to remove the workaround again.
> >>>> [Dennis Li] It is not a workaround. According to design of lockdep, each rw_semaphore should has a separated lock_class_key. I have explained that init_rwsem has the above limitation, so we must correct it.
> >>> Yeah, but this explanation is not really sufficient. Again this is 
> >>> not an limitation of init_rwsem, but intentional behavior.
> >>>
> >>> In other words in most cases lockdep should splat if you use rw 
> >>> semaphores like this.
> >>>
> >>> That we avoid this by locking multiple amdgpu device always in the 
> >>> same order is rather questionable driver design.
> >> Yeah don't give multiple locking classes like that, that's fairly 
> >> terrible.
>
> Ok, that is exactly the answer I feared we would get.
>
> >> The right fix is mutex_lock_nest_lock, which we're using in 
> >> ww_mutex or in which is also used in mm_take_all_locks.
>
> Ah! Enlightenment! So in this case we should use down_write_nested() 
> in this special space and all is well?

Nope. There's two kinds of nesting that lockdep supports:
1. _nested() variants, where you supply an integer parameter specifying the lockdep subclass. That's like setting a locking subclass at lock creating time, except much worse because you change the lockdep class at runtime. This can lead to stuff like:

struct mutex A;
mutex_init(&A);
mutex_lock(&A);
mutex_lock_nested(&A, SINGLE_DEPTH_NESTING);

which is a very clear deadlock, but lockdep will think it's all fine.
Don't use that.

2. _nest_lock() variants, where you specify a super-lock, which makes sure that all sub-locks cannot be acquired concurrently (or you promise there's some other trick to avoid deadlocks). Lockdep checks that you've acquired the super-lock, and then allows arbitary nesting.
This is what ww_mutex uses, and what mm_take_all_locks uses. If the super-lock is an actual mutex, then this is actually deadlock free.
With ww_mutex it's a fake lockdep_map in the ww_acquire_ctx, but we guarantee deadlock-freeness through the ww algorithm. The fake super lock for ww_mutex is acquired when you get the ticket.

No 2 is the thing you want, not no 1.

> >>
> >> If you solve the locking ordering by sorting all the locks you need 
> >> (ugh), then you can also just use a fake lockdep_map for your 
> >> special function only.
> >>
> >> Also in general the idea of an rwsem in conjunction with xgmi cross 
> >> device memory handling just plain freaks me out :-) Please make 
> >> sure you prime this lock against dma_resv and dma_fence (and maybe 
> >> also drm_modeset_lock if this lock is outside of dma_resv), and 
> >> ideally this should only ever be used for setup stuff when loading 
> >> drivers. I guess that's why you went with a per-device rwsem. If 
> >> it's really only for setup then just do the simple thing of having 
> >> an xgmi_hive_reconfigure lock, which all the rwsems nest within, 
> >> and no fancy lock ordering or other tricks.
>
> Well this is for the group reset of all devices in a hive and you need 
> to reboot to change the order the device are in the list.

Ok, that sounds save enough. I'd go with a xgmi_hive_lock then, and down_write_nest_lock(&adev->xgmi_rwsem, &xgmi_hive_lock); Taking the global lock for device reset shouldn't be a problem, we're not going to reset multiple devices in parallel anyway. Or at least I don't hope that use-case is a perf critical benchmark for you guys :-)

Cheers, Daniel

> Regards,
> Christian.
>
> >>
> >>>>    Core network driver (net/core/dev.c) has the similar use case with us.
> >>> Just took a look at that, but this is completely different use 
> >>> case. The lockdep classes are grouped by driver type to note the 
> >>> difference in the network stack.
> >>>
> >>> A quick search showed that no other device driver is doing this in 
> >>> the kernel, so I'm not sure if such a behavior wouldn't be 
> >>> rejected. Adding Daniel to comment on this as well.
> >>>
> >>> Felix had some really good arguments to make that an urgent fix, 
> >>> so either we come up with some real rework of this or at least add 
> >>> a big
> >>> "/* TODO....*/".
> >> Aside from "I have questions" I don't think there's any reason to 
> >> no fix this correctly with mutex_lock_nest_lock. Really shouldn't 
> >> be a semantic change from what I'm understand here.
> > Also one more with my drm maintainer hat on, as a heads-up: Dave&me 
> > looked at i915 gem code a bit too much, and we're appalled at the 
> > massive over-use of lockdep class hacks and dubious nesting 
> > annotations. Expect crack down on anyone who tries to play clever 
> > tricks here, we have a few too many already :-)
> >
> > So no mutex_lock_nested (totally different beast from 
> > mutex_lock_nest_lock, and really unsafe since it's a runtime change 
> > of the lockdep class) and also not any lockdep_set_class trickery or 
> > anything like that. We need locking hierarchies which mere humans 
> > can comprehend and review.
> >
> > Cheers, Daniel
> >
> >> Cheers, Daniel
> >>
> >>> Regards,
> >>> Christian.
> >>>
> >>>> Regards,
> >>>> Christian.
> >>>>
> >>
> >> --
> >> Daniel Vetter
> >> Software Engineer, Intel Corporation
> >> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fbl
> >> og.ffwll.ch%2F&amp;data=02%7C01%7CDennis.Li%40amd.com%7C604aa31bd86
> >> 4459226ee08d83ab684dd%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C
> >> 637323902882042827&amp;sdata=DieZpl%2BIJ73STTkfn9UXDXmtssPXKKbtPsut
> >> BRp%2FS%2BE%3D&amp;reserved=0
> >
> >
>


--
Daniel Vetter
Software Engineer, Intel Corporation
https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&amp;data=02%7C01%7CDennis.Li%40amd.com%7C604aa31bd864459226ee08d83ab684dd%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637323902882042827&amp;sdata=DieZpl%2BIJ73STTkfn9UXDXmtssPXKKbtPsutBRp%2FS%2BE%3D&amp;reserved=0
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] drm/amdgpu: annotate a false positive recursive locking
  2020-08-07 11:59                                   ` Li, Dennis
@ 2020-08-07 12:28                                     ` Daniel Vetter
  2020-08-07 15:32                                       ` Li, Dennis
  0 siblings, 1 reply; 24+ messages in thread
From: Daniel Vetter @ 2020-08-07 12:28 UTC (permalink / raw)
  To: Li, Dennis
  Cc: Deucher, Alexander, Kuehling, Felix, Koenig, Christian, amd-gfx,
	Zhang, Hawking

On Fri, Aug 7, 2020 at 1:59 PM Li, Dennis <Dennis.Li@amd.com> wrote:
>
> [AMD Public Use]
>
> Hi, Daniel,
>       Thanks your review. And I also understand your concern. I guess you missed the description of this issue, so I paste it again in the below, and explain why this issue happens.
>
>       For example, in a XGMI system with 2 GPU devices whose device entity is named adev. And each adev has a separated reset_sem.
>       // device init function
>       void device_init(adev) {
>         ...
>         init_rwsem(&adev->reset_sem);
>               ...
>       }
>
>      XGMI system have two GPU devices, so system will call device_init twice. However the definition of init_rwsem macro has a limitation which use a local static lock_class_key to initialize rw_sem, which cause each adev->reset_sem share the same lock_class_key.
>
>      #define init_rwsem(sem)                                                 \
>      do {                                                               \
>          static struct lock_class_key __key;                    \
>                                                                 \
>         __init_rwsem((sem), #sem, &__key);                      \
>      } while (0)
>
>      And when GPU hang, we should do gpu recovery for all devices in the hive. Therefore we should lock each adev->reset_sem to protect GPU from be accessed by other threads during recovery. The detailed recovery sequence as the following:
>      // Step1: lock all GPU firstly:
>      for each adev of GPU0 or GPU1 {
>            down_write(adev->reset_sem);
>            do_suspend(adev);
>     }
>
>     // Step2:
>     do_hive_recovery(hive);
>
>     // Step3: resume and unlock GPU
>     for each adev of GPU0 or GPU1 {
>           do_resume(adev)
>           up_write(adev->reset_sem);
>     }
>
>     Each adev->reset_sem has the same lock_class_key, and lockdep will take them as the same rw_semaphore object. Therefore in step1, when lock GPU1, lockdep will pop the below warning.
>
>     I have considered your proposal (using  _nest_lock() ) before. Just as you said, _nest_lock() will hide true positive recursive locking. So I gave up it in the end.
>
>     After reviewing codes of lockdep, I found the lockdep support dynamic_key, so using separated lock_class_key has no any side effect. In fact, init_rwsem also use dynamic_key. Please see the call sequence of init_rwsem and lockdep_set_class as the below:
>    1) init_rwsem -> __init_rwsem -> lockdep_init_map;
>    2) lockdep_set_class -> lockdep_init_map;
>
>     Finally we go back to your concern, you maybe worry this change will cause the below dead-lock can't be detected. In fact, lockdep still is able to detect the issue as circular locking dependency, but there is no warning "recursive locking " in this case.
>     Thread A: down_write(adev->reset_sem) for GPU 0 -> down_write(adev->reset_sem) for GPU 1 -> ... -> up_write(adev->reset_sem) for GPU 1 -> up_write(adev->reset_sem) for GPU 0
>     Thread B: down_write(adev->reset_sem) for GPU 1 -> down_write(adev->reset_sem) for GPU 0 -> ... -> up_write(adev->reset_sem) for GPU 0 -> up_write(adev->reset_sem) for GPU 1
>
>     But lockdep still can detect recursive locking for this case:
>     Thread A: down_write(adev->reset_sem) for GPU 0 -> ...-> ...-> down_write(adev->reset_sem) for GPU 0

Yeah, I guessed correctly what you're doing. My recommendation to use
down_write_nest_lock still stands. This is for reset only, kernel-wide
reset lock really shouldn't hurt. Or make it a lock per xgmi hive, I'm
assuming that information is known somewhere.

The problem with manual lockdep annotations is that they increase
complexity. You have to keep all the annotations in mind, including
justifcations and which parts they still catch and which parts they
don't catch. And there's zero performance justification for a gpu
reset path to create some fancy lockdep special cases.

Locking needs to be
1. maintainable, i.e. every time you need to write a multi-paragraph
explanation like the above it's probably not. This obviously includes
correctness, but it's even more important that people can easily
understand what you're doing.
2. fast enough, where it matters. gpu reset just doesn't.

> [  584.110304] ============================================
> [  584.110590] WARNING: possible recursive locking detected
> [  584.110876] 5.6.0-deli-v5.6-2848-g3f3109b0e75f #1 Tainted: G           OE
> [  584.111164] --------------------------------------------
> [  584.111456] kworker/38:1/553 is trying to acquire lock:
> [  584.111721] ffff9b15ff0a47a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
> [  584.112112]
>                but task is already holding lock:
> [  584.112673] ffff9b1603d247a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
> [  584.113068]
>                other info that might help us debug this:
> [  584.113689]  Possible unsafe locking scenario:
>
> [  584.114350]        CPU0
> [  584.114685]        ----
> [  584.115014]   lock(&adev->reset_sem);
> [  584.115349]   lock(&adev->reset_sem);
> [  584.115678]
>                 *** DEADLOCK ***
>
> [  584.116624]  May be due to missing lock nesting notation
>
> [  584.117284] 4 locks held by kworker/38:1/553:
> [  584.117616]  #0: ffff9ad635c1d348 ((wq_completion)events){+.+.}, at: process_one_work+0x21f/0x630
> [  584.117967]  #1: ffffac708e1c3e58 ((work_completion)(&con->recovery_work)){+.+.}, at: process_one_work+0x21f/0x630
> [  584.118358]  #2: ffffffffc1c2a5d0 (&tmp->hive_lock){+.+.}, at: amdgpu_device_gpu_recover+0xae/0x1030 [amdgpu]
> [  584.118786]  #3: ffff9b1603d247a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]

Uh, so this means this rwsem is in the critical path for dma-fence
signalling. Please make sure this is all correctly primed at driver
load (like we do alredy for dma_resv/fence and drm_modeset_lock) when
lockdep is enabled. Otherwise pretty much guaranteed you'll get this
wrong somewhere.
-Daniel

>
> Best Regards
> Dennis Li
> -----Original Message-----
> From: Daniel Vetter <daniel@ffwll.ch>
> Sent: Friday, August 7, 2020 5:45 PM
> To: Koenig, Christian <Christian.Koenig@amd.com>
> Cc: Li, Dennis <Dennis.Li@amd.com>; amd-gfx@lists.freedesktop.org; Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix <Felix.Kuehling@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>
> Subject: Re: [PATCH] drm/amdgpu: annotate a false positive recursive locking
>
> On Fri, Aug 7, 2020 at 11:34 AM Christian König <christian.koenig@amd.com> wrote:
> >
> > Am 07.08.20 um 11:23 schrieb Daniel Vetter:
> > > On Fri, Aug 7, 2020 at 11:20 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > >> On Fri, Aug 7, 2020 at 11:08 AM Christian König
> > >> <christian.koenig@amd.com> wrote:
> > >>> [SNIP]
> > >>>>>>> What we should do instead is to make sure we have only a single lock for the complete hive instead.
> > >>>>>>> [Dennis Li] If we use a single lock, users will must wait for all devices resuming successfully, but in fact their tasks are only running in device a. It is not friendly to users.
> > >>>>>> Well that is intentional I would say. We can only restart submissions after all devices are resumed successfully, cause otherwise it can happen that a job on device A depends on memory on device B which isn't accessible.
> > >>>>>> [Dennis Li] Yes, you are right. Driver have make sure that the shared resources(such as the shard memory) are ready before unlock the lock of adev one by one. But each device still has private hardware resources such as video  and display.
> > >>>>> Yeah, but those are negligible and we have a team working on display support for XGMI. So this will sooner or later become a problem as well.
> > >>>>>
> > >>>>> I still think that a single rwsem for the whole hive is still the best option here.
> > >>>>>
> > >>>>> [Dennis Li] For your above concern, we can open a new thread to discuss it. If we make a decision to use a single after discussing, we can create another patch for it.
> > >>>> That's a really good argument, but I still hesitate to merge this patch.
> > >>>> How severe is the lockdep splat?
> > >>>>
> > >>>> At bare minimum we need a "/* TODO: ...." comment why we do this and how to remove the workaround again.
> > >>>> [Dennis Li] It is not a workaround. According to design of lockdep, each rw_semaphore should has a separated lock_class_key. I have explained that init_rwsem has the above limitation, so we must correct it.
> > >>> Yeah, but this explanation is not really sufficient. Again this is
> > >>> not an limitation of init_rwsem, but intentional behavior.
> > >>>
> > >>> In other words in most cases lockdep should splat if you use rw
> > >>> semaphores like this.
> > >>>
> > >>> That we avoid this by locking multiple amdgpu device always in the
> > >>> same order is rather questionable driver design.
> > >> Yeah don't give multiple locking classes like that, that's fairly
> > >> terrible.
> >
> > Ok, that is exactly the answer I feared we would get.
> >
> > >> The right fix is mutex_lock_nest_lock, which we're using in
> > >> ww_mutex or in which is also used in mm_take_all_locks.
> >
> > Ah! Enlightenment! So in this case we should use down_write_nested()
> > in this special space and all is well?
>
> Nope. There's two kinds of nesting that lockdep supports:
> 1. _nested() variants, where you supply an integer parameter specifying the lockdep subclass. That's like setting a locking subclass at lock creating time, except much worse because you change the lockdep class at runtime. This can lead to stuff like:
>
> struct mutex A;
> mutex_init(&A);
> mutex_lock(&A);
> mutex_lock_nested(&A, SINGLE_DEPTH_NESTING);
>
> which is a very clear deadlock, but lockdep will think it's all fine.
> Don't use that.
>
> 2. _nest_lock() variants, where you specify a super-lock, which makes sure that all sub-locks cannot be acquired concurrently (or you promise there's some other trick to avoid deadlocks). Lockdep checks that you've acquired the super-lock, and then allows arbitary nesting.
> This is what ww_mutex uses, and what mm_take_all_locks uses. If the super-lock is an actual mutex, then this is actually deadlock free.
> With ww_mutex it's a fake lockdep_map in the ww_acquire_ctx, but we guarantee deadlock-freeness through the ww algorithm. The fake super lock for ww_mutex is acquired when you get the ticket.
>
> No 2 is the thing you want, not no 1.
>
> > >>
> > >> If you solve the locking ordering by sorting all the locks you need
> > >> (ugh), then you can also just use a fake lockdep_map for your
> > >> special function only.
> > >>
> > >> Also in general the idea of an rwsem in conjunction with xgmi cross
> > >> device memory handling just plain freaks me out :-) Please make
> > >> sure you prime this lock against dma_resv and dma_fence (and maybe
> > >> also drm_modeset_lock if this lock is outside of dma_resv), and
> > >> ideally this should only ever be used for setup stuff when loading
> > >> drivers. I guess that's why you went with a per-device rwsem. If
> > >> it's really only for setup then just do the simple thing of having
> > >> an xgmi_hive_reconfigure lock, which all the rwsems nest within,
> > >> and no fancy lock ordering or other tricks.
> >
> > Well this is for the group reset of all devices in a hive and you need
> > to reboot to change the order the device are in the list.
>
> Ok, that sounds save enough. I'd go with a xgmi_hive_lock then, and down_write_nest_lock(&adev->xgmi_rwsem, &xgmi_hive_lock); Taking the global lock for device reset shouldn't be a problem, we're not going to reset multiple devices in parallel anyway. Or at least I don't hope that use-case is a perf critical benchmark for you guys :-)
>
> Cheers, Daniel
>
> > Regards,
> > Christian.
> >
> > >>
> > >>>>    Core network driver (net/core/dev.c) has the similar use case with us.
> > >>> Just took a look at that, but this is completely different use
> > >>> case. The lockdep classes are grouped by driver type to note the
> > >>> difference in the network stack.
> > >>>
> > >>> A quick search showed that no other device driver is doing this in
> > >>> the kernel, so I'm not sure if such a behavior wouldn't be
> > >>> rejected. Adding Daniel to comment on this as well.
> > >>>
> > >>> Felix had some really good arguments to make that an urgent fix,
> > >>> so either we come up with some real rework of this or at least add
> > >>> a big
> > >>> "/* TODO....*/".
> > >> Aside from "I have questions" I don't think there's any reason to
> > >> no fix this correctly with mutex_lock_nest_lock. Really shouldn't
> > >> be a semantic change from what I'm understand here.
> > > Also one more with my drm maintainer hat on, as a heads-up: Dave&me
> > > looked at i915 gem code a bit too much, and we're appalled at the
> > > massive over-use of lockdep class hacks and dubious nesting
> > > annotations. Expect crack down on anyone who tries to play clever
> > > tricks here, we have a few too many already :-)
> > >
> > > So no mutex_lock_nested (totally different beast from
> > > mutex_lock_nest_lock, and really unsafe since it's a runtime change
> > > of the lockdep class) and also not any lockdep_set_class trickery or
> > > anything like that. We need locking hierarchies which mere humans
> > > can comprehend and review.
> > >
> > > Cheers, Daniel
> > >
> > >> Cheers, Daniel
> > >>
> > >>> Regards,
> > >>> Christian.
> > >>>
> > >>>> Regards,
> > >>>> Christian.
> > >>>>
> > >>
> > >> --
> > >> Daniel Vetter
> > >> Software Engineer, Intel Corporation
> > >> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fbl
> > >> og.ffwll.ch%2F&amp;data=02%7C01%7CDennis.Li%40amd.com%7C604aa31bd86
> > >> 4459226ee08d83ab684dd%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C
> > >> 637323902882042827&amp;sdata=DieZpl%2BIJ73STTkfn9UXDXmtssPXKKbtPsut
> > >> BRp%2FS%2BE%3D&amp;reserved=0
> > >
> > >
> >
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&amp;data=02%7C01%7CDennis.Li%40amd.com%7C604aa31bd864459226ee08d83ab684dd%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637323902882042827&amp;sdata=DieZpl%2BIJ73STTkfn9UXDXmtssPXKKbtPsutBRp%2FS%2BE%3D&amp;reserved=0



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [PATCH] drm/amdgpu: annotate a false positive recursive locking
  2020-08-07 12:28                                     ` Daniel Vetter
@ 2020-08-07 15:32                                       ` Li, Dennis
  2020-08-07 16:18                                         ` Daniel Vetter
  0 siblings, 1 reply; 24+ messages in thread
From: Li, Dennis @ 2020-08-07 15:32 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Deucher, Alexander, Kuehling, Felix, Koenig, Christian, amd-gfx,
	Zhang, Hawking

[AMD Public Use]

On Fri, Aug 7, 2020 at 1:59 PM Li, Dennis <Dennis.Li@amd.com> wrote:
>
> [AMD Public Use]
>
> Hi, Daniel,
>       Thanks your review. And I also understand your concern. I guess you missed the description of this issue, so I paste it again in the below, and explain why this issue happens.
>
>       For example, in a XGMI system with 2 GPU devices whose device entity is named adev. And each adev has a separated reset_sem.
>       // device init function
>       void device_init(adev) {
>         ...
>         init_rwsem(&adev->reset_sem);
>               ...
>       }
>
>      XGMI system have two GPU devices, so system will call device_init twice. However the definition of init_rwsem macro has a limitation which use a local static lock_class_key to initialize rw_sem, which cause each adev->reset_sem share the same lock_class_key.
>
>      #define init_rwsem(sem)                                                 \
>      do {                                                               \
>          static struct lock_class_key __key;                    \
>                                                                 \
>         __init_rwsem((sem), #sem, &__key);                      \
>      } while (0)
>
>      And when GPU hang, we should do gpu recovery for all devices in the hive. Therefore we should lock each adev->reset_sem to protect GPU from be accessed by other threads during recovery. The detailed recovery sequence as the following:
>      // Step1: lock all GPU firstly:
>      for each adev of GPU0 or GPU1 {
>            down_write(adev->reset_sem);
>            do_suspend(adev);
>     }
>
>     // Step2:
>     do_hive_recovery(hive);
>
>     // Step3: resume and unlock GPU
>     for each adev of GPU0 or GPU1 {
>           do_resume(adev)
>           up_write(adev->reset_sem);
>     }
>
>     Each adev->reset_sem has the same lock_class_key, and lockdep will take them as the same rw_semaphore object. Therefore in step1, when lock GPU1, lockdep will pop the below warning.
>
>     I have considered your proposal (using  _nest_lock() ) before. Just as you said, _nest_lock() will hide true positive recursive locking. So I gave up it in the end.
>
>     After reviewing codes of lockdep, I found the lockdep support dynamic_key, so using separated lock_class_key has no any side effect. In fact, init_rwsem also use dynamic_key. Please see the call sequence of init_rwsem and lockdep_set_class as the below:
>    1) init_rwsem -> __init_rwsem -> lockdep_init_map;
>    2) lockdep_set_class -> lockdep_init_map;
>
>     Finally we go back to your concern, you maybe worry this change will cause the below dead-lock can't be detected. In fact, lockdep still is able to detect the issue as circular locking dependency, but there is no warning "recursive locking " in this case.
>     Thread A: down_write(adev->reset_sem) for GPU 0 -> down_write(adev->reset_sem) for GPU 1 -> ... -> up_write(adev->reset_sem) for GPU 1 -> up_write(adev->reset_sem) for GPU 0
>     Thread B: down_write(adev->reset_sem) for GPU 1 -> 
> down_write(adev->reset_sem) for GPU 0 -> ... -> 
> up_write(adev->reset_sem) for GPU 0 -> up_write(adev->reset_sem) for 
> GPU 1
>
>     But lockdep still can detect recursive locking for this case:
>     Thread A: down_write(adev->reset_sem) for GPU 0 -> ...-> ...-> 
> down_write(adev->reset_sem) for GPU 0

Yeah, I guessed correctly what you're doing. My recommendation to use down_write_nest_lock still stands. This is for reset only, kernel-wide reset lock really shouldn't hurt. Or make it a lock per xgmi hive, I'm assuming that information is known somewhere.

The problem with manual lockdep annotations is that they increase complexity. You have to keep all the annotations in mind, including justifcations and which parts they still catch and which parts they don't catch. And there's zero performance justification for a gpu reset path to create some fancy lockdep special cases.

Locking needs to be
1. maintainable, i.e. every time you need to write a multi-paragraph explanation like the above it's probably not. This obviously includes correctness, but it's even more important that people can easily understand what you're doing.
2. fast enough, where it matters. gpu reset just doesn't.

[Dennis Li] Yeah. I strongly agree that manual lockdep annotation will increase complexity. However my patch isn't for lockdep annotation, and in fact it is to fix a bug. According to design of lockdep, every lock object should has a separated  lock_class_key which is an unified ID of lock object in lockdep.
I still take the previous sample codes for example: we define two rw_sem: a and b, and use case1 and case2 to init them.  In my opinion, case1 and case2 should have same behavior, but in fact, they are different, because case2 use init_rwsem macro wrongly. Therefore we should lockdep_set_class to fix this issue, such as case3. 

Case 1:
init_rwsem(&a);
init_rwsem(&b);

Case2:
void init_rwsem_ext(rw_sem* sem)
{
     init_rwsem(sem);
}
init_rwsem_ext(&a);
init_rwsem_ext(&b);

Case3:
void init_rwsem_ext(rw_sem* sem, lock_class_key *key)
{
     init_rwsem(sem);
     lockdep_set_class(sem, key);
}
init_rwsem_ext(&a, a_key);
init_rwsem_ext(&b, b_key);

> [  584.110304] ============================================
> [  584.110590] WARNING: possible recursive locking detected
> [  584.110876] 5.6.0-deli-v5.6-2848-g3f3109b0e75f #1 Tainted: G           OE
> [  584.111164] --------------------------------------------
> [  584.111456] kworker/38:1/553 is trying to acquire lock:
> [  584.111721] ffff9b15ff0a47a0 (&adev->reset_sem){++++}, at: 
> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.112112]
>                but task is already holding lock:
> [  584.112673] ffff9b1603d247a0 (&adev->reset_sem){++++}, at: 
> amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.113068]
>                other info that might help us debug this:
> [  584.113689]  Possible unsafe locking scenario:
>
> [  584.114350]        CPU0
> [  584.114685]        ----
> [  584.115014]   lock(&adev->reset_sem);
> [  584.115349]   lock(&adev->reset_sem);
> [  584.115678]
>                 *** DEADLOCK ***
>
> [  584.116624]  May be due to missing lock nesting notation
>
> [  584.117284] 4 locks held by kworker/38:1/553:
> [  584.117616]  #0: ffff9ad635c1d348 ((wq_completion)events){+.+.}, 
> at: process_one_work+0x21f/0x630 [  584.117967]  #1: ffffac708e1c3e58 
> ((work_completion)(&con->recovery_work)){+.+.}, at: 
> process_one_work+0x21f/0x630 [  584.118358]  #2: ffffffffc1c2a5d0 
> (&tmp->hive_lock){+.+.}, at: amdgpu_device_gpu_recover+0xae/0x1030 
> [amdgpu] [  584.118786]  #3: ffff9b1603d247a0 
> (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 
> [amdgpu]

Uh, so this means this rwsem is in the critical path for dma-fence signalling. Please make sure this is all correctly primed at driver load (like we do alredy for dma_resv/fence and drm_modeset_lock) when lockdep is enabled. Otherwise pretty much guaranteed you'll get this wrong somewhere.
-Daniel

>
> Best Regards
> Dennis Li
> -----Original Message-----
> From: Daniel Vetter <daniel@ffwll.ch>
> Sent: Friday, August 7, 2020 5:45 PM
> To: Koenig, Christian <Christian.Koenig@amd.com>
> Cc: Li, Dennis <Dennis.Li@amd.com>; amd-gfx@lists.freedesktop.org; 
> Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix 
> <Felix.Kuehling@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>
> Subject: Re: [PATCH] drm/amdgpu: annotate a false positive recursive 
> locking
>
> On Fri, Aug 7, 2020 at 11:34 AM Christian König <christian.koenig@amd.com> wrote:
> >
> > Am 07.08.20 um 11:23 schrieb Daniel Vetter:
> > > On Fri, Aug 7, 2020 at 11:20 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > >> On Fri, Aug 7, 2020 at 11:08 AM Christian König 
> > >> <christian.koenig@amd.com> wrote:
> > >>> [SNIP]
> > >>>>>>> What we should do instead is to make sure we have only a single lock for the complete hive instead.
> > >>>>>>> [Dennis Li] If we use a single lock, users will must wait for all devices resuming successfully, but in fact their tasks are only running in device a. It is not friendly to users.
> > >>>>>> Well that is intentional I would say. We can only restart submissions after all devices are resumed successfully, cause otherwise it can happen that a job on device A depends on memory on device B which isn't accessible.
> > >>>>>> [Dennis Li] Yes, you are right. Driver have make sure that the shared resources(such as the shard memory) are ready before unlock the lock of adev one by one. But each device still has private hardware resources such as video  and display.
> > >>>>> Yeah, but those are negligible and we have a team working on display support for XGMI. So this will sooner or later become a problem as well.
> > >>>>>
> > >>>>> I still think that a single rwsem for the whole hive is still the best option here.
> > >>>>>
> > >>>>> [Dennis Li] For your above concern, we can open a new thread to discuss it. If we make a decision to use a single after discussing, we can create another patch for it.
> > >>>> That's a really good argument, but I still hesitate to merge this patch.
> > >>>> How severe is the lockdep splat?
> > >>>>
> > >>>> At bare minimum we need a "/* TODO: ...." comment why we do this and how to remove the workaround again.
> > >>>> [Dennis Li] It is not a workaround. According to design of lockdep, each rw_semaphore should has a separated lock_class_key. I have explained that init_rwsem has the above limitation, so we must correct it.
> > >>> Yeah, but this explanation is not really sufficient. Again this 
> > >>> is not an limitation of init_rwsem, but intentional behavior.
> > >>>
> > >>> In other words in most cases lockdep should splat if you use rw 
> > >>> semaphores like this.
> > >>>
> > >>> That we avoid this by locking multiple amdgpu device always in 
> > >>> the same order is rather questionable driver design.
> > >> Yeah don't give multiple locking classes like that, that's fairly 
> > >> terrible.
> >
> > Ok, that is exactly the answer I feared we would get.
> >
> > >> The right fix is mutex_lock_nest_lock, which we're using in 
> > >> ww_mutex or in which is also used in mm_take_all_locks.
> >
> > Ah! Enlightenment! So in this case we should use down_write_nested() 
> > in this special space and all is well?
>
> Nope. There's two kinds of nesting that lockdep supports:
> 1. _nested() variants, where you supply an integer parameter specifying the lockdep subclass. That's like setting a locking subclass at lock creating time, except much worse because you change the lockdep class at runtime. This can lead to stuff like:
>
> struct mutex A;
> mutex_init(&A);
> mutex_lock(&A);
> mutex_lock_nested(&A, SINGLE_DEPTH_NESTING);
>
> which is a very clear deadlock, but lockdep will think it's all fine.
> Don't use that.
>
> 2. _nest_lock() variants, where you specify a super-lock, which makes sure that all sub-locks cannot be acquired concurrently (or you promise there's some other trick to avoid deadlocks). Lockdep checks that you've acquired the super-lock, and then allows arbitary nesting.
> This is what ww_mutex uses, and what mm_take_all_locks uses. If the super-lock is an actual mutex, then this is actually deadlock free.
> With ww_mutex it's a fake lockdep_map in the ww_acquire_ctx, but we guarantee deadlock-freeness through the ww algorithm. The fake super lock for ww_mutex is acquired when you get the ticket.
>
> No 2 is the thing you want, not no 1.
>
> > >>
> > >> If you solve the locking ordering by sorting all the locks you 
> > >> need (ugh), then you can also just use a fake lockdep_map for 
> > >> your special function only.
> > >>
> > >> Also in general the idea of an rwsem in conjunction with xgmi 
> > >> cross device memory handling just plain freaks me out :-) Please 
> > >> make sure you prime this lock against dma_resv and dma_fence (and 
> > >> maybe also drm_modeset_lock if this lock is outside of dma_resv), 
> > >> and ideally this should only ever be used for setup stuff when 
> > >> loading drivers. I guess that's why you went with a per-device 
> > >> rwsem. If it's really only for setup then just do the simple 
> > >> thing of having an xgmi_hive_reconfigure lock, which all the 
> > >> rwsems nest within, and no fancy lock ordering or other tricks.
> >
> > Well this is for the group reset of all devices in a hive and you 
> > need to reboot to change the order the device are in the list.
>
> Ok, that sounds save enough. I'd go with a xgmi_hive_lock then, and 
> down_write_nest_lock(&adev->xgmi_rwsem, &xgmi_hive_lock); Taking the 
> global lock for device reset shouldn't be a problem, we're not going 
> to reset multiple devices in parallel anyway. Or at least I don't hope 
> that use-case is a perf critical benchmark for you guys :-)
>
> Cheers, Daniel
>
> > Regards,
> > Christian.
> >
> > >>
> > >>>>    Core network driver (net/core/dev.c) has the similar use case with us.
> > >>> Just took a look at that, but this is completely different use 
> > >>> case. The lockdep classes are grouped by driver type to note the 
> > >>> difference in the network stack.
> > >>>
> > >>> A quick search showed that no other device driver is doing this 
> > >>> in the kernel, so I'm not sure if such a behavior wouldn't be 
> > >>> rejected. Adding Daniel to comment on this as well.
> > >>>
> > >>> Felix had some really good arguments to make that an urgent fix, 
> > >>> so either we come up with some real rework of this or at least 
> > >>> add a big
> > >>> "/* TODO....*/".
> > >> Aside from "I have questions" I don't think there's any reason to 
> > >> no fix this correctly with mutex_lock_nest_lock. Really shouldn't 
> > >> be a semantic change from what I'm understand here.
> > > Also one more with my drm maintainer hat on, as a heads-up: 
> > > Dave&me looked at i915 gem code a bit too much, and we're appalled 
> > > at the massive over-use of lockdep class hacks and dubious nesting 
> > > annotations. Expect crack down on anyone who tries to play clever 
> > > tricks here, we have a few too many already :-)
> > >
> > > So no mutex_lock_nested (totally different beast from 
> > > mutex_lock_nest_lock, and really unsafe since it's a runtime 
> > > change of the lockdep class) and also not any lockdep_set_class 
> > > trickery or anything like that. We need locking hierarchies which 
> > > mere humans can comprehend and review.
> > >
> > > Cheers, Daniel
> > >
> > >> Cheers, Daniel
> > >>
> > >>> Regards,
> > >>> Christian.
> > >>>
> > >>>> Regards,
> > >>>> Christian.
> > >>>>
> > >>
> > >> --
> > >> Daniel Vetter
> > >> Software Engineer, Intel Corporation 
> > >> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2F
> > >> bl
> > >> og.ffwll.ch%2F&amp;data=02%7C01%7CDennis.Li%40amd.com%7C604aa31bd
> > >> 86 
> > >> 4459226ee08d83ab684dd%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%
> > >> 7C 
> > >> 637323902882042827&amp;sdata=DieZpl%2BIJ73STTkfn9UXDXmtssPXKKbtPs
> > >> ut
> > >> BRp%2FS%2BE%3D&amp;reserved=0
> > >
> > >
> >
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.
> ffwll.ch%2F&amp;data=02%7C01%7CDennis.Li%40amd.com%7Cc9876eefce8e4bb96
> 31008d83acd7273%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637324001
> 349319231&amp;sdata=%2BFmRWSamXiOE4HgK8VcTfFEINf31DqMqA5DjClsRkY4%3D&a
> mp;reserved=0



--
Daniel Vetter
Software Engineer, Intel Corporation
https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&amp;data=02%7C01%7CDennis.Li%40amd.com%7Cc9876eefce8e4bb9631008d83acd7273%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637324001349319231&amp;sdata=%2BFmRWSamXiOE4HgK8VcTfFEINf31DqMqA5DjClsRkY4%3D&amp;reserved=0
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] drm/amdgpu: annotate a false positive recursive locking
  2020-08-07 15:32                                       ` Li, Dennis
@ 2020-08-07 16:18                                         ` Daniel Vetter
  2020-08-07 16:50                                           ` Li, Dennis
  0 siblings, 1 reply; 24+ messages in thread
From: Daniel Vetter @ 2020-08-07 16:18 UTC (permalink / raw)
  To: Li, Dennis
  Cc: Deucher, Alexander, Kuehling, Felix, Koenig, Christian, amd-gfx,
	Zhang, Hawking

On Fri, Aug 7, 2020 at 5:32 PM Li, Dennis <Dennis.Li@amd.com> wrote:
>
> [AMD Public Use]
>
> On Fri, Aug 7, 2020 at 1:59 PM Li, Dennis <Dennis.Li@amd.com> wrote:
> >
> > [AMD Public Use]
> >
> > Hi, Daniel,
> >       Thanks your review. And I also understand your concern. I guess you missed the description of this issue, so I paste it again in the below, and explain why this issue happens.
> >
> >       For example, in a XGMI system with 2 GPU devices whose device entity is named adev. And each adev has a separated reset_sem.
> >       // device init function
> >       void device_init(adev) {
> >         ...
> >         init_rwsem(&adev->reset_sem);
> >               ...
> >       }
> >
> >      XGMI system have two GPU devices, so system will call device_init twice. However the definition of init_rwsem macro has a limitation which use a local static lock_class_key to initialize rw_sem, which cause each adev->reset_sem share the same lock_class_key.
> >
> >      #define init_rwsem(sem)                                                 \
> >      do {                                                               \
> >          static struct lock_class_key __key;                    \
> >                                                                 \
> >         __init_rwsem((sem), #sem, &__key);                      \
> >      } while (0)
> >
> >      And when GPU hang, we should do gpu recovery for all devices in the hive. Therefore we should lock each adev->reset_sem to protect GPU from be accessed by other threads during recovery. The detailed recovery sequence as the following:
> >      // Step1: lock all GPU firstly:
> >      for each adev of GPU0 or GPU1 {
> >            down_write(adev->reset_sem);
> >            do_suspend(adev);
> >     }
> >
> >     // Step2:
> >     do_hive_recovery(hive);
> >
> >     // Step3: resume and unlock GPU
> >     for each adev of GPU0 or GPU1 {
> >           do_resume(adev)
> >           up_write(adev->reset_sem);
> >     }
> >
> >     Each adev->reset_sem has the same lock_class_key, and lockdep will take them as the same rw_semaphore object. Therefore in step1, when lock GPU1, lockdep will pop the below warning.
> >
> >     I have considered your proposal (using  _nest_lock() ) before. Just as you said, _nest_lock() will hide true positive recursive locking. So I gave up it in the end.
> >
> >     After reviewing codes of lockdep, I found the lockdep support dynamic_key, so using separated lock_class_key has no any side effect. In fact, init_rwsem also use dynamic_key. Please see the call sequence of init_rwsem and lockdep_set_class as the below:
> >    1) init_rwsem -> __init_rwsem -> lockdep_init_map;
> >    2) lockdep_set_class -> lockdep_init_map;
> >
> >     Finally we go back to your concern, you maybe worry this change will cause the below dead-lock can't be detected. In fact, lockdep still is able to detect the issue as circular locking dependency, but there is no warning "recursive locking " in this case.
> >     Thread A: down_write(adev->reset_sem) for GPU 0 -> down_write(adev->reset_sem) for GPU 1 -> ... -> up_write(adev->reset_sem) for GPU 1 -> up_write(adev->reset_sem) for GPU 0
> >     Thread B: down_write(adev->reset_sem) for GPU 1 ->
> > down_write(adev->reset_sem) for GPU 0 -> ... ->
> > up_write(adev->reset_sem) for GPU 0 -> up_write(adev->reset_sem) for
> > GPU 1
> >
> >     But lockdep still can detect recursive locking for this case:
> >     Thread A: down_write(adev->reset_sem) for GPU 0 -> ...-> ...->
> > down_write(adev->reset_sem) for GPU 0
>
> Yeah, I guessed correctly what you're doing. My recommendation to use down_write_nest_lock still stands. This is for reset only, kernel-wide reset lock really shouldn't hurt. Or make it a lock per xgmi hive, I'm assuming that information is known somewhere.
>
> The problem with manual lockdep annotations is that they increase complexity. You have to keep all the annotations in mind, including justifcations and which parts they still catch and which parts they don't catch. And there's zero performance justification for a gpu reset path to create some fancy lockdep special cases.
>
> Locking needs to be
> 1. maintainable, i.e. every time you need to write a multi-paragraph explanation like the above it's probably not. This obviously includes correctness, but it's even more important that people can easily understand what you're doing.
> 2. fast enough, where it matters. gpu reset just doesn't.
>
> [Dennis Li] Yeah. I strongly agree that manual lockdep annotation will increase complexity. However my patch isn't for lockdep annotation, and in fact it is to fix a bug. According to design of lockdep, every lock object should has a separated  lock_class_key which is an unified ID of lock object in lockdep.

Nope that 's not how lockdep works. It intentionally shares the
lockdep class. If every lock would have it's own key all lockdep would
achieve is produce a bit more heat and slow down your cpu.

Yes it uses the macro trick to make sure you share the lockdep key the
right way in 99% of all cases, but there's also many other examples
with manually shared lockdep keys. But shared lockdep keys is the
entire point of lockdep. It doesn't do much useful without that.
-Daniel

> I still take the previous sample codes for example: we define two rw_sem: a and b, and use case1 and case2 to init them.  In my opinion, case1 and case2 should have same behavior, but in fact, they are different, because case2 use init_rwsem macro wrongly. Therefore we should lockdep_set_class to fix this issue, such as case3.
>
> Case 1:
> init_rwsem(&a);
> init_rwsem(&b);
>
> Case2:
> void init_rwsem_ext(rw_sem* sem)
> {
>      init_rwsem(sem);
> }
> init_rwsem_ext(&a);
> init_rwsem_ext(&b);
>
> Case3:
> void init_rwsem_ext(rw_sem* sem, lock_class_key *key)
> {
>      init_rwsem(sem);
>      lockdep_set_class(sem, key);
> }
> init_rwsem_ext(&a, a_key);
> init_rwsem_ext(&b, b_key);
>
> > [  584.110304] ============================================
> > [  584.110590] WARNING: possible recursive locking detected
> > [  584.110876] 5.6.0-deli-v5.6-2848-g3f3109b0e75f #1 Tainted: G           OE
> > [  584.111164] --------------------------------------------
> > [  584.111456] kworker/38:1/553 is trying to acquire lock:
> > [  584.111721] ffff9b15ff0a47a0 (&adev->reset_sem){++++}, at:
> > amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.112112]
> >                but task is already holding lock:
> > [  584.112673] ffff9b1603d247a0 (&adev->reset_sem){++++}, at:
> > amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.113068]
> >                other info that might help us debug this:
> > [  584.113689]  Possible unsafe locking scenario:
> >
> > [  584.114350]        CPU0
> > [  584.114685]        ----
> > [  584.115014]   lock(&adev->reset_sem);
> > [  584.115349]   lock(&adev->reset_sem);
> > [  584.115678]
> >                 *** DEADLOCK ***
> >
> > [  584.116624]  May be due to missing lock nesting notation
> >
> > [  584.117284] 4 locks held by kworker/38:1/553:
> > [  584.117616]  #0: ffff9ad635c1d348 ((wq_completion)events){+.+.},
> > at: process_one_work+0x21f/0x630 [  584.117967]  #1: ffffac708e1c3e58
> > ((work_completion)(&con->recovery_work)){+.+.}, at:
> > process_one_work+0x21f/0x630 [  584.118358]  #2: ffffffffc1c2a5d0
> > (&tmp->hive_lock){+.+.}, at: amdgpu_device_gpu_recover+0xae/0x1030
> > [amdgpu] [  584.118786]  #3: ffff9b1603d247a0
> > (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030
> > [amdgpu]
>
> Uh, so this means this rwsem is in the critical path for dma-fence signalling. Please make sure this is all correctly primed at driver load (like we do alredy for dma_resv/fence and drm_modeset_lock) when lockdep is enabled. Otherwise pretty much guaranteed you'll get this wrong somewhere.
> -Daniel
>
> >
> > Best Regards
> > Dennis Li
> > -----Original Message-----
> > From: Daniel Vetter <daniel@ffwll.ch>
> > Sent: Friday, August 7, 2020 5:45 PM
> > To: Koenig, Christian <Christian.Koenig@amd.com>
> > Cc: Li, Dennis <Dennis.Li@amd.com>; amd-gfx@lists.freedesktop.org;
> > Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix
> > <Felix.Kuehling@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>
> > Subject: Re: [PATCH] drm/amdgpu: annotate a false positive recursive
> > locking
> >
> > On Fri, Aug 7, 2020 at 11:34 AM Christian König <christian.koenig@amd.com> wrote:
> > >
> > > Am 07.08.20 um 11:23 schrieb Daniel Vetter:
> > > > On Fri, Aug 7, 2020 at 11:20 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > >> On Fri, Aug 7, 2020 at 11:08 AM Christian König
> > > >> <christian.koenig@amd.com> wrote:
> > > >>> [SNIP]
> > > >>>>>>> What we should do instead is to make sure we have only a single lock for the complete hive instead.
> > > >>>>>>> [Dennis Li] If we use a single lock, users will must wait for all devices resuming successfully, but in fact their tasks are only running in device a. It is not friendly to users.
> > > >>>>>> Well that is intentional I would say. We can only restart submissions after all devices are resumed successfully, cause otherwise it can happen that a job on device A depends on memory on device B which isn't accessible.
> > > >>>>>> [Dennis Li] Yes, you are right. Driver have make sure that the shared resources(such as the shard memory) are ready before unlock the lock of adev one by one. But each device still has private hardware resources such as video  and display.
> > > >>>>> Yeah, but those are negligible and we have a team working on display support for XGMI. So this will sooner or later become a problem as well.
> > > >>>>>
> > > >>>>> I still think that a single rwsem for the whole hive is still the best option here.
> > > >>>>>
> > > >>>>> [Dennis Li] For your above concern, we can open a new thread to discuss it. If we make a decision to use a single after discussing, we can create another patch for it.
> > > >>>> That's a really good argument, but I still hesitate to merge this patch.
> > > >>>> How severe is the lockdep splat?
> > > >>>>
> > > >>>> At bare minimum we need a "/* TODO: ...." comment why we do this and how to remove the workaround again.
> > > >>>> [Dennis Li] It is not a workaround. According to design of lockdep, each rw_semaphore should has a separated lock_class_key. I have explained that init_rwsem has the above limitation, so we must correct it.
> > > >>> Yeah, but this explanation is not really sufficient. Again this
> > > >>> is not an limitation of init_rwsem, but intentional behavior.
> > > >>>
> > > >>> In other words in most cases lockdep should splat if you use rw
> > > >>> semaphores like this.
> > > >>>
> > > >>> That we avoid this by locking multiple amdgpu device always in
> > > >>> the same order is rather questionable driver design.
> > > >> Yeah don't give multiple locking classes like that, that's fairly
> > > >> terrible.
> > >
> > > Ok, that is exactly the answer I feared we would get.
> > >
> > > >> The right fix is mutex_lock_nest_lock, which we're using in
> > > >> ww_mutex or in which is also used in mm_take_all_locks.
> > >
> > > Ah! Enlightenment! So in this case we should use down_write_nested()
> > > in this special space and all is well?
> >
> > Nope. There's two kinds of nesting that lockdep supports:
> > 1. _nested() variants, where you supply an integer parameter specifying the lockdep subclass. That's like setting a locking subclass at lock creating time, except much worse because you change the lockdep class at runtime. This can lead to stuff like:
> >
> > struct mutex A;
> > mutex_init(&A);
> > mutex_lock(&A);
> > mutex_lock_nested(&A, SINGLE_DEPTH_NESTING);
> >
> > which is a very clear deadlock, but lockdep will think it's all fine.
> > Don't use that.
> >
> > 2. _nest_lock() variants, where you specify a super-lock, which makes sure that all sub-locks cannot be acquired concurrently (or you promise there's some other trick to avoid deadlocks). Lockdep checks that you've acquired the super-lock, and then allows arbitary nesting.
> > This is what ww_mutex uses, and what mm_take_all_locks uses. If the super-lock is an actual mutex, then this is actually deadlock free.
> > With ww_mutex it's a fake lockdep_map in the ww_acquire_ctx, but we guarantee deadlock-freeness through the ww algorithm. The fake super lock for ww_mutex is acquired when you get the ticket.
> >
> > No 2 is the thing you want, not no 1.
> >
> > > >>
> > > >> If you solve the locking ordering by sorting all the locks you
> > > >> need (ugh), then you can also just use a fake lockdep_map for
> > > >> your special function only.
> > > >>
> > > >> Also in general the idea of an rwsem in conjunction with xgmi
> > > >> cross device memory handling just plain freaks me out :-) Please
> > > >> make sure you prime this lock against dma_resv and dma_fence (and
> > > >> maybe also drm_modeset_lock if this lock is outside of dma_resv),
> > > >> and ideally this should only ever be used for setup stuff when
> > > >> loading drivers. I guess that's why you went with a per-device
> > > >> rwsem. If it's really only for setup then just do the simple
> > > >> thing of having an xgmi_hive_reconfigure lock, which all the
> > > >> rwsems nest within, and no fancy lock ordering or other tricks.
> > >
> > > Well this is for the group reset of all devices in a hive and you
> > > need to reboot to change the order the device are in the list.
> >
> > Ok, that sounds save enough. I'd go with a xgmi_hive_lock then, and
> > down_write_nest_lock(&adev->xgmi_rwsem, &xgmi_hive_lock); Taking the
> > global lock for device reset shouldn't be a problem, we're not going
> > to reset multiple devices in parallel anyway. Or at least I don't hope
> > that use-case is a perf critical benchmark for you guys :-)
> >
> > Cheers, Daniel
> >
> > > Regards,
> > > Christian.
> > >
> > > >>
> > > >>>>    Core network driver (net/core/dev.c) has the similar use case with us.
> > > >>> Just took a look at that, but this is completely different use
> > > >>> case. The lockdep classes are grouped by driver type to note the
> > > >>> difference in the network stack.
> > > >>>
> > > >>> A quick search showed that no other device driver is doing this
> > > >>> in the kernel, so I'm not sure if such a behavior wouldn't be
> > > >>> rejected. Adding Daniel to comment on this as well.
> > > >>>
> > > >>> Felix had some really good arguments to make that an urgent fix,
> > > >>> so either we come up with some real rework of this or at least
> > > >>> add a big
> > > >>> "/* TODO....*/".
> > > >> Aside from "I have questions" I don't think there's any reason to
> > > >> no fix this correctly with mutex_lock_nest_lock. Really shouldn't
> > > >> be a semantic change from what I'm understand here.
> > > > Also one more with my drm maintainer hat on, as a heads-up:
> > > > Dave&me looked at i915 gem code a bit too much, and we're appalled
> > > > at the massive over-use of lockdep class hacks and dubious nesting
> > > > annotations. Expect crack down on anyone who tries to play clever
> > > > tricks here, we have a few too many already :-)
> > > >
> > > > So no mutex_lock_nested (totally different beast from
> > > > mutex_lock_nest_lock, and really unsafe since it's a runtime
> > > > change of the lockdep class) and also not any lockdep_set_class
> > > > trickery or anything like that. We need locking hierarchies which
> > > > mere humans can comprehend and review.
> > > >
> > > > Cheers, Daniel
> > > >
> > > >> Cheers, Daniel
> > > >>
> > > >>> Regards,
> > > >>> Christian.
> > > >>>
> > > >>>> Regards,
> > > >>>> Christian.
> > > >>>>
> > > >>
> > > >> --
> > > >> Daniel Vetter
> > > >> Software Engineer, Intel Corporation
> > > >> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2F
> > > >> bl
> > > >> og.ffwll.ch%2F&amp;data=02%7C01%7CDennis.Li%40amd.com%7C604aa31bd
> > > >> 86
> > > >> 4459226ee08d83ab684dd%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%
> > > >> 7C
> > > >> 637323902882042827&amp;sdata=DieZpl%2BIJ73STTkfn9UXDXmtssPXKKbtPs
> > > >> ut
> > > >> BRp%2FS%2BE%3D&amp;reserved=0
> > > >
> > > >
> > >
> >
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.
> > ffwll.ch%2F&amp;data=02%7C01%7CDennis.Li%40amd.com%7Cc9876eefce8e4bb96
> > 31008d83acd7273%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637324001
> > 349319231&amp;sdata=%2BFmRWSamXiOE4HgK8VcTfFEINf31DqMqA5DjClsRkY4%3D&a
> > mp;reserved=0
>
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&amp;data=02%7C01%7CDennis.Li%40amd.com%7Cc9876eefce8e4bb9631008d83acd7273%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637324001349319231&amp;sdata=%2BFmRWSamXiOE4HgK8VcTfFEINf31DqMqA5DjClsRkY4%3D&amp;reserved=0



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [PATCH] drm/amdgpu: annotate a false positive recursive locking
  2020-08-07 16:18                                         ` Daniel Vetter
@ 2020-08-07 16:50                                           ` Li, Dennis
  0 siblings, 0 replies; 24+ messages in thread
From: Li, Dennis @ 2020-08-07 16:50 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Deucher, Alexander, Kuehling, Felix, Koenig, Christian, amd-gfx,
	Zhang, Hawking

[AMD Official Use Only - Internal Distribution Only]

On Fri, Aug 7, 2020 at 5:32 PM Li, Dennis <Dennis.Li@amd.com> wrote:
>
> [AMD Public Use]
>
> On Fri, Aug 7, 2020 at 1:59 PM Li, Dennis <Dennis.Li@amd.com> wrote:
> >
> > [AMD Public Use]
> >
> > Hi, Daniel,
> >       Thanks your review. And I also understand your concern. I guess you missed the description of this issue, so I paste it again in the below, and explain why this issue happens.
> >
> >       For example, in a XGMI system with 2 GPU devices whose device entity is named adev. And each adev has a separated reset_sem.
> >       // device init function
> >       void device_init(adev) {
> >         ...
> >         init_rwsem(&adev->reset_sem);
> >               ...
> >       }
> >
> >      XGMI system have two GPU devices, so system will call device_init twice. However the definition of init_rwsem macro has a limitation which use a local static lock_class_key to initialize rw_sem, which cause each adev->reset_sem share the same lock_class_key.
> >
> >      #define init_rwsem(sem)                                                 \
> >      do {                                                               \
> >          static struct lock_class_key __key;                    \
> >                                                                 \
> >         __init_rwsem((sem), #sem, &__key);                      \
> >      } while (0)
> >
> >      And when GPU hang, we should do gpu recovery for all devices in the hive. Therefore we should lock each adev->reset_sem to protect GPU from be accessed by other threads during recovery. The detailed recovery sequence as the following:
> >      // Step1: lock all GPU firstly:
> >      for each adev of GPU0 or GPU1 {
> >            down_write(adev->reset_sem);
> >            do_suspend(adev);
> >     }
> >
> >     // Step2:
> >     do_hive_recovery(hive);
> >
> >     // Step3: resume and unlock GPU
> >     for each adev of GPU0 or GPU1 {
> >           do_resume(adev)
> >           up_write(adev->reset_sem);
> >     }
> >
> >     Each adev->reset_sem has the same lock_class_key, and lockdep will take them as the same rw_semaphore object. Therefore in step1, when lock GPU1, lockdep will pop the below warning.
> >
> >     I have considered your proposal (using  _nest_lock() ) before. Just as you said, _nest_lock() will hide true positive recursive locking. So I gave up it in the end.
> >
> >     After reviewing codes of lockdep, I found the lockdep support dynamic_key, so using separated lock_class_key has no any side effect. In fact, init_rwsem also use dynamic_key. Please see the call sequence of init_rwsem and lockdep_set_class as the below:
> >    1) init_rwsem -> __init_rwsem -> lockdep_init_map;
> >    2) lockdep_set_class -> lockdep_init_map;
> >
> >     Finally we go back to your concern, you maybe worry this change will cause the below dead-lock can't be detected. In fact, lockdep still is able to detect the issue as circular locking dependency, but there is no warning "recursive locking " in this case.
> >     Thread A: down_write(adev->reset_sem) for GPU 0 -> down_write(adev->reset_sem) for GPU 1 -> ... -> up_write(adev->reset_sem) for GPU 1 -> up_write(adev->reset_sem) for GPU 0
> >     Thread B: down_write(adev->reset_sem) for GPU 1 ->
> > down_write(adev->reset_sem) for GPU 0 -> ... ->
> > up_write(adev->reset_sem) for GPU 0 -> up_write(adev->reset_sem) for 
> > GPU 1
> >
> >     But lockdep still can detect recursive locking for this case:
> >     Thread A: down_write(adev->reset_sem) for GPU 0 -> ...-> ...->
> > down_write(adev->reset_sem) for GPU 0
>
> Yeah, I guessed correctly what you're doing. My recommendation to use down_write_nest_lock still stands. This is for reset only, kernel-wide reset lock really shouldn't hurt. Or make it a lock per xgmi hive, I'm assuming that information is known somewhere.
>
> The problem with manual lockdep annotations is that they increase complexity. You have to keep all the annotations in mind, including justifcations and which parts they still catch and which parts they don't catch. And there's zero performance justification for a gpu reset path to create some fancy lockdep special cases.
>
> Locking needs to be
> 1. maintainable, i.e. every time you need to write a multi-paragraph explanation like the above it's probably not. This obviously includes correctness, but it's even more important that people can easily understand what you're doing.
> 2. fast enough, where it matters. gpu reset just doesn't.
>
> [Dennis Li] Yeah. I strongly agree that manual lockdep annotation will increase complexity. However my patch isn't for lockdep annotation, and in fact it is to fix a bug. According to design of lockdep, every lock object should has a separated  lock_class_key which is an unified ID of lock object in lockdep.

Nope that 's not how lockdep works. It intentionally shares the lockdep class. If every lock would have it's own key all lockdep would achieve is produce a bit more heat and slow down your cpu.

Yes it uses the macro trick to make sure you share the lockdep key the right way in 99% of all cases, but there's also many other examples with manually shared lockdep keys. But shared lockdep keys is the entire point of lockdep. It doesn't do much useful without that.
-Daniel

[Dennis Li] Yes. I agree that using the shared lockdep class has higher CPU performance when lookdep opened. And 99% of all cases can work well, but our case belong to 1%.  You agree it as well that down_write_nest_lock has side effect and will increase complexity. Comparing two methods, I still think it is the best choice to use the separated class_key, because it has no side effect and performance drop is very little. 

> I still take the previous sample codes for example: we define two rw_sem: a and b, and use case1 and case2 to init them.  In my opinion, case1 and case2 should have same behavior, but in fact, they are different, because case2 use init_rwsem macro wrongly. Therefore we should lockdep_set_class to fix this issue, such as case3.
>
> Case 1:
> init_rwsem(&a);
> init_rwsem(&b);
>
> Case2:
> void init_rwsem_ext(rw_sem* sem)
> {
>      init_rwsem(sem);
> }
> init_rwsem_ext(&a);
> init_rwsem_ext(&b);
>
> Case3:
> void init_rwsem_ext(rw_sem* sem, lock_class_key *key) {
>      init_rwsem(sem);
>      lockdep_set_class(sem, key);
> }
> init_rwsem_ext(&a, a_key);
> init_rwsem_ext(&b, b_key);
>
> > [  584.110304] ============================================
> > [  584.110590] WARNING: possible recursive locking detected
> > [  584.110876] 5.6.0-deli-v5.6-2848-g3f3109b0e75f #1 Tainted: G           OE
> > [  584.111164] --------------------------------------------
> > [  584.111456] kworker/38:1/553 is trying to acquire lock:
> > [  584.111721] ffff9b15ff0a47a0 (&adev->reset_sem){++++}, at:
> > amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.112112]
> >                but task is already holding lock:
> > [  584.112673] ffff9b1603d247a0 (&adev->reset_sem){++++}, at:
> > amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [  584.113068]
> >                other info that might help us debug this:
> > [  584.113689]  Possible unsafe locking scenario:
> >
> > [  584.114350]        CPU0
> > [  584.114685]        ----
> > [  584.115014]   lock(&adev->reset_sem);
> > [  584.115349]   lock(&adev->reset_sem);
> > [  584.115678]
> >                 *** DEADLOCK ***
> >
> > [  584.116624]  May be due to missing lock nesting notation
> >
> > [  584.117284] 4 locks held by kworker/38:1/553:
> > [  584.117616]  #0: ffff9ad635c1d348 ((wq_completion)events){+.+.},
> > at: process_one_work+0x21f/0x630 [  584.117967]  #1: 
> > ffffac708e1c3e58 ((work_completion)(&con->recovery_work)){+.+.}, at:
> > process_one_work+0x21f/0x630 [  584.118358]  #2: ffffffffc1c2a5d0 
> > (&tmp->hive_lock){+.+.}, at: amdgpu_device_gpu_recover+0xae/0x1030
> > [amdgpu] [  584.118786]  #3: ffff9b1603d247a0 
> > (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030
> > [amdgpu]
>
> Uh, so this means this rwsem is in the critical path for dma-fence signalling. Please make sure this is all correctly primed at driver load (like we do alredy for dma_resv/fence and drm_modeset_lock) when lockdep is enabled. Otherwise pretty much guaranteed you'll get this wrong somewhere.
> -Daniel
>
> >
> > Best Regards
> > Dennis Li
> > -----Original Message-----
> > From: Daniel Vetter <daniel@ffwll.ch>
> > Sent: Friday, August 7, 2020 5:45 PM
> > To: Koenig, Christian <Christian.Koenig@amd.com>
> > Cc: Li, Dennis <Dennis.Li@amd.com>; amd-gfx@lists.freedesktop.org; 
> > Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix 
> > <Felix.Kuehling@amd.com>; Zhang, Hawking <Hawking.Zhang@amd.com>
> > Subject: Re: [PATCH] drm/amdgpu: annotate a false positive recursive 
> > locking
> >
> > On Fri, Aug 7, 2020 at 11:34 AM Christian König <christian.koenig@amd.com> wrote:
> > >
> > > Am 07.08.20 um 11:23 schrieb Daniel Vetter:
> > > > On Fri, Aug 7, 2020 at 11:20 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > >> On Fri, Aug 7, 2020 at 11:08 AM Christian König 
> > > >> <christian.koenig@amd.com> wrote:
> > > >>> [SNIP]
> > > >>>>>>> What we should do instead is to make sure we have only a single lock for the complete hive instead.
> > > >>>>>>> [Dennis Li] If we use a single lock, users will must wait for all devices resuming successfully, but in fact their tasks are only running in device a. It is not friendly to users.
> > > >>>>>> Well that is intentional I would say. We can only restart submissions after all devices are resumed successfully, cause otherwise it can happen that a job on device A depends on memory on device B which isn't accessible.
> > > >>>>>> [Dennis Li] Yes, you are right. Driver have make sure that the shared resources(such as the shard memory) are ready before unlock the lock of adev one by one. But each device still has private hardware resources such as video  and display.
> > > >>>>> Yeah, but those are negligible and we have a team working on display support for XGMI. So this will sooner or later become a problem as well.
> > > >>>>>
> > > >>>>> I still think that a single rwsem for the whole hive is still the best option here.
> > > >>>>>
> > > >>>>> [Dennis Li] For your above concern, we can open a new thread to discuss it. If we make a decision to use a single after discussing, we can create another patch for it.
> > > >>>> That's a really good argument, but I still hesitate to merge this patch.
> > > >>>> How severe is the lockdep splat?
> > > >>>>
> > > >>>> At bare minimum we need a "/* TODO: ...." comment why we do this and how to remove the workaround again.
> > > >>>> [Dennis Li] It is not a workaround. According to design of lockdep, each rw_semaphore should has a separated lock_class_key. I have explained that init_rwsem has the above limitation, so we must correct it.
> > > >>> Yeah, but this explanation is not really sufficient. Again 
> > > >>> this is not an limitation of init_rwsem, but intentional behavior.
> > > >>>
> > > >>> In other words in most cases lockdep should splat if you use 
> > > >>> rw semaphores like this.
> > > >>>
> > > >>> That we avoid this by locking multiple amdgpu device always in 
> > > >>> the same order is rather questionable driver design.
> > > >> Yeah don't give multiple locking classes like that, that's 
> > > >> fairly terrible.
> > >
> > > Ok, that is exactly the answer I feared we would get.
> > >
> > > >> The right fix is mutex_lock_nest_lock, which we're using in 
> > > >> ww_mutex or in which is also used in mm_take_all_locks.
> > >
> > > Ah! Enlightenment! So in this case we should use 
> > > down_write_nested() in this special space and all is well?
> >
> > Nope. There's two kinds of nesting that lockdep supports:
> > 1. _nested() variants, where you supply an integer parameter specifying the lockdep subclass. That's like setting a locking subclass at lock creating time, except much worse because you change the lockdep class at runtime. This can lead to stuff like:
> >
> > struct mutex A;
> > mutex_init(&A);
> > mutex_lock(&A);
> > mutex_lock_nested(&A, SINGLE_DEPTH_NESTING);
> >
> > which is a very clear deadlock, but lockdep will think it's all fine.
> > Don't use that.
> >
> > 2. _nest_lock() variants, where you specify a super-lock, which makes sure that all sub-locks cannot be acquired concurrently (or you promise there's some other trick to avoid deadlocks). Lockdep checks that you've acquired the super-lock, and then allows arbitary nesting.
> > This is what ww_mutex uses, and what mm_take_all_locks uses. If the super-lock is an actual mutex, then this is actually deadlock free.
> > With ww_mutex it's a fake lockdep_map in the ww_acquire_ctx, but we guarantee deadlock-freeness through the ww algorithm. The fake super lock for ww_mutex is acquired when you get the ticket.
> >
> > No 2 is the thing you want, not no 1.
> >
> > > >>
> > > >> If you solve the locking ordering by sorting all the locks you 
> > > >> need (ugh), then you can also just use a fake lockdep_map for 
> > > >> your special function only.
> > > >>
> > > >> Also in general the idea of an rwsem in conjunction with xgmi 
> > > >> cross device memory handling just plain freaks me out :-) 
> > > >> Please make sure you prime this lock against dma_resv and 
> > > >> dma_fence (and maybe also drm_modeset_lock if this lock is 
> > > >> outside of dma_resv), and ideally this should only ever be used 
> > > >> for setup stuff when loading drivers. I guess that's why you 
> > > >> went with a per-device rwsem. If it's really only for setup 
> > > >> then just do the simple thing of having an 
> > > >> xgmi_hive_reconfigure lock, which all the rwsems nest within, and no fancy lock ordering or other tricks.
> > >
> > > Well this is for the group reset of all devices in a hive and you 
> > > need to reboot to change the order the device are in the list.
> >
> > Ok, that sounds save enough. I'd go with a xgmi_hive_lock then, and 
> > down_write_nest_lock(&adev->xgmi_rwsem, &xgmi_hive_lock); Taking the 
> > global lock for device reset shouldn't be a problem, we're not going 
> > to reset multiple devices in parallel anyway. Or at least I don't 
> > hope that use-case is a perf critical benchmark for you guys :-)
> >
> > Cheers, Daniel
> >
> > > Regards,
> > > Christian.
> > >
> > > >>
> > > >>>>    Core network driver (net/core/dev.c) has the similar use case with us.
> > > >>> Just took a look at that, but this is completely different use 
> > > >>> case. The lockdep classes are grouped by driver type to note 
> > > >>> the difference in the network stack.
> > > >>>
> > > >>> A quick search showed that no other device driver is doing 
> > > >>> this in the kernel, so I'm not sure if such a behavior 
> > > >>> wouldn't be rejected. Adding Daniel to comment on this as well.
> > > >>>
> > > >>> Felix had some really good arguments to make that an urgent 
> > > >>> fix, so either we come up with some real rework of this or at 
> > > >>> least add a big
> > > >>> "/* TODO....*/".
> > > >> Aside from "I have questions" I don't think there's any reason 
> > > >> to no fix this correctly with mutex_lock_nest_lock. Really 
> > > >> shouldn't be a semantic change from what I'm understand here.
> > > > Also one more with my drm maintainer hat on, as a heads-up:
> > > > Dave&me looked at i915 gem code a bit too much, and we're 
> > > > appalled at the massive over-use of lockdep class hacks and 
> > > > dubious nesting annotations. Expect crack down on anyone who 
> > > > tries to play clever tricks here, we have a few too many already 
> > > > :-)
> > > >
> > > > So no mutex_lock_nested (totally different beast from 
> > > > mutex_lock_nest_lock, and really unsafe since it's a runtime 
> > > > change of the lockdep class) and also not any lockdep_set_class 
> > > > trickery or anything like that. We need locking hierarchies 
> > > > which mere humans can comprehend and review.
> > > >
> > > > Cheers, Daniel
> > > >
> > > >> Cheers, Daniel
> > > >>
> > > >>> Regards,
> > > >>> Christian.
> > > >>>
> > > >>>> Regards,
> > > >>>> Christian.
> > > >>>>
> > > >>
> > > >> --
> > > >> Daniel Vetter
> > > >> Software Engineer, Intel Corporation 
> > > >> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%
> > > >> 2F
> > > >> bl
> > > >> og.ffwll.ch%2F&amp;data=02%7C01%7CDennis.Li%40amd.com%7C604aa31
> > > >> bd
> > > >> 86
> > > >> 4459226ee08d83ab684dd%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C
> > > >> 0%
> > > >> 7C
> > > >> 637323902882042827&amp;sdata=DieZpl%2BIJ73STTkfn9UXDXmtssPXKKbt
> > > >> Ps
> > > >> ut
> > > >> BRp%2FS%2BE%3D&amp;reserved=0
> > > >
> > > >
> > >
> >
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation 
> > https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.
> > ffwll.ch%2F&amp;data=02%7C01%7CDennis.Li%40amd.com%7Cc9876eefce8e4bb
> > 96
> > 31008d83acd7273%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C6373240
> > 01 
> > 349319231&amp;sdata=%2BFmRWSamXiOE4HgK8VcTfFEINf31DqMqA5DjClsRkY4%3D
> > &a
> > mp;reserved=0
>
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.
> ffwll.ch%2F&amp;data=02%7C01%7CDennis.Li%40amd.com%7C784bbc9b31b043917
> cdd08d83aed8cb5%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637324139
> 228476856&amp;sdata=xIzqs2EysUjcN0jNx4MtD5%2FW9lyMghwXZKfqnauUGXA%3D&a
> mp;reserved=0



--
Daniel Vetter
Software Engineer, Intel Corporation
https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&amp;data=02%7C01%7CDennis.Li%40amd.com%7C784bbc9b31b043917cdd08d83aed8cb5%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637324139228476856&amp;sdata=xIzqs2EysUjcN0jNx4MtD5%2FW9lyMghwXZKfqnauUGXA%3D&amp;reserved=0
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2020-08-07 16:50 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-06  7:02 [PATCH] drm/amdgpu: annotate a false positive recursive locking Dennis Li
2020-08-06  7:08 ` Christian König
2020-08-06  7:44   ` Li, Dennis
2020-08-06  8:13     ` Christian König
2020-08-06  9:15       ` Li, Dennis
2020-08-06  9:19         ` Christian König
2020-08-06 10:10           ` Li, Dennis
2020-08-06 10:47             ` Christian König
2020-08-06 11:38               ` Li, Dennis
2020-08-06 13:16                 ` Christian König
2020-08-07  2:22                   ` Li, Dennis
2020-08-07  6:57                     ` Christian König
2020-08-07  7:08                       ` Felix Kuehling
2020-08-07  7:38                       ` Li, Dennis
2020-08-07  9:08                         ` Christian König
2020-08-07  9:20                           ` Daniel Vetter
2020-08-07  9:23                             ` Daniel Vetter
2020-08-07  9:34                               ` Christian König
2020-08-07  9:44                                 ` Daniel Vetter
2020-08-07 11:59                                   ` Li, Dennis
2020-08-07 12:28                                     ` Daniel Vetter
2020-08-07 15:32                                       ` Li, Dennis
2020-08-07 16:18                                         ` Daniel Vetter
2020-08-07 16:50                                           ` Li, Dennis

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.