* [PATCH] drm/amdgpu: SRIOV flr_work should use down_write @ 2021-12-09 17:02 Victor Skvortsov 2021-12-09 18:25 ` Liu, Shaoyun 0 siblings, 1 reply; 4+ messages in thread From: Victor Skvortsov @ 2021-12-09 17:02 UTC (permalink / raw) To: amd-gfx; +Cc: Victor Skvortsov Host initiated VF FLR may fail if someone else is already holding a read_lock. Change from down_write_trylock to down_write to guarantee the reset goes through. Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com> --- drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c | 5 +++-- drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c | 5 +++-- 2 files changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c index cd2719bc0139..e4365c97adaa 100644 --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c @@ -252,11 +252,12 @@ static void xgpu_ai_mailbox_flr_work(struct work_struct *work) * otherwise the mailbox msg will be ruined/reseted by * the VF FLR. */ - if (!down_write_trylock(&adev->reset_sem)) + if (atomic_cmpxchg(&adev->in_gpu_reset, 0, 1) != 0) return; + down_write(&adev->reset_sem); + amdgpu_virt_fini_vf2pf_work_item(adev); - atomic_set(&adev->in_gpu_reset, 1); xgpu_ai_mailbox_trans_msg(adev, IDH_READY_TO_RESET, 0, 0, 0); diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c index 2bc93808469a..1cde70c72e54 100644 --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c @@ -281,11 +281,12 @@ static void xgpu_nv_mailbox_flr_work(struct work_struct *work) * otherwise the mailbox msg will be ruined/reseted by * the VF FLR. */ - if (!down_write_trylock(&adev->reset_sem)) + if (atomic_cmpxchg(&adev->in_gpu_reset, 0, 1) != 0) return; + down_write(&adev->reset_sem); + amdgpu_virt_fini_vf2pf_work_item(adev); - atomic_set(&adev->in_gpu_reset, 1); xgpu_nv_mailbox_trans_msg(adev, IDH_READY_TO_RESET, 0, 0, 0); -- 2.25.1 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* RE: [PATCH] drm/amdgpu: SRIOV flr_work should use down_write 2021-12-09 17:02 [PATCH] drm/amdgpu: SRIOV flr_work should use down_write Victor Skvortsov @ 2021-12-09 18:25 ` Liu, Shaoyun 2021-12-09 18:33 ` Skvortsov, Victor 0 siblings, 1 reply; 4+ messages in thread From: Liu, Shaoyun @ 2021-12-09 18:25 UTC (permalink / raw) To: Skvortsov, Victor, amd-gfx; +Cc: Skvortsov, Victor [AMD Official Use Only] I think it's a good catch for reset_sem, any reason to change the adev->in_gpu_reset ? Regards Shaoyun.liu -----Original Message----- From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of Victor Skvortsov Sent: Thursday, December 9, 2021 12:02 PM To: amd-gfx@lists.freedesktop.org Cc: Skvortsov, Victor <Victor.Skvortsov@amd.com> Subject: [PATCH] drm/amdgpu: SRIOV flr_work should use down_write Host initiated VF FLR may fail if someone else is already holding a read_lock. Change from down_write_trylock to down_write to guarantee the reset goes through. Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com> --- drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c | 5 +++-- drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c | 5 +++-- 2 files changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c index cd2719bc0139..e4365c97adaa 100644 --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c @@ -252,11 +252,12 @@ static void xgpu_ai_mailbox_flr_work(struct work_struct *work) * otherwise the mailbox msg will be ruined/reseted by * the VF FLR. */ - if (!down_write_trylock(&adev->reset_sem)) + if (atomic_cmpxchg(&adev->in_gpu_reset, 0, 1) != 0) return; + down_write(&adev->reset_sem); + amdgpu_virt_fini_vf2pf_work_item(adev); - atomic_set(&adev->in_gpu_reset, 1); xgpu_ai_mailbox_trans_msg(adev, IDH_READY_TO_RESET, 0, 0, 0); diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c index 2bc93808469a..1cde70c72e54 100644 --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c @@ -281,11 +281,12 @@ static void xgpu_nv_mailbox_flr_work(struct work_struct *work) * otherwise the mailbox msg will be ruined/reseted by * the VF FLR. */ - if (!down_write_trylock(&adev->reset_sem)) + if (atomic_cmpxchg(&adev->in_gpu_reset, 0, 1) != 0) return; + down_write(&adev->reset_sem); + amdgpu_virt_fini_vf2pf_work_item(adev); - atomic_set(&adev->in_gpu_reset, 1); xgpu_nv_mailbox_trans_msg(adev, IDH_READY_TO_RESET, 0, 0, 0); -- 2.25.1 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* RE: [PATCH] drm/amdgpu: SRIOV flr_work should use down_write 2021-12-09 18:25 ` Liu, Shaoyun @ 2021-12-09 18:33 ` Skvortsov, Victor 2021-12-09 18:43 ` Liu, Shaoyun 0 siblings, 1 reply; 4+ messages in thread From: Skvortsov, Victor @ 2021-12-09 18:33 UTC (permalink / raw) To: Liu, Shaoyun, amd-gfx [AMD Official Use Only] I wanted to keep the order the same as in amdgpu_device_lock_adev() (Set flag then acquire lock) to prevent any weird race conditions. Thanks, Victor -----Original Message----- From: Liu, Shaoyun <Shaoyun.Liu@amd.com> Sent: Thursday, December 9, 2021 1:25 PM To: Skvortsov, Victor <Victor.Skvortsov@amd.com>; amd-gfx@lists.freedesktop.org Cc: Skvortsov, Victor <Victor.Skvortsov@amd.com> Subject: RE: [PATCH] drm/amdgpu: SRIOV flr_work should use down_write [AMD Official Use Only] I think it's a good catch for reset_sem, any reason to change the adev->in_gpu_reset ? Regards Shaoyun.liu -----Original Message----- From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of Victor Skvortsov Sent: Thursday, December 9, 2021 12:02 PM To: amd-gfx@lists.freedesktop.org Cc: Skvortsov, Victor <Victor.Skvortsov@amd.com> Subject: [PATCH] drm/amdgpu: SRIOV flr_work should use down_write Host initiated VF FLR may fail if someone else is already holding a read_lock. Change from down_write_trylock to down_write to guarantee the reset goes through. Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com> --- drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c | 5 +++-- drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c | 5 +++-- 2 files changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c index cd2719bc0139..e4365c97adaa 100644 --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c @@ -252,11 +252,12 @@ static void xgpu_ai_mailbox_flr_work(struct work_struct *work) * otherwise the mailbox msg will be ruined/reseted by * the VF FLR. */ - if (!down_write_trylock(&adev->reset_sem)) + if (atomic_cmpxchg(&adev->in_gpu_reset, 0, 1) != 0) return; + down_write(&adev->reset_sem); + amdgpu_virt_fini_vf2pf_work_item(adev); - atomic_set(&adev->in_gpu_reset, 1); xgpu_ai_mailbox_trans_msg(adev, IDH_READY_TO_RESET, 0, 0, 0); diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c index 2bc93808469a..1cde70c72e54 100644 --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c @@ -281,11 +281,12 @@ static void xgpu_nv_mailbox_flr_work(struct work_struct *work) * otherwise the mailbox msg will be ruined/reseted by * the VF FLR. */ - if (!down_write_trylock(&adev->reset_sem)) + if (atomic_cmpxchg(&adev->in_gpu_reset, 0, 1) != 0) return; + down_write(&adev->reset_sem); + amdgpu_virt_fini_vf2pf_work_item(adev); - atomic_set(&adev->in_gpu_reset, 1); xgpu_nv_mailbox_trans_msg(adev, IDH_READY_TO_RESET, 0, 0, 0); -- 2.25.1 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* RE: [PATCH] drm/amdgpu: SRIOV flr_work should use down_write 2021-12-09 18:33 ` Skvortsov, Victor @ 2021-12-09 18:43 ` Liu, Shaoyun 0 siblings, 0 replies; 4+ messages in thread From: Liu, Shaoyun @ 2021-12-09 18:43 UTC (permalink / raw) To: Skvortsov, Victor, amd-gfx [AMD Official Use Only] Sounds reasonable. This patch is Reviewed by : Shaoyun.liu <Shaoyun.liu@amd.com> Regards Shaoyun.liu -----Original Message----- From: Skvortsov, Victor <Victor.Skvortsov@amd.com> Sent: Thursday, December 9, 2021 1:33 PM To: Liu, Shaoyun <Shaoyun.Liu@amd.com>; amd-gfx@lists.freedesktop.org Subject: RE: [PATCH] drm/amdgpu: SRIOV flr_work should use down_write [AMD Official Use Only] I wanted to keep the order the same as in amdgpu_device_lock_adev() (Set flag then acquire lock) to prevent any weird race conditions. Thanks, Victor -----Original Message----- From: Liu, Shaoyun <Shaoyun.Liu@amd.com> Sent: Thursday, December 9, 2021 1:25 PM To: Skvortsov, Victor <Victor.Skvortsov@amd.com>; amd-gfx@lists.freedesktop.org Cc: Skvortsov, Victor <Victor.Skvortsov@amd.com> Subject: RE: [PATCH] drm/amdgpu: SRIOV flr_work should use down_write [AMD Official Use Only] I think it's a good catch for reset_sem, any reason to change the adev->in_gpu_reset ? Regards Shaoyun.liu -----Original Message----- From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of Victor Skvortsov Sent: Thursday, December 9, 2021 12:02 PM To: amd-gfx@lists.freedesktop.org Cc: Skvortsov, Victor <Victor.Skvortsov@amd.com> Subject: [PATCH] drm/amdgpu: SRIOV flr_work should use down_write Host initiated VF FLR may fail if someone else is already holding a read_lock. Change from down_write_trylock to down_write to guarantee the reset goes through. Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com> --- drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c | 5 +++-- drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c | 5 +++-- 2 files changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c index cd2719bc0139..e4365c97adaa 100644 --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c @@ -252,11 +252,12 @@ static void xgpu_ai_mailbox_flr_work(struct work_struct *work) * otherwise the mailbox msg will be ruined/reseted by * the VF FLR. */ - if (!down_write_trylock(&adev->reset_sem)) + if (atomic_cmpxchg(&adev->in_gpu_reset, 0, 1) != 0) return; + down_write(&adev->reset_sem); + amdgpu_virt_fini_vf2pf_work_item(adev); - atomic_set(&adev->in_gpu_reset, 1); xgpu_ai_mailbox_trans_msg(adev, IDH_READY_TO_RESET, 0, 0, 0); diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c index 2bc93808469a..1cde70c72e54 100644 --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c @@ -281,11 +281,12 @@ static void xgpu_nv_mailbox_flr_work(struct work_struct *work) * otherwise the mailbox msg will be ruined/reseted by * the VF FLR. */ - if (!down_write_trylock(&adev->reset_sem)) + if (atomic_cmpxchg(&adev->in_gpu_reset, 0, 1) != 0) return; + down_write(&adev->reset_sem); + amdgpu_virt_fini_vf2pf_work_item(adev); - atomic_set(&adev->in_gpu_reset, 1); xgpu_nv_mailbox_trans_msg(adev, IDH_READY_TO_RESET, 0, 0, 0); -- 2.25.1 ^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2021-12-09 18:43 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-12-09 17:02 [PATCH] drm/amdgpu: SRIOV flr_work should use down_write Victor Skvortsov 2021-12-09 18:25 ` Liu, Shaoyun 2021-12-09 18:33 ` Skvortsov, Victor 2021-12-09 18:43 ` Liu, Shaoyun
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.