* [PATCH v2] amd/amdgpu: Fix resv shared fence overflow
@ 2020-09-29 5:57 xinhui pan
2020-09-29 7:00 ` Christian König
0 siblings, 1 reply; 2+ messages in thread
From: xinhui pan @ 2020-09-29 5:57 UTC (permalink / raw)
To: amd-gfx; +Cc: alexander.deucher, xinhui pan, christian.koenig
[ 179.556745] kernel BUG at drivers/dma-buf/dma-resv.c:282!
[snip]
[ 179.702910] Call Trace:
[ 179.705696] amdgpu_bo_fence+0x21/0x50 [amdgpu]
[ 179.710707] amdgpu_vm_sdma_commit+0x299/0x430 [amdgpu]
[ 179.716497] amdgpu_vm_bo_update_mapping.constprop.0+0x29f/0x390 [amdgpu]
[ 179.723927] ? find_held_lock+0x38/0x90
[ 179.728183] amdgpu_vm_handle_fault+0x1af/0x420 [amdgpu]
[ 179.734063] gmc_v9_0_process_interrupt+0x245/0x2e0 [amdgpu]
[ 179.740347] ? kgd2kfd_interrupt+0xb8/0x1e0 [amdgpu]
[ 179.745808] amdgpu_irq_dispatch+0x10a/0x3c0 [amdgpu]
[ 179.751380] ? amdgpu_irq_dispatch+0x10a/0x3c0 [amdgpu]
[ 179.757159] amdgpu_ih_process+0xbb/0x1a0 [amdgpu]
[ 179.762466] amdgpu_irq_handle_ih1+0x27/0x40 [amdgpu]
[ 179.767997] process_one_work+0x23c/0x580
[ 179.772371] worker_thread+0x50/0x3b0
[ 179.776356] ? process_one_work+0x580/0x580
[ 179.780939] kthread+0x128/0x160
[ 179.784462] ? kthread_park+0x90/0x90
[ 179.788466] ret_from_fork+0x1f/0x30
We have two scheduler entities, immediate and delayed.
So there are two kinds of scheduler finished fences.
We might add these two fences in root bo resv at same time while we
only reserve one slot.
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 37221b99ca96..9e0116c7f8d1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2869,7 +2869,7 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
if (r)
goto error_free_root;
- r = dma_resv_reserve_shared(root->tbo.base.resv, 1);
+ r = dma_resv_reserve_shared(root->tbo.base.resv, 2);
if (r)
goto error_unreserve;
--
2.25.1
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH v2] amd/amdgpu: Fix resv shared fence overflow
2020-09-29 5:57 [PATCH v2] amd/amdgpu: Fix resv shared fence overflow xinhui pan
@ 2020-09-29 7:00 ` Christian König
0 siblings, 0 replies; 2+ messages in thread
From: Christian König @ 2020-09-29 7:00 UTC (permalink / raw)
To: xinhui pan, amd-gfx, Philip Yang; +Cc: alexander.deucher
Philip already stumbled over this issue as well, but this is the wrong
place to fix this.
dma_resv_reserve_shared() needs to be called after we reserved the page
tables and before we do the update in amdgpu_vm_handle_fault().
Reserved slots are freed (in a debug build) as soon as we release the
reservation.
Christian.
Am 29.09.20 um 07:57 schrieb xinhui pan:
> [ 179.556745] kernel BUG at drivers/dma-buf/dma-resv.c:282!
> [snip]
> [ 179.702910] Call Trace:
> [ 179.705696] amdgpu_bo_fence+0x21/0x50 [amdgpu]
> [ 179.710707] amdgpu_vm_sdma_commit+0x299/0x430 [amdgpu]
> [ 179.716497] amdgpu_vm_bo_update_mapping.constprop.0+0x29f/0x390 [amdgpu]
> [ 179.723927] ? find_held_lock+0x38/0x90
> [ 179.728183] amdgpu_vm_handle_fault+0x1af/0x420 [amdgpu]
> [ 179.734063] gmc_v9_0_process_interrupt+0x245/0x2e0 [amdgpu]
> [ 179.740347] ? kgd2kfd_interrupt+0xb8/0x1e0 [amdgpu]
> [ 179.745808] amdgpu_irq_dispatch+0x10a/0x3c0 [amdgpu]
> [ 179.751380] ? amdgpu_irq_dispatch+0x10a/0x3c0 [amdgpu]
> [ 179.757159] amdgpu_ih_process+0xbb/0x1a0 [amdgpu]
> [ 179.762466] amdgpu_irq_handle_ih1+0x27/0x40 [amdgpu]
> [ 179.767997] process_one_work+0x23c/0x580
> [ 179.772371] worker_thread+0x50/0x3b0
> [ 179.776356] ? process_one_work+0x580/0x580
> [ 179.780939] kthread+0x128/0x160
> [ 179.784462] ? kthread_park+0x90/0x90
> [ 179.788466] ret_from_fork+0x1f/0x30
>
> We have two scheduler entities, immediate and delayed.
> So there are two kinds of scheduler finished fences.
> We might add these two fences in root bo resv at same time while we
> only reserve one slot.
>
> Signed-off-by: xinhui pan <xinhui.pan@amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 37221b99ca96..9e0116c7f8d1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -2869,7 +2869,7 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
> if (r)
> goto error_free_root;
>
> - r = dma_resv_reserve_shared(root->tbo.base.resv, 1);
> + r = dma_resv_reserve_shared(root->tbo.base.resv, 2);
> if (r)
> goto error_unreserve;
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2020-09-29 7:00 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-29 5:57 [PATCH v2] amd/amdgpu: Fix resv shared fence overflow xinhui pan
2020-09-29 7:00 ` Christian König
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.