From: Lyude Paul <lyude@redhat.com> To: stable@vger.kernel.org Cc: "Alex Deucher" <alexander.deucher@amd.com>, "Sinclair Yeh" <syeh@vmware.com>, "Christian König" <christian.koenig@amd.com>, "David Airlie" <airlied@linux.ie>, linux-kernel@vger.kernel.org, "Nicolai Hähnle" <nicolai.haehnle@amd.com>, dri-devel@lists.freedesktop.org, "Peter Zijlstra" <peterz@infradead.org>, "Chunming Zhou" <david1.zhou@amd.com>, "Michel Dänzer" <michel.daenzer@amd.com>, "Sumit Semwal" <sumit.semwal@linaro.org>, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, "Harish Kasiviswanathan" <harish.kasiviswanathan@amd.com>, "Alex Xie" <alexbin.xie@amd.com>, "Zhang, Jerry" <jerry.zhang@amd.com>, "Felix Kuehling" <felix.kuehling@amd.com>, amd-gfx@lists.freedesktop.org Subject: [PATCH 0/4] Backported amdgpu ttm deadlock fixes for 4.14 Date: Thu, 30 Nov 2017 19:23:02 -0500 [thread overview] Message-ID: <20171201002311.28098-1-lyude@redhat.com> (raw) I haven't gone to see where it started, but as of late a good number of pretty nasty deadlock issues have appeared with the kernel. Easy reproduction recipe on a laptop with i915/amdgpu prime with lockdep enabled: DRI_PRIME=1 glxinfo Additionally, some more race conditions exist that I've managed to trigger with piglit and lockdep enabled after applying these patches: ============================= WARNING: suspicious RCU usage 4.14.3Lyude-Test+ #2 Not tainted ----------------------------- ./include/linux/reservation.h:216 suspicious rcu_dereference_protected() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 1 lock held by ext_image_dma_b/27451: #0: (reservation_ww_class_mutex){+.+.}, at: [<ffffffffa034f2ff>] ttm_bo_unref+0x9f/0x3c0 [ttm] stack backtrace: CPU: 0 PID: 27451 Comm: ext_image_dma_b Not tainted 4.14.3Lyude-Test+ #2 Hardware name: HP HP ZBook 15 G4/8275, BIOS P70 Ver. 01.02 06/09/2017 Call Trace: dump_stack+0x8e/0xce lockdep_rcu_suspicious+0xc5/0x100 reservation_object_copy_fences+0x292/0x2b0 ? ttm_bo_unref+0x9f/0x3c0 [ttm] ttm_bo_unref+0xbd/0x3c0 [ttm] amdgpu_bo_unref+0x2a/0x50 [amdgpu] amdgpu_gem_object_free+0x4b/0x50 [amdgpu] drm_gem_object_free+0x1f/0x40 [drm] drm_gem_object_put_unlocked+0x40/0xb0 [drm] drm_gem_object_handle_put_unlocked+0x6c/0xb0 [drm] drm_gem_object_release_handle+0x51/0x90 [drm] drm_gem_handle_delete+0x5e/0x90 [drm] ? drm_gem_handle_create+0x40/0x40 [drm] drm_gem_close_ioctl+0x20/0x30 [drm] drm_ioctl_kernel+0x5d/0xb0 [drm] drm_ioctl+0x2f7/0x3b0 [drm] ? drm_gem_handle_create+0x40/0x40 [drm] ? trace_hardirqs_on_caller+0xf4/0x190 ? trace_hardirqs_on+0xd/0x10 amdgpu_drm_ioctl+0x4f/0x90 [amdgpu] do_vfs_ioctl+0x93/0x670 ? __fget+0x108/0x1f0 SyS_ioctl+0x79/0x90 entry_SYSCALL_64_fastpath+0x23/0xc2 I've also added the relevant fixes for the issue mentioned above. Christian König (3): drm/ttm: fix ttm_bo_cleanup_refs_or_queue once more dma-buf: make reservation_object_copy_fences rcu save drm/amdgpu: reserve root PD while releasing it Michel Dänzer (1): drm/ttm: Always and only destroy bo->ttm_resv in ttm_bo_release_list drivers/dma-buf/reservation.c | 56 +++++++++++++++++++++++++--------- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 13 ++++++-- drivers/gpu/drm/ttm/ttm_bo.c | 43 +++++++++++++------------- 3 files changed, 74 insertions(+), 38 deletions(-) -- 2.14.3
WARNING: multiple messages have this Message-ID (diff)
From: Lyude Paul <lyude-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> To: stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org Cc: "Chunming Zhou" <david1.zhou-5C7GfCeVMHo@public.gmane.org>, "Nicolai Hähnle" <nicolai.haehnle-5C7GfCeVMHo@public.gmane.org>, "Sinclair Yeh" <syeh-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>, "David Airlie" <airlied-cv59FeDIM0c@public.gmane.org>, "Harish Kasiviswanathan" <harish.kasiviswanathan-5C7GfCeVMHo@public.gmane.org>, "Felix Kuehling" <felix.kuehling-5C7GfCeVMHo@public.gmane.org>, "Zhang, Jerry" <jerry.zhang-5C7GfCeVMHo@public.gmane.org>, "Michel Dänzer" <michel.daenzer-5C7GfCeVMHo@public.gmane.org>, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org, "Sumit Semwal" <sumit.semwal-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>, linaro-mm-sig-cunTk1MwBs8s++Sfvej+rw@public.gmane.org, "Peter Zijlstra" <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org, "Alex Deucher" <alexander.deucher-5C7GfCeVMHo@public.gmane.org>, "Alex Xie" <alexbin.xie-5C7GfCeVMHo@public.gmane.org>, "Christian König" <christian.koenig-5C7GfCeVMHo@public.gmane.org>, linux-media-u79uwXL29TY76Z2rM5mHXA@public.gmane.org Subject: [PATCH 0/4] Backported amdgpu ttm deadlock fixes for 4.14 Date: Thu, 30 Nov 2017 19:23:02 -0500 [thread overview] Message-ID: <20171201002311.28098-1-lyude@redhat.com> (raw) I haven't gone to see where it started, but as of late a good number of pretty nasty deadlock issues have appeared with the kernel. Easy reproduction recipe on a laptop with i915/amdgpu prime with lockdep enabled: DRI_PRIME=1 glxinfo Additionally, some more race conditions exist that I've managed to trigger with piglit and lockdep enabled after applying these patches: ============================= WARNING: suspicious RCU usage 4.14.3Lyude-Test+ #2 Not tainted ----------------------------- ./include/linux/reservation.h:216 suspicious rcu_dereference_protected() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 1 lock held by ext_image_dma_b/27451: #0: (reservation_ww_class_mutex){+.+.}, at: [<ffffffffa034f2ff>] ttm_bo_unref+0x9f/0x3c0 [ttm] stack backtrace: CPU: 0 PID: 27451 Comm: ext_image_dma_b Not tainted 4.14.3Lyude-Test+ #2 Hardware name: HP HP ZBook 15 G4/8275, BIOS P70 Ver. 01.02 06/09/2017 Call Trace: dump_stack+0x8e/0xce lockdep_rcu_suspicious+0xc5/0x100 reservation_object_copy_fences+0x292/0x2b0 ? ttm_bo_unref+0x9f/0x3c0 [ttm] ttm_bo_unref+0xbd/0x3c0 [ttm] amdgpu_bo_unref+0x2a/0x50 [amdgpu] amdgpu_gem_object_free+0x4b/0x50 [amdgpu] drm_gem_object_free+0x1f/0x40 [drm] drm_gem_object_put_unlocked+0x40/0xb0 [drm] drm_gem_object_handle_put_unlocked+0x6c/0xb0 [drm] drm_gem_object_release_handle+0x51/0x90 [drm] drm_gem_handle_delete+0x5e/0x90 [drm] ? drm_gem_handle_create+0x40/0x40 [drm] drm_gem_close_ioctl+0x20/0x30 [drm] drm_ioctl_kernel+0x5d/0xb0 [drm] drm_ioctl+0x2f7/0x3b0 [drm] ? drm_gem_handle_create+0x40/0x40 [drm] ? trace_hardirqs_on_caller+0xf4/0x190 ? trace_hardirqs_on+0xd/0x10 amdgpu_drm_ioctl+0x4f/0x90 [amdgpu] do_vfs_ioctl+0x93/0x670 ? __fget+0x108/0x1f0 SyS_ioctl+0x79/0x90 entry_SYSCALL_64_fastpath+0x23/0xc2 I've also added the relevant fixes for the issue mentioned above. Christian König (3): drm/ttm: fix ttm_bo_cleanup_refs_or_queue once more dma-buf: make reservation_object_copy_fences rcu save drm/amdgpu: reserve root PD while releasing it Michel Dänzer (1): drm/ttm: Always and only destroy bo->ttm_resv in ttm_bo_release_list drivers/dma-buf/reservation.c | 56 +++++++++++++++++++++++++--------- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 13 ++++++-- drivers/gpu/drm/ttm/ttm_bo.c | 43 +++++++++++++------------- 3 files changed, 74 insertions(+), 38 deletions(-) -- 2.14.3 _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
next reply other threads:[~2017-12-01 0:24 UTC|newest] Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top 2017-12-01 0:23 Lyude Paul [this message] 2017-12-01 0:23 ` [PATCH 0/4] Backported amdgpu ttm deadlock fixes for 4.14 Lyude Paul 2017-12-01 0:23 ` [PATCH 1/4] drm/ttm: fix ttm_bo_cleanup_refs_or_queue once more Lyude Paul 2017-12-01 0:23 ` Lyude Paul 2017-12-01 0:23 ` [PATCH 2/4] dma-buf: make reservation_object_copy_fences rcu save Lyude Paul 2017-12-01 0:23 ` Lyude Paul 2017-12-01 0:23 ` [PATCH 3/4] drm/amdgpu: reserve root PD while releasing it Lyude Paul 2017-12-01 0:23 ` Lyude Paul 2017-12-01 0:23 ` [PATCH 4/4] drm/ttm: Always and only destroy bo->ttm_resv in ttm_bo_release_list Lyude Paul 2017-12-01 0:23 ` Lyude Paul 2017-12-01 8:27 ` [PATCH 0/4] Backported amdgpu ttm deadlock fixes for 4.14 Christian König 2017-12-04 11:45 ` Greg KH 2017-12-04 11:45 ` Greg KH
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20171201002311.28098-1-lyude@redhat.com \ --to=lyude@redhat.com \ --cc=airlied@linux.ie \ --cc=alexander.deucher@amd.com \ --cc=alexbin.xie@amd.com \ --cc=amd-gfx@lists.freedesktop.org \ --cc=christian.koenig@amd.com \ --cc=david1.zhou@amd.com \ --cc=dri-devel@lists.freedesktop.org \ --cc=felix.kuehling@amd.com \ --cc=harish.kasiviswanathan@amd.com \ --cc=jerry.zhang@amd.com \ --cc=linaro-mm-sig@lists.linaro.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-media@vger.kernel.org \ --cc=michel.daenzer@amd.com \ --cc=nicolai.haehnle@amd.com \ --cc=peterz@infradead.org \ --cc=stable@vger.kernel.org \ --cc=sumit.semwal@linaro.org \ --cc=syeh@vmware.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.