From: Sasha Levin <sashal@kernel.org> To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: "Yintian Tao" <yttao@amd.com>, "Christian König" <christian.koenig@amd.com>, "Alex Deucher" <alexander.deucher@amd.com>, "Sasha Levin" <sashal@kernel.org>, dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org Subject: [PATCH AUTOSEL 5.5 31/35] drm/scheduler: fix rare NULL ptr race Date: Mon, 6 Apr 2020 20:00:53 -0400 [thread overview] Message-ID: <20200407000058.16423-31-sashal@kernel.org> (raw) In-Reply-To: <20200407000058.16423-1-sashal@kernel.org> From: Yintian Tao <yttao@amd.com> [ Upstream commit 3c0fdf3302cb4f186c871684eac5c407a107e480 ] There is one one corner case at dma_fence_signal_locked which will raise the NULL pointer problem just like below. ->dma_fence_signal ->dma_fence_signal_locked ->test_and_set_bit here trigger dma_fence_release happen due to the zero of fence refcount. ->dma_fence_put ->dma_fence_release ->drm_sched_fence_release_scheduled ->call_rcu here make the union fled “cb_list” at finished fence to NULL because struct rcu_head contains two pointer which is same as struct list_head cb_list Therefore, to hold the reference of finished fence at drm_sched_process_job to prevent the null pointer during finished fence dma_fence_signal [ 732.912867] BUG: kernel NULL pointer dereference, address: 0000000000000008 [ 732.914815] #PF: supervisor write access in kernel mode [ 732.915731] #PF: error_code(0x0002) - not-present page [ 732.916621] PGD 0 P4D 0 [ 732.917072] Oops: 0002 [#1] SMP PTI [ 732.917682] CPU: 7 PID: 0 Comm: swapper/7 Tainted: G OE 5.4.0-rc7 #1 [ 732.918980] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014 [ 732.920906] RIP: 0010:dma_fence_signal_locked+0x3e/0x100 [ 732.938569] Call Trace: [ 732.939003] <IRQ> [ 732.939364] dma_fence_signal+0x29/0x50 [ 732.940036] drm_sched_fence_finished+0x12/0x20 [gpu_sched] [ 732.940996] drm_sched_process_job+0x34/0xa0 [gpu_sched] [ 732.941910] dma_fence_signal_locked+0x85/0x100 [ 732.942692] dma_fence_signal+0x29/0x50 [ 732.943457] amdgpu_fence_process+0x99/0x120 [amdgpu] [ 732.944393] sdma_v4_0_process_trap_irq+0x81/0xa0 [amdgpu] v2: hold the finished fence at drm_sched_process_job instead of amdgpu_fence_process v3: resume the blank line Signed-off-by: Yintian Tao <yttao@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org> --- drivers/gpu/drm/scheduler/sched_main.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 3c57e84222ca9..5bb9feddbfd6b 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -632,7 +632,9 @@ static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb) trace_drm_sched_process_job(s_fence); + dma_fence_get(&s_fence->finished); drm_sched_fence_finished(s_fence); + dma_fence_put(&s_fence->finished); wake_up_interruptible(&sched->wake_up_worker); } -- 2.20.1
WARNING: multiple messages have this Message-ID (diff)
From: Sasha Levin <sashal@kernel.org> To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: "Sasha Levin" <sashal@kernel.org>, dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org, "Alex Deucher" <alexander.deucher@amd.com>, "Yintian Tao" <yttao@amd.com>, "Christian König" <christian.koenig@amd.com>, linux-media@vger.kernel.org Subject: [PATCH AUTOSEL 5.5 31/35] drm/scheduler: fix rare NULL ptr race Date: Mon, 6 Apr 2020 20:00:53 -0400 [thread overview] Message-ID: <20200407000058.16423-31-sashal@kernel.org> (raw) In-Reply-To: <20200407000058.16423-1-sashal@kernel.org> From: Yintian Tao <yttao@amd.com> [ Upstream commit 3c0fdf3302cb4f186c871684eac5c407a107e480 ] There is one one corner case at dma_fence_signal_locked which will raise the NULL pointer problem just like below. ->dma_fence_signal ->dma_fence_signal_locked ->test_and_set_bit here trigger dma_fence_release happen due to the zero of fence refcount. ->dma_fence_put ->dma_fence_release ->drm_sched_fence_release_scheduled ->call_rcu here make the union fled “cb_list” at finished fence to NULL because struct rcu_head contains two pointer which is same as struct list_head cb_list Therefore, to hold the reference of finished fence at drm_sched_process_job to prevent the null pointer during finished fence dma_fence_signal [ 732.912867] BUG: kernel NULL pointer dereference, address: 0000000000000008 [ 732.914815] #PF: supervisor write access in kernel mode [ 732.915731] #PF: error_code(0x0002) - not-present page [ 732.916621] PGD 0 P4D 0 [ 732.917072] Oops: 0002 [#1] SMP PTI [ 732.917682] CPU: 7 PID: 0 Comm: swapper/7 Tainted: G OE 5.4.0-rc7 #1 [ 732.918980] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014 [ 732.920906] RIP: 0010:dma_fence_signal_locked+0x3e/0x100 [ 732.938569] Call Trace: [ 732.939003] <IRQ> [ 732.939364] dma_fence_signal+0x29/0x50 [ 732.940036] drm_sched_fence_finished+0x12/0x20 [gpu_sched] [ 732.940996] drm_sched_process_job+0x34/0xa0 [gpu_sched] [ 732.941910] dma_fence_signal_locked+0x85/0x100 [ 732.942692] dma_fence_signal+0x29/0x50 [ 732.943457] amdgpu_fence_process+0x99/0x120 [amdgpu] [ 732.944393] sdma_v4_0_process_trap_irq+0x81/0xa0 [amdgpu] v2: hold the finished fence at drm_sched_process_job instead of amdgpu_fence_process v3: resume the blank line Signed-off-by: Yintian Tao <yttao@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org> --- drivers/gpu/drm/scheduler/sched_main.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 3c57e84222ca9..5bb9feddbfd6b 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -632,7 +632,9 @@ static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb) trace_drm_sched_process_job(s_fence); + dma_fence_get(&s_fence->finished); drm_sched_fence_finished(s_fence); + dma_fence_put(&s_fence->finished); wake_up_interruptible(&sched->wake_up_worker); } -- 2.20.1 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
next prev parent reply other threads:[~2020-04-07 0:07 UTC|newest] Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-04-07 0:00 [PATCH AUTOSEL 5.5 01/35] ARM: dts: sun8i-a83t-tbs-a711: HM5065 doesn't like such a high voltage Sasha Levin 2020-04-07 0:00 ` Sasha Levin 2020-04-07 0:00 ` [PATCH AUTOSEL 5.5 02/35] bus: sunxi-rsb: Return correct data when mixing 16-bit and 8-bit reads Sasha Levin 2020-04-07 0:00 ` Sasha Levin 2020-04-07 0:00 ` [PATCH AUTOSEL 5.5 03/35] ARM: dts: Fix dm814x Ethernet by changing to use rgmii-id mode Sasha Levin 2020-04-07 0:00 ` [PATCH AUTOSEL 5.5 04/35] bpf: Fix deadlock with rq_lock in bpf_send_signal() Sasha Levin 2020-04-07 0:00 ` [PATCH AUTOSEL 5.5 05/35] net/mlx5e: kTLS, Fix wrong value in record tracker enum Sasha Levin 2020-04-07 0:00 ` [PATCH AUTOSEL 5.5 06/35] iwlwifi: mvm: take the required lock when clearing time event data Sasha Levin 2020-04-07 0:00 ` [PATCH AUTOSEL 5.5 07/35] iwlwifi: consider HE capability when setting LDPC Sasha Levin 2020-04-07 0:00 ` [PATCH AUTOSEL 5.5 08/35] iwlwifi: yoyo: don't add TLV offset when reading FIFOs Sasha Levin 2020-04-07 0:00 ` [PATCH AUTOSEL 5.5 09/35] iwlwifi: dbg: don't abort if sending DBGC_SUSPEND_RESUME fails Sasha Levin 2020-04-07 0:00 ` [PATCH AUTOSEL 5.5 10/35] iwlwifi: mvm: Fix rate scale NSS configuration Sasha Levin 2020-04-07 0:00 ` [PATCH AUTOSEL 5.5 11/35] Input: tm2-touchkey - add support for Coreriver TC360 variant Sasha Levin 2020-04-07 0:00 ` [PATCH AUTOSEL 5.5 12/35] soc: fsl: dpio: register dpio irq handlers after dpio create Sasha Levin 2020-04-07 0:00 ` Sasha Levin 2020-04-07 0:00 ` Sasha Levin 2020-04-07 0:00 ` [PATCH AUTOSEL 5.5 13/35] rxrpc: Abstract out the calculation of whether there's Tx space Sasha Levin 2020-04-07 0:00 ` [PATCH AUTOSEL 5.5 14/35] rxrpc: Fix call interruptibility handling Sasha Levin 2020-04-07 0:00 ` [PATCH AUTOSEL 5.5 15/35] rxrpc: Fix sendmsg(MSG_WAITALL) handling Sasha Levin 2020-04-07 0:00 ` [PATCH AUTOSEL 5.5 16/35] net: stmmac: platform: Fix misleading interrupt error msg Sasha Levin 2020-04-07 0:00 ` Sasha Levin 2020-04-07 0:00 ` [PATCH AUTOSEL 5.5 17/35] net: vxge: fix wrong __VA_ARGS__ usage Sasha Levin 2020-04-07 0:00 ` [PATCH AUTOSEL 5.5 18/35] ARM: dts: omap4-droid4: Fix lost touchscreen interrupts Sasha Levin 2020-04-07 0:00 ` [PATCH AUTOSEL 5.5 19/35] riscv: uaccess should be used in nommu mode Sasha Levin 2020-04-07 0:00 ` Sasha Levin 2020-04-07 0:00 ` [PATCH AUTOSEL 5.5 20/35] hinic: fix a bug of waitting for IO stopped Sasha Levin 2020-04-07 0:00 ` [PATCH AUTOSEL 5.5 21/35] hinic: fix the bug of clearing event queue Sasha Levin 2020-04-07 0:00 ` [PATCH AUTOSEL 5.5 22/35] hinic: fix out-of-order excution in arm cpu Sasha Levin 2020-04-07 0:00 ` [PATCH AUTOSEL 5.5 23/35] hinic: fix wrong para of wait_for_completion_timeout Sasha Levin 2020-04-07 0:00 ` [PATCH AUTOSEL 5.5 24/35] hinic: fix wrong value of MIN_SKB_LEN Sasha Levin 2020-04-07 0:00 ` [PATCH AUTOSEL 5.5 25/35] selftests/net: add definition for SOL_DCCP to fix compilation errors for old libc Sasha Levin 2020-04-07 0:00 ` [PATCH AUTOSEL 5.5 26/35] IB/hfi1: Ensure pq is not left on waitlist Sasha Levin 2020-04-07 0:00 ` [PATCH AUTOSEL 5.5 27/35] netfilter: nf_tables: Allow set back-ends to report partial overlaps on insertion Sasha Levin 2020-04-07 0:18 ` Stefano Brivio 2020-04-13 16:39 ` Sasha Levin 2020-04-13 20:38 ` Stefano Brivio 2020-04-14 15:08 ` Sasha Levin 2020-04-21 11:32 ` Pablo Neira Ayuso 2020-04-21 13:14 ` Greg KH 2020-04-07 0:00 ` [PATCH AUTOSEL 5.5 28/35] netfilter: nft_set_rbtree: Introduce and use nft_rbtree_interval_start() Sasha Levin 2020-04-07 0:00 ` [PATCH AUTOSEL 5.5 29/35] netfilter: nft_set_rbtree: Detect partial overlaps on insertion Sasha Levin 2020-04-07 0:00 ` [PATCH AUTOSEL 5.5 30/35] cxgb4/ptp: pass the sign of offset delta in FW CMD Sasha Levin 2020-04-07 0:00 ` Sasha Levin [this message] 2020-04-07 0:00 ` [PATCH AUTOSEL 5.5 31/35] drm/scheduler: fix rare NULL ptr race Sasha Levin 2020-04-07 0:00 ` [PATCH AUTOSEL 5.5 32/35] cfg80211: Do not warn on same channel at the end of CSA Sasha Levin 2020-04-07 0:00 ` [PATCH AUTOSEL 5.5 33/35] qlcnic: Fix bad kzalloc null test Sasha Levin 2020-04-07 0:00 ` [PATCH AUTOSEL 5.5 34/35] i2c: st: fix missing struct parameter description Sasha Levin 2020-04-07 0:00 ` Sasha Levin 2020-04-07 0:00 ` [PATCH AUTOSEL 5.5 35/35] i2c: pca-platform: Use platform_irq_get_optional Sasha Levin
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20200407000058.16423-31-sashal@kernel.org \ --to=sashal@kernel.org \ --cc=alexander.deucher@amd.com \ --cc=christian.koenig@amd.com \ --cc=dri-devel@lists.freedesktop.org \ --cc=linaro-mm-sig@lists.linaro.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-media@vger.kernel.org \ --cc=stable@vger.kernel.org \ --cc=yttao@amd.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.