All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: "Yintian Tao" <yttao@amd.com>,
	"Christian König" <christian.koenig@amd.com>,
	"Alex Deucher" <alexander.deucher@amd.com>,
	"Sasha Levin" <sashal@kernel.org>,
	dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org,
	linaro-mm-sig@lists.linaro.org
Subject: [PATCH AUTOSEL 5.5 31/35] drm/scheduler: fix rare NULL ptr race
Date: Mon,  6 Apr 2020 20:00:53 -0400	[thread overview]
Message-ID: <20200407000058.16423-31-sashal@kernel.org> (raw)
In-Reply-To: <20200407000058.16423-1-sashal@kernel.org>

From: Yintian Tao <yttao@amd.com>

[ Upstream commit 3c0fdf3302cb4f186c871684eac5c407a107e480 ]

There is one one corner case at dma_fence_signal_locked
which will raise the NULL pointer problem just like below.
->dma_fence_signal
    ->dma_fence_signal_locked
	->test_and_set_bit
here trigger dma_fence_release happen due to the zero of fence refcount.

->dma_fence_put
    ->dma_fence_release
	->drm_sched_fence_release_scheduled
	    ->call_rcu
here make the union fled “cb_list” at finished fence
to NULL because struct rcu_head contains two pointer
which is same as struct list_head cb_list

Therefore, to hold the reference of finished fence at drm_sched_process_job
to prevent the null pointer during finished fence dma_fence_signal

[  732.912867] BUG: kernel NULL pointer dereference, address: 0000000000000008
[  732.914815] #PF: supervisor write access in kernel mode
[  732.915731] #PF: error_code(0x0002) - not-present page
[  732.916621] PGD 0 P4D 0
[  732.917072] Oops: 0002 [#1] SMP PTI
[  732.917682] CPU: 7 PID: 0 Comm: swapper/7 Tainted: G           OE     5.4.0-rc7 #1
[  732.918980] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
[  732.920906] RIP: 0010:dma_fence_signal_locked+0x3e/0x100
[  732.938569] Call Trace:
[  732.939003]  <IRQ>
[  732.939364]  dma_fence_signal+0x29/0x50
[  732.940036]  drm_sched_fence_finished+0x12/0x20 [gpu_sched]
[  732.940996]  drm_sched_process_job+0x34/0xa0 [gpu_sched]
[  732.941910]  dma_fence_signal_locked+0x85/0x100
[  732.942692]  dma_fence_signal+0x29/0x50
[  732.943457]  amdgpu_fence_process+0x99/0x120 [amdgpu]
[  732.944393]  sdma_v4_0_process_trap_irq+0x81/0xa0 [amdgpu]

v2: hold the finished fence at drm_sched_process_job instead of
    amdgpu_fence_process
v3: resume the blank line

Signed-off-by: Yintian Tao <yttao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/gpu/drm/scheduler/sched_main.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 3c57e84222ca9..5bb9feddbfd6b 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -632,7 +632,9 @@ static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb)
 
 	trace_drm_sched_process_job(s_fence);
 
+	dma_fence_get(&s_fence->finished);
 	drm_sched_fence_finished(s_fence);
+	dma_fence_put(&s_fence->finished);
 	wake_up_interruptible(&sched->wake_up_worker);
 }
 
-- 
2.20.1


WARNING: multiple messages have this Message-ID (diff)
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: "Sasha Levin" <sashal@kernel.org>,
	dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org,
	"Alex Deucher" <alexander.deucher@amd.com>,
	"Yintian Tao" <yttao@amd.com>,
	"Christian König" <christian.koenig@amd.com>,
	linux-media@vger.kernel.org
Subject: [PATCH AUTOSEL 5.5 31/35] drm/scheduler: fix rare NULL ptr race
Date: Mon,  6 Apr 2020 20:00:53 -0400	[thread overview]
Message-ID: <20200407000058.16423-31-sashal@kernel.org> (raw)
In-Reply-To: <20200407000058.16423-1-sashal@kernel.org>

From: Yintian Tao <yttao@amd.com>

[ Upstream commit 3c0fdf3302cb4f186c871684eac5c407a107e480 ]

There is one one corner case at dma_fence_signal_locked
which will raise the NULL pointer problem just like below.
->dma_fence_signal
    ->dma_fence_signal_locked
	->test_and_set_bit
here trigger dma_fence_release happen due to the zero of fence refcount.

->dma_fence_put
    ->dma_fence_release
	->drm_sched_fence_release_scheduled
	    ->call_rcu
here make the union fled “cb_list” at finished fence
to NULL because struct rcu_head contains two pointer
which is same as struct list_head cb_list

Therefore, to hold the reference of finished fence at drm_sched_process_job
to prevent the null pointer during finished fence dma_fence_signal

[  732.912867] BUG: kernel NULL pointer dereference, address: 0000000000000008
[  732.914815] #PF: supervisor write access in kernel mode
[  732.915731] #PF: error_code(0x0002) - not-present page
[  732.916621] PGD 0 P4D 0
[  732.917072] Oops: 0002 [#1] SMP PTI
[  732.917682] CPU: 7 PID: 0 Comm: swapper/7 Tainted: G           OE     5.4.0-rc7 #1
[  732.918980] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
[  732.920906] RIP: 0010:dma_fence_signal_locked+0x3e/0x100
[  732.938569] Call Trace:
[  732.939003]  <IRQ>
[  732.939364]  dma_fence_signal+0x29/0x50
[  732.940036]  drm_sched_fence_finished+0x12/0x20 [gpu_sched]
[  732.940996]  drm_sched_process_job+0x34/0xa0 [gpu_sched]
[  732.941910]  dma_fence_signal_locked+0x85/0x100
[  732.942692]  dma_fence_signal+0x29/0x50
[  732.943457]  amdgpu_fence_process+0x99/0x120 [amdgpu]
[  732.944393]  sdma_v4_0_process_trap_irq+0x81/0xa0 [amdgpu]

v2: hold the finished fence at drm_sched_process_job instead of
    amdgpu_fence_process
v3: resume the blank line

Signed-off-by: Yintian Tao <yttao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/gpu/drm/scheduler/sched_main.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 3c57e84222ca9..5bb9feddbfd6b 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -632,7 +632,9 @@ static void drm_sched_process_job(struct dma_fence *f, struct dma_fence_cb *cb)
 
 	trace_drm_sched_process_job(s_fence);
 
+	dma_fence_get(&s_fence->finished);
 	drm_sched_fence_finished(s_fence);
+	dma_fence_put(&s_fence->finished);
 	wake_up_interruptible(&sched->wake_up_worker);
 }
 
-- 
2.20.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

  parent reply	other threads:[~2020-04-07  0:07 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-07  0:00 [PATCH AUTOSEL 5.5 01/35] ARM: dts: sun8i-a83t-tbs-a711: HM5065 doesn't like such a high voltage Sasha Levin
2020-04-07  0:00 ` Sasha Levin
2020-04-07  0:00 ` [PATCH AUTOSEL 5.5 02/35] bus: sunxi-rsb: Return correct data when mixing 16-bit and 8-bit reads Sasha Levin
2020-04-07  0:00   ` Sasha Levin
2020-04-07  0:00 ` [PATCH AUTOSEL 5.5 03/35] ARM: dts: Fix dm814x Ethernet by changing to use rgmii-id mode Sasha Levin
2020-04-07  0:00 ` [PATCH AUTOSEL 5.5 04/35] bpf: Fix deadlock with rq_lock in bpf_send_signal() Sasha Levin
2020-04-07  0:00 ` [PATCH AUTOSEL 5.5 05/35] net/mlx5e: kTLS, Fix wrong value in record tracker enum Sasha Levin
2020-04-07  0:00 ` [PATCH AUTOSEL 5.5 06/35] iwlwifi: mvm: take the required lock when clearing time event data Sasha Levin
2020-04-07  0:00 ` [PATCH AUTOSEL 5.5 07/35] iwlwifi: consider HE capability when setting LDPC Sasha Levin
2020-04-07  0:00 ` [PATCH AUTOSEL 5.5 08/35] iwlwifi: yoyo: don't add TLV offset when reading FIFOs Sasha Levin
2020-04-07  0:00 ` [PATCH AUTOSEL 5.5 09/35] iwlwifi: dbg: don't abort if sending DBGC_SUSPEND_RESUME fails Sasha Levin
2020-04-07  0:00 ` [PATCH AUTOSEL 5.5 10/35] iwlwifi: mvm: Fix rate scale NSS configuration Sasha Levin
2020-04-07  0:00 ` [PATCH AUTOSEL 5.5 11/35] Input: tm2-touchkey - add support for Coreriver TC360 variant Sasha Levin
2020-04-07  0:00 ` [PATCH AUTOSEL 5.5 12/35] soc: fsl: dpio: register dpio irq handlers after dpio create Sasha Levin
2020-04-07  0:00   ` Sasha Levin
2020-04-07  0:00   ` Sasha Levin
2020-04-07  0:00 ` [PATCH AUTOSEL 5.5 13/35] rxrpc: Abstract out the calculation of whether there's Tx space Sasha Levin
2020-04-07  0:00 ` [PATCH AUTOSEL 5.5 14/35] rxrpc: Fix call interruptibility handling Sasha Levin
2020-04-07  0:00 ` [PATCH AUTOSEL 5.5 15/35] rxrpc: Fix sendmsg(MSG_WAITALL) handling Sasha Levin
2020-04-07  0:00 ` [PATCH AUTOSEL 5.5 16/35] net: stmmac: platform: Fix misleading interrupt error msg Sasha Levin
2020-04-07  0:00   ` Sasha Levin
2020-04-07  0:00 ` [PATCH AUTOSEL 5.5 17/35] net: vxge: fix wrong __VA_ARGS__ usage Sasha Levin
2020-04-07  0:00 ` [PATCH AUTOSEL 5.5 18/35] ARM: dts: omap4-droid4: Fix lost touchscreen interrupts Sasha Levin
2020-04-07  0:00 ` [PATCH AUTOSEL 5.5 19/35] riscv: uaccess should be used in nommu mode Sasha Levin
2020-04-07  0:00   ` Sasha Levin
2020-04-07  0:00 ` [PATCH AUTOSEL 5.5 20/35] hinic: fix a bug of waitting for IO stopped Sasha Levin
2020-04-07  0:00 ` [PATCH AUTOSEL 5.5 21/35] hinic: fix the bug of clearing event queue Sasha Levin
2020-04-07  0:00 ` [PATCH AUTOSEL 5.5 22/35] hinic: fix out-of-order excution in arm cpu Sasha Levin
2020-04-07  0:00 ` [PATCH AUTOSEL 5.5 23/35] hinic: fix wrong para of wait_for_completion_timeout Sasha Levin
2020-04-07  0:00 ` [PATCH AUTOSEL 5.5 24/35] hinic: fix wrong value of MIN_SKB_LEN Sasha Levin
2020-04-07  0:00 ` [PATCH AUTOSEL 5.5 25/35] selftests/net: add definition for SOL_DCCP to fix compilation errors for old libc Sasha Levin
2020-04-07  0:00 ` [PATCH AUTOSEL 5.5 26/35] IB/hfi1: Ensure pq is not left on waitlist Sasha Levin
2020-04-07  0:00 ` [PATCH AUTOSEL 5.5 27/35] netfilter: nf_tables: Allow set back-ends to report partial overlaps on insertion Sasha Levin
2020-04-07  0:18   ` Stefano Brivio
2020-04-13 16:39     ` Sasha Levin
2020-04-13 20:38       ` Stefano Brivio
2020-04-14 15:08         ` Sasha Levin
2020-04-21 11:32           ` Pablo Neira Ayuso
2020-04-21 13:14             ` Greg KH
2020-04-07  0:00 ` [PATCH AUTOSEL 5.5 28/35] netfilter: nft_set_rbtree: Introduce and use nft_rbtree_interval_start() Sasha Levin
2020-04-07  0:00 ` [PATCH AUTOSEL 5.5 29/35] netfilter: nft_set_rbtree: Detect partial overlaps on insertion Sasha Levin
2020-04-07  0:00 ` [PATCH AUTOSEL 5.5 30/35] cxgb4/ptp: pass the sign of offset delta in FW CMD Sasha Levin
2020-04-07  0:00 ` Sasha Levin [this message]
2020-04-07  0:00   ` [PATCH AUTOSEL 5.5 31/35] drm/scheduler: fix rare NULL ptr race Sasha Levin
2020-04-07  0:00 ` [PATCH AUTOSEL 5.5 32/35] cfg80211: Do not warn on same channel at the end of CSA Sasha Levin
2020-04-07  0:00 ` [PATCH AUTOSEL 5.5 33/35] qlcnic: Fix bad kzalloc null test Sasha Levin
2020-04-07  0:00 ` [PATCH AUTOSEL 5.5 34/35] i2c: st: fix missing struct parameter description Sasha Levin
2020-04-07  0:00   ` Sasha Levin
2020-04-07  0:00 ` [PATCH AUTOSEL 5.5 35/35] i2c: pca-platform: Use platform_irq_get_optional Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200407000058.16423-31-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=alexander.deucher@amd.com \
    --cc=christian.koenig@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=linaro-mm-sig@lists.linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=yttao@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.