linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: stable@vger.kernel.org, linux-kernel@vger.kernel.org
Cc: Leo Li <sunpeng.li@amd.com>,
	Harry Wentland <harry.wentland@amd.com>,
	Sasha Levin <sashal@kernel.org>
Subject: [PATCH AUTOSEL 4.18 29/33] drm: Get ref on CRTC commit object when waiting for flip_done
Date: Tue, 30 Oct 2018 09:26:53 -0400	[thread overview]
Message-ID: <20181030132657.217970-29-sashal@kernel.org> (raw)
In-Reply-To: <20181030132657.217970-1-sashal@kernel.org>

From: Leo Li <sunpeng.li@amd.com>

[ Upstream commit 4364bcb2cd21d042bde4776448417ddffbc54045 ]

This fixes a general protection fault, caused by accessing the contents
of a flip_done completion object that has already been freed. It occurs
due to the preemption of a non-blocking commit worker thread W by
another commit thread X. X continues to clear its atomic state at the
end, destroying the CRTC commit object that W still needs. Switching
back to W and accessing the commit objects then leads to bad results.

Worker W becomes preemptable when waiting for flip_done to complete. At
this point, a frequently occurring commit thread X can take over. Here's
an example where W is a worker thread that flips on both CRTCs, and X
does a legacy cursor update on both CRTCs:

        ...
     1. W does flip work
     2. W runs commit_hw_done()
     3. W waits for flip_done on CRTC 1
     4. > flip_done for CRTC 1 completes
     5. W finishes waiting for CRTC 1
     6. W waits for flip_done on CRTC 2

     7. > Preempted by X
     8. > flip_done for CRTC 2 completes
     9. X atomic_check: hw_done and flip_done are complete on all CRTCs
    10. X updates cursor on both CRTCs
    11. X destroys atomic state
    12. X done

    13. > Switch back to W
    14. W waits for flip_done on CRTC 2
    15. W raises general protection fault

The error looks like so:

    general protection fault: 0000 [#1] PREEMPT SMP PTI
    **snip**
    Call Trace:
     lock_acquire+0xa2/0x1b0
     _raw_spin_lock_irq+0x39/0x70
     wait_for_completion_timeout+0x31/0x130
     drm_atomic_helper_wait_for_flip_done+0x64/0x90 [drm_kms_helper]
     amdgpu_dm_atomic_commit_tail+0xcae/0xdd0 [amdgpu]
     commit_tail+0x3d/0x70 [drm_kms_helper]
     process_one_work+0x212/0x650
     worker_thread+0x49/0x420
     kthread+0xfb/0x130
     ret_from_fork+0x3a/0x50
    Modules linked in: x86_pkg_temp_thermal amdgpu(O) chash(O)
    gpu_sched(O) drm_kms_helper(O) syscopyarea sysfillrect sysimgblt
    fb_sys_fops ttm(O) drm(O)

Note that i915 has this issue masked, since hw_done is signaled after
waiting for flip_done. Doing so will block the cursor update from
happening until hw_done is signaled, preventing the cursor commit from
destroying the state.

v2: The reference on the commit object needs to be obtained before
    hw_done() is signaled, since that's the point where another commit
    is allowed to modify the state. Assuming that the
    new_crtc_state->commit object still exists within flip_done() is
    incorrect.

    Fix by getting a reference in setup_commit(), and releasing it
    during default_clear().

Signed-off-by: Leo Li <sunpeng.li@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1539611200-6184-1-git-send-email-sunpeng.li@amd.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/gpu/drm/drm_atomic.c        |  5 +++++
 drivers/gpu/drm/drm_atomic_helper.c | 12 ++++++++----
 include/drm/drm_atomic.h            | 11 +++++++++++
 3 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
index 895741e9cd7d..52ccf1c31855 100644
--- a/drivers/gpu/drm/drm_atomic.c
+++ b/drivers/gpu/drm/drm_atomic.c
@@ -173,6 +173,11 @@ void drm_atomic_state_default_clear(struct drm_atomic_state *state)
 		state->crtcs[i].state = NULL;
 		state->crtcs[i].old_state = NULL;
 		state->crtcs[i].new_state = NULL;
+
+		if (state->crtcs[i].commit) {
+			drm_crtc_commit_put(state->crtcs[i].commit);
+			state->crtcs[i].commit = NULL;
+		}
 	}
 
 	for (i = 0; i < config->num_total_plane; i++) {
diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
index 81e32199d3ef..abca95b970ea 100644
--- a/drivers/gpu/drm/drm_atomic_helper.c
+++ b/drivers/gpu/drm/drm_atomic_helper.c
@@ -1384,15 +1384,16 @@ EXPORT_SYMBOL(drm_atomic_helper_wait_for_vblanks);
 void drm_atomic_helper_wait_for_flip_done(struct drm_device *dev,
 					  struct drm_atomic_state *old_state)
 {
-	struct drm_crtc_state *new_crtc_state;
 	struct drm_crtc *crtc;
 	int i;
 
-	for_each_new_crtc_in_state(old_state, crtc, new_crtc_state, i) {
-		struct drm_crtc_commit *commit = new_crtc_state->commit;
+	for (i = 0; i < dev->mode_config.num_crtc; i++) {
+		struct drm_crtc_commit *commit = old_state->crtcs[i].commit;
 		int ret;
 
-		if (!commit)
+		crtc = old_state->crtcs[i].ptr;
+
+		if (!crtc || !commit)
 			continue;
 
 		ret = wait_for_completion_timeout(&commit->flip_done, 10 * HZ);
@@ -1906,6 +1907,9 @@ int drm_atomic_helper_setup_commit(struct drm_atomic_state *state,
 		drm_crtc_commit_get(commit);
 
 		commit->abort_completion = true;
+
+		state->crtcs[i].commit = commit;
+		drm_crtc_commit_get(commit);
 	}
 
 	for_each_oldnew_connector_in_state(state, conn, old_conn_state, new_conn_state, i) {
diff --git a/include/drm/drm_atomic.h b/include/drm/drm_atomic.h
index a57a8aa90ffb..2b0d02458a18 100644
--- a/include/drm/drm_atomic.h
+++ b/include/drm/drm_atomic.h
@@ -153,6 +153,17 @@ struct __drm_planes_state {
 struct __drm_crtcs_state {
 	struct drm_crtc *ptr;
 	struct drm_crtc_state *state, *old_state, *new_state;
+
+	/**
+	 * @commit:
+	 *
+	 * A reference to the CRTC commit object that is kept for use by
+	 * drm_atomic_helper_wait_for_flip_done() after
+	 * drm_atomic_helper_commit_hw_done() is called. This ensures that a
+	 * concurrent commit won't free a commit object that is still in use.
+	 */
+	struct drm_crtc_commit *commit;
+
 	s32 __user *out_fence_ptr;
 	u64 last_vblank_count;
 };
-- 
2.17.1


  parent reply	other threads:[~2018-10-30 13:28 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-30 13:26 [PATCH AUTOSEL 4.18 01/33] drm: fix use of freed memory in drm_mode_setcrtc Sasha Levin
2018-10-30 13:26 ` [PATCH AUTOSEL 4.18 02/33] bpf: do not blindly change rlimit in reuseport net selftest Sasha Levin
2018-10-30 13:26 ` [PATCH AUTOSEL 4.18 03/33] nvme: remove ns sibling before clearing path Sasha Levin
2018-10-30 13:26 ` [PATCH AUTOSEL 4.18 04/33] Revert "perf tools: Fix PMU term format max value calculation" Sasha Levin
2018-10-30 13:26 ` [PATCH AUTOSEL 4.18 05/33] selftests: usbip: add wait after attach and before checking port status Sasha Levin
2018-10-30 13:26 ` [PATCH AUTOSEL 4.18 06/33] net/mlx5: Fix memory leak when setting fpga ipsec caps Sasha Levin
2018-10-30 13:26 ` [PATCH AUTOSEL 4.18 07/33] net/mlx5: Take only bit 24-26 of wqe.pftype_wq for page fault type Sasha Levin
2018-10-30 13:26 ` [PATCH AUTOSEL 4.18 08/33] net/mlx5: WQ, fixes for fragmented WQ buffers API Sasha Levin
2018-10-30 13:26 ` [PATCH AUTOSEL 4.18 09/33] xsk: do not call synchronize_net() under RCU read lock Sasha Levin
2018-10-30 13:26 ` [PATCH AUTOSEL 4.18 10/33] xfrm: policy: use hlist rcu variants on insert Sasha Levin
2018-10-30 13:26 ` [PATCH AUTOSEL 4.18 11/33] perf vendor events intel: Fix wrong filter_band* values for uncore events Sasha Levin
2018-10-30 13:26 ` [PATCH AUTOSEL 4.18 12/33] sparc: Fix single-pcr perf event counter management Sasha Levin
2018-10-30 13:26 ` [PATCH AUTOSEL 4.18 13/33] sparc: Throttle perf events properly Sasha Levin
2018-10-30 13:26 ` [PATCH AUTOSEL 4.18 14/33] sparc64: Make proc_id signed Sasha Levin
2018-10-30 13:26 ` [PATCH AUTOSEL 4.18 15/33] r8169: Enable MSI-X on RTL8106e Sasha Levin
2018-10-30 13:26 ` [PATCH AUTOSEL 4.18 16/33] net: bcmgenet: Poll internal PHY for GENETv5 Sasha Levin
2018-10-30 13:26 ` [PATCH AUTOSEL 4.18 17/33] net: fec: don't dump RX FIFO register when not available Sasha Levin
2018-10-30 13:26 ` [PATCH AUTOSEL 4.18 18/33] nfp: flower: fix pedit set actions for multiple partial masks Sasha Levin
2018-10-30 13:26 ` [PATCH AUTOSEL 4.18 19/33] nfp: flower: use offsets provided by pedit instead of index for ipv6 Sasha Levin
2018-10-30 13:26 ` [PATCH AUTOSEL 4.18 20/33] sched/fair: Fix the min_vruntime update logic in dequeue_entity() Sasha Levin
2018-10-30 13:26 ` [PATCH AUTOSEL 4.18 21/33] perf evsel: Store ids for events with their own cpus perf_event__synthesize_event_update_cpus Sasha Levin
2018-10-30 13:26 ` [PATCH AUTOSEL 4.18 22/33] perf tools: Fix use of alternatives to find JDIR Sasha Levin
2018-10-30 13:26 ` [PATCH AUTOSEL 4.18 23/33] perf cpu_map: Align cpu map synthesized events properly Sasha Levin
2018-10-30 13:26 ` [PATCH AUTOSEL 4.18 24/33] perf report: Don't crash on invalid inline debug information Sasha Levin
2018-10-30 13:26 ` [PATCH AUTOSEL 4.18 25/33] x86/fpu: Remove second definition of fpu in __fpu__restore_sig() Sasha Levin
2018-10-30 13:26 ` [PATCH AUTOSEL 4.18 26/33] net: qla3xxx: Remove overflowing shift statement Sasha Levin
2018-10-30 13:26 ` [PATCH AUTOSEL 4.18 27/33] r8169: re-enable MSI-X on RTL8168g Sasha Levin
2018-10-30 13:26 ` [PATCH AUTOSEL 4.18 28/33] virtio_net: avoid using netif_tx_disable() for serializing tx routine Sasha Levin
2018-10-30 13:26 ` Sasha Levin [this message]
2018-10-30 13:26 ` [PATCH AUTOSEL 4.18 30/33] r8169: fix NAPI handling under high load Sasha Levin
2018-10-30 13:26 ` [PATCH AUTOSEL 4.18 31/33] selftests: ftrace: Add synthetic event syntax testcase Sasha Levin
2018-10-30 13:26 ` [PATCH AUTOSEL 4.18 32/33] net: fix pskb_trim_rcsum_slow() with odd trim offset Sasha Levin
2018-10-30 13:26 ` [PATCH AUTOSEL 4.18 33/33] i2c: rcar: cleanup DMA for all kinds of failure Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181030132657.217970-29-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=harry.wentland@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=sunpeng.li@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).