All of lore.kernel.org
 help / color / mirror / Atom feed
* [RESEND] [PATCH 0/4] Important MST fixes for 4.6
@ 2016-05-12 14:56 Lyude
  2016-05-12 14:56 ` [PATCH 1/4] drm/i915/fbdev: Fix num_connector references in intel_fb_initial_config() Lyude
                   ` (5 more replies)
  0 siblings, 6 replies; 11+ messages in thread
From: Lyude @ 2016-05-12 14:56 UTC (permalink / raw)
  To: Daniel Vetter, intel-gfx

Unfortunately we've never really made use of creating/destroying connectors on
the fly like we have with MST, so 4.6 ended up showing a lot of various bugs
with hotplugging MST displays, booting with MST displays, etc. Most of these
bugs are very likely to panic the kernel, and a couple of them end up even
doing out of bounds memory accesses causing all sorts of other issues. This
being said; we currently have a much better solution that makes a few of these
patches unneccessary: dave's connector reference counting patch series[1]. For
the time being however, this is the most minimal patchset I could come up with
for 4.6 to work around all of these issues.

[1]: b164d31f50b2923a7a92c2a40cb46973a6ba8c36 in drm-next

Lyude (4):
  drm/i915/fbdev: Fix num_connector references in
    intel_fb_initial_config()
  drm/fb_helper: Fix references to dev->mode_config.num_connector
  drm/i915: Discard previous atomic state on resume if connectors change
  drm/atomic: Verify connector->funcs != NULL when clearing states

 drivers/gpu/drm/drm_atomic.c         |  3 ++-
 drivers/gpu/drm/drm_fb_helper.c      |  4 ++--
 drivers/gpu/drm/i915/intel_display.c | 12 ++++++++++++
 drivers/gpu/drm/i915/intel_fbdev.c   |  6 +++---
 4 files changed, 19 insertions(+), 6 deletions(-)

This hasn't been sent to stable@vger.kernel.org yet, just sending this to
intel-gfx for review beforehand.

-- 
2.5.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/4] drm/i915/fbdev: Fix num_connector references in intel_fb_initial_config()
  2016-05-12 14:56 [RESEND] [PATCH 0/4] Important MST fixes for 4.6 Lyude
@ 2016-05-12 14:56 ` Lyude
  2016-05-12 14:56 ` [PATCH 2/4] drm/fb_helper: Fix references to dev->mode_config.num_connector Lyude
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Lyude @ 2016-05-12 14:56 UTC (permalink / raw)
  To: Daniel Vetter, intel-gfx

During boot time, MST devices usually send a ton of hotplug events
irregardless of whether or not any physical hotplugs actually occurred.
Hotplugs mean connectors being created/destroyed, and the number of DRM
connectors changing under us. This isn't a problem if we use
fb_helper->connector_count since we only set it once in the code,
however if we use num_connector from struct drm_mode_config we risk it's
value changing under us. On top of that, there's even a chance that
dev->mode_config.num_connector != fb_helper->connector_count. If the
number of connectors happens to increase under us, we'll end up using
the wrong array size for memcpy and start writing beyond the actual
length of the array, occasionally resulting in kernel panics.

Cc: stable@vger.kernel.org
Signed-off-by: Lyude <cpaul@redhat.com>
---
 drivers/gpu/drm/i915/intel_fbdev.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c
index 97a91e6..c607217 100644
--- a/drivers/gpu/drm/i915/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/intel_fbdev.c
@@ -366,12 +366,12 @@ static bool intel_fb_initial_config(struct drm_fb_helper *fb_helper,
 	uint64_t conn_configured = 0, mask;
 	int pass = 0;
 
-	save_enabled = kcalloc(dev->mode_config.num_connector, sizeof(bool),
+	save_enabled = kcalloc(fb_helper->connector_count, sizeof(bool),
 			       GFP_KERNEL);
 	if (!save_enabled)
 		return false;
 
-	memcpy(save_enabled, enabled, dev->mode_config.num_connector);
+	memcpy(save_enabled, enabled, fb_helper->connector_count);
 	mask = (1 << fb_helper->connector_count) - 1;
 retry:
 	for (i = 0; i < fb_helper->connector_count; i++) {
@@ -510,7 +510,7 @@ retry:
 	if (fallback) {
 bail:
 		DRM_DEBUG_KMS("Not using firmware configuration\n");
-		memcpy(enabled, save_enabled, dev->mode_config.num_connector);
+		memcpy(enabled, save_enabled, fb_helper->connector_count);
 		kfree(save_enabled);
 		return false;
 	}
-- 
2.5.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/4] drm/fb_helper: Fix references to dev->mode_config.num_connector
  2016-05-12 14:56 [RESEND] [PATCH 0/4] Important MST fixes for 4.6 Lyude
  2016-05-12 14:56 ` [PATCH 1/4] drm/i915/fbdev: Fix num_connector references in intel_fb_initial_config() Lyude
@ 2016-05-12 14:56 ` Lyude
  2016-05-12 21:47   ` [PATCH v2 3/4] drm/i915: Discard previous atomic state on resume if connectors change Lyude
  2016-05-12 14:57 ` [PATCH " Lyude
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 11+ messages in thread
From: Lyude @ 2016-05-12 14:56 UTC (permalink / raw)
  To: Daniel Vetter, intel-gfx

During boot, MST hotplugs are generally expected (even if no physical
hotplugging occurs) and result in DRM's connector topology changing.
This means that using num_connector from the current mode configuration
can lead to the number of connectors changing under us. This can lead to
some nasty scenarios in fbcon:

- We allocate an array to the size of dev->mode_config.num_connectors.
- MST hotplug occurs, dev->mode_config.num_connectors gets incremented.
- We try to loop through each element in the array using the new value
  of dev->mode_config.num_connectors, and end up going out of bounds
  since dev->mode_config.num_connectors is now larger then the array we
  allocated.

fb_helper->connector_count however, will always remain consistent while
we do a modeset in fb_helper.

Cc: stable@vger.kernel.org
Signed-off-by: Lyude <cpaul@redhat.com>
---
 drivers/gpu/drm/drm_fb_helper.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
index 855108e..15204c0 100644
--- a/drivers/gpu/drm/drm_fb_helper.c
+++ b/drivers/gpu/drm/drm_fb_helper.c
@@ -1914,7 +1914,7 @@ static int drm_pick_crtcs(struct drm_fb_helper *fb_helper,
 	if (modes[n] == NULL)
 		return best_score;
 
-	crtcs = kzalloc(dev->mode_config.num_connector *
+	crtcs = kzalloc(fb_helper->connector_count *
 			sizeof(struct drm_fb_helper_crtc *), GFP_KERNEL);
 	if (!crtcs)
 		return best_score;
@@ -1960,7 +1960,7 @@ static int drm_pick_crtcs(struct drm_fb_helper *fb_helper,
 		if (score > best_score) {
 			best_score = score;
 			memcpy(best_crtcs, crtcs,
-			       dev->mode_config.num_connector *
+			       fb_helper->connector_count *
 			       sizeof(struct drm_fb_helper_crtc *));
 		}
 	}
-- 
2.5.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/4] drm/i915: Discard previous atomic state on resume if connectors change
  2016-05-12 14:56 [RESEND] [PATCH 0/4] Important MST fixes for 4.6 Lyude
  2016-05-12 14:56 ` [PATCH 1/4] drm/i915/fbdev: Fix num_connector references in intel_fb_initial_config() Lyude
  2016-05-12 14:56 ` [PATCH 2/4] drm/fb_helper: Fix references to dev->mode_config.num_connector Lyude
@ 2016-05-12 14:57 ` Lyude
  2016-05-12 14:57 ` [PATCH 4/4] drm/atomic: Verify connector->funcs != NULL when clearing states Lyude
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Lyude @ 2016-05-12 14:57 UTC (permalink / raw)
  To: Daniel Vetter, intel-gfx

If an MST device is disconnected while the machine is suspended, the
number of connectors will change as well after we call
intel_dp_mst_resume(). This means that any previous atomic state we had
before suspending is no longer valid, since it'll still be pointing to
missing connectors. We need to check for this before committing the
state, otherwise we'll kernel panic on resume whenever if any MST
display was disconnected before we started resuming:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
IP: [<ffffffffa01588ef>] drm_atomic_helper_check_modeset+0x29f/0xb40 [drm_kms_helper]
Call Trace:
 [<ffffffffa02354f4>] intel_atomic_check+0x34/0x1180 [i915]
 [<ffffffff810e6c3f>] ? mark_held_locks+0x6f/0xa0
 [<ffffffff810e6d99>] ? trace_hardirqs_on_caller+0x129/0x1b0
 [<ffffffffa00ff1d2>] drm_atomic_check_only+0x192/0x620 [drm]
 [<ffffffff813ee001>] ? pci_pm_thaw+0x21/0x90
 [<ffffffffa00ff677>] drm_atomic_commit+0x17/0x60 [drm]
 [<ffffffffa023e0ad>] intel_display_resume+0xbd/0x160 [i915]
 [<ffffffff813ee070>] ? pci_pm_thaw+0x90/0x90
 [<ffffffffa01b60d8>] i915_drm_resume+0xd8/0x160 [i915]
 [<ffffffffa01b6185>] i915_pm_resume+0x25/0x30 [i915]
 [<ffffffff813ee0d4>] pci_pm_resume+0x64/0xa0
 [<ffffffff814d9ea0>] dpm_run_callback+0x90/0x190
 [<ffffffff814da455>] device_resume+0xd5/0x1f0
 [<ffffffff814da58d>] async_resume+0x1d/0x50
 [<ffffffff810b6718>] async_run_entry_fn+0x48/0x150
 [<ffffffff810acc19>] process_one_work+0x1e9/0x5c0
 [<ffffffff810acb96>] ? process_one_work+0x166/0x5c0
 [<ffffffff810ad038>] worker_thread+0x48/0x4e0
 [<ffffffff810acff0>] ? process_one_work+0x5c0/0x5c0
 [<ffffffff810b3794>] kthread+0xe4/0x100
 [<ffffffff81742672>] ret_from_fork+0x22/0x50
 [<ffffffff810b36b0>] ? kthread_create_on_node+0x200/0x200

Changes since v1:
  - Move drm_atomic_state_free() call down so we're holding the
    appropriate locks when destroying the atomic state
Changes since v2:
  - Check that state != NULL before we start accessing it's members

This fix is only required for 4.6 and below. David Airlie's patchseries
for 4.7 to add connector reference counting provides a more proper fix
for this.

Upstream fix: b164d31f50b2923a7a92c2a40cb46973a6ba8c36
Cc: stable@vger.kernel.org
Signed-off-by: Lyude <cpaul@redhat.com>
---
 drivers/gpu/drm/i915/intel_display.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 182f849..e49a313 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -15959,6 +15959,18 @@ void intel_display_resume(struct drm_device *dev)
 retry:
 	ret = drm_modeset_lock_all_ctx(dev, &ctx);
 
+	/*
+	 * With MST, the number of connectors can change between suspend and
+	 * resume, which means that the state we want to restore might now be
+	 * impossible to use since it'll be pointing to non-existant
+	 * connectors.
+	 */
+	if (ret == 0 && state &&
+	    state->num_connector != dev->mode_config.num_connector) {
+		drm_atomic_state_free(state);
+		state = NULL;
+	}
+
 	if (ret == 0 && !setup) {
 		setup = true;
 
-- 
2.5.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 4/4] drm/atomic: Verify connector->funcs != NULL when clearing states
  2016-05-12 14:56 [RESEND] [PATCH 0/4] Important MST fixes for 4.6 Lyude
                   ` (2 preceding siblings ...)
  2016-05-12 14:57 ` [PATCH " Lyude
@ 2016-05-12 14:57 ` Lyude
  2016-05-12 15:04   ` Lyude
  2016-05-17 12:00   ` Daniel Vetter
  2016-05-12 16:48 ` ✗ Ro.CI.BAT: failure for Important MST fixes for 4.6 (rev2) Patchwork
  2016-05-12 22:25 ` ✗ Ro.CI.BAT: failure for Important MST fixes for 4.6 (rev3) Patchwork
  5 siblings, 2 replies; 11+ messages in thread
From: Lyude @ 2016-05-12 14:57 UTC (permalink / raw)
  To: Daniel Vetter, intel-gfx

Unfortunately since we don't have Dave's connector refcounting patch
here yet, it's very possible that drm_atomic_state_default_clear() could
get called by intel_display_resume() when
intel_dp_mst_destroy_connector() isn't completely finished destroying an
mst connector, but has already finished setting connector->funcs to
NULL. As such, we need to treat the connector like it's already been
destroyed and just skip it, otherwise we'll end up dereferencing a NULL
pointer.

This fix is only required for 4.6 and below. David Airlie's patchseries
for 4.7 to add connector reference counting provides a more proper fix
for this.

Upstream fix: b164d31f50b2923a7a92c2a40cb46973a6ba8c36
Cc: stable@vger.kernel.org
Signed-off-by: Lyude <cpaul@redhat.com>
---
 drivers/gpu/drm/drm_atomic.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
index 8ee1db8..d3a5b5c 100644
--- a/drivers/gpu/drm/drm_atomic.c
+++ b/drivers/gpu/drm/drm_atomic.c
@@ -139,7 +139,7 @@ void drm_atomic_state_default_clear(struct drm_atomic_state *state)
 	for (i = 0; i < state->num_connector; i++) {
 		struct drm_connector *connector = state->connectors[i];
 
-		if (!connector)
+		if (!connector || !connector->funcs)
 			continue;
 
 		/*
@@ -150,6 +150,7 @@ void drm_atomic_state_default_clear(struct drm_atomic_state *state)
 		 * case by setting all connector pointers to NULL.
 		 */
 		state->connector_states[i]->connector = NULL;
+
 		connector->funcs->atomic_destroy_state(NULL,
 						       state->connector_states[i]);
 		state->connectors[i] = NULL;
-- 
2.5.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 4/4] drm/atomic: Verify connector->funcs != NULL when clearing states
  2016-05-12 14:57 ` [PATCH 4/4] drm/atomic: Verify connector->funcs != NULL when clearing states Lyude
@ 2016-05-12 15:04   ` Lyude
  2016-05-17 12:00   ` Daniel Vetter
  1 sibling, 0 replies; 11+ messages in thread
From: Lyude @ 2016-05-12 15:04 UTC (permalink / raw)
  To: intel-gfx, Daniel Vetter

Unfortunately since we don't have Dave's connector refcounting patch
here yet, it's very possible that drm_atomic_state_default_clear() could
get called by intel_display_resume() when
intel_dp_mst_destroy_connector() isn't completely finished destroying an
mst connector, but has already finished setting connector->funcs to
NULL. As such, we need to treat the connector like it's already been
destroyed and just skip it, otherwise we'll end up dereferencing a NULL
pointer.

This fix is only required for 4.6 and below. David Airlie's patchseries
for 4.7 to add connector reference counting provides a more proper fix
for this.

Changes since v1:
 - Fix leftover whitespace

Upstream fix: b164d31f50b2923a7a92c2a40cb46973a6ba8c36
Cc: stable@vger.kernel.org
Signed-off-by: Lyude <cpaul@redhat.com>
---
 drivers/gpu/drm/drm_atomic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
index 8ee1db8..d307d96 100644
--- a/drivers/gpu/drm/drm_atomic.c
+++ b/drivers/gpu/drm/drm_atomic.c
@@ -139,7 +139,7 @@ void drm_atomic_state_default_clear(struct drm_atomic_state *state)
 	for (i = 0; i < state->num_connector; i++) {
 		struct drm_connector *connector = state->connectors[i];
 
-		if (!connector)
+		if (!connector || !connector->funcs)
 			continue;
 
 		/*
-- 
2.5.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* ✗ Ro.CI.BAT: failure for Important MST fixes for 4.6 (rev2)
  2016-05-12 14:56 [RESEND] [PATCH 0/4] Important MST fixes for 4.6 Lyude
                   ` (3 preceding siblings ...)
  2016-05-12 14:57 ` [PATCH 4/4] drm/atomic: Verify connector->funcs != NULL when clearing states Lyude
@ 2016-05-12 16:48 ` Patchwork
  2016-05-12 22:25 ` ✗ Ro.CI.BAT: failure for Important MST fixes for 4.6 (rev3) Patchwork
  5 siblings, 0 replies; 11+ messages in thread
From: Patchwork @ 2016-05-12 16:48 UTC (permalink / raw)
  To: cpaul; +Cc: intel-gfx

== Series Details ==

Series: Important MST fixes for 4.6 (rev2)
URL   : https://patchwork.freedesktop.org/series/7080/
State : failure

== Summary ==

Series 7080v2 Important MST fixes for 4.6
http://patchwork.freedesktop.org/api/1.0/series/7080/revisions/2/mbox

Test drv_hangman:
        Subgroup error-state-basic:
                fail       -> PASS       (ro-ilk1-i5-650)
Test gem_exec_flush:
        Subgroup basic-batch-kernel-default-cmd:
                pass       -> FAIL       (ro-byt-n2820)
Test kms_flip:
        Subgroup basic-flip-vs-wf_vblank:
                fail       -> PASS       (ro-ivb2-i7-3770)
                pass       -> FAIL       (ro-hsw-i3-4010u)
                fail       -> PASS       (ro-hsw-i7-4770r)
Test kms_pipe_crc_basic:
        Subgroup nonblocking-crc-pipe-b:
                pass       -> SKIP       (fi-skl-i5-6260u)
Test kms_sink_crc_basic:
                pass       -> SKIP       (ro-skl-i7-6700hq)

fi-bdw-i7-5557u  total:219  pass:206  dwarn:0   dfail:0   fail:0   skip:13 
fi-bsw-n3050     total:218  pass:174  dwarn:0   dfail:0   fail:2   skip:42 
fi-byt-n2820     total:218  pass:175  dwarn:0   dfail:0   fail:2   skip:41 
fi-hsw-i7-4770r  total:219  pass:193  dwarn:0   dfail:0   fail:0   skip:26 
fi-kbl-y         total:219  pass:191  dwarn:1   dfail:0   fail:2   skip:25 
fi-skl-i5-6260u  total:219  pass:205  dwarn:0   dfail:0   fail:0   skip:14 
fi-skl-i7-6700k  total:219  pass:191  dwarn:0   dfail:0   fail:0   skip:28 
ro-bdw-i5-5250u  total:219  pass:181  dwarn:0   dfail:0   fail:0   skip:38 
ro-bdw-i7-5557U  total:219  pass:206  dwarn:0   dfail:0   fail:0   skip:13 
ro-bdw-i7-5600u  total:219  pass:187  dwarn:0   dfail:0   fail:0   skip:32 
ro-byt-n2820     total:218  pass:174  dwarn:0   dfail:0   fail:3   skip:41 
ro-hsw-i3-4010u  total:218  pass:192  dwarn:0   dfail:0   fail:1   skip:25 
ro-hsw-i7-4770r  total:219  pass:194  dwarn:0   dfail:0   fail:0   skip:25 
ro-ilk-i7-620lm  total:219  pass:151  dwarn:0   dfail:0   fail:1   skip:67 
ro-ilk1-i5-650   total:214  pass:152  dwarn:0   dfail:0   fail:1   skip:61 
ro-ivb-i7-3770   total:219  pass:183  dwarn:0   dfail:0   fail:0   skip:36 
ro-ivb2-i7-3770  total:219  pass:187  dwarn:0   dfail:0   fail:0   skip:32 
ro-skl-i7-6700hq total:214  pass:189  dwarn:0   dfail:0   fail:0   skip:25 
ro-snb-i7-2620M  total:219  pass:177  dwarn:0   dfail:0   fail:1   skip:41 
fi-hsw-i7-4770k failed to connect after reboot
ro-bsw-n3050 failed to connect after reboot

Results at /archive/results/CI_IGT_test/RO_Patchwork_866/

a6ac4ab drm-intel-nightly: 2016y-05m-12d-15h-37m-38s UTC integration manifest
25fca60 drm/atomic: Verify connector->funcs != NULL when clearing states
ad0f473 drm/i915: Discard previous atomic state on resume if connectors change
2488bf3 drm/fb_helper: Fix references to dev->mode_config.num_connector
922232b drm/i915/fbdev: Fix num_connector references in intel_fb_initial_config()

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 3/4] drm/i915: Discard previous atomic state on resume if connectors change
  2016-05-12 14:56 ` [PATCH 2/4] drm/fb_helper: Fix references to dev->mode_config.num_connector Lyude
@ 2016-05-12 21:47   ` Lyude
  2016-05-17 12:01     ` Daniel Vetter
  0 siblings, 1 reply; 11+ messages in thread
From: Lyude @ 2016-05-12 21:47 UTC (permalink / raw)
  To: intel-gfx, Daniel Vetter

If an MST device is disconnected while the machine is suspended, the
number of connectors will change as well after we call
intel_dp_mst_resume(). This means that any previous atomic state we had
before suspending is no longer valid, since it'll still be pointing to
missing connectors. We need to check for this before committing the
state, otherwise we'll kernel panic on resume whenever if any MST
display was disconnected before we started resuming:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
IP: [<ffffffffa01588ef>] drm_atomic_helper_check_modeset+0x29f/0xb40 [drm_kms_helper]
Call Trace:
 [<ffffffffa02354f4>] intel_atomic_check+0x34/0x1180 [i915]
 [<ffffffff810e6c3f>] ? mark_held_locks+0x6f/0xa0
 [<ffffffff810e6d99>] ? trace_hardirqs_on_caller+0x129/0x1b0
 [<ffffffffa00ff1d2>] drm_atomic_check_only+0x192/0x620 [drm]
 [<ffffffff813ee001>] ? pci_pm_thaw+0x21/0x90
 [<ffffffffa00ff677>] drm_atomic_commit+0x17/0x60 [drm]
 [<ffffffffa023e0ad>] intel_display_resume+0xbd/0x160 [i915]
 [<ffffffff813ee070>] ? pci_pm_thaw+0x90/0x90
 [<ffffffffa01b60d8>] i915_drm_resume+0xd8/0x160 [i915]
 [<ffffffffa01b6185>] i915_pm_resume+0x25/0x30 [i915]
 [<ffffffff813ee0d4>] pci_pm_resume+0x64/0xa0
 [<ffffffff814d9ea0>] dpm_run_callback+0x90/0x190
 [<ffffffff814da455>] device_resume+0xd5/0x1f0
 [<ffffffff814da58d>] async_resume+0x1d/0x50
 [<ffffffff810b6718>] async_run_entry_fn+0x48/0x150
 [<ffffffff810acc19>] process_one_work+0x1e9/0x5c0
 [<ffffffff810acb96>] ? process_one_work+0x166/0x5c0
 [<ffffffff810ad038>] worker_thread+0x48/0x4e0
 [<ffffffff810acff0>] ? process_one_work+0x5c0/0x5c0
 [<ffffffff810b3794>] kthread+0xe4/0x100
 [<ffffffff81742672>] ret_from_fork+0x22/0x50
 [<ffffffff810b36b0>] ? kthread_create_on_node+0x200/0x200

Changes since v1:
  - Move drm_atomic_state_free() call down so we're holding the
    appropriate locks when destroying the atomic state
Changes since v2:
  - Check that state != NULL before we start accessing it's members

This fix is only required for 4.6 and below. David Airlie's patchseries
for 4.7 to add connector reference counting provides a more proper fix
for this.

Upstream fix: b164d31f50b2923a7a92c2a40cb46973a6ba8c36
Cc: stable@vger.kernel.org
Signed-off-by: Lyude <cpaul@redhat.com>
---
 drivers/gpu/drm/i915/intel_display.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 182f849..e49a313 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -15959,6 +15959,18 @@ void intel_display_resume(struct drm_device *dev)
 retry:
 	ret = drm_modeset_lock_all_ctx(dev, &ctx);
 
+	/*
+	 * With MST, the number of connectors can change between suspend and
+	 * resume, which means that the state we want to restore might now be
+	 * impossible to use since it'll be pointing to non-existant
+	 * connectors.
+	 */
+	if (ret == 0 && state &&
+	    state->num_connector != dev->mode_config.num_connector) {
+		drm_atomic_state_free(state);
+		state = NULL;
+	}
+
 	if (ret == 0 && !setup) {
 		setup = true;
 
-- 
2.5.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* ✗ Ro.CI.BAT: failure for Important MST fixes for 4.6 (rev3)
  2016-05-12 14:56 [RESEND] [PATCH 0/4] Important MST fixes for 4.6 Lyude
                   ` (4 preceding siblings ...)
  2016-05-12 16:48 ` ✗ Ro.CI.BAT: failure for Important MST fixes for 4.6 (rev2) Patchwork
@ 2016-05-12 22:25 ` Patchwork
  5 siblings, 0 replies; 11+ messages in thread
From: Patchwork @ 2016-05-12 22:25 UTC (permalink / raw)
  To: cpaul; +Cc: intel-gfx

== Series Details ==

Series: Important MST fixes for 4.6 (rev3)
URL   : https://patchwork.freedesktop.org/series/7080/
State : failure

== Summary ==

Series 7080v3 Important MST fixes for 4.6
http://patchwork.freedesktop.org/api/1.0/series/7080/revisions/3/mbox

Test gem_exec_flush:
        Subgroup basic-batch-kernel-default-cmd:
                pass       -> FAIL       (ro-byt-n2820)
Test kms_flip:
        Subgroup basic-flip-vs-wf_vblank:
                fail       -> PASS       (ro-hsw-i7-4770r)
                pass       -> FAIL       (ro-byt-n2820)
                pass       -> FAIL       (ro-hsw-i3-4010u)

ro-bdw-i5-5250u  total:219  pass:181  dwarn:0   dfail:0   fail:0   skip:38 
ro-bdw-i7-5557U  total:219  pass:206  dwarn:0   dfail:0   fail:0   skip:13 
ro-bdw-i7-5600u  total:219  pass:187  dwarn:0   dfail:0   fail:0   skip:32 
ro-bsw-n3050     total:219  pass:175  dwarn:0   dfail:0   fail:2   skip:42 
ro-byt-n2820     total:218  pass:173  dwarn:0   dfail:0   fail:4   skip:41 
ro-hsw-i3-4010u  total:218  pass:192  dwarn:0   dfail:0   fail:1   skip:25 
ro-hsw-i7-4770r  total:219  pass:194  dwarn:0   dfail:0   fail:0   skip:25 
ro-ilk-i7-620lm  total:219  pass:151  dwarn:0   dfail:0   fail:1   skip:67 
ro-ilk1-i5-650   total:214  pass:151  dwarn:0   dfail:0   fail:2   skip:61 
ro-ivb-i7-3770   total:219  pass:183  dwarn:0   dfail:0   fail:0   skip:36 
ro-ivb2-i7-3770  total:219  pass:186  dwarn:0   dfail:0   fail:1   skip:32 
ro-skl-i7-6700hq total:214  pass:190  dwarn:0   dfail:0   fail:0   skip:24 
ro-snb-i7-2620M  total:219  pass:177  dwarn:0   dfail:0   fail:1   skip:41 

Results at /archive/results/CI_IGT_test/RO_Patchwork_875/

a6ac4ab drm-intel-nightly: 2016y-05m-12d-15h-37m-38s UTC integration manifest
f5d7f80 drm/atomic: Verify connector->funcs != NULL when clearing states
fd49c26 drm/i915: Discard previous atomic state on resume if connectors change
da6cd061 drm/i915/fbdev: Fix num_connector references in intel_fb_initial_config()

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 4/4] drm/atomic: Verify connector->funcs != NULL when clearing states
  2016-05-12 14:57 ` [PATCH 4/4] drm/atomic: Verify connector->funcs != NULL when clearing states Lyude
  2016-05-12 15:04   ` Lyude
@ 2016-05-17 12:00   ` Daniel Vetter
  1 sibling, 0 replies; 11+ messages in thread
From: Daniel Vetter @ 2016-05-17 12:00 UTC (permalink / raw)
  To: Lyude; +Cc: intel-gfx

On Thu, May 12, 2016 at 10:57:01AM -0400, Lyude wrote:
> Unfortunately since we don't have Dave's connector refcounting patch
> here yet, it's very possible that drm_atomic_state_default_clear() could
> get called by intel_display_resume() when
> intel_dp_mst_destroy_connector() isn't completely finished destroying an
> mst connector, but has already finished setting connector->funcs to
> NULL. As such, we need to treat the connector like it's already been
> destroyed and just skip it, otherwise we'll end up dereferencing a NULL
> pointer.
> 
> This fix is only required for 4.6 and below. David Airlie's patchseries
> for 4.7 to add connector reference counting provides a more proper fix
> for this.
> 
> Upstream fix: b164d31f50b2923a7a92c2a40cb46973a6ba8c36
> Cc: stable@vger.kernel.org
> Signed-off-by: Lyude <cpaul@redhat.com>

Not fixing the race at all, bug if it helps a few users in real-world
cases while the real bugfix trickles down into shipping kernels (it'll be
in 4.7 but just way too big for backporting) I'm ok with this.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (but for stable
kernels only)
> ---
>  drivers/gpu/drm/drm_atomic.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
> index 8ee1db8..d3a5b5c 100644
> --- a/drivers/gpu/drm/drm_atomic.c
> +++ b/drivers/gpu/drm/drm_atomic.c
> @@ -139,7 +139,7 @@ void drm_atomic_state_default_clear(struct drm_atomic_state *state)
>  	for (i = 0; i < state->num_connector; i++) {
>  		struct drm_connector *connector = state->connectors[i];
>  
> -		if (!connector)
> +		if (!connector || !connector->funcs)
>  			continue;
>  
>  		/*
> @@ -150,6 +150,7 @@ void drm_atomic_state_default_clear(struct drm_atomic_state *state)
>  		 * case by setting all connector pointers to NULL.
>  		 */
>  		state->connector_states[i]->connector = NULL;
> +
>  		connector->funcs->atomic_destroy_state(NULL,
>  						       state->connector_states[i]);
>  		state->connectors[i] = NULL;
> -- 
> 2.5.5
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 3/4] drm/i915: Discard previous atomic state on resume if connectors change
  2016-05-12 21:47   ` [PATCH v2 3/4] drm/i915: Discard previous atomic state on resume if connectors change Lyude
@ 2016-05-17 12:01     ` Daniel Vetter
  0 siblings, 0 replies; 11+ messages in thread
From: Daniel Vetter @ 2016-05-17 12:01 UTC (permalink / raw)
  To: Lyude; +Cc: intel-gfx

On Thu, May 12, 2016 at 05:47:39PM -0400, Lyude wrote:
> If an MST device is disconnected while the machine is suspended, the
> number of connectors will change as well after we call
> intel_dp_mst_resume(). This means that any previous atomic state we had
> before suspending is no longer valid, since it'll still be pointing to
> missing connectors. We need to check for this before committing the
> state, otherwise we'll kernel panic on resume whenever if any MST
> display was disconnected before we started resuming:
> 
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
> IP: [<ffffffffa01588ef>] drm_atomic_helper_check_modeset+0x29f/0xb40 [drm_kms_helper]
> Call Trace:
>  [<ffffffffa02354f4>] intel_atomic_check+0x34/0x1180 [i915]
>  [<ffffffff810e6c3f>] ? mark_held_locks+0x6f/0xa0
>  [<ffffffff810e6d99>] ? trace_hardirqs_on_caller+0x129/0x1b0
>  [<ffffffffa00ff1d2>] drm_atomic_check_only+0x192/0x620 [drm]
>  [<ffffffff813ee001>] ? pci_pm_thaw+0x21/0x90
>  [<ffffffffa00ff677>] drm_atomic_commit+0x17/0x60 [drm]
>  [<ffffffffa023e0ad>] intel_display_resume+0xbd/0x160 [i915]
>  [<ffffffff813ee070>] ? pci_pm_thaw+0x90/0x90
>  [<ffffffffa01b60d8>] i915_drm_resume+0xd8/0x160 [i915]
>  [<ffffffffa01b6185>] i915_pm_resume+0x25/0x30 [i915]
>  [<ffffffff813ee0d4>] pci_pm_resume+0x64/0xa0
>  [<ffffffff814d9ea0>] dpm_run_callback+0x90/0x190
>  [<ffffffff814da455>] device_resume+0xd5/0x1f0
>  [<ffffffff814da58d>] async_resume+0x1d/0x50
>  [<ffffffff810b6718>] async_run_entry_fn+0x48/0x150
>  [<ffffffff810acc19>] process_one_work+0x1e9/0x5c0
>  [<ffffffff810acb96>] ? process_one_work+0x166/0x5c0
>  [<ffffffff810ad038>] worker_thread+0x48/0x4e0
>  [<ffffffff810acff0>] ? process_one_work+0x5c0/0x5c0
>  [<ffffffff810b3794>] kthread+0xe4/0x100
>  [<ffffffff81742672>] ret_from_fork+0x22/0x50
>  [<ffffffff810b36b0>] ? kthread_create_on_node+0x200/0x200
> 
> Changes since v1:
>   - Move drm_atomic_state_free() call down so we're holding the
>     appropriate locks when destroying the atomic state
> Changes since v2:
>   - Check that state != NULL before we start accessing it's members
> 
> This fix is only required for 4.6 and below. David Airlie's patchseries
> for 4.7 to add connector reference counting provides a more proper fix
> for this.
> 
> Upstream fix: b164d31f50b2923a7a92c2a40cb46973a6ba8c36
> Cc: stable@vger.kernel.org
> Signed-off-by: Lyude <cpaul@redhat.com>

Same caveat: This is for stable only as a duct-tape, the real fix is too
big for backporting from 4.7.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (for stable kernels
only)
> ---
>  drivers/gpu/drm/i915/intel_display.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index 182f849..e49a313 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -15959,6 +15959,18 @@ void intel_display_resume(struct drm_device *dev)
>  retry:
>  	ret = drm_modeset_lock_all_ctx(dev, &ctx);
>  
> +	/*
> +	 * With MST, the number of connectors can change between suspend and
> +	 * resume, which means that the state we want to restore might now be
> +	 * impossible to use since it'll be pointing to non-existant
> +	 * connectors.
> +	 */
> +	if (ret == 0 && state &&
> +	    state->num_connector != dev->mode_config.num_connector) {
> +		drm_atomic_state_free(state);
> +		state = NULL;
> +	}
> +
>  	if (ret == 0 && !setup) {
>  		setup = true;
>  
> -- 
> 2.5.5
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-05-17 12:01 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-12 14:56 [RESEND] [PATCH 0/4] Important MST fixes for 4.6 Lyude
2016-05-12 14:56 ` [PATCH 1/4] drm/i915/fbdev: Fix num_connector references in intel_fb_initial_config() Lyude
2016-05-12 14:56 ` [PATCH 2/4] drm/fb_helper: Fix references to dev->mode_config.num_connector Lyude
2016-05-12 21:47   ` [PATCH v2 3/4] drm/i915: Discard previous atomic state on resume if connectors change Lyude
2016-05-17 12:01     ` Daniel Vetter
2016-05-12 14:57 ` [PATCH " Lyude
2016-05-12 14:57 ` [PATCH 4/4] drm/atomic: Verify connector->funcs != NULL when clearing states Lyude
2016-05-12 15:04   ` Lyude
2016-05-17 12:00   ` Daniel Vetter
2016-05-12 16:48 ` ✗ Ro.CI.BAT: failure for Important MST fixes for 4.6 (rev2) Patchwork
2016-05-12 22:25 ` ✗ Ro.CI.BAT: failure for Important MST fixes for 4.6 (rev3) Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.