All of lore.kernel.org
 help / color / mirror / Atom feed
* [Bug 201015] New: [amdgpu] BUG: unable to handle kernel NULL pointer dereference on resume with 2 monitors (vega)
@ 2018-09-05  6:06 bugzilla-daemon
  2018-09-05  6:08 ` [Bug 201015] " bugzilla-daemon
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: bugzilla-daemon @ 2018-09-05  6:06 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=201015

            Bug ID: 201015
           Summary: [amdgpu] BUG: unable to handle kernel NULL pointer
                    dereference on resume with 2 monitors (vega)
           Product: Drivers
           Version: 2.5
    Kernel Version: 4.19-rc2
          Hardware: All
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: Video(DRI - non Intel)
          Assignee: drivers_video-dri@kernel-bugs.osdl.org
          Reporter: mezin.alexander@gmail.com
        Regression: No

Created attachment 278309
  --> https://bugzilla.kernel.org/attachment.cgi?id=278309&action=edit
dmesg, failed resume with 2 monitors

Happens on resume from suspend when 2 monitors are connected (over
DisplayPort).
With 1 monitor suspend/resume works reliably.

Vega 64 (Sapphire Nitro+)
Dell P2415Q and LG 27UD69P, connected over DisplayPort

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug 201015] [amdgpu] BUG: unable to handle kernel NULL pointer dereference on resume with 2 monitors (vega)
  2018-09-05  6:06 [Bug 201015] New: [amdgpu] BUG: unable to handle kernel NULL pointer dereference on resume with 2 monitors (vega) bugzilla-daemon
@ 2018-09-05  6:08 ` bugzilla-daemon
  2018-09-05  7:40 ` bugzilla-daemon
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: bugzilla-daemon @ 2018-09-05  6:08 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=201015

--- Comment #1 from Aleksandr Mezin (mezin.alexander@gmail.com) ---
Created attachment 278311
  --> https://bugzilla.kernel.org/attachment.cgi?id=278311&action=edit
dmesg, multiple suspend-resume cycles with 1 monitor, then attached 2nd
monitor, resume failed

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug 201015] [amdgpu] BUG: unable to handle kernel NULL pointer dereference on resume with 2 monitors (vega)
  2018-09-05  6:06 [Bug 201015] New: [amdgpu] BUG: unable to handle kernel NULL pointer dereference on resume with 2 monitors (vega) bugzilla-daemon
  2018-09-05  6:08 ` [Bug 201015] " bugzilla-daemon
@ 2018-09-05  7:40 ` bugzilla-daemon
  2018-09-05  9:37 ` bugzilla-daemon
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: bugzilla-daemon @ 2018-09-05  7:40 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=201015

Michel Dänzer (michel@daenzer.net) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |harry.wentland@amd.com,
                   |                            |nicholas.kazlauskas@amd.com

--- Comment #2 from Michel Dänzer (michel@daenzer.net) ---
Is this a regression in 4.19-rc compared to 4.18?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug 201015] [amdgpu] BUG: unable to handle kernel NULL pointer dereference on resume with 2 monitors (vega)
  2018-09-05  6:06 [Bug 201015] New: [amdgpu] BUG: unable to handle kernel NULL pointer dereference on resume with 2 monitors (vega) bugzilla-daemon
  2018-09-05  6:08 ` [Bug 201015] " bugzilla-daemon
  2018-09-05  7:40 ` bugzilla-daemon
@ 2018-09-05  9:37 ` bugzilla-daemon
  2018-09-10 17:55 ` bugzilla-daemon
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: bugzilla-daemon @ 2018-09-05  9:37 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=201015

--- Comment #3 from Aleksandr Mezin (mezin.alexander@gmail.com) ---
(In reply to Michel Dänzer from comment #2)
> Is this a regression in 4.19-rc compared to 4.18?

No, happens on 4.18.5 too.

But on 4.18 suspend-resume also triggers
https://bugzilla.kernel.org/show_bug.cgi?id=200531 (one of the monitors turns
off quickly -> resume with one monitor in standby mode -> triggers REG_WAIT
timeout). But null pointer dereference is there too. With only one monitor
quick suspend-resume on 4.18.5 works fine (exactly as it is on 4.19-rc2).

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug 201015] [amdgpu] BUG: unable to handle kernel NULL pointer dereference on resume with 2 monitors (vega)
  2018-09-05  6:06 [Bug 201015] New: [amdgpu] BUG: unable to handle kernel NULL pointer dereference on resume with 2 monitors (vega) bugzilla-daemon
                   ` (2 preceding siblings ...)
  2018-09-05  9:37 ` bugzilla-daemon
@ 2018-09-10 17:55 ` bugzilla-daemon
  2018-09-10 18:34 ` bugzilla-daemon
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: bugzilla-daemon @ 2018-09-10 17:55 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=201015

--- Comment #4 from Nicholas Kazlauskas (nicholas.kazlauskas@amd.com) ---
Created attachment 278425
  --> https://bugzilla.kernel.org/attachment.cgi?id=278425&action=edit
0001-drm-amd-display-Add-null-checks-to-surface-update-in.patch

I'm unable to reproduce the issue under Ubuntu 18.04, GNOME, 4.19 rc2 and a
Vega.

What's your userspace setup like?

You can try the attached patch and see if that helps the problem. Try booting
with drm.debug=6 in your bootline and post the results of the suspend with the
patch.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug 201015] [amdgpu] BUG: unable to handle kernel NULL pointer dereference on resume with 2 monitors (vega)
  2018-09-05  6:06 [Bug 201015] New: [amdgpu] BUG: unable to handle kernel NULL pointer dereference on resume with 2 monitors (vega) bugzilla-daemon
                   ` (3 preceding siblings ...)
  2018-09-10 17:55 ` bugzilla-daemon
@ 2018-09-10 18:34 ` bugzilla-daemon
  2018-09-10 18:46 ` bugzilla-daemon
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: bugzilla-daemon @ 2018-09-10 18:34 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=201015

--- Comment #5 from Aleksandr Mezin (mezin.alexander@gmail.com) ---
Created attachment 278427
  --> https://bugzilla.kernel.org/attachment.cgi?id=278427&action=edit
Kernel log, 2 suspend-resume attempts, second one failed

(In reply to Nicholas Kazlauskas from comment #4)
> Created attachment 278425 [details]
> 0001-drm-amd-display-Add-null-checks-to-surface-update-in.patch
> 
> I'm unable to reproduce the issue under Ubuntu 18.04, GNOME, 4.19 rc2 and a
> Vega.
> 
> What's your userspace setup like?

Arch Linux, 4.19-rc3, GNOME 3.28.3

> 
> You can try the attached patch and see if that helps the problem. Try
> booting with drm.debug=6 in your bootline and post the results of the
> suspend with the patch.

This time first suspend-resume worked fine, second one failed

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug 201015] [amdgpu] BUG: unable to handle kernel NULL pointer dereference on resume with 2 monitors (vega)
  2018-09-05  6:06 [Bug 201015] New: [amdgpu] BUG: unable to handle kernel NULL pointer dereference on resume with 2 monitors (vega) bugzilla-daemon
                   ` (4 preceding siblings ...)
  2018-09-10 18:34 ` bugzilla-daemon
@ 2018-09-10 18:46 ` bugzilla-daemon
  2018-09-10 20:34 ` bugzilla-daemon
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: bugzilla-daemon @ 2018-09-10 18:46 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=201015

--- Comment #6 from Nicholas Kazlauskas (nicholas.kazlauskas@amd.com) ---
I'd imagine you're probably running GNOME on Wayland from that setup
environment.

The patch seems to fix the null pointer deference but you're probably getting a
black screen from those failed atomic commits.

Might not be a problem with the driver but with the GNOME Wayland
implementation - I would need to do more investigation to see which atomic
commits are failing and if the failures are valid (but unchecked).

You would probably not see this occur for GNOME over Xorg.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug 201015] [amdgpu] BUG: unable to handle kernel NULL pointer dereference on resume with 2 monitors (vega)
  2018-09-05  6:06 [Bug 201015] New: [amdgpu] BUG: unable to handle kernel NULL pointer dereference on resume with 2 monitors (vega) bugzilla-daemon
                   ` (5 preceding siblings ...)
  2018-09-10 18:46 ` bugzilla-daemon
@ 2018-09-10 20:34 ` bugzilla-daemon
  2018-09-11 19:16 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: bugzilla-daemon @ 2018-09-10 20:34 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=201015

--- Comment #7 from Aleksandr Mezin (mezin.alexander@gmail.com) ---
(In reply to Nicholas Kazlauskas from comment #6)
> I'd imagine you're probably running GNOME on Wayland from that setup
> environment.
> 
> The patch seems to fix the null pointer deference but you're probably
> getting a black screen from those failed atomic commits.
> 
> Might not be a problem with the driver but with the GNOME Wayland
> implementation - I would need to do more investigation to see which atomic
> commits are failing and if the failures are valid (but unchecked).
> 
> You would probably not see this occur for GNOME over Xorg.

No, it occurs with Gnome on Xorg, with modesetting driver. Gnome on Wayland
seems to handle suspend and resume fine (even on unpatched 4.19-rc3). Also, I
tried xf86-video-amdgpu. It works like Gnome on Wayland, but sometimes after
resume one display is limited to 800x600 resolution only (it's a 4k display).
Probably another different issue.

I expected modesetting driver to work though.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug 201015] [amdgpu] BUG: unable to handle kernel NULL pointer dereference on resume with 2 monitors (vega)
  2018-09-05  6:06 [Bug 201015] New: [amdgpu] BUG: unable to handle kernel NULL pointer dereference on resume with 2 monitors (vega) bugzilla-daemon
                   ` (6 preceding siblings ...)
  2018-09-10 20:34 ` bugzilla-daemon
@ 2018-09-11 19:16 ` bugzilla-daemon
  2018-09-11 19:19 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: bugzilla-daemon @ 2018-09-11 19:16 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=201015

--- Comment #8 from Aleksandr Mezin (mezin.alexander@gmail.com) ---
Created attachment 278457
  --> https://bugzilla.kernel.org/attachment.cgi?id=278457&action=edit
kernel log: patched kernel + xorg modesetting, resume failed

Even with patched kernel, when resume fails there are errors in kernel log
(when using modesetting driver):

[   98.136982] [drm:drm_atomic_helper_wait_for_flip_done [drm_kms_helper]]
*ERROR* [CRTC:43:crtc-0] flip_done timed out
[  103.668322] [drm:dc_remove_plane_from_context [amdgpu]] *ERROR* Existing
plane_state not found; failed to detach it!
[  103.702464] [drm:dc_remove_plane_from_context [amdgpu]] *ERROR* Existing
plane_state not found; failed to detach it!

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug 201015] [amdgpu] BUG: unable to handle kernel NULL pointer dereference on resume with 2 monitors (vega)
  2018-09-05  6:06 [Bug 201015] New: [amdgpu] BUG: unable to handle kernel NULL pointer dereference on resume with 2 monitors (vega) bugzilla-daemon
                   ` (7 preceding siblings ...)
  2018-09-11 19:16 ` bugzilla-daemon
@ 2018-09-11 19:19 ` bugzilla-daemon
  2018-10-03 16:17 ` bugzilla-daemon
  2018-11-10  2:08 ` bugzilla-daemon
  10 siblings, 0 replies; 12+ messages in thread
From: bugzilla-daemon @ 2018-09-11 19:19 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=201015

--- Comment #9 from Aleksandr Mezin (mezin.alexander@gmail.com) ---
Created attachment 278459
  --> https://bugzilla.kernel.org/attachment.cgi?id=278459&action=edit
user log: patched kernel + xorg modesetting, resume failed

сен 12 01:05:22 X299 /usr/lib/gdm-x-session[1593]: (WW) modeset(0): flip queue
failed: Invalid argument
сен 12 01:05:22 X299 /usr/lib/gdm-x-session[1593]: (WW) modeset(0): Page flip
failed: Invalid argument
сен 12 01:05:22 X299 /usr/lib/gdm-x-session[1593]: (EE) modeset(0): present
flip failed
сен 12 01:05:22 X299 /usr/lib/gdm-x-session[1593]: (WW) modeset(0): flip queue
failed: Invalid argument
сен 12 01:05:22 X299 /usr/lib/gdm-x-session[1593]: (WW) modeset(0): Page flip
failed: Invalid argument

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug 201015] [amdgpu] BUG: unable to handle kernel NULL pointer dereference on resume with 2 monitors (vega)
  2018-09-05  6:06 [Bug 201015] New: [amdgpu] BUG: unable to handle kernel NULL pointer dereference on resume with 2 monitors (vega) bugzilla-daemon
                   ` (8 preceding siblings ...)
  2018-09-11 19:19 ` bugzilla-daemon
@ 2018-10-03 16:17 ` bugzilla-daemon
  2018-11-10  2:08 ` bugzilla-daemon
  10 siblings, 0 replies; 12+ messages in thread
From: bugzilla-daemon @ 2018-10-03 16:17 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=201015

--- Comment #10 from Aleksandr Mezin (mezin.alexander@gmail.com) ---
After recent updates, the issue went away. But I'm not sure what exactly has
changed. I tried reverting the kernel (to 4.19-rc3) and libdrm, but still can't
trigger it anymore.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug 201015] [amdgpu] BUG: unable to handle kernel NULL pointer dereference on resume with 2 monitors (vega)
  2018-09-05  6:06 [Bug 201015] New: [amdgpu] BUG: unable to handle kernel NULL pointer dereference on resume with 2 monitors (vega) bugzilla-daemon
                   ` (9 preceding siblings ...)
  2018-10-03 16:17 ` bugzilla-daemon
@ 2018-11-10  2:08 ` bugzilla-daemon
  10 siblings, 0 replies; 12+ messages in thread
From: bugzilla-daemon @ 2018-11-10  2:08 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=201015

Aleksandr Mezin (mezin.alexander@gmail.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |OBSOLETE

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2018-11-10  2:08 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-05  6:06 [Bug 201015] New: [amdgpu] BUG: unable to handle kernel NULL pointer dereference on resume with 2 monitors (vega) bugzilla-daemon
2018-09-05  6:08 ` [Bug 201015] " bugzilla-daemon
2018-09-05  7:40 ` bugzilla-daemon
2018-09-05  9:37 ` bugzilla-daemon
2018-09-10 17:55 ` bugzilla-daemon
2018-09-10 18:34 ` bugzilla-daemon
2018-09-10 18:46 ` bugzilla-daemon
2018-09-10 20:34 ` bugzilla-daemon
2018-09-11 19:16 ` bugzilla-daemon
2018-09-11 19:19 ` bugzilla-daemon
2018-10-03 16:17 ` bugzilla-daemon
2018-11-10  2:08 ` bugzilla-daemon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.