dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [Bug 104142] Stack trace in runpm when Tonga card powers down
@ 2017-12-06 10:32 bugzilla-daemon
  2017-12-07 14:15 ` bugzilla-daemon
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: bugzilla-daemon @ 2017-12-06 10:32 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 3357 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=104142

            Bug ID: 104142
           Summary: Stack trace in runpm when Tonga card powers down
           Product: DRI
           Version: XOrg git
          Hardware: Other
                OS: All
            Status: NEW
          Severity: normal
          Priority: medium
         Component: DRM/AMDgpu
          Assignee: dri-devel@lists.freedesktop.org
          Reporter: mike@fireburn.co.uk

Created attachment 135997
  --> https://bugs.freedesktop.org/attachment.cgi?id=135997&action=edit
Dmesg

[ 9087.801615] WARNING: CPU: 6 PID: 1002 at dm_suspend+0x49/0x50
[ 9087.801617] Modules linked in:
[ 9087.801620] CPU: 6 PID: 1002 Comm: kworker/6:0 Tainted: G        W       
4.15.0-rc2-agd5f+ #300
[ 9087.801621] Hardware name: Alienware Alienware 15 R2/0H6J09, BIOS 1.3.12
07/28/2017
[ 9087.801622] Workqueue: pm pm_runtime_work
[ 9087.801623] task: 00000000fc3d3872 task.stack: 00000000a1a771ba
[ 9087.801624] RIP: 0010:dm_suspend+0x49/0x50
[ 9087.801625] RSP: 0018:ffffc900000fbca0 EFLAGS: 00010282
[ 9087.801626] RAX: 0000000000000000 RBX: ffff88089c9e0000 RCX:
0000000000000000
[ 9087.801627] RDX: 0000000000000001 RSI: 0000000000000282 RDI:
ffff88089c9ea8f0
[ 9087.801627] RBP: 0000000000000003 R08: 00000000c0000000 R09:
ffffffff824e7648
[ 9087.801628] R10: ffffea001e7c8020 R11: ffff8808c1d19480 R12:
ffff88089c9e0000
[ 9087.801628] R13: ffffffff82241e98 R14: 0000000000000004 R15:
ffffffff816b7670
[ 9087.801629] FS:  0000000000000000(0000) GS:ffff8808c1d80000(0000)
knlGS:0000000000000000
[ 9087.801630] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 9087.801631] CR2: 00007f43c544dbb8 CR3: 000000000240a004 CR4:
00000000001606e0
[ 9087.801631] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[ 9087.801632] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[ 9087.801632] Call Trace:
[ 9087.801636]  amdgpu_suspend+0x61/0x170
[ 9087.801637]  amdgpu_device_suspend+0x195/0x390
[ 9087.801639]  ? vga_switcheroo_runtime_resume+0x50/0x50
[ 9087.801640]  amdgpu_pmops_runtime_suspend+0x4d/0xc0
[ 9087.801642]  pci_pm_runtime_suspend+0x4d/0x120
[ 9087.801644]  vga_switcheroo_runtime_suspend+0x19/0x90
[ 9087.801645]  __rpm_callback+0xb5/0x1e0
[ 9087.801647]  ? vga_switcheroo_runtime_resume+0x50/0x50
[ 9087.801648]  rpm_callback+0x1a/0x70
[ 9087.801649]  ? vga_switcheroo_runtime_resume+0x50/0x50
[ 9087.801650]  rpm_suspend+0x124/0x650
[ 9087.801652]  pm_runtime_work+0x58/0x80
[ 9087.801653]  process_one_work+0x1d5/0x3d0
[ 9087.801655]  worker_thread+0x42/0x3e0
[ 9087.801656]  kthread+0xf0/0x130
[ 9087.801657]  ? cancel_delayed_work+0x10/0x10
[ 9087.801658]  ? kthread_create_worker_on_cpu+0x70/0x70
[ 9087.801660]  ret_from_fork+0x1f/0x30
[ 9087.801661] Code: a9 00 00 00 75 25 48 8b 7b 08 e8 93 4f f0 ff 48 8b bb 00
92 00 00 be 08 00 00 00 48 89 83 68 a9 00 00 e8 bb 4f 04 00 31 c0 5b c3 <0f> ff
eb d7 0f 1f 00 53 48 89 fb 48 8b bf 20 92 00 00 e8 20 8d
[ 9087.801675] ---[ end trace 8e3cd942fb9ca189 ]---
[ 9087.996944] amdgpu 0000:01:00.0: GPU pci config reset


I'm seeing these a lot, also in Linus's tree

Will try and bisect later though I have a funny feeling it might be DC related

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 4761 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug 104142] Stack trace in runpm when Tonga card powers down
  2017-12-06 10:32 [Bug 104142] Stack trace in runpm when Tonga card powers down bugzilla-daemon
@ 2017-12-07 14:15 ` bugzilla-daemon
  2017-12-07 15:36 ` bugzilla-daemon
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2017-12-07 14:15 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 382 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=104142

--- Comment #1 from Harry Wentland <harry.wentland@amd.com> ---
We do a WARN_ON(adev->dm.cached_state). We shouldn't have a cached state when
doing suspend. Not sure right now why this is happening. Is this with an Intel
iGPU + AMD dGPU laptop?

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1171 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug 104142] Stack trace in runpm when Tonga card powers down
  2017-12-06 10:32 [Bug 104142] Stack trace in runpm when Tonga card powers down bugzilla-daemon
  2017-12-07 14:15 ` bugzilla-daemon
@ 2017-12-07 15:36 ` bugzilla-daemon
  2017-12-14 14:57 ` bugzilla-daemon
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2017-12-07 15:36 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 3340 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=104142

Mike Lothian <mike@fireburn.co.uk> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |mike@fireburn.co.uk

--- Comment #2 from Mike Lothian <mike@fireburn.co.uk> ---
Yes it's Intel Skylake and AMD Tonga

Just tested it with Alex's 4.16-wip branch

[ 8476.275162] [drm] PCIE GART of 1024M enabled (table at 0x000000F400040000).
[ 8476.472088] [drm] UVD initialized successfully.
[ 8476.684149] [drm] VCE initialized successfully.
[ 8481.834519] WARNING: CPU: 2 PID: 62 at dm_suspend+0x49/0x50
[ 8481.834521] Modules linked in:
[ 8481.834523] CPU: 2 PID: 62 Comm: kworker/2:1 Not tainted 4.15.0-rc2-agd5f+
#303
[ 8481.834524] Hardware name: Alienware Alienware 15 R2/0H6J09, BIOS 1.3.12
07/28/2017
[ 8481.834525] Workqueue: pm pm_runtime_work
[ 8481.834527] task: 000000002f4790b3 task.stack: 000000003f1bbe2b
[ 8481.834528] RIP: 0010:dm_suspend+0x49/0x50
[ 8481.834529] RSP: 0018:ffffc90000273ca0 EFLAGS: 00010286
[ 8481.834530] RAX: 0000000000000000 RBX: ffff88089c9f0000 RCX:
0000000000000000
[ 8481.834530] RDX: 0000000000000001 RSI: 0000000000000282 RDI:
ffff88089c9fa8f0
[ 8481.834531] RBP: 0000000000000003 R08: 00000000c0000000 R09:
ffffffff824e80c8
[ 8481.834532] R10: ffffea0021940020 R11: ffff8808c1c99480 R12:
ffff88089c9f0000
[ 8481.834532] R13: ffffffff82243fc0 R14: 0000000000000004 R15:
ffffffff816b7630
[ 8481.834533] FS:  0000000000000000(0000) GS:ffff8808c1c80000(0000)
knlGS:0000000000000000
[ 8481.834534] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8481.834535] CR2: 00007fd9e466f000 CR3: 000000000240a005 CR4:
00000000001606e0
[ 8481.834535] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[ 8481.834536] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[ 8481.834536] Call Trace:
[ 8481.834540]  amdgpu_suspend+0x61/0x170
[ 8481.834541]  amdgpu_device_suspend+0x195/0x390
[ 8481.834543]  ? vga_switcheroo_runtime_resume+0x50/0x50
[ 8481.834544]  amdgpu_pmops_runtime_suspend+0x4d/0xc0
[ 8481.834547]  pci_pm_runtime_suspend+0x4d/0x120
[ 8481.834548]  vga_switcheroo_runtime_suspend+0x19/0x90
[ 8481.834550]  __rpm_callback+0xb5/0x1e0
[ 8481.834551]  ? vga_switcheroo_runtime_resume+0x50/0x50
[ 8481.834552]  rpm_callback+0x1a/0x70
[ 8481.834554]  ? vga_switcheroo_runtime_resume+0x50/0x50
[ 8481.834555]  rpm_suspend+0x124/0x650
[ 8481.834556]  pm_runtime_work+0x58/0x80
[ 8481.834558]  process_one_work+0x1d5/0x3d0
[ 8481.834559]  worker_thread+0x42/0x3e0
[ 8481.834560]  kthread+0xf0/0x130
[ 8481.834562]  ? cancel_delayed_work+0x10/0x10
[ 8481.834562]  ? kthread_create_worker_on_cpu+0x70/0x70
[ 8481.834564]  ret_from_fork+0x1f/0x30
[ 8481.834565] Code: a9 00 00 00 75 25 48 8b 7b 08 e8 d3 4f f0 ff 48 8b bb 00
92 00 00 be 08 00 00 00 48 89 83 68 a9 00 00 e8 bb 4f 04 00 31 c0 5b c3 <0f> ff
eb d7 0f 1f 00 53 48 89 fb 48 8b bf 20 92 00 00 e8 60 8d
[ 8481.834579] ---[ end trace 86b596a21b2ff6ee ]---
[ 8482.018625] amdgpu 0000:01:00.0: GPU pci config reset

Is there anything else you need from me? Or debugging I could turn on?

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 4672 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug 104142] Stack trace in runpm when Tonga card powers down
  2017-12-06 10:32 [Bug 104142] Stack trace in runpm when Tonga card powers down bugzilla-daemon
  2017-12-07 14:15 ` bugzilla-daemon
  2017-12-07 15:36 ` bugzilla-daemon
@ 2017-12-14 14:57 ` bugzilla-daemon
  2017-12-15  0:57 ` bugzilla-daemon
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2017-12-14 14:57 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 824 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=104142

--- Comment #3 from Mike Lothian <mike@fireburn.co.uk> ---
I bisected this back to:

d21becbe0225de0e2582d17d4fbc73fbd103b1f7 is the first bad commit
commit d21becbe0225de0e2582d17d4fbc73fbd103b1f7
Author: Tony Cheng <tony.cheng@amd.com>
Date:   Wed Jul 12 11:54:10 2017 -0400

    drm/amd/display: avoid disabling opp clk before hubp is blanked.

    Signed-off-by: Tony Cheng <tony.cheng@amd.com>
    Reviewed-by: Eric Yang <eric.yang2@amd.com>
    Acked-by: Harry Wentland <Harry.Wentland@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

:040000 040000 61debba3cf73670d29975bc136d01862c2a54576
3d2315a1843d6276655b1550cb9f18fab47c5ce4 M      drivers

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1886 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug 104142] Stack trace in runpm when Tonga card powers down
  2017-12-06 10:32 [Bug 104142] Stack trace in runpm when Tonga card powers down bugzilla-daemon
                   ` (2 preceding siblings ...)
  2017-12-14 14:57 ` bugzilla-daemon
@ 2017-12-15  0:57 ` bugzilla-daemon
  2018-02-09  0:40 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2017-12-15  0:57 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 1015 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=104142

--- Comment #4 from Mike Lothian <mike@fireburn.co.uk> ---
Tried again and it looks a little more promising:

0a214e2fb6b0a56519b6d5efab4b21475c233ee0 is the first bad commit
commit 0a214e2fb6b0a56519b6d5efab4b21475c233ee0
Author: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Date:   Thu Jul 13 10:56:48 2017 -0400

    drm/amd/display: Release cached atomic state in S3.

    Fixes memory leak.

    Signed-off-by: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
    Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
    Acked-by: Harry Wentland <Harry.Wentland@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

:040000 040000 494f25ce4ad407678f88d6c85128905762c9fbfb
fb36845ef2ccca7bf823c9fec4d13d0a6e71ea2b M      drivers

Which makes sense as that's the commit that adds the WARN_ON, I guess that
takes us back to why is there a cached state

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 2091 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug 104142] Stack trace in runpm when Tonga card powers down
  2017-12-06 10:32 [Bug 104142] Stack trace in runpm when Tonga card powers down bugzilla-daemon
                   ` (3 preceding siblings ...)
  2017-12-15  0:57 ` bugzilla-daemon
@ 2018-02-09  0:40 ` bugzilla-daemon
  2018-02-27  0:46 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2018-02-09  0:40 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 268 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=104142

--- Comment #5 from JohnDoe <frederik.schwan@linux.com> ---
I ran into this bug when upgrading to 4.15.1. Anything I can do to help?

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1050 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug 104142] Stack trace in runpm when Tonga card powers down
  2017-12-06 10:32 [Bug 104142] Stack trace in runpm when Tonga card powers down bugzilla-daemon
                   ` (4 preceding siblings ...)
  2018-02-09  0:40 ` bugzilla-daemon
@ 2018-02-27  0:46 ` bugzilla-daemon
  2018-02-27  1:08 ` bugzilla-daemon
  2018-02-27  1:18 ` bugzilla-daemon
  7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2018-02-27  0:46 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 463 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=104142

--- Comment #6 from Harry Wentland <harry.wentland@amd.com> ---
We have a few new patches in our staging trees relating to suspend and driver
unload.

Would you be able to try amd-staging-drm-next or drm-next-4.17-wip from
https://cgit.freedesktop.org/~agd5f/linux/?h=drm-next-4.17-wip and see if the
issue is fixed there?

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1326 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug 104142] Stack trace in runpm when Tonga card powers down
  2017-12-06 10:32 [Bug 104142] Stack trace in runpm when Tonga card powers down bugzilla-daemon
                   ` (5 preceding siblings ...)
  2018-02-27  0:46 ` bugzilla-daemon
@ 2018-02-27  1:08 ` bugzilla-daemon
  2018-02-27  1:18 ` bugzilla-daemon
  7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2018-02-27  1:08 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 813 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=104142

Mike Lothian <mike@fireburn.co.uk> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|NEW                         |RESOLVED

--- Comment #7 from Mike Lothian <mike@fireburn.co.uk> ---
I'm running agd5f's drm-next-4.17-wip branch with
https://patchwork.freedesktop.org/series/38985/ applied on top along with
https://github.com/FireBurn/KernelStuff/blob/master/05-remove-warn.patch
removing the WARN_ON(adev->dm.cached_state);

I've reverted the removal of the WARN_ON and it seems to be fixed thanks

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 2462 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug 104142] Stack trace in runpm when Tonga card powers down
  2017-12-06 10:32 [Bug 104142] Stack trace in runpm when Tonga card powers down bugzilla-daemon
                   ` (6 preceding siblings ...)
  2018-02-27  1:08 ` bugzilla-daemon
@ 2018-02-27  1:18 ` bugzilla-daemon
  7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2018-02-27  1:18 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 222 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=104142

--- Comment #8 from Harry Wentland <harry.wentland@amd.com> ---
Thanks for the update.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1060 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-02-27  1:18 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-06 10:32 [Bug 104142] Stack trace in runpm when Tonga card powers down bugzilla-daemon
2017-12-07 14:15 ` bugzilla-daemon
2017-12-07 15:36 ` bugzilla-daemon
2017-12-14 14:57 ` bugzilla-daemon
2017-12-15  0:57 ` bugzilla-daemon
2018-02-09  0:40 ` bugzilla-daemon
2018-02-27  0:46 ` bugzilla-daemon
2018-02-27  1:08 ` bugzilla-daemon
2018-02-27  1:18 ` bugzilla-daemon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).