All of lore.kernel.org
 help / color / mirror / Atom feed
* [Bug 215436] New: admgpu: suspend and resuming from suspend don't work
@ 2021-12-29 12:28 bugzilla-daemon
  2021-12-29 15:08 ` [Bug 215436] " bugzilla-daemon
                   ` (29 more replies)
  0 siblings, 30 replies; 31+ messages in thread
From: bugzilla-daemon @ 2021-12-29 12:28 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=215436

            Bug ID: 215436
           Summary: admgpu: suspend and resuming from suspend don't work
           Product: Drivers
           Version: 2.5
    Kernel Version: 5.16-rc{1-7}
          Hardware: All
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: Video(DRI - non Intel)
          Assignee: drivers_video-dri@kernel-bugs.osdl.org
          Reporter: spasswolf@web.de
        Regression: No

Created attachment 300180
  --> https://bugzilla.kernel.org/attachment.cgi?id=300180&action=edit
full kern.log from boot to reboot

System:
MSI Alpha 15 B5EEK Lapttop with newest BIOS
lspci -nn
00:00.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne
Root Complex [1022:1630]
00:00.2 IOMMU [0806]: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne IOMMU
[1022:1631]
00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir PCIe
Dummy Host Bridge [1022:1632]
00:01.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Renoir PCIe GPP
Bridge [1022:1633]
00:02.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir PCIe
Dummy Host Bridge [1022:1632]
00:02.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne
PCIe GPP Bridge [1022:1634]
00:02.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne
PCIe GPP Bridge [1022:1634]
00:02.3 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne
PCIe GPP Bridge [1022:1634]
00:02.4 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Renoir/Cezanne
PCIe GPP Bridge [1022:1634]
00:08.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Renoir PCIe
Dummy Host Bridge [1022:1632]
00:08.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Renoir Internal
PCIe GPP Bridge to Bus [1022:1635]
00:14.0 SMBus [0c05]: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller
[1022:790b] (rev 51)
00:14.3 ISA bridge [0601]: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge
[1022:790e] (rev 51)
00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Cezanne Data
Fabric; Function 0 [1022:166a]
00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Cezanne Data
Fabric; Function 1 [1022:166b]
00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Cezanne Data
Fabric; Function 2 [1022:166c]
00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Cezanne Data
Fabric; Function 3 [1022:166d]
00:18.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Cezanne Data
Fabric; Function 4 [1022:166e]
00:18.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Cezanne Data
Fabric; Function 5 [1022:166f]
00:18.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Cezanne Data
Fabric; Function 6 [1022:1670]
00:18.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Cezanne Data
Fabric; Function 7 [1022:1671]
01:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL
Upstream Port of PCI Express Switch [1002:1478] (rev c3)
02:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL
Downstream Port of PCI Express Switch [1002:1479]
03:00.0 Display controller [0380]: Advanced Micro Devices, Inc. [AMD/ATI] Navi
23 [Radeon RX 6600/6600 XT/6600M] [1002:73ff] (rev c3)
03:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21
HDMI Audio [Radeon RX 6800/6800 XT / 6900 XT] [1002:ab28]
04:00.0 Network controller [0280]: MEDIATEK Corp. Device [14c3:0608]
05:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd.
RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 15)
06:00.0 Non-Volatile memory controller [0108]: Micron/Crucial Technology P1
NVMe PCIe SSD [c0a9:2263] (rev 03)
07:00.0 Non-Volatile memory controller [0108]: Kingston Technology Company,
Inc. Device [2646:500c] (rev 01)
08:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc.
[AMD/ATI] Cezanne [1002:1638] (rev c5)
08:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Renoir
Radeon High Definition Audio Controller [1002:1637]
08:00.2 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] Family
17h (Models 10h-1fh) Platform Security Processor [1022:15df]
08:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD]
Renoir/Cezanne USB 3.1 [1022:1639]
08:00.4 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD]
Renoir/Cezanne USB 3.1 [1022:1639]
08:00.5 Multimedia controller [0480]: Advanced Micro Devices, Inc. [AMD]
Raven/Raven2/FireFlight/Renoir Audio Processor [1022:15e2] (rev 01)
08:00.6 Audio device [0403]: Advanced Micro Devices, Inc. [AMD] Family 17h
(Models 10h-1fh) HD Audio Controller [1022:15e3]
08:00.7 Signal processing controller [1180]: Advanced Micro Devices, Inc. [AMD]
Raven/Raven2/Renoir Sensor Fusion Hub [1022:15e4]
When going into suspend on the above System suspend does not work properly
Dec 29 13:12:38 lisa kernel: [   52.085072] [drm] Register(0)
[mmUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000002
Dec 29 13:12:38 lisa kernel: [   52.407841] [drm] Register(0)
[mmUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000002
resuming from suspend freezes the System:
Dec 29 13:12:38 lisa kernel: [   53.663566] [drm] reserve 0xa00000 from
0x81fe000000 for PSP TMR
Dec 29 13:12:38 lisa kernel: [   53.715562] amdgpu 0000:08:00.0:
[drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring sdma0 test failed (-110)
Dec 29 13:12:38 lisa kernel: [   53.715805] [drm:amdgpu_device_ip_resume_phase2
[amdgpu]] *ERROR* resume of IP block <sdma_v4_0> failed -110
Dec 29 13:12:38 lisa kernel: [   53.716007] amdgpu 0000:08:00.0: amdgpu:
amdgpu_device_ip_resume failed (-110).
Dec 29 13:12:38 lisa kernel: [   53.716010] PM: dpm_run_callback():
pci_pm_resume+0x0/0xe0 returns -110
Dec 29 13:12:38 lisa kernel: [   53.716023] amdgpu 0000:08:00.0: PM: failed to
resume async: error -110
Dec 29 13:12:38 lisa kernel: [   53.734885] amdgpu 0000:03:00.0: amdgpu: RAS:
optional ras ta ucode is not available
Dec 29 13:12:38 lisa kernel: [   53.744042] amdgpu 0000:03:00.0: amdgpu:
SECUREDISPLAY: securedisplay ta ucode is not available
Dec 29 13:12:38 lisa kernel: [   53.744048] amdgpu 0000:03:00.0: amdgpu: SMU is
resuming...
Dec 29 13:12:38 lisa kernel: [   53.744053] amdgpu 0000:03:00.0: amdgpu: smu
driver if version = 0x0000000f, smu fw if version = 0x00000012, smu fw version
= 0x003b2200 (59.34.0)
Dec 29 13:12:38 lisa kernel: [   53.744059] amdgpu 0000:03:00.0: amdgpu: SMU
driver if version not matched
Dec 29 13:12:38 lisa kernel: [   53.794983] amdgpu 0000:03:00.0: amdgpu: SMU is
resumed successfully!
Suspend and resume used to work with linux.5.15.11, the first bad commit is
commit 2a50edbf10c8c56e930bfb53d8f9f00a33fd837e
Author: Qingqing Zhuo <qingqing.zhuo@amd.com>
Date:   Tue Aug 17 16:29:07 2021 -0400

    drm/amd/display: Apply w/a for hard hang on HPD

    [Why]
    HPD disable and enable sequences are not mutually exclusive
    on Linux. For HPDs that spans under 1s (i.e. HPD low = 1s),
    part of the disable sequence (specifically, a request to SMU
    to lower refclk) could come right before the call to PHY
    enablement, causing DMUB to access an irresponsive PHY
    and thus a hard hang on the system.

    [How]
    Disable 48mhz refclk off when there is any HPD status in
    connected state.

    Reviewed-by: Hersen Wu <hersenwu@amd.com>
    Acked-by: Mikita Lipski <mikita.lipski@amd.com>
    Signed-off-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

 .../drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c  | 12 ++++++++++-
 .../amd/display/dc/irq/dcn21/irq_service_dcn21.c   | 25 ++++++++++++++++++++++
 .../amd/display/dc/irq/dcn21/irq_service_dcn21.h   |  2 ++
 drivers/gpu/drm/amd/display/dc/irq/irq_service.c   |  2 +-
 drivers/gpu/drm/amd/display/dc/irq/irq_service.h   |  4 ++++
 5 files changed, 43 insertions(+), 2 deletions(-)

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 215436] admgpu: suspend and resuming from suspend don't work
  2021-12-29 12:28 [Bug 215436] New: admgpu: suspend and resuming from suspend don't work bugzilla-daemon
@ 2021-12-29 15:08 ` bugzilla-daemon
  2021-12-29 15:10 ` bugzilla-daemon
                   ` (28 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2021-12-29 15:08 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=215436

Alex Deucher (alexdeucher@gmail.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |alexdeucher@gmail.com

--- Comment #1 from Alex Deucher (alexdeucher@gmail.com) ---
See also:
https://gitlab.freedesktop.org/drm/amd/-/issues/1821

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 215436] admgpu: suspend and resuming from suspend don't work
  2021-12-29 12:28 [Bug 215436] New: admgpu: suspend and resuming from suspend don't work bugzilla-daemon
  2021-12-29 15:08 ` [Bug 215436] " bugzilla-daemon
@ 2021-12-29 15:10 ` bugzilla-daemon
  2021-12-30 13:03 ` bugzilla-daemon
                   ` (27 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2021-12-29 15:10 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=215436

--- Comment #2 from Alex Deucher (alexdeucher@gmail.com) ---
Could you please help add some prints and confirm that the commit doesn't
change the code sequence in rn_update_clocks() especially for the following
part in suspend and wake sequence:
if (display_count == 0 && !hpd_state) {
rn_vbios_smu_set_dcn_low_power_state(clk_mgr, DCN_PWR_STATE_LOW_POWER);
/* update power state */
clk_mgr_base->clks.pwr_state = DCN_PWR_STATE_LOW_POWER;
It would also be appreciated if you could print out the time that was spent in
the function rn_update_clocks() with and w/o the commit.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 215436] admgpu: suspend and resuming from suspend don't work
  2021-12-29 12:28 [Bug 215436] New: admgpu: suspend and resuming from suspend don't work bugzilla-daemon
  2021-12-29 15:08 ` [Bug 215436] " bugzilla-daemon
  2021-12-29 15:10 ` bugzilla-daemon
@ 2021-12-30 13:03 ` bugzilla-daemon
  2021-12-30 13:16 ` bugzilla-daemon
                   ` (26 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2021-12-30 13:03 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=215436

--- Comment #3 from spasswolf@web.de ---
I added printk at the beginnig and exit(s) of rn_clocks_update:
This is the result when resuming from suspend on commit
d020970959169627d59a711769f8c4b87bf5f90c

Dec 30 13:55:42 lisa kernel: [  140.198849] PM: suspend exit
Dec 30 13:55:42 lisa kernel: [  140.208063] amdgpu: entering rn_update_clocks
...
Dec 30 13:55:42 lisa kernel: [  140.209251] amdgpu: leaving rn_update_clocks
...
Dec 30 13:55:42 lisa kernel: [  140.211147] amdgpu: entering rn_update_clocks
...
Dec 30 13:55:42 lisa kernel: [  140.211148] amdgpu: leaving rn_update_clocks
...
Dec 30 13:55:42 lisa kernel: [  140.226830] Generic FE-GE Realtek PHY
r8169-0-500:00: attached PHY driver (mii_bus:phy_addr=r8169-0-500:00, irq=MAC)
Dec 30 13:55:42 lisa kernel: [  140.277915] amdgpu: entering rn_update_clocks
...
Dec 30 13:55:42 lisa kernel: [  140.277925] amdgpu: leaving rn_update_clocks
...
Dec 30 13:55:42 lisa kernel: [  140.278233] amdgpu: entering rn_update_clocks
...
Dec 30 13:55:42 lisa kernel: [  140.278234] amdgpu: leaving rn_update_clocks
...

In this case the system does not hang, but suspend doesn't seem to work
properly, either (Fan and keyboard do not switch off). I probably missed that
during the first bisection.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 215436] admgpu: suspend and resuming from suspend don't work
  2021-12-29 12:28 [Bug 215436] New: admgpu: suspend and resuming from suspend don't work bugzilla-daemon
                   ` (2 preceding siblings ...)
  2021-12-30 13:03 ` bugzilla-daemon
@ 2021-12-30 13:16 ` bugzilla-daemon
  2021-12-30 14:33 ` bugzilla-daemon
                   ` (25 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2021-12-30 13:16 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=215436

--- Comment #4 from spasswolf@web.de ---
Unfortunately I optimized my .config to make bisections faster, but now I
cannot reproduce the Hang with commit 2a50edbf10c8c56e930bfb53d8f9f00a33fd837e.
So now I will start over with the bisection ...

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 215436] admgpu: suspend and resuming from suspend don't work
  2021-12-29 12:28 [Bug 215436] New: admgpu: suspend and resuming from suspend don't work bugzilla-daemon
                   ` (3 preceding siblings ...)
  2021-12-30 13:16 ` bugzilla-daemon
@ 2021-12-30 14:33 ` bugzilla-daemon
  2021-12-31  0:17 ` bugzilla-daemon
                   ` (24 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2021-12-30 14:33 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=215436

--- Comment #5 from spasswolf@web.de ---
There might also be different suspend errors we're dealing with:
linux-5.16-rc7 e.g. seems to hang when going into suspend with the last log
message before reboot being:
Dec 30 14:53:19 lisa kernel: [   85.056243] PM: suspend entry (s2idle)

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 215436] admgpu: suspend and resuming from suspend don't work
  2021-12-29 12:28 [Bug 215436] New: admgpu: suspend and resuming from suspend don't work bugzilla-daemon
                   ` (4 preceding siblings ...)
  2021-12-30 14:33 ` bugzilla-daemon
@ 2021-12-31  0:17 ` bugzilla-daemon
  2021-12-31  0:22 ` bugzilla-daemon
                   ` (23 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2021-12-31  0:17 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=215436

--- Comment #6 from spasswolf@web.de ---
After sorting out some .config troubles I've reverted commit
2a50edbf10c8c56e930bfb53d8f9f00a33fd837e in linux-5.16-rc7 and this seems to
cure both the not properly suspending issue (fan and keyboard lights stayed on)
and the hang on resuming from suspend issue.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 215436] admgpu: suspend and resuming from suspend don't work
  2021-12-29 12:28 [Bug 215436] New: admgpu: suspend and resuming from suspend don't work bugzilla-daemon
                   ` (5 preceding siblings ...)
  2021-12-31  0:17 ` bugzilla-daemon
@ 2021-12-31  0:22 ` bugzilla-daemon
  2021-12-31 15:10 ` bugzilla-daemon
                   ` (22 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2021-12-31  0:22 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=215436

--- Comment #7 from spasswolf@web.de ---
The
 [drm] Register(0) [mmUVD_POWER_STATUS] failed to reach value 0x00000001 !=
0x00000002
are still there, but they don't seem to be harmful, in fact they were there
even
 in linux.5.15.11 where suspend and resume seemed to work.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 215436] admgpu: suspend and resuming from suspend don't work
  2021-12-29 12:28 [Bug 215436] New: admgpu: suspend and resuming from suspend don't work bugzilla-daemon
                   ` (6 preceding siblings ...)
  2021-12-31  0:22 ` bugzilla-daemon
@ 2021-12-31 15:10 ` bugzilla-daemon
  2021-12-31 16:14 ` bugzilla-daemon
                   ` (21 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2021-12-31 15:10 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=215436

--- Comment #8 from spasswolf@web.de ---
Just did a bit of experimentation and changed dc_get_hpd_state_dcn21 (in
5.16-rc7) to

uint32_t dc_get_hpd_state_dcn21(struct irq_service *irq_service, enum
dc_irq_source source)
{
        const struct irq_source_info *info;
        uint32_t addr;
        uint32_t value;
        uint32_t current_status;

        info = find_irq_source_info(irq_service, source);
        if (!info)
                return 0;

        addr = info->status_reg;
        if (!addr)
                return 0;

        value = dm_read_reg(irq_service->ctx, addr);
        //current_status =
        //      get_reg_field_value(
        //              value,
        //              HPD0_DC_HPD_INT_STATUS,
        //              DC_HPD_SENSE);

        //return current_status;
        return 0;
}
This also leads to working suspend resume, the get_reg_field_value seems to
trigger the problems.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 215436] admgpu: suspend and resuming from suspend don't work
  2021-12-29 12:28 [Bug 215436] New: admgpu: suspend and resuming from suspend don't work bugzilla-daemon
                   ` (7 preceding siblings ...)
  2021-12-31 15:10 ` bugzilla-daemon
@ 2021-12-31 16:14 ` bugzilla-daemon
  2022-01-01  1:42 ` bugzilla-daemon
                   ` (20 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2021-12-31 16:14 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=215436

--- Comment #9 from spasswolf@web.de ---
get_reg_field_value(value, HPD0_DC_HPD_INT_STATUS, DC_HPD_SENSE)
is just a macro expanding to an inline function, i.e.
current_status = (value & HPD0_DC_HPD_INT_STATUS__DC_HPD_SENSE_MASK) >>
HPD0_DC_HPD_IN_STATUS__DC_HPD_SENSE__SHIFT;
with ..._MASK = 0x2 being 0x2 and ...__SHIFT = 0x1, problems seem to occur when
current_status != 0.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 215436] admgpu: suspend and resuming from suspend don't work
  2021-12-29 12:28 [Bug 215436] New: admgpu: suspend and resuming from suspend don't work bugzilla-daemon
                   ` (8 preceding siblings ...)
  2021-12-31 16:14 ` bugzilla-daemon
@ 2022-01-01  1:42 ` bugzilla-daemon
  2022-01-03  9:29 ` bugzilla-daemon
                   ` (19 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2022-01-01  1:42 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=215436

Mario Limonciello (superm1@gmail.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |superm1@gmail.com

--- Comment #10 from Mario Limonciello (superm1@gmail.com) ---
current_status != 0 causes rn_update_clocks to behave differently, basically
rn_vbios_smu_set_dcn_low_power_state can never get called.

I wonder if we want to just not apply this WA as part of the suspend/resume
paths, but only during the runtime paths?

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 215436] admgpu: suspend and resuming from suspend don't work
  2021-12-29 12:28 [Bug 215436] New: admgpu: suspend and resuming from suspend don't work bugzilla-daemon
                   ` (9 preceding siblings ...)
  2022-01-01  1:42 ` bugzilla-daemon
@ 2022-01-03  9:29 ` bugzilla-daemon
  2022-01-03  9:35 ` bugzilla-daemon
                   ` (18 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2022-01-03  9:29 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=215436

--- Comment #11 from spasswolf@web.de ---
Created attachment 300211
  --> https://bugzilla.kernel.org/attachment.cgi?id=300211&action=edit
dmesg with linux-5.16-rc8

With linux-5.16-rc8, suspend does still not work properly (fan stays on).
Resume seemed to work at first, but on the next suspend/resume test I get the
usual hang.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 215436] admgpu: suspend and resuming from suspend don't work
  2021-12-29 12:28 [Bug 215436] New: admgpu: suspend and resuming from suspend don't work bugzilla-daemon
                   ` (10 preceding siblings ...)
  2022-01-03  9:29 ` bugzilla-daemon
@ 2022-01-03  9:35 ` bugzilla-daemon
  2022-01-03  9:36 ` bugzilla-daemon
                   ` (17 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2022-01-03  9:35 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=215436

--- Comment #12 from spasswolf@web.de ---
Suspend works again if I ignore hpd_state in rn_update_clocks.
        // if (display_count == 0 && !hpd_state) {
        if (display_count == 0) {

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 215436] admgpu: suspend and resuming from suspend don't work
  2021-12-29 12:28 [Bug 215436] New: admgpu: suspend and resuming from suspend don't work bugzilla-daemon
                   ` (11 preceding siblings ...)
  2022-01-03  9:35 ` bugzilla-daemon
@ 2022-01-03  9:36 ` bugzilla-daemon
  2022-01-03 10:20 ` bugzilla-daemon
                   ` (16 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2022-01-03  9:36 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=215436

spasswolf@web.de changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Kernel Version|5.16-rc{1-7}                |5.16-rc{1-8}

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 215436] admgpu: suspend and resuming from suspend don't work
  2021-12-29 12:28 [Bug 215436] New: admgpu: suspend and resuming from suspend don't work bugzilla-daemon
                   ` (12 preceding siblings ...)
  2022-01-03  9:36 ` bugzilla-daemon
@ 2022-01-03 10:20 ` bugzilla-daemon
  2022-01-03 10:29 ` bugzilla-daemon
                   ` (15 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2022-01-03 10:20 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=215436

--- Comment #13 from spasswolf@web.de ---
I added a some dev_info calls:

static int amdgpu_pmops_suspend(struct device *dev)
{
        struct drm_device *drm_dev = dev_get_drvdata(dev);
        struct amdgpu_device *adev = drm_to_adev(drm_dev);
        int r;

        if (amdgpu_acpi_is_s0ix_active(adev))
                adev->in_s0ix = true;
        adev->in_s3 = true;
        r = amdgpu_device_suspend(drm_dev, true);
        dev_info(adev->dev, "amdgpu_device_suspend(...) = %d\n", r);
        adev->in_s3 = false;
        if (r)
                return r;
        dev_info(adev->dev, "adev->in_s0ix = %d\n", adev->in_s0ix);
        if (!adev->in_s0ix)
                r = amdgpu_asic_reset(adev);
        return r;
}

which gives

Jan  3 11:13:40 lisa kernel: [   38.792335] amdgpu 0000:08:00.0: amdgpu:
amdgpu_device_suspend(...) = 0
Jan  3 11:13:40 lisa kernel: [   38.792341] amdgpu 0000:08:00.0: amdgpu:
adev->in_s0ix = 1

So the gpu is not actually reset ...

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 215436] admgpu: suspend and resuming from suspend don't work
  2021-12-29 12:28 [Bug 215436] New: admgpu: suspend and resuming from suspend don't work bugzilla-daemon
                   ` (13 preceding siblings ...)
  2022-01-03 10:20 ` bugzilla-daemon
@ 2022-01-03 10:29 ` bugzilla-daemon
  2022-01-03 15:48 ` bugzilla-daemon
                   ` (14 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2022-01-03 10:29 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=215436

--- Comment #14 from spasswolf@web.de ---
Created attachment 300212
  --> https://bugzilla.kernel.org/attachment.cgi?id=300212&action=edit
dmesg with linux-5.16-rc8 while ignoring adev->in_s0ix

Simply ignoring the value of adev->in_s0ix and calling reset anyway does not
work either:
Jan  3 11:23:51 lisa kernel: [   32.649142] amdgpu 0000:08:00.0: amdgpu:
amdgpu_device_suspend(...) = 0
Jan  3 11:23:51 lisa kernel: [   32.649146] amdgpu 0000:08:00.0: amdgpu:
adev->in_s0ix = 1
Jan  3 11:23:51 lisa kernel: [   32.649149] amdgpu 0000:08:00.0: amdgpu: MODE2
reset
Jan  3 11:23:51 lisa kernel: [   37.404614] amdgpu 0000:08:00.0: amdgpu: Mode2
reset failed!
Jan  3 11:23:51 lisa kernel: [   37.404616] PM: pci_pm_suspend():
amdgpu_pmops_suspend+0x0/0x60 [amdgpu] returns -62
Jan  3 11:23:51 lisa kernel: [   37.404733] PM: dpm_run_callback():
pci_pm_suspend+0x0/0x150 returns -62

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 215436] admgpu: suspend and resuming from suspend don't work
  2021-12-29 12:28 [Bug 215436] New: admgpu: suspend and resuming from suspend don't work bugzilla-daemon
                   ` (14 preceding siblings ...)
  2022-01-03 10:29 ` bugzilla-daemon
@ 2022-01-03 15:48 ` bugzilla-daemon
  2022-01-03 17:12 ` bugzilla-daemon
                   ` (13 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2022-01-03 15:48 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=215436

mario.limonciello@amd.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |mario.limonciello@amd.com

--- Comment #15 from mario.limonciello@amd.com ---
Created attachment 300214
  --> https://bugzilla.kernel.org/attachment.cgi?id=300214&action=edit
Possible patch that ignores WA in s0ix

Actually this is more like what I was suggesting.  It effectively doesn't use
the WA during s0ix suspend or resume.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 215436] admgpu: suspend and resuming from suspend don't work
  2021-12-29 12:28 [Bug 215436] New: admgpu: suspend and resuming from suspend don't work bugzilla-daemon
                   ` (15 preceding siblings ...)
  2022-01-03 15:48 ` bugzilla-daemon
@ 2022-01-03 17:12 ` bugzilla-daemon
  2022-01-03 17:49 ` bugzilla-daemon
                   ` (12 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2022-01-03 17:12 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=215436

--- Comment #16 from spasswolf@web.de ---
Added a dev_info:

dev_info(adev->dev, "adev->in_s0ix = %d\n", adev->in_s0ix);
if (display_count == 0 && (adev->in_s0ix || !hpd_state)) {

with the result:
It seems that adev->in_s0ix = 0 during suspend. Only later during resume (after
ACPI: EC: interrupt unblocked) do we have adev->in_s0ix = 1. 

Jan  3 18:00:24 lisa kernel: [   40.367310] wlp4s0: deauthenticating from
54:67:51:3d:a2:e0 by local choice (Reason: 3=DEAUTH_LEAVING)
Jan  3 18:00:24 lisa kernel: [   40.437382] amdgpu 0000:08:00.0: amdgpu:
adev->in_s0ix = 0
Jan  3 18:00:27 lisa kernel: [   44.031004] PM: suspend entry (s2idle)
Jan  3 18:00:28 lisa kernel: [   44.252353] Filesystems sync: 0.221 seconds
Jan  3 18:00:58 lisa kernel: [   44.253185] Freezing user space processes ...
(elapsed 0.002 seconds) done.
Jan  3 18:00:58 lisa kernel: [   44.255407] OOM killer disabled.
Jan  3 18:00:58 lisa kernel: [   44.255407] Freezing remaining freezable tasks
... (elapsed 0.000 seconds) done.
Jan  3 18:00:58 lisa kernel: [   44.256406] printk: Suspending console(s) (use
no_console_suspend to debug)
Jan  3 18:00:58 lisa kernel: [   44.277805] queueing ieee80211 work while going
to suspend
Jan  3 18:00:58 lisa kernel: [   44.831434] [drm] PCIE GART of 512M enabled
(table at 0x0000008000000000).
Jan  3 18:00:58 lisa kernel: [   44.831464] [drm] PSP is resuming...
Jan  3 18:00:58 lisa kernel: [   45.019729] [drm] reserve 0xa00000 from
0x81fe000000 for PSP TMR
Jan  3 18:00:58 lisa kernel: [   45.091174] amdgpu 0000:03:00.0: amdgpu: RAS:
optional ras ta ucode is not available
Jan  3 18:00:58 lisa kernel: [   45.100301] amdgpu 0000:03:00.0: amdgpu:
SECUREDISPLAY: securedisplay ta ucode is not available
Jan  3 18:00:58 lisa kernel: [   45.100305] amdgpu 0000:03:00.0: amdgpu: SMU is
resuming...
Jan  3 18:00:58 lisa kernel: [   45.100308] amdgpu 0000:03:00.0: amdgpu: smu
driver if version = 0x0000000f, smu fw if version = 0x00000013, smu fw version
= 0x003b2500 (59.37.0)
Jan  3 18:00:58 lisa kernel: [   45.100311] amdgpu 0000:03:00.0: amdgpu: SMU
driver if version not matched
Jan  3 18:00:58 lisa kernel: [   45.151255] amdgpu 0000:03:00.0: amdgpu: SMU is
resumed successfully!
Jan  3 18:00:58 lisa kernel: [   45.152433] [drm] DMUB hardware initialized:
version=0x02020007
Jan  3 18:00:58 lisa kernel: [   45.194058] [drm] kiq ring mec 2 pipe 1 q 0
Jan  3 18:00:58 lisa kernel: [   45.198331] [drm] VCN decode and encode
initialized successfully(under DPG Mode).
Jan  3 18:00:58 lisa kernel: [   45.198603] [drm] JPEG decode initialized
successfully.
Jan  3 18:00:58 lisa kernel: [   45.198652] amdgpu 0000:03:00.0: amdgpu: ring
gfx_0.0.0 uses VM inv eng 0 on hub 0
Jan  3 18:00:58 lisa kernel: [   45.198654] amdgpu 0000:03:00.0: amdgpu: ring
comp_1.0.0 uses VM inv eng 1 on hub 0
Jan  3 18:00:58 lisa kernel: [   45.198655] amdgpu 0000:03:00.0: amdgpu: ring
comp_1.1.0 uses VM inv eng 4 on hub 0
Jan  3 18:00:58 lisa kernel: [   45.198655] amdgpu 0000:03:00.0: amdgpu: ring
comp_1.2.0 uses VM inv eng 5 on hub 0
Jan  3 18:00:58 lisa kernel: [   45.198656] amdgpu 0000:03:00.0: amdgpu: ring
comp_1.3.0 uses VM inv eng 6 on hub 0
Jan  3 18:00:58 lisa kernel: [   45.198657] amdgpu 0000:03:00.0: amdgpu: ring
comp_1.0.1 uses VM inv eng 7 on hub 0
Jan  3 18:00:58 lisa kernel: [   45.198658] amdgpu 0000:03:00.0: amdgpu: ring
comp_1.1.1 uses VM inv eng 8 on hub 0
Jan  3 18:00:58 lisa kernel: [   45.198659] amdgpu 0000:03:00.0: amdgpu: ring
comp_1.2.1 uses VM inv eng 9 on hub 0
Jan  3 18:00:58 lisa kernel: [   45.198659] amdgpu 0000:03:00.0: amdgpu: ring
comp_1.3.1 uses VM inv eng 10 on hub 0
Jan  3 18:00:58 lisa kernel: [   45.198660] amdgpu 0000:03:00.0: amdgpu: ring
kiq_2.1.0 uses VM inv eng 11 on hub 0
Jan  3 18:00:58 lisa kernel: [   45.198661] amdgpu 0000:03:00.0: amdgpu: ring
sdma0 uses VM inv eng 12 on hub 0
Jan  3 18:00:58 lisa kernel: [   45.198661] amdgpu 0000:03:00.0: amdgpu: ring
sdma1 uses VM inv eng 13 on hub 0
Jan  3 18:00:58 lisa kernel: [   45.198662] amdgpu 0000:03:00.0: amdgpu: ring
vcn_dec_0 uses VM inv eng 0 on hub 1
Jan  3 18:00:58 lisa kernel: [   45.198662] amdgpu 0000:03:00.0: amdgpu: ring
vcn_enc_0.0 uses VM inv eng 1 on hub 1
Jan  3 18:00:58 lisa kernel: [   45.198663] amdgpu 0000:03:00.0: amdgpu: ring
vcn_enc_0.1 uses VM inv eng 4 on hub 1
Jan  3 18:00:58 lisa kernel: [   45.198663] amdgpu 0000:03:00.0: amdgpu: ring
jpeg_dec uses VM inv eng 5 on hub 1
Jan  3 18:00:58 lisa kernel: [   45.202326] amdgpu 0000:03:00.0: [drm] Cannot
find any crtc or sizes
Jan  3 18:00:58 lisa kernel: [   45.527398] [drm] Register(0)
[mmUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000002
Jan  3 18:00:58 lisa kernel: [   45.844705] [drm] Register(0)
[mmUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000002
Jan  3 18:00:58 lisa kernel: [   45.889752] [drm] free PSP TMR buffer
Jan  3 18:00:58 lisa kernel: [   45.939048] amdgpu 0000:03:00.0: amdgpu: MODE1
reset
Jan  3 18:00:58 lisa kernel: [   45.939052] amdgpu 0000:03:00.0: amdgpu: GPU
mode1 reset
Jan  3 18:00:58 lisa kernel: [   45.939132] amdgpu 0000:03:00.0: amdgpu: GPU
smu mode1 reset
Jan  3 18:00:58 lisa kernel: [   46.450444] ACPI: EC: interrupt blocked
Jan  3 18:00:58 lisa kernel: [   46.472489] xhci_hcd 0000:08:00.3: refused to
change power state from D0 to D3hot
Jan  3 18:00:58 lisa kernel: [   55.081262] ACPI: EC: interrupt unblocked
Jan  3 18:00:58 lisa kernel: [   55.632167] pci 0000:00:00.2: can't derive
routing for PCI INT A
Jan  3 18:00:58 lisa kernel: [   55.632172] pci 0000:00:00.2: PCI INT A: no GSI
Jan  3 18:00:58 lisa kernel: [   55.635561] [drm] PCIE GART of 512M enabled
(table at 0x0000008000000000).
Jan  3 18:00:58 lisa kernel: [   55.637155] [drm] PSP is resuming...
Jan  3 18:00:58 lisa kernel: [   55.637252] [drm] PCIE GART of 1024M enabled.
Jan  3 18:00:58 lisa kernel: [   55.637254] [drm] PTB located at
0x000000F400900000
Jan  3 18:00:58 lisa kernel: [   55.637769] amdgpu 0000:08:00.0: amdgpu: SMU is
resuming...
Jan  3 18:00:58 lisa kernel: [   55.640076] amdgpu 0000:08:00.0: amdgpu: dpm
has been disabled
Jan  3 18:00:58 lisa kernel: [   55.643194] amdgpu 0000:08:00.0: amdgpu: SMU is
resumed successfully!
Jan  3 18:00:58 lisa kernel: [   55.649205] [drm] DMUB hardware initialized:
version=0x0101001C
Jan  3 18:00:58 lisa kernel: [   55.661720] nvme nvme0: 15/0/0
default/read/poll queues
Jan  3 18:00:58 lisa kernel: [   55.702951] nvme nvme1: Shutdown timeout set to
10 seconds
Jan  3 18:00:58 lisa kernel: [   55.704396] nvme nvme1: 8/0/0 default/read/poll
queues
Jan  3 18:00:58 lisa kernel: [   55.737425] amdgpu 0000:08:00.0: amdgpu:
adev->in_s0ix = 1
Jan  3 18:00:58 lisa kernel: [   55.828624] [drm] reserve 0xa00000 from
0x81fe000000 for PSP TMR
Jan  3 18:00:58 lisa kernel: [   55.898619] amdgpu 0000:03:00.0: amdgpu: RAS:
optional ras ta ucode is not available
Jan  3 18:00:58 lisa kernel: [   55.907754] amdgpu 0000:03:00.0: amdgpu:
SECUREDISPLAY: securedisplay ta ucode is not available
Jan  3 18:00:58 lisa kernel: [   55.907757] amdgpu 0000:03:00.0: amdgpu: SMU is
resuming...
Jan  3 18:00:58 lisa kernel: [   55.907761] amdgpu 0000:03:00.0: amdgpu: smu
driver if version = 0x0000000f, smu fw if version = 0x00000013, smu fw version
= 0x003b2500 (59.37.0)
Jan  3 18:00:58 lisa kernel: [   55.907767] amdgpu 0000:03:00.0: amdgpu: SMU
driver if version not matched
Jan  3 18:00:58 lisa kernel: [   55.958092] amdgpu 0000:03:00.0: amdgpu: SMU is
resumed successfully!
Jan  3 18:00:58 lisa kernel: [   55.959299] [drm] DMUB hardware initialized:
version=0x02020007
Jan  3 18:00:58 lisa kernel: [   55.968918] amdgpu 0000:08:00.0:
[drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring sdma0 test failed (-110)
Jan  3 18:00:58 lisa kernel: [   55.969120] [drm:amdgpu_device_ip_resume_phase2
[amdgpu]] *ERROR* resume of IP block <sdma_v4_0> failed -110
Jan  3 18:00:58 lisa kernel: [   55.969298] amdgpu 0000:08:00.0: amdgpu:
amdgpu_device_ip_resume failed (-110).
Jan  3 18:00:58 lisa kernel: [   55.969301] PM: dpm_run_callback():
pci_pm_resume+0x0/0xd0 returns -110
Jan  3 18:00:58 lisa kernel: [   55.969313] amdgpu 0000:08:00.0: PM: failed to
resume async: error -110

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 215436] admgpu: suspend and resuming from suspend don't work
  2021-12-29 12:28 [Bug 215436] New: admgpu: suspend and resuming from suspend don't work bugzilla-daemon
                   ` (16 preceding siblings ...)
  2022-01-03 17:12 ` bugzilla-daemon
@ 2022-01-03 17:49 ` bugzilla-daemon
  2022-01-03 17:51 ` bugzilla-daemon
                   ` (11 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2022-01-03 17:49 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=215436

mario.limonciello@amd.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Attachment #300214|0                           |1
        is obsolete|                            |

--- Comment #17 from mario.limonciello@amd.com ---
Created attachment 300215
  --> https://bugzilla.kernel.org/attachment.cgi?id=300215&action=edit
Possible patch that ignores WA in s0ix w/ debugging statements (v2)

OK thanks for trying it!  Here's an updated patch, to try to give an extra
explicit call to update the clocks again during the suspend path.  Also
included a debugging statement so we can see the state of the variables that
matter in each call to update_clocks.

If you can please share the output from your kernel log again with it applied.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 215436] admgpu: suspend and resuming from suspend don't work
  2021-12-29 12:28 [Bug 215436] New: admgpu: suspend and resuming from suspend don't work bugzilla-daemon
                   ` (17 preceding siblings ...)
  2022-01-03 17:49 ` bugzilla-daemon
@ 2022-01-03 17:51 ` bugzilla-daemon
  2022-01-04 10:17 ` bugzilla-daemon
                   ` (10 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2022-01-03 17:51 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=215436

--- Comment #18 from spasswolf@web.de ---
The problem seems to be that rn_update_clocks tries to call
rn_vbios_smu_set_dcn_low power before adev->in_s0ix is set to 1 in
amdpgu_pmops_suspend:

Changes to rn_update_clocks:
dev_info(adev->dev, "adev->in_s0ix = %d\n", adev->in_s0ix);
//if (display_count == 0 && (adev->in_s0ix || !hpd_state)) {
if (display_count == 0) {
        dev_info(adev->dev, "calling rn_vbios_smu_set_dcn_low_power\n");
        rn_vbios_smu_set_dcn_low_power_state(clk_mgr, DCN_PWR_STATE_LOW_POWER);
Changes to amdgpu_pmops_suspend:
if (amdgpu_acpi_is_s0ix_active(adev))
        adev->in_s0ix = true;
dev_info(adev->dev, "amdgpu_pmops_suspend: adev->in_s0ix = %d\n",
adev->in_s0ix);


[   36.010259] wlp4s0: deauthenticating from 54:67:51:3d:a2:e0 by local choice
(Reason: 3=DEAUTH_LEAVING)
[   36.043902] amdgpu 0000:08:00.0: amdgpu: adev->in_s0ix = 0
[   36.043904] amdgpu 0000:08:00.0: amdgpu: calling
rn_vbios_smu_set_dcn_low_power
[   40.015570] PM: suspend entry (s2idle)
[   40.239013] Filesystems sync: 0.223 seconds
[   40.239912] Freezing user space processes ... (elapsed 0.002 seconds) done.
[   40.242407] OOM killer disabled.
[   40.242407] Freezing remaining freezable tasks ... (elapsed 0.001 seconds)
done.
[   40.243513] printk: Suspending console(s) (use no_console_suspend to debug)
[   40.246850] amdgpu 0000:08:00.0: amdgpu: amdgpu_pmops_suspend: adev->in_s0ix
= 1
[   40.266376] queueing ieee80211 work while going to suspend

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 215436] admgpu: suspend and resuming from suspend don't work
  2021-12-29 12:28 [Bug 215436] New: admgpu: suspend and resuming from suspend don't work bugzilla-daemon
                   ` (18 preceding siblings ...)
  2022-01-03 17:51 ` bugzilla-daemon
@ 2022-01-04 10:17 ` bugzilla-daemon
  2022-01-04 10:25 ` bugzilla-daemon
                   ` (9 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2022-01-04 10:17 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=215436

--- Comment #19 from spasswolf@web.de ---
Created attachment 300222
  --> https://bugzilla.kernel.org/attachment.cgi?id=300222&action=edit
debugging patch

I added some more dev_info calls to figure out where things are called:
[   34.946911] wlp4s0: deauthenticating from 54:67:51:3d:a2:e0 by local choice
(Reason: 3=DEAUTH_LEAVING)
[   35.079021] amdgpu 0000:08:00.0: amdgpu: Calling update_clocks from
dcn10_set_clock
[   35.079026] amdgpu 0000:08:00.0: amdgpu: adev->in_s0ix = 0
[   35.079027] amdgpu 0000:08:00.0: amdgpu: calling
rn_vbios_smu_set_dcn_low_power
[   39.030889] PM: suspend entry (s2idle)
[   39.270864] Filesystems sync: 0.239 seconds
[   39.271929] Freezing user space processes ... (elapsed 0.002 seconds) done.
[   39.274694] OOM killer disabled.
The WLAN message is the first message one gets after initiating suspend. The
interesting thing is that rn_vbios_set_dc_low_power is called before suspend
entry (and there's also a long 4s gap).
   I also have no idea where dcn10_set_clock is called. The only place I found
is from within dc_set_clock, but dc_set_clock is called nowhere in the entire
amdgpu module.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 215436] admgpu: suspend and resuming from suspend don't work
  2021-12-29 12:28 [Bug 215436] New: admgpu: suspend and resuming from suspend don't work bugzilla-daemon
                   ` (19 preceding siblings ...)
  2022-01-04 10:17 ` bugzilla-daemon
@ 2022-01-04 10:25 ` bugzilla-daemon
  2022-01-04 11:15 ` bugzilla-daemon
                   ` (8 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2022-01-04 10:25 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=215436

spasswolf@web.de changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Attachment #300222|0                           |1
        is obsolete|                            |

--- Comment #20 from spasswolf@web.de ---
Created attachment 300223
  --> https://bugzilla.kernel.org/attachment.cgi?id=300223&action=edit
corrected debugging patch

The patch above contained an error
rn_update_clocks is actually called from dcn20_optimize_bandwidth

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 215436] admgpu: suspend and resuming from suspend don't work
  2021-12-29 12:28 [Bug 215436] New: admgpu: suspend and resuming from suspend don't work bugzilla-daemon
                   ` (20 preceding siblings ...)
  2022-01-04 10:25 ` bugzilla-daemon
@ 2022-01-04 11:15 ` bugzilla-daemon
  2022-01-04 11:27 ` bugzilla-daemon
                   ` (7 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2022-01-04 11:15 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=215436

spasswolf@web.de changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Attachment #300223|0                           |1
        is obsolete|                            |

--- Comment #21 from spasswolf@web.de ---
Created attachment 300224
  --> https://bugzilla.kernel.org/attachment.cgi?id=300224&action=edit
new debuggin patch

Seems the call trace of rn_vbios_smu_set_dcn_low_power is
amdgpu_dm_atomic_commit_tail -> dc_commit_state -> dc_commit_state_nocheck ->
dcn20_optimize_bandwith -> rn_update_clocks -> rn_vbios_smu_set_dcn_low_power

  53.562322] amdgpu 0000:08:00.0: amdgpu: Calling dc_commit_state from
amdgpu_dm_atomic_commit_tail
[   53.562426] amdgpu 0000:08:00.0: amdgpu: Calling dc_commit_state from
amdgpu_dm_atomic_commit_tail
[   53.563183] wlp4s0: deauthenticating from 54:67:51:3d:a2:e0 by local choice
(Reason: 3=DEAUTH_LEAVING)
[   53.673655] amdgpu 0000:08:00.0: amdgpu: Calling optimize_bandwidth from
dc_commit_state_no_check
[   53.673669] amdgpu 0000:08:00.0: amdgpu: Calling update_clocks from
dcn20_optimize_bandwidth
[   53.673672] amdgpu 0000:08:00.0: amdgpu: adev->in_s0ix = 0
[   53.673673] amdgpu 0000:08:00.0: amdgpu: calling
rn_vbios_smu_set_dcn_low_power
[   53.674562] amdgpu 0000:08:00.0: amdgpu: Calling dc_commit_state from
amdgpu_dm_atomic_commit_tail

This seems to be not really connected to suspend.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 215436] admgpu: suspend and resuming from suspend don't work
  2021-12-29 12:28 [Bug 215436] New: admgpu: suspend and resuming from suspend don't work bugzilla-daemon
                   ` (21 preceding siblings ...)
  2022-01-04 11:15 ` bugzilla-daemon
@ 2022-01-04 11:27 ` bugzilla-daemon
  2022-01-04 14:14 ` bugzilla-daemon
                   ` (6 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2022-01-04 11:27 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=215436

--- Comment #22 from spasswolf@web.de ---
I get similar message if I just lock the screen which then switches off:
[ 1732.526162] amdgpu 0000:08:00.0: amdgpu: Calling dc_commit_state from
amdgpu_dm_atomic_commit_tail
[ 1732.640992] amdgpu 0000:08:00.0: amdgpu: Calling optimize_bandwidth from
dc_commit_state_no_check
[ 1732.641005] amdgpu 0000:08:00.0: amdgpu: Calling update_clocks from
dcn20_optimize_bandwidth
[ 1732.641007] amdgpu 0000:08:00.0: amdgpu: adev->in_s0ix = 0
[ 1732.641008] amdgpu 0000:08:00.0: amdgpu: calling
rn_vbios_smu_set_dcn_low_power

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 215436] admgpu: suspend and resuming from suspend don't work
  2021-12-29 12:28 [Bug 215436] New: admgpu: suspend and resuming from suspend don't work bugzilla-daemon
                   ` (22 preceding siblings ...)
  2022-01-04 11:27 ` bugzilla-daemon
@ 2022-01-04 14:14 ` bugzilla-daemon
  2022-01-04 14:59 ` bugzilla-daemon
                   ` (5 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2022-01-04 14:14 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=215436

--- Comment #23 from mario.limonciello@amd.com ---
Did you try out my patch?  It should have given an extra explicit call in the
suspend path.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 215436] admgpu: suspend and resuming from suspend don't work
  2021-12-29 12:28 [Bug 215436] New: admgpu: suspend and resuming from suspend don't work bugzilla-daemon
                   ` (23 preceding siblings ...)
  2022-01-04 14:14 ` bugzilla-daemon
@ 2022-01-04 14:59 ` bugzilla-daemon
  2022-01-04 15:03 ` bugzilla-daemon
                   ` (4 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2022-01-04 14:59 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=215436

--- Comment #24 from spasswolf@web.de ---
That's it! The extra call to clk_mgr_optimize_pwr_state now leads to
rn_update_clock being called with adev->in_s0ix = 1:


[   31.142514] wlp4s0: deauthenticating from 54:67:51:3d:a2:e0 by local choice
(Reason: 3=DEAUTH_LEAVING)
[   31.176805] amdgpu 0000:08:00.0: amdgpu: Calling optimize_bandwidth from
dc_commit_state_no_check
[   31.176815] amdgpu 0000:08:00.0: amdgpu: Calling update_clocks from
dcn20_optimize_bandwidth
[   31.176818] amdgpu 0000:08:00.0: amdgpu: adev->in_s0ix = 0
[   35.056209] PM: suspend entry (s2idle)
[   36.535688] Filesystems sync: 1.479 seconds
[   36.536404] Freezing user space processes ... (elapsed 0.024 seconds) done.
[   36.561024] OOM killer disabled.
[   36.561025] Freezing remaining freezable tasks ... (elapsed 0.001 seconds)
done.
[   36.562212] printk: Suspending console(s) (use no_console_suspend to debug)
[   36.566381] amdgpu 0000:08:00.0: amdgpu: amdgpu_pmops_suspend: adev->in_s0ix
= 1
[   36.566557] amdgpu 0000:08:00.0: amdgpu: calling update_clocks from
dcn21_optimize_pwr_state
[   36.566567] amdgpu 0000:08:00.0: amdgpu: adev->in_s0ix = 1
[   36.566569] amdgpu 0000:08:00.0: amdgpu: calling
rn_vbios_smu_set_dcn_low_power

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 215436] admgpu: suspend and resuming from suspend don't work
  2021-12-29 12:28 [Bug 215436] New: admgpu: suspend and resuming from suspend don't work bugzilla-daemon
                   ` (24 preceding siblings ...)
  2022-01-04 14:59 ` bugzilla-daemon
@ 2022-01-04 15:03 ` bugzilla-daemon
  2022-01-04 15:27 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2022-01-04 15:03 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=215436

--- Comment #25 from mario.limonciello@amd.com ---
Can you check the state of the other variables though with that extra call? 
That was in my debugging patch but not inyours.
Is the HPD active?

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 215436] admgpu: suspend and resuming from suspend don't work
  2021-12-29 12:28 [Bug 215436] New: admgpu: suspend and resuming from suspend don't work bugzilla-daemon
                   ` (25 preceding siblings ...)
  2022-01-04 15:03 ` bugzilla-daemon
@ 2022-01-04 15:27 ` bugzilla-daemon
  2022-01-04 15:28 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2022-01-04 15:27 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=215436

--- Comment #26 from spasswolf@web.de ---
For the extra call hpd_state = 0:
[   30.684201] wlp4s0: deauthenticating from 54:67:51:3d:a2:e0 by local choice
(Reason: 3=DEAUTH_LEAVING)
[   30.794571] amdgpu 0000:08:00.0: amdgpu: Calling optimize_bandwidth from
dc_commit_state_no_check
[   30.794609] amdgpu 0000:08:00.0: amdgpu: Calling update_clocks from
dcn20_optimize_bandwidth
[   30.794613] amdgpu 0000:08:00.0: amdgpu: adev->in_s0ix = 0, hpd_state = 1
[   35.026604] PM: suspend entry (s2idle)
[   35.256153] Filesystems sync: 0.229 seconds
[   35.257044] Freezing user space processes ... (elapsed 0.002 seconds) done.
[   35.259636] OOM killer disabled.
[   35.259636] Freezing remaining freezable tasks ... (elapsed 0.001 seconds)
done.
[   35.260642] printk: Suspending console(s) (use no_console_suspend to debug)
[   35.264212] amdgpu 0000:08:00.0: amdgpu: amdgpu_pmops_suspend: adev->in_s0ix
= 1
[   35.264369] amdgpu 0000:08:00.0: amdgpu: calling update_clocks from
dcn21_optimize_pwr_state
[   35.264379] amdgpu 0000:08:00.0: amdgpu: adev->in_s0ix = 1, hpd_state = 0
[   35.264381] amdgpu 0000:08:00.0: amdgpu: calling
rn_vbios_smu_set_dcn_low_power

So one could go back to the old version of rn_update_clocks with               
if (display_count == 0 && !hpd_state) {

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 215436] admgpu: suspend and resuming from suspend don't work
  2021-12-29 12:28 [Bug 215436] New: admgpu: suspend and resuming from suspend don't work bugzilla-daemon
                   ` (26 preceding siblings ...)
  2022-01-04 15:27 ` bugzilla-daemon
@ 2022-01-04 15:28 ` bugzilla-daemon
  2022-01-24 22:45 ` bugzilla-daemon
  2022-01-25 20:25 ` bugzilla-daemon
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2022-01-04 15:28 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=215436

--- Comment #27 from mario.limonciello@amd.com ---
Great! 
OK, let me send that out for review then and see what folks think.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 215436] admgpu: suspend and resuming from suspend don't work
  2021-12-29 12:28 [Bug 215436] New: admgpu: suspend and resuming from suspend don't work bugzilla-daemon
                   ` (27 preceding siblings ...)
  2022-01-04 15:28 ` bugzilla-daemon
@ 2022-01-24 22:45 ` bugzilla-daemon
  2022-01-25 20:25 ` bugzilla-daemon
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2022-01-24 22:45 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=215436

--- Comment #28 from mario.limonciello@amd.com ---
As this is now in 5.17-rc1:
https://github.com/torvalds/linux/commit/c4849f88164b13dd141885e28210f599741b304b
this issue can be closed.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [Bug 215436] admgpu: suspend and resuming from suspend don't work
  2021-12-29 12:28 [Bug 215436] New: admgpu: suspend and resuming from suspend don't work bugzilla-daemon
                   ` (28 preceding siblings ...)
  2022-01-24 22:45 ` bugzilla-daemon
@ 2022-01-25 20:25 ` bugzilla-daemon
  29 siblings, 0 replies; 31+ messages in thread
From: bugzilla-daemon @ 2022-01-25 20:25 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=215436

spasswolf@web.de changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |CODE_FIX

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2022-01-25 20:25 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-29 12:28 [Bug 215436] New: admgpu: suspend and resuming from suspend don't work bugzilla-daemon
2021-12-29 15:08 ` [Bug 215436] " bugzilla-daemon
2021-12-29 15:10 ` bugzilla-daemon
2021-12-30 13:03 ` bugzilla-daemon
2021-12-30 13:16 ` bugzilla-daemon
2021-12-30 14:33 ` bugzilla-daemon
2021-12-31  0:17 ` bugzilla-daemon
2021-12-31  0:22 ` bugzilla-daemon
2021-12-31 15:10 ` bugzilla-daemon
2021-12-31 16:14 ` bugzilla-daemon
2022-01-01  1:42 ` bugzilla-daemon
2022-01-03  9:29 ` bugzilla-daemon
2022-01-03  9:35 ` bugzilla-daemon
2022-01-03  9:36 ` bugzilla-daemon
2022-01-03 10:20 ` bugzilla-daemon
2022-01-03 10:29 ` bugzilla-daemon
2022-01-03 15:48 ` bugzilla-daemon
2022-01-03 17:12 ` bugzilla-daemon
2022-01-03 17:49 ` bugzilla-daemon
2022-01-03 17:51 ` bugzilla-daemon
2022-01-04 10:17 ` bugzilla-daemon
2022-01-04 10:25 ` bugzilla-daemon
2022-01-04 11:15 ` bugzilla-daemon
2022-01-04 11:27 ` bugzilla-daemon
2022-01-04 14:14 ` bugzilla-daemon
2022-01-04 14:59 ` bugzilla-daemon
2022-01-04 15:03 ` bugzilla-daemon
2022-01-04 15:27 ` bugzilla-daemon
2022-01-04 15:28 ` bugzilla-daemon
2022-01-24 22:45 ` bugzilla-daemon
2022-01-25 20:25 ` bugzilla-daemon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.