On Sat, Nov 11, 2023 at 07:46:41PM -0500, Phillip Susi wrote: > I had been testing some things on a post 6.6-rc5 kernel for a week or > two and then when I pulled to a post 6.6 release kernel, I found that > system suspend was broken. It seems that the radeon driver failed to > suspend, leaving the display dead, the wayland display server hung, and > the system still running. I have been trying to bisect it for the last > few days and have only been able to narrow it down to the following 3 > commits: > > There are only 'skip'ped commits left to test. > The first bad commit could be any of: > 56e449603f0ac580700621a356d35d5716a62ce5 > c07bf1636f0005f9eb7956404490672286ea59d3 > b70438004a14f4d0f9890b3297cd66248728546c > We cannot bisect more! Please show the full bisect log, and also tell why these commits are skipped. > > It appears that there was a late merge in the 6.6 window that originally > forked from the -rc2, as many of the later commits that I bisected had > that version number. > > I couldn't get it more narrowed down because I had to skip the > surrounding commits because they wouldn't even boot up to a gui desktop, > let alone try to suspend. > > When system suspend fails, I find the following in my syslog after I > have to magic-sysrq reboot because the the display is dead: > > Nov 11 18:44:39 faldara kernel: PM: suspend entry (deep) > Nov 11 18:44:39 faldara kernel: Filesystems sync: 0.035 seconds > Nov 11 18:44:40 faldara kernel: Freezing user space processes > Nov 11 18:44:40 faldara kernel: Freezing user space processes completed (elapsed 0.001 seconds) > Nov 11 18:44:40 faldara kernel: OOM killer disabled. > Nov 11 18:44:40 faldara kernel: Freezing remaining freezable tasks > Nov 11 18:44:40 faldara kernel: Freezing remaining freezable tasks completed (elapsed 0.001 seconds) > Nov 11 18:44:40 faldara kernel: printk: Suspending console(s) (use no_console_suspend to debug) > Nov 11 18:44:40 faldara kernel: serial 00:01: disabled > Nov 11 18:44:40 faldara kernel: e1000e: EEE TX LPI TIMER: 00000011 > Nov 11 18:44:40 faldara kernel: sd 4:0:0:0: [sdb] Synchronizing SCSI cache > Nov 11 18:44:40 faldara kernel: sd 1:0:0:0: [sda] Synchronizing SCSI cache > Nov 11 18:44:40 faldara kernel: sd 5:0:0:0: [sdc] Synchronizing SCSI cache > Nov 11 18:44:40 faldara kernel: sd 4:0:0:0: [sdb] Stopping disk > Nov 11 18:44:40 faldara kernel: sd 1:0:0:0: [sda] Stopping disk > Nov 11 18:44:40 faldara kernel: sd 5:0:0:0: [sdc] Stopping disk > Nov 11 18:44:40 faldara kernel: amdgpu: Move buffer fallback to memcpy unavailable > Nov 11 18:44:40 faldara kernel: [TTM] Buffer eviction failed > Nov 11 18:44:40 faldara kernel: [drm] evicting device resources failed > Nov 11 18:44:40 faldara kernel: amdgpu 0000:03:00.0: PM: pci_pm_suspend(): amdgpu_pmops_suspend+0x0/0x80 [amdgpu] returns -19 > Nov 11 18:44:40 faldara kernel: amdgpu 0000:03:00.0: PM: dpm_run_callback(): pci_pm_suspend+0x0/0x170 returns -19 > Nov 11 18:44:40 faldara kernel: amdgpu 0000:03:00.0: PM: failed to suspend async: error -19 > Nov 11 18:44:40 faldara kernel: PM: Some devices failed to suspend, or early wake event detected > Nov 11 18:44:40 faldara kernel: xhci_hcd 0000:06:00.0: xHC error in resume, USBSTS 0x401, Reinit > Nov 11 18:44:40 faldara kernel: usb usb3: root hub lost power or was reset > Nov 11 18:44:40 faldara kernel: usb usb4: root hub lost power or was reset > Nov 11 18:44:40 faldara kernel: serial 00:01: activated > Nov 11 18:44:40 faldara kernel: nvme nvme0: 4/0/0 default/read/poll queues > Nov 11 18:44:40 faldara kernel: ata8: SATA link down (SStatus 0 SControl 300) > Nov 11 18:44:40 faldara kernel: ata7: SATA link down (SStatus 0 SControl 300) > Nov 11 18:44:40 faldara kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300) > Nov 11 18:44:40 faldara kernel: ata1: SATA link down (SStatus 4 SControl 300) > Nov 11 18:44:40 faldara kernel: ata3: SATA link down (SStatus 4 SControl 300) > Nov 11 18:44:40 faldara kernel: ata4.00: configured for UDMA/133 > Nov 11 18:44:40 faldara kernel: OOM killer enabled. > Nov 11 18:44:40 faldara kernel: Restarting tasks ... done. > Nov 11 18:44:40 faldara kernel: random: crng reseeded on system resumption > Nov 11 18:44:40 faldara kernel: PM: suspend exit > Nov 11 18:44:40 faldara kernel: PM: suspend entry (s2idle) > Nov 11 18:44:40 faldara systemd-networkd[384]: enp0s31f6: Gained IPv6LL > Nov 11 18:44:40 faldara avahi-daemon[668]: Joining mDNS multicast group on interface enp0s31f6.IPv6 with address fe80::3ad5:47ff:fe0f:488a. > > My video card is this: > > 03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 23 (rev c7) (prog-if 00 [VGA controller]) > Subsystem: Gigabyte Technology Co., Ltd Navi 23 > Flags: bus master, fast devsel, latency 0, IRQ 139 > Memory at e0000000 (64-bit, prefetchable) [size=256M] > Memory at f0000000 (64-bit, prefetchable) [size=2M] > I/O ports at e000 [size=256] > Memory at f7900000 (32-bit, non-prefetchable) [size=1M] > Expansion ROM at 000c0000 [disabled] [size=128K] > Capabilities: [48] Vendor Specific Information: Len=08 > Capabilities: [50] Power Management version 3 > Capabilities: [64] Express Legacy Endpoint, MSI 00 > Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+ > Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 > Capabilities: [150] Advanced Error Reporting > Capabilities: [200] Physical Resizable BAR > Capabilities: [240] Power Budgeting > Capabilities: [270] Secondary PCI Express > Capabilities: [2a0] Access Control Services > Capabilities: [2d0] Process Address Space ID (PASID) > Capabilities: [320] Latency Tolerance Reporting > Capabilities: [410] Physical Layer 16.0 GT/s > Capabilities: [440] Lane Margining at the Receiver > Kernel driver in use: amdgpu > Kernel modules: amdgpu Anyway, thanks for the regression report. I'm adding it to regzbot: #regzbot ^introduced: 56e449603f0ac5..b70438004a14f4 -- An old man doll... just what I always wanted! - Clara