* [Bug 217514] New: [amdgpu] system doesn't boot after linux-firmware 2023-05-23 ffe1a41e
@ 2023-05-31 16:35 bugzilla-daemon
2023-05-31 16:36 ` [Bug 217514] " bugzilla-daemon
2023-05-31 17:14 ` bugzilla-daemon
0 siblings, 2 replies; 3+ messages in thread
From: bugzilla-daemon @ 2023-05-31 16:35 UTC (permalink / raw)
To: dri-devel
https://bugzilla.kernel.org/show_bug.cgi?id=217514
Bug ID: 217514
Summary: [amdgpu] system doesn't boot after linux-firmware
2023-05-23 ffe1a41e
Product: Drivers
Version: 2.5
Hardware: All
OS: Linux
Status: NEW
Severity: normal
Priority: P3
Component: Video(DRI - non Intel)
Assignee: drivers_video-dri@kernel-bugs.osdl.org
Reporter: rly@hotmail.hu
Regression: No
Created attachment 304361
--> https://bugzilla.kernel.org/attachment.cgi?id=304361&action=edit
softlockup
Updating linux-firmware to the latest git version causes my pc to lock up
during boot. I have a 3900x paired with a 7900xtx running arch linux with 6.3.4
xanmod kernel (but this happens with kernel from the core repo as well) and
mesa 23.1.1 if that matters.
During boot time I see the following error printed and the system is completely
locked up, only hard reset helps:
`May 31 07:20:40 valhalla kernel: watchdog: BUG: soft lockup - CPU#5 stuck for
26s! [swapper/5:0]`
accompanied with a lots of amdgpu errors in the journal (followed by stack
trace after both):
```
May 31 07:20:44 valhalla kernel: amdgpu 0000:0c:00.0: amdgpu: [gfxhub] page
fault (src_id:0 ring:24 vmid:9 pasid:32768, for process pid 0 thread pid 0)
May 31 07:20:44 valhalla kernel: amdgpu 0000:0c:00.0: amdgpu: in page
starting at address 0x0000ffff0021a000 from client 10
May 31 07:20:44 valhalla kernel: amdgpu 0000:0c:00.0: amdgpu:
GCVM_L2_PROTECTION_FAULT_STATUS:0x00900831
May 31 07:20:44 valhalla kernel: amdgpu 0000:0c:00.0: amdgpu: Faulty
UTCL2 client ID: CPF (0x4)
May 31 07:20:44 valhalla kernel: amdgpu 0000:0c:00.0: amdgpu:
MORE_FAULTS: 0x1
May 31 07:20:44 valhalla kernel: amdgpu 0000:0c:00.0: amdgpu:
WALKER_ERROR: 0x0
May 31 07:20:44 valhalla kernel: amdgpu 0000:0c:00.0: amdgpu:
PERMISSION_FAULTS: 0x3
May 31 07:20:44 valhalla kernel: amdgpu 0000:0c:00.0: amdgpu:
MAPPING_ERROR: 0x0
May 31 07:20:44 valhalla kernel: amdgpu 0000:0c:00.0: amdgpu: RW: 0x0
```
full journal log in "softlockup".
The issues start to happen after [this commit,
ffe1a41e](https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=ffe1a41e2ddbc39109b12d95dcac282d90eba8fc)
but not the above mentioned soft lock, instead after initramfs loads I get the
bios splash screen back and it's stuck there.
There are different amdgpu errors(followed by stack trace) during this:
```
May 31 09:18:37 valhalla kernel: amdgpu 0000:0c:00.0: amdgpu: SMU: I'm not done
with your previous command: SMN_C2PMSG_66:0x00000006 SMN_C2PMSG_82:0x00000000
May 31 09:18:37 valhalla kernel: amdgpu 0000:0c:00.0: amdgpu: Failed to enable
requested dpm features!
May 31 09:18:37 valhalla kernel: amdgpu 0000:0c:00.0: amdgpu: Failed to setup
smc hw!
May 31 09:18:37 valhalla kernel: [drm:amdgpu_device_init [amdgpu]] *ERROR*
hw_init of IP block <smu> failed -62
May 31 09:18:37 valhalla kernel: amdgpu 0000:0c:00.0: amdgpu:
amdgpu_device_ip_init failed
May 31 09:18:37 valhalla kernel: amdgpu 0000:0c:00.0: amdgpu: Fatal error
during GPU init
May 31 09:18:37 valhalla kernel: amdgpu 0000:0c:00.0: amdgpu: amdgpu: finishing
device.
```
Logs during this in "amdgpu_error"
Note that at the end it seems like the system is running but as I only saw the
bios splash screen rebooted via sysrq/reisub.
The commit after ffe1a41
([56832557](https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=568325574a3b6148f3296984aa24fcd1fb4b912c)
or might be the one after that
[39dafcc](https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=39d6fcc73100ae4aeeec0194bbf102c672673edd),
not sure at the moment) gets past the splash screen but that's where the soft
lockup starts to happen.
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug 217514] [amdgpu] system doesn't boot after linux-firmware 2023-05-23 ffe1a41e
2023-05-31 16:35 [Bug 217514] New: [amdgpu] system doesn't boot after linux-firmware 2023-05-23 ffe1a41e bugzilla-daemon
@ 2023-05-31 16:36 ` bugzilla-daemon
2023-05-31 17:14 ` bugzilla-daemon
1 sibling, 0 replies; 3+ messages in thread
From: bugzilla-daemon @ 2023-05-31 16:36 UTC (permalink / raw)
To: dri-devel
https://bugzilla.kernel.org/show_bug.cgi?id=217514
--- Comment #1 from rLy (rly@hotmail.hu) ---
Created attachment 304362
--> https://bugzilla.kernel.org/attachment.cgi?id=304362&action=edit
amdgpu_error
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug 217514] [amdgpu] system doesn't boot after linux-firmware 2023-05-23 ffe1a41e
2023-05-31 16:35 [Bug 217514] New: [amdgpu] system doesn't boot after linux-firmware 2023-05-23 ffe1a41e bugzilla-daemon
2023-05-31 16:36 ` [Bug 217514] " bugzilla-daemon
@ 2023-05-31 17:14 ` bugzilla-daemon
1 sibling, 0 replies; 3+ messages in thread
From: bugzilla-daemon @ 2023-05-31 17:14 UTC (permalink / raw)
To: dri-devel
https://bugzilla.kernel.org/show_bug.cgi?id=217514
Alex Deucher (alexdeucher@gmail.com) changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |alexdeucher@gmail.com
--- Comment #2 from Alex Deucher (alexdeucher@gmail.com) ---
Does this kernel change fix the issues?
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5ee33d905f89c18d4b33da6e5eefdae6060502df
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2023-05-31 17:14 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-31 16:35 [Bug 217514] New: [amdgpu] system doesn't boot after linux-firmware 2023-05-23 ffe1a41e bugzilla-daemon
2023-05-31 16:36 ` [Bug 217514] " bugzilla-daemon
2023-05-31 17:14 ` bugzilla-daemon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).