All of lore.kernel.org
 help / color / mirror / Atom feed
* Waiting for fences timed out on MacBook Pro 2019
@ 2021-07-12  9:56 Tomasz Moń
  2021-07-12 18:35 ` Alex Deucher
  2022-02-06  8:17 ` Tomasz Moń
  0 siblings, 2 replies; 5+ messages in thread
From: Tomasz Moń @ 2021-07-12  9:56 UTC (permalink / raw)
  To: amd-gfx; +Cc: alexander.deucher

Hello,

I am having trouble getting Linux to run on MacBook Pro 2019 with
Radeon Pro Vega 20 4 GB. Basically as soon as graphical user interface
starts, the whole system freezes. This happens with every Linux kernel
version I have tried over the last few months, including 5.13.

Using SSH I have been able to read dmesg output:
[  207.113144] [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR*
Waiting for fences timed out!
[  207.113168] [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR*
Waiting for fences timed out!
[  212.029866] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx
timeout, signaled seq=1022, emitted seq=1024
[  212.030083] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process
information: process Xorg pid 918 thread Xorg:cs0 pid 919
[  212.030276] amdgpu 0000:03:00.0: amdgpu: GPU reset begin!
[  216.030286] amdgpu 0000:03:00.0: amdgpu: failed to suspend display audio
[  227.396593] applesmc: driver init failed (ret=-5)!

How do I debug this further?

Best regards,
Tomasz Moń
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Waiting for fences timed out on MacBook Pro 2019
  2021-07-12  9:56 Waiting for fences timed out on MacBook Pro 2019 Tomasz Moń
@ 2021-07-12 18:35 ` Alex Deucher
  2021-07-13  7:00   ` Tomasz Moń
  2022-02-06  8:17 ` Tomasz Moń
  1 sibling, 1 reply; 5+ messages in thread
From: Alex Deucher @ 2021-07-12 18:35 UTC (permalink / raw)
  To: Tomasz Moń; +Cc: Deucher, Alexander, amd-gfx list

On Mon, Jul 12, 2021 at 5:57 AM Tomasz Moń <desowin@gmail.com> wrote:
>
> Hello,
>
> I am having trouble getting Linux to run on MacBook Pro 2019 with
> Radeon Pro Vega 20 4 GB. Basically as soon as graphical user interface
> starts, the whole system freezes. This happens with every Linux kernel
> version I have tried over the last few months, including 5.13.

If this is a regression, can you bisect?

Alex


>
> Using SSH I have been able to read dmesg output:
> [  207.113144] [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR*
> Waiting for fences timed out!
> [  207.113168] [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR*
> Waiting for fences timed out!
> [  212.029866] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx
> timeout, signaled seq=1022, emitted seq=1024
> [  212.030083] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process
> information: process Xorg pid 918 thread Xorg:cs0 pid 919
> [  212.030276] amdgpu 0000:03:00.0: amdgpu: GPU reset begin!
> [  216.030286] amdgpu 0000:03:00.0: amdgpu: failed to suspend display audio
> [  227.396593] applesmc: driver init failed (ret=-5)!
>
> How do I debug this further?
>
> Best regards,
> Tomasz Moń
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Waiting for fences timed out on MacBook Pro 2019
  2021-07-12 18:35 ` Alex Deucher
@ 2021-07-13  7:00   ` Tomasz Moń
  0 siblings, 0 replies; 5+ messages in thread
From: Tomasz Moń @ 2021-07-13  7:00 UTC (permalink / raw)
  To: Alex Deucher; +Cc: Deucher, Alexander, amd-gfx list

On Mon, Jul 12, 2021 at 8:35 PM Alex Deucher <alexdeucher@gmail.com> wrote:
> On Mon, Jul 12, 2021 at 5:57 AM Tomasz Moń <desowin@gmail.com> wrote:
> > I am having trouble getting Linux to run on MacBook Pro 2019 with
> > Radeon Pro Vega 20 4 GB. Basically as soon as graphical user interface
> > starts, the whole system freezes. This happens with every Linux kernel
> > version I have tried over the last few months, including 5.13.
>
> If this is a regression, can you bisect?

I cannot find any working kernel version. There are some differences
between what happens after the failure, but it all seem to start with
ring gfx timeout.

Below is log with 5.6.6. Is there any point in trying even older
kernels than 5.6.6?

[ 209.276886] [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR*
Waiting for fences timed out!
[ 214.193336] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx
timeout, signaled seq=392, emitted seq=394
[ 214.193372] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process
information: process Xorg pid 16356 thread Xorg:cs0 pid 16357
[ 214.193375] amdgpu 0000:03:00.0: GPU reset begin!
[ 214.355019] amdgpu 0000:03:00.0: GPU mode1 reset
[ 214.876624] [drm] psp mode1 reset succeed
[ 214.983413] amdgpu 0000:03:00.0: GPU reset succeeded, trying to resume
[ 214.983490] [drm] PCIE GART of 512M enabled (table at 0x000000F401AEB000).
[ 214.983685] [drm] PSP is resuming...
[ 215.170727] [drm] reserve 0x400000 from 0xf4fe800000 for PSP TMR
[ 215.322638] [drm] kiq ring mec 2 pipe 1 q 0
[ 215.558174] [drm] DM_MST: starting TM on aconnector: 000000008b23e629 [id: 64]
[ 216.043545] [drm] UVD and UVD ENC initialized successfully.
[ 216.143888] [drm] VCE initialized successfully.
[ 216.143891] amdgpu 0000:03:00.0: ring gfx uses VM inv eng 0 on hub 0
[ 216.143892] amdgpu 0000:03:00.0: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[ 216.143892] amdgpu 0000:03:00.0: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[ 216.143893] amdgpu 0000:03:00.0: ring comp_1.2.0 uses VM inv eng 5 on hub 0
[ 216.143893] amdgpu 0000:03:00.0: ring comp_1.3.0 uses VM inv eng 6 on hub 0
[ 216.143894] amdgpu 0000:03:00.0: ring comp_1.0.1 uses VM inv eng 7 on hub 0
[ 216.143894] amdgpu 0000:03:00.0: ring comp_1.1.1 uses VM inv eng 8 on hub 0
[ 216.143895] amdgpu 0000:03:00.0: ring comp_1.2.1 uses VM inv eng 9 on hub 0
[ 216.143895] amdgpu 0000:03:00.0: ring comp_1.3.1 uses VM inv eng 10 on hub 0
[ 216.143896] amdgpu 0000:03:00.0: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
[ 216.143896] amdgpu 0000:03:00.0: ring sdma0 uses VM inv eng 0 on hub 1
[ 216.143897] amdgpu 0000:03:00.0: ring sdma1 uses VM inv eng 1 on hub 1
[ 216.143897] amdgpu 0000:03:00.0: ring uvd_0 uses VM inv eng 4 on hub 1
[ 216.143898] amdgpu 0000:03:00.0: ring uvd_enc_0.0 uses VM inv eng 5 on hub 1
[ 216.143898] amdgpu 0000:03:00.0: ring uvd_enc_0.1 uses VM inv eng 6 on hub 1
[ 216.143899] amdgpu 0000:03:00.0: ring vce0 uses VM inv eng 7 on hub 1
[ 216.143899] amdgpu 0000:03:00.0: ring vce1 uses VM inv eng 8 on hub 1
[ 216.143900] amdgpu 0000:03:00.0: ring vce2 uses VM inv eng 9 on hub 1
[ 216.146937] [drm] recover vram bo from shadow start
[ 216.147238] [drm] recover vram bo from shadow done
[ 216.147239] [drm] Skip scheduling IBs!
[ 216.147239] [drm] Skip scheduling IBs!
[ 216.147245] [drm] Skip scheduling IBs!
[ 216.147247] [drm] Skip scheduling IBs!
[ 216.147248] [drm] Skip scheduling IBs!
[ 216.147249] [drm] Skip scheduling IBs!
[ 216.147250] [drm] Skip scheduling IBs!
[ 216.147291] [drm] Skip scheduling IBs!
[ 216.147293] [drm] Skip scheduling IBs!
[ 216.147299] [drm] Skip scheduling IBs!
[ 216.147300] amdgpu 0000:03:00.0: GPU reset(2) succeeded!
[ 216.147304] [drm] Skip scheduling IBs!
[ 216.147305] [drm] Skip scheduling IBs!
[ 216.147306] [drm] Skip scheduling IBs!
[ 216.147307] [drm] Skip scheduling IBs!
[ 216.147328] [drm] Skip scheduling IBs!
[ 216.147330] [drm] Skip scheduling IBs!
[ 216.147331] [drm] Skip scheduling IBs!
[ 216.147331] [drm] Skip scheduling IBs!
[ 216.147332] [drm] Skip scheduling IBs!
[ 216.147334] [drm] Skip scheduling IBs!
[ 216.147334] [drm] Skip scheduling IBs!
[ 216.147336] [drm] Skip scheduling IBs!
[ 216.147337] [drm] Skip scheduling IBs!
[ 216.147338] [drm] Skip scheduling IBs!
[ 216.147339] [drm] Skip scheduling IBs!
[ 216.147339] [drm] Skip scheduling IBs!
[ 216.147340] [drm] Skip scheduling IBs!
[ 216.147341] [drm] Skip scheduling IBs!
[ 216.147342] [drm] Skip scheduling IBs!
[ 216.147343] [drm] Skip scheduling IBs!
[ 216.147344] [drm] Skip scheduling IBs!
[ 216.147345] [drm] Skip scheduling IBs!
[ 216.147346] [drm] Skip scheduling IBs!
[ 216.147347] [drm] Skip scheduling IBs!
[ 216.147348] [drm] Skip scheduling IBs!
[ 216.147349] [drm] Skip scheduling IBs!
[ 216.147349] [drm] Skip scheduling IBs!
[ 216.147350] [drm] Skip scheduling IBs!
[ 216.147352] [drm] Skip scheduling IBs!
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Waiting for fences timed out on MacBook Pro 2019
  2021-07-12  9:56 Waiting for fences timed out on MacBook Pro 2019 Tomasz Moń
  2021-07-12 18:35 ` Alex Deucher
@ 2022-02-06  8:17 ` Tomasz Moń
  2022-03-26 17:37   ` Tomasz Moń
  1 sibling, 1 reply; 5+ messages in thread
From: Tomasz Moń @ 2022-02-06  8:17 UTC (permalink / raw)
  To: amd-gfx list

On Mon, Jul 12, 2021 at 11:56 AM Tomasz Moń <desowin@gmail.com> wrote:
> I am having trouble getting Linux to run on MacBook Pro 2019 with
> Radeon Pro Vega 20 4 GB. Basically as soon as graphical user interface
> starts, the whole system freezes. This happens with every Linux kernel
> version I have tried over the last few months, including 5.13.

It is significantly better on 5.17-rc2. That is, the whole system is
not frozen. just the screen keeps blinking and visual artifacts show.
Graphical desktop is not usable, but switching between virtual
terminals works just fine.

dmesg | grep amdgpu shows:
[    2.310680] [drm] amdgpu kernel modesetting enabled.
[    2.310888] amdgpu: CRAT table not found
[    2.310891] amdgpu: Virtual CRAT table created for CPU
[    2.310902] amdgpu: Topology: Add CPU node
[    2.310966] fb0: switching to amdgpu from EFI VGA
[    2.311069] amdgpu 0000:03:00.0: vgaarb: deactivate vga console
[    2.311161] amdgpu 0000:03:00.0: amdgpu: Trusted Memory Zone (TMZ)
feature not supported
[    2.322968] amdgpu 0000:03:00.0: amdgpu: Fetched VBIOS from VFCT
[    2.322972] amdgpu: ATOM BIOS: 113-D20601MA0T-016
[    2.325980] amdgpu 0000:03:00.0: BAR 2: releasing [mem
0xc0000000-0xc01fffff 64bit pref]
[    2.325983] amdgpu 0000:03:00.0: BAR 0: releasing [mem
0xb0000000-0xbfffffff 64bit pref]
[    2.326015] amdgpu 0000:03:00.0: BAR 0: assigned [mem
0x4100000000-0x41ffffffff 64bit pref]
[    2.326022] amdgpu 0000:03:00.0: BAR 2: assigned [mem
0x4080000000-0x40801fffff 64bit pref]
[    2.326075] amdgpu 0000:03:00.0: amdgpu: VRAM: 4080M
0x000000F400000000 - 0x000000F4FEFFFFFF (4080M used)
[    2.326078] amdgpu 0000:03:00.0: amdgpu: GART: 512M
0x0000000000000000 - 0x000000001FFFFFFF
[    2.326080] amdgpu 0000:03:00.0: amdgpu: AGP: 267419648M
0x000000F800000000 - 0x0000FFFFFFFFFFFF
[    2.326144] [drm] amdgpu: 4080M of VRAM memory ready
[    2.326145] [drm] amdgpu: 4080M of GTT memory ready.
[    2.330452] amdgpu 0000:03:00.0: amdgpu: PSP runtime database doesn't exist
[    2.330457] amdgpu: hwmgr_sw_init smu backed is vega12_smu
[    3.169108] snd_hda_intel 0000:03:00.1: bound 0000:03:00.0 (ops
amdgpu_dm_audio_component_bind_ops [amdgpu])
[    4.427470] kfd kfd: amdgpu: Allocated 3969056 bytes on gart
[    4.462468] amdgpu: HMM registered 4080MB device memory
[    4.462492] amdgpu: SRAT table not found
[    4.462492] amdgpu: Virtual CRAT table created for GPU
[    4.462567] amdgpu: Topology: Add dGPU node [0x69af:0x1002]
[    4.462572] kfd kfd: amdgpu: added device 1002:69af
[    4.462587] amdgpu 0000:03:00.0: amdgpu: SE 4, SH per SE 1, CU per
SH 5, active_cu_number 20
[    4.462674] amdgpu 0000:03:00.0: amdgpu: ring gfx uses VM inv eng 0 on hub 0
[    4.462676] amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.0 uses VM
inv eng 1 on hub 0
[    4.462678] amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.0 uses VM
inv eng 4 on hub 0
[    4.462679] amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.0 uses VM
inv eng 5 on hub 0
[    4.462680] amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.0 uses VM
inv eng 6 on hub 0
[    4.462682] amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.1 uses VM
inv eng 7 on hub 0
[    4.462683] amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.1 uses VM
inv eng 8 on hub 0
[    4.462684] amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.1 uses VM
inv eng 9 on hub 0
[    4.462685] amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.1 uses VM
inv eng 10 on hub 0
[    4.462686] amdgpu 0000:03:00.0: amdgpu: ring kiq_2.1.0 uses VM inv
eng 11 on hub 0
[    4.462688] amdgpu 0000:03:00.0: amdgpu: ring sdma0 uses VM inv eng
0 on hub 1
[    4.462689] amdgpu 0000:03:00.0: amdgpu: ring sdma1 uses VM inv eng
1 on hub 1
[    4.462690] amdgpu 0000:03:00.0: amdgpu: ring uvd_0 uses VM inv eng
4 on hub 1
[    4.462691] amdgpu 0000:03:00.0: amdgpu: ring uvd_enc_0.0 uses VM
inv eng 5 on hub 1
[    4.462693] amdgpu 0000:03:00.0: amdgpu: ring uvd_enc_0.1 uses VM
inv eng 6 on hub 1
[    4.462694] amdgpu 0000:03:00.0: amdgpu: ring vce0 uses VM inv eng 7 on hub 1
[    4.462695] amdgpu 0000:03:00.0: amdgpu: ring vce1 uses VM inv eng 8 on hub 1
[    4.462696] amdgpu 0000:03:00.0: amdgpu: ring vce2 uses VM inv eng 9 on hub 1
[    4.469544] [drm] Initialized amdgpu 3.44.0 20150101 for
0000:03:00.0 on minor 0
[    4.474424] fbcon: amdgpudrmfb (fb0) is primary device
[    5.547836] amdgpu 0000:03:00.0: [drm] fb0: amdgpudrmfb frame buffer device
[    5.636489] audit: type=1130 audit(1644133454.781:45): pid=1 uid=0
auid=4294967295 ses=4294967295
msg='unit=systemd-backlight@backlight:amdgpu_bl0 comm="systemd"
exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=?
res=success'
[   24.927611] [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR*
Waiting for fences timed out!
[   24.927611] [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR*
Waiting for fences timed out!
[   30.057616] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx
timeout, signaled seq=895, emitted seq=897
[   30.057933] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process
information: process Xorg pid 722 thread Xorg:cs0 pid 723
[   30.058217] amdgpu 0000:03:00.0: amdgpu: GPU reset begin!
[   34.058227] amdgpu 0000:03:00.0: amdgpu: failed to suspend display audio
[   34.314738] amdgpu 0000:03:00.0: amdgpu: MODE1 reset
[   34.314741] amdgpu 0000:03:00.0: amdgpu: GPU mode1 reset
[   34.314792] amdgpu 0000:03:00.0: amdgpu: GPU psp mode1 reset
[   34.892364] amdgpu 0000:03:00.0: amdgpu: GPU reset succeeded,
trying to resume
[   36.910260] amdgpu 0000:03:00.0: amdgpu: ring gfx uses VM inv eng 0 on hub 0
[   36.910263] amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.0 uses VM
inv eng 1 on hub 0
[   36.910264] amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.0 uses VM
inv eng 4 on hub 0
[   36.910266] amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.0 uses VM
inv eng 5 on hub 0
[   36.910267] amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.0 uses VM
inv eng 6 on hub 0
[   36.910268] amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.1 uses VM
inv eng 7 on hub 0
[   36.910269] amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.1 uses VM
inv eng 8 on hub 0
[   36.910271] amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.1 uses VM
inv eng 9 on hub 0
[   36.910272] amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.1 uses VM
inv eng 10 on hub 0
[   36.910273] amdgpu 0000:03:00.0: amdgpu: ring kiq_2.1.0 uses VM inv
eng 11 on hub 0
[   36.910274] amdgpu 0000:03:00.0: amdgpu: ring sdma0 uses VM inv eng
0 on hub 1
[   36.910276] amdgpu 0000:03:00.0: amdgpu: ring sdma1 uses VM inv eng
1 on hub 1
[   36.910277] amdgpu 0000:03:00.0: amdgpu: ring uvd_0 uses VM inv eng
4 on hub 1
[   36.910278] amdgpu 0000:03:00.0: amdgpu: ring uvd_enc_0.0 uses VM
inv eng 5 on hub 1
[   36.910279] amdgpu 0000:03:00.0: amdgpu: ring uvd_enc_0.1 uses VM
inv eng 6 on hub 1
[   36.910281] amdgpu 0000:03:00.0: amdgpu: ring vce0 uses VM inv eng 7 on hub 1
[   36.910282] amdgpu 0000:03:00.0: amdgpu: ring vce1 uses VM inv eng 8 on hub 1
[   36.910283] amdgpu 0000:03:00.0: amdgpu: ring vce2 uses VM inv eng 9 on hub 1
[   36.913056] amdgpu 0000:03:00.0: amdgpu: recover vram bo from shadow start
[   36.913063] amdgpu 0000:03:00.0: amdgpu: recover vram bo from shadow done
[   36.913147] amdgpu 0000:03:00.0: amdgpu: GPU reset(2) succeeded!
[   36.915587] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to
initialize parser -125!
[   36.916107] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to
initialize parser -125!
[   36.916471] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to
initialize parser -125!
[   36.916975] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to
initialize parser -125!
[   36.917366] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to
initialize parser -125!
[   36.917735] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to
initialize parser -125!
[   38.243514] amdgpu_cs_ioctl: 101 callbacks suppressed
[   38.243517] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to
initialize parser -125!
[   38.243999] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to
initialize parser -125!

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Waiting for fences timed out on MacBook Pro 2019
  2022-02-06  8:17 ` Tomasz Moń
@ 2022-03-26 17:37   ` Tomasz Moń
  0 siblings, 0 replies; 5+ messages in thread
From: Tomasz Moń @ 2022-03-26 17:37 UTC (permalink / raw)
  To: amd-gfx list; +Cc: Alex Deucher, Tomasz Moń

On 2/6/22 09:17, Tomasz Moń wrote:
> On Mon, Jul 12, 2021 at 11:56 AM Tomasz Moń <desowin@gmail.com> wrote:
>> I am having trouble getting Linux to run on MacBook Pro 2019 with
>> Radeon Pro Vega 20 4 GB. Basically as soon as graphical user interface
>> starts, the whole system freezes. This happens with every Linux kernel
>> version I have tried over the last few months, including 5.13.
> 
> It is significantly better on 5.17-rc2. That is, the whole system is
> not frozen. just the screen keeps blinking and visual artifacts show.
> Graphical desktop is not usable, but switching between virtual
> terminals works just fine.

I have tried on amd-drm-next-5.18-2022-03-25 with following options:
   amdgpu.vm_debug=1 amdgpu.debug_evictions=1 amdgpu.dcdebugmask=0xffffffff amdgpu.dc=1 amdgpu.dcdebugmask=0xffffffff

Unfortunately I am not familiar with the domain so my understanding is
very limited. However the UNLOAD_TA command returning 0x117 stands out.

What does the status 0x117 from UNLOAD_TA mean? Is the documentation for
commands publicly available?

[   24.931035] [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out!
[   24.931035] [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out!
[   29.847661] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=1712, emitted seq=1713
[   29.847970] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process Xorg pid 572 thread Xorg:cs0 pid 573
[   29.848244] amdgpu 0000:03:00.0: amdgpu: GPU reset begin!
[   32.746329] audit: type=1131 audit(1648314775.912:67): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-hostnamed comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[   32.847653] audit: type=1334 audit(1648314776.015:68): prog-id=0 op=UNLOAD
[   32.847658] audit: type=1334 audit(1648314776.015:69): prog-id=0 op=UNLOAD
[   32.847660] audit: type=1334 audit(1648314776.015:70): prog-id=0 op=UNLOAD
[   33.848255] amdgpu 0000:03:00.0: amdgpu: failed to suspend display audio
[   33.848260] ------------[ cut here ]------------
[   33.848261] Evicting all processes
[   33.848276] WARNING: CPU: 10 PID: 469 at drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_process.c:1888 kfd_suspend_all_processes+0xfa/0x110 [amdgpu]
[   33.848554] Modules linked in: amdgpu drm_ttm_helper gpu_sched
[   33.848558] CPU: 10 PID: 469 Comm: kworker/u32:4 Not tainted 5.17.0-rc6-amd #1 771afb710c57d59790c0f2362731ed3ffe6af1f8
[   33.848561] Hardware name: Apple Inc. MacBookPro15,3/Mac-1E7E29AD0135F9BC, BIOS 1731.100.130.0.0 (iBridge: 19.16.14242.0.0,0) 02/15/2022
[   33.848563] Workqueue: amdgpu-reset-dev drm_sched_job_timedout [gpu_sched]
[   33.848568] RIP: 0010:kfd_suspend_all_processes+0xfa/0x110 [amdgpu]
[   33.848817] Code: c7 c7 40 9b 85 c0 41 5c 41 5d e9 c1 e8 98 f1 be 03 00 00 00 e8 27 16 e3 f1 e9 5b ff ff ff 48 c7 c7 4a cc 6f c0 e8 6d 2c 8c f2 <0f> 0b e9 26 ff ff ff 0f 0b eb c5 66 66 2e 0f 1f 84 00 00 00 00 00
[   33.848819] RSP: 0018:ffffade780a83d08 EFLAGS: 00010246
[   33.848821] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[   33.848823] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[   33.848824] RBP: ffff8d1fc4d68000 R08: 0000000000000000 R09: 0000000000000000
[   33.848825] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8d1fd79e0000
[   33.848826] R13: 0000000000000000 R14: ffff8d1fc2a270d0 R15: ffff8d1fd79e0000
[   33.848828] FS:  0000000000000000(0000) GS:ffff8d271ec80000(0000) knlGS:0000000000000000
[   33.848830] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   33.848831] CR2: 00007fc180bfb200 CR3: 00000004a8c10003 CR4: 00000000003706e0
[   33.848833] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   33.848834] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   33.848835] Call Trace:
[   33.848837]  <TASK>
[   33.848839]  kgd2kfd_suspend.part.0+0x3d/0x40 [amdgpu 9abbe9b6fc2429e6d465345bc384def6ac94e6a9]
[   33.849084]  kgd2kfd_pre_reset+0x43/0x60 [amdgpu 9abbe9b6fc2429e6d465345bc384def6ac94e6a9]
[   33.849326]  amdgpu_device_gpu_recover_imp.cold+0x120/0x8e9 [amdgpu 9abbe9b6fc2429e6d465345bc384def6ac94e6a9]
[   33.849628]  amdgpu_job_timedout+0x18f/0x1c0 [amdgpu 9abbe9b6fc2429e6d465345bc384def6ac94e6a9]
[   33.849887]  ? finish_task_switch.isra.0+0xaa/0x290
[   33.849892]  drm_sched_job_timedout+0x77/0x120 [gpu_sched 721b514943d9cddbec8b63d5dd19fd642806bd31]
[   33.849898]  process_one_work+0x1e2/0x3b0
[   33.849901]  ? rescuer_thread+0x3a0/0x3a0
[   33.849903]  worker_thread+0x50/0x3a0
[   33.849905]  ? rescuer_thread+0x3a0/0x3a0
[   33.849906]  kthread+0xd6/0x100
[   33.849910]  ? kthread_complete_and_exit+0x20/0x20
[   33.849913]  ret_from_fork+0x1f/0x30
[   33.849918]  </TASK>
[   33.849919] ---[ end trace 0000000000000000 ]---
[   34.072819] [drm] psp gfx command UNLOAD_TA(0x2) failed and response status is (0x117)
[   34.072824] [drm] free PSP TMR buffer
[   34.108151] CPU: 12 PID: 469 Comm: kworker/u32:4 Tainted: G        W         5.17.0-rc6-amd #1 771afb710c57d59790c0f2362731ed3ffe6af1f8
[   34.108156] Hardware name: Apple Inc. MacBookPro15,3/Mac-1E7E29AD0135F9BC, BIOS 1731.100.130.0.0 (iBridge: 19.16.14242.0.0,0) 02/15/2022
[   34.108158] Workqueue: amdgpu-reset-dev drm_sched_job_timedout [gpu_sched]
[   34.108165] Call Trace:
[   34.108167]  <TASK>
[   34.108168]  dump_stack_lvl+0x48/0x66
[   34.108174]  amdgpu_do_asic_reset+0x28/0x45c [amdgpu 9abbe9b6fc2429e6d465345bc384def6ac94e6a9]
[   34.108523]  amdgpu_device_gpu_recover_imp.cold+0x60e/0x8e9 [amdgpu 9abbe9b6fc2429e6d465345bc384def6ac94e6a9]
[   34.108835]  amdgpu_job_timedout+0x18f/0x1c0 [amdgpu 9abbe9b6fc2429e6d465345bc384def6ac94e6a9]
[   34.109109]  ? finish_task_switch.isra.0+0xaa/0x290
[   34.109114]  drm_sched_job_timedout+0x77/0x120 [gpu_sched 721b514943d9cddbec8b63d5dd19fd642806bd31]
[   34.109120]  process_one_work+0x1e2/0x3b0
[   34.109123]  ? rescuer_thread+0x3a0/0x3a0
[   34.109125]  worker_thread+0x50/0x3a0
[   34.109127]  ? rescuer_thread+0x3a0/0x3a0
[   34.109129]  kthread+0xd6/0x100
[   34.109132]  ? kthread_complete_and_exit+0x20/0x20
[   34.109135]  ret_from_fork+0x1f/0x30
[   34.109141]  </TASK>
[   34.109144] amdgpu 0000:03:00.0: amdgpu: MODE1 reset
[   34.109146] amdgpu 0000:03:00.0: amdgpu: GPU mode1 reset
[   34.109196] amdgpu 0000:03:00.0: amdgpu: GPU psp mode1 reset
[   34.637593] [drm] psp mode1 reset succeed
[   34.708779] amdgpu 0000:03:00.0: amdgpu: GPU reset succeeded, trying to resume
[   34.708884] [drm] PCIE GART of 512M enabled.
[   34.708886] [drm] PTB located at 0x000000F400000000
[   34.708906] [drm] VRAM is lost due to GPU reset!
[   34.708907] [drm] PSP is resuming...
[   34.896778] [drm] reserve 0x400000 from 0xf4fec00000 for PSP TMR
[   36.604868] [drm] kiq ring mec 2 pipe 1 q 0
[   36.626895] [drm] UVD and UVD ENC initialized successfully.
[   36.727142] [drm] VCE initialized successfully.
[   36.727148] amdgpu 0000:03:00.0: amdgpu: ring gfx uses VM inv eng 0 on hub 0
[   36.727151] amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[   36.727153] amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[   36.727154] amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
[   36.727155] amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
[   36.727157] amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
[   36.727158] amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
[   36.727159] amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
[   36.727160] amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
[   36.727162] amdgpu 0000:03:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
[   36.727163] amdgpu 0000:03:00.0: amdgpu: ring sdma0 uses VM inv eng 0 on hub 1
[   36.727164] amdgpu 0000:03:00.0: amdgpu: ring sdma1 uses VM inv eng 1 on hub 1
[   36.727166] amdgpu 0000:03:00.0: amdgpu: ring uvd_0 uses VM inv eng 4 on hub 1
[   36.727167] amdgpu 0000:03:00.0: amdgpu: ring uvd_enc_0.0 uses VM inv eng 5 on hub 1
[   36.727168] amdgpu 0000:03:00.0: amdgpu: ring uvd_enc_0.1 uses VM inv eng 6 on hub 1
[   36.727169] amdgpu 0000:03:00.0: amdgpu: ring vce0 uses VM inv eng 7 on hub 1
[   36.727170] amdgpu 0000:03:00.0: amdgpu: ring vce1 uses VM inv eng 8 on hub 1
[   36.727171] amdgpu 0000:03:00.0: amdgpu: ring vce2 uses VM inv eng 9 on hub 1
[   36.729758] amdgpu 0000:03:00.0: amdgpu: recover vram bo from shadow start
[   36.729762] amdgpu 0000:03:00.0: amdgpu: recover vram bo from shadow done
[   36.729765] [drm] Skip scheduling IBs!
[   36.729785] amdgpu 0000:03:00.0: amdgpu: GPU reset(2) succeeded!
[   36.729811] [drm] Skip scheduling IBs!
...
[   36.731540] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-03-28  7:25 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-12  9:56 Waiting for fences timed out on MacBook Pro 2019 Tomasz Moń
2021-07-12 18:35 ` Alex Deucher
2021-07-13  7:00   ` Tomasz Moń
2022-02-06  8:17 ` Tomasz Moń
2022-03-26 17:37   ` Tomasz Moń

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.