All of lore.kernel.org
 help / color / mirror / Atom feed
* [Bug 108644] driver/card crashes with latest polaris11 firmware
@ 2018-11-03 15:32 bugzilla-daemon
  2018-11-05 16:31 ` bugzilla-daemon
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: bugzilla-daemon @ 2018-11-03 15:32 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 26147 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108644

            Bug ID: 108644
           Summary: driver/card crashes with latest polaris11 firmware
           Product: DRI
           Version: unspecified
          Hardware: PowerPC
                OS: Linux (All)
            Status: NEW
          Severity: normal
          Priority: medium
         Component: DRM/AMDgpu
          Assignee: dri-devel@lists.freedesktop.org
          Reporter: dan@danny.cz
                CC: bcrocker@redhat.com

Created attachment 142356
  --> https://bugs.freedesktop.org/attachment.cgi?id=142356&action=edit
dmesg-20181102

After updating to the latest polaris11 firmware (from
linux-firmware-20181008-88.gitc6b6265d.fc28.noarch) I'm experiencing
driver/card crashes. This is on a Power9 system running 4.19.0 kernel. There
was no such crashes with the previous firmware and all 4.19-pre kernels (and
even earlier).

...
lis 02 11:54:00 talos.danny.cz kernel: EEH: Frozen PHB#0-PE#0 detected
lis 02 11:54:00 talos.danny.cz kernel: EEH: PE location: N/A, PHB location: N/A
lis 02 11:54:00 talos.danny.cz kernel: CPU: 35 PID: 3250 Comm: InputThread Not
tainted 4.19.0-1.fc30.op.1.ppc64le #1
lis 02 11:54:00 talos.danny.cz kernel: Call Trace:
lis 02 11:54:00 talos.danny.cz kernel: [c0002006af9bf810] [c000000000be3f9c]
dump_stack+0xb0/0xf4 (unreliable)
lis 02 11:54:00 talos.danny.cz kernel: [c0002006af9bf850] [c000000000040738]
eeh_dev_check_failure+0x4a8/0x5d0
lis 02 11:54:00 talos.danny.cz kernel: [c0002006af9bf8f0] [c0000000000408ec]
eeh_check_failure+0x8c/0xd0
lis 02 11:54:00 talos.danny.cz kernel: [c0002006af9bf930] [c00800000de61998]
amdgpu_mm_rreg+0x240/0x2a0 [amdgpu]
lis 02 11:54:00 talos.danny.cz kernel: [c0002006af9bf990] [c00800000df22a68]
dce_v11_0_lock_cursor+0x50/0xf0 [amdgpu]
lis 02 11:54:00 talos.danny.cz kernel: [c0002006af9bf9d0] [c00800000df236a0]
dce_v11_0_crtc_cursor_move+0x38/0x80 [amdgpu]
lis 02 11:54:00 talos.danny.cz kernel: [c0002006af9bfa10] [c00800000d11f248]
drm_mode_cursor_common+0x1e0/0x2c0 [drm]
lis 02 11:54:00 talos.danny.cz kernel: [c0002006af9bfae0] [c00800000d11f6c8]
drm_mode_cursor_ioctl+0x50/0x70 [drm]
lis 02 11:54:00 talos.danny.cz kernel: [c0002006af9bfb30] [c00800000d0f7f14]
drm_ioctl_kernel+0xdc/0x170 [drm]
lis 02 11:54:00 talos.danny.cz kernel: [c0002006af9bfb90] [c00800000d0f8414]
drm_ioctl+0x20c/0x430 [drm]
lis 02 11:54:00 talos.danny.cz kernel: [c0002006af9bfcd0] [c00800000de60078]
amdgpu_drm_ioctl+0x70/0xd0 [amdgpu]
lis 02 11:54:00 talos.danny.cz kernel: [c0002006af9bfd20] [c00000000040f0f4]
do_vfs_ioctl+0xd4/0x8d0
lis 02 11:54:00 talos.danny.cz kernel: [c0002006af9bfdc0] [c00000000040f9b4]
ksys_ioctl+0xc4/0x110
lis 02 11:54:00 talos.danny.cz kernel: [c0002006af9bfe10] [c00000000040fa28]
sys_ioctl+0x28/0x80
lis 02 11:54:00 talos.danny.cz kernel: [c0002006af9bfe30] [c00000000000b9e4]
system_call+0x5c/0x70
lis 02 11:54:00 talos.danny.cz kernel: EEH: Detected PCI bus error on
PHB#0-PE#0
lis 02 11:54:00 talos.danny.cz kernel: EEH: This PCI device has failed 1 times
in the last hour and will be permanently disabled after 5 failures.
lis 02 11:54:00 talos.danny.cz kernel: EEH: Notify device drivers to shutdown
lis 02 11:54:00 talos.danny.cz kernel: EEH: Beginning: 'error_detected(IO
frozen)'
lis 02 11:54:00 talos.danny.cz kernel: EEH: PE#0 (PCI 0000:01:00.1): driver not
EEH aware
lis 02 11:54:00 talos.danny.cz kernel: EEH: PE#0 (PCI 0000:01:00.0): driver not
EEH aware
lis 02 11:54:00 talos.danny.cz kernel: EEH: Finished:'error_detected(IO
frozen)' with aggregate recovery state:'none'
lis 02 11:54:00 talos.danny.cz kernel: EEH: Collect temporary log
lis 02 11:54:00 talos.danny.cz kernel: EEH: of node=0000:01:00.1
lis 02 11:54:00 talos.danny.cz kernel: EEH: PCI device/vendor: aae01002
lis 02 11:54:00 talos.danny.cz kernel: EEH: PCI cmd/status register: 00100546
lis 02 11:54:00 talos.danny.cz kernel: EEH: PCI-E capabilities and status
follow:
lis 02 11:54:00 talos.danny.cz kernel: EEH: PCI-E 00: 0012a010 00008fa1
00002930 00400883 
lis 02 11:54:00 talos.danny.cz kernel: EEH: PCI-E 10: 10810000 00000000
00000000 00000000 
lis 02 11:54:00 talos.danny.cz kernel: EEH: PCI-E 20: 00000000 
lis 02 11:54:00 talos.danny.cz kernel: EEH: PCI-E AER capability register set
follows:
lis 02 11:54:00 talos.danny.cz kernel: EEH: PCI-E AER 00: 32820001 00000000
00000000 00462030 
lis 02 11:54:00 talos.danny.cz kernel: EEH: PCI-E AER 10: 00000000 00002000
000001e0 00000000 
lis 02 11:54:00 talos.danny.cz kernel: EEH: PCI-E AER 20: 00000000 00000000
00000000 00000000 
lis 02 11:54:00 talos.danny.cz kernel: EEH: PCI-E AER 30: 00000000 00000000 
lis 02 11:54:00 talos.danny.cz kernel: EEH: of node=0000:01:00.0
lis 02 11:54:00 talos.danny.cz kernel: EEH: PCI device/vendor: 67e31002
lis 02 11:54:00 talos.danny.cz kernel: EEH: PCI cmd/status register: 00100546
lis 02 11:54:00 talos.danny.cz kernel: EEH: PCI-E capabilities and status
follow:
lis 02 11:54:00 talos.danny.cz kernel: EEH: PCI-E 00: 0012a010 00008fa1
00002930 00400883 
lis 02 11:54:00 talos.danny.cz kernel: EEH: PCI-E 10: 10810000 00000000
00000000 00000000 
lis 02 11:54:00 talos.danny.cz kernel: EEH: PCI-E 20: 00000000 
lis 02 11:54:00 talos.danny.cz kernel: EEH: PCI-E AER capability register set
follows:
lis 02 11:54:00 talos.danny.cz kernel: EEH: PCI-E AER 00: 20020001 00000000
00000000 00462030 
lis 02 11:54:00 talos.danny.cz kernel: EEH: PCI-E AER 10: 00000000 00002000
000001e0 00000000 
lis 02 11:54:00 talos.danny.cz kernel: EEH: PCI-E AER 20: 00000000 00000000
00000000 00000000 
lis 02 11:54:00 talos.danny.cz kernel: EEH: PCI-E AER 30: 00000000 00000000 
lis 02 11:54:00 talos.danny.cz kernel: PHB4 PHB#0 Diag-data (Version: 1)
lis 02 11:54:00 talos.danny.cz kernel: brdgCtl:    00000002
lis 02 11:54:00 talos.danny.cz kernel: RootSts:    00060020 00402000 a0810008
00100107 00000800
lis 02 11:54:00 talos.danny.cz kernel: PhbSts:     0000001c00000000
0000001c00000000
lis 02 11:54:00 talos.danny.cz kernel: Lem:        0000000100000080
0000000000000000 0000000000000080
lis 02 11:54:00 talos.danny.cz kernel: PhbErr:     0000028000000000
0000020000000000 2148000098000240 a008400000000000
lis 02 11:54:00 talos.danny.cz kernel: RxeTceErr:  2000000000000000
2000000000000000 c000000000000000 0000000000000000
lis 02 11:54:00 talos.danny.cz kernel: PblErr:     0000000000020000
0000000000020000 0000000000000000 0000000000000000
lis 02 11:54:00 talos.danny.cz kernel: RegbErr:    0000004000000000
0000004000000000 8800001c00000000 0000000000000200
lis 02 11:54:00 talos.danny.cz kernel: PE[000] A/B: 8300b03800000000
8000000000000000
lis 02 11:54:00 talos.danny.cz kernel: EEH: Reset with hotplug activity
lis 02 11:54:00 talos.danny.cz kernel: iommu: Removing device 0000:01:00.1 from
group 0
lis 02 11:54:00 talos.danny.cz kernel: pci 0000:01:00.1: Dropping the link to
0000:01:00.0
lis 02 11:54:00 talos.danny.cz kernel: [drm] amdgpu: finishing device.
lis 02 11:54:04 talos.danny.cz kernel: EEH: 2100000 reads ignored for
recovering device at location=unknown driver=amdgpu pci addr=0000:01:00.0
lis 02 11:54:04 talos.danny.cz kernel: EEH: Might be infinite loop in amdgpu
driver
lis 02 11:54:04 talos.danny.cz kernel: CPU: 8 PID: 335 Comm: eehd Not tainted
4.19.0-1.fc30.op.1.ppc64le #1
lis 02 11:54:04 talos.danny.cz kernel: Call Trace:
lis 02 11:54:04 talos.danny.cz kernel: [c0000007f8516ed0] [c000000000be3f9c]
dump_stack+0xb0/0xf4 (unreliable)
lis 02 11:54:04 talos.danny.cz kernel: [c0000007f8516f10] [c000000000040640]
eeh_dev_check_failure+0x3b0/0x5d0
lis 02 11:54:04 talos.danny.cz kernel: [c0000007f8516fb0] [c0000000000408ec]
eeh_check_failure+0x8c/0xd0
lis 02 11:54:04 talos.danny.cz kernel: [c0000007f8516ff0] [c00800000de61998]
amdgpu_mm_rreg+0x240/0x2a0 [amdgpu]
lis 02 11:54:04 talos.danny.cz kernel: [c0000007f8517050] [c00800000de68904]
cail_reg_read+0x2c/0x50 [amdgpu]
lis 02 11:54:04 talos.danny.cz kernel: [c0000007f8517070] [c00800000de7123c]
atom_get_src_int+0x104/0xa00 [amdgpu]
lis 02 11:54:04 talos.danny.cz kernel: [c0000007f8517120] [c00800000de72b10]
atom_op_test+0xd8/0x1d0 [amdgpu]
lis 02 11:54:04 talos.danny.cz kernel: [c0000007f85171b0] [c00800000de74d7c]
amdgpu_atom_execute_table_locked+0x204/0x380 [amdgpu]
lis 02 11:54:04 talos.danny.cz kernel: [c0000007f85172a0] [c00800000de75020]
atom_op_calltable+0x128/0x1e0 [amdgpu]
lis 02 11:54:04 talos.danny.cz kernel: [c0000007f8517320] [c00800000de74d7c]
amdgpu_atom_execute_table_locked+0x204/0x380 [amdgpu]
lis 02 11:54:04 talos.danny.cz kernel: [c0000007f8517410] [c00800000de758f8]
amdgpu_atom_execute_table+0x70/0xb0 [amdgpu]
lis 02 11:54:04 talos.danny.cz kernel: [c0000007f8517450] [c00800000de96fc0]
amdgpu_atombios_encoder_setup_dig_transmitter+0x1d8/0xc10 [amdgpu]
lis 02 11:54:04 talos.danny.cz kernel: [c0000007f8517540] [c00800000de97e58]
amdgpu_atombios_encoder_dpms+0x1a0/0x5a0 [amdgpu]
lis 02 11:54:04 talos.danny.cz kernel: [c0000007f85175d0] [c00800000df255a4]
dce_v11_0_encoder_disable+0x2c/0x160 [amdgpu]
lis 02 11:54:04 talos.danny.cz kernel: [c0000007f8517640] [c00800000d4803e8]
drm_encoder_disable+0x60/0xc0 [drm_kms_helper]
lis 02 11:54:04 talos.danny.cz kernel: [c0000007f8517670] [c00800000d4804c8]
__drm_helper_disable_unused_functions+0x80/0x160 [drm_kms_helper]
lis 02 11:54:04 talos.danny.cz kernel: [c0000007f85176b0] [c00800000d481ad0]
drm_crtc_helper_set_config+0x978/0xb70 [drm_kms_helper]
lis 02 11:54:04 talos.danny.cz kernel: [c0000007f85177c0] [c00800000de7f958]
amdgpu_display_crtc_set_config+0x70/0x1c0 [amdgpu]
lis 02 11:54:04 talos.danny.cz kernel: [c0000007f8517800] [c00800000d0ff274]
__drm_mode_set_config_internal+0xac/0x1a0 [drm]
lis 02 11:54:04 talos.danny.cz kernel: [c0000007f8517850] [c00800000d0ff450]
drm_crtc_force_disable+0x88/0xa0 [drm]
lis 02 11:54:04 talos.danny.cz kernel: [c0000007f85178a0] [c00800000d0ff4e4]
drm_crtc_force_disable_all+0x7c/0x100 [drm]
lis 02 11:54:04 talos.danny.cz kernel: [c0000007f85178e0] [c00800000e0553f4]
amdgpu_device_fini+0xa0/0x628 [amdgpu]
lis 02 11:54:04 talos.danny.cz kernel: [c0000007f8517990] [c00800000de67b04]
amdgpu_driver_unload_kms+0x6c/0x100 [amdgpu]
lis 02 11:54:04 talos.danny.cz kernel: [c0000007f85179c0] [c00800000d0fa978]
drm_dev_unregister+0x80/0x170 [drm]
lis 02 11:54:04 talos.danny.cz kernel: [c0000007f8517a00] [c00800000de6055c]
amdgpu_pci_remove+0x34/0x80 [amdgpu]
lis 02 11:54:04 talos.danny.cz kernel: [c0000007f8517a30] [c0000000006d91dc]
pci_device_remove+0x6c/0x120
lis 02 11:54:04 talos.danny.cz kernel: [c0000007f8517a70] [c000000000790410]
device_release_driver_internal+0x290/0x370
lis 02 11:54:04 talos.danny.cz kernel: [c0000007f8517ac0] [c0000000006cc718]
pci_stop_bus_device+0xb8/0x110
lis 02 11:54:04 talos.danny.cz kernel: [c0000007f8517b00] [c0000000006cc918]
pci_stop_and_remove_bus_device+0x28/0x40
lis 02 11:54:04 talos.danny.cz kernel: [c0000007f8517b30] [c000000000066ac0]
pci_hp_remove_devices+0x90/0x130
lis 02 11:54:04 talos.danny.cz kernel: [c0000007f8517bc0] [c000000000045f40]
eeh_reset_device+0xa0/0x1f4
lis 02 11:54:04 talos.danny.cz kernel: [c0000007f8517c50] [c0000000000455c8]
eeh_handle_normal_event+0x2b8/0x650
lis 02 11:54:04 talos.danny.cz kernel: [c0000007f8517d10] [c000000000046710]
eeh_event_handler+0x1c0/0x1e0
lis 02 11:54:04 talos.danny.cz kernel: [c0000007f8517dc0] [c00000000014900c]
kthread+0x1ac/0x1c0
lis 02 11:54:04 talos.danny.cz kernel: [c0000007f8517e30] [c00000000000bdd4]
ret_from_kernel_thread+0x5c/0x68
lis 02 11:54:05 talos.danny.cz kernel: [drm:atom_op_jump [amdgpu]] *ERROR*
atombios stuck in loop for more than 5secs aborting
lis 02 11:54:05 talos.danny.cz kernel: [drm:amdgpu_atom_execute_table_locked
[amdgpu]] *ERROR* atombios stuck executing D860 (len 824, WS 0, PS 0) @ 0xD9E0
lis 02 11:54:05 talos.danny.cz kernel: [drm:amdgpu_atom_execute_table_locked
[amdgpu]] *ERROR* atombios stuck executing D71A (len 326, WS 0, PS 0) @ 0xD80A
lis 02 11:54:07 talos.danny.cz kernel: EEH: 4200000 reads ignored for
recovering device at location=unknown driver=amdgpu pci addr=0000:01:00.0
lis 02 11:54:07 talos.danny.cz kernel: EEH: Might be infinite loop in amdgpu
driver
lis 02 11:54:07 talos.danny.cz kernel: CPU: 9 PID: 335 Comm: eehd Not tainted
4.19.0-1.fc30.op.1.ppc64le #1
lis 02 11:54:07 talos.danny.cz kernel: Call Trace:
lis 02 11:54:07 talos.danny.cz kernel: [c0000007f8517120] [c000000000be3f9c]
dump_stack+0xb0/0xf4 (unreliable)
lis 02 11:54:07 talos.danny.cz kernel: [c0000007f8517160] [c000000000040640]
eeh_dev_check_failure+0x3b0/0x5d0
lis 02 11:54:07 talos.danny.cz kernel: [c0000007f8517200] [c0000000000408ec]
eeh_check_failure+0x8c/0xd0
lis 02 11:54:07 talos.danny.cz kernel: [c0000007f8517240] [c00800000de61998]
amdgpu_mm_rreg+0x240/0x2a0 [amdgpu]
lis 02 11:54:07 talos.danny.cz kernel: [c0000007f85172a0] [c00800000de68904]
cail_reg_read+0x2c/0x50 [amdgpu]
lis 02 11:54:07 talos.danny.cz kernel: [c0000007f85172c0] [c00800000de7123c]
atom_get_src_int+0x104/0xa00 [amdgpu]
lis 02 11:54:07 talos.danny.cz kernel: [c0000007f8517370] [c00800000de72b10]
atom_op_test+0xd8/0x1d0 [amdgpu]
lis 02 11:54:07 talos.danny.cz kernel: [c0000007f8517400] [c00800000de74d7c]
amdgpu_atom_execute_table_locked+0x204/0x380 [amdgpu]
lis 02 11:54:07 talos.danny.cz kernel: [c0000007f85174f0] [c00800000de758f8]
amdgpu_atom_execute_table+0x70/0xb0 [amdgpu]
lis 02 11:54:07 talos.danny.cz kernel: [c0000007f8517530] [c00800000de6c154]
amdgpu_atombios_crtc_blank+0x4c/0x70 [amdgpu]
lis 02 11:54:07 talos.danny.cz kernel: [c0000007f8517560] [c00800000df23ff8]
dce_v11_0_crtc_dpms+0x170/0x1b0 [amdgpu]
lis 02 11:54:07 talos.danny.cz kernel: [c0000007f85175a0] [c00800000df28e50]
dce_v11_0_crtc_disable+0x38/0x2e0 [amdgpu]
lis 02 11:54:07 talos.danny.cz kernel: [c0000007f8517670] [c00800000d48050c]
__drm_helper_disable_unused_functions+0xc4/0x160 [drm_kms_helper]
lis 02 11:54:07 talos.danny.cz kernel: [c0000007f85176b0] [c00800000d481ad0]
drm_crtc_helper_set_config+0x978/0xb70 [drm_kms_helper]
lis 02 11:54:07 talos.danny.cz kernel: [c0000007f85177c0] [c00800000de7f958]
amdgpu_display_crtc_set_config+0x70/0x1c0 [amdgpu]
lis 02 11:54:07 talos.danny.cz kernel: [c0000007f8517800] [c00800000d0ff274]
__drm_mode_set_config_internal+0xac/0x1a0 [drm]
lis 02 11:54:07 talos.danny.cz kernel: [c0000007f8517850] [c00800000d0ff450]
drm_crtc_force_disable+0x88/0xa0 [drm]
lis 02 11:54:07 talos.danny.cz kernel: [c0000007f85178a0] [c00800000d0ff4e4]
drm_crtc_force_disable_all+0x7c/0x100 [drm]
lis 02 11:54:07 talos.danny.cz kernel: [c0000007f85178e0] [c00800000e0553f4]
amdgpu_device_fini+0xa0/0x628 [amdgpu]
lis 02 11:54:07 talos.danny.cz kernel: [c0000007f8517990] [c00800000de67b04]
amdgpu_driver_unload_kms+0x6c/0x100 [amdgpu]
lis 02 11:54:07 talos.danny.cz kernel: [c0000007f85179c0] [c00800000d0fa978]
drm_dev_unregister+0x80/0x170 [drm]
lis 02 11:54:07 talos.danny.cz kernel: [c0000007f8517a00] [c00800000de6055c]
amdgpu_pci_remove+0x34/0x80 [amdgpu]
lis 02 11:54:07 talos.danny.cz kernel: [c0000007f8517a30] [c0000000006d91dc]
pci_device_remove+0x6c/0x120
lis 02 11:54:07 talos.danny.cz kernel: [c0000007f8517a70] [c000000000790410]
device_release_driver_internal+0x290/0x370
lis 02 11:54:07 talos.danny.cz kernel: [c0000007f8517ac0] [c0000000006cc718]
pci_stop_bus_device+0xb8/0x110
lis 02 11:54:07 talos.danny.cz kernel: [c0000007f8517b00] [c0000000006cc918]
pci_stop_and_remove_bus_device+0x28/0x40
lis 02 11:54:07 talos.danny.cz kernel: [c0000007f8517b30] [c000000000066ac0]
pci_hp_remove_devices+0x90/0x130
lis 02 11:54:07 talos.danny.cz kernel: [c0000007f8517bc0] [c000000000045f40]
eeh_reset_device+0xa0/0x1f4
lis 02 11:54:07 talos.danny.cz kernel: [c0000007f8517c50] [c0000000000455c8]
eeh_handle_normal_event+0x2b8/0x650
lis 02 11:54:07 talos.danny.cz kernel: [c0000007f8517d10] [c000000000046710]
eeh_event_handler+0x1c0/0x1e0
lis 02 11:54:07 talos.danny.cz kernel: [c0000007f8517dc0] [c00000000014900c]
kthread+0x1ac/0x1c0
lis 02 11:54:07 talos.danny.cz kernel: [c0000007f8517e30] [c00000000000bdd4]
ret_from_kernel_thread+0x5c/0x68
lis 02 11:54:08 talos.danny.cz kernel: [drm:amdgpu_job_timedout [amdgpu]]
*ERROR* ring gfx timeout, signaled seq=134395, emitted seq=134397
lis 02 11:54:08 talos.danny.cz kernel: [drm:amdgpu_job_timedout [amdgpu]]
*ERROR* ring sdma1 timeout, signaled seq=54428, emitted seq=54430
lis 02 11:54:08 talos.danny.cz kernel: [drm] GPU recovery disabled.
lis 02 11:54:08 talos.danny.cz kernel: [drm] GPU recovery disabled.
lis 02 11:54:10 talos.danny.cz kernel: [drm:atom_op_jump [amdgpu]] *ERROR*
atombios stuck in loop for more than 5secs aborting
lis 02 11:54:10 talos.danny.cz kernel: [drm:amdgpu_atom_execute_table_locked
[amdgpu]] *ERROR* atombios stuck executing C1E0 (len 116, WS 0, PS 0) @ 0xC22D
lis 02 11:54:10 talos.danny.cz kernel: EEH: 6300000 reads ignored for
recovering device at location=unknown driver=amdgpu pci addr=0000:01:00.0
lis 02 11:54:10 talos.danny.cz kernel: EEH: Might be infinite loop in amdgpu
driver
lis 02 11:54:10 talos.danny.cz kernel: CPU: 11 PID: 335 Comm: eehd Not tainted
4.19.0-1.fc30.op.1.ppc64le #1
lis 02 11:54:10 talos.danny.cz kernel: Call Trace:
lis 02 11:54:10 talos.danny.cz kernel: [c0000007f8517120] [c000000000be3f9c]
dump_stack+0xb0/0xf4 (unreliable)
lis 02 11:54:10 talos.danny.cz kernel: [c0000007f8517160] [c000000000040640]
eeh_dev_check_failure+0x3b0/0x5d0
lis 02 11:54:10 talos.danny.cz kernel: [c0000007f8517200] [c0000000000408ec]
eeh_check_failure+0x8c/0xd0
lis 02 11:54:10 talos.danny.cz kernel: [c0000007f8517240] [c00800000de61998]
amdgpu_mm_rreg+0x240/0x2a0 [amdgpu]
lis 02 11:54:10 talos.danny.cz kernel: [c0000007f85172a0] [c00800000de68904]
cail_reg_read+0x2c/0x50 [amdgpu]
lis 02 11:54:10 talos.danny.cz kernel: [c0000007f85172c0] [c00800000de7123c]
atom_get_src_int+0x104/0xa00 [amdgpu]
lis 02 11:54:10 talos.danny.cz kernel: [c0000007f8517370] [c00800000de72b10]
atom_op_test+0xd8/0x1d0 [amdgpu]
lis 02 11:54:10 talos.danny.cz kernel: [c0000007f8517400] [c00800000de74d7c]
amdgpu_atom_execute_table_locked+0x204/0x380 [amdgpu]
lis 02 11:54:10 talos.danny.cz kernel: [c0000007f85174f0] [c00800000de758f8]
amdgpu_atom_execute_table+0x70/0xb0 [amdgpu]
lis 02 11:54:10 talos.danny.cz kernel: [c0000007f8517530] [c00800000de6c0e0]
amdgpu_atombios_crtc_enable+0x48/0x70 [amdgpu]
lis 02 11:54:10 talos.danny.cz kernel: [c0000007f8517560] [c00800000df24014]
dce_v11_0_crtc_dpms+0x18c/0x1b0 [amdgpu]
lis 02 11:54:10 talos.danny.cz kernel: [c0000007f85175a0] [c00800000df28e50]
dce_v11_0_crtc_disable+0x38/0x2e0 [amdgpu]
lis 02 11:54:10 talos.danny.cz kernel: [c0000007f8517670] [c00800000d48050c]
__drm_helper_disable_unused_functions+0xc4/0x160 [drm_kms_helper]
lis 02 11:54:10 talos.danny.cz kernel: [c0000007f85176b0] [c00800000d481ad0]
drm_crtc_helper_set_config+0x978/0xb70 [drm_kms_helper]
lis 02 11:54:10 talos.danny.cz kernel: [c0000007f85177c0] [c00800000de7f958]
amdgpu_display_crtc_set_config+0x70/0x1c0 [amdgpu]
lis 02 11:54:10 talos.danny.cz kernel: [c0000007f8517800] [c00800000d0ff274]
__drm_mode_set_config_internal+0xac/0x1a0 [drm]
lis 02 11:54:10 talos.danny.cz kernel: [c0000007f8517850] [c00800000d0ff450]
drm_crtc_force_disable+0x88/0xa0 [drm]
lis 02 11:54:10 talos.danny.cz kernel: [c0000007f85178a0] [c00800000d0ff4e4]
drm_crtc_force_disable_all+0x7c/0x100 [drm]
lis 02 11:54:10 talos.danny.cz kernel: [c0000007f85178e0] [c00800000e0553f4]
amdgpu_device_fini+0xa0/0x628 [amdgpu]
lis 02 11:54:10 talos.danny.cz kernel: [c0000007f8517990] [c00800000de67b04]
amdgpu_driver_unload_kms+0x6c/0x100 [amdgpu]
lis 02 11:54:10 talos.danny.cz kernel: [c0000007f85179c0] [c00800000d0fa978]
drm_dev_unregister+0x80/0x170 [drm]
lis 02 11:54:10 talos.danny.cz kernel: [c0000007f8517a00] [c00800000de6055c]
amdgpu_pci_remove+0x34/0x80 [amdgpu]
lis 02 11:54:10 talos.danny.cz kernel: [c0000007f8517a30] [c0000000006d91dc]
pci_device_remove+0x6c/0x120
lis 02 11:54:10 talos.danny.cz kernel: [c0000007f8517a70] [c000000000790410]
device_release_driver_internal+0x290/0x370
lis 02 11:54:10 talos.danny.cz kernel: [c0000007f8517ac0] [c0000000006cc718]
pci_stop_bus_device+0xb8/0x110
lis 02 11:54:10 talos.danny.cz kernel: [c0000007f8517b00] [c0000000006cc918]
pci_stop_and_remove_bus_device+0x28/0x40
lis 02 11:54:10 talos.danny.cz kernel: [c0000007f8517b30] [c000000000066ac0]
pci_hp_remove_devices+0x90/0x130
lis 02 11:54:10 talos.danny.cz kernel: [c0000007f8517bc0] [c000000000045f40]
eeh_reset_device+0xa0/0x1f4
lis 02 11:54:10 talos.danny.cz kernel: [c0000007f8517c50] [c0000000000455c8]
eeh_handle_normal_event+0x2b8/0x650
lis 02 11:54:10 talos.danny.cz kernel: [c0000007f8517d10] [c000000000046710]
eeh_event_handler+0x1c0/0x1e0
lis 02 11:54:10 talos.danny.cz kernel: [c0000007f8517dc0] [c00000000014900c]
kthread+0x1ac/0x1c0
lis 02 11:54:10 talos.danny.cz kernel: [c0000007f8517e30] [c00000000000bdd4]
ret_from_kernel_thread+0x5c/0x68
lis 02 11:54:13 talos.danny.cz kernel: EEH: 8400000 reads ignored for
recovering device at location=unknown driver=amdgpu pci addr=0000:01:00.0
lis 02 11:54:13 talos.danny.cz kernel: EEH: Might be infinite loop in amdgpu
driver
lis 02 11:54:13 talos.danny.cz kernel: CPU: 11 PID: 335 Comm: eehd Not tainted
4.19.0-1.fc30.op.1.ppc64le #1
lis 02 11:54:13 talos.danny.cz kernel: Call Trace:
lis 02 11:54:13 talos.danny.cz kernel: [c0000007f8517120] [c000000000be3f9c]
dump_stack+0xb0/0xf4 (unreliable)
lis 02 11:54:13 talos.danny.cz kernel: [c0000007f8517160] [c000000000040640]
eeh_dev_check_failure+0x3b0/0x5d0
lis 02 11:54:13 talos.danny.cz kernel: [c0000007f8517200] [c0000000000408ec]
eeh_check_failure+0x8c/0xd0
lis 02 11:54:13 talos.danny.cz kernel: [c0000007f8517240] [c00800000de61998]
amdgpu_mm_rreg+0x240/0x2a0 [amdgpu]
lis 02 11:54:13 talos.danny.cz kernel: [c0000007f85172a0] [c00800000de68904]
cail_reg_read+0x2c/0x50 [amdgpu]
lis 02 11:54:13 talos.danny.cz kernel: [c0000007f85172c0] [c00800000de7123c]
atom_get_src_int+0x104/0xa00 [amdgpu]
lis 02 11:54:13 talos.danny.cz kernel: [c0000007f8517370] [c00800000de72b10]
atom_op_test+0xd8/0x1d0 [amdgpu]
lis 02 11:54:13 talos.danny.cz kernel: [c0000007f8517400] [c00800000de74d7c]
amdgpu_atom_execute_table_locked+0x204/0x380 [amdgpu]
lis 02 11:54:13 talos.danny.cz kernel: [c0000007f85174f0] [c00800000de758f8]
amdgpu_atom_execute_table+0x70/0xb0 [amdgpu]
lis 02 11:54:13 talos.danny.cz kernel: [c0000007f8517530] [c00800000de6c0e0]
amdgpu_atombios_crtc_enable+0x48/0x70 [amdgpu]
lis 02 11:54:13 talos.danny.cz kernel: [c0000007f8517560] [c00800000df24014]
dce_v11_0_crtc_dpms+0x18c/0x1b0 [amdgpu]
lis 02 11:54:13 talos.danny.cz kernel: [c0000007f85175a0] [c00800000df28e50]
dce_v11_0_crtc_disable+0x38/0x2e0 [amdgpu]
lis 02 11:54:13 talos.danny.cz kernel: [c0000007f8517670] [c00800000d48050c]
__drm_helper_disable_unused_functions+0xc4/0x160 [drm_kms_helper]
lis 02 11:54:13 talos.danny.cz kernel: [c0000007f85176b0] [c00800000d481ad0]
drm_crtc_helper_set_config+0x978/0xb70 [drm_kms_helper]
lis 02 11:54:13 talos.danny.cz kernel: [c0000007f85177c0] [c00800000de7f958]
amdgpu_display_crtc_set_config+0x70/0x1c0 [amdgpu]
lis 02 11:54:13 talos.danny.cz kernel: [c0000007f8517800] [c00800000d0ff274]
__drm_mode_set_config_internal+0xac/0x1a0 [drm]
lis 02 11:54:13 talos.danny.cz kernel: [c0000007f8517850] [c00800000d0ff450]
drm_crtc_force_disable+0x88/0xa0 [drm]
lis 02 11:54:13 talos.danny.cz kernel: [c0000007f85178a0] [c00800000d0ff4e4]
drm_crtc_force_disable_all+0x7c/0x100 [drm]
lis 02 11:54:13 talos.danny.cz kernel: [c0000007f85178e0] [c00800000e0553f4]
amdgpu_device_fini+0xa0/0x628 [amdgpu]
lis 02 11:54:13 talos.danny.cz kernel: [c0000007f8517990] [c00800000de67b04]
amdgpu_driver_unload_kms+0x6c/0x100 [amdgpu]
lis 02 11:54:13 talos.danny.cz kernel: [c0000007f85179c0] [c00800000d0fa978]
drm_dev_unregister+0x80/0x170 [drm]
lis 02 11:54:13 talos.danny.cz kernel: [c0000007f8517a00] [c00800000de6055c]
amdgpu_pci_remove+0x34/0x80 [amdgpu]
lis 02 11:54:13 talos.danny.cz kernel: [c0000007f8517a30] [c0000000006d91dc]
pci_device_remove+0x6c/0x120
lis 02 11:54:13 talos.danny.cz kernel: [c0000007f8517a70] [c000000000790410]
device_release_driver_internal+0x290/0x370
lis 02 11:54:13 talos.danny.cz kernel: [c0000007f8517ac0] [c0000000006cc718]
pci_stop_bus_device+0xb8/0x110
lis 02 11:54:13 talos.danny.cz kernel: [c0000007f8517b00] [c0000000006cc918]
pci_stop_and_remove_bus_device+0x28/0x40
lis 02 11:54:13 talos.danny.cz kernel: [c0000007f8517b30] [c000000000066ac0]
pci_hp_remove_devices+0x90/0x130
lis 02 11:54:13 talos.danny.cz kernel: [c0000007f8517bc0] [c000000000045f40]
eeh_reset_device+0xa0/0x1f4
lis 02 11:54:13 talos.danny.cz kernel: [c0000007f8517c50] [c0000000000455c8]
eeh_handle_normal_event+0x2b8/0x650
lis 02 11:54:13 talos.danny.cz kernel: [c0000007f8517d10] [c000000000046710]
eeh_event_handler+0x1c0/0x1e0
lis 02 11:54:13 talos.danny.cz kernel: [c0000007f8517dc0] [c00000000014900c]
kthread+0x1ac/0x1c0
lis 02 11:54:13 talos.danny.cz kernel: [c0000007f8517e30] [c00000000000bdd4]
ret_from_kernel_thread+0x5c/0x68
lis 02 11:54:15 talos.danny.cz kernel: [drm:atom_op_jump [amdgpu]] *ERROR*
atombios stuck in loop for more than 5secs aborting
lis 02 11:54:15 talos.danny.cz kernel: [drm:amdgpu_atom_execute_table_locked
[amdgpu]] *ERROR* atombios stuck executing C254 (len 62, WS 0, PS 0) @ 0xC270
lis 02 11:57:13 talos.danny.cz kernel: alsactl[2517]: segfault (11) at 28 nip
122708cfc lr 122708db0 code 1 in alsactl[1226f0000+20000]
lis 02 11:57:13 talos.danny.cz kernel: alsactl[2517]: code: 4bfee1f5 e8410018
00000000 01000000 00000280 3c4c0003 3842f220 7c0802a6 
lis 02 11:57:13 talos.danny.cz kernel: alsactl[2517]: code: fbc1fff0 f8010010
f821ffc1 7c7e1b78 <81240000> 2f890000 409d0048 fba10028 
lis 02 12:01:43 talos.danny.cz kernel: opal-power: Poweroff requested

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 27656 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug 108644] driver/card crashes with latest polaris11 firmware
  2018-11-03 15:32 [Bug 108644] driver/card crashes with latest polaris11 firmware bugzilla-daemon
@ 2018-11-05 16:31 ` bugzilla-daemon
  2018-11-05 16:45 ` bugzilla-daemon
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2018-11-05 16:31 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 234 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108644

--- Comment #1 from Dan Horák <dan@danny.cz> ---
for the record - this is with Radeon Pro WX 4100

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1010 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug 108644] driver/card crashes with latest polaris11 firmware
  2018-11-03 15:32 [Bug 108644] driver/card crashes with latest polaris11 firmware bugzilla-daemon
  2018-11-05 16:31 ` bugzilla-daemon
@ 2018-11-05 16:45 ` bugzilla-daemon
  2018-11-05 17:26 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2018-11-05 16:45 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 567 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108644

--- Comment #2 from Alex Deucher <alexdeucher@gmail.com> ---
Does this patch help?
https://patchwork.freedesktop.org/patch/259364/

Can you bisect which firmware commit
(https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git)
caused the regression and then narrow down which firmware updated causes the
issue?  I'd start with the smc firmware and they try the rlc, followed by the
CP (mec, me, pfp, ce).

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1506 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug 108644] driver/card crashes with latest polaris11 firmware
  2018-11-03 15:32 [Bug 108644] driver/card crashes with latest polaris11 firmware bugzilla-daemon
  2018-11-05 16:31 ` bugzilla-daemon
  2018-11-05 16:45 ` bugzilla-daemon
@ 2018-11-05 17:26 ` bugzilla-daemon
  2018-11-15 12:35 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2018-11-05 17:26 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 358 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108644

--- Comment #3 from Dan Horák <dan@danny.cz> ---
OK, will try both.

It crashed twice since last Thursday when I updated the firmware, so it might
take me some time to get a better info. There isn't a clear reproducer.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1134 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug 108644] driver/card crashes with latest polaris11 firmware
  2018-11-03 15:32 [Bug 108644] driver/card crashes with latest polaris11 firmware bugzilla-daemon
                   ` (2 preceding siblings ...)
  2018-11-05 17:26 ` bugzilla-daemon
@ 2018-11-15 12:35 ` bugzilla-daemon
  2018-11-15 12:37 ` bugzilla-daemon
  2019-11-19  9:01 ` bugzilla-daemon
  5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2018-11-15 12:35 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 250 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108644

--- Comment #4 from Dan Horák <dan@danny.cz> ---
Testing with 4.20-pre kernels is not possible due bug 108754 :-(

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1164 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug 108644] driver/card crashes with latest polaris11 firmware
  2018-11-03 15:32 [Bug 108644] driver/card crashes with latest polaris11 firmware bugzilla-daemon
                   ` (3 preceding siblings ...)
  2018-11-15 12:35 ` bugzilla-daemon
@ 2018-11-15 12:37 ` bugzilla-daemon
  2019-11-19  9:01 ` bugzilla-daemon
  5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2018-11-15 12:37 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 297 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108644

--- Comment #5 from Dan Horák <dan@danny.cz> ---
I got a new crash today, after ~10 days without an issue. Again it was when I
was scrolling a page in Firefox.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 1073 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug 108644] driver/card crashes with latest polaris11 firmware
  2018-11-03 15:32 [Bug 108644] driver/card crashes with latest polaris11 firmware bugzilla-daemon
                   ` (4 preceding siblings ...)
  2018-11-15 12:37 ` bugzilla-daemon
@ 2019-11-19  9:01 ` bugzilla-daemon
  5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2019-11-19  9:01 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 805 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=108644

Martin Peres <martin.peres@free.fr> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |MOVED
             Status|NEW                         |RESOLVED

--- Comment #6 from Martin Peres <martin.peres@free.fr> ---
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been
closed from further activity.

You can subscribe and participate further through the new bug through this link
to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/588.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 2374 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-11-19  9:01 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-03 15:32 [Bug 108644] driver/card crashes with latest polaris11 firmware bugzilla-daemon
2018-11-05 16:31 ` bugzilla-daemon
2018-11-05 16:45 ` bugzilla-daemon
2018-11-05 17:26 ` bugzilla-daemon
2018-11-15 12:35 ` bugzilla-daemon
2018-11-15 12:37 ` bugzilla-daemon
2019-11-19  9:01 ` bugzilla-daemon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.