dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
@ 2020-03-26 19:51 bugzilla-daemon
  2020-03-26 19:54 ` [Bug 206987] " bugzilla-daemon
                   ` (47 more replies)
  0 siblings, 48 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-03-26 19:51 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

            Bug ID: 206987
           Summary: [drm] [amdgpu] Whole system crashes when the driver is
                    in mode_support_and_system_configuration
           Product: Drivers
           Version: 2.5
    Kernel Version: 5.5.11
          Hardware: x86-64
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: blocking
          Priority: P1
         Component: Video(DRI - non Intel)
          Assignee: drivers_video-dri@kernel-bugs.osdl.org
          Reporter: evvke@hotmail.com
        Regression: No

Whole system crashes with this error message : simd exception: 0000 [#1]
PREEMPT SMP NOPTI

Only giving a REISUB treatment works.

And cause is amdgpu driver.

---

Mar 26 20:47:13 shodan kernel: simd exception: 0000 [#1] PREEMPT SMP NOPTI
Mar 26 20:47:13 shodan kernel: CPU: 7 PID: 1344 Comm: Xorg Tainted: G        W 
OE     5.5.11-arch1-1 #1
Mar 26 20:47:13 shodan kernel: Hardware name: Micro-Star International Co.,
Ltd. MS-7B78/X470 GAMING PRO CARBON (MS-7B78), BIOS 2.80 03/06/2019
Mar 26 20:47:13 shodan kernel: RIP:
0010:mode_support_and_system_configuration+0x30a3/0x4d90 [amdgpu]
Mar 26 20:47:13 shodan kernel: Code: 00 0f 28 c3 e8 7e c9 ff ff f3 41 0f 11 87
40 19 00 00 e9 12 fd ff ff 41 83 be a8 00 00 00 06 75 93 f3 41 0f 10 86 40 1b
00 00 <f3> 41 0f 5e 86 f8 17 00 00 e8 4f c9 ff ff 41 8b 87 80 04 00 00 f3
Mar 26 20:47:13 shodan kernel: RSP: 0018:ffffb216c1f3b978 EFLAGS: 00010246
Mar 26 20:47:13 shodan kernel: RAX: 0000000000000006 RBX: ffff9c120bbfadc4 RCX:
0000000000000004
Mar 26 20:47:13 shodan kernel: RDX: 0000000000000001 RSI: 0000000000000000 RDI:
ffff9c120bbfb008
Mar 26 20:47:13 shodan kernel: RBP: ffff9c120bbfadc4 R08: ffff9c120bbfc164 R09:
0000000000000120
Mar 26 20:47:13 shodan kernel: R10: ffff9c120bbfaee4 R11: ffff9c120bbf0248 R12:
ffff9c120bbfc63c
Mar 26 20:47:13 shodan kernel: R13: 0000000000000000 R14: ffff9c120bbfaf5c R15:
ffff9c120bbfadc4
Mar 26 20:47:13 shodan kernel: FS:  00007f1c9f336dc0(0000)
GS:ffff9c19009c0000(0000) knlGS:0000000000000000
Mar 26 20:47:13 shodan kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
0000000080050033
Mar 26 20:47:13 shodan kernel: CR2: 00001f82bfec7fe0 CR3: 00000007cbe4a000 CR4:
00000000003406e0
Mar 26 20:47:13 shodan kernel: Call Trace:
Mar 26 20:47:13 shodan kernel:  dcn_validate_bandwidth+0xfe5/0x1f20 [amdgpu]
Mar 26 20:47:13 shodan kernel:  dc_validate_global_state+0x28a/0x310 [amdgpu]
Mar 26 20:47:13 shodan kernel:  amdgpu_dm_atomic_check+0x5d8/0x870 [amdgpu]
Mar 26 20:47:13 shodan kernel:  drm_atomic_check_only+0x578/0x800 [drm]
Mar 26 20:47:13 shodan kernel:  ? dm_crtc_duplicate_state+0x6b/0x1f0 [amdgpu]
Mar 26 20:47:13 shodan kernel:  drm_atomic_commit+0x13/0x50 [drm]
Mar 26 20:47:13 shodan kernel:  drm_atomic_helper_legacy_gamma_set+0x123/0x180
[drm_kms_helper]
Mar 26 20:47:13 shodan kernel:  drm_mode_gamma_set_ioctl+0x171/0x220 [drm]
Mar 26 20:47:13 shodan kernel:  ? drm_mode_crtc_set_gamma_size+0xa0/0xa0 [drm]
Mar 26 20:47:13 shodan kernel:  drm_ioctl_kernel+0xb2/0x100 [drm]
Mar 26 20:47:13 shodan kernel:  drm_ioctl+0x209/0x360 [drm]
Mar 26 20:47:13 shodan kernel:  ? drm_mode_crtc_set_gamma_size+0xa0/0xa0 [drm]
Mar 26 20:47:13 shodan kernel:  amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
Mar 26 20:47:13 shodan kernel:  do_vfs_ioctl+0x4b7/0x730
Mar 26 20:47:13 shodan kernel:  ksys_ioctl+0x5e/0x90
Mar 26 20:47:13 shodan kernel:  __x64_sys_ioctl+0x16/0x20
Mar 26 20:47:13 shodan kernel:  do_syscall_64+0x4e/0x150
Mar 26 20:47:13 shodan kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Mar 26 20:47:13 shodan kernel: RIP: 0033:0x7f1ca01892eb
Mar 26 20:47:13 shodan kernel: Code: 0f 1e fa 48 8b 05 a5 8b 0c 00 64 c7 00 26
00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00
0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 75 8b 0c 00 f7 d8 64 89 01 48
Mar 26 20:47:13 shodan kernel: RSP: 002b:00007ffc60ff5648 EFLAGS: 00000206
ORIG_RAX: 0000000000000010
Mar 26 20:47:13 shodan kernel: RAX: ffffffffffffffda RBX: 0000000000000001 RCX:
00007f1ca01892eb
Mar 26 20:47:13 shodan kernel: RDX: 00007ffc60ff5700 RSI: 00000000c02064a5 RDI:
000000000000000a
Mar 26 20:47:13 shodan kernel: RBP: 00007ffc60ff5680 R08: 0000562bb635c080 R09:
0000562bb635c280
Mar 26 20:47:13 shodan kernel: R10: 0000562bb635be80 R11: 0000000000000206 R12:
0000000000000100
Mar 26 20:47:13 shodan kernel: R13: 0000562bb6ab4f70 R14: 0000562bb635b9c0 R15:
0000000000000100
Mar 26 20:47:13 shodan kernel: Modules linked in: snd_seq_dummy snd_seq
bluetooth ecdh_generic rfkill ecc veth fuse iscsi_tcp libiscsi_tcp libiscsi
scsi_transport_iscsi tun ip6table_mangle xt_MASQUERADE iptable_nat nf_nat
xt_connmark iptable_mangle xt_helper xt_NFLOG xt_limit xt_conntrack xt_tcpudp
nf_conntrack_ftp nf_conntrack_sip nf_conntrack_pptp nf_conntrack_irc
nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 vboxnetadp(OE) vboxnetflt(OE)
vboxdrv(OE) pktcdvd nfnetlink_log nfnetlink ip6table_filter nct6775 ip6_tables
hwmon_vid iptable_filter edac_mce_amd kvm_amd ccp ext4 rng_core kvm
snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi
crc16 mbcache irqbypass mxm_wmi jbd2 snd_hda_intel wmi_bmof snd_intel_dspcfg
snd_hda_codec snd_usb_audio crct10dif_pclmul crc32_pclmul ghash_clmulni_intel
snd_hda_core uvcvideo snd_usbmidi_lib snd_rawmidi videobuf2_vmalloc
videobuf2_memops snd_seq_device videobuf2_v4l2 aesni_intel snd_hwdep
videobuf2_common crypto_simd snd_pcm mousedev cryptd glue_helper
Mar 26 20:47:13 shodan kernel:  input_leds sp5100_tco snd_timer igb k10temp
pcspkr i2c_piix4 snd soundcore dca wmi evdev mac_hid gpio_amdpt pinctrl_amd
acpi_cpufreq xt_mark v4l2loopback(OE) videodev mc usbmon nbd msr vhba(OE)
sr_mod cdrom sg br_netfilter bridge stp llc ip_tables x_tables dm_mod btrfs
blake2b_generic libcrc32c crc32c_generic xor raid6_pq sd_mod hid_generic usbhid
hid crc32c_intel ahci libahci libata xhci_pci xhci_hcd scsi_mod amdgpu
gpu_sched i2c_algo_bit ttm drm_kms_helper serio_raw syscopyarea sysfillrect
sysimgblt fb_sys_fops drm agpgart i8042 atkbd libps2 serio
Mar 26 20:47:13 shodan kernel: ---[ end trace e34593e526e29a3d ]---
Mar 26 20:47:13 shodan kernel: RIP:
0010:mode_support_and_system_configuration+0x30a3/0x4d90 [amdgpu]
Mar 26 20:47:13 shodan kernel: Code: 00 0f 28 c3 e8 7e c9 ff ff f3 41 0f 11 87
40 19 00 00 e9 12 fd ff ff 41 83 be a8 00 00 00 06 75 93 f3 41 0f 10 86 40 1b
00 00 <f3> 41 0f 5e 86 f8 17 00 00 e8 4f c9 ff ff 41 8b 87 80 04 00 00 f3
Mar 26 20:47:13 shodan kernel: RSP: 0018:ffffb216c1f3b978 EFLAGS: 00010246
Mar 26 20:47:13 shodan kernel: RAX: 0000000000000006 RBX: ffff9c120bbfadc4 RCX:
0000000000000004
Mar 26 20:47:13 shodan kernel: RDX: 0000000000000001 RSI: 0000000000000000 RDI:
ffff9c120bbfb008
Mar 26 20:47:13 shodan kernel: RBP: ffff9c120bbfadc4 R08: ffff9c120bbfc164 R09:
0000000000000120
Mar 26 20:47:13 shodan kernel: R10: ffff9c120bbfaee4 R11: ffff9c120bbf0248 R12:
ffff9c120bbfc63c
Mar 26 20:47:13 shodan kernel: R13: 0000000000000000 R14: ffff9c120bbfaf5c R15:
ffff9c120bbfadc4
Mar 26 20:47:13 shodan kernel: FS:  00007f1c9f336dc0(0000)
GS:ffff9c19009c0000(0000) knlGS:0000000000000000
Mar 26 20:47:13 shodan kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
0000000080050033
Mar 26 20:47:13 shodan kernel: CR2: 00001f82bfec7fe0 CR3: 00000007cbe4a000 CR4:
00000000003406e0

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
@ 2020-03-26 19:54 ` bugzilla-daemon
  2020-03-26 21:36 ` bugzilla-daemon
                   ` (46 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-03-26 19:54 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

Alex Deucher (alexdeucher@gmail.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |alexdeucher@gmail.com

--- Comment #1 from Alex Deucher (alexdeucher@gmail.com) ---
Please attach your full dmesg output.  What version of gcc are you using?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
  2020-03-26 19:54 ` [Bug 206987] " bugzilla-daemon
@ 2020-03-26 21:36 ` bugzilla-daemon
  2020-03-26 21:37 ` bugzilla-daemon
                   ` (45 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-03-26 21:36 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

--- Comment #2 from Cyrax (evvke@hotmail.com) ---
Created attachment 288079
  --> https://bugzilla.kernel.org/attachment.cgi?id=288079&action=edit
dmesg output

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
  2020-03-26 19:54 ` [Bug 206987] " bugzilla-daemon
  2020-03-26 21:36 ` bugzilla-daemon
@ 2020-03-26 21:37 ` bugzilla-daemon
  2020-04-04  7:40 ` bugzilla-daemon
                   ` (44 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-03-26 21:37 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

--- Comment #3 from Cyrax (evvke@hotmail.com) ---
GCC is "gcc (Arch Linux 9.3.0-1) 9.3.0"

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (2 preceding siblings ...)
  2020-03-26 21:37 ` bugzilla-daemon
@ 2020-04-04  7:40 ` bugzilla-daemon
  2020-04-04  7:41 ` bugzilla-daemon
                   ` (43 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-04-04  7:40 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

--- Comment #4 from Cyrax (evvke@hotmail.com) ---
Created attachment 288203
  --> https://bugzilla.kernel.org/attachment.cgi?id=288203&action=edit
dmesg output 2

This crash happened again. In that time I have used VLC, played a game (GZDoom)
and tried to listen youtube playlist by using a combination of youtube-dl,
ffmpeg and mpv.

I also updated motherboards BIOS/firmware to latest one.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (3 preceding siblings ...)
  2020-04-04  7:40 ` bugzilla-daemon
@ 2020-04-04  7:41 ` bugzilla-daemon
  2020-04-04  7:42 ` bugzilla-daemon
                   ` (42 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-04-04  7:41 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

Cyrax (evvke@hotmail.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Kernel Version|5.5.11                      |5.5.13
         Regression|No                          |Yes

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (4 preceding siblings ...)
  2020-04-04  7:41 ` bugzilla-daemon
@ 2020-04-04  7:42 ` bugzilla-daemon
  2020-04-18 13:15 ` bugzilla-daemon
                   ` (41 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-04-04  7:42 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

--- Comment #5 from Cyrax (evvke@hotmail.com) ---
Oh and kernel is in 5.5.13 version.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (5 preceding siblings ...)
  2020-04-04  7:42 ` bugzilla-daemon
@ 2020-04-18 13:15 ` bugzilla-daemon
  2020-04-18 13:19 ` bugzilla-daemon
                   ` (40 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-04-18 13:15 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

Cyrax (evvke@hotmail.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Kernel Version|5.5.13                      |5.6.4

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (6 preceding siblings ...)
  2020-04-18 13:15 ` bugzilla-daemon
@ 2020-04-18 13:19 ` bugzilla-daemon
  2020-04-19 11:42 ` bugzilla-daemon
                   ` (39 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-04-18 13:19 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

--- Comment #6 from Cyrax (evvke@hotmail.com) ---
Created attachment 288595
  --> https://bugzilla.kernel.org/attachment.cgi?id=288595&action=edit
dmesg output

And another one. It seems that switching between virtual consoles causes this
bug to happen

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (7 preceding siblings ...)
  2020-04-18 13:19 ` bugzilla-daemon
@ 2020-04-19 11:42 ` bugzilla-daemon
  2020-04-19 11:43 ` bugzilla-daemon
                   ` (38 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-04-19 11:42 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

farmboy0@googlemail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |farmboy0@googlemail.com

--- Comment #7 from farmboy0@googlemail.com ---
I am having the same problem sometimes during start/exit of SteamVR.
I have observed with the 5.6 kernels.
My card is a Navi RX 5700XT.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (8 preceding siblings ...)
  2020-04-19 11:42 ` bugzilla-daemon
@ 2020-04-19 11:43 ` bugzilla-daemon
  2020-04-23  5:15 ` bugzilla-daemon
                   ` (37 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-04-19 11:43 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

--- Comment #8 from farmboy0@googlemail.com ---
Created attachment 288615
  --> https://bugzilla.kernel.org/attachment.cgi?id=288615&action=edit
smesg output

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (9 preceding siblings ...)
  2020-04-19 11:43 ` bugzilla-daemon
@ 2020-04-23  5:15 ` bugzilla-daemon
  2020-04-23  5:15 ` bugzilla-daemon
                   ` (36 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-04-23  5:15 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

--- Comment #9 from Cyrax (evvke@hotmail.com) ---
Created attachment 288679
  --> https://bugzilla.kernel.org/attachment.cgi?id=288679&action=edit
dmesg output

And again.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (10 preceding siblings ...)
  2020-04-23  5:15 ` bugzilla-daemon
@ 2020-04-23  5:15 ` bugzilla-daemon
  2020-04-25  8:44 ` bugzilla-daemon
                   ` (35 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-04-23  5:15 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

Cyrax (evvke@hotmail.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Kernel Version|5.6.4                       |5.6.5

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (11 preceding siblings ...)
  2020-04-23  5:15 ` bugzilla-daemon
@ 2020-04-25  8:44 ` bugzilla-daemon
  2020-04-25  8:44 ` bugzilla-daemon
                   ` (34 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-04-25  8:44 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

--- Comment #10 from Cyrax (evvke@hotmail.com) ---
Created attachment 288719
  --> https://bugzilla.kernel.org/attachment.cgi?id=288719&action=edit
gdb disassembler dump around mode_support_and_system_configuration

And it happened again. Looks like that something goes wrong after while when
computer monitor is turned on.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (12 preceding siblings ...)
  2020-04-25  8:44 ` bugzilla-daemon
@ 2020-04-25  8:44 ` bugzilla-daemon
  2020-04-27 19:20 ` bugzilla-daemon
                   ` (33 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-04-25  8:44 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

Cyrax (evvke@hotmail.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Kernel Version|5.6.5                       |5.6.7

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (13 preceding siblings ...)
  2020-04-25  8:44 ` bugzilla-daemon
@ 2020-04-27 19:20 ` bugzilla-daemon
  2020-04-27 19:20 ` bugzilla-daemon
                   ` (32 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-04-27 19:20 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

--- Comment #11 from Cyrax (evvke@hotmail.com) ---
Created attachment 288781
  --> https://bugzilla.kernel.org/attachment.cgi?id=288781&action=edit
dmesg output from Linux 5.7-rc3

This is starting to be real problem, I can't do anything remotely productive.
Crash will happen in just 12 hours (give or take) when system is rebooted from
previous one.

I'm running four LXC containers which I have setup to run GUI programs in hosts
system by following this help :
https://wiki.archlinux.org/index.php/Linux_Containers#Xorg_program_considerations_(optional)

Also I have running VirtualBox but its VM's aren't accessing 3D functions from
host at all.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (14 preceding siblings ...)
  2020-04-27 19:20 ` bugzilla-daemon
@ 2020-04-27 19:20 ` bugzilla-daemon
  2020-05-02 14:18 ` bugzilla-daemon
                   ` (31 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-04-27 19:20 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

Cyrax (evvke@hotmail.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Kernel Version|5.6.7                       |5.7.0-rc3

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (15 preceding siblings ...)
  2020-04-27 19:20 ` bugzilla-daemon
@ 2020-05-02 14:18 ` bugzilla-daemon
  2020-05-23  1:52 ` bugzilla-daemon
                   ` (30 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-05-02 14:18 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

--- Comment #12 from Cyrax (evvke@hotmail.com) ---
Created attachment 288873
  --> https://bugzilla.kernel.org/attachment.cgi?id=288873&action=edit
dmesg from 5.6.8

Additionally dmesg output shows this line : note: kworker/0:3[2251663] exited
with preempt_count 1

It seems that this bug occurs when the monitor is turned off and then on
repeatedly with short delay between.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (16 preceding siblings ...)
  2020-05-02 14:18 ` bugzilla-daemon
@ 2020-05-23  1:52 ` bugzilla-daemon
  2020-05-23  1:56 ` bugzilla-daemon
                   ` (29 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-05-23  1:52 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

--- Comment #13 from Cyrax (evvke@hotmail.com) ---
Created attachment 289237
  --> https://bugzilla.kernel.org/attachment.cgi?id=289237&action=edit
kernel log dumped from crash dump by using crash utility

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (17 preceding siblings ...)
  2020-05-23  1:52 ` bugzilla-daemon
@ 2020-05-23  1:56 ` bugzilla-daemon
  2020-05-23  1:58 ` bugzilla-daemon
                   ` (28 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-05-23  1:56 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

--- Comment #14 from Cyrax (evvke@hotmail.com) ---
Created attachment 289239
  --> https://bugzilla.kernel.org/attachment.cgi?id=289239&action=edit
backtrace created by executing bt -f command in crash utility

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (18 preceding siblings ...)
  2020-05-23  1:56 ` bugzilla-daemon
@ 2020-05-23  1:58 ` bugzilla-daemon
  2020-05-28 14:17 ` bugzilla-daemon
                   ` (27 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-05-23  1:58 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

--- Comment #15 from Cyrax (evvke@hotmail.com) ---
Created attachment 289241
  --> https://bugzilla.kernel.org/attachment.cgi?id=289241&action=edit
dump of struct dcn_bw_internal_vars

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (19 preceding siblings ...)
  2020-05-23  1:58 ` bugzilla-daemon
@ 2020-05-28 14:17 ` bugzilla-daemon
  2020-05-28 16:05 ` bugzilla-daemon
                   ` (26 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-05-28 14:17 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

Petteri Aimonen (jpa@kernelbug.mail.kapsi.fi) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jpa@kernelbug.mail.kapsi.fi

--- Comment #16 from Petteri Aimonen (jpa@kernelbug.mail.kapsi.fi) ---
I hit the same issue, using Ubuntu 20.04. It happened when switching window to
Firefox. For me it only crashed Xorg, ssh to the machine still worked ok.
Killing Xorg didn't work and `shutdown -r now` hung up somewhere.

Here is a bug report on the Ubuntu package:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1881134

Here is call trace decoded with the debug symbols:

--

[455834.385061] Call Trace:
[455834.385120] mode_support_and_system_configuration
(/build/linux-FFoizL/linux-5.4.0/drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calc_auto.c:176)
amdgpu
[455834.385174] ? calculate_inits_and_adj_vp
(/build/linux-FFoizL/linux-5.4.0/drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_resource.c:950
(discriminator 12)) amdgpu
[455834.385230] dcn_validate_bandwidth
(/build/linux-FFoizL/linux-5.4.0/drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:1034)
amdgpu
[455834.385283] dc_validate_global_state
(/build/linux-FFoizL/linux-5.4.0/drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_resource.c:2093)
amdgpu
[455834.385338] amdgpu_dm_atomic_check
(/build/linux-FFoizL/linux-5.4.0/drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:7413)
amdgpu
[455834.385351] drm_atomic_check_only
(/build/linux-FFoizL/linux-5.4.0/drivers/gpu/drm/drm_atomic.c:1179) drm
[455834.385361] drm_atomic_commit
(/build/linux-FFoizL/linux-5.4.0/drivers/gpu/drm/drm_atomic.c:1220) drm
[455834.385370] drm_mode_obj_set_property_ioctl
(/build/linux-FFoizL/linux-5.4.0/drivers/gpu/drm/drm_mode_object.c:496
/build/linux-FFoizL/linux-5.4.0/drivers/gpu/drm/drm_mode_object.c:533) drm
[455834.385379] ? drm_mode_obj_find_prop_id
(/build/linux-FFoizL/linux-5.4.0/drivers/gpu/drm/drm_mode_object.c:512) drm
[455834.385386] drm_ioctl_kernel
(/build/linux-FFoizL/linux-5.4.0/drivers/gpu/drm/drm_ioctl.c:793) drm
[455834.385394] drm_ioctl
(/build/linux-FFoizL/linux-5.4.0/include/linux/thread_info.h:119
/build/linux-FFoizL/linux-5.4.0/include/linux/thread_info.h:152
/build/linux-FFoizL/linux-5.4.0/include/linux/uaccess.h:151
/build/linux-FFoizL/linux-5.4.0/drivers/gpu/drm/drm_ioctl.c:888) drm
[455834.385402] ? drm_mode_obj_find_prop_id
(/build/linux-FFoizL/linux-5.4.0/drivers/gpu/drm/drm_mode_object.c:512) drm
[455834.385406] ? recalc_sigpending
(/build/linux-FFoizL/linux-5.4.0/kernel/signal.c:184) 
[455834.385440] amdgpu_drm_ioctl
(/build/linux-FFoizL/linux-5.4.0/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c:1293)
amdgpu
[455834.385443] do_vfs_ioctl (/build/linux-FFoizL/linux-5.4.0/fs/ioctl.c:47
/build/linux-FFoizL/linux-5.4.0/fs/ioctl.c:510
/build/linux-FFoizL/linux-5.4.0/fs/ioctl.c:697) 
[455834.385444] ? recalc_sigpending
(/build/linux-FFoizL/linux-5.4.0/kernel/signal.c:184) 
[455834.385446] ? _copy_from_user
(/build/linux-FFoizL/linux-5.4.0/arch/x86/include/asm/uaccess_64.h:46
/build/linux-FFoizL/linux-5.4.0/arch/x86/include/asm/uaccess_64.h:71
/build/linux-FFoizL/linux-5.4.0/lib/usercopy.c:14) 
[455834.385448] ksys_ioctl
(/build/linux-FFoizL/linux-5.4.0/include/linux/file.h:43
/build/linux-FFoizL/linux-5.4.0/fs/ioctl.c:715) 
[455834.385449] __x64_sys_ioctl
(/build/linux-FFoizL/linux-5.4.0/fs/ioctl.c:719) 
[455834.385451] do_syscall_64
(/build/linux-FFoizL/linux-5.4.0/arch/x86/entry/common.c:290) 
[455834.385455] entry_SYSCALL_64_after_hwframe
(/build/linux-FFoizL/linux-5.4.0/arch/x86/entry/entry_64.S:184) 
[455834.385456] RIP: 0033:0x7faf3181837b

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (20 preceding siblings ...)
  2020-05-28 14:17 ` bugzilla-daemon
@ 2020-05-28 16:05 ` bugzilla-daemon
  2020-05-28 16:24 ` bugzilla-daemon
                   ` (25 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-05-28 16:05 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

--- Comment #17 from Petteri Aimonen (jpa@kernelbug.mail.kapsi.fi) ---
Created attachment 289381
  --> https://bugzilla.kernel.org/attachment.cgi?id=289381&action=edit
dmesg from kernel 5.4.0-31

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (21 preceding siblings ...)
  2020-05-28 16:05 ` bugzilla-daemon
@ 2020-05-28 16:24 ` bugzilla-daemon
  2020-05-28 18:56 ` bugzilla-daemon
                   ` (24 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-05-28 16:24 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

--- Comment #18 from Petteri Aimonen (jpa@kernelbug.mail.kapsi.fi) ---
As best as I can tell, the crash seems to be caused by some floating point
exception (such as underflow/overflow) in this function call in dcn_calc_auto.c
line 176:

dcn_bw_ceil2(v->byte_per_pixel_in_dety[k], 1.0)

In dcn_bw_ceil2() the exception occurs in this instruction:

addsd  0x0(%rip),%xmm3

which is performing the addition flr + 0.00001.
At this point %xmm3 is ((int)(v->byte_per_pixel_in_dety[k] / 1.0)) * 1.0
The variable byte_per_pixel_in_dety is only assigned constant values 1.0, 2.0,
4.0, 8.0 so
I don't see any reason for addsd to cause a simd exception. I'm not sure if the
exception
is precise or if it could be delayed from some prior instruction, but AFAIK it
should be
precise because in usermode the exception handler would attempt a recovery.

Having XMM3 or MXCSR values would help, but they don't seem to get included in
the dmesg output and I'm not sure if they are available in a crash dump either.

Google search turned up
https://beowulf.beowulf.narkive.com/tAHxVcs0/simd-exception-kernel-panic-on-skylake-ep-triggered-by-openfoam
where the exception was delayed for some reason.

Analyzing the dmesgs attached to this bug report, we have following crash
locations:

Cyrax    2020-03-26 21:36: divss  xmm0,DWORD PTR [r14+0x17f8]
Cyrax    2020-04-04 07:40: divss  xmm0,DWORD PTR [r14+0x17f8]
Cyrax    2020-04-18 13:19: divss  xmm0,DWORD PTR [r14+0x17f8]
farmboy0 2020-04-19 11:43: not a simd exception
Cyrax    2020-04-23 05:15: divss  xmm0,DWORD PTR [r14+0x17f8]
Cyrax    2020-04-27 19:20: divss  xmm0,DWORD PTR [r14+0x17f8]
Cyrax    2020-05-02 14:18: divss  xmm0,DWORD PTR [r14+0x17f8]
PetteriA 2020-05-28 16:05: addsd  xmm3,QWORD PTR [rip+0x1de967]

So the crash locations appear fairly consistent for Cyrax's machine, but no two
machines have the same location.

For other users affected by this problem, it could be helpful if you install
kernel debugging symbols and use decode_stacktrace.sh to convert the raw stack
trace to code locations.

Also reported on freedesktop amd bugtracker:
https://gitlab.freedesktop.org/drm/amd/-/issues/1154

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (22 preceding siblings ...)
  2020-05-28 16:24 ` bugzilla-daemon
@ 2020-05-28 18:56 ` bugzilla-daemon
  2020-06-02  3:50 ` bugzilla-daemon
                   ` (23 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-05-28 18:56 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

--- Comment #19 from Alex Deucher (alexdeucher@gmail.com) ---
Do these patches help?
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=59dfb0c64d3853d20dc84f4561f28d4f5a2ddc7d
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5aa82e35cacfdff7278b7eeffd9575e9c386289e

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (23 preceding siblings ...)
  2020-05-28 18:56 ` bugzilla-daemon
@ 2020-06-02  3:50 ` bugzilla-daemon
  2020-06-03  1:34 ` bugzilla-daemon
                   ` (22 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-06-02  3:50 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

yaomtc@protonmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |yaomtc@protonmail.com

--- Comment #20 from yaomtc@protonmail.com ---
So far so good Alex. Using the RX 5700 XT as well. Previously, running SteamVR
could pretty quickly crash my system (even before launching a game), and since
I rebuilt linux-mainline from AUR, haven't had SteamVR crash my system yet.
Fingers crossed that this continues. 

Though Half-Life: Alyx is causing a system crash, which can even happen on
Windows with Vulkan apparently! Wow. At least that's not an AMD or Linux
specific issue. https://github.com/ValveSoftware/SteamVR-for-Linux/issues/356

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (24 preceding siblings ...)
  2020-06-02  3:50 ` bugzilla-daemon
@ 2020-06-03  1:34 ` bugzilla-daemon
  2020-06-03  1:35 ` bugzilla-daemon
                   ` (21 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-06-03  1:34 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

--- Comment #21 from Cyrax (evvke@hotmail.com) ---
Created attachment 289479
  --> https://bugzilla.kernel.org/attachment.cgi?id=289479&action=edit
dmesg output kernel 5.7.0

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (25 preceding siblings ...)
  2020-06-03  1:34 ` bugzilla-daemon
@ 2020-06-03  1:35 ` bugzilla-daemon
  2020-06-03  1:36 ` bugzilla-daemon
                   ` (20 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-06-03  1:35 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

--- Comment #22 from Cyrax (evvke@hotmail.com) ---
Created attachment 289481
  --> https://bugzilla.kernel.org/attachment.cgi?id=289481&action=edit
config file used to build kernel 5.7.0 with KASAN etc

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (26 preceding siblings ...)
  2020-06-03  1:35 ` bugzilla-daemon
@ 2020-06-03  1:36 ` bugzilla-daemon
  2020-06-03  2:00 ` bugzilla-daemon
                   ` (19 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-06-03  1:36 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

Cyrax (evvke@hotmail.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Kernel Version|5.7.0-rc3                   |5.7.0

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (27 preceding siblings ...)
  2020-06-03  1:36 ` bugzilla-daemon
@ 2020-06-03  2:00 ` bugzilla-daemon
  2020-06-03  2:28 ` bugzilla-daemon
                   ` (18 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-06-03  2:00 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

--- Comment #23 from Cyrax (evvke@hotmail.com) ---
Created attachment 289483
  --> https://bugzilla.kernel.org/attachment.cgi?id=289483&action=edit
used decode_stacktrace.sh to previous dmesg log

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (28 preceding siblings ...)
  2020-06-03  2:00 ` bugzilla-daemon
@ 2020-06-03  2:28 ` bugzilla-daemon
  2020-06-03  5:14 ` bugzilla-daemon
                   ` (17 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-06-03  2:28 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

--- Comment #24 from Cyrax (evvke@hotmail.com) ---
(In reply to Petteri Aimonen from comment #16)
> I hit the same issue, using Ubuntu 20.04. It happened when switching window
> to Firefox. For me it only crashed Xorg, ssh to the machine still worked ok.
> Killing Xorg didn't work and `shutdown -r now` hung up somewhere.
> 
> Here is a bug report on the Ubuntu package:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1881134
> 
> Here is call trace decoded with the debug symbols:
> 
[clip]

Yeah, it happens when switching windows and/or to different workspace. And yes
it will crash Xorg only, other things will continue work as usual and issuing
reboot command via SSH won't - well - reboot it. Only REISUB brings machine
back to usable state.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (29 preceding siblings ...)
  2020-06-03  2:28 ` bugzilla-daemon
@ 2020-06-03  5:14 ` bugzilla-daemon
  2020-06-03 11:05 ` bugzilla-daemon
                   ` (16 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-06-03  5:14 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

--- Comment #25 from Petteri Aimonen (jpa@kernelbug.mail.kapsi.fi) ---
Looks like there are two kinds of crash bugs here. Many of the amdgpu crashes
have been fixed in 5.7.0, but the specific one that gives "simd exception" in
dmesg is not.

@Cyrax There is an experimental patch in
https://bugzilla.kernel.org/show_bug.cgi?id=207979 if you want to try.

Out of interest, are you possibly running a 32-bit operating system under
virtualization on 64-bit host? That's what triggers the bug for me.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (30 preceding siblings ...)
  2020-06-03  5:14 ` bugzilla-daemon
@ 2020-06-03 11:05 ` bugzilla-daemon
  2020-06-06  1:29 ` bugzilla-daemon
                   ` (15 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-06-03 11:05 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

--- Comment #26 from Cyrax (evvke@hotmail.com) ---
(In reply to Petteri Aimonen from comment #25)
> Looks like there are two kinds of crash bugs here. Many of the amdgpu
> crashes have been fixed in 5.7.0, but the specific one that gives "simd
> exception" in dmesg is not.
> 
> @Cyrax There is an experimental patch in
> https://bugzilla.kernel.org/show_bug.cgi?id=207979 if you want to try.
> 
> Out of interest, are you possibly running a 32-bit operating system under
> virtualization on 64-bit host? That's what triggers the bug for me.

I'm running one 32-bit LXC container (Arch Linux.
<url:https://archlinux32.org/>) and three 64-bit LXC containers (Arch Linux).
Additionally I'm running three VirtualBox guests which are Windows, Arch Linux
and old version LEDE (OpenWRT) router OS (All are running 64-bit OS).

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (31 preceding siblings ...)
  2020-06-03 11:05 ` bugzilla-daemon
@ 2020-06-06  1:29 ` bugzilla-daemon
  2020-06-06  6:42 ` bugzilla-daemon
                   ` (14 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-06-06  1:29 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

--- Comment #27 from yaomtc@protonmail.com ---
Created attachment 289535
  --> https://bugzilla.kernel.org/attachment.cgi?id=289535&action=edit
systemd journal from crash

Update: got a whole system crash again when I was starting up SteamVR. So I
guess the issue wasn't resolved for me. It could have reduced the likelihood
maybe, or it was luck?

Not sure what else to attach here, but I copied journal entries from the time
of the crash (which happens at 21:09:31 near the end). Let me know if there's
something else I should attach the next time this happens, if more data would
be helpful.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (32 preceding siblings ...)
  2020-06-06  1:29 ` bugzilla-daemon
@ 2020-06-06  6:42 ` bugzilla-daemon
  2020-07-03 22:22 ` bugzilla-daemon
                   ` (13 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-06-06  6:42 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

--- Comment #28 from Petteri Aimonen (jpa@kernelbug.mail.kapsi.fi) ---
@yaomtc Your bug seems to be some separate issue, as the log does not have the
"simd exception" or "mode_support_and_system_configuration" entries in it. It
looks more similar to this bug here:
https://gitlab.freedesktop.org/drm/amd/-/issues/1149

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (33 preceding siblings ...)
  2020-06-06  6:42 ` bugzilla-daemon
@ 2020-07-03 22:22 ` bugzilla-daemon
  2020-07-15 16:07 ` bugzilla-daemon
                   ` (12 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-07-03 22:22 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

Alexander Kernozhitsky (sh200105@mail.ru) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |sh200105@mail.ru

--- Comment #29 from Alexander Kernozhitsky (sh200105@mail.ru) ---
I encountered this bug today. When running specific graphical applications, the
machine hangs, and the kernel logs say about simd exception.

It started to occur after the upgrade to 5.7.6 kernel.

I tried to apply the patch mentioned in
https://bugzilla.kernel.org/show_bug.cgi?id=207979, and the patch resolves the
issue for me.

Using AMD Ryzen 5 3500U with Radeon Vega Mobile Gfx.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (34 preceding siblings ...)
  2020-07-03 22:22 ` bugzilla-daemon
@ 2020-07-15 16:07 ` bugzilla-daemon
  2020-07-15 16:12 ` bugzilla-daemon
                   ` (11 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-07-15 16:07 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

Cyrax (evvke@hotmail.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Kernel Version|5.7.0                       |5.7.6

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (35 preceding siblings ...)
  2020-07-15 16:07 ` bugzilla-daemon
@ 2020-07-15 16:12 ` bugzilla-daemon
  2020-07-17  4:40 ` bugzilla-daemon
                   ` (10 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-07-15 16:12 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

--- Comment #30 from Cyrax (evvke@hotmail.com) ---
The patch in https://bugzilla.kernel.org/show_bug.cgi?id=207979 works
beatifully.
19 days heavy usage without system crash on patched 5.7.6 kernel.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (36 preceding siblings ...)
  2020-07-15 16:12 ` bugzilla-daemon
@ 2020-07-17  4:40 ` bugzilla-daemon
  2020-07-23  1:47 ` bugzilla-daemon
                   ` (9 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-07-17  4:40 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

--- Comment #31 from Alex Deucher (alexdeucher@gmail.com) ---
Duplicate of bug 207979.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (37 preceding siblings ...)
  2020-07-17  4:40 ` bugzilla-daemon
@ 2020-07-23  1:47 ` bugzilla-daemon
  2020-08-19  6:37 ` bugzilla-daemon
                   ` (8 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-07-23  1:47 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

Cyrax (evvke@hotmail.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |DUPLICATE

--- Comment #32 from Cyrax (evvke@hotmail.com) ---
Fix is in stable 5.7.10 kernel.

*** This bug has been marked as a duplicate of bug 207979 ***

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (38 preceding siblings ...)
  2020-07-23  1:47 ` bugzilla-daemon
@ 2020-08-19  6:37 ` bugzilla-daemon
  2020-08-19  6:51 ` bugzilla-daemon
                   ` (7 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-08-19  6:37 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

krakopo@protonmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |krakopo@protonmail.com

--- Comment #33 from krakopo@protonmail.com ---
I'm seeing this on an AMD Ryzen 4500U laptop running 5.8.1 (Arch Linux
5.8.1-arch1-1). I can repro fairly consistently when running a 64-bit KVM
virtual machine.

The kernel I'm running has the commit which should resolve this:
7ad816762f9b ("x86/fpu: Reset MXCSR to default in kernel_fpu_begin()")

Confirmed patch is in my kernel:
https://git.archlinux.org/linux.git/tree/arch/x86/kernel/fpu/core.c?h=v5.8.1-arch1#n106

Here is what I see in dmesg:

Aug 18 20:25:49 archpad kernel: simd exception: 0000 [#1] PREEMPT SMP NOPTI
Aug 18 20:25:49 archpad kernel: CPU: 0 PID: 509 Comm: Xorg Not tainted
5.8.1-arch1-1 #1
Aug 18 20:25:49 archpad kernel: Hardware name: LENOVO 81W4/LNVNB161216, BIOS
DZCN19WW 04/13/2020
Aug 18 20:25:49 archpad kernel: RIP: 0010:dcn_bw_ceil2+0x35/0x60 [amdgpu]
Aug 18 20:25:49 archpad kernel: Code: cd 7b 3e 0f 28 d0 66 0f ef db 66 0f ef e4
f3 0f 5e d1 f3 0f 5a e0 f3 0f 2c c2 66 0f ef d2 f3 0f 2a d0 f3 0f 59 d1 f3 0f
5a da <f2> 0f 58 1d 5b 19 2e 00 66 0f 2f dc 72 01 c3 f3 0f 58 ca 0f 28 c1
Aug 18 20:25:49 archpad kernel: RSP: 0018:ffffb8fac07035f8 EFLAGS: 00010202
Aug 18 20:25:49 archpad kernel: RAX: 0000000000000004 RBX: 0000000000000000
RCX: 0000000000000780
Aug 18 20:25:49 archpad kernel: RDX: ffff97ebd0a63080 RSI: ffff97ebd0a69560
RDI: 0000000044444440
Aug 18 20:25:49 archpad kernel: RBP: ffff97ebd0a631c0 R08: ffff97ebd0a633b4
R09: 0000000000000000
Aug 18 20:25:49 archpad kernel: R10: 0000000000000000 R11: 0000000000000000
R12: ffff97ebd0a63360
Aug 18 20:25:49 archpad kernel: R13: 0000000000000001 R14: ffff97ebd0a62188
R15: ffff97ebd0a62028
Aug 18 20:25:49 archpad kernel: FS:  00007f8787a65940(0000)
GS:ffff97ec47400000(0000) knlGS:0000000000000000
Aug 18 20:25:49 archpad kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
0000000080050033
Aug 18 20:25:49 archpad kernel: CR2: 0000000800880000 CR3: 00000001f9040000
CR4: 0000000000340ef0
Aug 18 20:25:49 archpad kernel: Call Trace:
Aug 18 20:25:49 archpad kernel: 
dml21_ModeSupportAndSystemConfigurationFull+0x437/0x5cf0 [amdgpu]
Aug 18 20:25:49 archpad kernel:  ? sysvec_apic_timer_interrupt+0x46/0xe0
Aug 18 20:25:49 archpad kernel:  ? asm_sysvec_apic_timer_interrupt+0x12/0x20
Aug 18 20:25:49 archpad kernel:  ? sched_clock+0x5/0x10
Aug 18 20:25:49 archpad kernel:  ? sched_clock_local+0x12/0x80
Aug 18 20:25:49 archpad kernel:  ? amdgpu_sa_bo_new+0xbc/0x550 [amdgpu]
Aug 18 20:25:49 archpad kernel:  ? sched_clock_cpu+0xae/0xd0
Aug 18 20:25:49 archpad kernel:  ? kmem_cache_alloc_trace+0x17c/0x220
Aug 18 20:25:49 archpad kernel:  ? amdgpu_sa_bo_new+0xbc/0x550 [amdgpu]
Aug 18 20:25:49 archpad kernel:  ? _raw_spin_unlock+0x16/0x30
Aug 18 20:25:49 archpad kernel:  ? preempt_count_add+0x49/0xa0
Aug 18 20:25:49 archpad kernel:  ? kernel_init_free_pages+0x6d/0x90
Aug 18 20:25:49 archpad kernel:  ? prep_new_page+0xa2/0xb0
Aug 18 20:25:49 archpad kernel:  ? get_page_from_freelist+0xfa8/0x1220
Aug 18 20:25:49 archpad kernel:  ? __mod_zone_page_state+0x66/0xa0
Aug 18 20:25:49 archpad kernel:  ? hubbub2_get_dcc_compression_cap+0xa8/0x270
[amdgpu]
Aug 18 20:25:49 archpad kernel:  ? fill_plane_buffer_attributes+0x26f/0x420
[amdgpu]
Aug 18 20:25:49 archpad kernel:  dml_get_voltage_level+0x116/0x1e0 [amdgpu]
Aug 18 20:25:49 archpad kernel:  dcn20_fast_validate_bw+0x359/0x680 [amdgpu]
Aug 18 20:25:49 archpad kernel:  ? resource_build_scaling_params+0xc44/0x11a0
[amdgpu]
Aug 18 20:25:49 archpad kernel:  dcn21_validate_bandwidth+0xcd/0x2a0 [amdgpu]
Aug 18 20:25:49 archpad kernel:  dc_validate_global_state+0x2f2/0x390 [amdgpu]
Aug 18 20:25:49 archpad kernel:  amdgpu_dm_atomic_check+0xefb/0x1010 [amdgpu]
Aug 18 20:25:49 archpad kernel:  drm_atomic_check_only+0x57c/0x7f0 [drm]
Aug 18 20:25:49 archpad kernel:  ?
__drm_atomic_helper_crtc_duplicate_state+0x85/0xd0 [drm_kms_helper]
Aug 18 20:25:49 archpad kernel:  drm_atomic_commit+0x13/0x50 [drm]
Aug 18 20:25:49 archpad kernel:  drm_atomic_helper_legacy_gamma_set+0x123/0x180
[drm_kms_helper]
Aug 18 20:25:49 archpad kernel:  drm_mode_gamma_set_ioctl+0x19a/0x230 [drm]
Aug 18 20:25:49 archpad kernel:  ? drm_color_lut_check+0xa0/0xa0 [drm]
Aug 18 20:25:49 archpad kernel:  drm_ioctl_kernel+0xb2/0x100 [drm]
Aug 18 20:25:49 archpad kernel:  drm_ioctl+0x208/0x360 [drm]
Aug 18 20:25:49 archpad kernel:  ? drm_color_lut_check+0xa0/0xa0 [drm]
Aug 18 20:25:49 archpad kernel:  amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
Aug 18 20:25:49 archpad kernel:  ksys_ioctl+0x82/0xc0
Aug 18 20:25:49 archpad kernel:  __x64_sys_ioctl+0x16/0x20
Aug 18 20:25:49 archpad kernel:  do_syscall_64+0x44/0x70
Aug 18 20:25:49 archpad kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Aug 18 20:25:49 archpad kernel: RIP: 0033:0x7f87887888eb
Aug 18 20:25:49 archpad kernel: Code: 0f 1e fa 48 8b 05 a5 95 0c 00 64 c7 00 26
00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00
0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 75 95 0c 00 f7 d8 64 89 01 48
Aug 18 20:25:49 archpad kernel: RSP: 002b:00007ffc92f3a9a8 EFLAGS: 00000246
ORIG_RAX: 0000000000000010
Aug 18 20:25:49 archpad kernel: RAX: ffffffffffffffda RBX: 00007ffc92f3a9e0
RCX: 00007f87887888eb
Aug 18 20:25:49 archpad kernel: RDX: 00007ffc92f3a9e0 RSI: 00000000c02064a5
RDI: 000000000000000a
Aug 18 20:25:49 archpad kernel: RBP: 00000000c02064a5 R08: 00005627eb36eb10
R09: 00005627eb36ed10
Aug 18 20:25:49 archpad kernel: R10: 00005627eb36e910 R11: 0000000000000246
R12: 0000000000000100
Aug 18 20:25:49 archpad kernel: R13: 000000000000000a R14: 0000000000000100
R15: 0000000000000100
Aug 18 20:25:49 archpad kernel: Modules linked in: xt_CHECKSUM xt_MASQUERADE
xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat
iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4
libcrc32c tun bridge hid_multitouch hid_generic 8021q garp mrp stp llc
ebtable_filter ebtables snd_acp3x_rn snd_soc_dmic snd_acp3x_pdm_dma snd_soc_c>
Aug 18 20:25:49 archpad kernel:  drm_kms_helper btintel snd_hwdep i2c_hid hid
videobuf2_common cec nls_iso8859_1 snd_pcm rc_core nls_cp437 bluetooth cfg80211
snd_timer syscopyarea videodev ideapad_laptop snd_rn_pci_acp3x sysfillrect vfat
ecdh_generic snd sysimgblt tpm_crb snd_pci_acp3x sparse_keymap fat ecc mc
fb_sys_fops tpm_tis soundcore ccp rfkill libarc4 wmi battery tpm_tis>
Aug 18 20:25:49 archpad kernel: ---[ end trace 76f111d732bc1b57 ]---
Aug 18 20:25:49 archpad kernel: RIP: 0010:dcn_bw_ceil2+0x35/0x60 [amdgpu]
Aug 18 20:25:49 archpad kernel: Code: cd 7b 3e 0f 28 d0 66 0f ef db 66 0f ef e4
f3 0f 5e d1 f3 0f 5a e0 f3 0f 2c c2 66 0f ef d2 f3 0f 2a d0 f3 0f 59 d1 f3 0f
5a da <f2> 0f 58 1d 5b 19 2e 00 66 0f 2f dc 72 01 c3 f3 0f 58 ca 0f 28 c1
Aug 18 20:25:49 archpad kernel: RSP: 0018:ffffb8fac07035f8 EFLAGS: 00010202
Aug 18 20:25:49 archpad kernel: RAX: 0000000000000004 RBX: 0000000000000000
RCX: 0000000000000780
Aug 18 20:25:49 archpad kernel: RDX: ffff97ebd0a63080 RSI: ffff97ebd0a69560
RDI: 0000000044444440
Aug 18 20:25:49 archpad kernel: RBP: ffff97ebd0a631c0 R08: ffff97ebd0a633b4
R09: 0000000000000000
Aug 18 20:25:49 archpad kernel: R10: 0000000000000000 R11: 0000000000000000
R12: ffff97ebd0a63360
Aug 18 20:25:49 archpad kernel: R13: 0000000000000001 R14: ffff97ebd0a62188
R15: ffff97ebd0a62028
Aug 18 20:25:49 archpad kernel: FS:  00007f8787a65940(0000)
GS:ffff97ec47400000(0000) knlGS:0000000000000000
Aug 18 20:25:49 archpad kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
0000000080050033
Aug 18 20:25:49 archpad kernel: CR2: 0000000800880000 CR3: 00000001f9040000
CR4: 0000000000340ef0

$ objdump -d amdgpu.ko
...
00000000001b83c0 <dcn_bw_ceil2>:
  1b83c0:       e8 00 00 00 00          callq  1b83c5 <dcn_bw_ceil2+0x5>
  1b83c5:       66 0f ef ed             pxor   %xmm5,%xmm5
  1b83c9:       0f 2e cd                ucomiss %xmm5,%xmm1
  1b83cc:       7b 3e                   jnp    1b840c <dcn_bw_ceil2+0x4c>
  1b83ce:       0f 28 d0                movaps %xmm0,%xmm2
  1b83d1:       66 0f ef db             pxor   %xmm3,%xmm3
  1b83d5:       66 0f ef e4             pxor   %xmm4,%xmm4
  1b83d9:       f3 0f 5e d1             divss  %xmm1,%xmm2
  1b83dd:       f3 0f 5a e0             cvtss2sd %xmm0,%xmm4
  1b83e1:       f3 0f 2c c2             cvttss2si %xmm2,%eax
  1b83e5:       66 0f ef d2             pxor   %xmm2,%xmm2
  1b83e9:       f3 0f 2a d0             cvtsi2ss %eax,%xmm2
  1b83ed:       f3 0f 59 d1             mulss  %xmm1,%xmm2
  1b83f1:       f3 0f 5a da             cvtss2sd %xmm2,%xmm3
  1b83f5:       f2 0f 58 1d 00 00 00    addsd  0x0(%rip),%xmm3        # 1b83fd
<dcn_bw_ceil2+0x3d>
  1b83fc:       00 
  1b83fd:       66 0f 2f dc             comisd %xmm4,%xmm3
  1b8401:       72 01                   jb     1b8404 <dcn_bw_ceil2+0x44>
  1b8403:       c3                      retq   
  1b8404:       f3 0f 58 ca             addss  %xmm2,%xmm1
  1b8408:       0f 28 c1                movaps %xmm1,%xmm0
  1b840b:       c3                      retq   
  1b840c:       75 c0                   jne    1b83ce <dcn_bw_ceil2+0xe>
  1b840e:       66 0f ef c0             pxor   %xmm0,%xmm0
  1b8412:       c3                      retq   
  1b8413:       66 66 2e 0f 1f 84 00    data16 nopw %cs:0x0(%rax,%rax,1)
  1b841a:       00 00 00 00 
  1b841e:       66 90                   xchg   %ax,%ax
...

Instruction at RIP: 0010:dcn_bw_ceil2+0x35:

>>> hex(0x00000000001b83c0 + 0x35)
'0x1b83f5'

  1b83f5:       f2 0f 58 1d 00 00 00    addsd  0x0(%rip),%xmm3        # 1b83fd
<dcn_bw_ceil2+0x3d>

Same addsd instruction that was mentioned above.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (39 preceding siblings ...)
  2020-08-19  6:37 ` bugzilla-daemon
@ 2020-08-19  6:51 ` bugzilla-daemon
  2020-08-20  3:30 ` bugzilla-daemon
                   ` (6 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-08-19  6:51 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

--- Comment #34 from Petteri Aimonen (jpa@kernelbug.mail.kapsi.fi) ---
@krakopo Can you apply the debug info patch from here?
https://bugzilla.kernel.org/attachment.cgi?id=289421&action=diff

What kernel are you running inside the KVM virtual machine? I wonder if the
virtual machine has the MXCSR problem, perhaps it could be leaking to the host
somehow.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (40 preceding siblings ...)
  2020-08-19  6:51 ` bugzilla-daemon
@ 2020-08-20  3:30 ` bugzilla-daemon
  2020-08-20  4:11 ` bugzilla-daemon
                   ` (5 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-08-20  3:30 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

--- Comment #35 from krakopo@protonmail.com ---
@Petteri

I'm running DragonFly BSD 5.8.1 in my KVM virtual machine.

Here is the dmesg output with the debug info patch applied:

Aug 19 23:18:03 archpad kernel: MXCSR: 00000020 XMM3: 4010000000000000
Aug 19 23:18:03 archpad kernel: simd exception: 0000 [#1] PREEMPT SMP NOPTI
Aug 19 23:18:03 archpad kernel: CPU: 5 PID: 518 Comm: Xorg Not tainted
5.8.1-arch1206987 #1
Aug 19 23:18:03 archpad kernel: Hardware name: LENOVO 81W4/LNVNB161216, BIOS
DZCN19WW 04/13/2020
Aug 19 23:18:03 archpad kernel: RIP: 0010:dcn_bw_ceil2+0x35/0x60 [amdgpu]
Aug 19 23:18:03 archpad kernel: Code: cd 7b 3e 0f 28 d0 66 0f ef db 66 0f ef e4
f3 0f 5e d1 f3 0f 5a e0 f3 0f 2c c2 66 0f ef d2 f3 0f 2a d0 f3 0f 59 d1 f3 0f
5a da <f2> 0f 58 1d 5b 19 2e 00 66 0f 2f dc 72 01 c3 f3 0f 58 ca 0f 28 c1
Aug 19 23:18:03 archpad kernel: RSP: 0018:ffff9e24c10775f8 EFLAGS: 00010202
Aug 19 23:18:03 archpad kernel: RAX: 0000000000000004 RBX: 0000000000000000
RCX: 0000000000000780
Aug 19 23:18:03 archpad kernel: RDX: ffff93d6d0683080 RSI: ffff93d6d0689560
RDI: 0000000044444440
Aug 19 23:18:03 archpad kernel: RBP: ffff93d6d06831c0 R08: ffff93d6d06833b4
R09: 0000000000000000
Aug 19 23:18:03 archpad kernel: R10: 0000000000000000 R11: 0000000000000000
R12: ffff93d6d0683360
Aug 19 23:18:03 archpad kernel: R13: 0000000000000001 R14: ffff93d6d0682188
R15: ffff93d6d0682028
Aug 19 23:18:03 archpad kernel: FS:  00007f222278d940(0000)
GS:ffff93d707740000(0000) knlGS:0000000000000000
Aug 19 23:18:03 archpad kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
0000000080050033
Aug 19 23:18:03 archpad kernel: CR2: 0000000800ea8030 CR3: 00000002021ca000
CR4: 0000000000340ee0
Aug 19 23:18:03 archpad kernel: Call Trace:
Aug 19 23:18:03 archpad kernel: 
dml21_ModeSupportAndSystemConfigurationFull+0x437/0x5cf0 [amdgpu]
Aug 19 23:18:03 archpad kernel:  ? cpufreq_this_cpu_can_update+0xe/0x50
Aug 19 23:18:03 archpad kernel:  ? sugov_update_single+0x58/0x210
Aug 19 23:18:03 archpad kernel:  ? sugov_get_util+0xf0/0xf0
Aug 19 23:18:03 archpad kernel:  ? update_blocked_averages+0x539/0x620
Aug 19 23:18:03 archpad kernel:  ? update_group_capacity+0x25/0x1c0
Aug 19 23:18:03 archpad kernel:  ? cpumask_next_and+0x19/0x20
Aug 19 23:18:03 archpad kernel:  ? update_sd_lb_stats.constprop.0+0x799/0x8f0
Aug 19 23:18:03 archpad kernel:  ? cpufreq_this_cpu_can_update+0xe/0x50
Aug 19 23:18:03 archpad kernel:  ? sugov_update_single+0x143/0x210
Aug 19 23:18:03 archpad kernel:  ? sugov_get_util+0xf0/0xf0
Aug 19 23:18:03 archpad kernel:  ? update_load_avg+0x63a/0x660
Aug 19 23:18:03 archpad kernel:  ? update_curr+0x73/0x1f0
Aug 19 23:18:03 archpad kernel:  ? enqueue_entity+0x14e/0x750
Aug 19 23:18:03 archpad kernel:  ? resched_curr+0x20/0xc0
Aug 19 23:18:03 archpad kernel:  ? check_preempt_wakeup+0x13b/0x250
Aug 19 23:18:03 archpad kernel:  ? check_preempt_curr+0x67/0x90
Aug 19 23:18:03 archpad kernel:  ? _raw_spin_unlock+0x16/0x30
Aug 19 23:18:03 archpad kernel:  dml_get_voltage_level+0x116/0x1e0 [amdgpu]
Aug 19 23:18:03 archpad kernel:  dcn20_fast_validate_bw+0x359/0x680 [amdgpu]
Aug 19 23:18:03 archpad kernel:  ? resource_build_scaling_params+0xc44/0x11a0
[amdgpu]
Aug 19 23:18:03 archpad kernel:  dcn21_validate_bandwidth+0xcd/0x2a0 [amdgpu]
Aug 19 23:18:03 archpad kernel:  dc_validate_global_state+0x2f2/0x390 [amdgpu]
Aug 19 23:18:03 archpad kernel:  amdgpu_dm_atomic_check+0xefb/0x1010 [amdgpu]
Aug 19 23:18:03 archpad kernel:  ? free_one_page+0x57/0xd0
Aug 19 23:18:03 archpad kernel:  drm_atomic_check_only+0x57c/0x7f0 [drm]
Aug 19 23:18:03 archpad kernel:  ?
__drm_atomic_helper_crtc_duplicate_state+0x85/0xd0 [drm_kms_helper]
Aug 19 23:18:03 archpad kernel:  drm_atomic_commit+0x13/0x50 [drm]
Aug 19 23:18:03 archpad kernel:  drm_atomic_helper_legacy_gamma_set+0x123/0x180
[drm_kms_helper]
Aug 19 23:18:03 archpad kernel:  drm_mode_gamma_set_ioctl+0x19a/0x230 [drm]
Aug 19 23:18:03 archpad kernel:  ? drm_color_lut_check+0xa0/0xa0 [drm]
Aug 19 23:18:03 archpad kernel:  drm_ioctl_kernel+0xb2/0x100 [drm]
Aug 19 23:18:03 archpad kernel:  drm_ioctl+0x208/0x360 [drm]
Aug 19 23:18:03 archpad kernel:  ? drm_color_lut_check+0xa0/0xa0 [drm]
Aug 19 23:18:03 archpad kernel:  amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
Aug 19 23:18:03 archpad kernel:  ksys_ioctl+0x82/0xc0
Aug 19 23:18:03 archpad kernel:  __x64_sys_ioctl+0x16/0x20
Aug 19 23:18:03 archpad kernel:  do_syscall_64+0x44/0x70
Aug 19 23:18:03 archpad kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Aug 19 23:18:03 archpad kernel: RIP: 0033:0x7f22234b08eb
Aug 19 23:18:03 archpad kernel: Code: 0f 1e fa 48 8b 05 a5 95 0c 00 64 c7 00 26
00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00
0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 75 95 0c 00 f7 d8 64 89 01 48
Aug 19 23:18:03 archpad kernel: RSP: 002b:00007ffee6662f48 EFLAGS: 00000246
ORIG_RAX: 0000000000000010
Aug 19 23:18:03 archpad kernel: RAX: ffffffffffffffda RBX: 00007ffee6662f80
RCX: 00007f22234b08eb
Aug 19 23:18:03 archpad kernel: RDX: 00007ffee6662f80 RSI: 00000000c02064a5
RDI: 000000000000000a
Aug 19 23:18:03 archpad kernel: RBP: 00000000c02064a5 R08: 000055b14cc95f10
R09: 000055b14cc96110
Aug 19 23:18:03 archpad kernel: R10: 000055b14cc95d10 R11: 0000000000000246
R12: 0000000000000100
Aug 19 23:18:03 archpad kernel: R13: 000000000000000a R14: 0000000000000100
R15: 0000000000000100
Aug 19 23:18:03 archpad kernel: Modules linked in: xt_CHECKSUM xt_MASQUERADE
xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat
iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4
libcrc32c tun bridge hid_multitouch hid_generic 8021q garp mrp stp llc amdgpu
ath10k_pci edac_mce_amd ath10k_core kvm_amd snd_acp3x_rn kvm snd_acp3x_pdm_dma
ebtable_filter ebtables ip6table_filter snd_soc_dmic ip6_tables snd_soc_core
ath irqbypass iptable_filter crct10dif_pclmul crc32_pclmul mac80211
ghash_clmulni_intel joydev snd_compress ac97_bus snd_pcm_dmaengine mousedev
wmi_bmof aesni_intel crypto_simd ccm cryptd glue_helper algif_aead
snd_hda_codec_generic btusb rapl snd_hda_codec_hdmi ledtrig_audio des_generic
input_leds gpu_sched pcspkr libdes snd_hda_intel btrtl i2c_algo_bit
snd_intel_dspcfg btbcm ttm arc4 snd_hda_codec cbc btintel ecb snd_hda_core
uvcvideo algif_skcipher bluetooth drm_kms_helper k10temp sp5100_tco snd_hwdep
i2c_piix4 snd_pcm cmac md4 videobuf2_vmalloc cec
Aug 19 23:18:03 archpad kernel:  cfg80211 videobuf2_memops algif_hash af_alg
videobuf2_v4l2 rc_core tpm_crb videobuf2_common snd_timer nls_iso8859_1
syscopyarea videodev ideapad_laptop sysfillrect nls_cp437 tpm_tis snd ccp
ecdh_generic tpm_tis_core snd_rn_pci_acp3x ecc vfat sparse_keymap sysimgblt fat
soundcore tpm snd_pci_acp3x mc rfkill fb_sys_fops i2c_hid hid libarc4 wmi evdev
pinctrl_amd battery mac_hid elants_i2c acpi_cpufreq rng_core ac drm agpgart
pkcs8_key_parser ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2
serio_raw xhci_pci atkbd xhci_pci_renesas libps2 xhci_hcd crc32c_intel i8042
serio
Aug 19 23:18:03 archpad kernel: ---[ end trace a01eac408369453d ]---
Aug 19 23:18:03 archpad kernel: RIP: 0010:dcn_bw_ceil2+0x35/0x60 [amdgpu]
Aug 19 23:18:03 archpad kernel: Code: cd 7b 3e 0f 28 d0 66 0f ef db 66 0f ef e4
f3 0f 5e d1 f3 0f 5a e0 f3 0f 2c c2 66 0f ef d2 f3 0f 2a d0 f3 0f 59 d1 f3 0f
5a da <f2> 0f 58 1d 5b 19 2e 00 66 0f 2f dc 72 01 c3 f3 0f 58 ca 0f 28 c1
Aug 19 23:18:03 archpad kernel: RSP: 0018:ffff9e24c10775f8 EFLAGS: 00010202
Aug 19 23:18:03 archpad kernel: RAX: 0000000000000004 RBX: 0000000000000000
RCX: 0000000000000780
Aug 19 23:18:03 archpad kernel: RDX: ffff93d6d0683080 RSI: ffff93d6d0689560
RDI: 0000000044444440
Aug 19 23:18:03 archpad kernel: RBP: ffff93d6d06831c0 R08: ffff93d6d06833b4
R09: 0000000000000000
Aug 19 23:18:03 archpad kernel: R10: 0000000000000000 R11: 0000000000000000
R12: ffff93d6d0683360
Aug 19 23:18:03 archpad kernel: R13: 0000000000000001 R14: ffff93d6d0682188
R15: ffff93d6d0682028
Aug 19 23:18:03 archpad kernel: FS:  00007f222278d940(0000)
GS:ffff93d707640000(0000) knlGS:0000000000000000
Aug 19 23:18:03 archpad kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
0000000080050033
Aug 19 23:18:03 archpad kernel: CR2: 0000000800cca010 CR3: 00000002021ca000
CR4: 0000000000340ee0

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (41 preceding siblings ...)
  2020-08-20  3:30 ` bugzilla-daemon
@ 2020-08-20  4:11 ` bugzilla-daemon
  2020-08-20  4:21 ` bugzilla-daemon
                   ` (4 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-08-20  4:11 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

--- Comment #36 from Petteri Aimonen (jpa@kernelbug.mail.kapsi.fi) ---
@krakopo The 00000020 MXCSR value is also exactly like it was for me before the
bug fix. So something is definitely clearing MXCSR after it should be set to
0x1F80 by kernel_fpu_begin().

Can you disassemble kernel_fpu_begin() to verify that the ldmxcsr instruction
is present close to its end? Also, check that /proc/cpuinfo flags has "sse" in
it - not sure though how that could possibly be missing.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (42 preceding siblings ...)
  2020-08-20  4:11 ` bugzilla-daemon
@ 2020-08-20  4:21 ` bugzilla-daemon
  2020-08-20  4:24 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-08-20  4:21 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

--- Comment #37 from krakopo@protonmail.com ---
I do see ldmxcsr in the disassembly:

ffffffff81038870 <kernel_fpu_begin>:
ffffffff81038870:       e8 9b 07 03 00          callq  ffffffff81069010
<__fentry__>
ffffffff81038875:       48 83 ec 10             sub    $0x10,%rsp
ffffffff81038879:       bf 01 00 00 00          mov    $0x1,%edi
ffffffff8103887e:       65 48 8b 04 25 28 00    mov    %gs:0x28,%rax
ffffffff81038885:       00 00 
ffffffff81038887:       48 89 44 24 08          mov    %rax,0x8(%rsp)
ffffffff8103888c:       31 c0                   xor    %eax,%eax
ffffffff8103888e:       c7 44 24 04 00 00 00    movl   $0x0,0x4(%rsp)
ffffffff81038895:       00 
ffffffff81038896:       e8 35 ae 08 00          callq  ffffffff810c36d0
<preempt_count_add>
ffffffff8103889b:       e8 80 fd ff ff          callq  ffffffff81038620
<irq_fpu_usable>
ffffffff810388a0:       65 8a 05 b1 f2 fd 7e    mov    %gs:0x7efdf2b1(%rip),%al
       # 17b58 <in_kernel_fpu>
ffffffff810388a7:       65 c6 05 a9 f2 fd 7e    movb  
$0x1,%gs:0x7efdf2a9(%rip)        # 17b58 <in_kernel_fpu>
ffffffff810388ae:       01 
ffffffff810388af:       65 48 8b 3c 25 c0 7b    mov    %gs:0x17bc0,%rdi
ffffffff810388b6:       01 00 
ffffffff810388b8:       f6 47 26 20             testb  $0x20,0x26(%rdi)
ffffffff810388bc:       74 3c                   je     ffffffff810388fa
<kernel_fpu_begin+0x8a>
ffffffff810388be:       48 c7 c7 57 43 40 82    mov    $0xffffffff82404357,%rdi
ffffffff810388c5:       e8 46 41 9c 00          callq  ffffffff819fca10
<__this_cpu_preempt_check>
ffffffff810388ca:       c7 44 24 04 80 1f 00    movl   $0x1f80,0x4(%rsp)
ffffffff810388d1:       00 
ffffffff810388d2:       65 48 c7 05 82 f2 fd    movq  
$0x0,%gs:0x7efdf282(%rip)        # 17b60 <fpu_fpregs_owner_ctx>
ffffffff810388d9:       7e 00 00 00 00 
ffffffff810388de:       0f ae 54 24 04          ldmxcsr 0x4(%rsp)
ffffffff810388e3:       db e3                   fninit 
ffffffff810388e5:       48 8b 44 24 08          mov    0x8(%rsp),%rax
ffffffff810388ea:       65 48 2b 04 25 28 00    sub    %gs:0x28,%rax
ffffffff810388f1:       00 00 
ffffffff810388f3:       75 20                   jne    ffffffff81038915
<kernel_fpu_begin+0xa5>
ffffffff810388f5:       48 83 c4 10             add    $0x10,%rsp
ffffffff810388f9:       c3                      retq   
ffffffff810388fa:       48 8b 07                mov    (%rdi),%rax
ffffffff810388fd:       f6 c4 40                test   $0x40,%ah
ffffffff81038900:       75 bc                   jne    ffffffff810388be
<kernel_fpu_begin+0x4e>
ffffffff81038902:       f0 80 4f 01 40          lock orb $0x40,0x1(%rdi)
ffffffff81038907:       48 81 c7 00 1b 00 00    add    $0x1b00,%rdi
ffffffff8103890e:       e8 5d fd ff ff          callq  ffffffff81038670
<copy_fpregs_to_fpstate>
ffffffff81038913:       eb a9                   jmp    ffffffff810388be
<kernel_fpu_begin+0x4e>
ffffffff81038915:       e8 36 3c 9c 00          callq  ffffffff819fc550
<__stack_chk_fail>
ffffffff8103891a:       66 0f 1f 44 00 00       nopw   0x0(%rax,%rax,1)


And yes I do have the "sse" flag in /proc/cpuinfo.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (43 preceding siblings ...)
  2020-08-20  4:21 ` bugzilla-daemon
@ 2020-08-20  4:24 ` bugzilla-daemon
  2021-02-11  7:48 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2020-08-20  4:24 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

--- Comment #38 from Petteri Aimonen (jpa@kernelbug.mail.kapsi.fi) ---
@krakopo I must say I don't have any idea what could be happening on your
machine. It could be explained if the kernel thread was being pre-empted, but
pre-emption is disabled by kernel_fpu_begin().

It may help to ask in bug 207979 also, it has some of the long time x86
maintainers on CC.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (44 preceding siblings ...)
  2020-08-20  4:24 ` bugzilla-daemon
@ 2021-02-11  7:48 ` bugzilla-daemon
  2021-02-11 14:51 ` bugzilla-daemon
  2021-02-11 18:36 ` bugzilla-daemon
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2021-02-11  7:48 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

Jan Kokemüller (jan.kokemueller@gmail.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jan.kokemueller@gmail.com

--- Comment #39 from Jan Kokemüller (jan.kokemueller@gmail.com) ---
Created attachment 295225
  --> https://bugzilla.kernel.org/attachment.cgi?id=295225&action=edit
Call DC_FP_START() / DC_FP_END() in dcn21_validate_bandwidth

Could it be that DC_FP_START()/DC_FP_END() aka
kernel_fpu_begin()/kernel_fpu_end() are not called in the *_validate_bandwidth
code path on AMD Renoir systems? To my untrained eye it looks like it is
missing, while it _is_ there for dcn20.

I've been running the attached patch for 2 days now with some KVM VMs open and
the system seems stable. Previously, I had similar crashes/backtraces @krakopo
described.

I'm happy to help testing any patches. I'm running a Thinkpad T14 with a AMD
Ryzen 7 PRO 4750U (Renoir).

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (45 preceding siblings ...)
  2021-02-11  7:48 ` bugzilla-daemon
@ 2021-02-11 14:51 ` bugzilla-daemon
  2021-02-11 18:36 ` bugzilla-daemon
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2021-02-11 14:51 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

--- Comment #40 from Alex Deucher (alexdeucher@gmail.com) ---
(In reply to Jan Kokemüller from comment #39)
> Created attachment 295225 [details]
> Call DC_FP_START() / DC_FP_END() in dcn21_validate_bandwidth
> 
> Could it be that DC_FP_START()/DC_FP_END() aka
> kernel_fpu_begin()/kernel_fpu_end() are not called in the
> *_validate_bandwidth code path on AMD Renoir systems? To my untrained eye it
> looks like it is missing, while it _is_ there for dcn20.
> 
> I've been running the attached patch for 2 days now with some KVM VMs open
> and the system seems stable. Previously, I had similar crashes/backtraces
> @krakopo described.
> 
> I'm happy to help testing any patches. I'm running a Thinkpad T14 with a AMD
> Ryzen 7 PRO 4750U (Renoir).

Looks correct.  Care to send out a proper git patch?

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
  2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
                   ` (46 preceding siblings ...)
  2021-02-11 14:51 ` bugzilla-daemon
@ 2021-02-11 18:36 ` bugzilla-daemon
  47 siblings, 0 replies; 49+ messages in thread
From: bugzilla-daemon @ 2021-02-11 18:36 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=206987

--- Comment #41 from Jan Kokemüller (jan.kokemueller@gmail.com) ---
> Looks correct.  Care to send out a proper git patch?

Thank you for having a look at the patch! I've sent it to the amd-gfx list.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 49+ messages in thread

end of thread, other threads:[~2021-02-11 18:36 UTC | newest]

Thread overview: 49+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
2020-03-26 19:54 ` [Bug 206987] " bugzilla-daemon
2020-03-26 21:36 ` bugzilla-daemon
2020-03-26 21:37 ` bugzilla-daemon
2020-04-04  7:40 ` bugzilla-daemon
2020-04-04  7:41 ` bugzilla-daemon
2020-04-04  7:42 ` bugzilla-daemon
2020-04-18 13:15 ` bugzilla-daemon
2020-04-18 13:19 ` bugzilla-daemon
2020-04-19 11:42 ` bugzilla-daemon
2020-04-19 11:43 ` bugzilla-daemon
2020-04-23  5:15 ` bugzilla-daemon
2020-04-23  5:15 ` bugzilla-daemon
2020-04-25  8:44 ` bugzilla-daemon
2020-04-25  8:44 ` bugzilla-daemon
2020-04-27 19:20 ` bugzilla-daemon
2020-04-27 19:20 ` bugzilla-daemon
2020-05-02 14:18 ` bugzilla-daemon
2020-05-23  1:52 ` bugzilla-daemon
2020-05-23  1:56 ` bugzilla-daemon
2020-05-23  1:58 ` bugzilla-daemon
2020-05-28 14:17 ` bugzilla-daemon
2020-05-28 16:05 ` bugzilla-daemon
2020-05-28 16:24 ` bugzilla-daemon
2020-05-28 18:56 ` bugzilla-daemon
2020-06-02  3:50 ` bugzilla-daemon
2020-06-03  1:34 ` bugzilla-daemon
2020-06-03  1:35 ` bugzilla-daemon
2020-06-03  1:36 ` bugzilla-daemon
2020-06-03  2:00 ` bugzilla-daemon
2020-06-03  2:28 ` bugzilla-daemon
2020-06-03  5:14 ` bugzilla-daemon
2020-06-03 11:05 ` bugzilla-daemon
2020-06-06  1:29 ` bugzilla-daemon
2020-06-06  6:42 ` bugzilla-daemon
2020-07-03 22:22 ` bugzilla-daemon
2020-07-15 16:07 ` bugzilla-daemon
2020-07-15 16:12 ` bugzilla-daemon
2020-07-17  4:40 ` bugzilla-daemon
2020-07-23  1:47 ` bugzilla-daemon
2020-08-19  6:37 ` bugzilla-daemon
2020-08-19  6:51 ` bugzilla-daemon
2020-08-20  3:30 ` bugzilla-daemon
2020-08-20  4:11 ` bugzilla-daemon
2020-08-20  4:21 ` bugzilla-daemon
2020-08-20  4:24 ` bugzilla-daemon
2021-02-11  7:48 ` bugzilla-daemon
2021-02-11 14:51 ` bugzilla-daemon
2021-02-11 18:36 ` bugzilla-daemon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).