All of lore.kernel.org
 help / color / mirror / Atom feed
* [Bug 204181] New: NULL pointer dereference regression in amdgpu
@ 2019-07-15 10:11 bugzilla-daemon
  2019-07-15 13:07 ` [Bug 204181] " bugzilla-daemon
                   ` (68 more replies)
  0 siblings, 69 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-07-15 10:11 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

            Bug ID: 204181
           Summary: NULL pointer dereference regression in amdgpu
           Product: Drivers
           Version: 2.5
    Kernel Version: 5.2.1
          Hardware: x86-64
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: high
          Priority: P1
         Component: Video(DRI - non Intel)
          Assignee: drivers_video-dri@kernel-bugs.osdl.org
          Reporter: virtuousfox@gmail.com
        Regression: No

Created attachment 283693
  --> https://bugzilla.kernel.org/attachment.cgi?id=283693&action=edit
dmesg

After updating from 5.1 to 5.2.1 in about 5-10 minutes of watching a Youtube
video in Firefox I now get complete lock-up of video output and inability to
shutdown using power button. Using "magic keys" allows me to reboot and get
kernel log via `journalctl -b -1 -k`, here is relevant part:
BUG: kernel NULL pointer dereference, address: 00000000000002b4
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0 
Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 2 PID: 8200 Comm: kworker/u16:1 Tainted: G          IO     
5.2.1-1383.gd5bbc26-HSF #1 openSUSE Tumbleweed (unreleased)
Hardware name: Gigabyte Technology Co., Ltd. GA-990XA-UD3/GA-990XA-UD3, BIOS
F14e 09/09/2014
Workqueue: events_unbound commit_work
RIP: 0010:dc_stream_log+0x6/0xb0 [amdgpu]
Code: 04 00 00 49 8b bc 02 80 02 00 00 48 8b 07 48 8b 40 50 e8 ed 88 a8 d6 b8
01 00 00 00 c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 53 <8b> 86 b4 02 00 00 48 89
f3 48 89 f2 8b 8e 10 01 00 00 bf 04 00 00
RSP: 0018:ffffa5568b1b7c00 EFLAGS: 00010202
RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000002
RDX: ffffffffc07fbd50 RSI: 0000000000000000 RDI: ffff8e9ee9500000
RBP: ffff8e9d90618000 R08: 0000000000000001 R09: 0000000000000000
R10: ffffa5568b1b7c30 R11: 0000000000000000 R12: ffff8e9ee9500000
R13: ffff8e9ededb4448 R14: ffff8e9e47e10c00 R15: ffff8e9ededa0000
FS:  0000000000000000(0000) GS:ffff8e9eee000000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000002b4 CR3: 00000003bcf46000 CR4: 00000000000406e0
Call Trace:
 dc_commit_state+0x79/0xb0 [amdgpu]
 amdgpu_dm_atomic_commit_tail+0x3c0/0xdb0 [amdgpu]
 ? finish_task_switch+0x74/0x300
 ? __switch_to+0x152/0x4e0
 ? __switch_to_asm+0x34/0x70
 ? __lock_acquire+0x3c8/0x7a0
 ? find_held_lock+0x32/0x90
 ? find_held_lock+0x32/0x90
 ? sched_clock+0x5/0x10
 ? mark_held_locks+0x2d/0x80
 ? preempt_count_sub+0x98/0xe0
 ? _raw_spin_unlock_irq+0x3a/0x50
 ? wait_for_completion_timeout+0xe9/0x110
 ? commit_tail+0x3c/0x70
 commit_tail+0x3c/0x70
 process_one_work+0x271/0x5f0
 worker_thread+0x4a/0x3d0
 ? process_one_work+0x5f0/0x5f0
 kthread+0x118/0x140
 ? kthread_create_worker_on_cpu+0x70/0x70
 ret_from_fork+0x27/0x50
Modules linked in: af_packet ts_bm xt_pkttype xt_string nf_nat_ftp
nf_conntrack_ftp xt_tcpudp ip6t_rpfilter ip6t_REJECT ipt_REJECT xt_conntrack
ebtable_nat ip6table_nat ip6table_mangle ip6table_raw ip6table_security
iptable_nat nf_nat iptable_mangle iptable_raw iptable_security nf_conntrack
nf_defrag_ipv6 nf_defrag_ipv4 ip_set nfnetlink ebtable_filter ebtables
scsi_transport_iscsi ip6table_filter ip6_tables iptable_filter ip_tables
x_tables bpfilter snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq zram
snd_pcm_oss rfcomm snd_mixer_oss it87 hwmon_vid bnep msr rc_avermedia
tuner_simple tuner_types amd64_edac_mod tuner tda7432 edac_mce_amd btusb
tvaudio kvm_amd ath9k btrtl btbcm msp3400 ath9k_common btintel bluetooth
ath9k_hw kvm irqbypass ath bttv tea575x joydev tveeprom videobuf_dma_sg
snd_usb_audio videobuf_core snd_usbmidi_lib rc_core snd_rawmidi
snd_hda_codec_realtek mac80211 snd_hda_codec_generic ledtrig_audio
snd_hda_codec_hdmi v4l2_common snd_seq_device
 snd_hda_intel videodev sp5100_tco pcspkr snd_hda_codec wmi_bmof mxm_wmi amdgpu
fam15h_power k10temp media i2c_piix4 cfg80211 r8169 snd_hda_core gpu_sched
realtek snd_hwdep libphy ttm rfkill snd_pcm mac_hid hid_generic usbhid uas
usb_storage ohci_pci serio_raw sd_mod ehci_pci ohci_hcd xhci_pci ehci_hcd
xhci_hcd wmi exfat(O) l2tp_ppp l2tp_netlink l2tp_core ip6_udp_tunnel udp_tunnel
pppox ppp_generic slhc vhba(O) uinput sg nbd dm_multipath scsi_dh_rdac
scsi_dh_emc scsi_dh_alua ecryptfs
CR2: 00000000000002b4
---[ end trace 0633d97cb3f2d2d6 ]---
RIP: 0010:dc_stream_log+0x6/0xb0 [amdgpu]
Code: 04 00 00 49 8b bc 02 80 02 00 00 48 8b 07 48 8b 40 50 e8 ed 88 a8 d6 b8
01 00 00 00 c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 53 <8b> 86 b4 02 00 00 48 89
f3 48 89 f2 8b 8e 10 01 00 00 bf 04 00 00
RSP: 0018:ffffa5568b1b7c00 EFLAGS: 00010202
RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000002
RDX: ffffffffc07fbd50 RSI: 0000000000000000 RDI: ffff8e9ee9500000
RBP: ffff8e9d90618000 R08: 0000000000000001 R09: 0000000000000000
R10: ffffa5568b1b7c30 R11: 0000000000000000 R12: ffff8e9ee9500000
R13: ffff8e9ededb4448 R14: ffff8e9e47e10c00 R15: ffff8e9ededa0000
FS:  0000000000000000(0000) GS:ffff8e9eee000000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000002b4 CR3: 00000003bcf46000 CR4: 00000000000406e0

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
@ 2019-07-15 13:07 ` bugzilla-daemon
  2019-07-15 15:43 ` bugzilla-daemon
                   ` (67 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-07-15 13:07 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

Nicholas Kazlauskas (nicholas.kazlauskas@amd.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |nicholas.kazlauskas@amd.com

--- Comment #1 from Nicholas Kazlauskas (nicholas.kazlauskas@amd.com) ---
Do you mind posting an dmesg log with drm=debug=4 as part of your boot
parameters?

An xorg log would be good too if applicable.

I'm curious to know what the actual sequence / system setup is for reproducing
this as this isn't really a typical sequence. I think you'd run into other NULL
pointer dereferences even if this one is guarded.

I think the stream itself is NULL and it shouldn't be in the context.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
  2019-07-15 13:07 ` [Bug 204181] " bugzilla-daemon
@ 2019-07-15 15:43 ` bugzilla-daemon
  2019-07-15 15:43 ` bugzilla-daemon
                   ` (66 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-07-15 15:43 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #2 from Sergey Kondakov (virtuousfox@gmail.com) ---
Created attachment 283695
  --> https://bugzilla.kernel.org/attachment.cgi?id=283695&action=edit
dmesg with "drm=debug=4"

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
  2019-07-15 13:07 ` [Bug 204181] " bugzilla-daemon
  2019-07-15 15:43 ` bugzilla-daemon
@ 2019-07-15 15:43 ` bugzilla-daemon
  2019-07-15 15:45 ` bugzilla-daemon
                   ` (65 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-07-15 15:43 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #3 from Sergey Kondakov (virtuousfox@gmail.com) ---
Created attachment 283697
  --> https://bugzilla.kernel.org/attachment.cgi?id=283697&action=edit
kernel build config

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (2 preceding siblings ...)
  2019-07-15 15:43 ` bugzilla-daemon
@ 2019-07-15 15:45 ` bugzilla-daemon
  2019-07-15 15:48 ` bugzilla-daemon
                   ` (64 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-07-15 15:45 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #4 from Sergey Kondakov (virtuousfox@gmail.com) ---
Created attachment 283699
  --> https://bugzilla.kernel.org/attachment.cgi?id=283699&action=edit
amdgpu parameters

These doesn't seem to change anything about the hang. Although, maybe with
larger limits of scheduling (max_num_of_queues_per_device, sched_hw_submission,
sched_jobs) hang happens sooner but I'm not sure.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (3 preceding siblings ...)
  2019-07-15 15:45 ` bugzilla-daemon
@ 2019-07-15 15:48 ` bugzilla-daemon
  2019-07-15 15:50 ` bugzilla-daemon
                   ` (63 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-07-15 15:48 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #5 from Sergey Kondakov (virtuousfox@gmail.com) ---
Created attachment 283701
  --> https://bugzilla.kernel.org/attachment.cgi?id=283701&action=edit
X.log

amdgpu has TearFree and VariableRefresh (no LCD support though) enabled.
Dual-screen with 2 60 fps, VA and TN, 1080p LCDs, recently overclocked to ~73
and ~72 fps via CVT-1.2 lines on both Linux and Windows.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (4 preceding siblings ...)
  2019-07-15 15:48 ` bugzilla-daemon
@ 2019-07-15 15:50 ` bugzilla-daemon
  2019-07-15 15:50 ` bugzilla-daemon
                   ` (62 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-07-15 15:50 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #6 from Sergey Kondakov (virtuousfox@gmail.com) ---
Created attachment 283703
  --> https://bugzilla.kernel.org/attachment.cgi?id=283703&action=edit
lsmem

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (5 preceding siblings ...)
  2019-07-15 15:50 ` bugzilla-daemon
@ 2019-07-15 15:50 ` bugzilla-daemon
  2019-07-15 15:53 ` bugzilla-daemon
                   ` (61 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-07-15 15:50 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #7 from Sergey Kondakov (virtuousfox@gmail.com) ---
Created attachment 283705
  --> https://bugzilla.kernel.org/attachment.cgi?id=283705&action=edit
lspci -vv

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (6 preceding siblings ...)
  2019-07-15 15:50 ` bugzilla-daemon
@ 2019-07-15 15:53 ` bugzilla-daemon
  2019-07-15 15:56 ` bugzilla-daemon
                   ` (60 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-07-15 15:53 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #8 from Sergey Kondakov (virtuousfox@gmail.com) ---
Created attachment 283707
  --> https://bugzilla.kernel.org/attachment.cgi?id=283707&action=edit
lspci -t -PP -q -k -v

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (7 preceding siblings ...)
  2019-07-15 15:53 ` bugzilla-daemon
@ 2019-07-15 15:56 ` bugzilla-daemon
  2019-07-15 15:58 ` bugzilla-daemon
                   ` (59 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-07-15 15:56 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #9 from Sergey Kondakov (virtuousfox@gmail.com) ---
(In reply to Nicholas Kazlauskas from comment #1)
> Do you mind posting an dmesg log with drm=debug=4 as part of your boot
> parameters?
> 
> An xorg log would be good too if applicable.
> 
> I'm curious to know what the actual sequence / system setup is for
> reproducing this as this isn't really a typical sequence. I think you'd run
> into other NULL pointer dereferences even if this one is guarded.
> 
> I think the stream itself is NULL and it shouldn't be in the context.

I don't think that putting 'drm=debug=4' into boot cmd has changed anything but
here's some more data. I also stumbled into another baffling regression (bug
#203703) recently (from 5.0 to 5.1) concerning network packet scheduling
(fq_codel qdics) that halts affected Ethernet device, it also gives out
repeatable kernel trace on random network activity unless qdics is changed on
dumb "pfifo_fast" early on, similarly how this gives out same repeatable amdgpu
trace on some random GPU activity. Weird.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (8 preceding siblings ...)
  2019-07-15 15:56 ` bugzilla-daemon
@ 2019-07-15 15:58 ` bugzilla-daemon
  2019-07-15 15:59 ` bugzilla-daemon
                   ` (58 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-07-15 15:58 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #10 from Nicholas Kazlauskas (nicholas.kazlauskas@amd.com) ---
Thanks for all the logs.

I meant drm.debug=4 actually, the drm=debug=4 was a typo on my part - sorry!

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (9 preceding siblings ...)
  2019-07-15 15:58 ` bugzilla-daemon
@ 2019-07-15 15:59 ` bugzilla-daemon
  2019-07-16 15:29 ` bugzilla-daemon
                   ` (57 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-07-15 15:59 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #11 from Sergey Kondakov (virtuousfox@gmail.com) ---
Created attachment 283709
  --> https://bugzilla.kernel.org/attachment.cgi?id=283709&action=edit
/proc/interrupts

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (10 preceding siblings ...)
  2019-07-15 15:59 ` bugzilla-daemon
@ 2019-07-16 15:29 ` bugzilla-daemon
  2019-07-16 16:36 ` bugzilla-daemon
                   ` (56 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-07-16 15:29 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

Sergey Kondakov (virtuousfox@gmail.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Attachment #283695|0                           |1
        is obsolete|                            |

--- Comment #12 from Sergey Kondakov (virtuousfox@gmail.com) ---
Created attachment 283741
  --> https://bugzilla.kernel.org/attachment.cgi?id=283741&action=edit
dmesg with "drm.debug=4"

Here's actual debug dmesg. pci subsystem uses 'pci=x=y' syntax, so I wouldn't
have thought that for drm that wouldn't be valid.

Right when I wanted to upload the first dump from hang with debug that happened
in >16 hours of uptime and >30 minutes of video, it crashed before Firefox even
had a chance to render single page which happened to be same Youtube page
everything hanged on because it starts at last opened page. So, after >30
minutes it wasn't even a second to hang again. This dump is from that time.

Haven't tried launching a local video player or a 3D app. Without opening
Youtube in Firefox or video opening Firefox, doing all 2D non-accelerated
desktop stuff doesn't seem to trigger it.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (11 preceding siblings ...)
  2019-07-16 15:29 ` bugzilla-daemon
@ 2019-07-16 16:36 ` bugzilla-daemon
  2019-07-16 16:52 ` bugzilla-daemon
                   ` (55 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-07-16 16:36 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #13 from Sergey Kondakov (virtuousfox@gmail.com) ---
Created attachment 283745
  --> https://bugzilla.kernel.org/attachment.cgi?id=283745&action=edit
tail -n 2000 from dmesg with "drm.debug=5"

drm.debug=4 seem to produce only 1 new relevant line:
"[drm:dc_commit_state [amdgpu]] dc_commit_state: 2 streams"
so I tried increasing it. debug=5 creates a horrible stream that bogs down
system with i/o load from journald but it sure did write some more at the
moment of hang. I'm not going any further than that, though.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (12 preceding siblings ...)
  2019-07-16 16:36 ` bugzilla-daemon
@ 2019-07-16 16:52 ` bugzilla-daemon
  2019-07-16 16:55 ` bugzilla-daemon
                   ` (54 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-07-16 16:52 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #14 from Sergey Kondakov (virtuousfox@gmail.com) ---
(In reply to Nicholas Kazlauskas from comment #10)
> Thanks for all the logs.
> 
> I meant drm.debug=4 actually, the drm=debug=4 was a typo on my part - sorry!

So, I've got all I could on this.

Could this be relevant to my recent LCD overclock ? I haven't tried going back
to 60 fps yet.
cvt executable and modes/xf86cvt.c in X-server weren't updated for years and
can't even produce cvt-1.2 modes or any useful "reduced blanking" modes with
them, so I had to go for things like: 
https://github.com/kevinlekiller/cvt_modeline_calculator_12 and
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=899066
On Windows I had to use
https://www.monitortests.com/forum/Thread-Custom-Resolution-Utility-CRU because
AMD driver refuses to use custom modes it itself generates with "unsupported"
(yeah, right…) "error" naggings.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (13 preceding siblings ...)
  2019-07-16 16:52 ` bugzilla-daemon
@ 2019-07-16 16:55 ` bugzilla-daemon
  2019-07-24 18:33 ` bugzilla-daemon
                   ` (53 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-07-16 16:55 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #15 from Nicholas Kazlauskas (nicholas.kazlauskas@amd.com) ---
Thanks for the logs. I don't think this is related to your overclock.

Since this behavior wasn't previously observed during our 5.2 testing I think
that either a patch got lost or changed during the submission process, or
something from 5.3 was backported into 5.2 that shouldn't have been.

I don't think it's necessairly setup specific.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (14 preceding siblings ...)
  2019-07-16 16:55 ` bugzilla-daemon
@ 2019-07-24 18:33 ` bugzilla-daemon
  2019-07-25 10:52 ` bugzilla-daemon
                   ` (52 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-07-24 18:33 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #16 from Sergey Kondakov (virtuousfox@gmail.com) ---
(In reply to Nicholas Kazlauskas from comment #15)
> Thanks for the logs. I don't think this is related to your overclock.
> 
> Since this behavior wasn't previously observed during our 5.2 testing I
> think that either a patch got lost or changed during the submission process,
> or something from 5.3 was backported into 5.2 that shouldn't have been.
> 
> I don't think it's necessairly setup specific.

That means that you were able to reproduce it ? If so, any known workaround or
ETA on the fix ? Is rc1 of 5.3 affected ? Any plans on backport to 5.2.x ?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (15 preceding siblings ...)
  2019-07-24 18:33 ` bugzilla-daemon
@ 2019-07-25 10:52 ` bugzilla-daemon
  2019-07-25 14:21 ` bugzilla-daemon
                   ` (51 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-07-25 10:52 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

Yann HN (accs@21xayah.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |accs@21xayah.com

--- Comment #17 from Yann HN (accs@21xayah.com) ---
I was facing the same issue, Complete Video output stop, X Server process went
unresponsive.

I did a Hardware switch a day before.
GPU: PNY GTX 1060 -> Asus Vega 56
Mainboard: Asus Z370P -> MSI Z390A Pro

A friend suggested me to install some packages to enhance the GPU Support, one
of them was "xf86-video-amdgpu".

Seams like that package was responsible for the issues.
Removing it fixed the issue without any other (notable) effects.

Some more info for context:
X: X.Org X Server 1.20.5
Desktop: plasmashell 5.16.3
Kernel: 5.2.2-arch1-1-ARCH #1 SMP PREEMPT Sun Jul 21 19:18:34 UTC 2019 x86_64
GNU/Linux

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (16 preceding siblings ...)
  2019-07-25 10:52 ` bugzilla-daemon
@ 2019-07-25 14:21 ` bugzilla-daemon
  2019-07-25 15:42 ` bugzilla-daemon
                   ` (50 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-07-25 14:21 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #18 from Michel Dänzer (michel@daenzer.net) ---
(In reply to Yann HN from comment #17)
> A friend suggested me to install some packages to enhance the GPU Support,
> one of them was "xf86-video-amdgpu".
> 
> Seams like that package was responsible for the issues.
> Removing it fixed the issue without any other (notable) effects.

Did you get the same amdgpu_dm_atomic_commit_tail => dc_commit_state =>
dc_stream_log NULL pointer dereference as reported here?

If yes, this is a kernel driver bug, xf86-video-amdgpu just triggers it / the
Xorg modesetting driver avoids it somehow.

If not, please file your own report at
https://bugs.freedesktop.org/enter_bug.cgi?product=xorg&component=Driver/AMDgpu
and attach the corresponding Xorg log file and output of dmesg.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (17 preceding siblings ...)
  2019-07-25 14:21 ` bugzilla-daemon
@ 2019-07-25 15:42 ` bugzilla-daemon
  2019-07-25 15:50 ` bugzilla-daemon
                   ` (49 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-07-25 15:42 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #19 from Yann HN (accs@21xayah.com) ---
(In reply to Michel Dänzer from comment #18)
> (In reply to Yann HN from comment #17)
> > A friend suggested me to install some packages to enhance the GPU Support,
> > one of them was "xf86-video-amdgpu".
> > 
> > Seams like that package was responsible for the issues.
> > Removing it fixed the issue without any other (notable) effects.
> 
> Did you get the same amdgpu_dm_atomic_commit_tail => dc_commit_state =>
> dc_stream_log NULL pointer dereference as reported here?
> 
> If yes, this is a kernel driver bug, xf86-video-amdgpu just triggers it /
> the Xorg modesetting driver avoids it somehow.
> 
> If not, please file your own report at
> https://bugs.freedesktop.org/enter_bug.cgi?product=xorg&component=Driver/
> AMDgpu and attach the corresponding Xorg log file and output of dmesg.

Yes, i re installed the package and was able to reproduce the error pretty
fast, here the whole stack trace(package being the source of the issue
confirmed):

Jul 25 17:38:12 arch-workstation kernel: BUG: kernel NULL pointer dereference,
address: 00000000000002b4
Jul 25 17:38:12 arch-workstation kernel: #PF: supervisor read access in kernel
mode
Jul 25 17:38:12 arch-workstation kernel: #PF: error_code(0x0000) - not-present
page
Jul 25 17:38:12 arch-workstation kernel: PGD 0 P4D 0 
Jul 25 17:38:12 arch-workstation kernel: Oops: 0000 [#1] PREEMPT SMP PTI
Jul 25 17:38:12 arch-workstation kernel: CPU: 3 PID: 296 Comm: kworker/u24:4
Not tainted 5.2.2-arch1-1-ARCH #1
Jul 25 17:38:12 arch-workstation kernel: Hardware name: Micro-Star
International Co., Ltd. MS-7B98/Z390-A PRO (MS-7B98), BIOS 1.60 03/21/2019
Jul 25 17:38:12 arch-workstation kernel: Workqueue: events_unbound commit_work
[drm_kms_helper]
Jul 25 17:38:12 arch-workstation kernel: RIP: 0010:dc_stream_log+0x6/0xb0
[amdgpu]
Jul 25 17:38:12 arch-workstation kernel: Code: 04 00 00 49 8b bc 02 80 02 00 00
48 8b 07 48 8b 40 50 e8 1d 35 f7 cd b8 01 00 00 00 c3 0f 1f 80 00 00 00 00 0f
1f 44 00 00 53 <8b> 86 b4 02 00 00 48 89 f3 48 89 f2 8b 8e 10 01 00 00 bf 04
00>
Jul 25 17:38:12 arch-workstation kernel: RSP: 0018:ffff9ced83f5faf0 EFLAGS:
00010202
Jul 25 17:38:12 arch-workstation kernel: RAX: 0000000000000000 RBX:
ffff8b9687199000 RCX: 0000000000000002
Jul 25 17:38:12 arch-workstation kernel: RDX: ffffffffc1112710 RSI:
0000000000000000 RDI: ffff8b9687199000
Jul 25 17:38:12 arch-workstation kernel: RBP: ffff8b95c7868000 R08:
ffff8b95c7868000 R09: 0000000000000000
Jul 25 17:38:12 arch-workstation kernel: R10: ffff8b95c7868000 R11:
0000000000000018 R12: 0000000000000001
Jul 25 17:38:12 arch-workstation kernel: R13: ffff9ced83f5fd58 R14:
ffff8b967420cff0 R15: 0000000000000000
Jul 25 17:38:12 arch-workstation kernel: FS:  0000000000000000(0000)
GS:ffff8b968d8c0000(0000) knlGS:0000000000000000
Jul 25 17:38:12 arch-workstation kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
0000000080050033
Jul 25 17:38:12 arch-workstation kernel: CR2: 00000000000002b4 CR3:
000000080c284006 CR4: 00000000003606e0
Jul 25 17:38:12 arch-workstation kernel: DR0: 0000000000000000 DR1:
0000000000000000 DR2: 0000000000000000
Jul 25 17:38:12 arch-workstation kernel: DR3: 0000000000000000 DR6:
00000000fffe0ff0 DR7: 0000000000000400
Jul 25 17:38:12 arch-workstation kernel: Call Trace:
Jul 25 17:38:12 arch-workstation kernel:  dc_commit_state+0x9a/0x5a0 [amdgpu]
Jul 25 17:38:12 arch-workstation kernel:  ?
dm_plane_helper_cleanup_fb+0xa3/0x120 [amdgpu]
Jul 25 17:38:12 arch-workstation kernel: 
amdgpu_dm_atomic_commit_tail+0xc5d/0x1a10 [amdgpu]
Jul 25 17:38:12 arch-workstation kernel:  ? __switch_to_asm+0x34/0x70
Jul 25 17:38:12 arch-workstation kernel:  ? __switch_to_asm+0x40/0x70
Jul 25 17:38:12 arch-workstation kernel:  ? __switch_to_asm+0x34/0x70
Jul 25 17:38:12 arch-workstation kernel:  ? __switch_to_asm+0x40/0x70
Jul 25 17:38:12 arch-workstation kernel:  ? __switch_to_asm+0x34/0x70
Jul 25 17:38:12 arch-workstation kernel:  ? __switch_to_asm+0x40/0x70
Jul 25 17:38:12 arch-workstation kernel:  ? __switch_to_asm+0x34/0x70
Jul 25 17:38:12 arch-workstation kernel:  ? __switch_to_asm+0x40/0x70
Jul 25 17:38:12 arch-workstation kernel:  ? __switch_to_asm+0x34/0x70
Jul 25 17:38:12 arch-workstation kernel:  ? __switch_to_asm+0x34/0x70
Jul 25 17:38:12 arch-workstation kernel:  ? __switch_to_asm+0x34/0x70
Jul 25 17:38:12 arch-workstation kernel:  ? __switch_to_asm+0x34/0x70
Jul 25 17:38:12 arch-workstation kernel:  ? __switch_to_asm+0x40/0x70
Jul 25 17:38:12 arch-workstation kernel:  ? __switch_to_asm+0x34/0x70
Jul 25 17:38:12 arch-workstation kernel:  ? __switch_to_asm+0x34/0x70
Jul 25 17:38:12 arch-workstation kernel:  ? __switch_to_asm+0x40/0x70
Jul 25 17:38:12 arch-workstation kernel:  ? __switch_to_asm+0x34/0x70
Jul 25 17:38:12 arch-workstation kernel:  ? __switch_to_asm+0x40/0x70
Jul 25 17:38:12 arch-workstation kernel:  ? __switch_to_asm+0x34/0x70
Jul 25 17:38:12 arch-workstation kernel:  ? __switch_to_asm+0x40/0x70
Jul 25 17:38:12 arch-workstation kernel:  ? __switch_to_asm+0x34/0x70
Jul 25 17:38:12 arch-workstation kernel:  ? __switch_to_asm+0x40/0x70
Jul 25 17:38:12 arch-workstation kernel:  ? _raw_spin_unlock_irq+0x1d/0x30
Jul 25 17:38:12 arch-workstation kernel:  ? finish_task_switch+0x84/0x2d0
Jul 25 17:38:12 arch-workstation kernel:  ? preempt_schedule_common+0x32/0x80
Jul 25 17:38:12 arch-workstation kernel:  ? commit_tail+0x3c/0x70
[drm_kms_helper]
Jul 25 17:38:12 arch-workstation kernel:  commit_tail+0x3c/0x70
[drm_kms_helper]
Jul 25 17:38:12 arch-workstation kernel:  process_one_work+0x1d1/0x3e0
Jul 25 17:38:12 arch-workstation kernel:  worker_thread+0x4a/0x3d0
Jul 25 17:38:12 arch-workstation kernel:  kthread+0xfb/0x130
Jul 25 17:38:12 arch-workstation kernel:  ? process_one_work+0x3e0/0x3e0
Jul 25 17:38:12 arch-workstation kernel:  ? kthread_park+0x90/0x90
Jul 25 17:38:12 arch-workstation kernel:  ret_from_fork+0x35/0x40
Jul 25 17:38:12 arch-workstation kernel: Modules linked in: fuse xt_nat
xt_tcpudp veth xt_MASQUERADE nf_conntrack_netlink xfrm_user xfrm_algo
iptable_nat xt_addrtype iptable_filter xt_conntrack nf_nat nf_conntrack
nf_defrag_ipv6 nf_defra>
Jul 25 17:38:12 arch-workstation kernel:  snd_usbmidi_lib ppdev iTCO_wdt
snd_hda_codec iTCO_vendor_support snd_rawmidi snd_seq_device media snd_hda_core
agpgart snd_hwdep syscopyarea snd_pcm aesni_intel sysfillrect snd_timer
aes_x86_64 c>
Jul 25 17:38:12 arch-workstation kernel: CR2: 00000000000002b4
Jul 25 17:38:12 arch-workstation kernel: ---[ end trace 8659bfc7daefd7ef ]---
Jul 25 17:38:12 arch-workstation kernel: RIP: 0010:dc_stream_log+0x6/0xb0
[amdgpu]
Jul 25 17:38:12 arch-workstation kernel: Code: 04 00 00 49 8b bc 02 80 02 00 00
48 8b 07 48 8b 40 50 e8 1d 35 f7 cd b8 01 00 00 00 c3 0f 1f 80 00 00 00 00 0f
1f 44 00 00 53 <8b> 86 b4 02 00 00 48 89 f3 48 89 f2 8b 8e 10 01 00 00 bf 04
00>
Jul 25 17:38:12 arch-workstation kernel: RSP: 0018:ffff9ced83f5faf0 EFLAGS:
00010202
lines 802-866/1002 87%

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (18 preceding siblings ...)
  2019-07-25 15:42 ` bugzilla-daemon
@ 2019-07-25 15:50 ` bugzilla-daemon
  2019-07-26 12:23 ` bugzilla-daemon
                   ` (48 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-07-25 15:50 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #20 from Nicholas Kazlauskas (nicholas.kazlauskas@amd.com) ---
I haven't been able to reproduce this on my setup yet with xf86-video-amdgpu on
Arch's 5.2.2 kernel. I don't see anything really missing between that and
staging that could affect this issue.

It would probably help to have a dmesg log with drm.debug=0x54 - this will
enable DRM atomic state debug prints.

You'll probably need to increase your log buffer size to get the state relevant
to the crash.

ie: " log_buf_len=64M drm.debug=84 "

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (19 preceding siblings ...)
  2019-07-25 15:50 ` bugzilla-daemon
@ 2019-07-26 12:23 ` bugzilla-daemon
  2019-07-26 16:02 ` bugzilla-daemon
                   ` (47 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-07-26 12:23 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

Frank Steinborn (steinex@nognu.de) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |steinex@nognu.de

--- Comment #21 from Frank Steinborn (steinex@nognu.de) ---
Facing the same issue (Vega64). I captured a dmesg (drm.debug=0x54) with lockup
and uploaded it here:

https://nognu.de/p/dmesg_amdgpu.txt

Thanks!

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (20 preceding siblings ...)
  2019-07-26 12:23 ` bugzilla-daemon
@ 2019-07-26 16:02 ` bugzilla-daemon
  2019-07-30 21:41 ` bugzilla-daemon
                   ` (46 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-07-26 16:02 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #22 from Nicholas Kazlauskas (nicholas.kazlauskas@amd.com) ---
Thanks for the log!

I can reproduce the issue now by emulating the sequence using IGT. It doesn't
seem to show up in desktop usage for me.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (21 preceding siblings ...)
  2019-07-26 16:02 ` bugzilla-daemon
@ 2019-07-30 21:41 ` bugzilla-daemon
  2019-07-31 16:28 ` bugzilla-daemon
                   ` (45 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-07-30 21:41 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #23 from Sergey Kondakov (virtuousfox@gmail.com) ---
(In reply to Nicholas Kazlauskas from comment #22)
> Thanks for the log!
> 
> I can reproduce the issue now by emulating the sequence using IGT. It
> doesn't seem to show up in desktop usage for me.

Indeed. I tried using modeset X11 driver and got a bunch of errors in
Xorg.0.log about inability to do "page flips", so I've put `PageFlip false` for
it and `EnablePageFlip false` for amdgpu with removal of 'TearFree true' (why
it isn't always on by default ?), just in case. No hangs for about 24 hours
even with a lot of Youtube in Firefox even with amdgpu.

There seem to be a lot of patches for AMD GPUs queued for 5.2.5, any chance of
the complete fix among them ?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (22 preceding siblings ...)
  2019-07-30 21:41 ` bugzilla-daemon
@ 2019-07-31 16:28 ` bugzilla-daemon
  2019-08-01  6:13 ` bugzilla-daemon
                   ` (44 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-07-31 16:28 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #24 from Nicholas Kazlauskas (nicholas.kazlauskas@amd.com) ---
This should be fixed with the series linked below:

https://patchwork.freedesktop.org/series/64505/

But it still needs review and backporting to older kernels.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (23 preceding siblings ...)
  2019-07-31 16:28 ` bugzilla-daemon
@ 2019-08-01  6:13 ` bugzilla-daemon
  2019-08-02  2:21 ` bugzilla-daemon
                   ` (43 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-08-01  6:13 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #25 from Sergey Kondakov (virtuousfox@gmail.com) ---
(In reply to Nicholas Kazlauskas from comment #24)
> This should be fixed with the series linked below:
> 
> https://patchwork.freedesktop.org/series/64505/
> 
> But it still needs review and backporting to older kernels.

So, I've patched my 5.2.5 kernel package with that set and re-enabled page
flipping. So far, everything seems fine. When it's merged and released, this
issue may be closed. Thanks !

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (24 preceding siblings ...)
  2019-08-01  6:13 ` bugzilla-daemon
@ 2019-08-02  2:21 ` bugzilla-daemon
  2019-08-04  5:17 ` bugzilla-daemon
                   ` (42 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-08-02  2:21 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #26 from Sergey Kondakov (virtuousfox@gmail.com) ---
Created attachment 284083
  --> https://bugzilla.kernel.org/attachment.cgi?id=284083&action=edit
dmesg_2019-08-02-amdgpu_fail_on_patched_5.2.5

(In reply to Nicholas Kazlauskas from comment #24)
> This should be fixed with the series linked below:
> 
> https://patchwork.freedesktop.org/series/64505/
> 
> But it still needs review and backporting to older kernels.

Celebration might have been premature. Hours later I've got another freeze with
different error in amdgpu. Only this time, mouse cursor was movable over frozen
frame right until I tried switching VT. Here's trace:
BUG: unable to handle page fault for address: 0000000800000184
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0 
Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 2 PID: 21044 Comm: kworker/u16:0 Tainted: G        W IO     
5.2.5-1396.g79b6a9c-HSF #1 openSUSE Tumbleweed (unreleased)
Hardware name: Gigabyte Technology Co., Ltd. GA-990XA-UD3/GA-990XA-UD3, BIOS
F14e 09/09/2014
Workqueue: events_unbound commit_work
RIP: 0010:amdgpu_dm_atomic_commit_tail+0x2e6/0xd60 [amdgpu]
Code: ff 48 89 de 48 8b b8 40 43 01 00 e8 94 3b 09 00 49 8b 54 24 08 48 89 9d
30 fe ff ff 8b 82 00 09 00 00 85 c0 0f 85 fb fd ff ff <80> bb 80 01 00 00 01 0f
86 a0 00 00 00 48 b9 00 00 00 00 01 00 00
RSP: 0018:ffff98198b837c30 EFLAGS: 00010202
RAX: 0000000000000023 RBX: 0000000800000004 RCX: ffff8aca7b146f18
RDX: ffff8acc2a2d9000 RSI: ffffffffc0994f00 RDI: 0000000000000002
RBP: ffff98198b837e10 R08: 0000000000000001 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8aca97bf3540
R13: ffff8acc114b1000 R14: ffff8acc035da000 R15: 0000000000000006
FS:  0000000000000000(0000) GS:ffff8acc2e000000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000800000184 CR3: 00000003747c2000 CR4: 00000000000406e0
Call Trace:
 ? mark_held_locks+0x2d/0x80
 ? _raw_spin_unlock_irq+0x3a/0x50
 ? finish_task_switch+0xa2/0x300
 ? __lock_acquire+0x3c3/0x7c0
 ? find_held_lock+0x32/0x90
 ? find_held_lock+0x32/0x90
 ? sched_clock+0x5/0x10
 ? mark_held_locks+0x2d/0x80
 ? preempt_count_sub+0x98/0xe0
 ? _raw_spin_unlock_irq+0x3a/0x50
 ? wait_for_completion_timeout+0xe9/0x110
 ? commit_tail+0x3c/0x70
 commit_tail+0x3c/0x70
 process_one_work+0x271/0x5f0
 worker_thread+0x4a/0x3d0
 ? process_one_work+0x5f0/0x5f0
 kthread+0x118/0x140
 ? kthread_create_worker_on_cpu+0x70/0x70
 ret_from_fork+0x27/0x50
Modules linked in: r8169 binfmt_misc af_packet ts_bm xt_pkttype xt_string
nf_nat_ftp nf_conntrack_ftp xt_tcpudp ip6t_rpfilter ip6t_REJECT ipt_REJECT
xt_conntrack ebtable_nat ip6table_nat ip6table_mangle ip6table_raw
ip6table_security iptable_nat nf_nat iptable_mangle iptable_raw
iptable_security nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nfnetlink
ebtable_filter ebtables scsi_transport_iscsi ip6table_filter ip6_tables
iptable_filter ip_tables x_tables bpfilter snd_seq_dummy snd_seq_oss
snd_seq_midi_event snd_seq snd_pcm_oss zram snd_mixer_oss bnep it87 hwmon_vid
msr joydev amd64_edac_mod edac_mce_amd btusb btrtl btbcm rc_avermedia btintel
kvm_amd tuner_simple tuner_types bluetooth snd_usb_audio tuner kvm tda7432
snd_usbmidi_lib snd_rawmidi irqbypass tvaudio msp3400 snd_seq_device ath9k bttv
ath9k_common ath9k_hw tea575x tveeprom ath videobuf_dma_sg videobuf_core
rc_core v4l2_common pcspkr wmi_bmof videodev mxm_wmi mac80211 fam15h_power
k10temp sp5100_tco
 media amdgpu i2c_piix4 snd_hda_codec_realtek snd_hda_codec_generic
snd_hda_codec_hdmi ledtrig_audio snd_hda_intel cfg80211 snd_hda_codec
snd_hda_core realtek gpu_sched libphy snd_hwdep ttm rfkill snd_pcm mac_hid
hid_generic usbhid uas usb_storage ohci_pci serio_raw sd_mod ohci_hcd ehci_pci
ehci_hcd xhci_pci xhci_hcd wmi exfat(O) l2tp_ppp l2tp_netlink l2tp_core
ip6_udp_tunnel udp_tunnel pppox ppp_generic slhc vhba(O) uinput sg nbd
dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ecryptfs [last unloaded:
r8169]
CR2: 0000000800000184
---[ end trace 7da703104c8acbc9 ]---
RIP: 0010:amdgpu_dm_atomic_commit_tail+0x2e6/0xd60 [amdgpu]
Code: ff 48 89 de 48 8b b8 40 43 01 00 e8 94 3b 09 00 49 8b 54 24 08 48 89 9d
30 fe ff ff 8b 82 00 09 00 00 85 c0 0f 85 fb fd ff ff <80> bb 80 01 00 00 01 0f
86 a0 00 00 00 48 b9 00 00 00 00 01 00 00
RSP: 0018:ffff98198b837c30 EFLAGS: 00010202
RAX: 0000000000000023 RBX: 0000000800000004 RCX: ffff8aca7b146f18
RDX: ffff8acc2a2d9000 RSI: ffffffffc0994f00 RDI: 0000000000000002
RBP: ffff98198b837e10 R08: 0000000000000001 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8aca97bf3540
R13: ffff8acc114b1000 R14: ffff8acc035da000 R15: 0000000000000006
FS:  0000000000000000(0000) GS:ffff8acc2e000000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000800000184 CR3: 00000003747c2000 CR4: 00000000000406e0

How ironic for it to manifest again during discussion video on Youtube about
recent "JoJo" Part 5 finale's "perpetually trapped in the repeating nightmare
of a frozen time" theme…

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (25 preceding siblings ...)
  2019-08-02  2:21 ` bugzilla-daemon
@ 2019-08-04  5:17 ` bugzilla-daemon
  2019-08-07 17:43 ` bugzilla-daemon
                   ` (41 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-08-04  5:17 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #27 from Sergey Kondakov (virtuousfox@gmail.com) ---
Created attachment 284153
  --> https://bugzilla.kernel.org/attachment.cgi?id=284153&action=edit
dmesg_2019-08-04-amdgpu-new_dereference-with-shadowprimary

So, I've been using explicitly disabled "EnablePageFlip" and "TearFree" options
as workaround for the original dereference but then decided to try out
"ShadowPrimary" during fiddling with mvtools' motion-interpolation optimization
in mpv, since page flipping is disabled anyway. But the result was ANOTHER null
pointer dereference mere seconds after login:
BUG: kernel NULL pointer dereference, address: 0000000000000008
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0 
Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 1 PID: 3272 Comm: X:cs0 Tainted: G          IO     
5.2.5-1407.g79b6a9c-HSF #1 openSUSE Tumbleweed
Hardware name: Gigabyte Technology Co., Ltd. GA-990XA-UD3/GA-990XA-UD3, BIOS
F14e 09/09/2014
RIP: 0010:amdgpu_vm_update_directories+0xe7/0x260 [amdgpu]
Code: 89 08 48 8d 4a 40 48 89 48 08 48 89 42 40 48 8b 78 f0 c6 40 10 00 4c 8b
a7 80 06 00 00 4d 85 e4 74 08 4d 8b a4 24 40 04 00 00 <4d> 8b 6c 24 08 31 f6 49
8b 95 80 06 00 00 48 85 d2 74 0f 48 8b 92
RSP: 0018:ffffafc2478aba10 EFLAGS: 00010246
RAX: ffff98742e20e670 RBX: ffff98742e20e658 RCX: ffff98744fc66040
RDX: ffff98744fc66000 RSI: ffff98742e20e638 RDI: ffff9873a295f800
RBP: ffff987459e00000 R08: 0000000000000000 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffffafc2478abb58 R14: ffff98744fc66000 R15: ffffafc2478abb58
FS:  00007f3ee03d7700(0000) GS:ffff98746de00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000008 CR3: 00000003f27aa000 CR4: 00000000000406e0
Call Trace:
 amdgpu_cs_vm_handling+0x308/0x440 [amdgpu]
 amdgpu_cs_ioctl+0x154/0xa10 [amdgpu]
 ? amdgpu_cs_vm_handling+0x440/0x440 [amdgpu]
 drm_ioctl_kernel+0xaa/0xf0
 drm_ioctl+0x208/0x385
 ? amdgpu_cs_vm_handling+0x440/0x440 [amdgpu]
 ? _raw_spin_unlock_irqrestore+0x59/0x70
 ? preempt_count_sub+0x98/0xe0
 ? _raw_spin_unlock_irqrestore+0x46/0x70
 amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
 do_vfs_ioctl+0x3ed/0x720
 ? __fget+0xf9/0x1b0
 ksys_ioctl+0x5e/0x90
 __x64_sys_ioctl+0x16/0x20
 do_syscall_64+0x66/0xc0
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7f3ee641c7c7
Code: 00 00 90 48 8b 05 d1 86 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff
c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01
c3 48 8b 0d a1 86 0c 00 f7 d8 64 89 01 48
RSP: 002b:00007f3ee03d6a08 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007f3ee03d6a70 RCX: 00007f3ee641c7c7
RDX: 00007f3ee03d6a70 RSI: 00000000c0186444 RDI: 000000000000000e
RBP: 00000000c0186444 R08: 00007f3ee03d6b80 R09: 0000000000000020
R10: 00007f3ee03d6b80 R11: 0000000000000246 R12: 0000000000000000
R13: 000000000000000e R14: 000055d55e6f8bf0 R15: 000055d55e6f91a8
Modules linked in: af_packet xt_pkttype xt_string nf_nat_ftp nf_conntrack_ftp
xt_tcpudp ip6t_rpfilter ip6t_REJECT ipt_REJECT xt_conntrack ebtable_nat
ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat
iptable_mangle iptable_raw iptable_security nf_conntrack nf_defrag_ipv6
nf_defrag_ipv4 ip_set nfnetlink ebtable_filter ebtables scsi_transport_iscsi
ip6table_filter ip6_tables iptable_filter ip_tables x_tables bpfilter
snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_pcm_oss snd_mixer_oss
msr bnep it87 hwmon_vid zram amd64_edac_mod edac_mce_amd kvm_amd kvm
rc_avermedia tuner_simple tuner_types irqbypass tuner tda7432 btusb btrtl btbcm
btintel tvaudio msp3400 bluetooth snd_usb_audio ath9k joydev bttv ath9k_common
snd_usbmidi_lib tea575x ath9k_hw tveeprom snd_rawmidi videobuf_dma_sg mxm_wmi
wmi_bmof pcspkr ath videobuf_core snd_seq_device k10temp fam15h_power rc_core
snd_hda_codec_realtek v4l2_common snd_hda_codec_generic
 sp5100_tco snd_hda_codec_hdmi ledtrig_audio mac80211 amdgpu videodev media
i2c_piix4 snd_hda_intel cfg80211 snd_hda_codec r8169 snd_hda_core realtek
snd_hwdep libphy snd_pcm gpu_sched rfkill ttm mac_hid hid_generic usbhid uas
usb_storage ohci_pci serio_raw sd_mod ehci_pci ohci_hcd ehci_hcd xhci_pci
xhci_hcd wmi exfat(O) l2tp_ppp l2tp_netlink l2tp_core ip6_udp_tunnel udp_tunnel
pppox ppp_generic slhc vhba(O) uinput sg nbd dm_multipath scsi_dh_rdac
scsi_dh_emc scsi_dh_alua ecryptfs
CR2: 0000000000000008
---[ end trace a7f0ed14134a76ad ]---
RIP: 0010:amdgpu_vm_update_directories+0xe7/0x260 [amdgpu]
Code: 89 08 48 8d 4a 40 48 89 48 08 48 89 42 40 48 8b 78 f0 c6 40 10 00 4c 8b
a7 80 06 00 00 4d 85 e4 74 08 4d 8b a4 24 40 04 00 00 <4d> 8b 6c 24 08 31 f6 49
8b 95 80 06 00 00 48 85 d2 74 0f 48 8b 92
RSP: 0018:ffffafc2478aba10 EFLAGS: 00010246
RAX: ffff98742e20e670 RBX: ffff98742e20e658 RCX: ffff98744fc66040
RDX: ffff98744fc66000 RSI: ffff98742e20e638 RDI: ffff9873a295f800
RBP: ffff987459e00000 R08: 0000000000000000 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffffafc2478abb58 R14: ffff98744fc66000 R15: ffffafc2478abb58
FS:  00007f3ee03d7700(0000) GS:ffff98746de00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000008 CR3: 00000003f27aa000 CR4: 00000000000406e0

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (26 preceding siblings ...)
  2019-08-04  5:17 ` bugzilla-daemon
@ 2019-08-07 17:43 ` bugzilla-daemon
  2019-08-14  6:43 ` bugzilla-daemon
                   ` (40 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-08-07 17:43 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

vr00m (vmuppalla@outlook.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |vmuppalla@outlook.com

--- Comment #28 from vr00m (vmuppalla@outlook.com) ---
I experienced issues after upgrading kernel from 5.1 to 5.2 on my notebook with
2500 U. I tried kernel boot param iommu=soft and that fixed it.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (27 preceding siblings ...)
  2019-08-07 17:43 ` bugzilla-daemon
@ 2019-08-14  6:43 ` bugzilla-daemon
  2019-08-14 19:06 ` bugzilla-daemon
                   ` (39 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-08-14  6:43 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

chirney1@hotmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |chirney1@hotmail.com

--- Comment #29 from chirney1@hotmail.com ---
(In reply to vr00m from comment #28)
> I experienced issues after upgrading kernel from 5.1 to 5.2 on my notebook
> with 2500 U. I tried kernel boot param iommu=soft and that fixed it.


I've encountered this issue with kernel 5.2 (tried 5.2.8 just now) and also
have a Ryzen 5 2500U notebook (Huawei Matebook D 14" (AMD)). Running Manjaro.
The login screen appears fine, but after that, black screen. I know nothing's
locked up because I was able to launch GZDoom from typing in the dark in the
whisker menu and heard the sounds of Doom, or at least the title screen.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (28 preceding siblings ...)
  2019-08-14  6:43 ` bugzilla-daemon
@ 2019-08-14 19:06 ` bugzilla-daemon
  2019-08-15 22:05 ` bugzilla-daemon
                   ` (38 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-08-14 19:06 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

Andrey Grodzovsky (andrey.grodzovsky@amd.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |andrey.grodzovsky@amd.com

--- Comment #30 from Andrey Grodzovsky (andrey.grodzovsky@amd.com) ---
(In reply to Sergey Kondakov from comment #27)
> Created attachment 284153 [details]
> dmesg_2019-08-04-amdgpu-new_dereference-with-shadowprimary
> 
> So, I've been using explicitly disabled "EnablePageFlip" and "TearFree"
> options as workaround for the original dereference but then decided to try
> out "ShadowPrimary" during fiddling with mvtools' motion-interpolation
> optimization in mpv, since page flipping is disabled anyway. But the result
> was ANOTHER null pointer dereference mere seconds after login:
> BUG: kernel NULL pointer dereference, address: 0000000000000008
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> PGD 0 P4D 0 
> Oops: 0000 [#1] PREEMPT SMP NOPTI
> CPU: 1 PID: 3272 Comm: X:cs0 Tainted: G          IO     
> 5.2.5-1407.g79b6a9c-HSF #1 openSUSE Tumbleweed
> Hardware name: Gigabyte Technology Co., Ltd. GA-990XA-UD3/GA-990XA-UD3, BIOS
> F14e 09/09/2014
> RIP: 0010:amdgpu_vm_update_directories+0xe7/0x260 [amdgpu]
> Code: 89 08 48 8d 4a 40 48 89 48 08 48 89 42 40 48 8b 78 f0 c6 40 10 00 4c
> 8b a7 80 06 00 00 4d 85 e4 74 08 4d 8b a4 24 40 04 00 00 <4d> 8b 6c 24 08 31
> f6 49 8b 95 80 06 00 00 48 85 d2 74 0f 48 8b 92
> RSP: 0018:ffffafc2478aba10 EFLAGS: 00010246
> RAX: ffff98742e20e670 RBX: ffff98742e20e658 RCX: ffff98744fc66040
> RDX: ffff98744fc66000 RSI: ffff98742e20e638 RDI: ffff9873a295f800
> RBP: ffff987459e00000 R08: 0000000000000000 R09: 0000000000000001
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> R13: ffffafc2478abb58 R14: ffff98744fc66000 R15: ffffafc2478abb58
> FS:  00007f3ee03d7700(0000) GS:ffff98746de00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000008 CR3: 00000003f27aa000 CR4: 00000000000406e0
> Call Trace:
>  amdgpu_cs_vm_handling+0x308/0x440 [amdgpu]
>  amdgpu_cs_ioctl+0x154/0xa10 [amdgpu]
>  ? amdgpu_cs_vm_handling+0x440/0x440 [amdgpu]
>  drm_ioctl_kernel+0xaa/0xf0
>  drm_ioctl+0x208/0x385
>  ? amdgpu_cs_vm_handling+0x440/0x440 [amdgpu]
>  ? _raw_spin_unlock_irqrestore+0x59/0x70
>  ? preempt_count_sub+0x98/0xe0
>  ? _raw_spin_unlock_irqrestore+0x46/0x70
>  amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
>  do_vfs_ioctl+0x3ed/0x720
>  ? __fget+0xf9/0x1b0
>  ksys_ioctl+0x5e/0x90
>  __x64_sys_ioctl+0x16/0x20
>  do_syscall_64+0x66/0xc0
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x7f3ee641c7c7
> Code: 00 00 90 48 8b 05 d1 86 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff
> ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff
> 73 01 c3 48 8b 0d a1 86 0c 00 f7 d8 64 89 01 48
> RSP: 002b:00007f3ee03d6a08 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> RAX: ffffffffffffffda RBX: 00007f3ee03d6a70 RCX: 00007f3ee641c7c7
> RDX: 00007f3ee03d6a70 RSI: 00000000c0186444 RDI: 000000000000000e
> RBP: 00000000c0186444 R08: 00007f3ee03d6b80 R09: 0000000000000020
> R10: 00007f3ee03d6b80 R11: 0000000000000246 R12: 0000000000000000
> R13: 000000000000000e R14: 000055d55e6f8bf0 R15: 000055d55e6f91a8
> Modules linked in: af_packet xt_pkttype xt_string nf_nat_ftp
> nf_conntrack_ftp xt_tcpudp ip6t_rpfilter ip6t_REJECT ipt_REJECT xt_conntrack
> ebtable_nat ip6table_nat ip6table_mangle ip6table_raw ip6table_security
> iptable_nat nf_nat iptable_mangle iptable_raw iptable_security nf_conntrack
> nf_defrag_ipv6 nf_defrag_ipv4 ip_set nfnetlink ebtable_filter ebtables
> scsi_transport_iscsi ip6table_filter ip6_tables iptable_filter ip_tables
> x_tables bpfilter snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq
> snd_pcm_oss snd_mixer_oss msr bnep it87 hwmon_vid zram amd64_edac_mod
> edac_mce_amd kvm_amd kvm rc_avermedia tuner_simple tuner_types irqbypass
> tuner tda7432 btusb btrtl btbcm btintel tvaudio msp3400 bluetooth
> snd_usb_audio ath9k joydev bttv ath9k_common snd_usbmidi_lib tea575x
> ath9k_hw tveeprom snd_rawmidi videobuf_dma_sg mxm_wmi wmi_bmof pcspkr ath
> videobuf_core snd_seq_device k10temp fam15h_power rc_core
> snd_hda_codec_realtek v4l2_common snd_hda_codec_generic
>  sp5100_tco snd_hda_codec_hdmi ledtrig_audio mac80211 amdgpu videodev media
> i2c_piix4 snd_hda_intel cfg80211 snd_hda_codec r8169 snd_hda_core realtek
> snd_hwdep libphy snd_pcm gpu_sched rfkill ttm mac_hid hid_generic usbhid uas
> usb_storage ohci_pci serio_raw sd_mod ehci_pci ohci_hcd ehci_hcd xhci_pci
> xhci_hcd wmi exfat(O) l2tp_ppp l2tp_netlink l2tp_core ip6_udp_tunnel
> udp_tunnel pppox ppp_generic slhc vhba(O) uinput sg nbd dm_multipath
> scsi_dh_rdac scsi_dh_emc scsi_dh_alua ecryptfs
> CR2: 0000000000000008
> ---[ end trace a7f0ed14134a76ad ]---
> RIP: 0010:amdgpu_vm_update_directories+0xe7/0x260 [amdgpu]
> Code: 89 08 48 8d 4a 40 48 89 48 08 48 89 42 40 48 8b 78 f0 c6 40 10 00 4c
> 8b a7 80 06 00 00 4d 85 e4 74 08 4d 8b a4 24 40 04 00 00 <4d> 8b 6c 24 08 31
> f6 49 8b 95 80 06 00 00 48 85 d2 74 0f 48 8b 92
> RSP: 0018:ffffafc2478aba10 EFLAGS: 00010246
> RAX: ffff98742e20e670 RBX: ffff98742e20e658 RCX: ffff98744fc66040
> RDX: ffff98744fc66000 RSI: ffff98742e20e638 RDI: ffff9873a295f800
> RBP: ffff987459e00000 R08: 0000000000000000 R09: 0000000000000001
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> R13: ffffafc2478abb58 R14: ffff98744fc66000 R15: ffffafc2478abb58
> FS:  00007f3ee03d7700(0000) GS:ffff98746de00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000008 CR3: 00000003f27aa000 CR4: 00000000000406e0

Sergey, I tried to reproduce you latest issue on Ellsmere (Polaris 10) with
"ShadowPrimary" enabled flip disabled and didn't observe any crash.
In case you built your own kernel can you give me the output of this command -

Run gdb on amdgpu.ko
gdb drivers/gpu/drm/amd/amdgpu/amdgpu.ko

Then do - 
list *(amdgpu_vm_update_directories+0xe7)

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (29 preceding siblings ...)
  2019-08-14 19:06 ` bugzilla-daemon
@ 2019-08-15 22:05 ` bugzilla-daemon
  2019-08-17  5:13 ` bugzilla-daemon
                   ` (37 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-08-15 22:05 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #31 from Sergey Kondakov (virtuousfox@gmail.com) ---
(In reply to Andrey Grodzovsky from comment #30)
> (In reply to Sergey Kondakov from comment #27)
> 
> Sergey, I tried to reproduce you latest issue on Ellsmere (Polaris 10) with
> "ShadowPrimary" enabled flip disabled and didn't observe any crash.
> In case you built your own kernel can you give me the output of this command
> -
> 
> Run gdb on amdgpu.ko
> gdb drivers/gpu/drm/amd/amdgpu/amdgpu.ko
> 
> Then do - 
> list *(amdgpu_vm_update_directories+0xe7)

The crash may take a while (hours) to manifest and requires some video-watching
via Firefox and/or mpv (with '--opengl-pbo' option on opengl-hq profile). It
also may or may not need VAAPI to be used ('--hwdec=vaapi-copy' in case of
mpv).

My kernel is built on OBS build-server, so I had to enable debuginfo packaging
and rebuild it, then debuginfo package used up mind-boggling 5,1gb of space
leaving me with measly ~400mb on / ! After that I managed to get this:
0x2e127 is in amdgpu_vm_update_directories
(../drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:1191).
where line #1191 is:
struct amdgpu_bo *bo = parent->base.bo, *pbo;

But it a different build of the kernel, so I don't know if this is even
relevant. I'm not going to stick around with this monstrosity. You may check
out the packages at
https://build.opensuse.org/package/binaries/home:X0F:HSF:Kernel/kernel-HSF/standard
- they have pretty much all kernel modules that x86_64 supports, so it should
run anywhere.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (30 preceding siblings ...)
  2019-08-15 22:05 ` bugzilla-daemon
@ 2019-08-17  5:13 ` bugzilla-daemon
  2019-08-19 13:39 ` bugzilla-daemon
                   ` (36 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-08-17  5:13 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #32 from Sergey Kondakov (virtuousfox@gmail.com) ---
Just got exactly the same 0010:amdgpu_vm_update_directories+0xe7/0x260
dereference immediately on login even with PageFlip & TearFree disabled and
ShadowPrimary NOT enabled. Even with all the same addresses as before. So, now
I'm not sure about what actually triggers it. However, my setup is as
non-default as it gets:
amdgpu has these parameters: cik_support=1 si_support=1 msi=1 sched_policy=1
compute_multipipe=1 gartsize=1024 vm_fragment_size=9
max_num_of_queues_per_device=65536 sched_hw_submission=32 sched_jobs=1024
job_hang_limit=8000 halt_if_hws_hang=1 vm_fault_stop=0 vm_update_mode=3
vm_size=20 disp_priority=2 deep_color=1 gpu_recovery=1
irqbalance is enabled with interval=1 and rtirq has this:
RTIRQ_NAME_LIST="timer rtc snd drm amdgpu radeon i915 nvidia usb i8042 ahci"
RTIRQ_HIGH_LIST="watchdogd oom_reaper rcu_preempt rcu_sched rcu_bh rcub rcuc
gfx sdma ksoftirqd khugepaged"
RTIRQ_PRIO_HIGH=80
RTIRQ_PRIO_DECR=2
RTIRQ_PRIO_LOW=50
RTIRQ_RESET_ALL=0
to boost amdgpu's processes to highest RT/FIFO priorities in hope to avoid
video stuttering and audio x-runs under full load. Transparent hugepages are
enabled in attempt to spare crappy AMD FX's TLB cache and MMU (hence the
vm_fragment_size=9).

Maybe it's non-default vm_update_mode that does it. And few kernel versions
back default gart of 256MB was triggering some kind of fault, probably stall
and reset, maybe it even still does but I'm not going to check. Or maybe it's
all irrelevant.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (31 preceding siblings ...)
  2019-08-17  5:13 ` bugzilla-daemon
@ 2019-08-19 13:39 ` bugzilla-daemon
  2019-08-19 15:11 ` bugzilla-daemon
                   ` (35 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-08-19 13:39 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #33 from Nicholas Kazlauskas (nicholas.kazlauskas@amd.com) ---
I(In reply to Sergey Kondakov from comment #26)
> Created attachment 284083 [details]
> dmesg_2019-08-02-amdgpu_fail_on_patched_5.2.5
> 
> (In reply to Nicholas Kazlauskas from comment #24)
> > This should be fixed with the series linked below:
> > 
> > https://patchwork.freedesktop.org/series/64505/
> > 
> > But it still needs review and backporting to older kernels.
> 
> Celebration might have been premature. Hours later I've got another freeze
> with different error in amdgpu. Only this time, mouse cursor was movable
> over frozen frame right until I tried switching VT. Here's trace:
> BUG: unable to handle page fault for address: 0000000800000184
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> PGD 0 P4D 0 
> Oops: 0000 [#1] PREEMPT SMP NOPTI
> CPU: 2 PID: 21044 Comm: kworker/u16:0 Tainted: G        W IO     
> 5.2.5-1396.g79b6a9c-HSF #1 openSUSE Tumbleweed (unreleased)
> Hardware name: Gigabyte Technology Co., Ltd. GA-990XA-UD3/GA-990XA-UD3, BIOS
> F14e 09/09/2014
> Workqueue: events_unbound commit_work
> RIP: 0010:amdgpu_dm_atomic_commit_tail+0x2e6/0xd60 [amdgpu]

Are you able to consistently reproduce this issue? Is it the same setup and
same conditions as before? I haven't been able to see it in my testing at
least.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (32 preceding siblings ...)
  2019-08-19 13:39 ` bugzilla-daemon
@ 2019-08-19 15:11 ` bugzilla-daemon
  2019-08-21 13:38 ` bugzilla-daemon
                   ` (34 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-08-19 15:11 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #34 from Sergey Kondakov (virtuousfox@gmail.com) ---
(In reply to Nicholas Kazlauskas from comment #33)
> I(In reply to Sergey Kondakov from comment #26)
> > Created attachment 284083 [details]
> > dmesg_2019-08-02-amdgpu_fail_on_patched_5.2.5
> > 
> > (In reply to Nicholas Kazlauskas from comment #24)
> > > This should be fixed with the series linked below:
> > > 
> > > https://patchwork.freedesktop.org/series/64505/
> > > 
> > > But it still needs review and backporting to older kernels.
> > 
> > Celebration might have been premature. Hours later I've got another freeze
> > with different error in amdgpu. Only this time, mouse cursor was movable
> > over frozen frame right until I tried switching VT. Here's trace:
> > BUG: unable to handle page fault for address: 0000000800000184
> > #PF: supervisor read access in kernel mode
> > #PF: error_code(0x0000) - not-present page
> > PGD 0 P4D 0 
> > Oops: 0000 [#1] PREEMPT SMP NOPTI
> > CPU: 2 PID: 21044 Comm: kworker/u16:0 Tainted: G        W IO     
> > 5.2.5-1396.g79b6a9c-HSF #1 openSUSE Tumbleweed (unreleased)
> > Hardware name: Gigabyte Technology Co., Ltd. GA-990XA-UD3/GA-990XA-UD3,
> BIOS
> > F14e 09/09/2014
> > Workqueue: events_unbound commit_work
> > RIP: 0010:amdgpu_dm_atomic_commit_tail+0x2e6/0xd60 [amdgpu]
> 
> Are you able to consistently reproduce this issue? Is it the same setup and
> same conditions as before? I haven't been able to see it in my testing at
> least.

Yes, just having PageFlip enabled in amdgpu guarantees it. Changing anything
other than PageFlip doesn't seem to affect it. Forcing TearFree on with
PageFlip disabled may also trigger it, I think. You may try my previously
linked kernel build in your testing but I doubt that it has something specific
for it.

It may be not reproducible with modesetting X driver because it fails to engage
page flipping on init and throws a bunch of errors about it in Xorg.0.log. For
some reason I'm unable to use modesetting X driver at all, even with page
flipping disabled, it draws only mouse cursor on black background instead of
sddm login screen. So I have to use amdgpu with PageFlip and TearFree
explicitly disabled. But then another, rarer
0010:amdgpu_vm_update_directories+0xe7/0x260 dereference may happen regardless
(which I suspect is connected with vm_update_mode option, unlike the first
one).

By the way, is there any disadvantage in forcing TearFree to be always on when
it works ? Like additional frame of latency or something like that ?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (33 preceding siblings ...)
  2019-08-19 15:11 ` bugzilla-daemon
@ 2019-08-21 13:38 ` bugzilla-daemon
  2019-08-21 14:37 ` bugzilla-daemon
                   ` (33 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-08-21 13:38 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #35 from Nicholas Kazlauskas (nicholas.kazlauskas@amd.com) ---
Do you mind posting your compositor settings in plasma? That would certainly
influence flip timing and submission and I haven't been able to reproduce the
issue with the settings I'm using.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (34 preceding siblings ...)
  2019-08-21 13:38 ` bugzilla-daemon
@ 2019-08-21 14:37 ` bugzilla-daemon
  2019-08-21 15:27 ` bugzilla-daemon
                   ` (32 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-08-21 14:37 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #36 from Sergey Kondakov (virtuousfox@gmail.com) ---
(In reply to Nicholas Kazlauskas from comment #35)
> Do you mind posting your compositor settings in plasma? That would certainly
> influence flip timing and submission and I haven't been able to reproduce
> the issue with the settings I'm using.

Sure. They are also quite funky:
~/.config/kwinrc:
[Compositing]
AnimationSpeed=2
Backend=OpenGL
Enabled=true
GLColorCorrection=true
GLCore=true
GLPlatformInterface=glx
GLPreferBufferSwap=c
GLTextureFilter=2
HiddenPreviews=5
OpenGLIsUnsafe=false
OpenGLIsUnsafe0=false
OpenGLIsUnsafe1=false
UnredirectFullscreen=false
WindowsBlockCompositing=false
XRenderSmoothScale=false

However, I run LXQt with this in startup /usr/local/bin/kwin.sh script:
export __GL_YIELD=USLEEP
export KWIN_TRIPLE_BUFFER=0
export KWIN_USE_BUFFER_AGE=1
export KWIN_OPENGL_INTERFACE=egl
export KWIN_DIRECT_GL=1
export KWIN_FORCE_LANCZOS=1
export KWIN_PERSISTENT_VBO=1
export KWIN_EFFECTS_FORCE_ANIMATIONS=1
…
if [ -z "$WAYLAND_DISPLAY" ]; then
        export WINDOWMANAGER="env mesa_glthread=true nice -n -5 ionice -c 2 -n
0 -t chrt -v -r 5 kwin_x11 $KWIN_OPTIONS"
        exec /etc/X11/xinit/xinitrc
        return 0
else
        export WINDOWMANAGER="env mesa_glthread=true nice -n -5 ionice -c 2 -n
0 -t chrt -v -r 5 kwin_wayland"
        export QT_QPA_PLATFORM=wayland-egl
        export GDK_BACKEND=wayland
        export CLUTTER_BACKEND=wayland
        export SDL_VIDEODRIVER=wayland
        return 0
fi

X is run by /usr/local/bin/Xhp:
nice -n -10 ionice -c 2 -n 0 -t chrt -v -r 10 X "$@"
It hangs the system, by the way, if RT limit is not set by sched_rt_runtime_us

Here's ~/.drirc, just in case:
<driconf>
    <device screen="0" driver="radeonsi">
        <application name="Default">
            <option name="allow_glsl_relaxed_es" value="true" />
            <option name="radeonsi_enable_sisched" value="true" />
            <option name="allow_glsl_builtin_const_expression" value="true" />
            <option name="mesa_glthread" value="true" />
            <option name="radeonsi_enable_nir" value="true" />
            <option name="allow_glsl_extension_directive_midshader"
value="true" />
            <option name="allow_rgb10_configs" value="true" />
            <option name="allow_glsl_cross_stage_interpolation_mismatch"
value="true" />
            <option name="radeonsi_assume_no_z_fights" value="true" />
            <option name="allow_glsl_builtin_variable_redeclaration"
value="true" />
            <option name="allow_glsl_layout_qualifier_on_function_parameters"
value="true" />
            <option name="adaptive_sync" value="true" />
            <option name="radeonsi_commutative_blend_add" value="true" />
            <option name="allow_higher_compat_version" value="true" />
        </application>
    </device>
</driconf>

Some things from tuned.conf:
governor=schedutil
transparent_hugepages=always
/sys/kernel/mm/ksm/sleep_millisecs=250
/sys/kernel/mm/transparent_hugepage/shmem_enabled=advise 
/sys/kernel/mm/transparent_hugepage/defrag=defer+madvise
/sys/kernel/mm/transparent_hugepage/khugepaged/defrag=0
/sys/kernel/mm/transparent_hugepage/khugepaged/pages_to_scan=512
/sys/kernel/mm/transparent_hugepage/khugepaged/scan_sleep_millisecs=1000
/sys/kernel/mm/transparent_hugepage/khugepaged/alloc_sleep_millisecs=10000
dev.hpet.max-user-freq=4096
vm.zone_reclaim_mode=0
kernel.sched_autogroup_enabled=0
kernel.sched_latency_ns=1000000
kernel.sched_min_granularity_ns=100000
kernel.sched_wakeup_granularity_ns=1000
kernel.sched_nr_migrate=256
kernel.sched_migration_cost_ns=125
kernel.sched_cfs_bandwidth_slice_us=100
kernel.sched_tunable_scaling=1
kernel.sched_rt_period_us=1000000
kernel.sched_rt_runtime_us=900000
kernel.sched_rr_timeslice_ms=3

Originally the issue manifested with GLPreferBufferSwap=n and without
double-buffering & EGL enforcement, I've made those in hope to compensate for
disabled TearFree and PageFlip.

Please, answer the question about TearFree, if you can. I've been trying to
find out since its creation and wasn't able to get even a hint. Can it really
be just this perfect thing that everyone should have all the time, unless buged
?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (35 preceding siblings ...)
  2019-08-21 14:37 ` bugzilla-daemon
@ 2019-08-21 15:27 ` bugzilla-daemon
  2019-08-21 18:36 ` bugzilla-daemon
                   ` (31 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-08-21 15:27 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

Alex Deucher (alexdeucher@gmail.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |alexdeucher@gmail.com

--- Comment #37 from Alex Deucher (alexdeucher@gmail.com) ---
(In reply to Sergey Kondakov from comment #34)
> By the way, is there any disadvantage in forcing TearFree to be always on
> when it works ? Like additional frame of latency or something like that ?

The TearFree option is there to deal with compositors that do not support sync
to vblank.  The ddx allocates another front buffer and then that buffer is
updated synchronized with vblank with the data from the real front buffer.  So
it uses an additional buffer.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (36 preceding siblings ...)
  2019-08-21 15:27 ` bugzilla-daemon
@ 2019-08-21 18:36 ` bugzilla-daemon
  2019-08-21 19:28 ` bugzilla-daemon
                   ` (30 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-08-21 18:36 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #38 from Sergey Kondakov (virtuousfox@gmail.com) ---
(In reply to Alex Deucher from comment #37)
> (In reply to Sergey Kondakov from comment #34)
> > By the way, is there any disadvantage in forcing TearFree to be always on
> > when it works ? Like additional frame of latency or something like that ?
> 
> The TearFree option is there to deal with compositors that do not support
> sync to vblank. The ddx allocates another front buffer and then that buffer
> is updated synchronized with vblank with the data from the real front
> buffer.  So it uses an additional buffer.

Thanks ! It's a shame, I've already begun believing in "The Silver Bullet of
VSync". And it's completely "software" GPU-agnostic function, so alternatives
like Wayland would have to just reimplement it the same way ? It always adds a
buffer or "smart-enough" compositor can opt-out ? Or "the correct fix for
latency" with TF is disabling vsync everywhere (such as kwin's
GLPreferBufferSwap=n) else and let it handle it ?

No matter how I previously tried, nothing other than TearFree guaranteed actual
lack of tearing in all times in simple 2x1080p configuration but there is
abundance of buffering as it is in apps and a compositor + latency of LCD
displays. I'm sure, you're aware of
https://gitlab.freedesktop.org/xorg/xserver/issues/244 too. Strange that "the
magic" of TF isn't done directly in compositors or kernel then.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (37 preceding siblings ...)
  2019-08-21 18:36 ` bugzilla-daemon
@ 2019-08-21 19:28 ` bugzilla-daemon
  2019-08-21 21:39 ` bugzilla-daemon
                   ` (29 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-08-21 19:28 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #39 from Alex Deucher (alexdeucher@gmail.com) ---
(In reply to Sergey Kondakov from comment #38)
> 
> Thanks ! It's a shame, I've already begun believing in "The Silver Bullet of
> VSync". And it's completely "software" GPU-agnostic function, so
> alternatives like Wayland would have to just reimplement it the same way ?
> It always adds a buffer or "smart-enough" compositor can opt-out ? Or "the
> correct fix for latency" with TF is disabling vsync everywhere (such as
> kwin's GLPreferBufferSwap=n) else and let it handle it ?
> 
> No matter how I previously tried, nothing other than TearFree guaranteed
> actual lack of tearing in all times in simple 2x1080p configuration but
> there is abundance of buffering as it is in apps and a compositor + latency
> of LCD displays. I'm sure, you're aware of
> https://gitlab.freedesktop.org/xorg/xserver/issues/244 too. Strange that
> "the magic" of TF isn't done directly in compositors or kernel then.

Here is your issue: "simple 2x1080p"

multiple display are really hard to deal with.  The display timing may be
different, the blanking periods may not align, etc.  X uses a single surface
for each multi-display desktopso when you are updating multiple displays, if
the timings are not aligned, one display will show older content.  For this to
work smoothly, you really need the compositor to have each display using it's
own set of buffers and doing vsynced rendering to each display separately.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (38 preceding siblings ...)
  2019-08-21 19:28 ` bugzilla-daemon
@ 2019-08-21 21:39 ` bugzilla-daemon
  2019-08-21 21:51 ` bugzilla-daemon
                   ` (28 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-08-21 21:39 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #40 from Sergey Kondakov (virtuousfox@gmail.com) ---
(In reply to Alex Deucher from comment #39)
> (In reply to Sergey Kondakov from comment #38)
> Here is your issue: "simple 2x1080p"
> 
> multiple display are really hard to deal with.  The display timing may be
> different, the blanking periods may not align, etc.  X uses a single surface
> for each multi-display desktopso when you are updating multiple displays, if
> the timings are not aligned, one display will show older content.  For this
> to work smoothly, you really need the compositor to have each display using
> it's own set of buffers and doing vsynced rendering to each display
> separately.

I little bit strange to call 2x1080p on AMD's fancy 5-port GPU (+ possible DP
multiplexing) "my issue". If anything is an issue with AMD's modern output
controllers it's the lack of analogue signal in DVI port for my proper
1280x1024@89 CRT monitor with majestic >10k:1 contrast. Timing on both outputs
is definitively different, though.

I still cannot fathom how is it still that all outputs are lumped together like
that. Anyway, I was searching on my suspicion about kwin's vsync behaviour and
stumbled on this treat: https://bugs.kde.org/show_bug.cgi?id=395632#c45 - new
kwin developer working on that and multi-threaded per-output vsync _right now_,
wants testers. Surely, this new version of kwin will "blow up" kernel module
with this page-flipping bug ! And he would really benefit from your advices.
Then we might not even need TearFree anywhere anymore !
https://phabricator.kde.org/T11071 - quite a progress already. Aims to make
double-only per-output mandatory vsync via GLX_OML_sync_control.

Right now `qdbus-qt5 org.kde.KWin /KWin supportInformation` says:
…
maxFpsInterval: 16666666
refreshRate: 0
vBlankTime: 6000000
glStrictBinding: false
glStrictBindingFollowsDriver: true
…
Screens
=======
Multi-Head: no
Active screen follows mouse:  no
Number of Screens: 2

Screen 0:
---------
Name: DVI-D-0
Geometry: 0,0,1920x1080
Scale: 1
Refresh Rate: 72.9249

Screen 1:
---------
Name: HDMI-A-0
Geometry: 1920,0,1920x1080
Scale: 1
Refresh Rate: 71.8263

glxgears shows proper FPS (~72.923) but, judging by that bug, it's either
mistiming updates or "cutting out" some frames. It will not tear if it would
let apps render at their pace and then limit its own output to 60, isn't it ?
And I'm as clueless as those bug-reporters on how to check its real rate on
currently released version.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (39 preceding siblings ...)
  2019-08-21 21:39 ` bugzilla-daemon
@ 2019-08-21 21:51 ` bugzilla-daemon
  2019-08-22 13:14 ` bugzilla-daemon
                   ` (27 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-08-21 21:51 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #41 from Alex Deucher (alexdeucher@gmail.com) ---
(In reply to Sergey Kondakov from comment #40)

> 
> I little bit strange to call 2x1080p on AMD's fancy 5-port GPU (+ possible
> DP multiplexing) "my issue". 

It's a limitation of desktop environments on Linux in general.  Other OSes
handle this differently, but in general, regardless of OS, it's a hard problem
to solve.  If you have multiple displays running at different refresh rates how
do you update them without tearing and also without non-synchronized content on
some of the displays?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (40 preceding siblings ...)
  2019-08-21 21:51 ` bugzilla-daemon
@ 2019-08-22 13:14 ` bugzilla-daemon
  2019-08-23 21:02 ` bugzilla-daemon
                   ` (26 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-08-22 13:14 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #42 from Michel Dänzer (michel@daenzer.net) ---
(In reply to Alex Deucher from comment #37)
> The TearFree option is there to deal with compositors that do not support
> sync to vblank.

Also for cases where a compositor cannot prevent tearing, in particular with
rotation and other transforms via RandR.


> The ddx allocates another front buffer and then that buffer
> is updated synchronized with vblank with the data from the real front
> buffer.  So it uses an additional buffer.

Right, that's one of the main reasons why TearFree isn't enabled in all cases
by default, another one being that it can incur (at least) one refresh cycle
output latency.


(In reply to Sergey Kondakov from comment #38)
> It's a shame, I've already begun believing in "The Silver Bullet of VSync". 

If TearFree was a silver bullet, it would be enabled by default in all cases.
:) (It already is in cases where a compositor cannot prevent tearing, per
above)


> And it's completely "software" GPU-agnostic function, so alternatives like
> Wayland would have to just reimplement it the same way ?

More like the other way around actually; I consider TearFree sort of a poor
man's Wayland compositor. The latter generally handle this better, or are at
least in a better position to handle it.


> It always adds a buffer or "smart-enough" compositor can opt-out ?

Currently the former. It would be possible to eliminate the additional buffers
while a compositor / other fullscreen client is using page flipping, but I
never got around to implementing that.


> Or "the correct fix for latency" with TF is disabling vsync everywhere (such
> as kwin's GLPreferBufferSwap=n) else and let it handle it ?

I doubt that'll help for latency, and will waste energy generating frames which
are never visible.


> Strange that "the magic" of TF isn't done directly in compositors or kernel
> then.

Compositors can do so, with some exceptions per above, but in cases where they
can prevent tearing, they're generally preferable to TearFree.

The kernel cannot do this transparently though.


(In reply to Alex Deucher from comment #39)
> multiple display are really hard to deal with.  The display timing may be
> different, the blanking periods may not align, etc.  X uses a single surface
> for each multi-display desktopso when you are updating multiple displays, if
> the timings are not aligned, one display will show older content.  For this
> to work smoothly, you really need the compositor to have each display using
> it's own set of buffers and doing vsynced rendering to each display
> separately.

That wouldn't necessarily make much if any visible difference though, as the
displays can still show inconsistent contents sometimes if their timings aren't
synchronized, and tearing within a display can be avoided even with a single
scanout buffer. The main benefit of separate scanout buffers is that the
application can re-use buffers for rendering new frames earlier, but OTOH
there's the overhead cost of compositing (because the client buffers can't be
used directly for page flipping like this). This is pretty much how TearFree
works, BTW (except due to the Xorg architecture, it can't actualy allow a
compositor / other fullscreen client to re-use buffers earlier when
sync-to-vblank is enabled for the client).


(In reply to Sergey Kondakov from comment #40)
> Anyway, I was searching on my suspicion about kwin's vsync
> behaviour and stumbled on this treat:
> https://bugs.kde.org/show_bug.cgi?id=395632#c45 - new kwin developer working
> on that and multi-threaded per-output vsync _right now_, wants testers.

That's not possible with the current Xorg architecture. It only allows flipping
all displays (connected to a single GPU) together.

The way forward to solve this is Wayland.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (41 preceding siblings ...)
  2019-08-22 13:14 ` bugzilla-daemon
@ 2019-08-23 21:02 ` bugzilla-daemon
  2019-08-24  9:43 ` bugzilla-daemon
                   ` (25 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-08-23 21:02 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

Tom Seewald (tseewald@gmail.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |tseewald@gmail.com

--- Comment #43 from Tom Seewald (tseewald@gmail.com) ---
(In reply to Nicholas Kazlauskas from comment #24)
> This should be fixed with the series linked below:
> 
> https://patchwork.freedesktop.org/series/64505/
> 
> But it still needs review and backporting to older kernels.

That patch series (applied to mainline 5.2.x) appears to fix the hangs on my RX
560 while playing video with vaapi acceleration.

It would be great if this could be back-ported.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (42 preceding siblings ...)
  2019-08-23 21:02 ` bugzilla-daemon
@ 2019-08-24  9:43 ` bugzilla-daemon
  2019-08-26  5:32 ` bugzilla-daemon
                   ` (24 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-08-24  9:43 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #44 from Sergey Kondakov (virtuousfox@gmail.com) ---
(In reply to Alex Deucher from comment #41)
> (In reply to Sergey Kondakov from comment #40)
> 
> > 
> > I little bit strange to call 2x1080p on AMD's fancy 5-port GPU (+ possible
> > DP multiplexing) "my issue". 
> 
> It's a limitation of desktop environments on Linux in general.  Other OSes
> handle this differently, but in general, regardless of OS, it's a hard
> problem to solve.  If you have multiple displays running at different
> refresh rates how do you update them without tearing and also without
> non-synchronized content on some of the displays?

I guess, you always would have to have at least 2
(currently_rendering/next_shown and previously_rendered/currently_shown)
buffers in an app, 2 per each viewport in compositor, 2 in each of system's
video output controllers (what if a viewport shares several outputs or vice
versa ?; "crtc" on GPU but I would prefer a tendency to simplification and
de-specialization of GPUs by replacing them with separate general-purpose
vector processors, rasterization or BVH ASICs, FPGA codec accelerators, output
controllers, all with wider faster common system bus and RAM instead of
GPU-daughterboard-on-CPU's-MB monstrosities) and 2 in each display's controller
(the latter is especially a problem because of slow scalers wanting to do their
bad scaling and other in-display transformations while adding unpredictable
unknown latency). And then make them all work asynchronously with their own
safe timeframes for flipping.

(In reply to Michel Dänzer from comment #42)
>…
> The way forward to solve this is Wayland.

Thanks for detailed explanations, stuff like that should be in manuals. As for
Wayland, I even managed to launch Wayland LXQt sessions with kwin via sddm
where most things work. But something made me postpone transition to it
indefinitively, don't remember what exactly. But right now at least 2 reasons
would be: custom display modes (I want my 72-73 "free" fps on 60 fps
almost-trash-level displays and my CRT have to have its non-standard modes
defined manually) and per-display colour correction (with auto-generated and
custom profiles).

(In reply to Tom Seewald from comment #43)
> (In reply to Nicholas Kazlauskas from comment #24)
> > This should be fixed with the series linked below:
> > 
> > https://patchwork.freedesktop.org/series/64505/
> > 
> > But it still needs review and backporting to older kernels.
> 
> That patch series (applied to mainline 5.2.x) appears to fix the hangs on my
> RX 560 while playing video with vaapi acceleration.
> 
> It would be great if this could be back-ported.

Unfortunately, it didn't fix the page flip-triggered dereference for me. Do you
have page flip related errors in Xorg log on "modesetting" X driver ? With and
without it ?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (43 preceding siblings ...)
  2019-08-24  9:43 ` bugzilla-daemon
@ 2019-08-26  5:32 ` bugzilla-daemon
  2019-09-04  4:50 ` bugzilla-daemon
                   ` (23 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-08-26  5:32 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #45 from Tom Seewald (tseewald@gmail.com) ---
> Unfortunately, it didn't fix the page flip-triggered dereference for me. Do
> you have page flip related errors in Xorg log on "modesetting" X driver ?
> With and without it ?

I don't believe so, glancing over my Xorg.0.log and Xorg.0.log.old I don't see
any errors about page flipping.  I just use the standard modesetting driver for
Xorg.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (44 preceding siblings ...)
  2019-08-26  5:32 ` bugzilla-daemon
@ 2019-09-04  4:50 ` bugzilla-daemon
  2019-09-06 10:37 ` bugzilla-daemon
                   ` (22 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-09-04  4:50 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #46 from Tom Seewald (tseewald@gmail.com) ---
Will these patches[1] be back ported to 5.2/5.3 or will we need to wait until
this hopefully lands in 5.4?

[1] https://patchwork.freedesktop.org/series/64505/

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (45 preceding siblings ...)
  2019-09-04  4:50 ` bugzilla-daemon
@ 2019-09-06 10:37 ` bugzilla-daemon
  2019-09-06 10:38 ` bugzilla-daemon
                   ` (21 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-09-06 10:37 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

jamespharvey20@gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jamespharvey20@gmail.com

--- Comment #47 from jamespharvey20@gmail.com ---
I've been getting this crash about once a week.  Would be nice if something
were done here.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (46 preceding siblings ...)
  2019-09-06 10:37 ` bugzilla-daemon
@ 2019-09-06 10:38 ` bugzilla-daemon
  2019-09-20  1:58 ` bugzilla-daemon
                   ` (20 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-09-06 10:38 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #48 from jamespharvey20@gmail.com ---
Vega 64.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (47 preceding siblings ...)
  2019-09-06 10:38 ` bugzilla-daemon
@ 2019-09-20  1:58 ` bugzilla-daemon
  2019-09-20 13:19 ` bugzilla-daemon
                   ` (19 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-09-20  1:58 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

Christopher Snowhill (kode54@gmail.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |kode54@gmail.com

--- Comment #49 from Christopher Snowhill (kode54@gmail.com) ---
RX 480. Applied patch, haven't had any spurious crashes since. Using patchset
since kernel 5.2.14, now using it on 5.3. Haven't had any suspend/wake crashes
yet, either, but that may be unrelated.

Will continue applying it to successive 5.3 kernels until it is officially
backported, and will report if there are any further crashes.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (48 preceding siblings ...)
  2019-09-20  1:58 ` bugzilla-daemon
@ 2019-09-20 13:19 ` bugzilla-daemon
  2019-09-20 14:04 ` bugzilla-daemon
                   ` (18 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-09-20 13:19 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #50 from Sergey Kondakov (virtuousfox@gmail.com) ---
(In reply to Christopher Snowhill from comment #49)
> RX 480. Applied patch, haven't had any spurious crashes since. Using
> patchset since kernel 5.2.14, now using it on 5.3. Haven't had any
> suspend/wake crashes yet, either, but that may be unrelated.
> 
> Will continue applying it to successive 5.3 kernels until it is officially
> backported, and will report if there are any further crashes.

I also built 5.3 with these patches, almost just as it came out:
https://patchwork.freedesktop.org/series/64505/
https://patchwork.freedesktop.org/series/64614/
https://patchwork.freedesktop.org/series/65192/

No fails on X11's amdgpu so far BUT I've changed both TearFree and
vm_update_mode options to defaults (but pci=big_root_window that makes BAR=VRAM
is still active), so it may be just worked around and not completely gone, will
try vm_update_mode=3 later. Would be nice to have some clue about what vm_*
options actually entail for OpenCL, compute-shader and general rendering
performance. I just set them for whatever, code in amdgpu_vm.c goes high above
my head.

Modesetting X11 driver behaves weirdly for me: enabling PageFlip in it still
gives me errors and in both cases it just draws the black screen with movable
cursor above it instead of sddm greet-screen. But amdgpu works, so, fine.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (49 preceding siblings ...)
  2019-09-20 13:19 ` bugzilla-daemon
@ 2019-09-20 14:04 ` bugzilla-daemon
  2019-09-21  5:26 ` bugzilla-daemon
                   ` (17 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-09-20 14:04 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #51 from Alex Deucher (alexdeucher@gmail.com) ---
Patches have been sent to stable and should land soon.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (50 preceding siblings ...)
  2019-09-20 14:04 ` bugzilla-daemon
@ 2019-09-21  5:26 ` bugzilla-daemon
  2019-09-27  3:50 ` bugzilla-daemon
                   ` (16 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-09-21  5:26 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #52 from Sergey Kondakov (virtuousfox@gmail.com) ---
(In reply to Alex Deucher from comment #51)
> Patches have been sent to stable and should land soon.

Thanks !
However, it seems that not all is well, after all: using vm_update_mode=3 have
resulted in immediate 'RIP: 0010:amdgpu_vm_update_directories+0xe7/0x260'
dereference hang before sddm could draw anything, so the second one is not
fixed yet. Will use vm_update_mode=0 for now to make sure that offending code
is never touched.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (51 preceding siblings ...)
  2019-09-21  5:26 ` bugzilla-daemon
@ 2019-09-27  3:50 ` bugzilla-daemon
  2019-09-27 12:50 ` bugzilla-daemon
                   ` (15 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-09-27  3:50 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #53 from Sergey Kondakov (virtuousfox@gmail.com) ---
Created attachment 285209
  --> https://bugzilla.kernel.org/attachment.cgi?id=285209&action=edit
dmesg_2019-09-26-amdgpu-old_dereference_on_patched_5.3.1

After about a day of uptime my patched 5.3.1 hanged during hours-long Youtube
video with dereference that is almost identical to the original one:
BUG: unable to handle page fault for address: 00000008000001b4
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0 
Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 2 PID: 396 Comm: kworker/u16:2 Tainted: G        W IO     
5.3.1-1482.g27a0123-HSF #1 openSUSE Tumbleweed
Hardware name: Gigabyte Technology Co., Ltd. GA-990XA-UD3/GA-990XA-UD3, BIOS
F14e 09/09/2014
Workqueue: events_unbound commit_work
RIP: 0010:amdgpu_dm_atomic_commit_tail+0x2ee/0xfd0 [amdgpu]
…
Call Trace:
 ? __switch_to_asm+0x34/0x70
 ? __switch_to_asm+0x40/0x70
 ? _raw_spin_unlock_irq+0x29/0x50
 ? trace_hardirqs_on+0x2c/0xf0
 ? _raw_spin_unlock_irq+0x3a/0x50
 ? finish_task_switch+0xa3/0x2e0
 ? finish_task_switch+0x75/0x2e0
 ? __switch_to+0x152/0x4e0
 ? __switch_to_asm+0x34/0x70
 ? __schedule+0x353/0x900
 ? wait_for_completion_timeout+0x31/0x110
 ? _raw_spin_unlock_irq+0x29/0x50
 ? preempt_count_sub+0x9b/0xd0
 ? _raw_spin_unlock_irq+0x3a/0x50
 ? wait_for_completion_timeout+0xe9/0x110
 ? commit_tail+0x3c/0x70
 commit_tail+0x3c/0x70
 process_one_work+0x271/0x5b0
 worker_thread+0x4a/0x3d0
 ? process_one_work+0x5b0/0x5b0
 kthread+0x118/0x140
 ? kthread_create_worker_on_cpu+0x70/0x70
 ret_from_fork+0x27/0x50
…
[drm:amdgpu_dm_atomic_check [amdgpu]] *ERROR* [CRTC:47:crtc-0] hw_done or
flip_done timed out

Could this be due to these additional patches ?
https://patchwork.freedesktop.org/series/64614/
https://patchwork.freedesktop.org/series/65192/

Or the fact that I patched kwin-5.16.5 with https://phabricator.kde.org/T11071
and added KWIN_USE_INTEL_SWAP_EVENT=1 & KWIN_USE_BUFFER_AGE=3, so it works with
tighter timings now ?

Or any of these ?
options amdgpu cik_support=1 si_support=1 msi=1 disp_priority=2 dpm=1 runpm=1
sched_policy=1 compute_multipipe=1 vm_fragment_size=9 gartsize=1024
max_num_of_queues_per_device=65536 sched_hw_submission=32 sched_jobs=1024
job_hang_limit=8000 halt_if_hws_hang=1 vm_fault_stop=0 vm_update_mode=0
deep_color=1 gpu_recovery=1 lockup_timeout=2500,5000,8000,1000 ras_enable=1
mcbp=1 queue_preemption_timeout_ms=48 mes=1 hws_gws_support=1 discovery=1

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (52 preceding siblings ...)
  2019-09-27  3:50 ` bugzilla-daemon
@ 2019-09-27 12:50 ` bugzilla-daemon
  2019-09-27 13:19 ` bugzilla-daemon
                   ` (14 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-09-27 12:50 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #54 from Alex Deucher (alexdeucher@gmail.com) ---
(In reply to Sergey Kondakov from comment #53)
> Or any of these ?
> options amdgpu cik_support=1 si_support=1 msi=1 disp_priority=2 dpm=1
> runpm=1 sched_policy=1 compute_multipipe=1 vm_fragment_size=9 gartsize=1024
> max_num_of_queues_per_device=65536 sched_hw_submission=32 sched_jobs=1024
> job_hang_limit=8000 halt_if_hws_hang=1 vm_fault_stop=0 vm_update_mode=0
> deep_color=1 gpu_recovery=1 lockup_timeout=2500,5000,8000,1000 ras_enable=1
> mcbp=1 queue_preemption_timeout_ms=48 mes=1 hws_gws_support=1 discovery=1

remove all of those.  You should use the defaults unless you are specifically
debugging something.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (53 preceding siblings ...)
  2019-09-27 12:50 ` bugzilla-daemon
@ 2019-09-27 13:19 ` bugzilla-daemon
  2019-09-27 20:18 ` bugzilla-daemon
                   ` (13 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-09-27 13:19 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #55 from Tom Seewald (tseewald@gmail.com) ---
(In reply to Sergey Kondakov from comment #53)
> Created attachment 285209 [details]
> dmesg_2019-09-26-amdgpu-old_dereference_on_patched_5.3.1
> 
> After about a day of uptime my patched 5.3.1 hanged during hours-long
> Youtube video with dereference that is almost identical to the original one:

I don't believe the patches[1] have landed in a stable kernel release yet, at
least going by the 5.3.1 change log[2] I don't see any reference to them.

[1] https://patchwork.freedesktop.org/series/64505/
[2] https://cdn.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.3.1

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (54 preceding siblings ...)
  2019-09-27 13:19 ` bugzilla-daemon
@ 2019-09-27 20:18 ` bugzilla-daemon
  2019-09-28  0:07 ` bugzilla-daemon
                   ` (12 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-09-27 20:18 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #56 from Sergey Kondakov (virtuousfox@gmail.com) ---
(In reply to Alex Deucher from comment #54)
> (In reply to Sergey Kondakov from comment #53)
> > Or any of these ?
> > options amdgpu cik_support=1 si_support=1 msi=1 disp_priority=2 dpm=1
> > runpm=1 sched_policy=1 compute_multipipe=1 vm_fragment_size=9 gartsize=1024
> > max_num_of_queues_per_device=65536 sched_hw_submission=32 sched_jobs=1024
> > job_hang_limit=8000 halt_if_hws_hang=1 vm_fault_stop=0 vm_update_mode=0
> > deep_color=1 gpu_recovery=1 lockup_timeout=2500,5000,8000,1000 ras_enable=1
> > mcbp=1 queue_preemption_timeout_ms=48 mes=1 hws_gws_support=1 discovery=1
> 
> remove all of those.  You should use the defaults unless you are
> specifically debugging something.

Then you may consider that I "specifically debugging" THIS. Because when I ask
these questions here or in freedesktop.org, I specifically hope for an factual
response from people with actual understanding and experience of how it works
and what to be a proper way to debug without guesswork, based on knowledge that
would compensate for the lack of meaningful documentation and one of the
highest entry-barriers in software (even corporate monstrosity like Intel can't
figure out GPUs still, market that is dominated by 2 oligopolists that run it
with impunity however they feel like it, after all). This third dereference
would be really hard to debug, though, because there is no clear reproduction
steps, UNLESS you KNOW where and how to look as a developer. Or are you all
just going to ignore the presence of kernel-crashing code because it "may" (or
may not) be not triggered by your defaults ?

So, can you actually tell which code-path may result in this or, better yet,
test it yourself so things like that just would not go into releases ?
The original dereference is triggered by mere presence of PageFlip which is on
by default, so blindly running developer defaults (you can see what exactly I
think about them here: https://bugzilla.kernel.org/show_bug.cgi?id=203703#c9
and c11) didn't help much anyone now, did it ?

Or can you at least explain on what exactly each of these options does, what
may be desired and undesired consequences and how your consensus about defaults
came to be ? Short summary (but not as short as modinfo) or links to mailing
list discussions maybe ? Because my goals (as they are for any desktop user)
are: minimal guaranteed latency (meaning, full aggressive preemption, lowest
scheduling granularity and strict RT priorities) of audio/video/input/network
pipelines under stress-load and in that specific order of priority, with
working fast fail-over or recovery instead of hangs and reboots.

If I'd be using defaults then I still would be sitting on 3,3Ghz (instead of
4Ghz + 2,4Ghz for MMU & cache) FX CPU, non-ECC RAM ran by literally retarded
AMD FX's MMU (you KNOW the one, the laughing stock of 2011-2017 x86 CPUs !) by
slow default JEDEC timings, ~200W (instead of down-clocked and/or
under-voltaged 90-120W) RX580 GPU (that would, no doubt, fry itself at some
point like my previous 6870 did) with slow memory timings, sluggish non-patched
kwin, 64ms of audio latency (instead of 8-12ms) and whole bunch of random
hangs/drops in audio, video stuttering and input delays/skips due to scheduling
priorities that are all other the place by default. So, no, thank you very
much, on that. And YOU should NOT be testing exclusively on defaults either.

(In reply to Tom Seewald from comment #55)
> (In reply to Sergey Kondakov from comment #53)
> > Created attachment 285209 [details]
> > dmesg_2019-09-26-amdgpu-old_dereference_on_patched_5.3.1
> > 
> > After about a day of uptime my patched 5.3.1 hanged during hours-long
> > Youtube video with dereference that is almost identical to the original
> one:
> 
> I don't believe the patches[1] have landed in a stable kernel release yet,
> at least going by the 5.3.1 change log[2] I don't see any reference to them.
> 
> [1] https://patchwork.freedesktop.org/series/64505/
> [2] https://cdn.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.3.1

They seem to be in queue for 5.3.2:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/commit/?id=7f2f9d496c3b8809143f1fc14e8cb093cc981d78
BUT those only address #1 (PageFlip) dereference, NOT #2 (when vm_update_mode
not 0) and #3 !

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (55 preceding siblings ...)
  2019-09-27 20:18 ` bugzilla-daemon
@ 2019-09-28  0:07 ` bugzilla-daemon
  2019-09-29 18:10 ` bugzilla-daemon
                   ` (11 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-09-28  0:07 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #57 from Andrey Grodzovsky (andrey.grodzovsky@amd.com) ---
Sergey, instead of throwing tantrums why can't you just do what you are asked ?
You present an extremely convoluted set of driver config params and demand from
us resolving the bug with those parameters in place. This introduces unneeded
complication of the failure scenario which in turn introduces a lot of
unknowns. Alex asks you to simplify the settings so less unknows are in the
system so it's easier for us to try and figure out what goes wrong while we
inspect the code. 
So please, bring the parameters back to default as this is the most well tested
configuration and gives a baseline and also please provide addr2line for
0010:amdgpu_dm_atomic_commit_tail+0x2ee so we can get a better idea where in
code the NULL ptr happened.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (56 preceding siblings ...)
  2019-09-28  0:07 ` bugzilla-daemon
@ 2019-09-29 18:10 ` bugzilla-daemon
  2019-09-29 21:54 ` bugzilla-daemon
                   ` (10 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-09-29 18:10 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

Damian Nowak (spam@nowaker.net) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |spam@nowaker.net

--- Comment #58 from Damian Nowak (spam@nowaker.net) ---
I encounter this error once a week on average on my Radeon 7 (Vega 20). Great
on see you guys actively working on it. When 5.3.2 releases to Arch, I'll keep
using it for a week or two and report back whether I encounter an issue again
or not. Thanks! 

@Sergey You could revert to defaults just for the duration of
testing/debugging. It'll sure make things easier for developers, and you can
still go back to your settings once the issue is fixed. Great settings
nonetheless, do these kernel parameters really improve the power performance of
RX 580, or did you need to do something in addition too? By the way, I used RX
580 on default Arch Linux settings (so most likely kernel defaults) for a year
and it was fine so you probably don't have to worry about frying it. Now I'm
using Radeon 7, while RX 580 is still alive in a different Windows-based
computer.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (57 preceding siblings ...)
  2019-09-29 18:10 ` bugzilla-daemon
@ 2019-09-29 21:54 ` bugzilla-daemon
  2019-09-30  2:07 ` bugzilla-daemon
                   ` (9 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-09-29 21:54 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #59 from Sergey Kondakov (virtuousfox@gmail.com) ---
(In reply to Andrey Grodzovsky from comment #57)
> Sergey, instead of throwing tantrums why can't you just do what you are
> asked ? You present an extremely convoluted set of driver config params and
> demand from us resolving the bug with those parameters in place. This
> introduces unneeded complication of the failure scenario which in turn
> introduces a lot of unknowns. Alex asks you to simplify the settings so less
> unknows are in the system so it's easier for us to try and figure out what
> goes wrong while we inspect the code. 
> So please, bring the parameters back to default as this is the most well
> tested configuration and gives a baseline and also please provide addr2line
> for 0010:amdgpu_dm_atomic_commit_tail+0x2ee so we can get a better idea
> where in code the NULL ptr happened.

And how about instead of knowingly pushing untested code with known fatal
errors you stop taking QA notes from FGLRX in the first place and do your own
full testing ? You do realize that I, as all others, paid for that card to your
employer, right ? And people don't buy your top cards, RX[4-5][7-8]0, VEGAs and
so on, to use them as expensive bare output controllers.

DO NOT SHOOT THE MESSENGER. What you ignore from me, others will get one way or
another, most of which would be incapable to even report it and just resort to
cursing you and sell the hardware, going on Nvidia & Intel combo forever
instead. Do you have any idea how many times in my life I've heard "at least
it's without hassle" spiel about all (yes, all) AMD stuff from "normal people"
?

I don't demand from you resolving this personally for me and whatever I might
configure. But I do demand you not pushing untested code, hide it under
parameters that limit all cards to bare minimum and then use it as an excuse to
continue not to test it. And then silently expect me to work as your QA as if I
trained on how to debug kernel-level code and telepathically know what might be
on your minds. What else, should I be expected to whip out chip programmer and
write custom asm-code for your mystery chips by myself ? I don't have a
laboratory or a dedicated debug station.

_Regarding this notion of "testing on defaults"_. Maybe I was not clear on
that: that #3 dereference happened just once after about a full day of uptime.
The machine sometime was running for more than a week straight without issues.
So, defaulting will not show any difference on my end unless I run both configs
no less than 2 weeks of pure uptime each without shutting down the machine. And
it still be useless guesswork which will not produce any more pointers on what
exactly goes wrong, at best it just will repeat or not.

However, you as a developers of that code and a trained experts, can use that
little data there is to recheck exact offending code about no one else have a
clue about. You also can fully reproduce my configuration (including exact
packages of my kernel with debug-info) and work with full data of your own,
since you not willing to test all your codepaths regularly as a rule.

I will try to figure out what the hell this "addr2line" is but it will probably
include installing gigabytes of debug-symbols on SSD that has no space for
them, so… it will take a while.

But the way, what happened with my answer about #2 ? You know, the `list
*(amdgpu_vm_update_directories+0xe7)` part, which was real time-consuming pain
to get, with:
0x2e127 is in amdgpu_vm_update_directories
(../drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:1191).
where line #1191 is:
struct amdgpu_bo *bo = parent->base.bo, *pbo;

Have you even seen it ? Was it the right thing ? Any thoughts on the cause for
this one ? Should I do the same for the #3 ? Will it also go into a void of
silence ?

(In reply to Damian Nowak from comment #58)
> I encounter this error once a week on average on my Radeon 7 (Vega 20).
> Great on see you guys actively working on it. When 5.3.2 releases to Arch,
> I'll keep using it for a week or two and report back whether I encounter an
> issue again or not. Thanks! 
> 
> @Sergey You could revert to defaults just for the duration of
> testing/debugging. It'll sure make things easier for developers, and you can
> still go back to your settings once the issue is fixed. Great settings
> nonetheless, do these kernel parameters really improve the power performance
> of RX 580, or did you need to do something in addition too? By the way, I
> used RX 580 on default Arch Linux settings (so most likely kernel defaults)
> for a year and it was fine so you probably don't have to worry about frying
> it. Now I'm using Radeon 7, while RX 580 is still alive in a different
> Windows-based computer.

Ok, I can. But what's next ? How exactly does that would give any more data ?
What exactly should I do after booting the machine ?

Power ? No, the custom hacked GPU BIOS does. Although, after fiddling with
voltages, I just left them on auto-defaults, where driver/firmware uses
built-in per-card "chip quality" as multiplier for defaults, and limited
frequency to 1300. Power-draw increases exponentially with frequency and after
1300 it increases ridiculous on RX580's 14nm chips. I also made fans never stop
and act more aggressively but not to the point of out-noising the case and CPU
fans. And I tightened memory timings too. 90-120W are numbers from MSI
Afterburner, mostly about 90W and rarely 120W in some specific loads.

Pre-RX cards, the whole 2008-2015 generation of AMD GPU chips (and chipsets,
for that matter), especially mobile ones, are well known to be
self-destructive. And not long ago my 6870 has joined them. Ironically, default
firmware settings on commercial GPUs are not safe, at least not on those
generations. They are balanced by the manufacturers to barely survive warranty
periods. That's why pre-overclocked cards, or any chips, is not a product that
anyone should be exited about. AMD chips are knows as "the stoves" for the
reason but device manufacturers bring it them from "inefficient" to
"half-dead". Price's good though.

With the software parameters I mainly try to balance latency and CPU time,
remove sources of stuttering, do proper prioritization during CPU & I/O
contention, and enable features that can be safely enabled, so when I run my
live test/install distro build on unknown hardware, I could test and/or use it
fully without redoing and customizing the whole damn thing. But it's more of a
guesswork with GPU than with everything else. Unfortunately, developers in
general are not much of the fans of "multi-task desktop user experience" on
last-gen ("last" being "older than one in laboratory") hardware.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (58 preceding siblings ...)
  2019-09-29 21:54 ` bugzilla-daemon
@ 2019-09-30  2:07 ` bugzilla-daemon
  2019-09-30  2:09 ` bugzilla-daemon
                   ` (8 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-09-30  2:07 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #60 from jamespharvey20@gmail.com ---
(In reply to Sergey Kondakov from comment #59)
> 
> And how about instead of knowingly pushing untested code with known fatal
> errors you stop taking QA notes from FGLRX in the first place and do your
> own full testing ? You do realize that I, as all others, paid for that card
> to your employer, right ? And people don't buy your top cards,
> RX[4-5][7-8]0, VEGAs and so on, to use them as expensive bare output
> controllers.

This.  If this were just a free project with volunteers giving their time, many
of us who occasionally throw a tantrum towards AMD wouldn't be.  But, some of
us are throwing money at AMD to try to have a stable system again, and keep
getting regressions introduced that are either fixed very slowly, or not at
all.

I'm here, because I was running an R9 390, and kernel 4.19 introduced a
regression that causes a complete boot failure.  Others confirmed the same. 
See https://bugs.freedesktop.org/show_bug.cgi?id=108781  (As I explain way
below, this is still unfixed in 5.3.)

On that bug, I'm asked by an amd.com developer to bisect.  I run into hundreds,
or even a thousand, commits that don't even compile, and only a later commit
fixes that issue.  Fun, thanks for pushing those, guys.  I finally achieve a
bisected commit, where 0d9988910989 causes a boot hang and the one previous to
it doesn't.  Upon being told this shouldn't have to do with the bug I've
posted, I do discover that this bug causes a black screen boot hang, but it's a
different bug!  I then go on to document that I've found between 3 and 5
crashing commits in the new 4.19 commits.

So, how am I supposed to bisect this garbage, when a lot doesn't even compile,
and there are multiple bugs popping in and out of existence causing the same
symptom?  Boot crashes with black screen, and I'm supposed to know to mark that
commit as good because it's a different bug causing the same issue?

I ask the AMD devs to tell me exactly which card they use in testing (if any,
at all) so I can just buy that and be done with this.  No response.

So, I pay AMD more money and buy a RX 580, which is mostly a downgrade from the
R9 390.  Get frequent crashes from that as well.

So, I just decide to buy a Vega 64.  I don't need the extra power, I just want
to run a stable machine.  Since AMD devs aren't saying what card I could use
that they do, in a hope that they might fix crashes before they push them, I
figure the latest and greatest might be getting more attention.

All goes well until this regression is introduced.

I go back to try my R9 390, and guess what?  The same bug introduced in kernel
4.19 is still there in 5.3!  AMD's just ignored it, and hasn't bothered to try
to reproduce it themselves and try to untangle the mess of spaghetti.

Since running a custom kernel with the patchset, I haven't had this crash, but
come on guys!  Couldn't AMD have a bank of 50 computers running different
cards, constantly running the latest unpushed code and going through different
stress tests?  Hey, Jim, monitor #14 and #36 keep crashing, let's look into
it.....

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (59 preceding siblings ...)
  2019-09-30  2:07 ` bugzilla-daemon
@ 2019-09-30  2:09 ` bugzilla-daemon
  2019-11-05 19:38 ` bugzilla-daemon
                   ` (7 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-09-30  2:09 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #61 from jamespharvey20@gmail.com ---
It might look like I'm just ranting.  That's not the reason I posted my
comment.  I'm trying to give feedback to AMD about how bad so many customer
experiences are right now, and have been for quite some time, and how there
should be easy and affordable (for AMD) ways to make it better.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (60 preceding siblings ...)
  2019-09-30  2:09 ` bugzilla-daemon
@ 2019-11-05 19:38 ` bugzilla-daemon
  2019-12-12 19:04 ` bugzilla-daemon
                   ` (6 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-11-05 19:38 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

ZenAnonX (zenanonx@protonmail.ch) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |zenanonx@protonmail.ch

--- Comment #62 from ZenAnonX (zenanonx@protonmail.ch) ---
Kernel: 5.3.8
OS: Arch Linux x86_64


I was able to eliminate crash mentioned in,

https://bugzilla.kernel.org/show_bug.cgi?id=204181#c27 and
https://bugzilla.kernel.org/show_bug.cgi?id=204181#c34


by removing "amdgpu.vm_update_mode=3" from kernel parameters. This however
reintroduced https://bugs.freedesktop.org/show_bug.cgi?id=102322 as mentioned
on
https://wiki.archlinux.org/index.php/AMDGPU#Freezes_with_%22[drm]_IP_block:gmc_v8_0_is_hung!%22_kernel_error.


"BUG: kernel NULL pointer dereference, address: 0000000000000008" seems to
happen most frequently while browsing internet using icecat.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (61 preceding siblings ...)
  2019-11-05 19:38 ` bugzilla-daemon
@ 2019-12-12 19:04 ` bugzilla-daemon
  2020-06-19  3:13 ` bugzilla-daemon
                   ` (5 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2019-12-12 19:04 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

Julien Isorce (julien.isorce@gmail.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |julien.isorce@gmail.com

--- Comment #63 from Julien Isorce (julien.isorce@gmail.com) ---
Does the crash in comment #0 actually happen dc_stream_log ?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (62 preceding siblings ...)
  2019-12-12 19:04 ` bugzilla-daemon
@ 2020-06-19  3:13 ` bugzilla-daemon
  2020-06-19  3:14 ` bugzilla-daemon
                   ` (4 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2020-06-19  3:13 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #64 from Christopher Snowhill (kode54@gmail.com) ---
I encountered this crash twice today on a slightly modified Arch 5.7.2 kernel.
Attaching a crash log.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (63 preceding siblings ...)
  2020-06-19  3:13 ` bugzilla-daemon
@ 2020-06-19  3:14 ` bugzilla-daemon
  2020-07-26 22:49 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2020-06-19  3:14 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #65 from Christopher Snowhill (kode54@gmail.com) ---
Oh, there is no file attach here. So I'll just paste the whole thing in this
response:

Jun 18 19:53:40 mrgency kernel: general protection fault, probably for
non-canonical address 0x486df9363c7dd76e: 0000 [#1] SMP NOPTI
Jun 18 19:53:40 mrgency kernel: CPU: 7 PID: 15075 Comm: kworker/u64:1 Not
tainted 5.7.2-6-tkg-pds #1
Jun 18 19:53:40 mrgency kernel: Hardware name: Micro-Star International Co.,
Ltd MS-7C02/B450 TOMAHAWK (MS-7C02), BIOS 1.D0 11/07/2019
Jun 18 19:53:40 mrgency kernel: Workqueue: events_unbound commit_work
[drm_kms_helper]
Jun 18 19:53:40 mrgency kernel: RIP:
0010:amdgpu_dm_atomic_commit_tail+0x24c/0x2040 [amdgpu]
Jun 18 19:53:40 mrgency kernel: Code: 8b 4f 08 8b 81 e0 02 00 00 41 ff c5 44 39
e8 0f 87 4d ff ff ff 48 83 bd 60 fd ff ff 00 0f 84 01 01 00 00 48 8b bd 60 fd
ff ff <80> bf b0 01 00 00 01 0f 86 aa 00 00 00 31 c0 48 b9 00 00 00 00 01
Jun 18 19:53:40 mrgency kernel: RSP: 0018:ffffaa9109057b70 EFLAGS: 00010202
Jun 18 19:53:40 mrgency kernel: RAX: 0000000000000006 RBX: ffff916786f2c800
RCX: ffff916a4c049800
Jun 18 19:53:40 mrgency kernel: RDX: ffff916a4c0ce800 RSI: ffffffffc14dd198
RDI: 486df9363c7dd76e
Jun 18 19:53:40 mrgency kernel: RBP: ffffaa9109057e60 R08: 0000000000000001
R09: 0000000000000001
Jun 18 19:53:40 mrgency kernel: R10: 0000000000000000 R11: 0000000000000000
R12: ffff9169ce131800
Jun 18 19:53:40 mrgency kernel: R13: 0000000000000006 R14: 0000000000000000
R15: ffff91680248b780
Jun 18 19:53:40 mrgency kernel: FS:  0000000000000000(0000)
GS:ffff916a4e9c0000(0000) knlGS:0000000000000000
Jun 18 19:53:40 mrgency kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
0000000080050033
Jun 18 19:53:40 mrgency kernel: CR2: 000010a7e8949000 CR3: 000000028b26c000
CR4: 00000000003406e0
Jun 18 19:53:40 mrgency kernel: Call Trace:
Jun 18 19:53:40 mrgency kernel:  ? __switch_to_asm+0x40/0x70
Jun 18 19:53:40 mrgency kernel:  ? __switch_to_asm+0x34/0x70
Jun 18 19:53:40 mrgency kernel:  ? __switch_to_asm+0x40/0x70
Jun 18 19:53:40 mrgency kernel:  ? __switch_to_asm+0x34/0x70
Jun 18 19:53:40 mrgency kernel:  ? __switch_to_asm+0x40/0x70
Jun 18 19:53:40 mrgency kernel:  ? __switch_to_asm+0x34/0x70
Jun 18 19:53:40 mrgency kernel:  ? __switch_to_asm+0x40/0x70
Jun 18 19:53:40 mrgency kernel:  ? __switch_to_asm+0x34/0x70
Jun 18 19:53:40 mrgency kernel:  ? __switch_to_asm+0x40/0x70
Jun 18 19:53:40 mrgency kernel:  ? __switch_to_asm+0x34/0x70
Jun 18 19:53:40 mrgency kernel:  ? __switch_to_asm+0x40/0x70
Jun 18 19:53:40 mrgency kernel:  ? __switch_to_asm+0x34/0x70
Jun 18 19:53:40 mrgency kernel:  ? take_other_rq_task+0x9d/0x3e0
Jun 18 19:53:40 mrgency kernel:  ? __switch_to_asm+0x40/0x70
Jun 18 19:53:40 mrgency kernel:  ? __switch_to_asm+0x34/0x70
Jun 18 19:53:40 mrgency kernel:  ? __switch_to_asm+0x40/0x70
Jun 18 19:53:40 mrgency kernel:  ? __switch_to_asm+0x34/0x70
Jun 18 19:53:40 mrgency kernel:  ? timerqueue_add+0x65/0xb0
Jun 18 19:53:40 mrgency kernel:  ? enqueue_hrtimer+0x3c/0x90
Jun 18 19:53:40 mrgency kernel:  ? hrtimer_start_range_ns+0x1a2/0x2f0
Jun 18 19:53:40 mrgency kernel:  ? __schedule+0x202/0x9d0
Jun 18 19:53:40 mrgency kernel:  ? psi_task_change+0x84/0xc0
Jun 18 19:53:40 mrgency kernel:  ? usleep_range+0x80/0x80
Jun 18 19:53:40 mrgency kernel:  ? _cond_resched+0x16/0x40
Jun 18 19:53:40 mrgency kernel:  ? __wait_for_common+0x3b/0x160
Jun 18 19:53:40 mrgency kernel:  commit_tail+0x92/0x120 [drm_kms_helper]
Jun 18 19:53:40 mrgency kernel:  process_one_work+0x1e6/0x3b0
Jun 18 19:53:40 mrgency kernel:  worker_thread+0x50/0x410
Jun 18 19:53:40 mrgency kernel:  ? process_one_work+0x3b0/0x3b0
Jun 18 19:53:40 mrgency kernel:  kthread+0x122/0x140
Jun 18 19:53:40 mrgency kernel:  ? __kthread_bind_mask+0x60/0x60
Jun 18 19:53:40 mrgency kernel:  ret_from_fork+0x22/0x40
Jun 18 19:53:40 mrgency kernel: Modules linked in: fuse rfcomm tun uvcvideo
videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common videodev
cmac algif_hash algif_skcipher af_alg bnep btusb btrtl btbcm btintel bluetooth
ecdh_generic rfkill ecc crc16 hid_steam snd_usb_audio snd_usbmidi_lib
snd_rawmidi snd_seq_device mc mousedev joydev input_leds nls_iso8859_1
nls_cp437 vfat amdgpu hid_generic wmi_bmof fat edac_mce_amd dm_mod
snd_hda_codec_realtek kvm_amd snd_hda_codec_generic snd_hda_codec_hdmi
ledtrig_audio kvm snd_hda_intel gpu_sched snd_intel_dspcfg i2c_algo_bit
irqbypass snd_hda_codec ttm crct10dif_pclmul crc32_pclmul snd_hda_core
ghash_clmulni_intel drm_kms_helper snd_hwdep snd_pcm usbhid hid cec aesni_intel
r8169 snd_timer rc_core sp5100_tco snd crypto_simd syscopyarea realtek
sysfillrect cryptd glue_helper sysimgblt pcspkr ccp libphy i2c_piix4 k10temp
soundcore fb_sys_fops tpm_crb wmi tpm_tis tpm_tis_core tpm pinctrl_amd rng_core
gpio_amdpt evdev mac_hid acpi_cpufreq drm sg crypto_user
Jun 18 19:53:40 mrgency kernel:  agpgart ip_tables x_tables btrfs
blake2b_generic libcrc32c crc32c_generic xor raid6_pq crc32c_intel xhci_pci
sr_mod cdrom xhci_hcd
Jun 18 19:53:40 mrgency kernel: ---[ end trace 28969089457f0e4d ]---
Jun 18 19:53:40 mrgency kernel: RIP:
0010:amdgpu_dm_atomic_commit_tail+0x24c/0x2040 [amdgpu]
Jun 18 19:53:40 mrgency kernel: Code: 8b 4f 08 8b 81 e0 02 00 00 41 ff c5 44 39
e8 0f 87 4d ff ff ff 48 83 bd 60 fd ff ff 00 0f 84 01 01 00 00 48 8b bd 60 fd
ff ff <80> bf b0 01 00 00 01 0f 86 aa 00 00 00 31 c0 48 b9 00 00 00 00 01
Jun 18 19:53:40 mrgency kernel: RSP: 0018:ffffaa9109057b70 EFLAGS: 00010202
Jun 18 19:53:40 mrgency kernel: RAX: 0000000000000006 RBX: ffff916786f2c800
RCX: ffff916a4c049800
Jun 18 19:53:40 mrgency kernel: RDX: ffff916a4c0ce800 RSI: ffffffffc14dd198
RDI: 486df9363c7dd76e
Jun 18 19:53:40 mrgency kernel: RBP: ffffaa9109057e60 R08: 0000000000000001
R09: 0000000000000001
Jun 18 19:53:40 mrgency kernel: R10: 0000000000000000 R11: 0000000000000000
R12: ffff9169ce131800
Jun 18 19:53:40 mrgency kernel: R13: 0000000000000006 R14: 0000000000000000
R15: ffff91680248b780
Jun 18 19:53:40 mrgency kernel: FS:  0000000000000000(0000)
GS:ffff916a4e9c0000(0000) knlGS:0000000000000000
Jun 18 19:53:40 mrgency kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
0000000080050033
Jun 18 19:53:40 mrgency kernel: CR2: 000010a7e8949000 CR3: 000000028b26c000
CR4: 00000000003406e0

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (64 preceding siblings ...)
  2020-06-19  3:14 ` bugzilla-daemon
@ 2020-07-26 22:49 ` bugzilla-daemon
  2020-07-26 22:50 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2020-07-26 22:49 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

mnrzk@protonmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |mnrzk@protonmail.com

--- Comment #66 from mnrzk@protonmail.com ---
Created attachment 290589
  --> https://bugzilla.kernel.org/attachment.cgi?id=290589&action=edit
drm/amd/display: Clear dm_state for fast updates

Alright, the bug patch I mentioned in the last comment seems to be good
after a few hours of testing.

Please try out this patch and see if it fixes the issue for the rest of
you.

In the meantime, I'm doing more extended tests on this patch to confirm it
works well enough before posting it on LKML.

Nicholas, I haven't tested your commit since I was too busy with this. I'll
try it out if this one fails though.

Also, can you please review this patch to confirm that I'm not doing
anything wrong here?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (65 preceding siblings ...)
  2020-07-26 22:49 ` bugzilla-daemon
@ 2020-07-26 22:50 ` bugzilla-daemon
  2022-11-10  4:02 ` bugzilla-daemon
  2022-12-23  9:17 ` bugzilla-daemon
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2020-07-26 22:50 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

--- Comment #67 from mnrzk@protonmail.com ---
(In reply to mnrzk from comment #66)
> Created attachment 290589 [details]
> drm/amd/display: Clear dm_state for fast updates
> 
> Alright, the bug patch I mentioned in the last comment seems to be good
> after a few hours of testing.
> 
> Please try out this patch and see if it fixes the issue for the rest of
> you.
> 
> In the meantime, I'm doing more extended tests on this patch to confirm it
> works well enough before posting it on LKML.
> 
> Nicholas, I haven't tested your commit since I was too busy with this. I'll
> try it out if this one fails though.
> 
> Also, can you please review this patch to confirm that I'm not doing
> anything wrong here?

Oh my god, I just responded to the wrong thread by accident, so sorry.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (66 preceding siblings ...)
  2020-07-26 22:50 ` bugzilla-daemon
@ 2022-11-10  4:02 ` bugzilla-daemon
  2022-12-23  9:17 ` bugzilla-daemon
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2022-11-10  4:02 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

buro (francesco.burelli@proton.me) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |francesco.burelli@proton.me

--- Comment #68 from buro (francesco.burelli@proton.me) ---
Hello, I get similar system freeze and I know exactly how to reproduce it (on
my machine): just visit https://www.unrealengine.com/ with Firefox 106.0.3 and
you get the freeze. It happens also in others websites but more randomly.
If you need more info I can give you, many thanks.

Info about my graphic card:

07:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
Tahiti XT [Radeon HD 7970/8970 OEM / R9 280X] (prog-if 00 [VGA controller])
        Subsystem: PC Partner Limited / Sapphire Technology Device 3001

uname -a
Linux arch-tower 6.0.6-arch1-1 #1 SMP PREEMPT_DYNAMIC Sat, 29 Oct 2022 14:08:39
+0000 x86_64 GNU/Linux

The log from journalctl:

Nov 09 19:30:29 arch-tower kernel: BUG: kernel NULL pointer dereference,
address: 0000000000000020
Nov 09 19:30:29 arch-tower kernel: #PF: supervisor read access in kernel mode
Nov 09 19:30:29 arch-tower kernel: #PF: error_code(0x0000) - not-present page
Nov 09 19:30:29 arch-tower kernel: PGD 0 P4D 0 
Nov 09 19:30:29 arch-tower kernel: Oops: 0000 [#1] PREEMPT SMP NOPTI
Nov 09 19:30:29 arch-tower kernel: CPU: 7 PID: 220976 Comm: firefox:cs0 Not
tainted 6.0.6-arch1-1 #1 a46cc4b882cfc11c3bbb09d6a0fab3dcad53b5c2
Nov 09 19:30:29 arch-tower kernel: Hardware name: System manufacturer System
Product Name/PRIME A320M-K, BIOS 5207 08/30/2019
Nov 09 19:30:29 arch-tower kernel: RIP: 0010:amdgpu_sa_bo_free+0x57/0x150
[amdgpu]
Nov 09 19:30:29 arch-tower kernel: Code: 00 00 4c 8b 60 20 48 89 d5 4c 89 e7 e8
22 fd 4b c3 48 85 ed 0f 84 a4 00 00 00 48 8b 45 30 a8 01 0f 85 98 00 00 00 48
8b 45 08 <48> 8b 40 20 48 85 c0 74 0c 48 89 ef e8 48 1e 6c c3 84 c0 75 77 4c
Nov 09 19:30:29 arch-tower kernel: RSP: 0018:ffffb2c98cedfa70 EFLAGS: 00010246
Nov 09 19:30:29 arch-tower kernel: RAX: 0000000000000000 RBX: ffff948784158e30
RCX: 0000000080800078
Nov 09 19:30:29 arch-tower kernel: RDX: 0000000000000001 RSI: ffff948784158e30
RDI: ffff94878b1e62f0
Nov 09 19:30:29 arch-tower kernel: RBP: ffff948784158d98 R08: 0000000000000000
R09: 0000000080800078
Nov 09 19:30:29 arch-tower kernel: R10: 0000000000000008 R11: 0000000010000000
R12: ffff94878b1e62f0
Nov 09 19:30:29 arch-tower kernel: R13: ffff94878b1e9628 R14: 00000000fffffff4
R15: 0000000000000001
Nov 09 19:30:29 arch-tower kernel: FS:  00007f494b1ff6c0(0000)
GS:ffff9488969c0000(0000) knlGS:0000000000000000
Nov 09 19:30:29 arch-tower kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
0000000080050033
Nov 09 19:30:29 arch-tower kernel: CR2: 0000000000000020 CR3: 0000000154e50000
CR4: 00000000003506e0
Nov 09 19:30:29 arch-tower kernel: Call Trace:
Nov 09 19:30:29 arch-tower kernel:  <TASK>
Nov 09 19:30:29 arch-tower kernel:  amdgpu_job_free+0x55/0xe0 [amdgpu
3b0071ba2e7e576c543138f03ed9b8249042cca2]
Nov 09 19:30:29 arch-tower kernel:  amdgpu_cs_ioctl+0x506/0x1f30 [amdgpu
3b0071ba2e7e576c543138f03ed9b8249042cca2]
Nov 09 19:30:29 arch-tower kernel:  ? amdgpu_cs_find_mapping+0x110/0x110
[amdgpu 3b0071ba2e7e576c543138f03ed9b8249042cca2]
Nov 09 19:30:29 arch-tower kernel:  drm_ioctl_kernel+0xcd/0x170
Nov 09 19:30:29 arch-tower kernel:  drm_ioctl+0x231/0x410
Nov 09 19:30:29 arch-tower kernel:  ? amdgpu_cs_find_mapping+0x110/0x110
[amdgpu 3b0071ba2e7e576c543138f03ed9b8249042cca2]
Nov 09 19:30:29 arch-tower kernel:  amdgpu_drm_ioctl+0x4e/0x90 [amdgpu
3b0071ba2e7e576c543138f03ed9b8249042cca2]
Nov 09 19:30:29 arch-tower kernel:  __x64_sys_ioctl+0x94/0xd0
Nov 09 19:30:29 arch-tower kernel:  do_syscall_64+0x5f/0x90
Nov 09 19:30:29 arch-tower kernel:  ? do_futex+0xde/0x1b0
Nov 09 19:30:29 arch-tower kernel:  ? __x64_sys_futex+0x92/0x1d0
Nov 09 19:30:29 arch-tower kernel:  ? syscall_exit_to_user_mode+0x1b/0x40
Nov 09 19:30:29 arch-tower kernel:  ? do_syscall_64+0x6b/0x90
Nov 09 19:30:29 arch-tower kernel:  ? do_syscall_64+0x6b/0x90
Nov 09 19:30:29 arch-tower kernel:  ? syscall_exit_to_user_mode+0x1b/0x40
Nov 09 19:30:29 arch-tower kernel:  ? do_syscall_64+0x6b/0x90
Nov 09 19:30:29 arch-tower kernel:  ? do_syscall_64+0x6b/0x90
Nov 09 19:30:29 arch-tower kernel:  ? syscall_exit_to_user_mode+0x1b/0x40
Nov 09 19:30:29 arch-tower kernel:  ? do_syscall_64+0x6b/0x90
Nov 09 19:30:29 arch-tower kernel:  entry_SYSCALL_64_after_hwframe+0x63/0xcd
Nov 09 19:30:29 arch-tower kernel: RIP: 0033:0x7f494b515c0f
Nov 09 19:30:29 arch-tower kernel: Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60
c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00
00 0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00
Nov 09 19:30:29 arch-tower kernel: RSP: 002b:00007f494b1fe9c0 EFLAGS: 00000246
ORIG_RAX: 0000000000000010
Nov 09 19:30:29 arch-tower kernel: RAX: ffffffffffffffda RBX: 00007f494b1feb38
RCX: 00007f494b515c0f
Nov 09 19:30:29 arch-tower kernel: RDX: 00007f494b1fea80 RSI: 00000000c0186444
RDI: 0000000000000018
Nov 09 19:30:29 arch-tower kernel: RBP: 00007f494b1fea80 R08: 00007f494b1feb80
R09: 00007f494b1fea60
Nov 09 19:30:29 arch-tower kernel: R10: 00007f491f8cbd00 R11: 0000000000000246
R12: 00000000c0186444
Nov 09 19:30:29 arch-tower kernel: R13: 0000000000000018 R14: 00007f494b1feb38
R15: 0000000000000002
Nov 09 19:30:29 arch-tower kernel:  </TASK>
Nov 09 19:30:29 arch-tower kernel: Modules linked in: iptable_nat xt_MASQUERADE
nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_filter uas
usb_storage ccm dm_crypt cbc encrypted_keys trusted asn1_encoder tee tpm
nls_iso8859_1 vfat fat intel_rapl_msr intel_rapl_common edac_mce_amd eeepc_wmi
snd_hda_codec_realtek asus_wmi kvm_amd snd_hda_codec_generic sparse_keymap
platform_profile ledtrig_audio snd_hda_codec_hdmi video wmi_bmof kvm
snd_hda_intel snd_intel_dspcfg irqbypass snd_intel_sdw_acpi mt7601u
crct10dif_pclmul snd_hda_codec crc32_pclmul polyval_clmulni snd_hda_core
polyval_generic mac80211 snd_hwdep gf128mul r8169 ghash_clmulni_intel snd_pcm
realtek aesni_intel mdio_devres snd_timer crypto_simd ccp cryptd mousedev
libarc4 sp5100_tco snd joydev rapl libphy pcspkr k10temp soundcore i2c_piix4
rng_core gpio_amdpt mac_hid cfg80211 gpio_generic wmi acpi_cpufreq rfkill
dm_multipath dm_mod crypto_user fuse bpf_preload ip_tables x_tables ext4
crc32c_generic crc16 mbcache jbd2 usbhid sr_mod
Nov 09 19:30:29 arch-tower kernel:  crc32c_intel xhci_pci cdrom
xhci_pci_renesas amdgpu drm_ttm_helper ttm gpu_sched drm_buddy
drm_display_helper cec
Nov 09 19:30:29 arch-tower kernel: CR2: 0000000000000020
Nov 09 19:30:29 arch-tower kernel: ---[ end trace 0000000000000000 ]---
Nov 09 19:30:29 arch-tower kernel: RIP: 0010:amdgpu_sa_bo_free+0x57/0x150
[amdgpu]
Nov 09 19:30:29 arch-tower kernel: Code: 00 00 4c 8b 60 20 48 89 d5 4c 89 e7 e8
22 fd 4b c3 48 85 ed 0f 84 a4 00 00 00 48 8b 45 30 a8 01 0f 85 98 00 00 00 48
8b 45 08 <48> 8b 40 20 48 85 c0 74 0c 48 89 ef e8 48 1e 6c c3 84 c0 75 77 4c
Nov 09 19:30:29 arch-tower kernel: RSP: 0018:ffffb2c98cedfa70 EFLAGS: 00010246
Nov 09 19:30:29 arch-tower kernel: RAX: 0000000000000000 RBX: ffff948784158e30
RCX: 0000000080800078
Nov 09 19:30:29 arch-tower kernel: RDX: 0000000000000001 RSI: ffff948784158e30
RDI: ffff94878b1e62f0
Nov 09 19:30:29 arch-tower kernel: RBP: ffff948784158d98 R08: 0000000000000000
R09: 0000000080800078
Nov 09 19:30:29 arch-tower kernel: R10: 0000000000000008 R11: 0000000010000000
R12: ffff94878b1e62f0
Nov 09 19:30:29 arch-tower kernel: R13: ffff94878b1e9628 R14: 00000000fffffff4
R15: 0000000000000001
Nov 09 19:30:29 arch-tower kernel: FS:  00007f494b1ff6c0(0000)
GS:ffff9488969c0000(0000) knlGS:0000000000000000
Nov 09 19:30:29 arch-tower kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
0000000080050033
Nov 09 19:30:29 arch-tower kernel: CR2: 0000000000000020 CR3: 0000000154e50000
CR4: 00000000003506e0
Nov 09 19:30:29 arch-tower kernel: note: firefox:cs0[220976] exited with
preempt_count 1
Nov 09 19:30:54 arch-tower kernel: watchdog: BUG: soft lockup - CPU#8 stuck for
27s! [Renderer:202012]
Nov 09 19:30:54 arch-tower kernel: Modules linked in: iptable_nat xt_MASQUERADE
nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_filter uas
usb_storage ccm dm_crypt cbc encrypted_keys trusted asn1_encoder tee tpm
nls_iso8859_1 vfat fat intel_rapl_msr intel_rapl_common edac_mce_amd eeepc_wmi
snd_hda_codec_realtek asus_wmi kvm_amd snd_hda_codec_generic sparse_keymap
platform_profile ledtrig_audio snd_hda_codec_hdmi video wmi_bmof kvm
snd_hda_intel snd_intel_dspcfg irqbypass snd_intel_sdw_acpi mt7601u
crct10dif_pclmul snd_hda_codec crc32_pclmul polyval_clmulni snd_hda_core
polyval_generic mac80211 snd_hwdep gf128mul r8169 ghash_clmulni_intel snd_pcm
realtek aesni_intel mdio_devres snd_timer crypto_simd ccp cryptd mousedev
libarc4 sp5100_tco snd joydev rapl libphy pcspkr k10temp soundcore i2c_piix4
rng_core gpio_amdpt mac_hid cfg80211 gpio_generic wmi acpi_cpufreq rfkill
dm_multipath dm_mod crypto_user fuse bpf_preload ip_tables x_tables ext4
crc32c_generic crc16 mbcache jbd2 usbhid sr_mod
Nov 09 19:30:54 arch-tower kernel:  crc32c_intel xhci_pci cdrom
xhci_pci_renesas amdgpu drm_ttm_helper ttm gpu_sched drm_buddy
drm_display_helper cec
Nov 09 19:30:54 arch-tower kernel: CPU: 8 PID: 202012 Comm: Renderer Tainted: G
     D            6.0.6-arch1-1 #1 a46cc4b882cfc11c3bbb09d6a0fab3dcad53b5c2
Nov 09 19:30:54 arch-tower kernel: Hardware name: System manufacturer System
Product Name/PRIME A320M-K, BIOS 5207 08/30/2019
Nov 09 19:30:54 arch-tower kernel: RIP:
0010:native_queued_spin_lock_slowpath+0x21f/0x2d0
Nov 09 19:30:54 arch-tower kernel: Code: 41 8d 4d 01 41 c1 e4 10 c1 e1 12 44 09
e1 89 c8 c1 e8 10 66 87 43 02 89 c2 c1 e2 10 81 fa ff ff 00 00 77 3b 31 d2 eb
02 f3 90 <8b> 03 66 85 c0 75 f7 89 c6 66 31 f6 39 f1 0f 84 87 00 00 00 c6 03
Nov 09 19:30:54 arch-tower kernel: RSP: 0000:ffffb2c9a003f7e0 EFLAGS: 00000202
Nov 09 19:30:54 arch-tower kernel: RAX: 0000000000240101 RBX: ffff94878b1e62f0
RCX: 0000000000240000
Nov 09 19:30:54 arch-tower kernel: RDX: 0000000000000000 RSI: 0000000000000101
RDI: ffff94878b1e62f0
Nov 09 19:30:54 arch-tower kernel: RBP: ffff948896a33b80 R08: ffffb2c9a003f7c8
R09: 0000000000000040
Nov 09 19:30:54 arch-tower kernel: R10: 0000000000200000 R11: ffffb2c9a003fb80
R12: 0000000000000000
Nov 09 19:30:54 arch-tower kernel: R13: 0000000000000008 R14: ffff94878b1e6518
R15: ffff94878b1e62f0
Nov 09 19:30:54 arch-tower kernel: FS:  00007efd265fc6c0(0000)
GS:ffff948896a00000(0000) knlGS:0000000000000000
Nov 09 19:30:54 arch-tower kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
0000000080050033
Nov 09 19:30:54 arch-tower kernel: CR2: 00007efd04b68400 CR3: 0000000138d60000
CR4: 00000000003506e0
Nov 09 19:30:54 arch-tower kernel: Call Trace:
Nov 09 19:30:54 arch-tower kernel:  <TASK>
Nov 09 19:30:54 arch-tower kernel:  _raw_spin_lock+0x29/0x30
Nov 09 19:30:54 arch-tower kernel:  amdgpu_sa_bo_new+0xd5/0x560 [amdgpu
3b0071ba2e7e576c543138f03ed9b8249042cca2]
Nov 09 19:30:54 arch-tower kernel:  ?
update_sd_lb_stats.constprop.0+0x10f/0x910
Nov 09 19:30:54 arch-tower kernel:  ? select_task_rq_fair+0x161/0x1a60
Nov 09 19:30:54 arch-tower kernel:  amdgpu_ib_get+0x43/0x90 [amdgpu
3b0071ba2e7e576c543138f03ed9b8249042cca2]
Nov 09 19:30:54 arch-tower kernel:  amdgpu_job_alloc_with_ib+0x5b/0x80 [amdgpu
3b0071ba2e7e576c543138f03ed9b8249042cca2]
Nov 09 19:30:54 arch-tower kernel:  amdgpu_copy_buffer+0xc2/0x230 [amdgpu
3b0071ba2e7e576c543138f03ed9b8249042cca2]
Nov 09 19:30:54 arch-tower kernel:  amdgpu_ttm_copy_mem_to_mem+0x396/0x770
[amdgpu 3b0071ba2e7e576c543138f03ed9b8249042cca2]
Nov 09 19:30:54 arch-tower kernel:  amdgpu_bo_move+0x151/0x6d0 [amdgpu
3b0071ba2e7e576c543138f03ed9b8249042cca2]
Nov 09 19:30:54 arch-tower kernel:  ttm_bo_handle_move_mem+0xa8/0x170 [ttm
3393e9853c224a250513194a7cd094617e0e2b51]
Nov 09 19:30:54 arch-tower kernel:  ttm_bo_validate+0x10c/0x160 [ttm
3393e9853c224a250513194a7cd094617e0e2b51]
Nov 09 19:30:54 arch-tower kernel:  amdgpu_bo_fault_reserve_notify+0xbf/0x150
[amdgpu 3b0071ba2e7e576c543138f03ed9b8249042cca2]
Nov 09 19:30:54 arch-tower kernel:  amdgpu_gem_fault+0x89/0x100 [amdgpu
3b0071ba2e7e576c543138f03ed9b8249042cca2]
Nov 09 19:30:54 arch-tower kernel:  __do_fault+0x36/0x110
Nov 09 19:30:54 arch-tower kernel:  do_fault+0x2a2/0x420
Nov 09 19:30:54 arch-tower kernel:  __handle_mm_fault+0x668/0xf70
Nov 09 19:30:54 arch-tower kernel:  handle_mm_fault+0xb2/0x290
Nov 09 19:30:54 arch-tower kernel:  do_user_addr_fault+0x1be/0x6a0
Nov 09 19:30:54 arch-tower kernel:  exc_page_fault+0x74/0x170
Nov 09 19:30:54 arch-tower kernel:  asm_exc_page_fault+0x26/0x30
Nov 09 19:30:54 arch-tower kernel: RIP: 0033:0x7efd4a16c7d5
Nov 09 19:30:54 arch-tower kernel: Code: fc ff 0f 1f 00 f3 0f 1e fa 48 89 f8 48
83 fa 20 0f 82 af 00 00 00 c5 fe 6f 06 48 83 fa 40 0f 87 3e 01 00 00 c5 fe 6f
4c 16 e0 <c5> fe 7f 07 c5 fe 7f 4c 17 e0 c5 f8 77 c3 66 66 2e 0f 1f 84 00 00
Nov 09 19:30:54 arch-tower kernel: RSP: 002b:00007efd265f9698 EFLAGS: 00010246
Nov 09 19:30:54 arch-tower kernel: RAX: 00007efd04b68400 RBX: 00007efd21d36908
RCX: 00000000ffffffc0
Nov 09 19:30:54 arch-tower kernel: RDX: 0000000000000040 RSI: 00007efd3ab85c00
RDI: 00007efd04b68400
Nov 09 19:30:54 arch-tower kernel: RBP: 00007efd21d35000 R08: 0000000000040000
R09: 00007efd21d36918
Nov 09 19:30:54 arch-tower kernel: R10: 00007efd16c47c00 R11: 00007efd2396f000
R12: 0000000000000040
Nov 09 19:30:54 arch-tower kernel: R13: 0000000000000400 R14: 0000000000000000
R15: 00007efd21d35000
Nov 09 19:30:54 arch-tower kernel:  </TASK>
Nov 09 19:30:54 arch-tower kernel: watchdog: BUG: soft lockup - CPU#11 stuck
for 27s! [MediaPD~oder #1:283921]
Nov 09 19:30:54 arch-tower kernel: Modules linked in: iptable_nat xt_MASQUERADE
nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_filter uas
usb_storage ccm dm_crypt cbc encrypted_keys trusted asn1_encoder tee tpm
nls_iso8859_1 vfat fat intel_rapl_msr intel_rapl_common edac_mce_amd eeepc_wmi
snd_hda_codec_realtek asus_wmi kvm_amd snd_hda_codec_generic sparse_keymap
platform_profile ledtrig_audio snd_hda_codec_hdmi video wmi_bmof kvm
snd_hda_intel snd_intel_dspcfg irqbypass snd_intel_sdw_acpi mt7601u
crct10dif_pclmul snd_hda_codec crc32_pclmul polyval_clmulni snd_hda_core
polyval_generic mac80211 snd_hwdep gf128mul r8169 ghash_clmulni_intel snd_pcm
realtek aesni_intel mdio_devres snd_timer crypto_simd ccp cryptd mousedev
libarc4 sp5100_tco snd joydev rapl libphy pcspkr k10temp soundcore i2c_piix4
rng_core gpio_amdpt mac_hid cfg80211 gpio_generic wmi acpi_cpufreq rfkill
dm_multipath dm_mod crypto_user fuse bpf_preload ip_tables x_tables ext4
crc32c_generic crc16 mbcache jbd2 usbhid sr_mod
Nov 09 19:30:54 arch-tower kernel:  crc32c_intel xhci_pci cdrom
xhci_pci_renesas amdgpu drm_ttm_helper ttm gpu_sched drm_buddy
drm_display_helper cec
Nov 09 19:30:54 arch-tower kernel: CPU: 11 PID: 283921 Comm: MediaPD~oder #1
Tainted: G      D      L     6.0.6-arch1-1 #1
a46cc4b882cfc11c3bbb09d6a0fab3dcad53b5c2
Nov 09 19:30:54 arch-tower kernel: Hardware name: System manufacturer System
Product Name/PRIME A320M-K, BIOS 5207 08/30/2019
Nov 09 19:30:54 arch-tower kernel: RIP:
0010:native_queued_spin_lock_slowpath+0x6d/0x2d0
Nov 09 19:30:54 arch-tower kernel: Code: 00 77 7d f0 0f ba 2b 08 0f 92 c2 8b 03
0f b6 d2 c1 e2 08 30 e4 09 d0 3d ff 00 00 00 77 59 85 c0 74 0e 8b 03 84 c0 74
08 f3 90 <8b> 03 84 c0 75 f8 b8 01 00 00 00 66 89 03 65 48 ff 05 c5 24 11 7d
Nov 09 19:30:54 arch-tower kernel: RSP: 0018:ffffb2c9832c37c8 EFLAGS: 00000202
Nov 09 19:30:54 arch-tower kernel: RAX: 0000000000240101 RBX: ffff94878b1e62f0
RCX: 0000000000000001
Nov 09 19:30:54 arch-tower kernel: RDX: 0000000000000000 RSI: 0000000000000001
RDI: ffff94878b1e62f0
Nov 09 19:30:54 arch-tower kernel: RBP: ffff94879ac18a30 R08: ffffb2c9832c37b0
R09: 0000000000000040
Nov 09 19:30:54 arch-tower kernel: R10: 0000000000000000 R11: ffff9488238cb9d8
R12: ffffb2c9832c38d8
Nov 09 19:30:54 arch-tower kernel: R13: 0000000000000100 R14: ffff94878b1e6518
R15: ffff94878b1e62f0
Nov 09 19:30:54 arch-tower kernel: FS:  00007f49237fa6c0(0000)
GS:ffff948896ac0000(0000) knlGS:0000000000000000
Nov 09 19:30:54 arch-tower kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
0000000080050033
Nov 09 19:30:54 arch-tower kernel: CR2: 00007efd142bbfe0 CR3: 0000000154e50000
CR4: 00000000003506e0
Nov 09 19:30:54 arch-tower kernel: Call Trace:
Nov 09 19:30:54 arch-tower kernel:  <TASK>
Nov 09 19:30:54 arch-tower kernel:  _raw_spin_lock+0x29/0x30
Nov 09 19:30:54 arch-tower kernel:  amdgpu_sa_bo_new+0xd5/0x560 [amdgpu
3b0071ba2e7e576c543138f03ed9b8249042cca2]
Nov 09 19:30:54 arch-tower kernel:  ? __switch_to_asm+0x3e/0x60
Nov 09 19:30:54 arch-tower kernel:  ? finish_task_switch.isra.0+0x90/0x2d0
Nov 09 19:30:54 arch-tower kernel:  ? __schedule+0x34b/0x11c0
Nov 09 19:30:54 arch-tower kernel:  ?
update_sd_lb_stats.constprop.0+0x10f/0x910
Nov 09 19:30:54 arch-tower kernel:  amdgpu_ib_get+0x43/0x90 [amdgpu
3b0071ba2e7e576c543138f03ed9b8249042cca2]
Nov 09 19:30:54 arch-tower kernel:  amdgpu_job_alloc_with_ib+0x5b/0x80 [amdgpu
3b0071ba2e7e576c543138f03ed9b8249042cca2]
Nov 09 19:30:54 arch-tower kernel:  ? kmem_cache_alloc_trace+0x15d/0x320
Nov 09 19:30:54 arch-tower kernel:  amdgpu_vm_sdma_prepare+0x2b/0x70 [amdgpu
3b0071ba2e7e576c543138f03ed9b8249042cca2]
Nov 09 19:30:54 arch-tower kernel:  amdgpu_vm_update_range+0x1c0/0x770 [amdgpu
3b0071ba2e7e576c543138f03ed9b8249042cca2]
Nov 09 19:30:54 arch-tower kernel:  amdgpu_vm_bo_update+0x300/0x5a0 [amdgpu
3b0071ba2e7e576c543138f03ed9b8249042cca2]
Nov 09 19:30:54 arch-tower kernel:  amdgpu_gem_va_ioctl+0x54f/0x590 [amdgpu
3b0071ba2e7e576c543138f03ed9b8249042cca2]
Nov 09 19:30:54 arch-tower kernel:  ? amdgpu_gem_va_map_flags+0x80/0x80 [amdgpu
3b0071ba2e7e576c543138f03ed9b8249042cca2]
Nov 09 19:30:54 arch-tower kernel:  drm_ioctl_kernel+0xcd/0x170
Nov 09 19:30:54 arch-tower kernel:  drm_ioctl+0x231/0x410
Nov 09 19:30:54 arch-tower kernel:  ? amdgpu_gem_va_map_flags+0x80/0x80 [amdgpu
3b0071ba2e7e576c543138f03ed9b8249042cca2]
Nov 09 19:30:54 arch-tower kernel:  amdgpu_drm_ioctl+0x4e/0x90 [amdgpu
3b0071ba2e7e576c543138f03ed9b8249042cca2]
Nov 09 19:30:54 arch-tower kernel:  __x64_sys_ioctl+0x94/0xd0
Nov 09 19:30:54 arch-tower kernel:  do_syscall_64+0x5f/0x90
Nov 09 19:30:54 arch-tower kernel:  ? syscall_exit_to_user_mode+0x1b/0x40
Nov 09 19:30:54 arch-tower kernel:  ? do_syscall_64+0x6b/0x90
Nov 09 19:30:54 arch-tower kernel:  ? exc_page_fault+0x74/0x170
Nov 09 19:30:54 arch-tower kernel:  entry_SYSCALL_64_after_hwframe+0x63/0xcd
Nov 09 19:30:54 arch-tower kernel: RIP: 0033:0x7f494b515c0f
Nov 09 19:30:54 arch-tower kernel: Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60
c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00
00 0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00
Nov 09 19:30:54 arch-tower kernel: RSP: 002b:00007f49237f7f70 EFLAGS: 00000246
ORIG_RAX: 0000000000000010
Nov 09 19:30:54 arch-tower kernel: RAX: ffffffffffffffda RBX: 00007f491e7f23c0
RCX: 00007f494b515c0f
Nov 09 19:30:54 arch-tower kernel: RDX: 00007f49237f8010 RSI: 00000000c0286448
RDI: 0000000000000018
Nov 09 19:30:54 arch-tower kernel: RBP: 00007f49237f8010 R08: 000000010c400000
R09: 000000000000000e
Nov 09 19:30:54 arch-tower kernel: R10: 0000000000000000 R11: 0000000000000246
R12: 00000000c0286448
Nov 09 19:30:54 arch-tower kernel: R13: 0000000000000018 R14: 0000000000330000
R15: 0000000000000005
Nov 09 19:30:54 arch-tower kernel:  </TASK>
-- Boot c2c099679ad94b7190cf2380e047c12b --
Nov 09 19:31:40 arch-tower kernel: Linux version 6.0.6-arch1-1
(linux@archlinux) (gcc (GCC) 12.2.0, GNU ld (GNU Binutils) 2.39.0) #1 SMP
PREEMPT_DYNAMIC Sat, 29 Oct 2022 14:08:39 +0000
Nov 09 19:31:40 arch-tower kernel: Command line: initrd=\amd-ucode.img
initrd=\initramfs-linux.img root=PARTUUID=48a0f342-0f7d-7a4e-a0fd-7ccf9a7950fd
resume=PARTUUID=fd3f4977-4b82-6945-8257-e60d5214141b rw acpi=on
radeon.si_support=0 radeon.cik_support=0 amdgpu.cik_support=1
amdgpu.si_support=1

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [Bug 204181] NULL pointer dereference regression in amdgpu
  2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
                   ` (67 preceding siblings ...)
  2022-11-10  4:02 ` bugzilla-daemon
@ 2022-12-23  9:17 ` bugzilla-daemon
  68 siblings, 0 replies; 70+ messages in thread
From: bugzilla-daemon @ 2022-12-23  9:17 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=204181

doesnotcompete@posteo.de changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |doesnotcompete@posteo.de

--- Comment #69 from doesnotcompete@posteo.de ---
Hello, I'm encountering a similar regression on the 6.1.1 kernel (not present
in this form on 6.0.12, although the system occasionally freezes as well). When
connecting a Thinkpad USB-C Dock with two monitors to my Ryzen 3500U Thinkpad,
the system freezes with a null-pointer dereference in amdgpu.

Kernel: Linux version 6.1.1-arch1-1 (linux@archlinux) (gcc (GCC) 12.2.0, GNU ld
(GNU Binutils) 2.39.0) #1 SMP PREEMPT_DYNAMIC Wed, 21 Dec 2022 22:27:55 +0000

Graphics controller: Advanced Micro Devices, Inc. [AMD/ATI] Picasso/Raven 2
[Radeon Vega Series / Radeon Vega Mobile Series] (rev d2)

Output from journalctl:
Dec 23 09:42:29 kevin-t495 kernel: usb 2-1.3.3: New USB device found,
idVendor=17ef, idProduct=a395, bcdDevice=60.70
Dec 23 09:42:29 kevin-t495 kernel: usb 2-1.3.3: New USB device strings: Mfr=10,
Product=11, SerialNumber=0
Dec 23 09:42:29 kevin-t495 kernel: usb 2-1.3.3: Product: USB2.0 Hub
Dec 23 09:42:29 kevin-t495 kernel: usb 2-1.3.3: Manufacturer: Lenovo
Dec 23 09:42:29 kevin-t495 kernel: [drm] Downstream port present 1, type 2
Dec 23 09:42:29 kevin-t495 kernel: BUG: kernel NULL pointer dereference,
address: 0000000000000008
Dec 23 09:42:29 kevin-t495 kernel: #PF: supervisor read access in kernel mode
Dec 23 09:42:29 kevin-t495 kernel: #PF: error_code(0x0000) - not-present page
Dec 23 09:42:29 kevin-t495 kernel: PGD 0 P4D 0 
Dec 23 09:42:29 kevin-t495 kernel: Oops: 0000 [#1] PREEMPT SMP NOPTI
Dec 23 09:42:29 kevin-t495 kernel: CPU: 4 PID: 998 Comm: sway Not tainted
6.1.1-arch1-1 #1 9bd09188b430be630e611f984454e4f3c489be77
Dec 23 09:42:29 kevin-t495 kernel: Hardware name: LENOVO 20NKS01Y00/20NKS01Y00,
BIOS R12ET61W(1.31 ) 07/28/2022
Dec 23 09:42:29 kevin-t495 kernel: RIP:
0010:drm_dp_atomic_find_time_slots+0x61/0x2a0 [drm_display_helper]
Dec 23 09:42:29 kevin-t495 kernel: Code: 00 00 00 48 8b 85 60 05 00 00 48 63 80
88 00 00 00 3b 43 28 0f 8d ce 01 00 00 48 8b 53 30 48 8d 04 80 48 8d 04 c2 48
8b 40 18 <48> 8b 40 08 4d 8d 65 38 8b 88 90 00 00 00 b8 01 00 00 00 d3 e0 41
Dec 23 09:42:29 kevin-t495 kernel: RSP: 0018:ffffa526c0eef780 EFLAGS: 00010293
Dec 23 09:42:29 kevin-t495 kernel: RAX: 0000000000000000 RBX: ffff9555ef407200
RCX: 0000000000000214
Dec 23 09:42:29 kevin-t495 kernel: RDX: ffff9555c4124800 RSI: ffff9555429ba540
RDI: ffff9555ef407200
Dec 23 09:42:29 kevin-t495 kernel: RBP: ffff9555cfc76000 R08: 0000000000000001
R09: ffff9555c4242050
Dec 23 09:42:29 kevin-t495 kernel: R10: ffffa526c0eef8a0 R11: 000000004cb505a0
R12: 026d60dce16e8423
Dec 23 09:42:29 kevin-t495 kernel: R13: ffff95554cb505a0 R14: ffff9555429ba540
R15: 0000000000000214
Dec 23 09:42:29 kevin-t495 kernel: FS:  00007fb56378b980(0000)
GS:ffff9557f0b00000(0000) knlGS:0000000000000000
Dec 23 09:42:29 kevin-t495 kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
0000000080050033
Dec 23 09:42:29 kevin-t495 kernel: CR2: 0000000000000008 CR3: 000000010ae2a000
CR4: 00000000003506e0
Dec 23 09:42:29 kevin-t495 kernel: Call Trace:
Dec 23 09:42:29 kevin-t495 kernel:  <TASK>
Dec 23 09:42:29 kevin-t495 kernel: 
compute_mst_dsc_configs_for_link+0x31d/0x9d0 [amdgpu
895e2b3772442c7d04dbf61a65c8a3690bb074b6]
Dec 23 09:42:29 kevin-t495 kernel:  ?
cm_helper_translate_curve_to_degamma_hw_format+0x5f0/0x5f0 [amdgpu
895e2b3772442c7d04dbf61a65c8a3690bb074b6]
Dec 23 09:42:29 kevin-t495 kernel:  ? fill_plane_buffer_attributes+0x355/0x530
[amdgpu 895e2b3772442c7d04dbf61a65c8a3690bb074b6]
Dec 23 09:42:29 kevin-t495 kernel: 
compute_mst_dsc_configs_for_state+0x1e1/0x250 [amdgpu
895e2b3772442c7d04dbf61a65c8a3690bb074b6]
Dec 23 09:42:29 kevin-t495 kernel:  amdgpu_dm_atomic_check+0xf81/0x1230 [amdgpu
895e2b3772442c7d04dbf61a65c8a3690bb074b6]
Dec 23 09:42:29 kevin-t495 kernel:  drm_atomic_check_only+0x537/0xba0
Dec 23 09:42:29 kevin-t495 kernel:  drm_mode_atomic_ioctl+0x750/0xbb0
Dec 23 09:42:29 kevin-t495 kernel:  ? drm_property_add_enum+0x180/0x180
Dec 23 09:42:29 kevin-t495 kernel:  ? idr_alloc+0x3a/0x70
Dec 23 09:42:29 kevin-t495 kernel:  ? drm_atomic_set_property+0xbc0/0xbc0
Dec 23 09:42:29 kevin-t495 kernel:  drm_ioctl_kernel+0xcd/0x170
Dec 23 09:42:29 kevin-t495 kernel:  drm_ioctl+0x1eb/0x450
Dec 23 09:42:29 kevin-t495 kernel:  ? drm_atomic_set_property+0xbc0/0xbc0
Dec 23 09:42:29 kevin-t495 kernel:  amdgpu_drm_ioctl+0x4e/0x90 [amdgpu
895e2b3772442c7d04dbf61a65c8a3690bb074b6]
Dec 23 09:42:29 kevin-t495 kernel:  __x64_sys_ioctl+0x94/0xd0
Dec 23 09:42:29 kevin-t495 kernel:  do_syscall_64+0x5f/0x90
Dec 23 09:42:29 kevin-t495 kernel:  ? syscall_exit_to_user_mode+0x1b/0x40
Dec 23 09:42:29 kevin-t495 kernel:  ? do_syscall_64+0x6b/0x90
Dec 23 09:42:29 kevin-t495 kernel:  ? do_syscall_64+0x6b/0x90
Dec 23 09:42:29 kevin-t495 kernel:  ? syscall_exit_to_user_mode+0x1b/0x40
Dec 23 09:42:29 kevin-t495 kernel:  ? do_syscall_64+0x6b/0x90
Dec 23 09:42:29 kevin-t495 kernel:  ? do_syscall_64+0x6b/0x90
Dec 23 09:42:29 kevin-t495 kernel:  entry_SYSCALL_64_after_hwframe+0x63/0xcd
Dec 23 09:42:29 kevin-t495 kernel: RIP: 0033:0x7fb5645dec0f
Dec 23 09:42:29 kevin-t495 kernel: Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60
c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00
00 0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00
Dec 23 09:42:29 kevin-t495 kernel: RSP: 002b:00007ffd3850b740 EFLAGS: 00000246
ORIG_RAX: 0000000000000010
Dec 23 09:42:29 kevin-t495 kernel: RAX: ffffffffffffffda RBX: 000055c223c95960
RCX: 00007fb5645dec0f
Dec 23 09:42:29 kevin-t495 kernel: RDX: 00007ffd3850b7e0 RSI: 00000000c03864bc
RDI: 000000000000000d
Dec 23 09:42:29 kevin-t495 kernel: RBP: 00007ffd3850b7e0 R08: 0000000000000003
R09: 0000000000000003
Dec 23 09:42:29 kevin-t495 kernel: R10: 000055c222b65010 R11: 0000000000000246
R12: 00000000c03864bc
Dec 23 09:42:29 kevin-t495 kernel: R13: 000000000000000d R14: 000055c223c6cba0
R15: 000055c223bfcab0
Dec 23 09:42:29 kevin-t495 kernel:  </TASK>
Dec 23 09:42:29 kevin-t495 kernel: Modules linked in: cdc_ether usbnet r8152
mii rfcomm snd_seq_dummy snd_hrtimer snd_seq snd_seq_device nft_objref
nf_conntrack_netbios_ns nf_conntrack_broadcast ccm cmac algif_hash
algif_skcipher af_alg nft_fib_inet nft_>
Dec 23 09:42:29 kevin-t495 kernel:  snd_pci_acp6x iwlwifi snd_pci_acp5x
snd_hda_core rapl snd_rn_pci_acp3x vfat think_lmi realtek ecdh_generic
snd_hwdep fat snd_acp_config ucsi_acpi typec_ucsi pcspkr mdio_devres psmouse
snd_soc_acpi firmware_attributes_c>
Dec 23 09:42:29 kevin-t495 kernel: CR2: 0000000000000008
Dec 23 09:42:29 kevin-t495 kernel: ---[ end trace 0000000000000000 ]---
Dec 23 09:42:29 kevin-t495 kernel: RIP:
0010:drm_dp_atomic_find_time_slots+0x61/0x2a0 [drm_display_helper]
Dec 23 09:42:29 kevin-t495 kernel: Code: 00 00 00 48 8b 85 60 05 00 00 48 63 80
88 00 00 00 3b 43 28 0f 8d ce 01 00 00 48 8b 53 30 48 8d 04 80 48 8d 04 c2 48
8b 40 18 <48> 8b 40 08 4d 8d 65 38 8b 88 90 00 00 00 b8 01 00 00 00 d3 e0 41
Dec 23 09:42:29 kevin-t495 kernel: RSP: 0018:ffffa526c0eef780 EFLAGS: 00010293
Dec 23 09:42:29 kevin-t495 kernel: RAX: 0000000000000000 RBX: ffff9555ef407200
RCX: 0000000000000214
Dec 23 09:42:29 kevin-t495 kernel: RDX: ffff9555c4124800 RSI: ffff9555429ba540
RDI: ffff9555ef407200
Dec 23 09:42:29 kevin-t495 kernel: RBP: ffff9555cfc76000 R08: 0000000000000001
R09: ffff9555c4242050
Dec 23 09:42:29 kevin-t495 kernel: R10: ffffa526c0eef8a0 R11: 000000004cb505a0
R12: 026d60dce16e8423
Dec 23 09:42:29 kevin-t495 kernel: R13: ffff95554cb505a0 R14: ffff9555429ba540
R15: 0000000000000214
Dec 23 09:42:29 kevin-t495 kernel: FS:  00007fb56378b980(0000)
GS:ffff9557f0b00000(0000) knlGS:0000000000000000
Dec 23 09:42:29 kevin-t495 kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
0000000080050033
Dec 23 09:42:29 kevin-t495 kernel: CR2: 0000000000000008 CR3: 000000010ae2a000
CR4: 00000000003506e0

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 70+ messages in thread

end of thread, other threads:[~2022-12-23  9:18 UTC | newest]

Thread overview: 70+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-15 10:11 [Bug 204181] New: NULL pointer dereference regression in amdgpu bugzilla-daemon
2019-07-15 13:07 ` [Bug 204181] " bugzilla-daemon
2019-07-15 15:43 ` bugzilla-daemon
2019-07-15 15:43 ` bugzilla-daemon
2019-07-15 15:45 ` bugzilla-daemon
2019-07-15 15:48 ` bugzilla-daemon
2019-07-15 15:50 ` bugzilla-daemon
2019-07-15 15:50 ` bugzilla-daemon
2019-07-15 15:53 ` bugzilla-daemon
2019-07-15 15:56 ` bugzilla-daemon
2019-07-15 15:58 ` bugzilla-daemon
2019-07-15 15:59 ` bugzilla-daemon
2019-07-16 15:29 ` bugzilla-daemon
2019-07-16 16:36 ` bugzilla-daemon
2019-07-16 16:52 ` bugzilla-daemon
2019-07-16 16:55 ` bugzilla-daemon
2019-07-24 18:33 ` bugzilla-daemon
2019-07-25 10:52 ` bugzilla-daemon
2019-07-25 14:21 ` bugzilla-daemon
2019-07-25 15:42 ` bugzilla-daemon
2019-07-25 15:50 ` bugzilla-daemon
2019-07-26 12:23 ` bugzilla-daemon
2019-07-26 16:02 ` bugzilla-daemon
2019-07-30 21:41 ` bugzilla-daemon
2019-07-31 16:28 ` bugzilla-daemon
2019-08-01  6:13 ` bugzilla-daemon
2019-08-02  2:21 ` bugzilla-daemon
2019-08-04  5:17 ` bugzilla-daemon
2019-08-07 17:43 ` bugzilla-daemon
2019-08-14  6:43 ` bugzilla-daemon
2019-08-14 19:06 ` bugzilla-daemon
2019-08-15 22:05 ` bugzilla-daemon
2019-08-17  5:13 ` bugzilla-daemon
2019-08-19 13:39 ` bugzilla-daemon
2019-08-19 15:11 ` bugzilla-daemon
2019-08-21 13:38 ` bugzilla-daemon
2019-08-21 14:37 ` bugzilla-daemon
2019-08-21 15:27 ` bugzilla-daemon
2019-08-21 18:36 ` bugzilla-daemon
2019-08-21 19:28 ` bugzilla-daemon
2019-08-21 21:39 ` bugzilla-daemon
2019-08-21 21:51 ` bugzilla-daemon
2019-08-22 13:14 ` bugzilla-daemon
2019-08-23 21:02 ` bugzilla-daemon
2019-08-24  9:43 ` bugzilla-daemon
2019-08-26  5:32 ` bugzilla-daemon
2019-09-04  4:50 ` bugzilla-daemon
2019-09-06 10:37 ` bugzilla-daemon
2019-09-06 10:38 ` bugzilla-daemon
2019-09-20  1:58 ` bugzilla-daemon
2019-09-20 13:19 ` bugzilla-daemon
2019-09-20 14:04 ` bugzilla-daemon
2019-09-21  5:26 ` bugzilla-daemon
2019-09-27  3:50 ` bugzilla-daemon
2019-09-27 12:50 ` bugzilla-daemon
2019-09-27 13:19 ` bugzilla-daemon
2019-09-27 20:18 ` bugzilla-daemon
2019-09-28  0:07 ` bugzilla-daemon
2019-09-29 18:10 ` bugzilla-daemon
2019-09-29 21:54 ` bugzilla-daemon
2019-09-30  2:07 ` bugzilla-daemon
2019-09-30  2:09 ` bugzilla-daemon
2019-11-05 19:38 ` bugzilla-daemon
2019-12-12 19:04 ` bugzilla-daemon
2020-06-19  3:13 ` bugzilla-daemon
2020-06-19  3:14 ` bugzilla-daemon
2020-07-26 22:49 ` bugzilla-daemon
2020-07-26 22:50 ` bugzilla-daemon
2022-11-10  4:02 ` bugzilla-daemon
2022-12-23  9:17 ` bugzilla-daemon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.