All of lore.kernel.org
 help / color / mirror / Atom feed
* Kernel crash/Null pointer dereference on vblank
@ 2017-11-22  7:06 Martin Babutzka
       [not found] ` <1228839336.50367.1511334398694-NM1PAhYDU/Oo9dU1Uvar1Q@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Martin Babutzka @ 2017-11-22  7:06 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

[-- Attachment #1: Type: text/html, Size: 1730 bytes --]

[-- Attachment #2: crash_vblank.txt --]
[-- Type: text/plain, Size: 9468 bytes --]

bool dce110_vblank_set(
                struct irq_service *irq_service,
                const struct irq_source_info *info,
                bool enable)
{
printk(KERN_ALERT "DEBUG: Passed %s %d \n",__FUNCTION__,__LINE__);
        struct dc_context *dc_ctx = irq_service->ctx;
printk(KERN_ALERT "DEBUG: Passed %s %d \n",__FUNCTION__,__LINE__);
        struct dc *core_dc = irq_service->ctx->dc;
printk(KERN_ALERT "DEBUG: Passed %s %d \n",__FUNCTION__,__LINE__);
        enum dc_irq_source dal_irq_src = dc_interrupt_to_irq_source(
                                                                                irq_service->ctx->dc,
                                                                                info->src_id,
                                                                                info->ext_id);
        uint8_t pipe_offset = dal_irq_src - IRQ_TYPE_VBLANK;
printk(KERN_ALERT "DEBUG: Passed %s %d \n",__FUNCTION__,__LINE__);

        struct timing_generator *tg =
                        core_dc->current_state->res_ctx.pipe_ctx[pipe_offset].stream_res.tg;
printk(KERN_ALERT "DEBUG: Passed %s %d \n",__FUNCTION__,__LINE__);

        if (enable) {
                if (!tg->funcs->arm_vert_intr(tg, 2)) {
                        DC_ERROR("Failed to get VBLANK!\n");
                        return false;
                }
        }
printk(KERN_ALERT "DEBUG: Passed %s %d \n",__FUNCTION__,__LINE__);

        dal_irq_service_set_generic(irq_service, info, enable);
printk(KERN_ALERT "DEBUG: Passed %s %d \n",__FUNCTION__,__LINE__);
        return true;

}


"normal" vblank during boot:
Nov 19 22:33:10 Main-PC kernel: [   17.605100] DEBUG: Passed dce110_vblank_set 208 
Nov 19 22:33:10 Main-PC kernel: [   17.605102] DEBUG: Passed dce110_vblank_set 210 
Nov 19 22:33:10 Main-PC kernel: [   17.605103] DEBUG: Passed dce110_vblank_set 212 
Nov 19 22:33:10 Main-PC kernel: [   17.605104] DEBUG: Passed dce110_vblank_set 218 
Nov 19 22:33:10 Main-PC kernel: [   17.605104] DEBUG: Passed dce110_vblank_set 222 
Nov 19 22:33:10 Main-PC kernel: [   17.605108] DEBUG: Passed dce110_vblank_set 230 
Nov 19 22:33:10 Main-PC kernel: [   17.605110] DEBUG: Passed dce110_vblank_set 233 

vblank on screen lock in kernel.log/syslog:
Nov 19 22:34:10 Main-PC kernel: [   78.664890] DEBUG: Passed dce110_vblank_set 208 
Nov 19 22:34:10 Main-PC kernel: [   78.664892] DEBUG: Passed dce110_vblank_set 210 
Nov 19 22:34:10 Main-PC kernel: [   78.664893] DEBUG: Passed dce110_vblank_set 212 
Nov 19 22:34:10 Main-PC kernel: [   78.664894] DEBUG: Passed dce110_vblank_set 218 
Nov 19 22:34:10 Main-PC kernel: [   78.664894] DEBUG: Passed dce110_vblank_set 222 
Nov 19 22:34:10 Main-PC kernel: [   78.664895] DEBUG: Passed dce110_vblank_set 230 
Nov 19 22:34:10 Main-PC kernel: [   78.664896] DEBUG: Passed dce110_vblank_set 233 
Nov 19 22:34:27 Main-PC kernel: [   96.113426] DEBUG: Passed dce110_vblank_set 208 
Nov 19 22:34:27 Main-PC kernel: [   96.113433] DEBUG: Passed dce110_vblank_set 210 
Nov 19 22:34:27 Main-PC kernel: [   96.113435] DEBUG: Passed dce110_vblank_set 212 
Nov 19 22:34:27 Main-PC kernel: [   96.113438] DEBUG: Passed dce110_vblank_set 218 
Nov 19 22:34:27 Main-PC kernel: [   96.113440] DEBUG: Passed dce110_vblank_set 222 
Nov 19 22:34:27 Main-PC kernel: [   96.113448] BUG: unable to handle kernel NULL pointer dereference at           (null)
Nov 19 22:34:27 Main-PC kernel: [   96.113521] IP: dce110_vblank_set+0xe2/0x160 [amdgpu]
Nov 19 22:34:27 Main-PC kernel: [   96.113524] PGD 0 P4D 0 
Nov 19 22:34:27 Main-PC kernel: [   96.113531] Oops: 0000 [#1] SMP
Nov 19 22:34:27 Main-PC kernel: [   96.113535] Modules linked in: rfcomm bnep binfmt_misc snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm snd_pcm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_seq_midi pcbc dm_crypt snd_seq_midi_event aesni_intel snd_rawmidi aes_x86_64 crypto_simd glue_helper snd_seq cryptd snd_seq_device snd_timer intel_cstate intel_rapl_perf snd btusb serio_raw joydev input_leds soundcore btrtl hci_uart mei_me shpchp btbcm mei serdev btqca btintel bluetooth ecdh_generic intel_lpss_acpi intel_lpss acpi_als mac_hid kfifo_buf acpi_pad tpm_infineon industrialio parport_pc ppdev lp parport ip_tables x_tables autofs4 hid_generic uas usb_storage usbhid amdkfd amd_iommu_v2
Nov 19 22:34:27 Main-PC kernel: [   96.113614]  amdgpu chash i2c_algo_bit ttm drm_kms_helper e1000e syscopyarea sysfillrect sysimgblt fb_sys_fops ptp r8169 pps_core drm ahci mii libahci wmi pinctrl_sunrisepoint video i2c_hid pinctrl_intel hid
Nov 19 22:34:27 Main-PC kernel: [   96.113643] CPU: 2 PID: 1462 Comm: xfwm4 Not tainted 4.14.0+ #3
Nov 19 22:34:27 Main-PC kernel: [   96.113645] Hardware name: Gigabyte Technology Co., Ltd. B250-HD3P/B250-HD3P-CF, BIOS F3 12/07/2016
Nov 19 22:34:27 Main-PC kernel: [   96.113649] task: ffff998d53040000 task.stack: ffffa59103150000
Nov 19 22:34:27 Main-PC kernel: [   96.113710] RIP: 0010:dce110_vblank_set+0xe2/0x160 [amdgpu]
Nov 19 22:34:27 Main-PC kernel: [   96.113713] RSP: 0018:ffffa59103153b28 EFLAGS: 00010002
Nov 19 22:34:27 Main-PC kernel: [   96.113717] RAX: 0000000000000024 RBX: ffff998d5c3d4300 RCX: 0000000000000006
Nov 19 22:34:27 Main-PC kernel: [   96.113720] RDX: 0000000000000000 RSI: 0000000000000086 RDI: ffff998d6ec8dc90
Nov 19 22:34:27 Main-PC kernel: [   96.113723] RBP: ffffa59103153b58 R08: 0000000000000000 R09: 00000000000003ff
Nov 19 22:34:27 Main-PC kernel: [   96.113726] R10: 00007ffebd2bebc0 R11: ffffffffa354feed R12: ffffffffc052b3e0
Nov 19 22:34:27 Main-PC kernel: [   96.113728] R13: 0000000000000001 R14: ffff998d51695100 R15: 0000000000000000
Nov 19 22:34:27 Main-PC kernel: [   96.113732] FS:  00007f4e2f002a80(0000) GS:ffff998d6ec80000(0000) knlGS:0000000000000000
Nov 19 22:34:27 Main-PC kernel: [   96.113735] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 19 22:34:27 Main-PC kernel: [   96.113738] CR2: 0000000000000000 CR3: 00000004181e5001 CR4: 00000000003606e0
Nov 19 22:34:27 Main-PC kernel: [   96.113741] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Nov 19 22:34:27 Main-PC kernel: [   96.113744] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Nov 19 22:34:27 Main-PC kernel: [   96.113746] Call Trace:
Nov 19 22:34:27 Main-PC kernel: [   96.113807]  dal_irq_service_set+0x49/0x90 [amdgpu]
Nov 19 22:34:27 Main-PC kernel: [   96.113863]  dc_interrupt_set+0x24/0x30 [amdgpu]
Nov 19 22:34:27 Main-PC kernel: [   96.113933]  amdgpu_dm_set_crtc_irq_state+0x35/0x60 [amdgpu]
Nov 19 22:34:27 Main-PC kernel: [   96.113989]  amdgpu_irq_update+0x58/0xa0 [amdgpu]
Nov 19 22:34:27 Main-PC kernel: [   96.114041]  amdgpu_irq_get+0x49/0x60 [amdgpu]
Nov 19 22:34:27 Main-PC kernel: [   96.114076]  amdgpu_enable_vblank_kms+0x27/0x30 [amdgpu]
Nov 19 22:34:27 Main-PC kernel: [   96.114091]  drm_vblank_enable+0x84/0x100 [drm]
Nov 19 22:34:27 Main-PC kernel: [   96.114104]  drm_vblank_get+0x92/0xb0 [drm]
Nov 19 22:34:27 Main-PC kernel: [   96.114116]  drm_wait_vblank_ioctl+0xb4/0x580 [drm]
Nov 19 22:34:27 Main-PC kernel: [   96.114123]  ? unix_stream_recvmsg+0x51/0x70
Nov 19 22:34:27 Main-PC kernel: [   96.114127]  ? __unix_insert_socket+0x40/0x40
Nov 19 22:34:27 Main-PC kernel: [   96.114140]  ? drm_legacy_modeset_ctl_ioctl+0x100/0x100 [drm]
Nov 19 22:34:27 Main-PC kernel: [   96.114152]  drm_ioctl_kernel+0x5d/0xb0 [drm]
Nov 19 22:34:27 Main-PC kernel: [   96.114163]  drm_ioctl+0x31b/0x3d0 [drm]
Nov 19 22:34:27 Main-PC kernel: [   96.114174]  ? drm_legacy_modeset_ctl_ioctl+0x100/0x100 [drm]
Nov 19 22:34:27 Main-PC kernel: [   96.114180]  ? do_iter_write+0xe1/0x190
Nov 19 22:34:27 Main-PC kernel: [   96.114215]  amdgpu_drm_ioctl+0x4f/0x90 [amdgpu]
Nov 19 22:34:27 Main-PC kernel: [   96.114222]  do_vfs_ioctl+0xa5/0x610
Nov 19 22:34:27 Main-PC kernel: [   96.114227]  ? __sys_recvmsg+0x51/0x90
Nov 19 22:34:27 Main-PC kernel: [   96.114231]  ? __sys_recvmsg+0x51/0x90
Nov 19 22:34:27 Main-PC kernel: [   96.114237]  SyS_ioctl+0x79/0x90
Nov 19 22:34:27 Main-PC kernel: [   96.114243]  entry_SYSCALL_64_fastpath+0x1e/0xa9
Nov 19 22:34:27 Main-PC kernel: [   96.114247] RIP: 0033:0x7f4e2b64dea7
Nov 19 22:34:27 Main-PC kernel: [   96.114250] RSP: 002b:00007ffebd2bec08 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Nov 19 22:34:27 Main-PC kernel: [   96.114254] RAX: ffffffffffffffda RBX: 0000562e1f5938c0 RCX: 00007f4e2b64dea7
Nov 19 22:34:27 Main-PC kernel: [   96.114257] RDX: 00007ffebd2bec80 RSI: 00000000c018643a RDI: 0000000000000006
Nov 19 22:34:27 Main-PC kernel: [   96.114259] RBP: 0000562e1f620ce0 R08: 00000000006001e5 R09: 0000000000000000
Nov 19 22:34:27 Main-PC kernel: [   96.114262] R10: 00007ffebd2bebc0 R11: 0000000000000246 R12: 0000000000000000
Nov 19 22:34:27 Main-PC kernel: [   96.114264] R13: 0000000000000007 R14: 0000000000000007 R15: 0000562e1f5938c0
Nov 19 22:34:27 Main-PC kernel: [   96.114268] Code: 48 89 d0 48 c1 e0 05 48 01 d0 ba de 00 00 00 48 c1 e0 05 49 03 87 30 01 00 00 4c 8b b8 78 02 00 00 e8 c4 c2 04 e2 45 84 ed 74 38 <49> 8b 07 be 02 00 00 00 4c 89 ff ff 90 e0 00 00 00 84 c0 75 23 
Nov 19 22:34:27 Main-PC kernel: [   96.114392] RIP: dce110_vblank_set+0xe2/0x160 [amdgpu] RSP: ffffa59103153b28
Nov 19 22:34:27 Main-PC kernel: [   96.114394] CR2: 0000000000000000
Nov 19 22:34:27 Main-PC kernel: [   96.114399] ---[ end trace 4160248d2f91cb42 ]---


[-- Attachment #3: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Kernel crash/Null pointer dereference on vblank
       [not found] ` <1228839336.50367.1511334398694-NM1PAhYDU/Oo9dU1Uvar1Q@public.gmane.org>
@ 2017-11-22 15:07   ` Johannes Hirte
  2017-11-22 22:31     ` Johannes Hirte
  0 siblings, 1 reply; 10+ messages in thread
From: Johannes Hirte @ 2017-11-22 15:07 UTC (permalink / raw)
  To: Martin Babutzka; +Cc: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On 2017 Nov 22, Martin Babutzka wrote:
>Dear AMD Developers,
>At first congratulations for the DC code submission to the 4.15 kernel.
>Unfortunately the major regression which I reported on 29.09., 06.10.,
>02.11. and 05.11. still exists. But this time I got additional
>debugging information maybe this helps to fix it.
>
>Summary: I am running Xubuntu 17.10 with the amd-staging-drm-next
>kernel patched to 4.14.0. The latest build which I tested is from
>includes all commits up to now (including 2017-11-17 19:51:57 (GMT)
>commit 85d09ce5e5039644487e9508d6359f9f4cf64427).
>
>Some vblank operations make the kernel crash and hang up the whole
>system. The error is reproducible by enabling the screen lock or the
>suspend mode. The system can not return to proper state from either of
>these (after all I am not 100% sure it is the same error). Debugging is
> easier with screen lock. Attached you can find the kernel crash and
>the dce110_vblank_set function modified by some kernel prints. It looks
>like the function is called twice and does not work the second time.
>The whole code around dce110_vblank_set also looks interrupt-ish -
>could this be a race condition or timing problem? Objects being cleared
>from memory and then accessed by dce110_vblank_set?
>
>Bug reports on this issue:
>https://github.com/M-Bab/linux-kernel-amdgpu-binaries/issues/37
>https://github.com/M-Bab/linux-kernel-amdgpu-binaries/issues/29
>
>Many regards,
>Martin (M-bab)

I'm having the same problem on Carrizo. The system crashes when resuming
from S3 and dc is on. With dc off, everything works fine. I was able to
catch some debug info with kasan:

Nov 22 15:52:19 probook kernel: PM: suspend entry (deep)
Nov 22 15:52:19 probook kernel: PM: Syncing filesystems ... done.
Nov 22 15:52:28 probook kernel: Freezing user space processes ... (elapsed 0.002 seconds) done.
Nov 22 15:52:28 probook kernel: OOM killer disabled.
Nov 22 15:52:28 probook kernel: Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
Nov 22 15:52:28 probook kernel: Suspending console(s) (use no_console_suspend to debug)
Nov 22 15:52:28 probook kernel: sd 0:0:0:0: [sda] Synchronizing SCSI cache
Nov 22 15:52:28 probook kernel: sd 0:0:0:0: [sda] Stopping disk
Nov 22 15:52:28 probook kernel: amdgpu 0000:00:01.0: ffff8803e8075500 unpin not necessary
Nov 22 15:52:28 probook kernel: ACPI: Preparing to enter system sleep state S3
Nov 22 15:52:28 probook kernel: ACPI: EC: event blocked
Nov 22 15:52:28 probook kernel: ACPI: EC: EC stopped
Nov 22 15:52:28 probook kernel: PM: Saving platform NVS memory
Nov 22 15:52:28 probook kernel: Disabling non-boot CPUs ...
Nov 22 15:52:28 probook kernel: smpboot: CPU 1 is now offline
Nov 22 15:52:28 probook kernel: smpboot: CPU 2 is now offline
Nov 22 15:52:28 probook kernel: smpboot: CPU 3 is now offline
Nov 22 15:52:28 probook kernel: ACPI: Low-level resume complete
Nov 22 15:52:28 probook kernel: ACPI: EC: EC started
Nov 22 15:52:28 probook kernel: PM: Restoring platform NVS memory
Nov 22 15:52:28 probook kernel: LVT offset 0 assigned for vector 0x400
Nov 22 15:52:28 probook kernel: Enabling non-boot CPUs ...
Nov 22 15:52:28 probook kernel: x86: Booting SMP configuration:
Nov 22 15:52:28 probook kernel: smpboot: Booting Node 0 Processor 1 APIC 0x11
Nov 22 15:52:28 probook kernel:  cache: parent cpu1 should not be sleeping
Nov 22 15:52:28 probook kernel: CPU1 is up
Nov 22 15:52:28 probook kernel: smpboot: Booting Node 0 Processor 2 APIC 0x12
Nov 22 15:52:28 probook kernel:  cache: parent cpu2 should not be sleeping
Nov 22 15:52:28 probook kernel: CPU2 is up
Nov 22 15:52:28 probook kernel: smpboot: Booting Node 0 Processor 3 APIC 0x13
Nov 22 15:52:28 probook kernel:  cache: parent cpu3 should not be sleeping
Nov 22 15:52:28 probook kernel: CPU3 is up
Nov 22 15:52:28 probook kernel: ACPI: Waking up from system sleep state S3
Nov 22 15:52:28 probook kernel: ACPI: EC: event unblocked
Nov 22 15:52:28 probook kernel: [drm] PCIE GART of 1024M enabled (table at 0x000000F400040000).
Nov 22 15:52:28 probook kernel: sd 0:0:0:0: [sda] Starting disk
Nov 22 15:52:28 probook kernel: r8169 0000:01:00.0 enp1s0: link down
Nov 22 15:52:28 probook kernel: ACPI: button: The lid device is not compliant to SW_LID.
Nov 22 15:52:28 probook kernel: usb 3-1.1: reset high-speed USB device number 3 using ehci-pci
Nov 22 15:52:28 probook kernel: [drm:hwss_wait_for_blank_complete] *ERROR* DC: failed to blank crtc!
Nov 22 15:52:28 probook kernel: [drm] ring test on 0 succeeded in 11 usecs
Nov 22 15:52:28 probook kernel: [drm] ring test on 9 succeeded in 8 usecs
Nov 22 15:52:28 probook kernel: [drm] ring test on 1 succeeded in 4 usecs
Nov 22 15:52:28 probook kernel: [drm] ring test on 2 succeeded in 2 usecs
Nov 22 15:52:28 probook kernel: [drm] ring test on 3 succeeded in 2 usecs
Nov 22 15:52:28 probook kernel: [drm] ring test on 4 succeeded in 2 usecs
Nov 22 15:52:28 probook kernel: [drm] ring test on 5 succeeded in 7 usecs
Nov 22 15:52:28 probook kernel: [drm] ring test on 6 succeeded in 2 usecs
Nov 22 15:52:28 probook kernel: [drm] ring test on 7 succeeded in 2 usecs
Nov 22 15:52:28 probook kernel: [drm] ring test on 8 succeeded in 2 usecs
Nov 22 15:52:28 probook kernel: [drm] ring test on 10 succeeded in 4 usecs
Nov 22 15:52:28 probook kernel: [drm] ring test on 11 succeeded in 3 usecs
Nov 22 15:52:28 probook kernel: usb 3-1.3: reset high-speed USB device number 4 using ehci-pci
Nov 22 15:52:28 probook kernel: ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Nov 22 15:52:28 probook kernel: ata1.00: ACPI cmd f5/00:00:00:00:00:e0 (SECURITY FREEZE LOCK) filtered out
Nov 22 15:52:28 probook kernel: ata1.00: ACPI cmd b1/c1:00:00:00:00:e0 (DEVICE CONFIGURATION OVERLAY) filtered out
Nov 22 15:52:28 probook kernel: ata1.00: supports DRM functions and may not be fully accessible
Nov 22 15:52:28 probook kernel: ata1.00: disabling queued TRIM support
Nov 22 15:52:28 probook kernel: ata1.00: ACPI cmd f5/00:00:00:00:00:e0 (SECURITY FREEZE LOCK) filtered out
Nov 22 15:52:28 probook kernel: ata1.00: ACPI cmd b1/c1:00:00:00:00:e0 (DEVICE CONFIGURATION OVERLAY) filtered out
Nov 22 15:52:28 probook kernel: ata1.00: supports DRM functions and may not be fully accessible
Nov 22 15:52:28 probook kernel: ata1.00: disabling queued TRIM support
Nov 22 15:52:28 probook kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Nov 22 15:52:28 probook kernel: ata1.00: configured for UDMA/133
Nov 22 15:52:28 probook kernel: ata2.00: configured for UDMA/100
Nov 22 15:52:28 probook kernel: usb 3-1.3.2: reset full-speed USB device number 5 using ehci-pci
Nov 22 15:52:28 probook kernel: [drm] ring test on 12 succeeded in 1 usecs
Nov 22 15:52:28 probook kernel: [drm] UVD initialized successfully.
Nov 22 15:52:28 probook kernel: [drm] ring test on 13 succeeded in 0 usecs
Nov 22 15:52:28 probook kernel: [drm] ring test on 14 succeeded in 8 usecs
Nov 22 15:52:28 probook kernel: [drm] VCE initialized successfully.
Nov 22 15:52:28 probook kernel: [drm] ib test on ring 0 succeeded
Nov 22 15:52:28 probook kernel: [drm] ib test on ring 1 succeeded
Nov 22 15:52:28 probook kernel: [drm] ib test on ring 2 succeeded
Nov 22 15:52:28 probook kernel: [drm] ib test on ring 3 succeeded
Nov 22 15:52:28 probook kernel: [drm] ib test on ring 4 succeeded
Nov 22 15:52:28 probook kernel: [drm] ib test on ring 5 succeeded
Nov 22 15:52:28 probook kernel: [drm] ib test on ring 6 succeeded
Nov 22 15:52:28 probook kernel: [drm] ib test on ring 7 succeeded
Nov 22 15:52:28 probook kernel: [drm] ib test on ring 8 succeeded
Nov 22 15:52:28 probook kernel: [drm] ib test on ring 9 succeeded
Nov 22 15:52:28 probook kernel: [drm] ib test on ring 10 succeeded
Nov 22 15:52:28 probook kernel: [drm] ib test on ring 11 succeeded
Nov 22 15:52:28 probook kernel: [drm] ib test on ring 12 succeeded
Nov 22 15:52:28 probook kernel: [drm] ib test on ring 13 succeeded
Nov 22 15:52:28 probook kernel: [drm] {1920x1080, 2250x1132@152840Khz}
Nov 22 15:52:28 probook kernel: [drm] HBRx2 pass VS=1, PE=0
Nov 22 15:52:28 probook kernel: ------------[ cut here ]------------
Nov 22 15:52:28 probook kernel: Kernel BUG at ffffffffb522a5c9 [verbose debug info unavailable]
Nov 22 15:52:28 probook kernel: ==================================================================
Nov 22 15:52:28 probook kernel: BUG: KASAN: use-after-free in ex_handler_refcount+0x135/0x170
Nov 22 15:52:28 probook kernel: Write of size 4 at addr ffff8803e1be7840 by task kworker/u8:22/2619
Nov 22 15:52:28 probook kernel: 
Nov 22 15:52:28 probook kernel: CPU: 3 PID: 2619 Comm: kworker/u8:22 Not tainted 4.14.0-11095-g0c86a6bd85ff #404
Nov 22 15:52:28 probook kernel: Hardware name: HP HP ProBook 645 G2/80FE, BIOS N77 Ver. 01.09 06/09/2017
Nov 22 15:52:28 probook kernel: Workqueue: events_unbound async_run_entry_fn
Nov 22 15:52:28 probook kernel: Call Trace:
Nov 22 15:52:28 probook kernel:  dump_stack+0x99/0x11e
Nov 22 15:52:28 probook kernel:  ? _atomic_dec_and_lock+0x152/0x152
Nov 22 15:52:28 probook kernel:  print_address_description+0x65/0x270
Nov 22 15:52:28 probook kernel:  kasan_report+0x272/0x360
Nov 22 15:52:28 probook kernel:  ? ex_handler_refcount+0x135/0x170
Nov 22 15:52:28 probook kernel:  ex_handler_refcount+0x135/0x170
Nov 22 15:52:28 probook kernel:  ? ex_handler_clear_fs+0xa0/0xa0
Nov 22 15:52:28 probook kernel:  fixup_exception+0x78/0xb0
Nov 22 15:52:28 probook kernel:  do_trap+0x11c/0x380
Nov 22 15:52:28 probook kernel:  do_error_trap+0x11c/0x350
Nov 22 15:52:28 probook kernel:  ? fixup_bug.part.10+0x80/0x80
Nov 22 15:52:28 probook kernel:  ? csum_partial_copy_generic+0x1309/0x2880
Nov 22 15:52:28 probook kernel:  ? kasan_slab_free+0x87/0xc0
Nov 22 15:52:28 probook kernel:  ? drm_atomic_helper_resume+0xbf/0x120
Nov 22 15:52:28 probook kernel:  invalid_op+0x18/0x20
Nov 22 15:52:28 probook kernel: RIP: 0010:csum_partial_copy_generic+0x1309/0x2880
Nov 22 15:52:28 probook kernel: RSP: 0018:ffff8803d0c2f150 EFLAGS: 00010296
Nov 22 15:52:28 probook kernel: RAX: dffffc0000000000 RBX: ffff8803eecb0000 RCX: ffff8803e1be7840
Nov 22 15:52:28 probook kernel: RDX: 1ffff1007dd973b3 RSI: ffff8803d0c2f0f0 RDI: ffff8803e1be7840
Nov 22 15:52:28 probook kernel: RBP: ffff8803eecb9d98 R08: ffff8803f2db9348 R09: ffffffffb62c0aba
Nov 22 15:52:28 probook kernel: R10: 1ffff1007a185da1 R11: 1ffff1007a185e1f R12: dffffc0000000000
Nov 22 15:52:28 probook kernel: R13: 0000000000000000 R14: ffffed007a185e3b R15: ffff8803f2db9100
Nov 22 15:52:28 probook kernel:  ? amdgpu_dm_update_connector_after_detect+0x650/0x650
Nov 22 15:52:28 probook kernel:  amdgpu_device_resume+0x7d3/0x910
Nov 22 15:52:28 probook kernel:  ? amdgpu_device_suspend+0xa20/0xa20
Nov 22 15:52:28 probook kernel:  ? preempt_count_add+0xb9/0x140
Nov 22 15:52:28 probook kernel:  ? pci_pm_freeze+0x310/0x310
Nov 22 15:52:28 probook kernel:  dpm_run_callback+0xcb/0x460
Nov 22 15:52:28 probook kernel:  ? initcall_debug_report.isra.8+0xe0/0xe0
Nov 22 15:52:28 probook kernel:  ? __wake_up_common+0x650/0x650
Nov 22 15:52:28 probook kernel:  ? _raw_spin_unlock_irqrestore+0xc2/0x130
Nov 22 15:52:28 probook kernel:  device_resume+0x165/0x470
Nov 22 15:52:28 probook kernel:  ? async_run_entry_fn+0x41a/0x690
Nov 22 15:52:28 probook kernel:  ? device_resume+0x470/0x470
Nov 22 15:52:28 probook kernel:  async_resume+0x14/0x40
Nov 22 15:52:28 probook kernel:  async_run_entry_fn+0x16b/0x690
Nov 22 15:52:28 probook kernel:  ? sched_clock_cpu+0x18/0x1e0
Nov 22 15:52:28 probook kernel:  ? sched_clock_cpu+0x18/0x1e0
Nov 22 15:52:28 probook kernel:  ? lowest_in_progress+0x190/0x190
Nov 22 15:52:28 probook kernel:  ? pick_next_entity+0x194/0x400
Nov 22 15:52:28 probook kernel:  ? pwq_dec_nr_in_flight+0x1ab/0x3c0
Nov 22 15:52:28 probook kernel:  ? kthread_create_on_node+0x8b/0xc0
Nov 22 15:52:28 probook kernel:  ? _raw_spin_unlock_irq+0xbe/0x120
Nov 22 15:52:28 probook kernel:  ? _raw_spin_unlock+0x120/0x120
Nov 22 15:52:28 probook kernel:  process_one_work+0x84b/0x1600
Nov 22 15:52:28 probook kernel:  ? tick_nohz_dep_clear_signal+0x20/0x20
Nov 22 15:52:28 probook kernel:  ? _raw_spin_unlock_irq+0xbe/0x120
Nov 22 15:52:28 probook kernel:  ? _raw_spin_unlock+0x120/0x120
Nov 22 15:52:28 probook kernel:  ? pwq_dec_nr_in_flight+0x3c0/0x3c0
Nov 22 15:52:28 probook kernel:  ? arch_vtime_task_switch+0xee/0x190
Nov 22 15:52:28 probook kernel:  ? finish_task_switch+0x27d/0x7f0
Nov 22 15:52:28 probook kernel:  ? wq_worker_waking_up+0xc0/0xc0
Nov 22 15:52:28 probook kernel:  ? copy_overflow+0x20/0x20
Nov 22 15:52:28 probook kernel:  ? pci_mmcfg_check_reserved+0x100/0x100
Nov 22 15:52:28 probook kernel:  ? pointer+0x8d0/0x8d0
Nov 22 15:52:28 probook kernel:  ? remove_wait_queue+0x2b0/0x2b0
Nov 22 15:52:28 probook kernel:  ? _raw_spin_unlock_irqrestore+0xc2/0x130
Nov 22 15:52:28 probook kernel:  ? preempt_count_add+0xb9/0x140
Nov 22 15:52:28 probook kernel:  ? trace_raw_output_tick_stop+0x110/0x110
Nov 22 15:52:28 probook kernel:  ? schedule+0xfb/0x3b0
Nov 22 15:52:28 probook kernel:  ? __schedule+0x19b0/0x19b0
Nov 22 15:52:28 probook kernel:  ? _raw_spin_unlock_irq+0xbe/0x120
Nov 22 15:52:28 probook kernel:  ? _raw_spin_unlock+0x120/0x120
Nov 22 15:52:28 probook kernel:  ? task_change_group_fair+0x7e0/0x7e0
Nov 22 15:52:28 probook kernel:  worker_thread+0x211/0x1790
Nov 22 15:52:28 probook kernel:  ? unwind_next_frame+0x939/0x1e50
Nov 22 15:52:28 probook kernel:  ? trace_event_raw_event_workqueue_work+0x170/0x170
Nov 22 15:52:28 probook kernel:  ? __read_once_size_nocheck.constprop.6+0x10/0x10
Nov 22 15:52:28 probook kernel:  ? tick_nohz_dep_clear_signal+0x20/0x20
Nov 22 15:52:28 probook kernel:  ? _raw_spin_unlock_irq+0xbe/0x120
Nov 22 15:52:28 probook kernel:  ? _raw_spin_unlock+0x120/0x120
Nov 22 15:52:28 probook kernel:  ? compat_start_thread+0x70/0x70
Nov 22 15:52:28 probook kernel:  ? finish_task_switch+0x27d/0x7f0
Nov 22 15:52:28 probook kernel:  ? sched_clock_cpu+0x18/0x1e0
Nov 22 15:52:28 probook kernel:  ? ret_from_fork+0x1f/0x30
Nov 22 15:52:28 probook kernel:  ? pci_mmcfg_check_reserved+0x100/0x100
Nov 22 15:52:28 probook kernel:  ? schedule+0xfb/0x3b0
Nov 22 15:52:28 probook kernel:  ? __schedule+0x19b0/0x19b0
Nov 22 15:52:28 probook kernel:  ? remove_wait_queue+0x2b0/0x2b0
Nov 22 15:52:28 probook kernel:  ? memcg_kmem_get_cache+0x890/0x890
Nov 22 15:52:28 probook kernel:  ? _raw_spin_unlock_irqrestore+0xc2/0x130
Nov 22 15:52:28 probook kernel:  ? _raw_spin_unlock_irq+0x120/0x120
Nov 22 15:52:28 probook kernel:  ? trace_event_raw_event_workqueue_work+0x170/0x170
Nov 22 15:52:28 probook kernel:  kthread+0x2d4/0x390
Nov 22 15:52:28 probook kernel:  ? kthread_create_worker+0xd0/0xd0
Nov 22 15:52:28 probook kernel:  ret_from_fork+0x1f/0x30
Nov 22 15:52:28 probook kernel:
Nov 22 15:52:28 probook kernel: Allocated by task 2607:
Nov 22 15:52:28 probook kernel:  kasan_kmalloc+0xa0/0xd0
Nov 22 15:52:28 probook kernel:  kmem_cache_alloc_trace+0xd1/0x1e0
Nov 22 15:52:28 probook kernel:  dm_atomic_state_alloc+0x39/0x70
Nov 22 15:52:28 probook kernel:  drm_atomic_helper_duplicate_state+0x6f/0x2a0
Nov 22 15:52:28 probook kernel:  drm_atomic_helper_suspend+0x9e/0x130
Nov 22 15:52:28 probook kernel:  dm_suspend+0x8c/0x130
Nov 22 15:52:28 probook kernel:  amdgpu_suspend+0xf0/0x440
Nov 22 15:52:28 probook kernel:  amdgpu_device_suspend+0x51f/0xa20
Nov 22 15:52:28 probook kernel:  pci_pm_suspend+0x220/0x450
Nov 22 15:52:28 probook kernel:  dpm_run_callback+0xcb/0x460
Nov 22 15:52:28 probook kernel:  __device_suspend+0x2e4/0xd40
Nov 22 15:52:28 probook kernel:  async_suspend+0x15/0xd0
Nov 22 15:52:28 probook kernel:  async_run_entry_fn+0x16b/0x690
Nov 22 15:52:28 probook kernel:  process_one_work+0x84b/0x1600
Nov 22 15:52:28 probook kernel:  worker_thread+0x211/0x1790
Nov 22 15:52:28 probook kernel:  kthread+0x2d4/0x390
Nov 22 15:52:28 probook kernel:  ret_from_fork+0x1f/0x30
Nov 22 15:52:28 probook kernel:
Nov 22 15:52:28 probook kernel: Freed by task 2619:
Nov 22 15:52:28 probook kernel:  kasan_slab_free+0x71/0xc0
Nov 22 15:52:28 probook kernel:  kfree+0x88/0x1b0
Nov 22 15:52:28 probook kernel:  drm_atomic_helper_resume+0xbf/0x120
Nov 22 15:52:28 probook kernel:  amdgpu_dm_display_resume+0x6e9/0xa40
Nov 22 15:52:28 probook kernel:  amdgpu_device_resume+0x7d3/0x910
Nov 22 15:52:28 probook kernel:  dpm_run_callback+0xcb/0x460
Nov 22 15:52:28 probook kernel:  device_resume+0x165/0x470
Nov 22 15:52:28 probook kernel:  async_resume+0x14/0x40
Nov 22 15:52:28 probook kernel:  async_run_entry_fn+0x16b/0x690
Nov 22 15:52:28 probook kernel:  process_one_work+0x84b/0x1600
Nov 22 15:52:28 probook kernel:  worker_thread+0x211/0x1790
Nov 22 15:52:28 probook kernel:  kthread+0x2d4/0x390
Nov 22 15:52:28 probook kernel:  ret_from_fork+0x1f/0x30
Nov 22 15:52:28 probook kernel:
Nov 22 15:52:28 probook kernel: The buggy address belongs to the object at ffff8803e1be7840
 which belongs to the cache kmalloc-128 of size 128
Nov 22 15:52:28 probook kernel: The buggy address is located 0 bytes inside of
 128-byte region [ffff8803e1be7840, ffff8803e1be78c0)
Nov 22 15:52:28 probook kernel: The buggy address belongs to the page:
Nov 22 15:52:28 probook kernel: page:ffffea000f86f9c0 count:1 mapcount:0 mapping:          (null) index:0xffff8803e1be7000
Nov 22 15:52:28 probook kernel: flags: 0x2000000000000100(slab)
Nov 22 15:52:28 probook kernel: raw: 2000000000000100 0000000000000000 ffff8803e1be7000 0000000180150010
Nov 22 15:52:28 probook kernel: raw: 0000000000000000 0000000500000001 ffff8803f3403340 0000000000000000
Nov 22 15:52:28 probook kernel: page dumped because: kasan: bad access detected
Nov 22 15:52:28 probook kernel: 
Nov 22 15:52:28 probook kernel: Memory state around the buggy address:
Nov 22 15:52:28 probook kernel:  ffff8803e1be7700: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
Nov 22 15:52:28 probook kernel:  ffff8803e1be7780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Nov 22 15:52:28 probook kernel: >ffff8803e1be7800: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
Nov 22 15:52:28 probook kernel:                                            ^
Nov 22 15:52:28 probook kernel:  ffff8803e1be7880: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
Nov 22 15:52:28 probook kernel:  ffff8803e1be7900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
Nov 22 15:52:28 probook kernel: ==================================================================
Nov 22 15:52:28 probook kernel: Disabling lock debugging due to kernel taint

-- 
Regards,
  Johannes

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Kernel crash/Null pointer dereference on vblank
  2017-11-22 15:07   ` Johannes Hirte
@ 2017-11-22 22:31     ` Johannes Hirte
  2017-11-23  2:18       ` Chunming Zhou
  0 siblings, 1 reply; 10+ messages in thread
From: Johannes Hirte @ 2017-11-22 22:31 UTC (permalink / raw)
  To: Martin Babutzka; +Cc: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Ok, now I have more use-after-free report, this time without dc. I
don't know if this is related, but I didn't have runtime errors without
dc for now. 

kasan report:

[22697.845475] ==================================================================
[22697.845495] BUG: KASAN: use-after-free in amdgpu_job_free_cb+0x140/0x150
[22697.845500] Read of size 8 at addr ffff8801c02e91c8 by task kworker/0:2/22547

[22697.845509] CPU: 0 PID: 22547 Comm: kworker/0:2 Not tainted 4.14.0-11095-g0c86a6bd85ff #404
[22697.845513] Hardware name: HP HP ProBook 645 G2/80FE, BIOS N77 Ver. 01.09 06/09/2017
[22697.845520] Workqueue: events amd_sched_job_finish
[22697.845525] Call Trace:
[22697.845534]  dump_stack+0x99/0x11e
[22697.845541]  ? _atomic_dec_and_lock+0x152/0x152
[22697.845548]  print_address_description+0x65/0x270
[22697.845553]  kasan_report+0x272/0x360
[22697.845557]  ? amdgpu_job_free_cb+0x140/0x150
[22697.845562]  amdgpu_job_free_cb+0x140/0x150
[22697.845566]  amd_sched_job_finish+0x288/0x560
[22697.845571]  ? amd_sched_process_job+0x220/0x220
[22697.845576]  ? amdgpu_unpin_work_func+0x266/0x460
[22697.845582]  ? _raw_spin_unlock_irq+0xbe/0x120
[22697.845587]  ? _raw_spin_unlock+0x120/0x120
[22697.845593]  process_one_work+0x84b/0x1600
[22697.845599]  ? tick_nohz_dep_clear_signal+0x20/0x20
[22697.845603]  ? _raw_spin_unlock_irq+0xbe/0x120
[22697.845607]  ? _raw_spin_unlock+0x120/0x120
[22697.845611]  ? pwq_dec_nr_in_flight+0x3c0/0x3c0
[22697.845617]  ? release_thread+0xa0/0xe0
[22697.845621]  ? cyc2ns_read_end+0x20/0x20
[22697.845626]  ? finish_task_switch+0x27d/0x7f0
[22697.845630]  ? wq_worker_waking_up+0xc0/0xc0
[22697.845640]  ? pci_mmcfg_check_reserved+0x100/0x100
[22697.845644]  ? pci_mmcfg_check_reserved+0x100/0x100
[22697.845648]  ? preempt_schedule_irq+0x4e/0xb0
[22697.845653]  ? retint_kernel+0x1b/0x1d
[22697.845659]  ? schedule+0xfb/0x3b0
[22697.845663]  ? __schedule+0x19b0/0x19b0
[22697.845669]  ? _raw_spin_unlock_irq+0xb9/0x120
[22697.845674]  ? _raw_spin_unlock_irq+0xbe/0x120
[22697.845678]  ? _raw_spin_unlock+0x120/0x120
[22697.845683]  worker_thread+0x211/0x1790
[22697.845692]  ? pick_next_task_fair+0x97d/0x10f0
[22697.845697]  ? trace_event_raw_event_workqueue_work+0x170/0x170
[22697.845703]  ? tick_nohz_dep_clear_signal+0x20/0x20
[22697.845708]  ? _raw_spin_unlock_irq+0xbe/0x120
[22697.845713]  ? _raw_spin_unlock+0x120/0x120
[22697.845718]  ? compat_start_thread+0x70/0x70
[22697.845722]  ? finish_task_switch+0x27d/0x7f0
[22697.845727]  ? sched_clock_cpu+0x18/0x1e0
[22697.845733]  ? ret_from_fork+0x1f/0x30
[22697.845739]  ? pci_mmcfg_check_reserved+0x100/0x100
[22697.845744]  ? unix_write_space+0x410/0x410
[22697.845749]  ? cyc2ns_read_end+0x20/0x20
[22697.845755]  ? schedule+0xfb/0x3b0
[22697.845759]  ? __schedule+0x19b0/0x19b0
[22697.845765]  ? remove_wait_queue+0x2b0/0x2b0
[22697.845770]  ? arch_vtime_task_switch+0xee/0x190
[22697.845774]  ? _raw_spin_unlock_irqrestore+0xc2/0x130
[22697.845778]  ? _raw_spin_unlock_irq+0x120/0x120
[22697.845783]  ? trace_event_raw_event_workqueue_work+0x170/0x170
[22697.845788]  kthread+0x2d4/0x390
[22697.845793]  ? kthread_create_worker+0xd0/0xd0
[22697.845797]  ret_from_fork+0x1f/0x30

[22697.845809] Allocated by task 2378:
[22697.845817]  kasan_kmalloc+0xa0/0xd0
[22697.845822]  kmem_cache_alloc_trace+0xd1/0x1e0
[22697.845829]  amdgpu_driver_open_kms+0x12b/0x4d0
[22697.845839]  drm_open+0x7c3/0x1100
[22697.845843]  drm_stub_open+0x2a8/0x400
[22697.845851]  chrdev_open+0x1eb/0x5a0
[22697.845857]  do_dentry_open+0x5a1/0xc50
[22697.845865]  path_openat+0x11d3/0x4e90
[22697.845868]  do_filp_open+0x239/0x3c0
[22697.845872]  do_sys_open+0x402/0x630
[22697.845878]  do_syscall_64+0x220/0x670
[22697.845881]  return_from_SYSCALL_64+0x0/0x65

[22697.845887] Freed by task 24090:
[22697.845892]  kasan_slab_free+0x71/0xc0
[22697.845895]  kfree+0x88/0x1b0
[22697.845900]  amdgpu_driver_postclose_kms+0x469/0x860
[22697.845904]  drm_release+0x8a8/0x1180
[22697.845909]  __fput+0x2ab/0x730
[22697.845913]  task_work_run+0x14b/0x200
[22697.845919]  do_exit+0x7c6/0x13a0
[22697.845922]  do_group_exit+0x121/0x340
[22697.845926]  SyS_exit_group+0x14/0x20
[22697.845929]  do_syscall_64+0x220/0x670
[22697.845932]  return_from_SYSCALL_64+0x0/0x65

[22697.845940] The buggy address belongs to the object at ffff8801c02e9100
[22697.845946] The buggy address is located 200 bytes inside of
[22697.845949] The buggy address belongs to the page:
[22697.845958] page:ffffea000700ba00 count:1 mapcount:0 mapping:          (null) index:0x0 compound_mapcount: 0
[22697.845967] flags: 0x2000000000008100(slab|head)
[22697.845977] raw: 2000000000008100 0000000000000000 0000000000000000 00000001000f000f
[22697.845982] raw: dead000000000100 dead000000000200 ffff8803f3402a80 0000000000000000
[22697.845985] page dumped because: kasan: bad access detected

[22697.845990] Memory state around the buggy address:
[22697.845995]  ffff8801c02e9080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[22697.845999]  ffff8801c02e9100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[22697.846003] >ffff8801c02e9180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[22697.846005]                                               ^
[22697.846009]  ffff8801c02e9200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[22697.846012]  ffff8801c02e9280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[22697.846015] ==================================================================
[22697.846018] Disabling lock debugging due to kernel taint

-- 
Regards,
  Johannes

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Kernel crash/Null pointer dereference on vblank
  2017-11-22 22:31     ` Johannes Hirte
@ 2017-11-23  2:18       ` Chunming Zhou
       [not found]         ` <58b24f03-b71b-4208-4cb7-4706ba947dea-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Chunming Zhou @ 2017-11-23  2:18 UTC (permalink / raw)
  To: Johannes Hirte, Martin Babutzka; +Cc: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Which driver are you using?

I guess your driver is a bit old, the issue should be fixed before.


Regards,

David Zhou


On 2017年11月23日 06:31, Johannes Hirte wrote:
> Ok, now I have more use-after-free report, this time without dc. I
> don't know if this is related, but I didn't have runtime errors without
> dc for now.
>
> kasan report:
>
> [22697.845475] ==================================================================
> [22697.845495] BUG: KASAN: use-after-free in amdgpu_job_free_cb+0x140/0x150
> [22697.845500] Read of size 8 at addr ffff8801c02e91c8 by task kworker/0:2/22547
>
> [22697.845509] CPU: 0 PID: 22547 Comm: kworker/0:2 Not tainted 4.14.0-11095-g0c86a6bd85ff #404
> [22697.845513] Hardware name: HP HP ProBook 645 G2/80FE, BIOS N77 Ver. 01.09 06/09/2017
> [22697.845520] Workqueue: events amd_sched_job_finish
> [22697.845525] Call Trace:
> [22697.845534]  dump_stack+0x99/0x11e
> [22697.845541]  ? _atomic_dec_and_lock+0x152/0x152
> [22697.845548]  print_address_description+0x65/0x270
> [22697.845553]  kasan_report+0x272/0x360
> [22697.845557]  ? amdgpu_job_free_cb+0x140/0x150
> [22697.845562]  amdgpu_job_free_cb+0x140/0x150
> [22697.845566]  amd_sched_job_finish+0x288/0x560
> [22697.845571]  ? amd_sched_process_job+0x220/0x220
> [22697.845576]  ? amdgpu_unpin_work_func+0x266/0x460
> [22697.845582]  ? _raw_spin_unlock_irq+0xbe/0x120
> [22697.845587]  ? _raw_spin_unlock+0x120/0x120
> [22697.845593]  process_one_work+0x84b/0x1600
> [22697.845599]  ? tick_nohz_dep_clear_signal+0x20/0x20
> [22697.845603]  ? _raw_spin_unlock_irq+0xbe/0x120
> [22697.845607]  ? _raw_spin_unlock+0x120/0x120
> [22697.845611]  ? pwq_dec_nr_in_flight+0x3c0/0x3c0
> [22697.845617]  ? release_thread+0xa0/0xe0
> [22697.845621]  ? cyc2ns_read_end+0x20/0x20
> [22697.845626]  ? finish_task_switch+0x27d/0x7f0
> [22697.845630]  ? wq_worker_waking_up+0xc0/0xc0
> [22697.845640]  ? pci_mmcfg_check_reserved+0x100/0x100
> [22697.845644]  ? pci_mmcfg_check_reserved+0x100/0x100
> [22697.845648]  ? preempt_schedule_irq+0x4e/0xb0
> [22697.845653]  ? retint_kernel+0x1b/0x1d
> [22697.845659]  ? schedule+0xfb/0x3b0
> [22697.845663]  ? __schedule+0x19b0/0x19b0
> [22697.845669]  ? _raw_spin_unlock_irq+0xb9/0x120
> [22697.845674]  ? _raw_spin_unlock_irq+0xbe/0x120
> [22697.845678]  ? _raw_spin_unlock+0x120/0x120
> [22697.845683]  worker_thread+0x211/0x1790
> [22697.845692]  ? pick_next_task_fair+0x97d/0x10f0
> [22697.845697]  ? trace_event_raw_event_workqueue_work+0x170/0x170
> [22697.845703]  ? tick_nohz_dep_clear_signal+0x20/0x20
> [22697.845708]  ? _raw_spin_unlock_irq+0xbe/0x120
> [22697.845713]  ? _raw_spin_unlock+0x120/0x120
> [22697.845718]  ? compat_start_thread+0x70/0x70
> [22697.845722]  ? finish_task_switch+0x27d/0x7f0
> [22697.845727]  ? sched_clock_cpu+0x18/0x1e0
> [22697.845733]  ? ret_from_fork+0x1f/0x30
> [22697.845739]  ? pci_mmcfg_check_reserved+0x100/0x100
> [22697.845744]  ? unix_write_space+0x410/0x410
> [22697.845749]  ? cyc2ns_read_end+0x20/0x20
> [22697.845755]  ? schedule+0xfb/0x3b0
> [22697.845759]  ? __schedule+0x19b0/0x19b0
> [22697.845765]  ? remove_wait_queue+0x2b0/0x2b0
> [22697.845770]  ? arch_vtime_task_switch+0xee/0x190
> [22697.845774]  ? _raw_spin_unlock_irqrestore+0xc2/0x130
> [22697.845778]  ? _raw_spin_unlock_irq+0x120/0x120
> [22697.845783]  ? trace_event_raw_event_workqueue_work+0x170/0x170
> [22697.845788]  kthread+0x2d4/0x390
> [22697.845793]  ? kthread_create_worker+0xd0/0xd0
> [22697.845797]  ret_from_fork+0x1f/0x30
>
> [22697.845809] Allocated by task 2378:
> [22697.845817]  kasan_kmalloc+0xa0/0xd0
> [22697.845822]  kmem_cache_alloc_trace+0xd1/0x1e0
> [22697.845829]  amdgpu_driver_open_kms+0x12b/0x4d0
> [22697.845839]  drm_open+0x7c3/0x1100
> [22697.845843]  drm_stub_open+0x2a8/0x400
> [22697.845851]  chrdev_open+0x1eb/0x5a0
> [22697.845857]  do_dentry_open+0x5a1/0xc50
> [22697.845865]  path_openat+0x11d3/0x4e90
> [22697.845868]  do_filp_open+0x239/0x3c0
> [22697.845872]  do_sys_open+0x402/0x630
> [22697.845878]  do_syscall_64+0x220/0x670
> [22697.845881]  return_from_SYSCALL_64+0x0/0x65
>
> [22697.845887] Freed by task 24090:
> [22697.845892]  kasan_slab_free+0x71/0xc0
> [22697.845895]  kfree+0x88/0x1b0
> [22697.845900]  amdgpu_driver_postclose_kms+0x469/0x860
> [22697.845904]  drm_release+0x8a8/0x1180
> [22697.845909]  __fput+0x2ab/0x730
> [22697.845913]  task_work_run+0x14b/0x200
> [22697.845919]  do_exit+0x7c6/0x13a0
> [22697.845922]  do_group_exit+0x121/0x340
> [22697.845926]  SyS_exit_group+0x14/0x20
> [22697.845929]  do_syscall_64+0x220/0x670
> [22697.845932]  return_from_SYSCALL_64+0x0/0x65
>
> [22697.845940] The buggy address belongs to the object at ffff8801c02e9100
> [22697.845946] The buggy address is located 200 bytes inside of
> [22697.845949] The buggy address belongs to the page:
> [22697.845958] page:ffffea000700ba00 count:1 mapcount:0 mapping:          (null) index:0x0 compound_mapcount: 0
> [22697.845967] flags: 0x2000000000008100(slab|head)
> [22697.845977] raw: 2000000000008100 0000000000000000 0000000000000000 00000001000f000f
> [22697.845982] raw: dead000000000100 dead000000000200 ffff8803f3402a80 0000000000000000
> [22697.845985] page dumped because: kasan: bad access detected
>
> [22697.845990] Memory state around the buggy address:
> [22697.845995]  ffff8801c02e9080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> [22697.845999]  ffff8801c02e9100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [22697.846003] >ffff8801c02e9180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [22697.846005]                                               ^
> [22697.846009]  ffff8801c02e9200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [22697.846012]  ffff8801c02e9280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [22697.846015] ==================================================================
> [22697.846018] Disabling lock debugging due to kernel taint
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Kernel crash/Null pointer dereference on vblank
       [not found]         ` <58b24f03-b71b-4208-4cb7-4706ba947dea-5C7GfCeVMHo@public.gmane.org>
@ 2017-11-23  8:31           ` Johannes Hirte
  2017-11-23 10:07             ` Chunming Zhou
  0 siblings, 1 reply; 10+ messages in thread
From: Johannes Hirte @ 2017-11-23  8:31 UTC (permalink / raw)
  To: Chunming Zhou; +Cc: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Martin Babutzka

On 2017 Nov 23, Chunming Zhou wrote:
> Which driver are you using?
> 
> I guess your driver is a bit old, the issue should be fixed before.
> 

This was with git master from Linus. But even with the latest changes
from agd5f/drm-next-4.15 both use-after-free still persist. If there are
fixes for this, they're not available for upstream.

-- 
Regards,
  Johannes

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Kernel crash/Null pointer dereference on vblank
  2017-11-23  8:31           ` Johannes Hirte
@ 2017-11-23 10:07             ` Chunming Zhou
       [not found]               ` <56b938ac-4597-a354-0f4a-0c2625b10c5a-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Chunming Zhou @ 2017-11-23 10:07 UTC (permalink / raw)
  To: Johannes Hirte; +Cc: Martin Babutzka, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

[-- Attachment #1: Type: text/plain, Size: 661 bytes --]

See the attached email, they fixed same issue, each of them is ok to fix 
your issue, your calltrace is  same as the second.

We should already push the first patch in early time, could you check if 
the first patch is in your branch?

Regards,

David Zhou


On 2017年11月23日 16:31, Johannes Hirte wrote:
> On 2017 Nov 23, Chunming Zhou wrote:
>> Which driver are you using?
>>
>> I guess your driver is a bit old, the issue should be fixed before.
>>
> This was with git master from Linus. But even with the latest changes
> from agd5f/drm-next-4.15 both use-after-free still persist. If there are
> fixes for this, they're not available for upstream.
>


[-- Attachment #2: Re: [PATCH 1_3] drm_amdgpu: Avoid accessing job->entity after the job is scheduled..eml --]
[-- Type: message/rfc822, Size: 19575 bytes --]

From: "Christian König" <ckoenig.leichtzumerken-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: Andrey Grodzovsky <Andrey.Grodzovsky-5C7GfCeVMHo@public.gmane.org>, <amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org>
Cc: <christian.koenig-5C7GfCeVMHo@public.gmane.org>
Subject: Re: [PATCH 1/3] drm/amdgpu: Avoid accessing job->entity after the job is scheduled.
Date: Mon, 23 Oct 2017 14:36:31 +0200
Message-ID: <6b49f609-a083-ddde-ff3b-2abfe39890c3-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

Am 20.10.2017 um 15:32 schrieb Andrey Grodzovsky:
> Bug: amdgpu_job_free_cb was accessing s_job->s_entity when the allocated
> amdgpu_ctx (and the entity inside it) were already deallocated from
> amdgpu_cs_parser_fini.
>
> Fix: Save job's priority on it's creation instead of accessing it from
> s_entity later on.
>
> Signed-off-by: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>

Reviewed-by: Christian König <christian.koenig@amd.com>

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c        |  3 +--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c       |  5 ++---
>   drivers/gpu/drm/amd/scheduler/gpu_scheduler.c |  1 +
>   drivers/gpu/drm/amd/scheduler/gpu_scheduler.h | 32 ++++++++++++---------------
>   4 files changed, 18 insertions(+), 23 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index f7fceb6..a760b6e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -1192,8 +1192,7 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
>   	job->uf_sequence = seq;
>   
>   	amdgpu_job_free_resources(job);
> -	amdgpu_ring_priority_get(job->ring,
> -				 amd_sched_get_job_priority(&job->base));
> +	amdgpu_ring_priority_get(job->ring, job->base.s_priority);
>   
>   	trace_amdgpu_cs_ioctl(job);
>   	amd_sched_entity_push_job(&job->base);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> index 0cfc68d..a58e3c5 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> @@ -104,7 +104,7 @@ static void amdgpu_job_free_cb(struct amd_sched_job *s_job)
>   {
>   	struct amdgpu_job *job = container_of(s_job, struct amdgpu_job, base);
>   
> -	amdgpu_ring_priority_put(job->ring, amd_sched_get_job_priority(s_job));
> +	amdgpu_ring_priority_put(job->ring, s_job->s_priority);
>   	dma_fence_put(job->fence);
>   	amdgpu_sync_free(&job->sync);
>   	amdgpu_sync_free(&job->dep_sync);
> @@ -141,8 +141,7 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct amdgpu_ring *ring,
>   	job->fence_ctx = entity->fence_context;
>   	*f = dma_fence_get(&job->base.s_fence->finished);
>   	amdgpu_job_free_resources(job);
> -	amdgpu_ring_priority_get(job->ring,
> -				 amd_sched_get_job_priority(&job->base));
> +	amdgpu_ring_priority_get(job->ring, job->base.s_priority);
>   	amd_sched_entity_push_job(&job->base);
>   
>   	return 0;
> diff --git a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c
> index e4d3b4e..1bbbce2 100644
> --- a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c
> +++ b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c
> @@ -529,6 +529,7 @@ int amd_sched_job_init(struct amd_sched_job *job,
>   {
>   	job->sched = sched;
>   	job->s_entity = entity;
> +	job->s_priority = entity->rq - sched->sched_rq;
>   	job->s_fence = amd_sched_fence_create(entity, owner);
>   	if (!job->s_fence)
>   		return -ENOMEM;
> diff --git a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.h b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.h
> index 52c8e54..3f75b45 100644
> --- a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.h
> +++ b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.h
> @@ -30,6 +30,19 @@
>   struct amd_gpu_scheduler;
>   struct amd_sched_rq;
>   
> +enum amd_sched_priority {
> +	AMD_SCHED_PRIORITY_MIN,
> +	AMD_SCHED_PRIORITY_LOW = AMD_SCHED_PRIORITY_MIN,
> +	AMD_SCHED_PRIORITY_NORMAL,
> +	AMD_SCHED_PRIORITY_HIGH_SW,
> +	AMD_SCHED_PRIORITY_HIGH_HW,
> +	AMD_SCHED_PRIORITY_KERNEL,
> +	AMD_SCHED_PRIORITY_MAX,
> +	AMD_SCHED_PRIORITY_INVALID = -1,
> +	AMD_SCHED_PRIORITY_UNSET = -2
> +};
> +
> +
>   /**
>    * A scheduler entity is a wrapper around a job queue or a group
>    * of other entities. Entities take turns emitting jobs from their
> @@ -83,6 +96,7 @@ struct amd_sched_job {
>   	struct delayed_work		work_tdr;
>   	uint64_t			id;
>   	atomic_t karma;
> +	enum amd_sched_priority s_priority;
>   };
>   
>   extern const struct dma_fence_ops amd_sched_fence_ops_scheduled;
> @@ -114,18 +128,6 @@ struct amd_sched_backend_ops {
>   	void (*free_job)(struct amd_sched_job *sched_job);
>   };
>   
> -enum amd_sched_priority {
> -	AMD_SCHED_PRIORITY_MIN,
> -	AMD_SCHED_PRIORITY_LOW = AMD_SCHED_PRIORITY_MIN,
> -	AMD_SCHED_PRIORITY_NORMAL,
> -	AMD_SCHED_PRIORITY_HIGH_SW,
> -	AMD_SCHED_PRIORITY_HIGH_HW,
> -	AMD_SCHED_PRIORITY_KERNEL,
> -	AMD_SCHED_PRIORITY_MAX,
> -	AMD_SCHED_PRIORITY_INVALID = -1,
> -	AMD_SCHED_PRIORITY_UNSET = -2
> -};
> -
>   /**
>    * One scheduler is implemented for each hardware ring
>   */
> @@ -176,10 +178,4 @@ bool amd_sched_dependency_optimized(struct dma_fence* fence,
>   				    struct amd_sched_entity *entity);
>   void amd_sched_job_kickout(struct amd_sched_job *s_job);
>   
> -static inline enum amd_sched_priority
> -amd_sched_get_job_priority(struct amd_sched_job *job)
> -{
> -	return (job->s_entity->rq - job->sched->sched_rq);
> -}
> -
>   #endif


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[-- Attachment #3: Re: [PATCH] drm_amd_scheduler: fix one used-after-free case for job->s_entity.eml --]
[-- Type: message/rfc822, Size: 32146 bytes --]

From: Chunming Zhou <zhoucm1-5C7GfCeVMHo@public.gmane.org>
To: "Liu, Monk" <Monk.Liu-5C7GfCeVMHo@public.gmane.org>, "Koenig, Christian" <Christian.Koenig-5C7GfCeVMHo@public.gmane.org>, "Zhou, David(ChunMing)" <David1.Zhou-5C7GfCeVMHo@public.gmane.org>, "amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org" <amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org>
Subject: Re: [PATCH] drm/amd/scheduler: fix one used-after-free case for job->s_entity
Date: Tue, 24 Oct 2017 18:10:16 +0800
Message-ID: <8eaffd53-67eb-251d-440b-fb48df5d39a2-5C7GfCeVMHo@public.gmane.org>

Hi Monk,

You can enable KASAN to catch use-after-free case by 'CONFIG_KASAN = y', 
which calltrace really is very good and obvious.

Regards,

David Zhou


On 2017年10月24日 18:06, Liu, Monk wrote:
> Christian
>
> Actually there are more wild pointer issue for entity in scheduler's main routine ....
>
>
> See the message I replied to David
>
> BR Monk
>
> -----Original Message-----
> From: amd-gfx [mailto:amd-gfx-bounces@lists.freedesktop.org] On Behalf Of Christian K?nig
> Sent: 2017年10月24日 18:01
> To: Zhou, David(ChunMing) <David1.Zhou@amd.com>; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amd/scheduler: fix one used-after-free case for job->s_entity
>
> Andrey already submitted a fix for this a few days ago.
>
> Christian.
>
> Am 24.10.2017 um 11:55 schrieb Chunming Zhou:
>> The s_entity presented process could already be closed when calling amdgpu_job_free_cb.
>> the s_entity will be buggy pointer after it's freed. See below calltrace:
>>
>> [  355.616964]
>> ==================================================================
>> [  355.617191] BUG: KASAN: use-after-free in
>> amdgpu_job_free_cb+0x2f/0xc0 [amdgpu] [  355.617197] Read of size 8 at
>> addr ffff88039d593c40 by task kworker/9:1/100
>>
>> [  355.617206] CPU: 9 PID: 100 Comm: kworker/9:1 Not tainted
>> 4.13.0-custom #1 [  355.617208] Hardware name: Gigabyte Technology
>> Co., Ltd. Default string/X99P-SLI-CF, BIOS F23 07/22/2016 [
>> 355.617342] Workqueue: events amd_sched_job_finish [amdgpu] [  355.617344] Call Trace:
>> [  355.617351]  dump_stack+0x63/0x8d
>> [  355.617356]  print_address_description+0x70/0x290
>> [  355.617474]  ? amdgpu_job_free_cb+0x2f/0xc0 [amdgpu] [  355.617477]
>> kasan_report+0x265/0x350 [  355.617479]  __asan_load8+0x54/0x90 [
>> 355.617603]  amdgpu_job_free_cb+0x2f/0xc0 [amdgpu] [  355.617721]
>> amd_sched_job_finish+0x161/0x180 [amdgpu] [  355.617725]
>> process_one_work+0x2ab/0x700 [  355.617727]  worker_thread+0x90/0x720
>> [  355.617731]  kthread+0x18c/0x1e0 [  355.617732]  ?
>> process_one_work+0x700/0x700 [  355.617735]  ?
>> kthread_create_on_node+0xb0/0xb0 [  355.617738]
>> ret_from_fork+0x25/0x30
>>
>> [  355.617742] Allocated by task 1347:
>> [  355.617747]  save_stack_trace+0x1b/0x20 [  355.617749]
>> save_stack+0x46/0xd0 [  355.617751]  kasan_kmalloc+0xad/0xe0 [
>> 355.617753]  kmem_cache_alloc_trace+0xef/0x200 [  355.617853]
>> amdgpu_driver_open_kms+0x98/0x290 [amdgpu] [  355.617883]
>> drm_open+0x38c/0x6e0 [drm] [  355.617908]  drm_stub_open+0x144/0x1b0
>> [drm] [  355.617911]  chrdev_open+0x180/0x320 [  355.617913]
>> do_dentry_open+0x3a2/0x570 [  355.617915]  vfs_open+0x86/0xe0 [
>> 355.617918]  path_openat+0x49e/0x1db0 [  355.617919]
>> do_filp_open+0x11c/0x1a0 [  355.617921]  do_sys_open+0x16f/0x2a0 [
>> 355.617923]  SyS_open+0x1e/0x20 [  355.617926]
>> do_syscall_64+0xea/0x210 [  355.617928]
>> return_from_SYSCALL_64+0x0/0x6a
>>
>> [  355.617931] Freed by task 1347:
>> [  355.617934]  save_stack_trace+0x1b/0x20 [  355.617936]
>> save_stack+0x46/0xd0 [  355.617937]  kasan_slab_free+0x70/0xc0 [
>> 355.617939]  kfree+0x9d/0x1c0 [  355.618038]
>> amdgpu_driver_postclose_kms+0x1bc/0x3e0 [amdgpu] [  355.618063]
>> drm_release+0x454/0x610 [drm] [  355.618065]  __fput+0x177/0x350 [
>> 355.618066]  ____fput+0xe/0x10 [  355.618068]  task_work_run+0xa0/0xc0
>> [  355.618070]  do_exit+0x456/0x1320 [  355.618072]
>> do_group_exit+0x86/0x130 [  355.618074]  SyS_exit_group+0x1d/0x20 [
>> 355.618076]  do_syscall_64+0xea/0x210 [  355.618078]
>> return_from_SYSCALL_64+0x0/0x6a
>>
>> [  355.618081] The buggy address belongs to the object at ffff88039d593b80
>>                   which belongs to the cache kmalloc-2048 of size 2048
>> [  355.618085] The buggy address is located 192 bytes inside of
>>                   2048-byte region [ffff88039d593b80, ffff88039d594380)
>> [  355.618087] The buggy address belongs to the page:
>> [  355.618091] page:ffffea000e756400 count:1 mapcount:0 mapping:          (null) index:0x0 compound_mapcount: 0
>> [  355.618095] flags: 0x2ffff0000008100(slab|head) [  355.618099] raw:
>> 02ffff0000008100 0000000000000000 0000000000000000 00000001000f000f [
>> 355.618103] raw: ffffea000edb0600 0000000200000002 ffff8803bfc0ea00
>> 0000000000000000 [  355.618105] page dumped because: kasan: bad access
>> detected
>>
>> [  355.618108] Memory state around the buggy address:
>> [  355.618110]  ffff88039d593b00: fc fc fc fc fc fc fc fc fc fc fc fc
>> fc fc fc fc [  355.618113]  ffff88039d593b80: fb fb fb fb fb fb fb fb
>> fb fb fb fb fb fb fb fb [  355.618116] >ffff88039d593c00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> [  355.618117]                                            ^
>> [  355.618120]  ffff88039d593c80: fb fb fb fb fb fb fb fb fb fb fb fb
>> fb fb fb fb [  355.618122]  ffff88039d593d00: fb fb fb fb fb fb fb fb
>> fb fb fb fb fb fb fb fb [  355.618124]
>> ==================================================================
>> [  355.618126] Disabling lock debugging due to kernel taint
>>
>> Change-Id: I8ff7122796b8cd16fc26e9c40e8d4c8153d67e0c
>> Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
>> ---
>>    drivers/gpu/drm/amd/scheduler/gpu_scheduler.c |  1 +
>>    drivers/gpu/drm/amd/scheduler/gpu_scheduler.h | 27 ++++++++++++++-------------
>>    2 files changed, 15 insertions(+), 13 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c
>> b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c
>> index 007fdbd..8101ed7 100644
>> --- a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c
>> +++ b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c
>> @@ -535,6 +535,7 @@ int amd_sched_job_init(struct amd_sched_job *job,
>>    	if (!job->s_fence)
>>    		return -ENOMEM;
>>    	job->id = atomic64_inc_return(&sched->job_id_count);
>> +	job->priority = job->s_entity->rq - job->sched->sched_rq;
>>    
>>    	INIT_WORK(&job->finish_work, amd_sched_job_finish);
>>    	INIT_LIST_HEAD(&job->node);
>> diff --git a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.h
>> b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.h
>> index e21299c..8808eb1 100644
>> --- a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.h
>> +++ b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.h
>> @@ -77,6 +77,18 @@ struct amd_sched_fence {
>>    	void                            *owner;
>>    };
>>    
>> +enum amd_sched_priority {
>> +	AMD_SCHED_PRIORITY_MIN,
>> +	AMD_SCHED_PRIORITY_LOW = AMD_SCHED_PRIORITY_MIN,
>> +	AMD_SCHED_PRIORITY_NORMAL,
>> +	AMD_SCHED_PRIORITY_HIGH_SW,
>> +	AMD_SCHED_PRIORITY_HIGH_HW,
>> +	AMD_SCHED_PRIORITY_KERNEL,
>> +	AMD_SCHED_PRIORITY_MAX,
>> +	AMD_SCHED_PRIORITY_INVALID = -1,
>> +	AMD_SCHED_PRIORITY_UNSET = -2
>> +};
>> +
>>    struct amd_sched_job {
>>    	struct amd_gpu_scheduler        *sched;
>>    	struct amd_sched_entity         *s_entity;
>> @@ -87,6 +99,7 @@ struct amd_sched_job {
>>    	struct delayed_work		work_tdr;
>>    	uint64_t			id;
>>    	atomic_t karma;
>> +	enum amd_sched_priority		priority;
>>    };
>>    
>>    extern const struct dma_fence_ops amd_sched_fence_ops_scheduled; @@
>> -118,18 +131,6 @@ struct amd_sched_backend_ops {
>>    	void (*free_job)(struct amd_sched_job *sched_job);
>>    };
>>    
>> -enum amd_sched_priority {
>> -	AMD_SCHED_PRIORITY_MIN,
>> -	AMD_SCHED_PRIORITY_LOW = AMD_SCHED_PRIORITY_MIN,
>> -	AMD_SCHED_PRIORITY_NORMAL,
>> -	AMD_SCHED_PRIORITY_HIGH_SW,
>> -	AMD_SCHED_PRIORITY_HIGH_HW,
>> -	AMD_SCHED_PRIORITY_KERNEL,
>> -	AMD_SCHED_PRIORITY_MAX,
>> -	AMD_SCHED_PRIORITY_INVALID = -1,
>> -	AMD_SCHED_PRIORITY_UNSET = -2
>> -};
>> -
>>    /**
>>     * One scheduler is implemented for each hardware ring
>>    */
>> @@ -183,7 +184,7 @@ void amd_sched_job_kickout(struct amd_sched_job *s_job);
>>    static inline enum amd_sched_priority
>>    amd_sched_get_job_priority(struct amd_sched_job *job)
>>    {
>> -	return (job->s_entity->rq - job->sched->sched_rq);
>> +	return job->priority;
>>    }
>>    
>>    #endif
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[-- Attachment #4: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Kernel crash/Null pointer dereference on vblank
       [not found]               ` <56b938ac-4597-a354-0f4a-0c2625b10c5a-5C7GfCeVMHo@public.gmane.org>
@ 2017-11-23 12:27                 ` Johannes Hirte
  2017-11-23 15:17                   ` Leo Li
  0 siblings, 1 reply; 10+ messages in thread
From: Johannes Hirte @ 2017-11-23 12:27 UTC (permalink / raw)
  To: Chunming Zhou; +Cc: Martin Babutzka, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On 2017 Nov 23, Chunming Zhou wrote:
> See the attached email, they fixed same issue, each of them is ok to fix 
> your issue, your calltrace is  same as the second.
> 
> We should already push the first patch in early time, could you check if 
> the first patch is in your branch?
>

This patch (series) is not upstream yet. Just tested it, but this doesn't fix the
use-after-free on S3 resume with dc enabled. 

-- 
Regards,
  Johannes

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Kernel crash/Null pointer dereference on vblank
  2017-11-23 12:27                 ` Johannes Hirte
@ 2017-11-23 15:17                   ` Leo Li
       [not found]                     ` <eb1a453e-0705-ede7-bedd-dbc8668cd71d-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Leo Li @ 2017-11-23 15:17 UTC (permalink / raw)
  To: Johannes Hirte, Chunming Zhou
  Cc: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Martin Babutzka

[-- Attachment #1: Type: text/plain, Size: 606 bytes --]

Hi Johannes,

The s3 resume issue looks to be a problem with amdgpu/display. Could you 
give the attached patch a try?

Thanks,
Leo

On 2017-11-23 07:27 AM, Johannes Hirte wrote:
> On 2017 Nov 23, Chunming Zhou wrote:
>> See the attached email, they fixed same issue, each of them is ok to fix
>> your issue, your calltrace is  same as the second.
>>
>> We should already push the first patch in early time, could you check if
>> the first patch is in your branch?
>>
> 
> This patch (series) is not upstream yet. Just tested it, but this doesn't fix the
> use-after-free on S3 resume with dc enabled.
> 

[-- Attachment #2: 0001-drm-amdgpu-display-Do-not-put-drm_atomic_state-on-re.patch --]
[-- Type: text/x-patch, Size: 1146 bytes --]

>From 8656ef112d53f8c08f6571dd0d093f03d2e6cc30 Mon Sep 17 00:00:00 2001
From: "Leo (Sunpeng) Li" <sunpeng.li-5C7GfCeVMHo@public.gmane.org>
Date: Thu, 16 Nov 2017 15:17:27 -0500
Subject: [PATCH] drm/amdgpu/display: Do not put drm_atomic_state on resume

drm_atomic_helper_resume now puts it for us. See relevant patch here:
https://lists.freedesktop.org/archives/dri-devel/2017-October/154268.html

Change-Id: Ief246492f721a1cf281d48e9d1a7029e5cefc2da
Signed-off-by: Leo (Sunpeng) Li <sunpeng.li-5C7GfCeVMHo@public.gmane.org>
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 5731167..951ea77 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -688,7 +688,6 @@ int amdgpu_dm_display_resume(struct amdgpu_device *adev)
 
 	ret = drm_atomic_helper_resume(ddev, adev->dm.cached_state);
 
-	drm_atomic_state_put(adev->dm.cached_state);
 	adev->dm.cached_state = NULL;
 
 	amdgpu_dm_irq_resume_late(adev);
-- 
2.7.4


[-- Attachment #3: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: Kernel crash/Null pointer dereference on vblank
       [not found]                     ` <eb1a453e-0705-ede7-bedd-dbc8668cd71d-5C7GfCeVMHo@public.gmane.org>
@ 2017-11-23 17:28                       ` Johannes Hirte
  0 siblings, 0 replies; 10+ messages in thread
From: Johannes Hirte @ 2017-11-23 17:28 UTC (permalink / raw)
  To: Leo Li
  Cc: Chunming Zhou, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Martin Babutzka

On 2017 Nov 23, Leo Li wrote:
> Hi Johannes,
> 
> The s3 resume issue looks to be a problem with amdgpu/display. Could you 
> give the attached patch a try?
> 
> Thanks,
> Leo
> 
> On 2017-11-23 07:27 AM, Johannes Hirte wrote:
> > On 2017 Nov 23, Chunming Zhou wrote:
> >> See the attached email, they fixed same issue, each of them is ok to fix
> >> your issue, your calltrace is  same as the second.
> >>
> >> We should already push the first patch in early time, could you check if
> >> the first patch is in your branch?
> >>
> > 
> > This patch (series) is not upstream yet. Just tested it, but this doesn't fix the
> > use-after-free on S3 resume with dc enabled.
> > 

> From 8656ef112d53f8c08f6571dd0d093f03d2e6cc30 Mon Sep 17 00:00:00 2001
> From: "Leo (Sunpeng) Li" <sunpeng.li@amd.com>
> Date: Thu, 16 Nov 2017 15:17:27 -0500
> Subject: [PATCH] drm/amdgpu/display: Do not put drm_atomic_state on resume
> 
> drm_atomic_helper_resume now puts it for us. See relevant patch here:
> https://lists.freedesktop.org/archives/dri-devel/2017-October/154268.html
> 
> Change-Id: Ief246492f721a1cf281d48e9d1a7029e5cefc2da
> Signed-off-by: Leo (Sunpeng) Li <sunpeng.li@amd.com>
> ---
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 5731167..951ea77 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -688,7 +688,6 @@ int amdgpu_dm_display_resume(struct amdgpu_device *adev)
>  
>  	ret = drm_atomic_helper_resume(ddev, adev->dm.cached_state);
>  
> -	drm_atomic_state_put(adev->dm.cached_state);
>  	adev->dm.cached_state = NULL;
>  
>  	amdgpu_dm_irq_resume_late(adev);
> -- 
> 2.7.4
> 

Looks good, with this patch the use-after-free is gone and S3 resume woks as
expected.

You can add my Tested-by.

-- 
Regards,
  Johannes

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Kernel crash/Null pointer dereference on vblank
@ 2017-11-19 21:54 Martin Babutzka
  0 siblings, 0 replies; 10+ messages in thread
From: Martin Babutzka @ 2017-11-19 21:54 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

[-- Attachment #1: Type: text/plain, Size: 1395 bytes --]

Dear AMD Developers,

At first congratulations for the DC code submission to the 4.15 kernel.
Unfortunately the major regression which I reported on 29.09., 06.10.,
02.11. and 05.11. still exists. But this time I got additional
debugging information maybe this helps to fix it.

Summary: I am running Xubuntu 17.10 with the amd-staging-drm-next
kernel patched to 4.14.0. The latest build which I tested is from
includes all commits up to now (including 2017-11-17 19:51:57 (GMT)
commit	85d09ce5e5039644487e9508d6359f9f4cf64427).

Some vblank operations make the kernel crash and hang up the whole
system. The error is reproducible by enabling the screen lock or the
suspend mode. The system can not return to proper state from either of
these (after all I am not 100% sure it is the same error). Debugging is
 easier with screen lock. Attached you can find the kernel crash and
the dce110_vblank_set function modified by some kernel prints. It looks
like the function is called twice and does not work the second time.
The whole code around dce110_vblank_set also looks interrupt-ish -
could this be a race condition or timing problem? Objects being cleared
from memory and then accessed by dce110_vblank_set?

Bug reports on this issue:
https://github.com/M-Bab/linux-kernel-amdgpu-binaries/issues/37
https://github.com/M-Bab/linux-kernel-amdgpu-binaries/issues/29

Many regards,
Martin (M-bab)

[-- Attachment #2: crash_vblank.txt --]
[-- Type: text/plain, Size: 9468 bytes --]

bool dce110_vblank_set(
                struct irq_service *irq_service,
                const struct irq_source_info *info,
                bool enable)
{
printk(KERN_ALERT "DEBUG: Passed %s %d \n",__FUNCTION__,__LINE__);
        struct dc_context *dc_ctx = irq_service->ctx;
printk(KERN_ALERT "DEBUG: Passed %s %d \n",__FUNCTION__,__LINE__);
        struct dc *core_dc = irq_service->ctx->dc;
printk(KERN_ALERT "DEBUG: Passed %s %d \n",__FUNCTION__,__LINE__);
        enum dc_irq_source dal_irq_src = dc_interrupt_to_irq_source(
                                                                                irq_service->ctx->dc,
                                                                                info->src_id,
                                                                                info->ext_id);
        uint8_t pipe_offset = dal_irq_src - IRQ_TYPE_VBLANK;
printk(KERN_ALERT "DEBUG: Passed %s %d \n",__FUNCTION__,__LINE__);

        struct timing_generator *tg =
                        core_dc->current_state->res_ctx.pipe_ctx[pipe_offset].stream_res.tg;
printk(KERN_ALERT "DEBUG: Passed %s %d \n",__FUNCTION__,__LINE__);

        if (enable) {
                if (!tg->funcs->arm_vert_intr(tg, 2)) {
                        DC_ERROR("Failed to get VBLANK!\n");
                        return false;
                }
        }
printk(KERN_ALERT "DEBUG: Passed %s %d \n",__FUNCTION__,__LINE__);

        dal_irq_service_set_generic(irq_service, info, enable);
printk(KERN_ALERT "DEBUG: Passed %s %d \n",__FUNCTION__,__LINE__);
        return true;

}


"normal" vblank during boot:
Nov 19 22:33:10 Main-PC kernel: [   17.605100] DEBUG: Passed dce110_vblank_set 208 
Nov 19 22:33:10 Main-PC kernel: [   17.605102] DEBUG: Passed dce110_vblank_set 210 
Nov 19 22:33:10 Main-PC kernel: [   17.605103] DEBUG: Passed dce110_vblank_set 212 
Nov 19 22:33:10 Main-PC kernel: [   17.605104] DEBUG: Passed dce110_vblank_set 218 
Nov 19 22:33:10 Main-PC kernel: [   17.605104] DEBUG: Passed dce110_vblank_set 222 
Nov 19 22:33:10 Main-PC kernel: [   17.605108] DEBUG: Passed dce110_vblank_set 230 
Nov 19 22:33:10 Main-PC kernel: [   17.605110] DEBUG: Passed dce110_vblank_set 233 

vblank on screen lock in kernel.log/syslog:
Nov 19 22:34:10 Main-PC kernel: [   78.664890] DEBUG: Passed dce110_vblank_set 208 
Nov 19 22:34:10 Main-PC kernel: [   78.664892] DEBUG: Passed dce110_vblank_set 210 
Nov 19 22:34:10 Main-PC kernel: [   78.664893] DEBUG: Passed dce110_vblank_set 212 
Nov 19 22:34:10 Main-PC kernel: [   78.664894] DEBUG: Passed dce110_vblank_set 218 
Nov 19 22:34:10 Main-PC kernel: [   78.664894] DEBUG: Passed dce110_vblank_set 222 
Nov 19 22:34:10 Main-PC kernel: [   78.664895] DEBUG: Passed dce110_vblank_set 230 
Nov 19 22:34:10 Main-PC kernel: [   78.664896] DEBUG: Passed dce110_vblank_set 233 
Nov 19 22:34:27 Main-PC kernel: [   96.113426] DEBUG: Passed dce110_vblank_set 208 
Nov 19 22:34:27 Main-PC kernel: [   96.113433] DEBUG: Passed dce110_vblank_set 210 
Nov 19 22:34:27 Main-PC kernel: [   96.113435] DEBUG: Passed dce110_vblank_set 212 
Nov 19 22:34:27 Main-PC kernel: [   96.113438] DEBUG: Passed dce110_vblank_set 218 
Nov 19 22:34:27 Main-PC kernel: [   96.113440] DEBUG: Passed dce110_vblank_set 222 
Nov 19 22:34:27 Main-PC kernel: [   96.113448] BUG: unable to handle kernel NULL pointer dereference at           (null)
Nov 19 22:34:27 Main-PC kernel: [   96.113521] IP: dce110_vblank_set+0xe2/0x160 [amdgpu]
Nov 19 22:34:27 Main-PC kernel: [   96.113524] PGD 0 P4D 0 
Nov 19 22:34:27 Main-PC kernel: [   96.113531] Oops: 0000 [#1] SMP
Nov 19 22:34:27 Main-PC kernel: [   96.113535] Modules linked in: rfcomm bnep binfmt_misc snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm snd_pcm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_seq_midi pcbc dm_crypt snd_seq_midi_event aesni_intel snd_rawmidi aes_x86_64 crypto_simd glue_helper snd_seq cryptd snd_seq_device snd_timer intel_cstate intel_rapl_perf snd btusb serio_raw joydev input_leds soundcore btrtl hci_uart mei_me shpchp btbcm mei serdev btqca btintel bluetooth ecdh_generic intel_lpss_acpi intel_lpss acpi_als mac_hid kfifo_buf acpi_pad tpm_infineon industrialio parport_pc ppdev lp parport ip_tables x_tables autofs4 hid_generic uas usb_storage usbhid amdkfd amd_iommu_v2
Nov 19 22:34:27 Main-PC kernel: [   96.113614]  amdgpu chash i2c_algo_bit ttm drm_kms_helper e1000e syscopyarea sysfillrect sysimgblt fb_sys_fops ptp r8169 pps_core drm ahci mii libahci wmi pinctrl_sunrisepoint video i2c_hid pinctrl_intel hid
Nov 19 22:34:27 Main-PC kernel: [   96.113643] CPU: 2 PID: 1462 Comm: xfwm4 Not tainted 4.14.0+ #3
Nov 19 22:34:27 Main-PC kernel: [   96.113645] Hardware name: Gigabyte Technology Co., Ltd. B250-HD3P/B250-HD3P-CF, BIOS F3 12/07/2016
Nov 19 22:34:27 Main-PC kernel: [   96.113649] task: ffff998d53040000 task.stack: ffffa59103150000
Nov 19 22:34:27 Main-PC kernel: [   96.113710] RIP: 0010:dce110_vblank_set+0xe2/0x160 [amdgpu]
Nov 19 22:34:27 Main-PC kernel: [   96.113713] RSP: 0018:ffffa59103153b28 EFLAGS: 00010002
Nov 19 22:34:27 Main-PC kernel: [   96.113717] RAX: 0000000000000024 RBX: ffff998d5c3d4300 RCX: 0000000000000006
Nov 19 22:34:27 Main-PC kernel: [   96.113720] RDX: 0000000000000000 RSI: 0000000000000086 RDI: ffff998d6ec8dc90
Nov 19 22:34:27 Main-PC kernel: [   96.113723] RBP: ffffa59103153b58 R08: 0000000000000000 R09: 00000000000003ff
Nov 19 22:34:27 Main-PC kernel: [   96.113726] R10: 00007ffebd2bebc0 R11: ffffffffa354feed R12: ffffffffc052b3e0
Nov 19 22:34:27 Main-PC kernel: [   96.113728] R13: 0000000000000001 R14: ffff998d51695100 R15: 0000000000000000
Nov 19 22:34:27 Main-PC kernel: [   96.113732] FS:  00007f4e2f002a80(0000) GS:ffff998d6ec80000(0000) knlGS:0000000000000000
Nov 19 22:34:27 Main-PC kernel: [   96.113735] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 19 22:34:27 Main-PC kernel: [   96.113738] CR2: 0000000000000000 CR3: 00000004181e5001 CR4: 00000000003606e0
Nov 19 22:34:27 Main-PC kernel: [   96.113741] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Nov 19 22:34:27 Main-PC kernel: [   96.113744] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Nov 19 22:34:27 Main-PC kernel: [   96.113746] Call Trace:
Nov 19 22:34:27 Main-PC kernel: [   96.113807]  dal_irq_service_set+0x49/0x90 [amdgpu]
Nov 19 22:34:27 Main-PC kernel: [   96.113863]  dc_interrupt_set+0x24/0x30 [amdgpu]
Nov 19 22:34:27 Main-PC kernel: [   96.113933]  amdgpu_dm_set_crtc_irq_state+0x35/0x60 [amdgpu]
Nov 19 22:34:27 Main-PC kernel: [   96.113989]  amdgpu_irq_update+0x58/0xa0 [amdgpu]
Nov 19 22:34:27 Main-PC kernel: [   96.114041]  amdgpu_irq_get+0x49/0x60 [amdgpu]
Nov 19 22:34:27 Main-PC kernel: [   96.114076]  amdgpu_enable_vblank_kms+0x27/0x30 [amdgpu]
Nov 19 22:34:27 Main-PC kernel: [   96.114091]  drm_vblank_enable+0x84/0x100 [drm]
Nov 19 22:34:27 Main-PC kernel: [   96.114104]  drm_vblank_get+0x92/0xb0 [drm]
Nov 19 22:34:27 Main-PC kernel: [   96.114116]  drm_wait_vblank_ioctl+0xb4/0x580 [drm]
Nov 19 22:34:27 Main-PC kernel: [   96.114123]  ? unix_stream_recvmsg+0x51/0x70
Nov 19 22:34:27 Main-PC kernel: [   96.114127]  ? __unix_insert_socket+0x40/0x40
Nov 19 22:34:27 Main-PC kernel: [   96.114140]  ? drm_legacy_modeset_ctl_ioctl+0x100/0x100 [drm]
Nov 19 22:34:27 Main-PC kernel: [   96.114152]  drm_ioctl_kernel+0x5d/0xb0 [drm]
Nov 19 22:34:27 Main-PC kernel: [   96.114163]  drm_ioctl+0x31b/0x3d0 [drm]
Nov 19 22:34:27 Main-PC kernel: [   96.114174]  ? drm_legacy_modeset_ctl_ioctl+0x100/0x100 [drm]
Nov 19 22:34:27 Main-PC kernel: [   96.114180]  ? do_iter_write+0xe1/0x190
Nov 19 22:34:27 Main-PC kernel: [   96.114215]  amdgpu_drm_ioctl+0x4f/0x90 [amdgpu]
Nov 19 22:34:27 Main-PC kernel: [   96.114222]  do_vfs_ioctl+0xa5/0x610
Nov 19 22:34:27 Main-PC kernel: [   96.114227]  ? __sys_recvmsg+0x51/0x90
Nov 19 22:34:27 Main-PC kernel: [   96.114231]  ? __sys_recvmsg+0x51/0x90
Nov 19 22:34:27 Main-PC kernel: [   96.114237]  SyS_ioctl+0x79/0x90
Nov 19 22:34:27 Main-PC kernel: [   96.114243]  entry_SYSCALL_64_fastpath+0x1e/0xa9
Nov 19 22:34:27 Main-PC kernel: [   96.114247] RIP: 0033:0x7f4e2b64dea7
Nov 19 22:34:27 Main-PC kernel: [   96.114250] RSP: 002b:00007ffebd2bec08 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Nov 19 22:34:27 Main-PC kernel: [   96.114254] RAX: ffffffffffffffda RBX: 0000562e1f5938c0 RCX: 00007f4e2b64dea7
Nov 19 22:34:27 Main-PC kernel: [   96.114257] RDX: 00007ffebd2bec80 RSI: 00000000c018643a RDI: 0000000000000006
Nov 19 22:34:27 Main-PC kernel: [   96.114259] RBP: 0000562e1f620ce0 R08: 00000000006001e5 R09: 0000000000000000
Nov 19 22:34:27 Main-PC kernel: [   96.114262] R10: 00007ffebd2bebc0 R11: 0000000000000246 R12: 0000000000000000
Nov 19 22:34:27 Main-PC kernel: [   96.114264] R13: 0000000000000007 R14: 0000000000000007 R15: 0000562e1f5938c0
Nov 19 22:34:27 Main-PC kernel: [   96.114268] Code: 48 89 d0 48 c1 e0 05 48 01 d0 ba de 00 00 00 48 c1 e0 05 49 03 87 30 01 00 00 4c 8b b8 78 02 00 00 e8 c4 c2 04 e2 45 84 ed 74 38 <49> 8b 07 be 02 00 00 00 4c 89 ff ff 90 e0 00 00 00 84 c0 75 23 
Nov 19 22:34:27 Main-PC kernel: [   96.114392] RIP: dce110_vblank_set+0xe2/0x160 [amdgpu] RSP: ffffa59103153b28
Nov 19 22:34:27 Main-PC kernel: [   96.114394] CR2: 0000000000000000
Nov 19 22:34:27 Main-PC kernel: [   96.114399] ---[ end trace 4160248d2f91cb42 ]---


[-- Attachment #3: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-11-23 17:28 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-22  7:06 Kernel crash/Null pointer dereference on vblank Martin Babutzka
     [not found] ` <1228839336.50367.1511334398694-NM1PAhYDU/Oo9dU1Uvar1Q@public.gmane.org>
2017-11-22 15:07   ` Johannes Hirte
2017-11-22 22:31     ` Johannes Hirte
2017-11-23  2:18       ` Chunming Zhou
     [not found]         ` <58b24f03-b71b-4208-4cb7-4706ba947dea-5C7GfCeVMHo@public.gmane.org>
2017-11-23  8:31           ` Johannes Hirte
2017-11-23 10:07             ` Chunming Zhou
     [not found]               ` <56b938ac-4597-a354-0f4a-0c2625b10c5a-5C7GfCeVMHo@public.gmane.org>
2017-11-23 12:27                 ` Johannes Hirte
2017-11-23 15:17                   ` Leo Li
     [not found]                     ` <eb1a453e-0705-ede7-bedd-dbc8668cd71d-5C7GfCeVMHo@public.gmane.org>
2017-11-23 17:28                       ` Johannes Hirte
  -- strict thread matches above, loose matches on Subject: below --
2017-11-19 21:54 Martin Babutzka

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.