[Bug 198669] New: Driver crash at radeon_ring_backup+0xd3/0x140 [radeon]

* [Bug 198669] New: Driver crash at radeon_ring_backup+0xd3/0x140 [radeon]
@ 2018-02-04 17:39 bugzilla-daemon
  2018-02-04 17:41 ` [Bug 198669] " bugzilla-daemon
                   ` (17 more replies)
  0 siblings, 18 replies; 19+ messages in thread
From: bugzilla-daemon @ 2018-02-04 17:39 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=198669

            Bug ID: 198669
           Summary: Driver crash at radeon_ring_backup+0xd3/0x140 [radeon]
           Product: Drivers
           Version: 2.5
    Kernel Version: 4.13.0-32-generic x86_64
          Hardware: All
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: high
          Priority: P1
         Component: Video(DRI - non Intel)
          Assignee: drivers_video-dri@kernel-bugs.osdl.org
          Reporter: roger@beardandsandals.co.uk
        Regression: No

This is a resilience bug in the driver. When trying to recover from a GPU stall
radeon_ring_backup causes a paging violation.

[  488.507091] BUG: unable to handle kernel paging request at ffffb406c1891ffc
[  488.507176] IP: radeon_ring_backup+0xd3/0x140 [radeon]

The GPU stall is caused by a hardware problem triggered by vibration. N.B. This
bug is not abput the hardware problem. It is about the drivers resilience when
trying to recover from it.

It is very similar to bug #62721 reported 4 years ago. However this is
occurring using with the driver in the 4.13.0 kernel.

Here os the dmesg output.

[  139.457873] rfkill: input handler disabled
[  468.102340] radeon 0000:02:00.0: ring 0 stalled for more than 10256msec
[  468.102346] radeon 0000:02:00.0: GPU lockup (current fence id
0x0000000000001bdb last fence id 0x0000000000001bdc on ring 0)

... Similar lines removed

[  487.558156] radeon 0000:02:00.0: ring 0 stalled for more than 29712msec
[  487.558161] radeon 0000:02:00.0: GPU lockup (current fence id
0x0000000000001bdb last fence id 0x0000000000001bdc on ring 0)
[  488.070157] radeon 0000:02:00.0: ring 0 stalled for more than 30224msec
[  488.070162] radeon 0000:02:00.0: GPU lockup (current fence id
0x0000000000001bdb last fence id 0x0000000000001bdc on ring 0)
[  488.507091] BUG: unable to handle kernel paging request at ffffb406c1891ffc
[  488.507176] IP: radeon_ring_backup+0xd3/0x140 [radeon]
[  488.507195] PGD 236d37067 
[  488.507196] P4D 236d37067 
[  488.507207] PUD 0 

[  488.507234] Oops: 0000 [#1] SMP PTI
[  488.507248] Modules linked in: rfcomm bnep bonding binfmt_misc btusb btrtl
btbcm btintel intel_powerclamp joydev coretemp kvm_intel kvm input_leds
bluetooth ecdh_generic arc4 ath9k ath9k_common ath9k_hw ath mac80211 irqbypass
snd_seq_midi snd_seq_midi_event intel_cstate snd_hda_codec_realtek
snd_hda_codec_generic snd_hda_codec_hdmi snd_rawmidi cfg80211 snd_hda_intel
snd_hda_codec snd_hda_core snd_hwdep serio_raw snd_pcm snd_seq snd_seq_device
snd_timer snd lpc_ich shpchp i7core_edac mac_hid i5500_temp soundcore
tpm_infineon asus_atk0110 nfsd auth_rpcgss nfs_acl lockd grace sunrpc
parport_pc ppdev lp parport ip_tables x_tables autofs4 amdkfd amd_iommu_v2
radeon i2c_algo_bit ttm drm_kms_helper hid_generic syscopyarea uas sysfillrect
usbhid sysimgblt firewire_ohci fb_sys_fops usb_storage pata_acpi hid
[  488.507500]  psmouse firewire_core r8169 drm crc_itu_t mii
[  488.507523] CPU: 7 PID: 2073 Comm: gnome-shell Tainted: G          I    
4.13.0-32-generic #35-Ubuntu
[  488.507554] Hardware name: System manufacturer System Product Name/P6T SE,
BIOS 0403    05/19/2009
[  488.507584] task: ffff9e0cb6191600 task.stack: ffffb402c3724000
[  488.507619] RIP: 0010:radeon_ring_backup+0xd3/0x140 [radeon]
[  488.507639] RSP: 0018:ffffb402c3727c00 EFLAGS: 00010246
[  488.507658] RAX: ffff9e0c6f300000 RBX: 0000000000037ba1 RCX:
0000000000000000
[  488.507682] RDX: 0000000000000000 RSI: ffffb406c1891ffc RDI:
00000000000dee84
[  488.507707] RBP: ffffb402c3727c28 R08: 00000000000269a8 R09:
00000000000b2c44
[  488.507731] R10: ffffdc5046bd0000 R11: ffff9e0cfffd1d00 R12:
ffffb402c3727c68
[  488.507756] R13: ffff9e0ceafc9538 R14: ffff9e0ceafc9558 R15:
00000000ffffffff
[  488.507780] FS:  00007fdc82a60ac0(0000) GS:ffff9e0cf73c0000(0000)
knlGS:0000000000000000
[  488.507808] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  488.507828] CR2: ffffb406c1891ffc CR3: 00000001f632a000 CR4:
00000000000006e0
[  488.507853] Call Trace:
[  488.507875]  radeon_gpu_reset+0xc0/0x330 [radeon]
[  488.507895]  ? dma_fence_wait_timeout+0x38/0xf0
[  488.507912]  ? reservation_object_wait_timeout_rcu+0x14f/0x2d0
[  488.507946]  radeon_gem_handle_lockup.part.4+0xe/0x20 [radeon]
[  488.507979]  radeon_gem_wait_idle_ioctl+0x9c/0x100 [radeon]
[  488.508012]  ? radeon_gem_busy_ioctl+0x80/0x80 [radeon]
[  488.508040]  drm_ioctl_kernel+0x5d/0xb0 [drm]
[  488.508063]  drm_ioctl+0x31b/0x3d0 [drm]
[  488.508091]  ? radeon_gem_busy_ioctl+0x80/0x80 [radeon]
[  488.508111]  ? futex_wake+0x8f/0x180
[  488.508134]  radeon_drm_ioctl+0x4f/0x90 [radeon]
[  488.508153]  do_vfs_ioctl+0xa5/0x610
[  488.509723]  ? entry_SYSCALL_64_after_hwframe+0x118/0x168
[  488.511292]  ? entry_SYSCALL_64_after_hwframe+0x111/0x168
[  488.512853]  ? entry_SYSCALL_64_after_hwframe+0x10a/0x168
[  488.514405]  ? entry_SYSCALL_64_after_hwframe+0x103/0x168
[  488.515956]  ? entry_SYSCALL_64_after_hwframe+0xfc/0x168
[  488.517499]  ? entry_SYSCALL_64_after_hwframe+0xf5/0x168
[  488.519040]  ? entry_SYSCALL_64_after_hwframe+0xee/0x168
[  488.520572]  ? entry_SYSCALL_64_after_hwframe+0xe7/0x168
[  488.522095]  ? entry_SYSCALL_64_after_hwframe+0xe0/0x168
[  488.523598]  SyS_ioctl+0x79/0x90
[  488.525088]  ? entry_SYSCALL_64_after_hwframe+0xa1/0x168
[  488.526579]  entry_SYSCALL_64_fastpath+0x33/0xa3
[  488.528065] RIP: 0033:0x7fdc7fb65ef7
[  488.529553] RSP: 002b:00007ffd23064578 EFLAGS: 00000246 ORIG_RAX:
0000000000000010
[  488.531081] RAX: ffffffffffffffda RBX: 00007ffd230645c0 RCX:
00007fdc7fb65ef7
[  488.532581] RDX: 00007ffd230645c0 RSI: 0000000040086464 RDI:
000000000000000c
[  488.534078] RBP: 00007ffd230645c0 R08: 0000000000000000 R09:
0000000800000000
[  488.535556] R10: 00007ffd230645d0 R11: 0000000000000246 R12:
0000000040086464
[  488.537031] R13: 000000000000000c R14: 00007ffd230646e8 R15:
000055d8a7b62200
[  488.538511] Code: 48 85 c0 49 89 04 24 74 62 8d 53 ff 48 8d 3c 95 04 00 00
00 31 d2 eb 04 49 8b 04 24 49 8b 76 08 41 8d 4f 01 45 89 ff 4a 8d 34 be <8b> 36
89 34 10 41 23 4e 54 48 83 c2 04 48 39 d7 41 89 cf 75 d8 
[  488.541900] RIP: radeon_ring_backup+0xd3/0x140 [radeon] RSP:
ffffb402c3727c00
[  488.543656] CR2: ffffb406c1891ffc
[  488.552481] ---[ end trace e6e07e03d7738a24 ]---

Various versions of this crash seem to be have been reported over the last few
years but none successfully closed.

For further diagnostics see
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1746232

Roger

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread