All of lore.kernel.org
 help / color / mirror / Atom feed
* [Bug 110887] 5.0 kernel crash , drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-2)
@ 2019-06-11  7:07 bugzilla-daemon
  2019-11-19  9:31 ` bugzilla-daemon
  0 siblings, 1 reply; 2+ messages in thread
From: bugzilla-daemon @ 2019-06-11  7:07 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 13591 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=110887

            Bug ID: 110887
           Summary: 5.0 kernel crash , drm:amdgpu_gem_va_ioctl [amdgpu]]
                    *ERROR* Couldn't update BO_VA (-2)
           Product: DRI
           Version: DRI git
          Hardware: ARM
                OS: Linux (All)
            Status: NEW
          Severity: critical
          Priority: medium
         Component: DRM/AMDgpu
          Assignee: dri-devel@lists.freedesktop.org
          Reporter: wormwang@yahoo.com

Env: kernel 5.0.13, AMD rx580 GPU 8GB


We run about 32 game soft on the GPU concurrently and run a media 
encoder soft on VCE by vaapi at same time.

We meet the kernel crash, after runing 3 to 7 days. . We meet such crash 
5 times. We had enabled kdump ,if you need other kernel dump info, we 
can upload them


Log:

[172936.893428] binder_dkms: binder_deferred_func, binder_index = 12
[172937.052608] pci_generic_config_write32: 138 callbacks suppressed
[172937.052615] pci_bus 000d:30: 2-byte config write to 000d:30:00.0 
offset 0x4 may corrupt adjacent RW1C bits
[172937.052633] pci_bus 000c:20: 2-byte config write to 000c:20:00.0 
offset 0x44 may corrupt adjacent RW1C bits
[172937.054110] amdgpu 000d:31:00.0: couldn't schedule ib on ring <gfx>
[172937.062690] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling 
IBs (-22)
[172937.069361] pci_bus 000c:20: 2-byte config write to 000c:20:00.0 
offset 0x78 may corrupt adjacent RW1C bits
[172937.071029] pci_bus 000c:20: 2-byte config write to 000c:20:00.0 
offset 0x80 may corrupt adjacent RW1C bits
[172937.071034] pci_bus 000c:20: 2-byte config write to 000c:20:00.0 
offset 0x8c may corrupt adjacent RW1C bits
[172937.071038] pci_bus 000c:20: 2-byte config write to 000c:20:00.0 
offset 0x98 may corrupt adjacent RW1C bits
[172937.071042] pci_bus 000c:20: 2-byte config write to 000c:20:00.0 
offset 0xa0 may corrupt adjacent RW1C bits
[172937.071083] pci_bus 000c:20: 2-byte config write to 000c:20:00.0 
offset 0x44 may corrupt adjacent RW1C bits
[172937.071091] amdgpu 000d:31:00.0: couldn't schedule ib on ring <gfx>
[172937.071094] pci_bus 000c:20: 2-byte config write to 000c:20:00.0 
offset 0x44 may corrupt adjacent RW1C bits
[172937.071110] pci_bus 000c:20: 2-byte config write to 000c:20:00.0 
offset 0x4 may corrupt adjacent RW1C bits
[172937.079477] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling 
IBs (-22)
[172937.087723] amdgpu 000d:31:00.0: couldn't schedule ib on ring <gfx>
[172937.095955] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling 
IBs (-22)
[172937.104270] amdgpu 000d:31:00.0: couldn't schedule ib on ring <gfx>
[172937.112418] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling 
IBs (-22)
[172937.120490] amdgpu 000d:31:00.0: couldn't schedule ib on ring <gfx>
[172937.128525] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling 
IBs (-22)
[172937.136557] amdgpu 000d:31:00.0: couldn't schedule ib on ring <gfx>
[172937.144446] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling 
IBs (-22)
[172937.152254] amdgpu 000d:31:00.0: couldn't schedule ib on ring <gfx>
[172937.160039] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling 
IBs (-22)
[172937.167779] amdgpu 000d:31:00.0: couldn't schedule ib on ring <gfx>
[172937.175490] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling 
IBs (-22)
[172937.176747] binder_dkms: binder_defer_work 12
[172937.183035] amdgpu 000d:31:00.0: couldn't schedule ib on ring <gfx>
[172937.183075] binder_dkms: binder_deferred_func, binder_index = 12
[172937.355641] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling 
IBs (-22)
[172937.362028] amdgpu 000d:31:00.0: couldn't schedule ib on ring <gfx>
[172937.368253] pcieport 0004:48:00.0: can't derive routing for PCI INT A
[172937.368425] megaraid_sas 0004:49:00.0: PCI INT A: no GSI
[172937.368585] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling 
IBs (-22)
[172937.370828] amdgpu 0005:01:00.0: couldn't schedule ib on ring <gfx>
[172937.375070] amdgpu 000d:31:00.0: couldn't schedule ib on ring <gfx>
[172937.375167] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling 
IBs (-22)
[172937.375190] amdgpu 000d:31:00.0: couldn't schedule ib on ring <gfx>
[172937.381702] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling 
IBs (-22)
[172937.381732] amdgpu 0005:01:00.0: couldn't schedule ib on ring <gfx>
[172937.387959] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling 
IBs (-22)
[172937.387990] amdgpu 000d:31:00.0: couldn't schedule ib on ring <gfx>
[172937.394171] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling 
IBs (-22)
[172937.394186] amdgpu 0005:01:00.0: couldn't schedule ib on ring <gfx>
[172937.400498] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling 
IBs (-22)
[172937.400523] amdgpu 000d:31:00.0: couldn't schedule ib on ring <gfx>
[172937.406755] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling 
IBs (-22)
[172937.406775] amdgpu 0005:01:00.0: couldn't schedule ib on ring <gfx>
[172937.412977] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling 
IBs (-22)
[172937.412998] amdgpu 000d:31:00.0: couldn't schedule ib on ring <gfx>
[172937.419250] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling 
IBs (-22)
[172937.419273] amdgpu 0005:01:00.0: couldn't schedule ib on ring <gfx>
[172937.422361] [drm] schedsdma0 is not ready, skipping
[172937.422363] [drm] schedsdma1 is not ready, skipping
[172937.422448] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
update BO_VA (-2)
[172937.425437] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling 
IBs (-22)
[172937.425450] amdgpu 000d:31:00.0: couldn't schedule ib on ring <gfx>
[172937.431814] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling 
IBs (-22)
[172937.431837] amdgpu 0005:01:00.0: couldn't schedule ib on ring <gfx>
[172937.438038] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling 
IBs (-22)
[172937.438054] amdgpu 000d:31:00.0: couldn't schedule ib on ring <gfx>
[172937.450635] amdgpu 000d:31:00.0: couldn't schedule ib on ring <gfx>
[172937.454268] pcieport 0002:e8:00.0: can't derive routing for PCI INT B
[172937.454272] ixgbe 0002:e9:00.1: PCI INT B: no GSI
[172937.456923] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling 
IBs (-22)
[172937.456967] amdgpu 0005:01:00.0: couldn't schedule ib on ring <gfx>
[172937.463122] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling 
IBs (-22)
[172937.463148] amdgpu 000d:31:00.0: couldn't schedule ib on ring <gfx>
[172937.469428] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling 
IBs (-22)
[172937.469455] amdgpu 0005:01:00.0: couldn't schedule ib on ring <gfx>
[172937.475705] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling 
IBs (-22)
[172937.475722] amdgpu 000d:31:00.0: couldn't schedule ib on ring <gfx>
[172937.481928] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling 
IBs (-22)
[172937.481946] amdgpu 0005:01:00.0: couldn't schedule ib on ring <gfx>
[172937.488092] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling 
IBs (-22)
[172937.488108] amdgpu 000d:31:00.0: couldn't schedule ib on ring <gfx>
[172937.494407] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling 
IBs (-22)
[172937.494440] amdgpu 0005:01:00.0: couldn't schedule ib on ring <gfx>
[172937.500663] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling 
IBs (-22)
[172937.500678] amdgpu 000d:31:00.0: couldn't schedule ib on ring <gfx>
[172937.506909] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling 
IBs (-22)
[172937.506930] amdgpu 0005:01:00.0: couldn't schedule ib on ring <gfx>
[172937.511392] [drm] schedsdma0 is not ready, skipping
[172937.511394] [drm] schedsdma1 is not ready, skipping
[172937.511481] [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't 
update BO_VA (-2)
[172937.512346] Unable to handle kernel access to user memory outside 
uaccess routines at virtual address 0000000000000008
[172937.512348] Mem abort info:
[172937.512350]???? ESR = 0x96000004
[172937.512352]???? Exception class = DABT (current EL), IL = 32 bits
[172937.512353]???? SET = 0, FnV = 0
[172937.512354]???? EA = 0, S1PTW = 0
[172937.512355] Data abort info:
[172937.512356]???? ISV = 0, ISS = 0x00000004
[172937.512357]???? CM = 0, WnR = 0
[172937.512359] user pgtable: 4k pages, 48-bit VAs, pgdp = 00000000fb340bc6
[172937.512361] [0000000000000008] pgd=000000139dfe9003, 
pud=00000015d583a003, pmd=0000000000000000
[172937.512367] Internal error: Oops: 96000004 [#1] SMP
[172937.512370] Modules linked in: nfnetlink_log veth ipt_REJECT 
nf_reject_ipv4 xt_comment xt_mark xt_nat xt_tcpudp ipt_MASQUERADE 
nf_conntrack_netlink nfnetlink xfrm_user xt_conntrack br_netfilter 
bridge stp llc iptable_filter xt_addrtype iptable_nat nf_nat_ipv4 nf_nat 
bpfilter ip_vs_sh ip_vs_wrr ip_vs_rr ip_vs nf_conntrack nf_defrag_ipv6 
nf_defrag_ipv4 overlay nls_iso8859_1 joydev input_leds snd_hda_intel 
snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer ipmi_ssif snd 
ipmi_si soundcore ipmi_devintf ipmi_msghandler tcp_bbr sch_fq ib_iser 
rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi binder_dkms(OE) ip_tables x_tables autofs4 btrfs 
zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq 
async_xor async_tx xor xor_neon raid6_pq libcrc32c raid1 raid0 multipath 
linear hibmc_drm hid_generic usbhid hid ses enclosure marvell aes_ce_blk 
aes_ce_cipher amdgpu chash i2c_algo_bit gpu_sched ttm crct10dif_ce 
drm_kms_helper ghash_ce syscopyarea sha2_ce
[172937.512432]?? sysfillrect sysimgblt fb_sys_fops sha256_arm64 sha1_ce 
ixgbe drm hisi_sas_v2_hw hisi_sas_main megaraid_sas xfrm_algo libsas 
mdio ehci_platform scsi_transport_sas hns_dsaf hns_enet_drv hns_mdio 
hnae aes_neon_bs aes_neon_blk crypto_simd cryptd aes_arm64
[172937.512448] Process RenderThread (pid: 1569015, stack limit = 
0x00000000349701c4)
[172937.512451] CPU: 23 PID: 1569015 Comm: RenderThread Kdump: loaded 
Tainted: G???????????????????? OE???????? 5.0.13-1905061257-generic #appstream
[172937.512453] Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.58 
10/24/2018
[172937.512454] pstate: 80400005 (Nzcv daif +PAN -UAO)
[172937.512531] pc : amdgpu_vm_bo_update_mapping+0x120/0x3a0 [amdgpu]
[172937.512603] lr : amdgpu_vm_bo_update+0x2a4/0x6b8 [amdgpu]
[172937.512604] sp : ffff0000c0e4b8c0
[172937.512605] x29: ffff0000c0e4b8c0 x28: ffff801fd3010000
[172937.512608] x27: 0000000000000001 x26: ffff80161d777000
[172937.512610] x25: 0000000000100d1f x24: 0000000000100d00
[172937.512612] x23: 0000000000000000 x22: ffff809533c54f00
[172937.512614] x21: 0000000000000037 x20: ffff0000116cc000
[172937.512616] x19: 000000000000000a x18: 0000000000000000
[172937.512618] x17: 0000000000000000 x16: 0000000000000000
[172937.512620] x15: 0000000000000000 x14: 00000003000000b0
[172937.512623] x13: 0000000600000240 x12: 0000000000000000
[172937.512624] x11: 000000060000018d x10: 0000000000000040
[172937.512627] x9 : 0000000000000000 x8 : ffff0000c0e4b860
[172937.512628] x7 : 0000000000000020 x6 : 000000000000001f
[172937.512631] x5 : 0000000000100d1f x4 : 0000000000100d00
[172937.512633] x3 : 0000000000000000 x2 : 000000000000000b
[172937.512636] x1 : 0000000000000000 x0 : 
ffff80161d776000[172937.512638] Call trace:
[172937.512711]?? amdgpu_vm_bo_update_mapping+0x120/0x3a0 [amdgpu]
[172937.512784]?? amdgpu_vm_bo_update+0x2a4/0x6b8 [amdgpu]
[172937.512857]?? amdgpu_cs_ioctl+0xcbc/0x14a8 [amdgpu]
[172937.512882]?? drm_ioctl_kernel+0x90/0x100 [drm]
[172937.512904]?? drm_ioctl+0x1ec/0x418 [drm]
[172937.512977]?? amdgpu_drm_ioctl+0x58/0x90 [amdgpu]
[172937.513055] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling 
IBs (-22)
[172937.513136]?? amdgpu_kms_compat_ioctl+0x40/0x68 [amdgpu]
[172937.513140] amdgpu 000d:31:00.0: couldn't schedule ib on ring <gfx>
[172937.513220] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling 
IBs (-22)
[172937.513235] amdgpu 000d:31:00.0: couldn't schedule ib on ring <gfx>
[172937.513311] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling 
IBs (-22)
[172937.513323] amdgpu 000d:31:00.0: couldn't schedule ib on ring <gfx>
[172937.513412] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling 
IBs (-22)
[172937.513421] amdgpu 000d:31:00.0: couldn't schedule ib on ring <gfx>
[172937.513497] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling 
IBs (-22)
[172937.513513] amdgpu 000d:31:00.0: couldn't schedule ib on ring <gfx>
[172937.513587] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling 
IBs (-22)
[172937.513603] amdgpu 000d:31:00.0: couldn't schedule ib on ring <gfx>
[172937.513677] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling 
IBs (-22)
[172937.513687] amdgpu 000d:31:00.0: couldn't schedule ib on ring <gfx>
[172937.513761] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling 
IBs (-22)
[172937.522750] amdgpu 000d:31:00.0: couldn't schedule ib on ring <gfx>
[172937.522822] [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling 
IBs (-22)
[172937.525171]?? __arm64_compat_sys_ioctl+0x144/0x410
[172937.525177]?? el0_svc_common+0x78/0x120
[172937.525179]?? el0_svc_compat_handler+0x30/0x40
[172937.525182]?? el0_svc_compat+0x8/0x34
[172937.525187] Code: f9406b41 b966c793 f941e800 71002e7f (f9400421)
[172937.525195] SMP: stopping secondary CPUs
[172937.526945] Starting crashdump kernel...
[172937.526952] Bye!

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 15151 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 2+ messages in thread

* [Bug 110887] 5.0 kernel crash , drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-2)
  2019-06-11  7:07 [Bug 110887] 5.0 kernel crash , drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-2) bugzilla-daemon
@ 2019-11-19  9:31 ` bugzilla-daemon
  0 siblings, 0 replies; 2+ messages in thread
From: bugzilla-daemon @ 2019-11-19  9:31 UTC (permalink / raw)
  To: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 805 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=110887

Martin Peres <martin.peres@free.fr> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |MOVED

--- Comment #1 from Martin Peres <martin.peres@free.fr> ---
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been
closed from further activity.

You can subscribe and participate further through the new bug through this link
to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/826.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 2485 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2019-11-19  9:31 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-11  7:07 [Bug 110887] 5.0 kernel crash , drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-2) bugzilla-daemon
2019-11-19  9:31 ` bugzilla-daemon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.