All of lore.kernel.org
 help / color / mirror / Atom feed
From: bugzilla-daemon@freedesktop.org
To: dri-devel@lists.freedesktop.org
Subject: [Bug 107065] "BUG: unable to handle kernel paging request at 0000000000002000" in amdgpu_vm_cpu_set_ptes at amdgpu_vm.c:921
Date: Mon, 02 Jul 2018 19:48:48 +0000	[thread overview]
Message-ID: <bug-107065-502-Vh2Jx2iXbm@http.bugs.freedesktop.org/> (raw)
In-Reply-To: <bug-107065-502@http.bugs.freedesktop.org/>


[-- Attachment #1.1: Type: text/plain, Size: 10351 bytes --]

https://bugs.freedesktop.org/show_bug.cgi?id=107065

--- Comment #12 from dwagner <jb5sgc1n.nya@20mm.eu> ---
(In reply to Andrey Grodzovsky from comment #10)
> Created attachment 140418 [details] [review]
> drm/amdgpu: Verify root PD is mapped into kernel address space.
> 
> dwagner, please try this patch. Fixes the issue for me and I observed no
> suspend/resume issues.

While I can start X11 with this patch applied to current amd-staging-drm-next,
attempts to resume from S3 fail consistently.

The following related output is emitted right before the suspend:

Jul 02 21:31:32 ryzen kernel: Freezing remaining freezable tasks ... (elapsed
0.000 seconds) done.
Jul 02 21:31:32 ryzen kernel: Suspending console(s) (use no_console_suspend to
debug)
Jul 02 21:31:32 ryzen kernel: sd 9:0:0:0: [sda] Synchronizing SCSI cache
Jul 02 21:31:32 ryzen kernel: [TTM] Buffer eviction failed
Jul 02 21:31:32 ryzen kernel: ACPI: Preparing to enter system sleep state S3
Jul 02 21:31:32 ryzen kernel: PM: Saving platform NVS memory
Jul 02 21:31:32 ryzen kernel: Disabling non-boot CPUs ...

(I wonder if that "[TTM] Buffer eviction failed" is a bad sign - as I have seen
it some other times in conjunction with heavy uses of the amdgpu driver.)


Then, upon resume, the following messages are emitted:

Jul 02 21:31:33 ryzen kernel: ACPI: Low-level resume complete
Jul 02 21:31:33 ryzen kernel: [drm] PCIE GART of 256M enabled (table at
0x000000F400300000).
Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] 
                               failed to send message 146 ret is 0 
Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] 
                               last message was failed ret is 0
Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] 
                               failed to send message 148 ret is 0 
Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] 
                               last message was failed ret is 0
Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] 
                               failed to send message 145 ret is 0 
Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] 
                               last message was failed ret is 0
Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] 
                               failed to send message 146 ret is 0 
Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] 
                               last message was failed ret is 0
Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] 
                               failed to send message 189 ret is 0 
Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] 
                               last message was failed ret is 0
Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] 
                               failed to send message 306 ret is 0 
Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] 
                               last message was failed ret is 0
Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] 
                               failed to send message 5e ret is 0 
Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] 
                               last message was failed ret is 0
Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] 
                               failed to send message 18a ret is 0 
Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] 
                               last message was failed ret is 0
Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] 
                               failed to send message 145 ret is 0 
Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] 
                               last message was failed ret is 0
Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] 
                               failed to send message 146 ret is 0 
Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] 
                               last message was failed ret is 0
Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] 
                               failed to send message 148 ret is 0 
Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] 
                               last message was failed ret is 0
Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] 
                               failed to send message 145 ret is 0 
Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] 
                               last message was failed ret is 0
Jul 02 21:31:33 ryzen kernel: amdgpu: [powerplay] 
                               failed to send message 146 ret is 0 
Jul 02 21:31:33 ryzen kernel: [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR*
amdgpu: ring 0 test failed (scratch(0xC040)=0xC>
Jul 02 21:31:33 ryzen kernel: [drm:amdgpu_device_ip_resume_phase2 [amdgpu]]
*ERROR* resume of IP block <gfx_v8_0> failed -22
Jul 02 21:31:33 ryzen kernel: [drm:amdgpu_device_resume [amdgpu]] *ERROR*
amdgpu_device_ip_resume failed (-22).
Jul 02 21:31:33 ryzen kernel: dpm_run_callback(): pci_pm_resume+0x0/0xa0
returns -22
Jul 02 21:31:33 ryzen kernel: PM: Device 0000:0a:00.0 failed to resume async:
error -22
Jul 02 21:31:33 ryzen kernel: OOM killer enabled.
Jul 02 21:31:33 ryzen kernel: Restarting tasks ... done.
Jul 02 21:31:33 ryzen kernel: PM: suspend exit
Jul 02 21:31:33 ryzen kernel: BUG: unable to handle kernel paging request at
0000000000001000
Jul 02 21:31:33 ryzen kernel: PGD 0 P4D 0 
Jul 02 21:31:33 ryzen kernel: Oops: 0002 [#1] SMP
Jul 02 21:31:33 ryzen kernel: CPU: 14 PID: 791 Comm: amdgpu_cs:0 Tainted: G    
   W  O      4.18.0-rc1-amd+ #45
Jul 02 21:31:33 ryzen kernel: Hardware name: System manufacturer System Product
Name/PRIME X370-PRO, BIOS 4011 04/19/2018
Jul 02 21:31:33 ryzen kernel: RIP: 0010:gmc_v8_0_set_pte_pde+0x1b/0x30 [amdgpu]
Jul 02 21:31:33 ryzen kernel: Code: 80 d8 00 00 00 e9 25 78 60 e1 0f 1f 44 00
00 0f 1f 44 00 00 48 b8 00 f0 ff ff ff 00 00 0>
Jul 02 21:31:33 ryzen kernel: RSP: 0018:ffffc90003e73898 EFLAGS: 00010202
Jul 02 21:31:33 ryzen kernel: RAX: 000000fffffff000 RBX: 0000000000000001 RCX:
000000000fe004f1
Jul 02 21:31:33 ryzen kernel: RDX: 0000000000001000 RSI: 0000000000001000 RDI:
ffff8807e2f70000
Jul 02 21:31:33 ryzen kernel: RBP: 0000000000001000 R08: 00000000000004f1 R09:
0000000000001000
Jul 02 21:31:33 ryzen kernel: R10: ffffffffa03ac7e0 R11: ffff8807daf78000 R12:
0000000000001000
Jul 02 21:31:33 ryzen kernel: R13: 0000000000000200 R14: ffffc90003e73a18 R15:
000000000fe01000
Jul 02 21:31:33 ryzen kernel: FS:  00007f8b57266700(0000)
GS:ffff88081ef80000(0000) knlGS:0000000000000000
Jul 02 21:31:33 ryzen kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul 02 21:31:33 ryzen kernel: CR2: 0000000000001000 CR3: 00000007dbbda000 CR4:
00000000003406e0
Jul 02 21:31:33 ryzen kernel: Call Trace:
Jul 02 21:31:33 ryzen kernel:  amdgpu_vm_cpu_set_ptes+0x76/0xe0 [amdgpu]
Jul 02 21:31:33 ryzen kernel:  amdgpu_vm_update_ptes+0x1d3/0x2e0 [amdgpu]
Jul 02 21:31:33 ryzen kernel:  amdgpu_vm_frag_ptes+0xae/0x130 [amdgpu]
Jul 02 21:31:33 ryzen kernel:  amdgpu_vm_bo_update_mapping+0xed/0x410 [amdgpu]
Jul 02 21:31:33 ryzen kernel:  ? amdgpu_vm_do_copy_ptes+0xa0/0xa0 [amdgpu]
Jul 02 21:31:33 ryzen kernel:  amdgpu_vm_bo_update+0x310/0x680 [amdgpu]
Jul 02 21:31:33 ryzen kernel:  amdgpu_cs_ioctl+0x1092/0x1a50 [amdgpu]
Jul 02 21:31:33 ryzen kernel:  ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu]
Jul 02 21:31:33 ryzen kernel:  drm_ioctl_kernel+0xa7/0xf0 [drm]
Jul 02 21:31:33 ryzen kernel:  drm_ioctl+0x2f1/0x3c0 [drm]
Jul 02 21:31:33 ryzen kernel:  ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu]
Jul 02 21:31:33 ryzen kernel:  amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
Jul 02 21:31:33 ryzen kernel:  do_vfs_ioctl+0xa4/0x620
Jul 02 21:31:33 ryzen kernel:  ? __se_sys_futex+0x138/0x180
Jul 02 21:31:33 ryzen kernel:  ksys_ioctl+0x60/0x90
Jul 02 21:31:33 ryzen kernel:  __x64_sys_ioctl+0x16/0x20
Jul 02 21:31:33 ryzen kernel:  do_syscall_64+0x48/0xf0
Jul 02 21:31:33 ryzen kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Jul 02 21:31:33 ryzen kernel: RIP: 0033:0x7f8b66c92667
Jul 02 21:31:33 ryzen kernel: Code: 00 00 90 48 8b 05 e9 67 2c 00 64 c7 00 26
00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 8>
Jul 02 21:31:33 ryzen kernel: RSP: 002b:00007f8b57265a98 EFLAGS: 00000246
ORIG_RAX: 0000000000000010
Jul 02 21:31:33 ryzen kernel: RAX: ffffffffffffffda RBX: 00007f8b57265b88 RCX:
00007f8b66c92667
Jul 02 21:31:33 ryzen kernel: RDX: 00007f8b57265b00 RSI: 00000000c0186444 RDI:
000000000000000b
Jul 02 21:31:33 ryzen kernel: RBP: 00007f8b57265b00 R08: 00007f8b57265bb0 R09:
0000000000000010
Jul 02 21:31:33 ryzen kernel: R10: 00007f8b57265bb0 R11: 0000000000000246 R12:
00000000c0186444
Jul 02 21:31:33 ryzen kernel: R13: 000000000000000b R14: 0000000000000002 R15:
0000000000000000
Jul 02 21:31:33 ryzen kernel: Modules linked in: it87(O) joydev mousedev
hid_generic hidp hid ipt_REJECT nf_reject_ipv4 nf_l>
Jul 02 21:31:33 ryzen kernel:  serio_raw crc32_pclmul atkbd ghash_clmulni_intel
libps2 pcbc ahci libahci xhci_pci libata aes>
Jul 02 21:31:33 ryzen kernel: CR2: 0000000000001000
Jul 02 21:31:33 ryzen kernel: ---[ end trace 517a8a72887251f0 ]---
Jul 02 21:31:33 ryzen kernel: RIP: 0010:gmc_v8_0_set_pte_pde+0x1b/0x30 [amdgpu]
Jul 02 21:31:33 ryzen kernel: Code: 80 d8 00 00 00 e9 25 78 60 e1 0f 1f 44 00
00 0f 1f 44 00 00 48 b8 00 f0 ff ff ff 00 00 0>
Jul 02 21:31:33 ryzen kernel: RSP: 0018:ffffc90003e73898 EFLAGS: 00010202
Jul 02 21:31:33 ryzen kernel: RAX: 000000fffffff000 RBX: 0000000000000001 RCX:
000000000fe004f1
Jul 02 21:31:33 ryzen kernel: RDX: 0000000000001000 RSI: 0000000000001000 RDI:
ffff8807e2f70000
Jul 02 21:31:33 ryzen kernel: RBP: 0000000000001000 R08: 00000000000004f1 R09:
0000000000001000
Jul 02 21:31:33 ryzen kernel: R10: ffffffffa03ac7e0 R11: ffff8807daf78000 R12:
0000000000001000
Jul 02 21:31:33 ryzen kernel: R13: 0000000000000200 R14: ffffc90003e73a18 R15:
000000000fe01000
Jul 02 21:31:33 ryzen kernel: FS:  00007f8b57266700(0000)
GS:ffff88081ef80000(0000) knlGS:0000000000000000
Jul 02 21:31:33 ryzen kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul 02 21:31:33 ryzen kernel: CR2: 0000000000001000 CR3: 00000007dbbda000 CR4:
00000000003406e0

(At this point, the machine is just dead, and reacts upon nothing.)

So something is still wrong at amdgpu_vm_cpu_set_ptes+0x76

-- 
You are receiving this mail because:
You are the assignee for the bug.

[-- Attachment #1.2: Type: text/html, Size: 11758 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

  parent reply	other threads:[~2018-07-02 19:48 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-28 19:33 [Bug 107065] "BUG: unable to handle kernel paging request at 0000000000002000" at amdgpu_vm_cpu_set_ptes at S3 resume bugzilla-daemon
2018-06-28 19:42 ` bugzilla-daemon
2018-06-28 19:52 ` bugzilla-daemon
2018-06-28 20:49 ` bugzilla-daemon
2018-06-28 22:50 ` bugzilla-daemon
2018-06-29  0:37 ` bugzilla-daemon
2018-06-29 16:16 ` bugzilla-daemon
2018-06-29 19:10 ` bugzilla-daemon
2018-06-29 19:10 ` [Bug 107065] "BUG: unable to handle kernel paging request at 0000000000002000" in amdgpu_vm_cpu_set_ptes at amdgpu_vm.c:921 bugzilla-daemon
2018-06-29 19:17 ` bugzilla-daemon
2018-06-29 19:21 ` bugzilla-daemon
2018-07-02  3:11 ` bugzilla-daemon
2018-07-02 11:03 ` bugzilla-daemon
2018-07-02 19:48 ` bugzilla-daemon [this message]
2018-07-02 22:55 ` bugzilla-daemon
2018-07-03 20:42 ` bugzilla-daemon
2018-07-03 22:58 ` bugzilla-daemon
2018-07-04 22:55 ` bugzilla-daemon
2018-07-06 23:03 ` bugzilla-daemon
2018-07-09 18:16 ` bugzilla-daemon
2018-07-11 22:04 ` bugzilla-daemon
2018-07-11 22:23 ` bugzilla-daemon
2018-07-13 21:01 ` bugzilla-daemon
2018-07-13 23:45 ` bugzilla-daemon
2018-07-14  4:28 ` bugzilla-daemon
2018-07-14 13:15 ` bugzilla-daemon
2018-07-14 13:16 ` bugzilla-daemon
2018-07-16 13:52 ` bugzilla-daemon
2018-07-19 16:42 ` bugzilla-daemon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-107065-502-Vh2Jx2iXbm@http.bugs.freedesktop.org/ \
    --to=bugzilla-daemon@freedesktop.org \
    --cc=dri-devel@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.