All of lore.kernel.org
 help / color / mirror / Atom feed
From: Harvey <harv@gmx.de>
To: amd-gfx@lists.freedesktop.org
Subject: Re: Amdgpu kernel oops and freezing on system suspend and hibernate
Date: Tue, 23 Mar 2021 16:26:50 +0100	[thread overview]
Message-ID: <ef1b7ff1-b926-8651-3ff6-9c05bd8b234f@gmx.de> (raw)
In-Reply-To: <CADnq5_Oe6PHz5rQ9u5T2M3ZKhWE+fuj5CD2ngvXRiZFeZprS=Q@mail.gmail.com>


[-- Attachment #1.1.1: Type: text/plain, Size: 6486 bytes --]

Alex,

thanks for the hint, but...

Is this patch intended for kernel 5.11.8?

I applied the patch against 5.11.8 and it is freezing again:


Mär 23 16:18:51 obelix kernel: [drm:amdgpu_dm_atomic_commit_tail 
[amdgpu]] *ERROR* Waiting for fences timed out!
Mär 23 16:18:51 obelix kernel: [drm:amdgpu_dm_atomic_commit_tail 
[amdgpu]] *ERROR* Waiting for fences timed out!
Mär 23 16:18:51 obelix kernel: [drm:amdgpu_job_timedout [amdgpu]] 
*ERROR* ring sdma0 timeout, signaled seq=615, emitted seq=617
Mär 23 16:18:51 obelix kernel: [drm:amdgpu_job_timedout [amdgpu]] 
*ERROR* Process information: process  pid 0 thread  pid 0
Mär 23 16:18:51 obelix kernel: amdgpu 0000:03:00.0: amdgpu: GPU reset begin!
Mär 23 16:18:51 obelix kernel: BUG: kernel NULL pointer dereference, 
address: 0000000000000029
Mär 23 16:18:51 obelix kernel: #PF: supervisor read access in kernel mode
Mär 23 16:18:51 obelix kernel: #PF: error_code(0x0000) - not-present page
Mär 23 16:18:51 obelix kernel: PGD 0 P4D 0
Mär 23 16:18:51 obelix kernel: Oops: 0000 [#1] PREEMPT SMP NOPTI
Mär 23 16:18:51 obelix kernel: CPU: 12 PID: 178 Comm: kworker/12:1 Not 
tainted 5.11.8-arch1-1-custom #1
Mär 23 16:18:51 obelix kernel: Hardware name: Micro-Star International 
Co., Ltd. Bravo 17 A4DDR/MS-17FK, BIOS E17FKAMS.117 10/29/2020
Mär 23 16:18:51 obelix kernel: Workqueue: events drm_sched_job_timedout 
[gpu_sched]
Mär 23 16:18:51 obelix kernel: RIP: 0010:kernel_queue_uninit+0xd/0xf0 
[amdgpu]
Mär 23 16:18:51 obelix kernel: Code: ee 48 89 c7 e8 a4 f9 ff ff 84 c0 0f 
84 e3 d3 1f 00 4c 89 e0 5d 41 5c 41 5d c3 0f 1f 00 0f 1f 44 00 00 55 48 
8b 47 10 48 89 fd <8b> 50 28 83 fa 02 74 78 83 fa 03 0f 84 b1 00 00 00 
48 8b 7f 08 4c
Mär 23 16:18:51 obelix kernel: RSP: 0018:ffffa35d806dfd40 EFLAGS: 00010246
Mär 23 16:18:51 obelix kernel: RAX: 0000000000000001 RBX: 
ffff8b044c5ee000 RCX: 000000000080005b
Mär 23 16:18:51 obelix kernel: RDX: 000000000080005c RSI: 
0000000000000001 RDI: ffff8b044a877bc0
Mär 23 16:18:51 obelix kernel: RBP: ffff8b044a877bc0 R08: 
0000000000000001 R09: 0000000000000000
Mär 23 16:18:51 obelix kernel: R10: 0000000000000000 R11: 
ffffffffafccba00 R12: ffff8b044c5ee0d0
Mär 23 16:18:51 obelix kernel: R13: ffff8b044bf60000 R14: 
ffff8b04414a1000 R15: ffff8b04414a10c8
Mär 23 16:18:51 obelix kernel: FS:  0000000000000000(0000) 
GS:ffff8b075f900000(0000) knlGS:0000000000000000
Mär 23 16:18:51 obelix kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 
0000000080050033
Mär 23 16:18:51 obelix kernel: CR2: 0000000000000029 CR3: 
00000001ab010000 CR4: 0000000000350ee0
Mär 23 16:18:51 obelix kernel: Call Trace:
Mär 23 16:18:51 obelix kernel:  stop_cpsch+0xa0/0xc0 [amdgpu]
Mär 23 16:18:51 obelix kernel:  kgd2kfd_suspend.part.0+0x2f/0x40 [amdgpu]
Mär 23 16:18:51 obelix kernel:  kgd2kfd_pre_reset+0x3f/0x50 [amdgpu]
Mär 23 16:18:51 obelix kernel: 
amdgpu_device_gpu_recover.cold+0x36e/0x95d [amdgpu]
Mär 23 16:18:51 obelix kernel:  amdgpu_job_timedout+0x121/0x140 [amdgpu]
Mär 23 16:18:51 obelix kernel:  drm_sched_job_timedout+0x64/0xe0 [gpu_sched]
Mär 23 16:18:51 obelix kernel:  process_one_work+0x214/0x3e0
Mär 23 16:18:51 obelix kernel:  worker_thread+0x4d/0x3d0
Mär 23 16:18:51 obelix kernel:  ? rescuer_thread+0x3c0/0x3c0
Mär 23 16:18:51 obelix kernel:  kthread+0x133/0x150
Mär 23 16:18:51 obelix kernel:  ? __kthread_bind_mask+0x60/0x60
Mär 23 16:18:51 obelix kernel:  ret_from_fork+0x22/0x30
Mär 23 16:18:51 obelix kernel: Modules linked in: rfcomm 
snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio 
snd_hda_codec_hdmi cmac algif_hash snd_hda_intel algif_skcipher 
snd_intel_dspcfg soundwire_intel af_alg soundwire_ge>
Mär 23 16:18:51 obelix kernel:  sr_mod cdrom uas usb_storage dm_crypt 
cbc encrypted_keys dm_mod trusted tpm crct10dif_pclmul crc32_pclmul 
crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd 
glue_helper serio_raw ccp xhc>
Mär 23 16:18:51 obelix kernel: CR2: 0000000000000029
Mär 23 16:18:51 obelix kernel: ---[ end trace 8a72c5e07cbe6b63 ]---
Mär 23 16:18:51 obelix kernel: RIP: 0010:kernel_queue_uninit+0xd/0xf0 
[amdgpu]
Mär 23 16:18:51 obelix kernel: Code: ee 48 89 c7 e8 a4 f9 ff ff 84 c0 0f 
84 e3 d3 1f 00 4c 89 e0 5d 41 5c 41 5d c3 0f 1f 00 0f 1f 44 00 00 55 48 
8b 47 10 48 89 fd <8b> 50 28 83 fa 02 74 78 83 fa 03 0f 84 b1 00 00 00 
48 8b 7f 08 4c
Mär 23 16:18:51 obelix kernel: RSP: 0018:ffffa35d806dfd40 EFLAGS: 00010246
Mär 23 16:18:51 obelix kernel: RAX: 0000000000000001 RBX: 
ffff8b044c5ee000 RCX: 000000000080005b
Mär 23 16:18:51 obelix kernel: RDX: 000000000080005c RSI: 
0000000000000001 RDI: ffff8b044a877bc0
Mär 23 16:18:51 obelix kernel: RBP: ffff8b044a877bc0 R08: 
0000000000000001 R09: 0000000000000000
Mär 23 16:18:51 obelix kernel: R10: 0000000000000000 R11: 
ffffffffafccba00 R12: ffff8b044c5ee0d0
Mär 23 16:18:51 obelix kernel: R13: ffff8b044bf60000 R14: 
ffff8b04414a1000 R15: ffff8b04414a10c8
Mär 23 16:18:51 obelix kernel: FS:  0000000000000000(0000) 
GS:ffff8b075f900000(0000) knlGS:0000000000000000
Mär 23 16:18:51 obelix kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 
0000000080050033
Mär 23 16:18:51 obelix kernel: CR2: 0000000000000029 CR3: 
0000000105594000 CR4: 0000000000350ee0
Mär 23 16:19:10 obelix systemd[1]: systemd-hostnamed.service: 
Deactivated successfully.
Mär 23 16:19:10 obelix audit[1]: SERVICE_STOP pid=1 uid=0 
auid=4294967295 ses=4294967295 msg='unit=systemd-hostnamed 
comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? 
terminal=? res=success'
Mär 23 16:19:10 obelix kernel: [drm:amdgpu_dm_atomic_commit_tail 
[amdgpu]] *ERROR* Waiting for fences timed out!

Greetings
Harvey

Am 22.03.21 um 20:22 schrieb Alex Deucher:
> On Thu, Mar 18, 2021 at 8:19 AM Harvey <harv@gmx.de> wrote:
>>
>> Alex,
>>
>> I waited for kernel 5.11.7 to hit our repos yesterday evening and tested
>> again:
>>
>> 1. The suspend issue is gone - suspend and resume now work as expected.
>>
>> 2. System hibernation seems to be a different beast - still freezing
> 
> You need this patch:
> https://gitlab.freedesktop.org/agd5f/linux/-/commit/711c13547aad08f2cfe996e0cddc3d56f1233081
> 
> Alex
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> 

-- 
I am root. If you see me laughing, you'd better have a backup!


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 203 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

      reply	other threads:[~2021-03-23 15:28 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-17 16:20 Amdgpu kernel oops and freezing on system suspend and hibernate Harvey
2021-03-17 17:01 ` Deucher, Alexander
2021-03-18 12:17   ` Harvey
2021-03-19  2:10     ` Quan, Evan
2021-03-19 12:24       ` Harvey
2021-03-22 16:52         ` Harvey
2021-03-22 19:22     ` Alex Deucher
2021-03-23 15:26       ` Harvey [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ef1b7ff1-b926-8651-3ff6-9c05bd8b234f@gmx.de \
    --to=harv@gmx.de \
    --cc=amd-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.