From: Michal Kubecek <mkubecek@suse.cz> To: amd-gfx@lists.freedesktop.org Cc: "Christian König" <christian.koenig@amd.com>, "Felix Kuehling" <Felix.Kuehling@amd.com>, "Alex Deucher" <alexander.deucher@amd.com>, linux-kernel@vger.kernel.org Subject: (REGRESSION bisected) Re: amdgpu errors (VM fault / GPU fault detected) with 5.19 merge window snapshots Date: Fri, 27 May 2022 14:44:59 +0200 [thread overview] Message-ID: <20220527124459.mfo4tjdsjohamsvy@lion.mk-sys.cz> (raw) In-Reply-To: <20220527090039.pdrazo5e6mwgo3d3@lion.mk-sys.cz> [-- Attachment #1: Type: text/plain, Size: 2106 bytes --] On Fri, May 27, 2022 at 11:00:39AM +0200, Michal Kubecek wrote: > Hello, > > while testing 5.19 merge window snapshots (commits babf0bb978e3 and > 7e284070abe5), I keep getting errors like below. I have not seen them > with 5.18 final or older. > > ------------------------------------------------------------------------ > [ 247.150333] gmc_v8_0_process_interrupt: 46 callbacks suppressed > [ 247.150336] amdgpu 0000:0c:00.0: amdgpu: GPU fault detected: 147 0x00020802 for process firefox pid 6101 thread firefox:cs0 pid 6116 > [ 247.150339] amdgpu 0000:0c:00.0: amdgpu: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00107800 > [ 247.150340] amdgpu 0000:0c:00.0: amdgpu: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0D008002 > [ 247.150341] amdgpu 0000:0c:00.0: amdgpu: VM fault (0x02, vmid 6, pasid 32780) at page 1079296, write from 'TC2' (0x54433200) (8) [...] > [ 249.925909] amdgpu 0000:0c:00.0: amdgpu: IH ring buffer overflow (0x000844C0, 0x00004A00, 0x000044D0) > [ 250.434986] [drm] Fence fallback timer expired on ring sdma0 > [ 466.621568] gmc_v8_0_process_interrupt: 122 callbacks suppressed [...] > ------------------------------------------------------------------------ > > There does not seem to be any apparent immediate problem with graphics > but when running commit babf0bb978e3, there seemed to be a noticeable > lag in some operations, e.g. when moving a window or repainting large > part of the terminal window in konsole (no idea if it's related). > > My GPU is Radeon Pro WX 2100 (1002:6995). What other information should > I collect to help debugging the issue? Bisected to commit 5255e146c99a ("drm/amdgpu: rework TLB flushing"). There seem to be later commits depending on it so I did not test a revert on top of current mainline. I should also mention that most commits tested as "bad" during the bisect did behave much worse than current mainline (errors starting as early as with sddm, visibly damaged screen content, sometimes even crashes). But all of them issued messages similar to those above into kernel log. Michal Kubecek [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --]
WARNING: multiple messages have this Message-ID (diff)
From: Michal Kubecek <mkubecek@suse.cz> To: amd-gfx@lists.freedesktop.org Cc: "Alex Deucher" <alexander.deucher@amd.com>, "Felix Kuehling" <Felix.Kuehling@amd.com>, "Christian König" <christian.koenig@amd.com>, linux-kernel@vger.kernel.org Subject: (REGRESSION bisected) Re: amdgpu errors (VM fault / GPU fault detected) with 5.19 merge window snapshots Date: Fri, 27 May 2022 14:44:59 +0200 [thread overview] Message-ID: <20220527124459.mfo4tjdsjohamsvy@lion.mk-sys.cz> (raw) In-Reply-To: <20220527090039.pdrazo5e6mwgo3d3@lion.mk-sys.cz> [-- Attachment #1: Type: text/plain, Size: 2106 bytes --] On Fri, May 27, 2022 at 11:00:39AM +0200, Michal Kubecek wrote: > Hello, > > while testing 5.19 merge window snapshots (commits babf0bb978e3 and > 7e284070abe5), I keep getting errors like below. I have not seen them > with 5.18 final or older. > > ------------------------------------------------------------------------ > [ 247.150333] gmc_v8_0_process_interrupt: 46 callbacks suppressed > [ 247.150336] amdgpu 0000:0c:00.0: amdgpu: GPU fault detected: 147 0x00020802 for process firefox pid 6101 thread firefox:cs0 pid 6116 > [ 247.150339] amdgpu 0000:0c:00.0: amdgpu: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00107800 > [ 247.150340] amdgpu 0000:0c:00.0: amdgpu: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0D008002 > [ 247.150341] amdgpu 0000:0c:00.0: amdgpu: VM fault (0x02, vmid 6, pasid 32780) at page 1079296, write from 'TC2' (0x54433200) (8) [...] > [ 249.925909] amdgpu 0000:0c:00.0: amdgpu: IH ring buffer overflow (0x000844C0, 0x00004A00, 0x000044D0) > [ 250.434986] [drm] Fence fallback timer expired on ring sdma0 > [ 466.621568] gmc_v8_0_process_interrupt: 122 callbacks suppressed [...] > ------------------------------------------------------------------------ > > There does not seem to be any apparent immediate problem with graphics > but when running commit babf0bb978e3, there seemed to be a noticeable > lag in some operations, e.g. when moving a window or repainting large > part of the terminal window in konsole (no idea if it's related). > > My GPU is Radeon Pro WX 2100 (1002:6995). What other information should > I collect to help debugging the issue? Bisected to commit 5255e146c99a ("drm/amdgpu: rework TLB flushing"). There seem to be later commits depending on it so I did not test a revert on top of current mainline. I should also mention that most commits tested as "bad" during the bisect did behave much worse than current mainline (errors starting as early as with sddm, visibly damaged screen content, sometimes even crashes). But all of them issued messages similar to those above into kernel log. Michal Kubecek [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --]
next prev parent reply other threads:[~2022-05-27 12:46 UTC|newest] Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-05-27 9:00 amdgpu errors (VM fault / GPU fault detected) with 5.19 merge window snapshots Michal Kubecek 2022-05-27 12:44 ` Michal Kubecek [this message] 2022-05-27 12:44 ` (REGRESSION bisected) " Michal Kubecek 2022-06-01 14:55 ` Alex Deucher 2022-06-01 14:55 ` Alex Deucher 2022-06-01 14:59 ` Christian König 2022-06-02 13:58 ` Alex Deucher 2022-06-02 13:58 ` Alex Deucher 2022-06-02 14:22 ` Michal Kubecek 2022-06-02 14:22 ` Michal Kubecek 2022-06-03 15:49 ` Alex Deucher 2022-06-03 15:49 ` Alex Deucher 2022-06-03 17:23 ` Michal Kubecek 2022-06-03 17:23 ` Michal Kubecek 2022-06-05 22:00 ` Michal Kubecek 2022-06-05 22:00 ` Michal Kubecek 2022-06-06 10:25 ` Christian König 2022-06-06 10:25 ` Christian König
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20220527124459.mfo4tjdsjohamsvy@lion.mk-sys.cz \ --to=mkubecek@suse.cz \ --cc=Felix.Kuehling@amd.com \ --cc=alexander.deucher@amd.com \ --cc=amd-gfx@lists.freedesktop.org \ --cc=christian.koenig@amd.com \ --cc=linux-kernel@vger.kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.