All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: The problem "ring gfx timeout" are experienced yet another AMD GPU Vega 8 user
       [not found] ` <CABXGCsMas72q52GYvH+5Po-KDPfqu74XO_YznqJKnp+-+1Mnww-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2019-02-09 16:37   ` Grodzovsky, Andrey
       [not found]     ` <e92cffb7-6627-6f52-70de-e09d9bdbbe0a-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Grodzovsky, Andrey @ 2019-02-09 16:37 UTC (permalink / raw)
  To: Mikhail Gavrilov, Olsak, Marek, Marek Olšák; +Cc: amd-gfx list

+Marek

Can't find the last fence seqno from mmCP_EOP_LAST_FENCE_LO in gfx ring 
dump (probably that seqno wasn't really the last if the register was 
dumped several times before) but since waves were dumped could be some 
shader issue. Marek, could you please give it a quick look ?

Andrey

On 2/9/19 7:53 AM, Mikhail Gavrilov wrote:
> Hi Andrey,
> in our Linux chat yet another AMD GPU user complains on problem with
> `ring gfx timeout`.
> He said that problem happens when he played in the game "Hearts of Iron 4".
> His config:
> - APU: Ryzen 2200G
> - Kernel: 4.20.6
> - LLVM: 7.0.0
> - MESA: 18.2.8
>
> All logs which he collected with UMR I attach it here.
>
> Can you look please what happened with his GPU?
>
> --
> Best Regards,
> Mike Gavrilov.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: The problem "ring gfx timeout" are experienced yet another AMD GPU Vega 8 user
       [not found]     ` <e92cffb7-6627-6f52-70de-e09d9bdbbe0a-5C7GfCeVMHo@public.gmane.org>
@ 2019-02-09 17:01       ` Marek Olšák
       [not found]         ` <CABXGCsMJKKu6DxvKvD8U6Ffkmt8KinNS96w4ygZ7xaemvEhocg@mail.gmail.com>
  0 siblings, 1 reply; 8+ messages in thread
From: Marek Olšák @ 2019-02-09 17:01 UTC (permalink / raw)
  To: Grodzovsky, Andrey; +Cc: mikhail, Marek Olšák, amd-gfx list


[-- Attachment #1.1: Type: text/plain, Size: 957 bytes --]

I don't see any attachments here.

Marek

On Sat, Feb 9, 2019, 11:37 AM Grodzovsky, Andrey <Andrey.Grodzovsky-5C7GfCeVMHo@public.gmane.org
wrote:

> +Marek
>
> Can't find the last fence seqno from mmCP_EOP_LAST_FENCE_LO in gfx ring
> dump (probably that seqno wasn't really the last if the register was
> dumped several times before) but since waves were dumped could be some
> shader issue. Marek, could you please give it a quick look ?
>
> Andrey
>
> On 2/9/19 7:53 AM, Mikhail Gavrilov wrote:
> > Hi Andrey,
> > in our Linux chat yet another AMD GPU user complains on problem with
> > `ring gfx timeout`.
> > He said that problem happens when he played in the game "Hearts of Iron
> 4".
> > His config:
> > - APU: Ryzen 2200G
> > - Kernel: 4.20.6
> > - LLVM: 7.0.0
> > - MESA: 18.2.8
> >
> > All logs which he collected with UMR I attach it here.
> >
> > Can you look please what happened with his GPU?
> >
> > --
> > Best Regards,
> > Mike Gavrilov.
>

[-- Attachment #1.2: Type: text/html, Size: 1425 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: The problem "ring gfx timeout" are experienced yet another AMD GPU Vega 8 user
       [not found]           ` <CABXGCsMJKKu6DxvKvD8U6Ffkmt8KinNS96w4ygZ7xaemvEhocg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2019-02-25 19:40             ` Marek Olšák
       [not found]               ` <CAAxE2A6US6YBpESnwqm-EGsDapOVVfbHXxdrkjG1UddFPO0mOQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Marek Olšák @ 2019-02-25 19:40 UTC (permalink / raw)
  To: Mikhail Gavrilov; +Cc: Grodzovsky, Andrey, amd-gfx list


[-- Attachment #1.1: Type: text/plain, Size: 491 bytes --]

Some shaders are stuck at "s_load_dwordx4 s[32:35], s[36:37], 0x0", but
that might mean all sorts of things.

Do you also have the dmesg log?

Marek

On Sat, Feb 9, 2019 at 12:20 PM Mikhail Gavrilov <
mikhail.v.gavrilov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:

> On Sat, 9 Feb 2019 at 22:01, Marek Olšák <maraeo-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> >
> > I don't see any attachments here.
> >
> > Marek
>
>
> --
> Best Regards,
> Mike Gavrilov.
>

[-- Attachment #1.2: Type: text/html, Size: 970 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: The problem "ring gfx timeout" are experienced yet another AMD GPU Vega 8 user
       [not found]               ` <CAAxE2A6US6YBpESnwqm-EGsDapOVVfbHXxdrkjG1UddFPO0mOQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2019-02-26 12:50                 ` Mikhail Gavrilov
       [not found]                   ` <CABXGCsOWGpop=5_a6nrRoqWSWagkMg_sYCPL8ZuSuByvcDDu+w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Mikhail Gavrilov @ 2019-02-26 12:50 UTC (permalink / raw)
  To: Marek Olšák; +Cc: Grodzovsky, Andrey, amd-gfx list

On Tue, 26 Feb 2019 at 00:40, Marek Olšák <maraeo@gmail.com> wrote:
>
> Some shaders are stuck at "s_load_dwordx4 s[32:35], s[36:37], 0x0", but that might mean all sorts of things.
>
> Do you also have the dmesg log?
>
> Marek

All files together here:
https://mega.nz/#F!c4RwAYDJ!0ds-bVIftIDV4KCQOaDIsw

--
Best Regards,
Mike Gavrilov.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: The problem "ring gfx timeout" are experienced yet another AMD GPU Vega 8 user
       [not found]                   ` <CABXGCsOWGpop=5_a6nrRoqWSWagkMg_sYCPL8ZuSuByvcDDu+w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2019-02-26 19:56                     ` Marek Olšák
       [not found]                       ` <CAAxE2A4Xdsutk=teU=vi_Gnr2tuwinu6_JSfkoqTGwd1NrSpxQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Marek Olšák @ 2019-02-26 19:56 UTC (permalink / raw)
  To: Mikhail Gavrilov; +Cc: Grodzovsky, Andrey, amd-gfx list


[-- Attachment #1.1: Type: text/plain, Size: 697 bytes --]

Sadly, the logs don't contain any clue as to why it hangs.

It would be helpful to check if the hang can be reproduced on Vega 56 or 64
as well.

Marek

On Tue, Feb 26, 2019 at 7:51 AM Mikhail Gavrilov <
mikhail.v.gavrilov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:

> On Tue, 26 Feb 2019 at 00:40, Marek Olšák <maraeo-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> >
> > Some shaders are stuck at "s_load_dwordx4 s[32:35], s[36:37], 0x0", but
> that might mean all sorts of things.
> >
> > Do you also have the dmesg log?
> >
> > Marek
>
> All files together here:
> https://mega.nz/#F!c4RwAYDJ!0ds-bVIftIDV4KCQOaDIsw
>
> --
> Best Regards,
> Mike Gavrilov.
>

[-- Attachment #1.2: Type: text/html, Size: 1285 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: The problem "ring gfx timeout" are experienced yet another AMD GPU Vega 8 user
       [not found]                       ` <CAAxE2A4Xdsutk=teU=vi_Gnr2tuwinu6_JSfkoqTGwd1NrSpxQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2019-07-02  4:07                         ` Mikhail Gavrilov
       [not found]                           ` <CABXGCsNpD93nBCWiL0VV_x+7jjY+HMYMbWsswxvHWWZXJaAhTQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Mikhail Gavrilov @ 2019-07-02  4:07 UTC (permalink / raw)
  To: Marek Olšák; +Cc: Grodzovsky, Andrey, amd-gfx list

On Wed, 27 Feb 2019 at 00:57, Marek Olšák <maraeo@gmail.com> wrote:
>
> Sadly, the logs don't contain any clue as to why it hangs.
>
> It would be helpful to check if the hang can be reproduced on Vega 56 or 64 as well.
>
> Marek
>

Hi, Marek.

I'm sorry to trouble you.
But today the user of described above Vega 8 graphic sended me fresh logs.

Actual versions: kernel 5.1.15 / DRM 3.30.0 / Mesa 19.0. / LLVM 8.0.0

I uploaded all logs to mega cloud storage.
Can you look this logs please?

https://mega.nz/#F!Mt5mhKiI!8Sv2T5a6yTxBqVknhH1NjA


--
Best Regards,
Mike Gavrilov.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: The problem "ring gfx timeout" are experienced yet another AMD GPU Vega 8 user
       [not found]                           ` <CABXGCsNpD93nBCWiL0VV_x+7jjY+HMYMbWsswxvHWWZXJaAhTQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2019-07-03 18:56                             ` Marek Olšák
       [not found]                               ` <CAAxE2A4fy3MAMV0HzX2Rszf=2nZ923xE+r2tVhZKp18V88cmVg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Marek Olšák @ 2019-07-03 18:56 UTC (permalink / raw)
  To: Mikhail Gavrilov; +Cc: Grodzovsky, Andrey, amd-gfx list


[-- Attachment #1.1: Type: text/plain, Size: 893 bytes --]

It looks like memory corruption. You can try to disable IOMMU in the BIOS.

Marek

On Tue, Jul 2, 2019 at 12:07 AM Mikhail Gavrilov <
mikhail.v.gavrilov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:

> On Wed, 27 Feb 2019 at 00:57, Marek Olšák <maraeo-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> >
> > Sadly, the logs don't contain any clue as to why it hangs.
> >
> > It would be helpful to check if the hang can be reproduced on Vega 56 or
> 64 as well.
> >
> > Marek
> >
>
> Hi, Marek.
>
> I'm sorry to trouble you.
> But today the user of described above Vega 8 graphic sended me fresh logs.
>
> Actual versions: kernel 5.1.15 / DRM 3.30.0 / Mesa 19.0. / LLVM 8.0.0
>
> I uploaded all logs to mega cloud storage.
> Can you look this logs please?
>
> https://mega.nz/#F!Mt5mhKiI!8Sv2T5a6yTxBqVknhH1NjA
>
>
> --
> Best Regards,
> Mike Gavrilov.
>

[-- Attachment #1.2: Type: text/html, Size: 1480 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: The problem "ring gfx timeout" are experienced yet another AMD GPU Vega 8 user
       [not found]                               ` <CAAxE2A4fy3MAMV0HzX2Rszf=2nZ923xE+r2tVhZKp18V88cmVg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2019-07-18 20:38                                 ` Mikhail Gavrilov
  0 siblings, 0 replies; 8+ messages in thread
From: Mikhail Gavrilov @ 2019-07-18 20:38 UTC (permalink / raw)
  To: Marek Olšák; +Cc: Grodzovsky, Andrey, amd-gfx list

On Wed, 3 Jul 2019 at 23:57, Marek Olšák <maraeo@gmail.com> wrote:
>
> It looks like memory corruption. You can try to disable IOMMU in the BIOS.
>

We disabled IOMMU in the BIOS [1].
And was run the memory check with MemTest86.
MemTest86 did not find any memory problems [2].

But previously reported issue with GPU hanging, unfortunately, happens again.

[17571.578988] amdgpu 0000:08:00.0: [gfxhub] no-retry page fault
(src_id:0 ring:158 vmid:7 pasid:32776, for process hoi4 pid 9225
thread hoi4:cs0 pid 9226)
[17571.578992] amdgpu 0000:08:00.0:   in page starting at address
0x0000000044160000 from 27
[17571.578994] amdgpu 0000:08:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x0070153C
[17576.635622] [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR*
Waiting for fences timed out.
[17581.755948] [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR*
Waiting for fences timed out.
[17581.765672] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx
timeout, signaled seq=1520345, emitted seq=1520347
[17581.765765] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process
information: process hoi4 pid 9225 thread hoi4:cs0 pid 9226
[17581.765766] [drm] GPU recovery disabled.
[17586.875783] [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR*
Waiting for fences timed out.
[17592.005836] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx
timeout, signaled seq=1520345, emitted seq=1520347
[17592.005921] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process
information: process hoi4 pid 9225 thread hoi4:cs0 pid 9226
[17592.005923] [drm] GPU recovery disabled.


No more ideas on how memory may be corrupted?

Fresh logs uploaded here [3].

Thanks.

[1] https://postimg.cc/RJLYWgH7
[2] https://postimg.cc/Fk4qFM7F
[3] https://mega.nz/#F!8xphjAJL!7HVUz-NyRaICjCSu_x-fFA

--
Best Regards,
Mike Gavrilov.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-07-18 20:38 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CABXGCsMas72q52GYvH+5Po-KDPfqu74XO_YznqJKnp+-+1Mnww@mail.gmail.com>
     [not found] ` <CABXGCsMas72q52GYvH+5Po-KDPfqu74XO_YznqJKnp+-+1Mnww-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2019-02-09 16:37   ` The problem "ring gfx timeout" are experienced yet another AMD GPU Vega 8 user Grodzovsky, Andrey
     [not found]     ` <e92cffb7-6627-6f52-70de-e09d9bdbbe0a-5C7GfCeVMHo@public.gmane.org>
2019-02-09 17:01       ` Marek Olšák
     [not found]         ` <CABXGCsMJKKu6DxvKvD8U6Ffkmt8KinNS96w4ygZ7xaemvEhocg@mail.gmail.com>
     [not found]           ` <CABXGCsMJKKu6DxvKvD8U6Ffkmt8KinNS96w4ygZ7xaemvEhocg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2019-02-25 19:40             ` Marek Olšák
     [not found]               ` <CAAxE2A6US6YBpESnwqm-EGsDapOVVfbHXxdrkjG1UddFPO0mOQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2019-02-26 12:50                 ` Mikhail Gavrilov
     [not found]                   ` <CABXGCsOWGpop=5_a6nrRoqWSWagkMg_sYCPL8ZuSuByvcDDu+w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2019-02-26 19:56                     ` Marek Olšák
     [not found]                       ` <CAAxE2A4Xdsutk=teU=vi_Gnr2tuwinu6_JSfkoqTGwd1NrSpxQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2019-07-02  4:07                         ` Mikhail Gavrilov
     [not found]                           ` <CABXGCsNpD93nBCWiL0VV_x+7jjY+HMYMbWsswxvHWWZXJaAhTQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2019-07-03 18:56                             ` Marek Olšák
     [not found]                               ` <CAAxE2A4fy3MAMV0HzX2Rszf=2nZ923xE+r2tVhZKp18V88cmVg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2019-07-18 20:38                                 ` Mikhail Gavrilov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.