regressions.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* Re: PROBLEM: AMD Ryzen 9 7950X iGPU - Blinking Issue
       [not found] <46a7eb80-5f09-4f6a-4fd3-9550dafd497c@felixrichter.tech>
@ 2023-05-02 11:44 ` Linux regression tracking (Thorsten Leemhuis)
  2023-05-02 13:13   ` Alex Deucher
  0 siblings, 1 reply; 13+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2023-05-02 11:44 UTC (permalink / raw)
  To: Felix Richter, amd-gfx, dri-devel; +Cc: Linux kernel regressions list

[CCing the regression list, as it should be in the loop for regressions:
https://docs.kernel.org/admin-guide/reporting-regressions.html]

[TLDR: I'm adding this report to the list of tracked Linux kernel
regressions; the text you find below is based on a few templates
paragraphs you might have encountered already in similar form.
See link in footer if these mails annoy you.]

On 30.04.23 13:44, Felix Richter wrote:
> Hi,
> 
> I am running into an issue with the integrated GPU of the Ryzen 9 7950X. It seems to be a regression from kernel version 6.1 to 6.2. 
> The bug materializes in from of my monitor blinking, meaning it turns full white shortly. This happens very often so that the system becomes unpleasant to use.
> 
> I am running the Archlinux Kernel:
> The Issue happens on the bleeding edge kernel: 6.2.13
> Switching back to the LTS kernel resolves the issue: 6.1.26
> 
> I have two monitors attached to the system. One 42 inch 4k Display and a 24 inch 1080p Display and am running sway as my desktop.
> 
> Let me know if there is more information I could provide to help narrow down the issue.

Thanks for the report. To be sure the issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
tracking bot:

#regzbot ^introduced v6.1..v6.2
#regzbot title drm: amdgpu: system becomes unpleasant to use after
monitor starts blinking and turns full white
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply and tell me -- ideally
while also telling regzbot about it, as explained by the page listed in
the footer of this mail.

Developers: When fixing the issue, remember to add 'Link:' tags pointing
to the report (the parent of this mail). See page linked in footer for
details.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: PROBLEM: AMD Ryzen 9 7950X iGPU - Blinking Issue
  2023-05-02 11:44 ` PROBLEM: AMD Ryzen 9 7950X iGPU - Blinking Issue Linux regression tracking (Thorsten Leemhuis)
@ 2023-05-02 13:13   ` Alex Deucher
  2023-05-02 13:34     ` Linux regression tracking (Thorsten Leemhuis)
  0 siblings, 1 reply; 13+ messages in thread
From: Alex Deucher @ 2023-05-02 13:13 UTC (permalink / raw)
  To: Linux regressions mailing list; +Cc: Felix Richter, amd-gfx, dri-devel

On Tue, May 2, 2023 at 7:45 AM Linux regression tracking (Thorsten
Leemhuis) <regressions@leemhuis.info> wrote:
>
> [CCing the regression list, as it should be in the loop for regressions:
> https://docs.kernel.org/admin-guide/reporting-regressions.html]
>
> [TLDR: I'm adding this report to the list of tracked Linux kernel
> regressions; the text you find below is based on a few templates
> paragraphs you might have encountered already in similar form.
> See link in footer if these mails annoy you.]
>
> On 30.04.23 13:44, Felix Richter wrote:
> > Hi,
> >
> > I am running into an issue with the integrated GPU of the Ryzen 9 7950X. It seems to be a regression from kernel version 6.1 to 6.2.
> > The bug materializes in from of my monitor blinking, meaning it turns full white shortly. This happens very often so that the system becomes unpleasant to use.
> >
> > I am running the Archlinux Kernel:
> > The Issue happens on the bleeding edge kernel: 6.2.13
> > Switching back to the LTS kernel resolves the issue: 6.1.26
> >
> > I have two monitors attached to the system. One 42 inch 4k Display and a 24 inch 1080p Display and am running sway as my desktop.
> >
> > Let me know if there is more information I could provide to help narrow down the issue.
>
> Thanks for the report. To be sure the issue doesn't fall through the
> cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
> tracking bot:
>
> #regzbot ^introduced v6.1..v6.2
> #regzbot title drm: amdgpu: system becomes unpleasant to use after
> monitor starts blinking and turns full white
> #regzbot ignore-activity
>
> This isn't a regression? This issue or a fix for it are already
> discussed somewhere else? It was fixed already? You want to clarify when
> the regression started to happen? Or point out I got the title or
> something else totally wrong? Then just reply and tell me -- ideally
> while also telling regzbot about it, as explained by the page listed in
> the footer of this mail.
>
> Developers: When fixing the issue, remember to add 'Link:' tags pointing
> to the report (the parent of this mail). See page linked in footer for
> details.

This sounds exactly like the issue that was fixed in this patch which
is already on it's way to Linus:
https://gitlab.freedesktop.org/agd5f/linux/-/commit/08da182175db4c7f80850354849d95f2670e8cd9

Alex

>
> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
> --
> Everything you wanna know about Linux kernel regression tracking:
> https://linux-regtracking.leemhuis.info/about/#tldr
> That page also explains what to do if mails like this annoy you.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: PROBLEM: AMD Ryzen 9 7950X iGPU - Blinking Issue
  2023-05-02 13:13   ` Alex Deucher
@ 2023-05-02 13:34     ` Linux regression tracking (Thorsten Leemhuis)
  2023-05-02 13:39       ` Alex Deucher
  2023-05-02 13:48       ` Felix Richter
  0 siblings, 2 replies; 13+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2023-05-02 13:34 UTC (permalink / raw)
  To: Alex Deucher, Linux regressions mailing list
  Cc: Felix Richter, amd-gfx, dri-devel

On 02.05.23 15:13, Alex Deucher wrote:
> On Tue, May 2, 2023 at 7:45 AM Linux regression tracking (Thorsten
> Leemhuis) <regressions@leemhuis.info> wrote:
>
>> On 30.04.23 13:44, Felix Richter wrote:
>>> Hi,
>>>
>>> I am running into an issue with the integrated GPU of the Ryzen 9 7950X. It seems to be a regression from kernel version 6.1 to 6.2.
>>> The bug materializes in from of my monitor blinking, meaning it turns full white shortly. This happens very often so that the system becomes unpleasant to use.
>>>
>>> I am running the Archlinux Kernel:
>>> The Issue happens on the bleeding edge kernel: 6.2.13
>>> Switching back to the LTS kernel resolves the issue: 6.1.26
>>>
>>> I have two monitors attached to the system. One 42 inch 4k Display and a 24 inch 1080p Display and am running sway as my desktop.
>>>
>>> Let me know if there is more information I could provide to help narrow down the issue.
>>
>> Thanks for the report. To be sure the issue doesn't fall through the
>> cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
>> tracking bot:
>>
>> #regzbot ^introduced v6.1..v6.2
>> #regzbot title drm: amdgpu: system becomes unpleasant to use after
>> monitor starts blinking and turns full white
>> #regzbot ignore-activity
>>
>> This isn't a regression? This issue or a fix for it are already
>> discussed somewhere else? It was fixed already? You want to clarify when
>> the regression started to happen? Or point out I got the title or
>> something else totally wrong? Then just reply and tell me -- ideally
>> while also telling regzbot about it, as explained by the page listed in
>> the footer of this mail.
>>
>> Developers: When fixing the issue, remember to add 'Link:' tags pointing
>> to the report (the parent of this mail). See page linked in footer for
>> details.
> 
> This sounds exactly like the issue that was fixed in this patch which
> is already on it's way to Linus:
> https://gitlab.freedesktop.org/agd5f/linux/-/commit/08da182175db4c7f80850354849d95f2670e8cd9

FWIW, you in the flood of emails likely missed that this is the same
thread where you yesterday replied "If the module parameter didn't help
then perhaps you are seeing some other issue.  Can you bisect?". That's
why I decided to add this to the tracking. Or am I missing something
obvious here?

/me looks around again and can't see anything, but that doesn't have to
mean anything...

Felix, btw, this guide might help you with the bisection, even if it's
just for kernel compilation:

https://docs.kernel.org/next/admin-guide/quickly-build-trimmed-linux.html

And to indirectly reply to your mail from yesterday[1]. You might want
to ignore the arch linux kernel git repo and just do a bisection between
6.1 and the latest 6.2.y kernel using upstream repos; and if I were you
I'd also try 6.3 or even mainline before that, in case the issue was
fixed already.

[1]
https://lore.kernel.org/all/04749ee4-0728-92fe-bcb0-a7320279eaac@felixrichter.tech/

Ciao, Thorsten

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: PROBLEM: AMD Ryzen 9 7950X iGPU - Blinking Issue
  2023-05-02 13:34     ` Linux regression tracking (Thorsten Leemhuis)
@ 2023-05-02 13:39       ` Alex Deucher
  2023-05-02 13:48       ` Felix Richter
  1 sibling, 0 replies; 13+ messages in thread
From: Alex Deucher @ 2023-05-02 13:39 UTC (permalink / raw)
  To: Linux regressions mailing list; +Cc: Felix Richter, amd-gfx, dri-devel

On Tue, May 2, 2023 at 9:34 AM Linux regression tracking (Thorsten
Leemhuis) <regressions@leemhuis.info> wrote:
>
> On 02.05.23 15:13, Alex Deucher wrote:
> > On Tue, May 2, 2023 at 7:45 AM Linux regression tracking (Thorsten
> > Leemhuis) <regressions@leemhuis.info> wrote:
> >
> >> On 30.04.23 13:44, Felix Richter wrote:
> >>> Hi,
> >>>
> >>> I am running into an issue with the integrated GPU of the Ryzen 9 7950X. It seems to be a regression from kernel version 6.1 to 6.2.
> >>> The bug materializes in from of my monitor blinking, meaning it turns full white shortly. This happens very often so that the system becomes unpleasant to use.
> >>>
> >>> I am running the Archlinux Kernel:
> >>> The Issue happens on the bleeding edge kernel: 6.2.13
> >>> Switching back to the LTS kernel resolves the issue: 6.1.26
> >>>
> >>> I have two monitors attached to the system. One 42 inch 4k Display and a 24 inch 1080p Display and am running sway as my desktop.
> >>>
> >>> Let me know if there is more information I could provide to help narrow down the issue.
> >>
> >> Thanks for the report. To be sure the issue doesn't fall through the
> >> cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
> >> tracking bot:
> >>
> >> #regzbot ^introduced v6.1..v6.2
> >> #regzbot title drm: amdgpu: system becomes unpleasant to use after
> >> monitor starts blinking and turns full white
> >> #regzbot ignore-activity
> >>
> >> This isn't a regression? This issue or a fix for it are already
> >> discussed somewhere else? It was fixed already? You want to clarify when
> >> the regression started to happen? Or point out I got the title or
> >> something else totally wrong? Then just reply and tell me -- ideally
> >> while also telling regzbot about it, as explained by the page listed in
> >> the footer of this mail.
> >>
> >> Developers: When fixing the issue, remember to add 'Link:' tags pointing
> >> to the report (the parent of this mail). See page linked in footer for
> >> details.
> >
> > This sounds exactly like the issue that was fixed in this patch which
> > is already on it's way to Linus:
> > https://gitlab.freedesktop.org/agd5f/linux/-/commit/08da182175db4c7f80850354849d95f2670e8cd9
>
> FWIW, you in the flood of emails likely missed that this is the same
> thread where you yesterday replied "If the module parameter didn't help
> then perhaps you are seeing some other issue.  Can you bisect?". That's
> why I decided to add this to the tracking. Or am I missing something
> obvious here?

Sorry, from the wording of the message, it looked like you had missed
the earlier part of the thread.

Alex

>
> /me looks around again and can't see anything, but that doesn't have to
> mean anything...
>
> Felix, btw, this guide might help you with the bisection, even if it's
> just for kernel compilation:
>
> https://docs.kernel.org/next/admin-guide/quickly-build-trimmed-linux.html
>
> And to indirectly reply to your mail from yesterday[1]. You might want
> to ignore the arch linux kernel git repo and just do a bisection between
> 6.1 and the latest 6.2.y kernel using upstream repos; and if I were you
> I'd also try 6.3 or even mainline before that, in case the issue was
> fixed already.
>
> [1]
> https://lore.kernel.org/all/04749ee4-0728-92fe-bcb0-a7320279eaac@felixrichter.tech/
>
> Ciao, Thorsten

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: PROBLEM: AMD Ryzen 9 7950X iGPU - Blinking Issue
  2023-05-02 13:34     ` Linux regression tracking (Thorsten Leemhuis)
  2023-05-02 13:39       ` Alex Deucher
@ 2023-05-02 13:48       ` Felix Richter
  2023-05-02 14:12         ` Linux regression tracking (Thorsten Leemhuis)
  1 sibling, 1 reply; 13+ messages in thread
From: Felix Richter @ 2023-05-02 13:48 UTC (permalink / raw)
  To: Linux regressions mailing list, Alex Deucher; +Cc: amd-gfx, dri-devel

On 5/2/23 15:34, Linux regression tracking (Thorsten Leemhuis) wrote:
> On 02.05.23 15:13, Alex Deucher wrote:
>> On Tue, May 2, 2023 at 7:45 AM Linux regression tracking (Thorsten
>> Leemhuis) <regressions@leemhuis.info> wrote:
>>
>>> On 30.04.23 13:44, Felix Richter wrote:
>>>> Hi,
>>>>
>>>> I am running into an issue with the integrated GPU of the Ryzen 9 7950X. It seems to be a regression from kernel version 6.1 to 6.2.
>>>> The bug materializes in from of my monitor blinking, meaning it turns full white shortly. This happens very often so that the system becomes unpleasant to use.
>>>>
>>>> I am running the Archlinux Kernel:
>>>> The Issue happens on the bleeding edge kernel: 6.2.13
>>>> Switching back to the LTS kernel resolves the issue: 6.1.26
>>>>
>>>> I have two monitors attached to the system. One 42 inch 4k Display and a 24 inch 1080p Display and am running sway as my desktop.
>>>>
>>>> Let me know if there is more information I could provide to help narrow down the issue.
>>> Thanks for the report. To be sure the issue doesn't fall through the
>>> cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
>>> tracking bot:
>>>
>>> #regzbot ^introduced v6.1..v6.2
>>> #regzbot title drm: amdgpu: system becomes unpleasant to use after
>>> monitor starts blinking and turns full white
>>> #regzbot ignore-activity
>>>
>>> This isn't a regression? This issue or a fix for it are already
>>> discussed somewhere else? It was fixed already? You want to clarify when
>>> the regression started to happen? Or point out I got the title or
>>> something else totally wrong? Then just reply and tell me -- ideally
>>> while also telling regzbot about it, as explained by the page listed in
>>> the footer of this mail.
>>>
>>> Developers: When fixing the issue, remember to add 'Link:' tags pointing
>>> to the report (the parent of this mail). See page linked in footer for
>>> details.
>> This sounds exactly like the issue that was fixed in this patch which
>> is already on it's way to Linus:
>> https://gitlab.freedesktop.org/agd5f/linux/-/commit/08da182175db4c7f80850354849d95f2670e8cd9
> FWIW, you in the flood of emails likely missed that this is the same
> thread where you yesterday replied "If the module parameter didn't help
> then perhaps you are seeing some other issue.  Can you bisect?". That's
> why I decided to add this to the tracking. Or am I missing something
> obvious here?
>
> /me looks around again and can't see anything, but that doesn't have to
> mean anything...
>
> Felix, btw, this guide might help you with the bisection, even if it's
> just for kernel compilation:
>
> https://docs.kernel.org/next/admin-guide/quickly-build-trimmed-linux.html
>
> And to indirectly reply to your mail from yesterday[1]. You might want
> to ignore the arch linux kernel git repo and just do a bisection between
> 6.1 and the latest 6.2.y kernel using upstream repos; and if I were you
> I'd also try 6.3 or even mainline before that, in case the issue was
> fixed already.
>
> [1]
> https://lore.kernel.org/all/04749ee4-0728-92fe-bcb0-a7320279eaac@felixrichter.tech/
>
> Ciao, Thorsten
Thanks for the pointers, I'll do a bisection on my desktop from 6.1 to 
the newest commit. That was the part I was mostly unsure about … where 
to start from.

I was planning to use PKGBUILD scripts from arch to achieve the same 
configuration as I would when installing
the package and just rewrite the script to use a local copy of the 
source code instead of the repository.
That way I can just use the bisect command, rebuild the package and test 
again.

But I probably won't be able to finish it this week, since I am on 
vacation starting tomorrow and will not have access to the computer in 
question. I will be back next week, by that time the patch Alex is 
talking about might
already be in mainline. So if that fixes it, I will notice and let you 
know. If not I will do the bisection to figure out what the actual issue is.

Kind regards,
Felix

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: PROBLEM: AMD Ryzen 9 7950X iGPU - Blinking Issue
  2023-05-02 13:48       ` Felix Richter
@ 2023-05-02 14:12         ` Linux regression tracking (Thorsten Leemhuis)
  2023-06-03 14:52           ` Felix Richter
  0 siblings, 1 reply; 13+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2023-05-02 14:12 UTC (permalink / raw)
  To: Felix Richter, Linux regressions mailing list, Alex Deucher
  Cc: amd-gfx, dri-devel

On 02.05.23 15:48, Felix Richter wrote:
> On 5/2/23 15:34, Linux regression tracking (Thorsten Leemhuis) wrote:
>> On 02.05.23 15:13, Alex Deucher wrote:
>>> On Tue, May 2, 2023 at 7:45 AM Linux regression tracking (Thorsten
>>> Leemhuis) <regressions@leemhuis.info> wrote:
>>>
>>>> On 30.04.23 13:44, Felix Richter wrote:
>>>>> Hi,
>>>>>
>>>>> I am running into an issue with the integrated GPU of the Ryzen 9
>>>>> 7950X. It seems to be a regression from kernel version 6.1 to 6.2.
>>>>> The bug materializes in from of my monitor blinking, meaning it
>>>>> turns full white shortly. This happens very often so that the
>>>>> system becomes unpleasant to use.
>>>>>
>>>>> I am running the Archlinux Kernel:
>>>>> The Issue happens on the bleeding edge kernel: 6.2.13
>>>>> Switching back to the LTS kernel resolves the issue: 6.1.26
>>>>>
>>>>> I have two monitors attached to the system. One 42 inch 4k Display
>>>>> and a 24 inch 1080p Display and am running sway as my desktop.
>>>>>
>>>>> Let me know if there is more information I could provide to help
>>>>> narrow down the issue.
>>>> Thanks for the report. To be sure the issue doesn't fall through the
>>>> cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
>>>> tracking bot:
>>>>
>>>> #regzbot ^introduced v6.1..v6.2
>>>> #regzbot title drm: amdgpu: system becomes unpleasant to use after
>>>> monitor starts blinking and turns full white
>>>> #regzbot ignore-activity
>>>>
>>>> This isn't a regression? This issue or a fix for it are already
>>>> discussed somewhere else? It was fixed already? You want to clarify
>>>> when
>>>> the regression started to happen? Or point out I got the title or
>>>> something else totally wrong? Then just reply and tell me -- ideally
>>>> while also telling regzbot about it, as explained by the page listed in
>>>> the footer of this mail.
>>>>
>>>> Developers: When fixing the issue, remember to add 'Link:' tags
>>>> pointing
>>>> to the report (the parent of this mail). See page linked in footer for
>>>> details.
>>> This sounds exactly like the issue that was fixed in this patch which
>>> is already on it's way to Linus:
>>> https://gitlab.freedesktop.org/agd5f/linux/-/commit/08da182175db4c7f80850354849d95f2670e8cd9
>> FWIW, you in the flood of emails likely missed that this is the same
>> thread where you yesterday replied "If the module parameter didn't help
>> then perhaps you are seeing some other issue.  Can you bisect?". That's
>> why I decided to add this to the tracking. Or am I missing something
>> obvious here?
>>
>> /me looks around again and can't see anything, but that doesn't have to
>> mean anything...
>>
>> Felix, btw, this guide might help you with the bisection, even if it's
>> just for kernel compilation:
>>
>> https://docs.kernel.org/next/admin-guide/quickly-build-trimmed-linux.html
>>
>> And to indirectly reply to your mail from yesterday[1]. You might want
>> to ignore the arch linux kernel git repo and just do a bisection between
>> 6.1 and the latest 6.2.y kernel using upstream repos; and if I were you
>> I'd also try 6.3 or even mainline before that, in case the issue was
>> fixed already.
>>
>> [1]
>> https://lore.kernel.org/all/04749ee4-0728-92fe-bcb0-a7320279eaac@felixrichter.tech/
>>
> Thanks for the pointers, I'll do a bisection on my desktop from 6.1 to
> the newest commit.

FWIW, I wonder what you actually mean with "newest commit" here: a
bisection between 6.1 and mainline HEAD might be a waste of time, *if*
this is something that only happens in 6.2.y (say due to a broken or
incomplete backport)

> That was the part I was mostly unsure about … where
> to start from.
> 
> I was planning to use PKGBUILD scripts from arch to achieve the same
> configuration as I would when installing
> the package and just rewrite the script to use a local copy of the
> source code instead of the repository.
> That way I can just use the bisect command, rebuild the package and test
> again.

In my experience trying to deal with Linux distro's package managers
creates more trouble than it's worth.

> But I probably won't be able to finish it this week, since I am on
> vacation starting tomorrow and will not have access to the computer in
> question. I will be back next week, by that time the patch Alex is
> talking about might
> already be in mainline. So if that fixes it, I will notice and let you
> know. If not I will do the bisection to figure out what the actual issue
> is.

Enjoy your vacation!

Ciao, Thorsten

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: PROBLEM: AMD Ryzen 9 7950X iGPU - Blinking Issue
  2023-05-02 14:12         ` Linux regression tracking (Thorsten Leemhuis)
@ 2023-06-03 14:52           ` Felix Richter
  2023-06-05 14:11             ` Alex Deucher
  2023-06-05 15:27             ` Hamza Mahfooz
  0 siblings, 2 replies; 13+ messages in thread
From: Felix Richter @ 2023-06-03 14:52 UTC (permalink / raw)
  To: Linux regressions mailing list, Alex Deucher; +Cc: amd-gfx, dri-devel

[-- Attachment #1: Type: text/plain, Size: 5874 bytes --]

Hi Guys,

sorry for the silence from my side. I had a lot of things to take care 
of after returning from vacation. Also I had to wait on the zfs modules 
to be updated to support kernel 6.3 for further testing.

The bad news is that I am still experiencing issues. I have been able to 
get a reproducible trigger for the buggy behavior. The moment I take a 
screenshot or any other program like `wdisplays` accesses the screen 
buffer the screen starts flickering. The only way to reset it is to 
reboot the machine or log out of the desktop.

With this I did a bisection to figure out which commit is responsible 
for this. I attached the logs to the mail. The short version is that I 
identified commit 81d0bcf9900932633d270d5bc4a54ff599c6ebdb as the 
culprit. Seems that there are side effects of having more flexible 
buffer placement for the case of the internal GPU. To verify that this 
actually is the cause of the issue I built the current archlinux kernel 
with an extra patch to revert the commit: 
https://github.com/ju6ge/linux/tree/v6.3.5-ju6ge. The result is that be 
bug is fixed!

Now if this is the desired long term fix I do not know …

Kind regards,
Felix Richter

On 02.05.23 16:12, Linux regression tracking (Thorsten Leemhuis) wrote:
> On 02.05.23 15:48, Felix Richter wrote:
>> On 5/2/23 15:34, Linux regression tracking (Thorsten Leemhuis) wrote:
>>> On 02.05.23 15:13, Alex Deucher wrote:
>>>> On Tue, May 2, 2023 at 7:45 AM Linux regression tracking (Thorsten
>>>> Leemhuis)<regressions@leemhuis.info>  wrote:
>>>>
>>>>> On 30.04.23 13:44, Felix Richter wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I am running into an issue with the integrated GPU of the Ryzen 9
>>>>>> 7950X. It seems to be a regression from kernel version 6.1 to 6.2.
>>>>>> The bug materializes in from of my monitor blinking, meaning it
>>>>>> turns full white shortly. This happens very often so that the
>>>>>> system becomes unpleasant to use.
>>>>>>
>>>>>> I am running the Archlinux Kernel:
>>>>>> The Issue happens on the bleeding edge kernel: 6.2.13
>>>>>> Switching back to the LTS kernel resolves the issue: 6.1.26
>>>>>>
>>>>>> I have two monitors attached to the system. One 42 inch 4k Display
>>>>>> and a 24 inch 1080p Display and am running sway as my desktop.
>>>>>>
>>>>>> Let me know if there is more information I could provide to help
>>>>>> narrow down the issue.
>>>>> Thanks for the report. To be sure the issue doesn't fall through the
>>>>> cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
>>>>> tracking bot:
>>>>>
>>>>> #regzbot ^introduced v6.1..v6.2
>>>>> #regzbot title drm: amdgpu: system becomes unpleasant to use after
>>>>> monitor starts blinking and turns full white
>>>>> #regzbot ignore-activity
>>>>>
>>>>> This isn't a regression? This issue or a fix for it are already
>>>>> discussed somewhere else? It was fixed already? You want to clarify
>>>>> when
>>>>> the regression started to happen? Or point out I got the title or
>>>>> something else totally wrong? Then just reply and tell me -- ideally
>>>>> while also telling regzbot about it, as explained by the page listed in
>>>>> the footer of this mail.
>>>>>
>>>>> Developers: When fixing the issue, remember to add 'Link:' tags
>>>>> pointing
>>>>> to the report (the parent of this mail). See page linked in footer for
>>>>> details.
>>>> This sounds exactly like the issue that was fixed in this patch which
>>>> is already on it's way to Linus:
>>>> https://gitlab.freedesktop.org/agd5f/linux/-/commit/08da182175db4c7f80850354849d95f2670e8cd9
>>> FWIW, you in the flood of emails likely missed that this is the same
>>> thread where you yesterday replied "If the module parameter didn't help
>>> then perhaps you are seeing some other issue.  Can you bisect?". That's
>>> why I decided to add this to the tracking. Or am I missing something
>>> obvious here?
>>>
>>> /me looks around again and can't see anything, but that doesn't have to
>>> mean anything...
>>>
>>> Felix, btw, this guide might help you with the bisection, even if it's
>>> just for kernel compilation:
>>>
>>> https://docs.kernel.org/next/admin-guide/quickly-build-trimmed-linux.html
>>>
>>> And to indirectly reply to your mail from yesterday[1]. You might want
>>> to ignore the arch linux kernel git repo and just do a bisection between
>>> 6.1 and the latest 6.2.y kernel using upstream repos; and if I were you
>>> I'd also try 6.3 or even mainline before that, in case the issue was
>>> fixed already.
>>>
>>> [1]
>>> https://lore.kernel.org/all/04749ee4-0728-92fe-bcb0-a7320279eaac@felixrichter.tech/
>>>
>> Thanks for the pointers, I'll do a bisection on my desktop from 6.1 to
>> the newest commit.
> FWIW, I wonder what you actually mean with "newest commit" here: a
> bisection between 6.1 and mainline HEAD might be a waste of time, *if*
> this is something that only happens in 6.2.y (say due to a broken or
> incomplete backport)
>
>> That was the part I was mostly unsure about … where
>> to start from.
>>
>> I was planning to use PKGBUILD scripts from arch to achieve the same
>> configuration as I would when installing
>> the package and just rewrite the script to use a local copy of the
>> source code instead of the repository.
>> That way I can just use the bisect command, rebuild the package and test
>> again.
> In my experience trying to deal with Linux distro's package managers
> creates more trouble than it's worth.
>
>> But I probably won't be able to finish it this week, since I am on
>> vacation starting tomorrow and will not have access to the computer in
>> question. I will be back next week, by that time the patch Alex is
>> talking about might
>> already be in mainline. So if that fixes it, I will notice and let you
>> know. If not I will do the bisection to figure out what the actual issue
>> is.
> Enjoy your vacation!
>
> Ciao, Thorsten

[-- Attachment #2: bisect_final.log --]
[-- Type: text/x-log, Size: 2476 bytes --]

git bisect start
# Status: warte auf guten und schlechten Commit
# bad: [55c7d6a91d42ad98cbfb10da077ce8bb7084dc0e] Merge tag 'drm-next-2022-12-23' of git://anongit.freedesktop.org/drm/drm
git bisect bad 55c7d6a91d42ad98cbfb10da077ce8bb7084dc0e
# Status: warte auf gute(n) Commit(s), schlechter Commit bekannt
# good: [1eb206208b0f3f707c67134ef6ba394410effb67] block, bfq: only do counting of pending-request for BFQ_GROUP_IOSCHED
git bisect good 1eb206208b0f3f707c67134ef6ba394410effb67
# good: [dd6f9b17cd7af68b6a5090deedf1f5e84f66f4e6] Merge tag 'tty-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
git bisect good dd6f9b17cd7af68b6a5090deedf1f5e84f66f4e6
# good: [5f6e430f931d245da838db3e10e918681207029b] Merge tag 'powerpc-6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
git bisect good 5f6e430f931d245da838db3e10e918681207029b
# good: [609d3bc6230514a8ca79b377775b17e8c3d9ac93] Merge tag 'net-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
git bisect good 609d3bc6230514a8ca79b377775b17e8c3d9ac93
# good: [5461e079009ae2732c833281c4b50dfb58d15ba5] Merge tag 'media/v6.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
git bisect good 5461e079009ae2732c833281c4b50dfb58d15ba5
# good: [9d2f6060fe4c3b49d0cdc1dce1c99296f33379c8] Merge tag 'trace-v6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
git bisect good 9d2f6060fe4c3b49d0cdc1dce1c99296f33379c8
# good: [d1ac1a2b14264e98c24db6f8c2bd452e695c7238] Merge tag 'perf-tools-for-v6.2-2-2022-12-22' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
git bisect good d1ac1a2b14264e98c24db6f8c2bd452e695c7238
# bad: [d118b18fb1da02b41df2da78cb2794b3638d89cd] drm/amd/pm: avoid large variable on kernel stack
git bisect bad d118b18fb1da02b41df2da78cb2794b3638d89cd
# bad: [3273f11675ef11959d25a56df3279f712bcd41b7] drm/amdgpu: Remove unnecessary domain argument
git bisect bad 3273f11675ef11959d25a56df3279f712bcd41b7
# bad: [f95f51a4c3357eabf74fe14ab7daa5b5c0422b27] drm/amdgpu: Add notifier lock for KFD userptrs
git bisect bad f95f51a4c3357eabf74fe14ab7daa5b5c0422b27
# bad: [47ea20762bb7875a62e10433a3cd5d34e9133f47] drm/amdgpu: Add an extra evict_resource call during device_suspend.
git bisect bad 47ea20762bb7875a62e10433a3cd5d34e9133f47
# bad: [81d0bcf9900932633d270d5bc4a54ff599c6ebdb] drm/amdgpu: make display pinning more flexible (v2)
git bisect bad 81d0bcf9900932633d270d5bc4a54ff599c6ebdb

[-- Attachment #3: bisect_final.result --]
[-- Type: text/plain, Size: 1024 bytes --]

81d0bcf9900932633d270d5bc4a54ff599c6ebdb is the first bad commit
commit 81d0bcf9900932633d270d5bc4a54ff599c6ebdb
Author: Alex Deucher <alexander.deucher@amd.com>
Date:   Wed Dec 7 11:08:53 2022 -0500

    drm/amdgpu: make display pinning more flexible (v2)
    
    Only apply the static threshold for Stoney and Carrizo.
    This hardware has certain requirements that don't allow
    mixing of GTT and VRAM.  Newer asics do not have these
    requirements so we should be able to be more flexible
    with where buffers end up.
    
    Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2270
    Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2291
    Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2255
    Acked-by: Luben Tuikov <luben.tuikov@amd.com>
    Reviewed-by: Christian König <christian.koenig@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Cc: stable@vger.kernel.org

 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: PROBLEM: AMD Ryzen 9 7950X iGPU - Blinking Issue
  2023-06-03 14:52           ` Felix Richter
@ 2023-06-05 14:11             ` Alex Deucher
  2023-06-05 19:01               ` Felix Richter
  2023-06-07  8:42               ` Felix Richter
  2023-06-05 15:27             ` Hamza Mahfooz
  1 sibling, 2 replies; 13+ messages in thread
From: Alex Deucher @ 2023-06-05 14:11 UTC (permalink / raw)
  To: Felix Richter, Mahfooz, Hamza
  Cc: Linux regressions mailing list, amd-gfx, dri-devel

On Sat, Jun 3, 2023 at 10:52 AM Felix Richter <judge@felixrichter.tech> wrote:
>
> Hi Guys,
>
> sorry for the silence from my side. I had a lot of things to take care
> of after returning from vacation. Also I had to wait on the zfs modules
> to be updated to support kernel 6.3 for further testing.
>
> The bad news is that I am still experiencing issues. I have been able to
> get a reproducible trigger for the buggy behavior. The moment I take a
> screenshot or any other program like `wdisplays` accesses the screen
> buffer the screen starts flickering. The only way to reset it is to
> reboot the machine or log out of the desktop.
>
> With this I did a bisection to figure out which commit is responsible
> for this. I attached the logs to the mail. The short version is that I
> identified commit 81d0bcf9900932633d270d5bc4a54ff599c6ebdb as the
> culprit. Seems that there are side effects of having more flexible
> buffer placement for the case of the internal GPU. To verify that this
> actually is the cause of the issue I built the current archlinux kernel
> with an extra patch to revert the commit:
> https://github.com/ju6ge/linux/tree/v6.3.5-ju6ge. The result is that be
> bug is fixed!

+ Hamza

This is a known issue.  You can workaround it by setting
amdgpu.sg_display=0.  It should be issue should be fixed in:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=08da182175db4c7f80850354849d95f2670e8cd9

Alex



>
> Now if this is the desired long term fix I do not know …
>
> Kind regards,
> Felix Richter
>
> On 02.05.23 16:12, Linux regression tracking (Thorsten Leemhuis) wrote:
> > On 02.05.23 15:48, Felix Richter wrote:
> >> On 5/2/23 15:34, Linux regression tracking (Thorsten Leemhuis) wrote:
> >>> On 02.05.23 15:13, Alex Deucher wrote:
> >>>> On Tue, May 2, 2023 at 7:45 AM Linux regression tracking (Thorsten
> >>>> Leemhuis)<regressions@leemhuis.info>  wrote:
> >>>>
> >>>>> On 30.04.23 13:44, Felix Richter wrote:
> >>>>>> Hi,
> >>>>>>
> >>>>>> I am running into an issue with the integrated GPU of the Ryzen 9
> >>>>>> 7950X. It seems to be a regression from kernel version 6.1 to 6.2.
> >>>>>> The bug materializes in from of my monitor blinking, meaning it
> >>>>>> turns full white shortly. This happens very often so that the
> >>>>>> system becomes unpleasant to use.
> >>>>>>
> >>>>>> I am running the Archlinux Kernel:
> >>>>>> The Issue happens on the bleeding edge kernel: 6.2.13
> >>>>>> Switching back to the LTS kernel resolves the issue: 6.1.26
> >>>>>>
> >>>>>> I have two monitors attached to the system. One 42 inch 4k Display
> >>>>>> and a 24 inch 1080p Display and am running sway as my desktop.
> >>>>>>
> >>>>>> Let me know if there is more information I could provide to help
> >>>>>> narrow down the issue.
> >>>>> Thanks for the report. To be sure the issue doesn't fall through the
> >>>>> cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
> >>>>> tracking bot:
> >>>>>
> >>>>> #regzbot ^introduced v6.1..v6.2
> >>>>> #regzbot title drm: amdgpu: system becomes unpleasant to use after
> >>>>> monitor starts blinking and turns full white
> >>>>> #regzbot ignore-activity
> >>>>>
> >>>>> This isn't a regression? This issue or a fix for it are already
> >>>>> discussed somewhere else? It was fixed already? You want to clarify
> >>>>> when
> >>>>> the regression started to happen? Or point out I got the title or
> >>>>> something else totally wrong? Then just reply and tell me -- ideally
> >>>>> while also telling regzbot about it, as explained by the page listed in
> >>>>> the footer of this mail.
> >>>>>
> >>>>> Developers: When fixing the issue, remember to add 'Link:' tags
> >>>>> pointing
> >>>>> to the report (the parent of this mail). See page linked in footer for
> >>>>> details.
> >>>> This sounds exactly like the issue that was fixed in this patch which
> >>>> is already on it's way to Linus:
> >>>> https://gitlab.freedesktop.org/agd5f/linux/-/commit/08da182175db4c7f80850354849d95f2670e8cd9
> >>> FWIW, you in the flood of emails likely missed that this is the same
> >>> thread where you yesterday replied "If the module parameter didn't help
> >>> then perhaps you are seeing some other issue.  Can you bisect?". That's
> >>> why I decided to add this to the tracking. Or am I missing something
> >>> obvious here?
> >>>
> >>> /me looks around again and can't see anything, but that doesn't have to
> >>> mean anything...
> >>>
> >>> Felix, btw, this guide might help you with the bisection, even if it's
> >>> just for kernel compilation:
> >>>
> >>> https://docs.kernel.org/next/admin-guide/quickly-build-trimmed-linux.html
> >>>
> >>> And to indirectly reply to your mail from yesterday[1]. You might want
> >>> to ignore the arch linux kernel git repo and just do a bisection between
> >>> 6.1 and the latest 6.2.y kernel using upstream repos; and if I were you
> >>> I'd also try 6.3 or even mainline before that, in case the issue was
> >>> fixed already.
> >>>
> >>> [1]
> >>> https://lore.kernel.org/all/04749ee4-0728-92fe-bcb0-a7320279eaac@felixrichter.tech/
> >>>
> >> Thanks for the pointers, I'll do a bisection on my desktop from 6.1 to
> >> the newest commit.
> > FWIW, I wonder what you actually mean with "newest commit" here: a
> > bisection between 6.1 and mainline HEAD might be a waste of time, *if*
> > this is something that only happens in 6.2.y (say due to a broken or
> > incomplete backport)
> >
> >> That was the part I was mostly unsure about … where
> >> to start from.
> >>
> >> I was planning to use PKGBUILD scripts from arch to achieve the same
> >> configuration as I would when installing
> >> the package and just rewrite the script to use a local copy of the
> >> source code instead of the repository.
> >> That way I can just use the bisect command, rebuild the package and test
> >> again.
> > In my experience trying to deal with Linux distro's package managers
> > creates more trouble than it's worth.
> >
> >> But I probably won't be able to finish it this week, since I am on
> >> vacation starting tomorrow and will not have access to the computer in
> >> question. I will be back next week, by that time the patch Alex is
> >> talking about might
> >> already be in mainline. So if that fixes it, I will notice and let you
> >> know. If not I will do the bisection to figure out what the actual issue
> >> is.
> > Enjoy your vacation!
> >
> > Ciao, Thorsten

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: PROBLEM: AMD Ryzen 9 7950X iGPU - Blinking Issue
  2023-06-03 14:52           ` Felix Richter
  2023-06-05 14:11             ` Alex Deucher
@ 2023-06-05 15:27             ` Hamza Mahfooz
  2023-06-05 18:55               ` Felix Richter
  1 sibling, 1 reply; 13+ messages in thread
From: Hamza Mahfooz @ 2023-06-05 15:27 UTC (permalink / raw)
  To: Felix Richter, Linux regressions mailing list, Alex Deucher
  Cc: dri-devel, amd-gfx


On 6/3/23 10:52, Felix Richter wrote:
> Hi Guys,
> 
> sorry for the silence from my side. I had a lot of things to take care 
> of after returning from vacation. Also I had to wait on the zfs modules 
> to be updated to support kernel 6.3 for further testing.
> 
> The bad news is that I am still experiencing issues. I have been able to 
> get a reproducible trigger for the buggy behavior. The moment I take a 
> screenshot or any other program like `wdisplays` accesses the screen 
> buffer the screen starts flickering. The only way to reset it is to 
> reboot the machine or log out of the desktop.
> 
> With this I did a bisection to figure out which commit is responsible 
> for this. I attached the logs to the mail. The short version is that I 
> identified commit 81d0bcf9900932633d270d5bc4a54ff599c6ebdb as the 
> culprit. Seems that there are side effects of having more flexible 
> buffer placement for the case of the internal GPU. To verify that this 
> actually is the cause of the issue I built the current archlinux kernel 
> with an extra patch to revert the commit: 
> https://github.com/ju6ge/linux/tree/v6.3.5-ju6ge. The result is that be 
> bug is fixed!
> 
> Now if this is the desired long term fix I do not know …

Can you provide a dmidecode of your RAM (i.e. # dmidecode --type=memory)?

The current trend seems to suggest that if you have 64 or more gigs of
RAM, you will probably still experience issues with S/G mode enabled
even with my fix applied.

> 
> Kind regards,
> Felix Richter
> 
> On 02.05.23 16:12, Linux regression tracking (Thorsten Leemhuis) wrote:
>> On 02.05.23 15:48, Felix Richter wrote:
>>> On 5/2/23 15:34, Linux regression tracking (Thorsten Leemhuis) wrote:
>>>> On 02.05.23 15:13, Alex Deucher wrote:
>>>>> On Tue, May 2, 2023 at 7:45 AM Linux regression tracking (Thorsten
>>>>> Leemhuis)<regressions@leemhuis.info>  wrote:
>>>>>
>>>>>> On 30.04.23 13:44, Felix Richter wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I am running into an issue with the integrated GPU of the Ryzen 9
>>>>>>> 7950X. It seems to be a regression from kernel version 6.1 to 6.2.
>>>>>>> The bug materializes in from of my monitor blinking, meaning it
>>>>>>> turns full white shortly. This happens very often so that the
>>>>>>> system becomes unpleasant to use.
>>>>>>>
>>>>>>> I am running the Archlinux Kernel:
>>>>>>> The Issue happens on the bleeding edge kernel: 6.2.13
>>>>>>> Switching back to the LTS kernel resolves the issue: 6.1.26
>>>>>>>
>>>>>>> I have two monitors attached to the system. One 42 inch 4k Display
>>>>>>> and a 24 inch 1080p Display and am running sway as my desktop.
>>>>>>>
>>>>>>> Let me know if there is more information I could provide to help
>>>>>>> narrow down the issue.
>>>>>> Thanks for the report. To be sure the issue doesn't fall through the
>>>>>> cracks unnoticed, I'm adding it to regzbot, the Linux kernel 
>>>>>> regression
>>>>>> tracking bot:
>>>>>>
>>>>>> #regzbot ^introduced v6.1..v6.2
>>>>>> #regzbot title drm: amdgpu: system becomes unpleasant to use after
>>>>>> monitor starts blinking and turns full white
>>>>>> #regzbot ignore-activity
>>>>>>
>>>>>> This isn't a regression? This issue or a fix for it are already
>>>>>> discussed somewhere else? It was fixed already? You want to clarify
>>>>>> when
>>>>>> the regression started to happen? Or point out I got the title or
>>>>>> something else totally wrong? Then just reply and tell me -- ideally
>>>>>> while also telling regzbot about it, as explained by the page 
>>>>>> listed in
>>>>>> the footer of this mail.
>>>>>>
>>>>>> Developers: When fixing the issue, remember to add 'Link:' tags
>>>>>> pointing
>>>>>> to the report (the parent of this mail). See page linked in footer 
>>>>>> for
>>>>>> details.
>>>>> This sounds exactly like the issue that was fixed in this patch which
>>>>> is already on it's way to Linus:
>>>>> https://gitlab.freedesktop.org/agd5f/linux/-/commit/08da182175db4c7f80850354849d95f2670e8cd9
>>>> FWIW, you in the flood of emails likely missed that this is the same
>>>> thread where you yesterday replied "If the module parameter didn't help
>>>> then perhaps you are seeing some other issue.  Can you bisect?". That's
>>>> why I decided to add this to the tracking. Or am I missing something
>>>> obvious here?
>>>>
>>>> /me looks around again and can't see anything, but that doesn't have to
>>>> mean anything...
>>>>
>>>> Felix, btw, this guide might help you with the bisection, even if it's
>>>> just for kernel compilation:
>>>>
>>>> https://docs.kernel.org/next/admin-guide/quickly-build-trimmed-linux.html
>>>>
>>>> And to indirectly reply to your mail from yesterday[1]. You might want
>>>> to ignore the arch linux kernel git repo and just do a bisection 
>>>> between
>>>> 6.1 and the latest 6.2.y kernel using upstream repos; and if I were you
>>>> I'd also try 6.3 or even mainline before that, in case the issue was
>>>> fixed already.
>>>>
>>>> [1]
>>>> https://lore.kernel.org/all/04749ee4-0728-92fe-bcb0-a7320279eaac@felixrichter.tech/
>>>>
>>> Thanks for the pointers, I'll do a bisection on my desktop from 6.1 to
>>> the newest commit.
>> FWIW, I wonder what you actually mean with "newest commit" here: a
>> bisection between 6.1 and mainline HEAD might be a waste of time, *if*
>> this is something that only happens in 6.2.y (say due to a broken or
>> incomplete backport)
>>
>>> That was the part I was mostly unsure about … where
>>> to start from.
>>>
>>> I was planning to use PKGBUILD scripts from arch to achieve the same
>>> configuration as I would when installing
>>> the package and just rewrite the script to use a local copy of the
>>> source code instead of the repository.
>>> That way I can just use the bisect command, rebuild the package and test
>>> again.
>> In my experience trying to deal with Linux distro's package managers
>> creates more trouble than it's worth.
>>
>>> But I probably won't be able to finish it this week, since I am on
>>> vacation starting tomorrow and will not have access to the computer in
>>> question. I will be back next week, by that time the patch Alex is
>>> talking about might
>>> already be in mainline. So if that fixes it, I will notice and let you
>>> know. If not I will do the bisection to figure out what the actual issue
>>> is.
>> Enjoy your vacation!
>>
>> Ciao, Thorsten
-- 
Hamza


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: PROBLEM: AMD Ryzen 9 7950X iGPU - Blinking Issue
  2023-06-05 15:27             ` Hamza Mahfooz
@ 2023-06-05 18:55               ` Felix Richter
  0 siblings, 0 replies; 13+ messages in thread
From: Felix Richter @ 2023-06-05 18:55 UTC (permalink / raw)
  To: Hamza Mahfooz, Linux regressions mailing list, Alex Deucher
  Cc: dri-devel, amd-gfx

[-- Attachment #1: Type: text/plain, Size: 6865 bytes --]

Hi,

I can confirm that setting amdgpu.sg_display=0 does not fix the issue 
for me.

I have 64GB of Kinsten Memory running with XMP at 5200MHz. I attached 
the result of `dmidecode --type=memory` to this email.

Kind regards
Felix Richter

On 05.06.23 17:27, Hamza Mahfooz wrote:
>
> On 6/3/23 10:52, Felix Richter wrote:
>> Hi Guys,
>>
>> sorry for the silence from my side. I had a lot of things to take 
>> care of after returning from vacation. Also I had to wait on the zfs 
>> modules to be updated to support kernel 6.3 for further testing.
>>
>> The bad news is that I am still experiencing issues. I have been able 
>> to get a reproducible trigger for the buggy behavior. The moment I 
>> take a screenshot or any other program like `wdisplays` accesses the 
>> screen buffer the screen starts flickering. The only way to reset it 
>> is to reboot the machine or log out of the desktop.
>>
>> With this I did a bisection to figure out which commit is responsible 
>> for this. I attached the logs to the mail. The short version is that 
>> I identified commit 81d0bcf9900932633d270d5bc4a54ff599c6ebdb as the 
>> culprit. Seems that there are side effects of having more flexible 
>> buffer placement for the case of the internal GPU. To verify that 
>> this actually is the cause of the issue I built the current archlinux 
>> kernel with an extra patch to revert the commit: 
>> https://github.com/ju6ge/linux/tree/v6.3.5-ju6ge. The result is that 
>> be bug is fixed!
>>
>> Now if this is the desired long term fix I do not know …
>
> Can you provide a dmidecode of your RAM (i.e. # dmidecode --type=memory)?
>
> The current trend seems to suggest that if you have 64 or more gigs of
> RAM, you will probably still experience issues with S/G mode enabled
> even with my fix applied.
>
>>
>> Kind regards,
>> Felix Richter
>>
>> On 02.05.23 16:12, Linux regression tracking (Thorsten Leemhuis) wrote:
>>> On 02.05.23 15:48, Felix Richter wrote:
>>>> On 5/2/23 15:34, Linux regression tracking (Thorsten Leemhuis) wrote:
>>>>> On 02.05.23 15:13, Alex Deucher wrote:
>>>>>> On Tue, May 2, 2023 at 7:45 AM Linux regression tracking (Thorsten
>>>>>> Leemhuis)<regressions@leemhuis.info>  wrote:
>>>>>>
>>>>>>> On 30.04.23 13:44, Felix Richter wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I am running into an issue with the integrated GPU of the Ryzen 9
>>>>>>>> 7950X. It seems to be a regression from kernel version 6.1 to 6.2.
>>>>>>>> The bug materializes in from of my monitor blinking, meaning it
>>>>>>>> turns full white shortly. This happens very often so that the
>>>>>>>> system becomes unpleasant to use.
>>>>>>>>
>>>>>>>> I am running the Archlinux Kernel:
>>>>>>>> The Issue happens on the bleeding edge kernel: 6.2.13
>>>>>>>> Switching back to the LTS kernel resolves the issue: 6.1.26
>>>>>>>>
>>>>>>>> I have two monitors attached to the system. One 42 inch 4k Display
>>>>>>>> and a 24 inch 1080p Display and am running sway as my desktop.
>>>>>>>>
>>>>>>>> Let me know if there is more information I could provide to help
>>>>>>>> narrow down the issue.
>>>>>>> Thanks for the report. To be sure the issue doesn't fall through 
>>>>>>> the
>>>>>>> cracks unnoticed, I'm adding it to regzbot, the Linux kernel 
>>>>>>> regression
>>>>>>> tracking bot:
>>>>>>>
>>>>>>> #regzbot ^introduced v6.1..v6.2
>>>>>>> #regzbot title drm: amdgpu: system becomes unpleasant to use after
>>>>>>> monitor starts blinking and turns full white
>>>>>>> #regzbot ignore-activity
>>>>>>>
>>>>>>> This isn't a regression? This issue or a fix for it are already
>>>>>>> discussed somewhere else? It was fixed already? You want to clarify
>>>>>>> when
>>>>>>> the regression started to happen? Or point out I got the title or
>>>>>>> something else totally wrong? Then just reply and tell me -- 
>>>>>>> ideally
>>>>>>> while also telling regzbot about it, as explained by the page 
>>>>>>> listed in
>>>>>>> the footer of this mail.
>>>>>>>
>>>>>>> Developers: When fixing the issue, remember to add 'Link:' tags
>>>>>>> pointing
>>>>>>> to the report (the parent of this mail). See page linked in 
>>>>>>> footer for
>>>>>>> details.
>>>>>> This sounds exactly like the issue that was fixed in this patch 
>>>>>> which
>>>>>> is already on it's way to Linus:
>>>>>> https://gitlab.freedesktop.org/agd5f/linux/-/commit/08da182175db4c7f80850354849d95f2670e8cd9 
>>>>>>
>>>>> FWIW, you in the flood of emails likely missed that this is the same
>>>>> thread where you yesterday replied "If the module parameter didn't 
>>>>> help
>>>>> then perhaps you are seeing some other issue.  Can you bisect?". 
>>>>> That's
>>>>> why I decided to add this to the tracking. Or am I missing something
>>>>> obvious here?
>>>>>
>>>>> /me looks around again and can't see anything, but that doesn't 
>>>>> have to
>>>>> mean anything...
>>>>>
>>>>> Felix, btw, this guide might help you with the bisection, even if 
>>>>> it's
>>>>> just for kernel compilation:
>>>>>
>>>>> https://docs.kernel.org/next/admin-guide/quickly-build-trimmed-linux.html 
>>>>>
>>>>>
>>>>> And to indirectly reply to your mail from yesterday[1]. You might 
>>>>> want
>>>>> to ignore the arch linux kernel git repo and just do a bisection 
>>>>> between
>>>>> 6.1 and the latest 6.2.y kernel using upstream repos; and if I 
>>>>> were you
>>>>> I'd also try 6.3 or even mainline before that, in case the issue was
>>>>> fixed already.
>>>>>
>>>>> [1]
>>>>> https://lore.kernel.org/all/04749ee4-0728-92fe-bcb0-a7320279eaac@felixrichter.tech/ 
>>>>>
>>>>>
>>>> Thanks for the pointers, I'll do a bisection on my desktop from 6.1 to
>>>> the newest commit.
>>> FWIW, I wonder what you actually mean with "newest commit" here: a
>>> bisection between 6.1 and mainline HEAD might be a waste of time, *if*
>>> this is something that only happens in 6.2.y (say due to a broken or
>>> incomplete backport)
>>>
>>>> That was the part I was mostly unsure about … where
>>>> to start from.
>>>>
>>>> I was planning to use PKGBUILD scripts from arch to achieve the same
>>>> configuration as I would when installing
>>>> the package and just rewrite the script to use a local copy of the
>>>> source code instead of the repository.
>>>> That way I can just use the bisect command, rebuild the package and 
>>>> test
>>>> again.
>>> In my experience trying to deal with Linux distro's package managers
>>> creates more trouble than it's worth.
>>>
>>>> But I probably won't be able to finish it this week, since I am on
>>>> vacation starting tomorrow and will not have access to the computer in
>>>> question. I will be back next week, by that time the patch Alex is
>>>> talking about might
>>>> already be in mainline. So if that fixes it, I will notice and let you
>>>> know. If not I will do the bisection to figure out what the actual 
>>>> issue
>>>> is.
>>> Enjoy your vacation!
>>>
>>> Ciao, Thorsten

[-- Attachment #2: dmidump.txt --]
[-- Type: text/plain, Size: 2711 bytes --]

# dmidecode 3.5
Getting SMBIOS data from sysfs.
SMBIOS 3.5.0 present.

Handle 0x000F, DMI type 16, 23 bytes
Physical Memory Array
	Location: System Board Or Motherboard
	Use: System Memory
	Error Correction Type: None
	Maximum Capacity: 128 GB
	Error Information Handle: 0x000E
	Number Of Devices: 4

Handle 0x0012, DMI type 17, 92 bytes
Memory Device
	Array Handle: 0x000F
	Error Information Handle: 0x0011
	Total Width: Unknown
	Data Width: Unknown
	Size: No Module Installed
	Form Factor: Unknown
	Set: None
	Locator: DIMM 0
	Bank Locator: P0 CHANNEL A
	Type: Unknown
	Type Detail: Unknown

Handle 0x0014, DMI type 17, 92 bytes
Memory Device
	Array Handle: 0x000F
	Error Information Handle: 0x0013
	Total Width: 64 bits
	Data Width: 64 bits
	Size: 32 GB
	Form Factor: DIMM
	Set: None
	Locator: DIMM 1
	Bank Locator: P0 CHANNEL A
	Type: DDR5
	Type Detail: Synchronous Unbuffered (Unregistered)
	Speed: 5200 MT/s
	Manufacturer: Kingston
	Serial Number: D10C970D
	Asset Tag: Not Specified
	Part Number: KF552C40-32         
	Rank: 2
	Configured Memory Speed: 5200 MT/s
	Minimum Voltage: 1.1 V
	Maximum Voltage: 1.1 V
	Configured Voltage: 1.1 V
	Memory Technology: DRAM
	Memory Operating Mode Capability: Volatile memory
	Firmware Version: Unknown
	Module Manufacturer ID: Bank 2, Hex 0x98
	Module Product ID: Unknown
	Memory Subsystem Controller Manufacturer ID: Unknown
	Memory Subsystem Controller Product ID: Unknown
	Non-Volatile Size: None
	Volatile Size: 32 GB
	Cache Size: None
	Logical Size: None

Handle 0x0017, DMI type 17, 92 bytes
Memory Device
	Array Handle: 0x000F
	Error Information Handle: 0x0016
	Total Width: Unknown
	Data Width: Unknown
	Size: No Module Installed
	Form Factor: Unknown
	Set: None
	Locator: DIMM 0
	Bank Locator: P0 CHANNEL B
	Type: Unknown
	Type Detail: Unknown

Handle 0x0019, DMI type 17, 92 bytes
Memory Device
	Array Handle: 0x000F
	Error Information Handle: 0x0018
	Total Width: 64 bits
	Data Width: 64 bits
	Size: 32 GB
	Form Factor: DIMM
	Set: None
	Locator: DIMM 1
	Bank Locator: P0 CHANNEL B
	Type: DDR5
	Type Detail: Synchronous Unbuffered (Unregistered)
	Speed: 5200 MT/s
	Manufacturer: Kingston
	Serial Number: D50C9730
	Asset Tag: Not Specified
	Part Number: KF552C40-32         
	Rank: 2
	Configured Memory Speed: 5200 MT/s
	Minimum Voltage: 1.1 V
	Maximum Voltage: 1.1 V
	Configured Voltage: 1.1 V
	Memory Technology: DRAM
	Memory Operating Mode Capability: Volatile memory
	Firmware Version: Unknown
	Module Manufacturer ID: Bank 2, Hex 0x98
	Module Product ID: Unknown
	Memory Subsystem Controller Manufacturer ID: Unknown
	Memory Subsystem Controller Product ID: Unknown
	Non-Volatile Size: None
	Volatile Size: 32 GB
	Cache Size: None
	Logical Size: None


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: PROBLEM: AMD Ryzen 9 7950X iGPU - Blinking Issue
  2023-06-05 14:11             ` Alex Deucher
@ 2023-06-05 19:01               ` Felix Richter
  2023-06-07  8:42               ` Felix Richter
  1 sibling, 0 replies; 13+ messages in thread
From: Felix Richter @ 2023-06-05 19:01 UTC (permalink / raw)
  To: Alex Deucher, Mahfooz, Hamza
  Cc: Linux regressions mailing list, amd-gfx, dri-devel

I will apply this patch and see if fixes the issue for me. Will let you 
now when I am done.

Felix

On 05.06.23 16:11, Alex Deucher wrote:
> On Sat, Jun 3, 2023 at 10:52 AM Felix Richter <judge@felixrichter.tech> wrote:
>> Hi Guys,
>>
>> sorry for the silence from my side. I had a lot of things to take care
>> of after returning from vacation. Also I had to wait on the zfs modules
>> to be updated to support kernel 6.3 for further testing.
>>
>> The bad news is that I am still experiencing issues. I have been able to
>> get a reproducible trigger for the buggy behavior. The moment I take a
>> screenshot or any other program like `wdisplays` accesses the screen
>> buffer the screen starts flickering. The only way to reset it is to
>> reboot the machine or log out of the desktop.
>>
>> With this I did a bisection to figure out which commit is responsible
>> for this. I attached the logs to the mail. The short version is that I
>> identified commit 81d0bcf9900932633d270d5bc4a54ff599c6ebdb as the
>> culprit. Seems that there are side effects of having more flexible
>> buffer placement for the case of the internal GPU. To verify that this
>> actually is the cause of the issue I built the current archlinux kernel
>> with an extra patch to revert the commit:
>> https://github.com/ju6ge/linux/tree/v6.3.5-ju6ge. The result is that be
>> bug is fixed!
> + Hamza
>
> This is a known issue.  You can workaround it by setting
> amdgpu.sg_display=0.  It should be issue should be fixed in:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=08da182175db4c7f80850354849d95f2670e8cd9
>
> Alex
>
>
>
>> Now if this is the desired long term fix I do not know …
>>
>> Kind regards,
>> Felix Richter
>>
>> On 02.05.23 16:12, Linux regression tracking (Thorsten Leemhuis) wrote:
>>> On 02.05.23 15:48, Felix Richter wrote:
>>>> On 5/2/23 15:34, Linux regression tracking (Thorsten Leemhuis) wrote:
>>>>> On 02.05.23 15:13, Alex Deucher wrote:
>>>>>> On Tue, May 2, 2023 at 7:45 AM Linux regression tracking (Thorsten
>>>>>> Leemhuis)<regressions@leemhuis.info>  wrote:
>>>>>>
>>>>>>> On 30.04.23 13:44, Felix Richter wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I am running into an issue with the integrated GPU of the Ryzen 9
>>>>>>>> 7950X. It seems to be a regression from kernel version 6.1 to 6.2.
>>>>>>>> The bug materializes in from of my monitor blinking, meaning it
>>>>>>>> turns full white shortly. This happens very often so that the
>>>>>>>> system becomes unpleasant to use.
>>>>>>>>
>>>>>>>> I am running the Archlinux Kernel:
>>>>>>>> The Issue happens on the bleeding edge kernel: 6.2.13
>>>>>>>> Switching back to the LTS kernel resolves the issue: 6.1.26
>>>>>>>>
>>>>>>>> I have two monitors attached to the system. One 42 inch 4k Display
>>>>>>>> and a 24 inch 1080p Display and am running sway as my desktop.
>>>>>>>>
>>>>>>>> Let me know if there is more information I could provide to help
>>>>>>>> narrow down the issue.
>>>>>>> Thanks for the report. To be sure the issue doesn't fall through the
>>>>>>> cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
>>>>>>> tracking bot:
>>>>>>>
>>>>>>> #regzbot ^introduced v6.1..v6.2
>>>>>>> #regzbot title drm: amdgpu: system becomes unpleasant to use after
>>>>>>> monitor starts blinking and turns full white
>>>>>>> #regzbot ignore-activity
>>>>>>>
>>>>>>> This isn't a regression? This issue or a fix for it are already
>>>>>>> discussed somewhere else? It was fixed already? You want to clarify
>>>>>>> when
>>>>>>> the regression started to happen? Or point out I got the title or
>>>>>>> something else totally wrong? Then just reply and tell me -- ideally
>>>>>>> while also telling regzbot about it, as explained by the page listed in
>>>>>>> the footer of this mail.
>>>>>>>
>>>>>>> Developers: When fixing the issue, remember to add 'Link:' tags
>>>>>>> pointing
>>>>>>> to the report (the parent of this mail). See page linked in footer for
>>>>>>> details.
>>>>>> This sounds exactly like the issue that was fixed in this patch which
>>>>>> is already on it's way to Linus:
>>>>>> https://gitlab.freedesktop.org/agd5f/linux/-/commit/08da182175db4c7f80850354849d95f2670e8cd9
>>>>> FWIW, you in the flood of emails likely missed that this is the same
>>>>> thread where you yesterday replied "If the module parameter didn't help
>>>>> then perhaps you are seeing some other issue.  Can you bisect?". That's
>>>>> why I decided to add this to the tracking. Or am I missing something
>>>>> obvious here?
>>>>>
>>>>> /me looks around again and can't see anything, but that doesn't have to
>>>>> mean anything...
>>>>>
>>>>> Felix, btw, this guide might help you with the bisection, even if it's
>>>>> just for kernel compilation:
>>>>>
>>>>> https://docs.kernel.org/next/admin-guide/quickly-build-trimmed-linux.html
>>>>>
>>>>> And to indirectly reply to your mail from yesterday[1]. You might want
>>>>> to ignore the arch linux kernel git repo and just do a bisection between
>>>>> 6.1 and the latest 6.2.y kernel using upstream repos; and if I were you
>>>>> I'd also try 6.3 or even mainline before that, in case the issue was
>>>>> fixed already.
>>>>>
>>>>> [1]
>>>>> https://lore.kernel.org/all/04749ee4-0728-92fe-bcb0-a7320279eaac@felixrichter.tech/
>>>>>
>>>> Thanks for the pointers, I'll do a bisection on my desktop from 6.1 to
>>>> the newest commit.
>>> FWIW, I wonder what you actually mean with "newest commit" here: a
>>> bisection between 6.1 and mainline HEAD might be a waste of time, *if*
>>> this is something that only happens in 6.2.y (say due to a broken or
>>> incomplete backport)
>>>
>>>> That was the part I was mostly unsure about … where
>>>> to start from.
>>>>
>>>> I was planning to use PKGBUILD scripts from arch to achieve the same
>>>> configuration as I would when installing
>>>> the package and just rewrite the script to use a local copy of the
>>>> source code instead of the repository.
>>>> That way I can just use the bisect command, rebuild the package and test
>>>> again.
>>> In my experience trying to deal with Linux distro's package managers
>>> creates more trouble than it's worth.
>>>
>>>> But I probably won't be able to finish it this week, since I am on
>>>> vacation starting tomorrow and will not have access to the computer in
>>>> question. I will be back next week, by that time the patch Alex is
>>>> talking about might
>>>> already be in mainline. So if that fixes it, I will notice and let you
>>>> know. If not I will do the bisection to figure out what the actual issue
>>>> is.
>>> Enjoy your vacation!
>>>
>>> Ciao, Thorsten


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: PROBLEM: AMD Ryzen 9 7950X iGPU - Blinking Issue
  2023-06-05 14:11             ` Alex Deucher
  2023-06-05 19:01               ` Felix Richter
@ 2023-06-07  8:42               ` Felix Richter
  2023-06-07 18:05                 ` Alex Deucher
  1 sibling, 1 reply; 13+ messages in thread
From: Felix Richter @ 2023-06-07  8:42 UTC (permalink / raw)
  To: Alex Deucher, Mahfooz, Hamza
  Cc: Linux regressions mailing list, amd-gfx, dri-devel

Hi Guys,

so I checked, the kernel I am running has this commit 
(https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
/commit/?id=08da182175db4c7f80850354849d95f2670e8cd9) applied already!

https://github.com/ju6ge/linux/commit/917680e6056aa288cac288d3afd2745d372beb61u

And the bug of display flickering persists with or without the 
amdgpu.sg_display=0 variable applied!

Kind regards,
Felix Richter


On 6/5/23 16:11, Alex Deucher wrote:
> + Hamza
> This is a known issue.  You can workaround it by setting
> amdgpu.sg_display=0.  It should be issue should be fixed in:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=08da182175db4c7f80850354849d95f2670e8cd9
>
> Alex
>
>
>
>> Now if this is the desired long term fix I do not know …
>>
>> Kind regards,
>> Felix Richter
>>
>> On 02.05.23 16:12, Linux regression tracking (Thorsten Leemhuis) wrote:
>>> On 02.05.23 15:48, Felix Richter wrote:
>>>> On 5/2/23 15:34, Linux regression tracking (Thorsten Leemhuis) wrote:
>>>>> On 02.05.23 15:13, Alex Deucher wrote:
>>>>>> On Tue, May 2, 2023 at 7:45 AM Linux regression tracking (Thorsten
>>>>>> Leemhuis)<regressions@leemhuis.info>  wrote:
>>>>>>
>>>>>>> On 30.04.23 13:44, Felix Richter wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I am running into an issue with the integrated GPU of the Ryzen 9
>>>>>>>> 7950X. It seems to be a regression from kernel version 6.1 to 6.2.
>>>>>>>> The bug materializes in from of my monitor blinking, meaning it
>>>>>>>> turns full white shortly. This happens very often so that the
>>>>>>>> system becomes unpleasant to use.
>>>>>>>>
>>>>>>>> I am running the Archlinux Kernel:
>>>>>>>> The Issue happens on the bleeding edge kernel: 6.2.13
>>>>>>>> Switching back to the LTS kernel resolves the issue: 6.1.26
>>>>>>>>
>>>>>>>> I have two monitors attached to the system. One 42 inch 4k Display
>>>>>>>> and a 24 inch 1080p Display and am running sway as my desktop.
>>>>>>>>
>>>>>>>> Let me know if there is more information I could provide to help
>>>>>>>> narrow down the issue.
>>>>>>> Thanks for the report. To be sure the issue doesn't fall through the
>>>>>>> cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
>>>>>>> tracking bot:
>>>>>>>
>>>>>>> #regzbot ^introduced v6.1..v6.2
>>>>>>> #regzbot title drm: amdgpu: system becomes unpleasant to use after
>>>>>>> monitor starts blinking and turns full white
>>>>>>> #regzbot ignore-activity
>>>>>>>
>>>>>>> This isn't a regression? This issue or a fix for it are already
>>>>>>> discussed somewhere else? It was fixed already? You want to clarify
>>>>>>> when
>>>>>>> the regression started to happen? Or point out I got the title or
>>>>>>> something else totally wrong? Then just reply and tell me -- ideally
>>>>>>> while also telling regzbot about it, as explained by the page listed in
>>>>>>> the footer of this mail.
>>>>>>>
>>>>>>> Developers: When fixing the issue, remember to add 'Link:' tags
>>>>>>> pointing
>>>>>>> to the report (the parent of this mail). See page linked in footer for
>>>>>>> details.
>>>>>> This sounds exactly like the issue that was fixed in this patch which
>>>>>> is already on it's way to Linus:
>>>>>> https://gitlab.freedesktop.org/agd5f/linux/-/commit/08da182175db4c7f80850354849d95f2670e8cd9
>>>>> FWIW, you in the flood of emails likely missed that this is the same
>>>>> thread where you yesterday replied "If the module parameter didn't help
>>>>> then perhaps you are seeing some other issue.  Can you bisect?". That's
>>>>> why I decided to add this to the tracking. Or am I missing something
>>>>> obvious here?
>>>>>
>>>>> /me looks around again and can't see anything, but that doesn't have to
>>>>> mean anything...
>>>>>
>>>>> Felix, btw, this guide might help you with the bisection, even if it's
>>>>> just for kernel compilation:
>>>>>
>>>>> https://docs.kernel.org/next/admin-guide/quickly-build-trimmed-linux.html
>>>>>
>>>>> And to indirectly reply to your mail from yesterday[1]. You might want
>>>>> to ignore the arch linux kernel git repo and just do a bisection between
>>>>> 6.1 and the latest 6.2.y kernel using upstream repos; and if I were you
>>>>> I'd also try 6.3 or even mainline before that, in case the issue was
>>>>> fixed already.
>>>>>
>>>>> [1]
>>>>> https://lore.kernel.org/all/04749ee4-0728-92fe-bcb0-a7320279eaac@felixrichter.tech/
>>>>>
>>>> Thanks for the pointers, I'll do a bisection on my desktop from 6.1 to
>>>> the newest commit.
>>> FWIW, I wonder what you actually mean with "newest commit" here: a
>>> bisection between 6.1 and mainline HEAD might be a waste of time, *if*
>>> this is something that only happens in 6.2.y (say due to a broken or
>>> incomplete backport)
>>>
>>>> That was the part I was mostly unsure about … where
>>>> to start from.
>>>>
>>>> I was planning to use PKGBUILD scripts from arch to achieve the same
>>>> configuration as I would when installing
>>>> the package and just rewrite the script to use a local copy of the
>>>> source code instead of the repository.
>>>> That way I can just use the bisect command, rebuild the package and test
>>>> again.
>>> In my experience trying to deal with Linux distro's package managers
>>> creates more trouble than it's worth.
>>>
>>>> But I probably won't be able to finish it this week, since I am on
>>>> vacation starting tomorrow and will not have access to the computer in
>>>> question. I will be back next week, by that time the patch Alex is
>>>> talking about might
>>>> already be in mainline. So if that fixes it, I will notice and let you
>>>> know. If not I will do the bisection to figure out what the actual issue
>>>> is.
>>> Enjoy your vacation!
>>>
>>> Ciao, Thorsten


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: PROBLEM: AMD Ryzen 9 7950X iGPU - Blinking Issue
  2023-06-07  8:42               ` Felix Richter
@ 2023-06-07 18:05                 ` Alex Deucher
  0 siblings, 0 replies; 13+ messages in thread
From: Alex Deucher @ 2023-06-07 18:05 UTC (permalink / raw)
  To: Felix Richter
  Cc: Mahfooz, Hamza, Linux regressions mailing list, amd-gfx, dri-devel

[-- Attachment #1: Type: text/plain, Size: 6391 bytes --]

On Wed, Jun 7, 2023 at 4:42 AM Felix Richter <judge@felixrichter.tech> wrote:
>
> Hi Guys,
>
> so I checked, the kernel I am running has this commit
> (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> /commit/?id=08da182175db4c7f80850354849d95f2670e8cd9) applied already!
>
> https://github.com/ju6ge/linux/commit/917680e6056aa288cac288d3afd2745d372beb61u
>
> And the bug of display flickering persists with or without the
> amdgpu.sg_display=0 variable applied!

That is unexpected.  Setting sg_display=0 should be equivalent to
reverting 81d0bcf9900932633d270d5bc4a54ff599c6ebdb.  Does the attached
patch (with sg_display=0 set) make any difference?

Alex


>
> Kind regards,
> Felix Richter
>
>
> On 6/5/23 16:11, Alex Deucher wrote:
> > + Hamza
> > This is a known issue.  You can workaround it by setting
> > amdgpu.sg_display=0.  It should be issue should be fixed in:
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=08da182175db4c7f80850354849d95f2670e8cd9
> >
> > Alex
> >
> >
> >
> >> Now if this is the desired long term fix I do not know …
> >>
> >> Kind regards,
> >> Felix Richter
> >>
> >> On 02.05.23 16:12, Linux regression tracking (Thorsten Leemhuis) wrote:
> >>> On 02.05.23 15:48, Felix Richter wrote:
> >>>> On 5/2/23 15:34, Linux regression tracking (Thorsten Leemhuis) wrote:
> >>>>> On 02.05.23 15:13, Alex Deucher wrote:
> >>>>>> On Tue, May 2, 2023 at 7:45 AM Linux regression tracking (Thorsten
> >>>>>> Leemhuis)<regressions@leemhuis.info>  wrote:
> >>>>>>
> >>>>>>> On 30.04.23 13:44, Felix Richter wrote:
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> I am running into an issue with the integrated GPU of the Ryzen 9
> >>>>>>>> 7950X. It seems to be a regression from kernel version 6.1 to 6.2.
> >>>>>>>> The bug materializes in from of my monitor blinking, meaning it
> >>>>>>>> turns full white shortly. This happens very often so that the
> >>>>>>>> system becomes unpleasant to use.
> >>>>>>>>
> >>>>>>>> I am running the Archlinux Kernel:
> >>>>>>>> The Issue happens on the bleeding edge kernel: 6.2.13
> >>>>>>>> Switching back to the LTS kernel resolves the issue: 6.1.26
> >>>>>>>>
> >>>>>>>> I have two monitors attached to the system. One 42 inch 4k Display
> >>>>>>>> and a 24 inch 1080p Display and am running sway as my desktop.
> >>>>>>>>
> >>>>>>>> Let me know if there is more information I could provide to help
> >>>>>>>> narrow down the issue.
> >>>>>>> Thanks for the report. To be sure the issue doesn't fall through the
> >>>>>>> cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
> >>>>>>> tracking bot:
> >>>>>>>
> >>>>>>> #regzbot ^introduced v6.1..v6.2
> >>>>>>> #regzbot title drm: amdgpu: system becomes unpleasant to use after
> >>>>>>> monitor starts blinking and turns full white
> >>>>>>> #regzbot ignore-activity
> >>>>>>>
> >>>>>>> This isn't a regression? This issue or a fix for it are already
> >>>>>>> discussed somewhere else? It was fixed already? You want to clarify
> >>>>>>> when
> >>>>>>> the regression started to happen? Or point out I got the title or
> >>>>>>> something else totally wrong? Then just reply and tell me -- ideally
> >>>>>>> while also telling regzbot about it, as explained by the page listed in
> >>>>>>> the footer of this mail.
> >>>>>>>
> >>>>>>> Developers: When fixing the issue, remember to add 'Link:' tags
> >>>>>>> pointing
> >>>>>>> to the report (the parent of this mail). See page linked in footer for
> >>>>>>> details.
> >>>>>> This sounds exactly like the issue that was fixed in this patch which
> >>>>>> is already on it's way to Linus:
> >>>>>> https://gitlab.freedesktop.org/agd5f/linux/-/commit/08da182175db4c7f80850354849d95f2670e8cd9
> >>>>> FWIW, you in the flood of emails likely missed that this is the same
> >>>>> thread where you yesterday replied "If the module parameter didn't help
> >>>>> then perhaps you are seeing some other issue.  Can you bisect?". That's
> >>>>> why I decided to add this to the tracking. Or am I missing something
> >>>>> obvious here?
> >>>>>
> >>>>> /me looks around again and can't see anything, but that doesn't have to
> >>>>> mean anything...
> >>>>>
> >>>>> Felix, btw, this guide might help you with the bisection, even if it's
> >>>>> just for kernel compilation:
> >>>>>
> >>>>> https://docs.kernel.org/next/admin-guide/quickly-build-trimmed-linux.html
> >>>>>
> >>>>> And to indirectly reply to your mail from yesterday[1]. You might want
> >>>>> to ignore the arch linux kernel git repo and just do a bisection between
> >>>>> 6.1 and the latest 6.2.y kernel using upstream repos; and if I were you
> >>>>> I'd also try 6.3 or even mainline before that, in case the issue was
> >>>>> fixed already.
> >>>>>
> >>>>> [1]
> >>>>> https://lore.kernel.org/all/04749ee4-0728-92fe-bcb0-a7320279eaac@felixrichter.tech/
> >>>>>
> >>>> Thanks for the pointers, I'll do a bisection on my desktop from 6.1 to
> >>>> the newest commit.
> >>> FWIW, I wonder what you actually mean with "newest commit" here: a
> >>> bisection between 6.1 and mainline HEAD might be a waste of time, *if*
> >>> this is something that only happens in 6.2.y (say due to a broken or
> >>> incomplete backport)
> >>>
> >>>> That was the part I was mostly unsure about … where
> >>>> to start from.
> >>>>
> >>>> I was planning to use PKGBUILD scripts from arch to achieve the same
> >>>> configuration as I would when installing
> >>>> the package and just rewrite the script to use a local copy of the
> >>>> source code instead of the repository.
> >>>> That way I can just use the bisect command, rebuild the package and test
> >>>> again.
> >>> In my experience trying to deal with Linux distro's package managers
> >>> creates more trouble than it's worth.
> >>>
> >>>> But I probably won't be able to finish it this week, since I am on
> >>>> vacation starting tomorrow and will not have access to the computer in
> >>>> question. I will be back next week, by that time the patch Alex is
> >>>> talking about might
> >>>> already be in mainline. So if that fixes it, I will notice and let you
> >>>> know. If not I will do the bisection to figure out what the actual issue
> >>>> is.
> >>> Enjoy your vacation!
> >>>
> >>> Ciao, Thorsten
>

[-- Attachment #2: sg_display_test.diff --]
[-- Type: text/x-patch, Size: 633 bytes --]

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index cc65c334cb64..195b4ff7a287 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -1631,10 +1631,7 @@ static int amdgpu_dm_init(struct amdgpu_device *adev)
 		break;
 	}
 	if (init_data.flags.gpu_vm_support &&
-	    (amdgpu_sg_display == 0))
-		init_data.flags.gpu_vm_support = false;
-
-	if (init_data.flags.gpu_vm_support)
+	    (amdgpu_sg_display != 0))
 		adev->mode_info.gpu_vm_support = true;
 
 	if (amdgpu_dc_feature_mask & DC_FBC_MASK)

^ permalink raw reply related	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2023-06-07 18:05 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <46a7eb80-5f09-4f6a-4fd3-9550dafd497c@felixrichter.tech>
2023-05-02 11:44 ` PROBLEM: AMD Ryzen 9 7950X iGPU - Blinking Issue Linux regression tracking (Thorsten Leemhuis)
2023-05-02 13:13   ` Alex Deucher
2023-05-02 13:34     ` Linux regression tracking (Thorsten Leemhuis)
2023-05-02 13:39       ` Alex Deucher
2023-05-02 13:48       ` Felix Richter
2023-05-02 14:12         ` Linux regression tracking (Thorsten Leemhuis)
2023-06-03 14:52           ` Felix Richter
2023-06-05 14:11             ` Alex Deucher
2023-06-05 19:01               ` Felix Richter
2023-06-07  8:42               ` Felix Richter
2023-06-07 18:05                 ` Alex Deucher
2023-06-05 15:27             ` Hamza Mahfooz
2023-06-05 18:55               ` Felix Richter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).