regressions.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Felix Richter <judge@felixrichter.tech>
To: Hamza Mahfooz <hamza.mahfooz@amd.com>,
	Linux regressions mailing list <regressions@lists.linux.dev>,
	Alex Deucher <alexdeucher@gmail.com>
Cc: dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org
Subject: Re: PROBLEM: AMD Ryzen 9 7950X iGPU - Blinking Issue
Date: Mon, 5 Jun 2023 20:55:42 +0200	[thread overview]
Message-ID: <6404ace2-1d74-e287-ccb5-eaab2fe50473@felixrichter.tech> (raw)
In-Reply-To: <5a925a7f-d810-275b-c735-29872bf523a3@amd.com>

[-- Attachment #1: Type: text/plain, Size: 6865 bytes --]

Hi,

I can confirm that setting amdgpu.sg_display=0 does not fix the issue 
for me.

I have 64GB of Kinsten Memory running with XMP at 5200MHz. I attached 
the result of `dmidecode --type=memory` to this email.

Kind regards
Felix Richter

On 05.06.23 17:27, Hamza Mahfooz wrote:
>
> On 6/3/23 10:52, Felix Richter wrote:
>> Hi Guys,
>>
>> sorry for the silence from my side. I had a lot of things to take 
>> care of after returning from vacation. Also I had to wait on the zfs 
>> modules to be updated to support kernel 6.3 for further testing.
>>
>> The bad news is that I am still experiencing issues. I have been able 
>> to get a reproducible trigger for the buggy behavior. The moment I 
>> take a screenshot or any other program like `wdisplays` accesses the 
>> screen buffer the screen starts flickering. The only way to reset it 
>> is to reboot the machine or log out of the desktop.
>>
>> With this I did a bisection to figure out which commit is responsible 
>> for this. I attached the logs to the mail. The short version is that 
>> I identified commit 81d0bcf9900932633d270d5bc4a54ff599c6ebdb as the 
>> culprit. Seems that there are side effects of having more flexible 
>> buffer placement for the case of the internal GPU. To verify that 
>> this actually is the cause of the issue I built the current archlinux 
>> kernel with an extra patch to revert the commit: 
>> https://github.com/ju6ge/linux/tree/v6.3.5-ju6ge. The result is that 
>> be bug is fixed!
>>
>> Now if this is the desired long term fix I do not know …
>
> Can you provide a dmidecode of your RAM (i.e. # dmidecode --type=memory)?
>
> The current trend seems to suggest that if you have 64 or more gigs of
> RAM, you will probably still experience issues with S/G mode enabled
> even with my fix applied.
>
>>
>> Kind regards,
>> Felix Richter
>>
>> On 02.05.23 16:12, Linux regression tracking (Thorsten Leemhuis) wrote:
>>> On 02.05.23 15:48, Felix Richter wrote:
>>>> On 5/2/23 15:34, Linux regression tracking (Thorsten Leemhuis) wrote:
>>>>> On 02.05.23 15:13, Alex Deucher wrote:
>>>>>> On Tue, May 2, 2023 at 7:45 AM Linux regression tracking (Thorsten
>>>>>> Leemhuis)<regressions@leemhuis.info>  wrote:
>>>>>>
>>>>>>> On 30.04.23 13:44, Felix Richter wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I am running into an issue with the integrated GPU of the Ryzen 9
>>>>>>>> 7950X. It seems to be a regression from kernel version 6.1 to 6.2.
>>>>>>>> The bug materializes in from of my monitor blinking, meaning it
>>>>>>>> turns full white shortly. This happens very often so that the
>>>>>>>> system becomes unpleasant to use.
>>>>>>>>
>>>>>>>> I am running the Archlinux Kernel:
>>>>>>>> The Issue happens on the bleeding edge kernel: 6.2.13
>>>>>>>> Switching back to the LTS kernel resolves the issue: 6.1.26
>>>>>>>>
>>>>>>>> I have two monitors attached to the system. One 42 inch 4k Display
>>>>>>>> and a 24 inch 1080p Display and am running sway as my desktop.
>>>>>>>>
>>>>>>>> Let me know if there is more information I could provide to help
>>>>>>>> narrow down the issue.
>>>>>>> Thanks for the report. To be sure the issue doesn't fall through 
>>>>>>> the
>>>>>>> cracks unnoticed, I'm adding it to regzbot, the Linux kernel 
>>>>>>> regression
>>>>>>> tracking bot:
>>>>>>>
>>>>>>> #regzbot ^introduced v6.1..v6.2
>>>>>>> #regzbot title drm: amdgpu: system becomes unpleasant to use after
>>>>>>> monitor starts blinking and turns full white
>>>>>>> #regzbot ignore-activity
>>>>>>>
>>>>>>> This isn't a regression? This issue or a fix for it are already
>>>>>>> discussed somewhere else? It was fixed already? You want to clarify
>>>>>>> when
>>>>>>> the regression started to happen? Or point out I got the title or
>>>>>>> something else totally wrong? Then just reply and tell me -- 
>>>>>>> ideally
>>>>>>> while also telling regzbot about it, as explained by the page 
>>>>>>> listed in
>>>>>>> the footer of this mail.
>>>>>>>
>>>>>>> Developers: When fixing the issue, remember to add 'Link:' tags
>>>>>>> pointing
>>>>>>> to the report (the parent of this mail). See page linked in 
>>>>>>> footer for
>>>>>>> details.
>>>>>> This sounds exactly like the issue that was fixed in this patch 
>>>>>> which
>>>>>> is already on it's way to Linus:
>>>>>> https://gitlab.freedesktop.org/agd5f/linux/-/commit/08da182175db4c7f80850354849d95f2670e8cd9 
>>>>>>
>>>>> FWIW, you in the flood of emails likely missed that this is the same
>>>>> thread where you yesterday replied "If the module parameter didn't 
>>>>> help
>>>>> then perhaps you are seeing some other issue.  Can you bisect?". 
>>>>> That's
>>>>> why I decided to add this to the tracking. Or am I missing something
>>>>> obvious here?
>>>>>
>>>>> /me looks around again and can't see anything, but that doesn't 
>>>>> have to
>>>>> mean anything...
>>>>>
>>>>> Felix, btw, this guide might help you with the bisection, even if 
>>>>> it's
>>>>> just for kernel compilation:
>>>>>
>>>>> https://docs.kernel.org/next/admin-guide/quickly-build-trimmed-linux.html 
>>>>>
>>>>>
>>>>> And to indirectly reply to your mail from yesterday[1]. You might 
>>>>> want
>>>>> to ignore the arch linux kernel git repo and just do a bisection 
>>>>> between
>>>>> 6.1 and the latest 6.2.y kernel using upstream repos; and if I 
>>>>> were you
>>>>> I'd also try 6.3 or even mainline before that, in case the issue was
>>>>> fixed already.
>>>>>
>>>>> [1]
>>>>> https://lore.kernel.org/all/04749ee4-0728-92fe-bcb0-a7320279eaac@felixrichter.tech/ 
>>>>>
>>>>>
>>>> Thanks for the pointers, I'll do a bisection on my desktop from 6.1 to
>>>> the newest commit.
>>> FWIW, I wonder what you actually mean with "newest commit" here: a
>>> bisection between 6.1 and mainline HEAD might be a waste of time, *if*
>>> this is something that only happens in 6.2.y (say due to a broken or
>>> incomplete backport)
>>>
>>>> That was the part I was mostly unsure about … where
>>>> to start from.
>>>>
>>>> I was planning to use PKGBUILD scripts from arch to achieve the same
>>>> configuration as I would when installing
>>>> the package and just rewrite the script to use a local copy of the
>>>> source code instead of the repository.
>>>> That way I can just use the bisect command, rebuild the package and 
>>>> test
>>>> again.
>>> In my experience trying to deal with Linux distro's package managers
>>> creates more trouble than it's worth.
>>>
>>>> But I probably won't be able to finish it this week, since I am on
>>>> vacation starting tomorrow and will not have access to the computer in
>>>> question. I will be back next week, by that time the patch Alex is
>>>> talking about might
>>>> already be in mainline. So if that fixes it, I will notice and let you
>>>> know. If not I will do the bisection to figure out what the actual 
>>>> issue
>>>> is.
>>> Enjoy your vacation!
>>>
>>> Ciao, Thorsten

[-- Attachment #2: dmidump.txt --]
[-- Type: text/plain, Size: 2711 bytes --]

# dmidecode 3.5
Getting SMBIOS data from sysfs.
SMBIOS 3.5.0 present.

Handle 0x000F, DMI type 16, 23 bytes
Physical Memory Array
	Location: System Board Or Motherboard
	Use: System Memory
	Error Correction Type: None
	Maximum Capacity: 128 GB
	Error Information Handle: 0x000E
	Number Of Devices: 4

Handle 0x0012, DMI type 17, 92 bytes
Memory Device
	Array Handle: 0x000F
	Error Information Handle: 0x0011
	Total Width: Unknown
	Data Width: Unknown
	Size: No Module Installed
	Form Factor: Unknown
	Set: None
	Locator: DIMM 0
	Bank Locator: P0 CHANNEL A
	Type: Unknown
	Type Detail: Unknown

Handle 0x0014, DMI type 17, 92 bytes
Memory Device
	Array Handle: 0x000F
	Error Information Handle: 0x0013
	Total Width: 64 bits
	Data Width: 64 bits
	Size: 32 GB
	Form Factor: DIMM
	Set: None
	Locator: DIMM 1
	Bank Locator: P0 CHANNEL A
	Type: DDR5
	Type Detail: Synchronous Unbuffered (Unregistered)
	Speed: 5200 MT/s
	Manufacturer: Kingston
	Serial Number: D10C970D
	Asset Tag: Not Specified
	Part Number: KF552C40-32         
	Rank: 2
	Configured Memory Speed: 5200 MT/s
	Minimum Voltage: 1.1 V
	Maximum Voltage: 1.1 V
	Configured Voltage: 1.1 V
	Memory Technology: DRAM
	Memory Operating Mode Capability: Volatile memory
	Firmware Version: Unknown
	Module Manufacturer ID: Bank 2, Hex 0x98
	Module Product ID: Unknown
	Memory Subsystem Controller Manufacturer ID: Unknown
	Memory Subsystem Controller Product ID: Unknown
	Non-Volatile Size: None
	Volatile Size: 32 GB
	Cache Size: None
	Logical Size: None

Handle 0x0017, DMI type 17, 92 bytes
Memory Device
	Array Handle: 0x000F
	Error Information Handle: 0x0016
	Total Width: Unknown
	Data Width: Unknown
	Size: No Module Installed
	Form Factor: Unknown
	Set: None
	Locator: DIMM 0
	Bank Locator: P0 CHANNEL B
	Type: Unknown
	Type Detail: Unknown

Handle 0x0019, DMI type 17, 92 bytes
Memory Device
	Array Handle: 0x000F
	Error Information Handle: 0x0018
	Total Width: 64 bits
	Data Width: 64 bits
	Size: 32 GB
	Form Factor: DIMM
	Set: None
	Locator: DIMM 1
	Bank Locator: P0 CHANNEL B
	Type: DDR5
	Type Detail: Synchronous Unbuffered (Unregistered)
	Speed: 5200 MT/s
	Manufacturer: Kingston
	Serial Number: D50C9730
	Asset Tag: Not Specified
	Part Number: KF552C40-32         
	Rank: 2
	Configured Memory Speed: 5200 MT/s
	Minimum Voltage: 1.1 V
	Maximum Voltage: 1.1 V
	Configured Voltage: 1.1 V
	Memory Technology: DRAM
	Memory Operating Mode Capability: Volatile memory
	Firmware Version: Unknown
	Module Manufacturer ID: Bank 2, Hex 0x98
	Module Product ID: Unknown
	Memory Subsystem Controller Manufacturer ID: Unknown
	Memory Subsystem Controller Product ID: Unknown
	Non-Volatile Size: None
	Volatile Size: 32 GB
	Cache Size: None
	Logical Size: None


      reply	other threads:[~2023-06-05 18:55 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <46a7eb80-5f09-4f6a-4fd3-9550dafd497c@felixrichter.tech>
2023-05-02 11:44 ` PROBLEM: AMD Ryzen 9 7950X iGPU - Blinking Issue Linux regression tracking (Thorsten Leemhuis)
2023-05-02 13:13   ` Alex Deucher
2023-05-02 13:34     ` Linux regression tracking (Thorsten Leemhuis)
2023-05-02 13:39       ` Alex Deucher
2023-05-02 13:48       ` Felix Richter
2023-05-02 14:12         ` Linux regression tracking (Thorsten Leemhuis)
2023-06-03 14:52           ` Felix Richter
2023-06-05 14:11             ` Alex Deucher
2023-06-05 19:01               ` Felix Richter
2023-06-07  8:42               ` Felix Richter
2023-06-07 18:05                 ` Alex Deucher
2023-06-05 15:27             ` Hamza Mahfooz
2023-06-05 18:55               ` Felix Richter [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6404ace2-1d74-e287-ccb5-eaab2fe50473@felixrichter.tech \
    --to=judge@felixrichter.tech \
    --cc=alexdeucher@gmail.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=hamza.mahfooz@amd.com \
    --cc=regressions@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).