From: Paul Menzel <pmenzel@molgen.mpg.de>
To: Alex Deucher <alexdeucher@gmail.com>
Cc: "Dave Airlie" <airlied@linux.ie>,
"Richard Gong" <richard.gong@amd.com>,
"Xinhui Pan" <xinhui.pan@amd.com>,
LKML <linux-kernel@vger.kernel.org>,
"amd-gfx list" <amd-gfx@lists.freedesktop.org>,
"Maling list - DRI developers" <dri-devel@lists.freedesktop.org>,
"Daniel Vetter" <daniel@ffwll.ch>,
"Alexander Deucher" <alexander.deucher@amd.com>,
"Christian König" <christian.koenig@amd.com>,
"Mario Limonciello" <mario.limonciello@amd.com>
Subject: Re: [PATCHv4] drm/amdgpu: disable ASPM on Intel Alder Lake based systems
Date: Wed, 20 Apr 2022 22:48:31 +0200 [thread overview]
Message-ID: <295e7882-21a2-f50f-6bfa-b0bae1d0fa12@molgen.mpg.de> (raw)
In-Reply-To: <CADnq5_NGTbTjn87tAsTJAMOKZ9niS7Mb=JDmjUH4KJXfDH_p_g@mail.gmail.com>
Dear Alex,
Am 20.04.22 um 22:40 schrieb Alex Deucher:
> On Wed, Apr 20, 2022 at 4:29 PM Paul Menzel <pmenzel@molgen.mpg.de> wrote:
>> Am 19.04.22 um 23:46 schrieb Gong, Richard:
>>
>>> On 4/14/2022 2:52 AM, Paul Menzel wrote:
>>>> [Cc: -kernel test robot <lkp@intel.com>]
>>
>> […]
>>
>>>> Am 13.04.22 um 15:00 schrieb Alex Deucher:
>>>>> On Wed, Apr 13, 2022 at 3:43 AM Paul Menzel wrote:
>>>>
>>>>>> Thank you for sending out v4.
>>>>>>
>>>>>> Am 12.04.22 um 23:50 schrieb Richard Gong:
>>>>>>> Active State Power Management (ASPM) feature is enabled since
>>>>>>> kernel 5.14.
>>>>>>> There are some AMD GFX cards (such as WX3200 and RX640) that won't
>>>>>>> work
>>>>>>> with ASPM-enabled Intel Alder Lake based systems. Using these GFX
>>>>>>> cards as
>>>>>>> video/display output, Intel Alder Lake based systems will hang during
>>>>>>> suspend/resume.
>>
>> [Your email program wraps lines in cited text for some reason, making
>> the citation harder to read.]
>>
>>>>>>
>>>>>> I am still not clear, what “hang during suspend/resume” means. I guess
>>>>>> suspending works fine? During resume (S3 or S0ix?), where does it hang?
>>>>>> The system is functional, but there are only display problems?
>>> System freeze after suspend/resume.
>>
>> But you see certain messages still? At what point does it freeze
>> exactly? In the bug report you posted Linux messages.
>>
>>>>>>> The issue was initially reported on one system (Dell Precision 3660
>>>>>>> with
>>>>>>> BIOS version 0.14.81), but was later confirmed to affect at least 4
>>>>>>> Alder
>>>>>>> Lake based systems.
>>>>>>>
>>>>>>> Add extra check to disable ASPM on Intel Alder Lake based systems.
>>>>>>>
>>>>>>> Fixes: 0064b0ce85bb ("drm/amd/pm: enable ASPM by default")
>>>>>>> Link:
>>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1885&data=04%7C01%7Crichard.gong%40amd.com%7Ce7febed5d6a441c3a58008da1debb99c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637855195670542145%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=7cEnE%2BSM9e5IGFxSLloCLtCOxovBpaPz0Ns0Ta2vVlc%3D&reserved=0
>>
>> Thank you Microsoft Outlook for keeping us safe. :(
>>
>>>>>>>
>>>>>>> Reported-by: kernel test robot <lkp@intel.com>
>>>>>>
>>>>>> This tag is a little confusing. Maybe clarify that it was for an issue
>>>>>> in a previous patch iteration?
>>>
>>> I did describe in change-list version 3 below, which corrected the build
>>> error with W=1 option.
>>>
>>> It is not good idea to add the description for that to the commit
>>> message, this is why I add descriptions on change-list version 3.
>>
>> Do as you wish, but the current style is confusing, and readers of the
>> commit are going to think, the kernel test robot reported the problem
>> with AMD VI ASICs and Intel Alder Lake systems.
>>
>>>>>>
>>>>>>> Signed-off-by: Richard Gong <richard.gong@amd.com>
>>>>>>> ---
>>>>>>> v4: s/CONFIG_X86_64/CONFIG_X86
>>>>>>> enhanced check logic
>>>>>>> v3: s/intel_core_asom_chk/aspm_support_quirk_check
>>>>>>> correct build error with W=1 option
>>>>>>> v2: correct commit description
>>>>>>> move the check from chip family to problematic platform
>>>>>>> ---
>>>>>>> drivers/gpu/drm/amd/amdgpu/vi.c | 17 ++++++++++++++++-
>>>>>>> 1 file changed, 16 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c
>>>>>>> b/drivers/gpu/drm/amd/amdgpu/vi.c
>>>>>>> index 039b90cdc3bc..b33e0a9bee65 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/vi.c
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/vi.c
>>>>>>> @@ -81,6 +81,10 @@
>>>>>>> #include "mxgpu_vi.h"
>>>>>>> #include "amdgpu_dm.h"
>>>>>>>
>>>>>>> +#if IS_ENABLED(CONFIG_X86)
>>>>>>> +#include <asm/intel-family.h>
>>>>>>> +#endif
>>>>>>> +
>>>>>>> #define ixPCIE_LC_L1_PM_SUBSTATE 0x100100C6
>>>>>>> #define PCIE_LC_L1_PM_SUBSTATE__LC_L1_SUBSTATES_OVERRIDE_EN_MASK
>>>>>>> 0x00000001L
>>>>>>> #define PCIE_LC_L1_PM_SUBSTATE__LC_PCI_PM_L1_2_OVERRIDE_MASK
>>>>>>> 0x00000002L
>>>>>>> @@ -1134,13 +1138,24 @@ static void vi_enable_aspm(struct
>>>>>>> amdgpu_device *adev)
>>>>>>> WREG32_PCIE(ixPCIE_LC_CNTL, data);
>>>>>>> }
>>>>>>>
>>>>>>> +static bool aspm_support_quirk_check(void)
>>>>>>> +{
>>>>>>> + if (IS_ENABLED(CONFIG_X86)) {
>>>>>>> + struct cpuinfo_x86 *c = &cpu_data(0);
>>>>>>> +
>>>>>>> + return !(c->x86 == 6 && c->x86_model ==
>>>>>>> INTEL_FAM6_ALDERLAKE);
>>>>>>> + }
>>>>>>> +
>>>>>>> + return true;
>>>>>>> +}
>>>>>>> +
>>>>>>> static void vi_program_aspm(struct amdgpu_device *adev)
>>>>>>> {
>>>>>>> u32 data, data1, orig;
>>>>>>> bool bL1SS = false;
>>>>>>> bool bClkReqSupport = true;
>>>>>>>
>>>>>>> - if (!amdgpu_device_should_use_aspm(adev))
>>>>>>> + if (!amdgpu_device_should_use_aspm(adev) ||
>>>>>>> !aspm_support_quirk_check())
>>>>>>> return;
>>>>>>
>>>>>> Can users still forcefully enable ASPM with the parameter
>>>>>> `amdgpu.aspm`?
>>>>>>
>>> As Mario mentioned in a separate reply, we can't forcefully enable ASPM
>>> with the parameter 'amdgpu.aspm'.
>>
>> That would be a regression on systems where ASPM used to work. Hmm. I
>> guess, you could say, there are no such systems.
>>
>>>>>>>
>>>>>>> if (adev->flags & AMD_IS_APU ||
>>>>>>
>>>>>> If I remember correctly, there were also newer cards, where ASPM worked
>>>>>> with Intel Alder Lake, right? Can only the problematic generations for
>>>>>> WX3200 and RX640 be excluded from ASPM?
>>>>>
>>>>> This patch only disables it for the generatioaon that was problematic.
>>>>
>>>> Could that please be made clear in the commit message summary, and
>>>> message?
>>>
>>> Are you ok with the commit messages below?
>>
>> Please change the commit message summary. Maybe:
>>
>> drm/amdgpu: VI: Disable ASPM on Intel Alder Lake based systems
>>
>>> Active State Power Management (ASPM) feature is enabled since kernel 5.14.
>>>
>>> There are some AMD GFX cards (such as WX3200 and RX640) that won't work
>>> with ASPM-enabled Intel Alder Lake based systems. Using these GFX cards as
>>> video/display output, Intel Alder Lake based systems will freeze after
>>> suspend/resume.
>>
>> Something like:
>>
>> On Intel Alder Lake based systems using ASPM with AMD GFX Volcanic
>> Islands (VI) cards, like WX3200 and RX640, graphics don’t initialize
>> when resuming from S0ix(?).
>>
>>
>>> The issue was initially reported on one system (Dell Precision 3660 with
>>> BIOS version 0.14.81), but was later confirmed to affect at least 4 Alder
>>> Lake based systems.
>>
>> Which ones?
>>
>>> Add extra check to disable ASPM on Intel Alder Lake based systems with
>>> problematic generation GFX cards.
>>
>> … with the problematic Volcanic Islands GFX cards.
>>
>>>>
>>>> Loosely related, is there a public (or internal issue) to analyze how
>>>> to get ASPM working for VI generation devices with Intel Alder Lake?
>>>
>>> As Alex mentioned, we need support from Intel. We don't have any update
>>> on that.
>>
>> It’d be great to get that fixed properly.
>>
>> Last thing, please don’t hate me, does Linux log, that ASPM is disabled?
>
> I'm not sure what gets logged at the platform level with respect to
> ASPM, but whether or not the driver enables ASPM is tied to whether
> ASPM is allowed at the platform level or not so if the platform
> indicates that ASPM is not supported, the driver won't enable it. The
> driver does not log whether ASPM is enabled or not if that is what you
> are asking. As to whether or not it should, it comes down to how much
> stuff is worth indiciating in the log. The driver is already pretty
> chatty by driver standards.
I specifically mean, Linux should log the quirks it applies. (As a
normal user, I’d also expect ASPM to work nowadays, so a message, that
it’s disabled would help a lot.)
Kind regards,
Paul
next prev parent reply other threads:[~2022-04-20 20:48 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-12 21:50 [PATCHv4] drm/amdgpu: disable ASPM on Intel Alder Lake based systems Richard Gong
2022-04-13 4:29 ` Lazar, Lijo
2022-04-13 7:43 ` Paul Menzel
2022-04-13 13:00 ` Alex Deucher
2022-04-13 13:28 ` Limonciello, Mario
2022-04-14 7:52 ` Paul Menzel
2022-04-14 13:11 ` Alex Deucher
[not found] ` <94fd858d-1792-9c05-b5c6-1b028427687d@amd.com>
2022-04-20 20:29 ` Paul Menzel
2022-04-20 20:40 ` Alex Deucher
2022-04-20 20:48 ` Paul Menzel [this message]
2022-04-20 20:56 ` Gong, Richard
2022-04-20 21:02 ` Paul Menzel
2022-04-20 21:12 ` Gong, Richard
2022-04-20 21:15 ` Alex Deucher
2022-04-20 21:13 ` Alex Deucher
2022-04-20 21:16 ` Limonciello, Mario
2022-04-21 1:12 ` Gong, Richard
2022-04-21 5:35 ` Paul Menzel
2022-04-26 13:53 ` Gong, Richard
2022-05-01 7:08 ` Paul Menzel
2022-05-02 14:56 ` Gong, Richard
2022-05-03 12:25 ` Daniel Stone
2022-05-03 12:44 ` Paul Menzel
2022-04-13 15:40 ` Nathan Chancellor
2022-04-19 21:08 ` Gong, Richard
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=295e7882-21a2-f50f-6bfa-b0bae1d0fa12@molgen.mpg.de \
--to=pmenzel@molgen.mpg.de \
--cc=airlied@linux.ie \
--cc=alexander.deucher@amd.com \
--cc=alexdeucher@gmail.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=christian.koenig@amd.com \
--cc=daniel@ffwll.ch \
--cc=dri-devel@lists.freedesktop.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mario.limonciello@amd.com \
--cc=richard.gong@amd.com \
--cc=xinhui.pan@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).