From: Thorsten Leemhuis <regressions@leemhuis.info>
To: "regressions@lists.linux.dev" <regressions@lists.linux.dev>
Subject: Re: [Bug 215315] New: [REGRESSION BISECTED] amdgpu crashes system suspend - NUC8i7HVKVA #forregzbot
Date: Mon, 10 Jan 2022 13:02:42 +0100 [thread overview]
Message-ID: <f0103811-bca2-e439-ca73-2132fd6e9871@leemhuis.info> (raw)
In-Reply-To: <8e1abb43-664b-5882-7c02-ef517c14fc94@leemhuis.info>
For the record, the culprit was reverted:
https://git.kernel.org/torvalds/c/df5bc0aa7ff6e2e14cb75182b4eda20253c711d4
#regzbot fixed-by: df5bc0aa7ff6e2e14cb75182b4eda20253c711d4
TWIMC: this mail is primarily send for documentation purposes and for
regzbot, my Linux kernel regression tracking bot. These mails usually
contain '#forregzbot' in the subject, to make them easy to spot and filter.
On 13.12.21 07:04, Thorsten Leemhuis wrote:
> [TLDR: adding this regression to regzbot; most of this mail is compiled
> from a few templates paragraphs some of you might have seen already.]
>
> Hi, this is your Linux kernel regression tracker speaking.
>
> Top-posting for once, to make this easy accessible to everyone.
>
> Thanks for the report.
>
> Adding the regression mailing list to the list of recipients, as it
> should be in the loop for all regressions, as explained here:
> https://www.kernel.org/doc/html/latest/admin-guide/reporting-issues.html
>
> To be sure this issue doesn't fall through the cracks unnoticed, I'm
> adding it to regzbot, my Linux kernel regression tracking bot:
>
> #regzbot ^introduced f7d6779df642720e22bffd449e683bb8690bd3bf
> #regzbot title drm: amdgpu: NUC8i7HVKVA crashes during system suspend
> #regzbot link: https://bugzilla.kernel.org/show_bug.cgi?id=215315
> #regzbot ignore-activity
>
> Reminder: when fixing the issue, please add a 'Link:' tag with the URL
> to the report (the parent of this mail), then regzbot will automatically
> mark the regression as resolved once the fix lands in the appropriate
> tree. For more details about regzbot see footer.
>
> Sending this to everyone that got the initial report, to make all aware
> of the tracking. I also hope that messages like this motivate people to
> directly get at least the regression mailing list and ideally even
> regzbot involved when dealing with regressions, as messages like this
> wouldn't be needed then.
>
> Don't worry, I'll send further messages wrt to this regression just to
> the lists (with a tag in the subject so people can filter them away), as
> long as they are intended just for regzbot. With a bit of luck no such
> messages will be needed anyway.
>
> Ciao, Thorsten (wearing his 'Linux kernel regression tracker' hat).
>
> P.S.: As a Linux kernel regression tracker I'm getting a lot of reports
> on my table. I can only look briefly into most of them. Unfortunately
> therefore I sometimes will get things wrong or miss something important.
> I hope that's not the case here; if you think it is, don't hesitate to
> tell me about it in a public reply. That's in everyone's interest, as
> what I wrote above might be misleading to everyone reading this; any
> suggestion I gave thus might sent someone reading this down the wrong
> rabbit hole, which none of us wants.
>
> BTW, I have no personal interest in this issue, which is tracked using
> regzbot, my Linux kernel regression tracking bot
> (https://linux-regtracking.leemhuis.info/regzbot/). I'm only posting
> this mail to get things rolling again and hence don't need to be CC on
> all further activities wrt to this regression.
>
>
> On 13.12.21 00:08, bugzilla-daemon@bugzilla.kernel.org wrote:
>> https://bugzilla.kernel.org/show_bug.cgi?id=215315
>>
>> Bug ID: 215315
>> Summary: [REGRESSION BISECTED] amdgpu crashes system suspend -
>> NUC8i7HVKVA
>> Product: Drivers
>> Version: 2.5
>> Kernel Version: 5.15-rc1, 5.15, 5.16-rc4
>> Hardware: x86-64
>> OS: Linux
>> Tree: Mainline
>> Status: NEW
>> Severity: normal
>> Priority: P1
>> Component: Video(DRI - non Intel)
>> Assignee: drivers_video-dri@kernel-bugs.osdl.org
>> Reporter: lenb@kernel.org
>> Regression: No
>>
>> My Intel NUC8i7HVKVA has an AMD GPU.
>>
>> Until 5.15-rc1, this machine was rock solid in suspend stress testing -- never
>> crashing after hundreds of hours of back-to-back suspend cycles.
>>
>> Until this patch went upstream:
>>
>> commit f7d6779df642720e22bffd449e683bb8690bd3bf (refs/bisect/bad)
>> Author: Guchun Chen <guchun.chen@amd.com>
>> Date: Fri Aug 27 18:31:41 2021 +0800
>>
>> drm/amdgpu: stop scheduler when calling hw_fini (v2)
>>
>> This gurantees no more work on the ring can be submitted
>> to hardware in suspend/resume case, otherwise a potential
>> race will occur and the ring will get no chance to stay
>> empty before suspend.
>>
>> v2: Call drm_sched_resubmit_job before drm_sched_start to
>> restart jobs from the pending list.
>>
>> Suggested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> Suggested-by: Christian König <christian.koenig@amd.com>
>> Signed-off-by: Guchun Chen <guchun.chen@amd.com>
>> Reviewed-by: Christian König <christian.koenig@amd.com>
>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>> Cc: stable@vger.kernel.org
>>
>> I bisected that the patch before this one was integrated can handle over 1,000
>> back-to-back "freeze" system suspend cycles. Yet, when this patch is present,
>> the system may crash before it completes only 100 cycles, and at most lasts a
>> few hundred cycles.
>>
>> This crash is present in all following upstream rc's, including 5.15-rc4.
>>
>> When I revert this patch from 5.15-rc4, stability returns.
>>
>> Usually, the crash is manifest by a black screen, and a system that does not
>> respond to ping, and will only respond to a long AC power button press to
>> remove power; and a subsequent cold reboot.
>>
>> I have witnessed the crash occur, and the "ubuntu color themed" screen enters
>> some sort of reverse video mode. In this weird color mode, I've seen a text
>> window oscillate between scrolling and un-scrolling for a line -- sort of like
>> it is going back in time, but then changes its mind. There is no response to
>> keyboard, mouse, or network input.
>>
>
prev parent reply other threads:[~2022-01-10 12:02 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <bug-215315-2300@https.bugzilla.kernel.org/>
2021-12-13 6:04 ` [Bug 215315] New: [REGRESSION BISECTED] amdgpu crashes system suspend - NUC8i7HVKVA Thorsten Leemhuis
2022-01-10 12:02 ` Thorsten Leemhuis [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f0103811-bca2-e439-ca73-2132fd6e9871@leemhuis.info \
--to=regressions@leemhuis.info \
--cc=regressions@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).