dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: bugzilla-daemon@bugzilla.kernel.org
To: dri-devel@lists.freedesktop.org
Subject: [Bug 206475] amdgpu under load drop signal to monitor until hard reset
Date: Tue, 16 Jun 2020 16:39:56 +0000	[thread overview]
Message-ID: <bug-206475-2300-Tb15mie6TM@https.bugzilla.kernel.org/> (raw)
In-Reply-To: <bug-206475-2300@https.bugzilla.kernel.org/>

https://bugzilla.kernel.org/show_bug.cgi?id=206475

--- Comment #14 from Andrew Ammerlaan (andrewammerlaan@riseup.net) ---
I sort of worked around this too.

I changed two things:

1) the iGPU is now the primary GPU, and I use DRI_PRIME=1 to offload to the AMD
gpu. This has reduced the amount of things that are rendered on the AMD card.
This didn't actually fix anything, but it did remove the necessity for a hard
reboot when the AMD GPU does a reset. Now, when the GPU resets only the
applications that are rendered on the AMD card stop working, the desktop and
stuff stay functional. 

2) I added three fans to my PC. Though the card's thermal sensor never reported
that it reached the critical temperature (it went up to 82 Celsius max,
critical is 91 Celsius). There definitely does seem to be a correlation between
high temperatures and the occurrence of the resets. And more fans is always
better anyway.

I still experienced some resets after switching the primary GPU to the iGPU,
but only if I really pushed it to it's limits. I haven't had a single reset
since I added the fans. (Though admittedly I haven't run a decent stress test
yet, so it is still too early to conclude that the problem is completely gone)

Since under-clocking the card worked for you, and adding fans seems to work for
me. I have a hunch that even though the thermal sensor doesn't report
problematic temperatures some parts of the card actually do reach problematic
temperatures nonetheless, which might causes issues leading to a reset.
I'm not sure where the sensor is physically located, but considering that the
card is quite large, it doesn't seem that far fetched to me that there could be
quite a large difference in temperature between two points on the card.

Perhaps this card could benefit from a second thermal sensor or earlier and/or
more aggressive thermal throttling.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

  parent reply	other threads:[~2020-06-16 16:39 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-09 20:36 [Bug 206475] New: amdgpu under load drop signal to monitor until hard reset bugzilla-daemon
2020-02-10 13:20 ` [Bug 206475] " bugzilla-daemon
2020-02-10 13:21 ` bugzilla-daemon
2020-02-10 16:39 ` bugzilla-daemon
2020-02-10 16:40 ` bugzilla-daemon
2020-02-10 19:33 ` bugzilla-daemon
2020-02-17 13:23 ` bugzilla-daemon
2020-02-21 21:13 ` bugzilla-daemon
2020-02-24 13:50 ` bugzilla-daemon
2020-02-24 13:52 ` bugzilla-daemon
2020-05-22 12:55 ` bugzilla-daemon
2020-05-23 14:40 ` bugzilla-daemon
2020-05-23 16:44 ` bugzilla-daemon
2020-06-16 15:48 ` bugzilla-daemon
2020-06-16 16:39 ` bugzilla-daemon [this message]
2020-06-24 20:33 ` bugzilla-daemon
2020-06-24 20:41 ` bugzilla-daemon
2020-06-25  9:58 ` bugzilla-daemon
2020-09-15 18:31 ` bugzilla-daemon
2020-09-16  7:52 ` bugzilla-daemon
2021-03-22  9:36 ` bugzilla-daemon
2022-01-06 17:58 ` bugzilla-daemon
2022-01-06 23:44 ` bugzilla-daemon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-206475-2300-Tb15mie6TM@https.bugzilla.kernel.org/ \
    --to=bugzilla-daemon@bugzilla.kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).