amd-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* Power limit OD stopped working for navi10 - broken on previously working commit
@ 2020-02-09 21:13 Matt Coffin
  2020-02-10  2:14 ` Matt Coffin
  0 siblings, 1 reply; 2+ messages in thread
From: Matt Coffin @ 2020-02-09 21:13 UTC (permalink / raw)
  To: amd-gfx

I was doing some benchmarking, and noticed some poor performance,
indicating that my overdrive settings were not in place, which they
were. hwmon/power1_cap reports the correctly adjusted value after it is
written to, and I confirmed with a quick patch that the updated power
limit value is actually being returned from the SMU after it is set, yet
the card refuses to go over stock settings (+/- 3% of stock power draw,
even with a 50% increase in power limit).

Since I worked on that code a while back, I went to go bisect, using
c39f062e881dcc6ab4c1c1c5835dc774be1bcfd6 as a starting location, since I
know that commit had working power limit overdrive before.

Strangely, I'm seeing the same behavior on that
previously-known-to-be-working commit!

This happens for both *increased* and *decreased* power limits. sysfs
reflects the change, but I see no change in the actual power draw on the
card, and for the *increased* case, performance reflects a card that is
throttling due to power limits.

Were there any firmware changes or anything that could be causing this
since I don't know where to start since a previously-working commit is
now somehow broken.

Since the behavior seems to have changed on me, it would also be
incredibly helpful if anyone can either confirm or deny that they can
reproduce this problem (or not) off of the latest codebase OR
c39f062e881dcc6ab4c1c1c5835dc774be1bcfd6.

Any help, testing information, or simple confirm/deny from your side
would go a long way.

Thanks in advance,
Matt
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Power limit OD stopped working for navi10 - broken on previously working commit
  2020-02-09 21:13 Power limit OD stopped working for navi10 - broken on previously working commit Matt Coffin
@ 2020-02-10  2:14 ` Matt Coffin
  0 siblings, 0 replies; 2+ messages in thread
From: Matt Coffin @ 2020-02-10  2:14 UTC (permalink / raw)
  To: amd-gfx


[-- Attachment #1.1.1: Type: text/plain, Size: 2328 bytes --]

Sorry for the followup, but I did finally manage to track this down to a
firmware/driver incompatibility and bisected `linux-firmware` to find
when it broke.

Since the firmware is just binaries, I can't really tell ya what is
wrong, but this is the commit where writing to the sysfs interface (and
in general sending the SetPptPowerLimit message to the SMC) stopped
doing anything.

https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/amdgpu?id=af76fd0ed266440ac406d5737218af7ac3cfc750

Let me know what I can do to help get this fixed. For now, I've just
downgraded to the first-released microcode as a stop-gap.

On 2/9/20 2:13 PM, Matt Coffin wrote:
> I was doing some benchmarking, and noticed some poor performance,
> indicating that my overdrive settings were not in place, which they
> were. hwmon/power1_cap reports the correctly adjusted value after it is
> written to, and I confirmed with a quick patch that the updated power
> limit value is actually being returned from the SMU after it is set, yet
> the card refuses to go over stock settings (+/- 3% of stock power draw,
> even with a 50% increase in power limit).
> 
> Since I worked on that code a while back, I went to go bisect, using
> c39f062e881dcc6ab4c1c1c5835dc774be1bcfd6 as a starting location, since I
> know that commit had working power limit overdrive before.
> 
> Strangely, I'm seeing the same behavior on that
> previously-known-to-be-working commit!
> 
> This happens for both *increased* and *decreased* power limits. sysfs
> reflects the change, but I see no change in the actual power draw on the
> card, and for the *increased* case, performance reflects a card that is
> throttling due to power limits.
> 
> Were there any firmware changes or anything that could be causing this
> since I don't know where to start since a previously-working commit is
> now somehow broken.
> 
> Since the behavior seems to have changed on me, it would also be
> incredibly helpful if anyone can either confirm or deny that they can
> reproduce this problem (or not) off of the latest codebase OR
> c39f062e881dcc6ab4c1c1c5835dc774be1bcfd6.
> 
> Any help, testing information, or simple confirm/deny from your side
> would go a long way.
> 
> Thanks in advance,
> Matt
> 


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2020-02-10  2:14 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-09 21:13 Power limit OD stopped working for navi10 - broken on previously working commit Matt Coffin
2020-02-10  2:14 ` Matt Coffin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).