dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [Bug 202043] New: amdgpu: Vega 56 SCLK drops to 700 Mhz when undervolting
@ 2018-12-22 12:24 bugzilla-daemon
  2018-12-22 15:31 ` [Bug 202043] " bugzilla-daemon
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: bugzilla-daemon @ 2018-12-22 12:24 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=202043

            Bug ID: 202043
           Summary: amdgpu: Vega 56 SCLK drops to 700 Mhz when
                    undervolting
           Product: Drivers
           Version: 2.5
    Kernel Version: 4.19.8, 4.20.0-rc6
          Hardware: All
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: Video(DRI - non Intel)
          Assignee: drivers_video-dri@kernel-bugs.osdl.org
          Reporter: antifermion@protonmail.com
        Regression: No

When undervolting my Sapphire Pulse Vega 56 by just 1mV, SCLK immediately drops
down to 700 Mhz and pstate 1-2 under load (`gputest /test=fur /width=1920
/height=1080`).

Script to undervolt:
```
echo "s 7 1630 1199" > /sys/class/drm/card0/device/pp_od_clk_voltage
echo "c" > /sys/class/drm/card0/device/pp_od_clk_voltage
```

Stock voltage would be 1200 on the Vega 64 Bios.
The same behavior can be observed with the stock Vega 56 Bios.
Undervolting the memory by 1mV results in similar behavior.
Overvolting by 1mV has no discernable effect.

`echo r > pp_od_clk_voltage` does not work to go back to the normal behavior.
Instead, I need to use `echo "s 7 1630 1200" > pp_od_clk_voltage` as above.

Without undervolting, SCLK is around 1330 Mhz, which matches the behavior on
Windows, where undervolting by around 150 mV is no problem and increases clock.

With an increased power limit of 300W, the clocks increase to around 1100 Mhz
while the card uses the full 300W.
It even maxes that limit with a significant underclock/undervolt which would
pull around 200W on Windows.

I tested with current Manjaro (4.19.8-2-MANJARO), as well as Kubuntu 18.10 with
stock (4.18) and 4.20 from
https://github.com/M-Bab/linux-kernel-amdgpu-binaries.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug 202043] amdgpu: Vega 56 SCLK drops to 700 Mhz when undervolting
  2018-12-22 12:24 [Bug 202043] New: amdgpu: Vega 56 SCLK drops to 700 Mhz when undervolting bugzilla-daemon
@ 2018-12-22 15:31 ` bugzilla-daemon
  2018-12-24 13:05 ` bugzilla-daemon
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: bugzilla-daemon @ 2018-12-22 15:31 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=202043

--- Comment #1 from antifermion@protonmail.com ---
It says
> To manually adjust these settings, first select manual using power_dpm_force_performance_level

in the driver source.
However, that does not change anything.
Setting the powerstate manually via `pp_dpm_sclk` does not work for me either.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug 202043] amdgpu: Vega 56 SCLK drops to 700 Mhz when undervolting
  2018-12-22 12:24 [Bug 202043] New: amdgpu: Vega 56 SCLK drops to 700 Mhz when undervolting bugzilla-daemon
  2018-12-22 15:31 ` [Bug 202043] " bugzilla-daemon
@ 2018-12-24 13:05 ` bugzilla-daemon
  2018-12-26 19:39 ` bugzilla-daemon
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: bugzilla-daemon @ 2018-12-24 13:05 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=202043

fin4478@hotmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |fin4478@hotmail.com

--- Comment #2 from fin4478@hotmail.com ---
Have amdgpu.ppfeaturemask=0xffffffff in the kernel command line.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug 202043] amdgpu: Vega 56 SCLK drops to 700 Mhz when undervolting
  2018-12-22 12:24 [Bug 202043] New: amdgpu: Vega 56 SCLK drops to 700 Mhz when undervolting bugzilla-daemon
  2018-12-22 15:31 ` [Bug 202043] " bugzilla-daemon
  2018-12-24 13:05 ` bugzilla-daemon
@ 2018-12-26 19:39 ` bugzilla-daemon
  2019-03-20  7:40 ` bugzilla-daemon
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: bugzilla-daemon @ 2018-12-26 19:39 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=202043

--- Comment #3 from antifermion@protonmail.com ---
I do have that enabled (my `/proc/cmdline` is
`BOOT_IMAGE=/boot/vmlinuz-4.19-x86_64
root=UUID=2994b8cf-341a-48b0-b49d-771df87dc509 rw quiet
amdgpu.ppfeaturemask=0xffffffff`)

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug 202043] amdgpu: Vega 56 SCLK drops to 700 Mhz when undervolting
  2018-12-22 12:24 [Bug 202043] New: amdgpu: Vega 56 SCLK drops to 700 Mhz when undervolting bugzilla-daemon
                   ` (2 preceding siblings ...)
  2018-12-26 19:39 ` bugzilla-daemon
@ 2019-03-20  7:40 ` bugzilla-daemon
  2019-03-20  7:45 ` bugzilla-daemon
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: bugzilla-daemon @ 2019-03-20  7:40 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=202043

Ivan Avdeev (lists@w23.ru) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |lists@w23.ru

--- Comment #4 from Ivan Avdeev (lists@w23.ru) ---
I have the very same issue on Vega 64. Changing any voltage value from stock
even by 1mV leads to severe throttling.

E.g.:
```
echo "manual" > $DEV/power_dpm_force_performance_level
echo "2" > $DEV/pp_sclk_od
echo "2" > $DEV/pp_mclk_od
echo "s 7 1663 1190" > $DEV/pp_od_clk_voltage # stock = 1200
echo "c" > $DEV/pp_od_clk_voltage
```

Will result in 0.5x performance and stutter.

This doesn't happen on windows on the same hardware.

Card: MSI Radeon RX Vega 64 Air Boost 8G OC
Kernels tried: 4.20, 5.0-rc2, 5.0.2
Mesa: 18.3.3, 18.3.4
MB: ASUS Prime X399-A, bios 0808
CPU: Threadripper 1950X

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug 202043] amdgpu: Vega 56 SCLK drops to 700 Mhz when undervolting
  2018-12-22 12:24 [Bug 202043] New: amdgpu: Vega 56 SCLK drops to 700 Mhz when undervolting bugzilla-daemon
                   ` (3 preceding siblings ...)
  2019-03-20  7:40 ` bugzilla-daemon
@ 2019-03-20  7:45 ` bugzilla-daemon
  2019-03-27  0:41 ` bugzilla-daemon
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: bugzilla-daemon @ 2019-03-20  7:45 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=202043

--- Comment #5 from Ivan Avdeev (lists@w23.ru) ---
Created attachment 281917
  --> https://bugzilla.kernel.org/attachment.cgi?id=281917&action=edit
Example of undervolting effect

Example of what happens when undervolting is attempted. Notice how:
- frame time goes from stable ~15ms to stuttering 25-30ms
- core freq drops from ~1.4GHz to 1.1GHz
- memory freq goes from stable ~900MHz to rapidly oscillating between
200-900MHz.

Captured by running Unigine Superposition in game mode on medium setttings on
2560x1440 screen with
`GALLIUM_HUD=.d.w1920frametime,.d.w1920shader-clock+memory-clock
GALLIUM_HUD_PERIOD=0`

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug 202043] amdgpu: Vega 56 SCLK drops to 700 Mhz when undervolting
  2018-12-22 12:24 [Bug 202043] New: amdgpu: Vega 56 SCLK drops to 700 Mhz when undervolting bugzilla-daemon
                   ` (4 preceding siblings ...)
  2019-03-20  7:45 ` bugzilla-daemon
@ 2019-03-27  0:41 ` bugzilla-daemon
  2019-03-27  0:45 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: bugzilla-daemon @ 2019-03-27  0:41 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=202043

mistarzy@gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |mistarzy@gmail.com

--- Comment #6 from mistarzy@gmail.com ---
Same issue here with Vega 64. From watching
/sys/kernel/debug/dri/0/amdgpu_pm_info my conclusion is that basically driver
gets max voltage applied even though tables in
/sys/class/drm/card0/device/pp_od_clk_voltage suggest otherwise. I did test
with Windows 10 and applied same pp_od_clk_voltage - no issues. For testing I
applied 1.2V for all states and had same result with amdgpu. Tested on 5.0.3
and 5.0.4 ubuntu mainline kernel.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug 202043] amdgpu: Vega 56 SCLK drops to 700 Mhz when undervolting
  2018-12-22 12:24 [Bug 202043] New: amdgpu: Vega 56 SCLK drops to 700 Mhz when undervolting bugzilla-daemon
                   ` (5 preceding siblings ...)
  2019-03-27  0:41 ` bugzilla-daemon
@ 2019-03-27  0:45 ` bugzilla-daemon
  2019-09-29  5:04 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: bugzilla-daemon @ 2019-03-27  0:45 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=202043

--- Comment #7 from mistarzy@gmail.com ---
Best results performance wise had with following setup:

echo 275000000 >> /sys/class/drm/card0/device/hwmon/hwmon0/power1_cap
echo "m 3 1100 1000" > /sys/class/drm/card0/device/pp_od_clk_voltage

GPU clock hovered above 1400Mhz and memory kept state 3 with 1100Mhz even
though I did not change to manual or sent commit. But after extensive use by
benchmarking or gaming for 20+ min drivers seems to bug at goes to lower states
even though temperatures stay the same. Let me know if you would like some
tests and logs.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug 202043] amdgpu: Vega 56 SCLK drops to 700 Mhz when undervolting
  2018-12-22 12:24 [Bug 202043] New: amdgpu: Vega 56 SCLK drops to 700 Mhz when undervolting bugzilla-daemon
                   ` (6 preceding siblings ...)
  2019-03-27  0:45 ` bugzilla-daemon
@ 2019-09-29  5:04 ` bugzilla-daemon
  2019-10-31 13:07 ` bugzilla-daemon
  2021-02-08 18:22 ` bugzilla-daemon
  9 siblings, 0 replies; 11+ messages in thread
From: bugzilla-daemon @ 2019-09-29  5:04 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=202043

Martin Böh (mart.b@outlook.de) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |mart.b@outlook.de

--- Comment #8 from Martin Böh (mart.b@outlook.de) ---
Same issue on a strix vega 64 it seems pretty weird. Is there any fix known?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug 202043] amdgpu: Vega 56 SCLK drops to 700 Mhz when undervolting
  2018-12-22 12:24 [Bug 202043] New: amdgpu: Vega 56 SCLK drops to 700 Mhz when undervolting bugzilla-daemon
                   ` (7 preceding siblings ...)
  2019-09-29  5:04 ` bugzilla-daemon
@ 2019-10-31 13:07 ` bugzilla-daemon
  2021-02-08 18:22 ` bugzilla-daemon
  9 siblings, 0 replies; 11+ messages in thread
From: bugzilla-daemon @ 2019-10-31 13:07 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=202043

haro41@gmx.de changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |haro41@gmx.de

--- Comment #9 from haro41@gmx.de ---
https://bugzilla.kernel.org/show_bug.cgi?id=205277

Should be fixed 5.4 release.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug 202043] amdgpu: Vega 56 SCLK drops to 700 Mhz when undervolting
  2018-12-22 12:24 [Bug 202043] New: amdgpu: Vega 56 SCLK drops to 700 Mhz when undervolting bugzilla-daemon
                   ` (8 preceding siblings ...)
  2019-10-31 13:07 ` bugzilla-daemon
@ 2021-02-08 18:22 ` bugzilla-daemon
  9 siblings, 0 replies; 11+ messages in thread
From: bugzilla-daemon @ 2021-02-08 18:22 UTC (permalink / raw)
  To: dri-devel

https://bugzilla.kernel.org/show_bug.cgi?id=202043

Bruno Jacquet (maxijac@free.fr) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |maxijac@free.fr

--- Comment #10 from Bruno Jacquet (maxijac@free.fr) ---
I am seeing the same behaviors with vega 64 on 5.4.96 (LTS) and also newer
kernels.

From my testing, it seems that amdgpu is simply not able to properly apply the
od table to the GPU. As soon as a clock change or voltage change is sent to
GPU, it disturbs the PM and it can only be fixed by rebooting.

(In reply to mistarzy from comment #6)
> Same issue here with Vega 64. From watching
> /sys/kernel/debug/dri/0/amdgpu_pm_info my conclusion is that basically
> driver gets max voltage applied even though tables in
> /sys/class/drm/card0/device/pp_od_clk_voltage suggest otherwise.

From my testing, I'd say that the voltages are just broken once a change to od
is sent to GPU.
After booting, if you monitor amdgpu_pm_info you will see some "uneven" VDD
values (825, 831, 918mV, etc...) which lets me think some kind of curve is
applied between voltage values of the initial sclk table.
Once any single change to the od table is sent, you will see that now the VDD
steps are just big chunks of the VDD steps, like incrementing by 50mV each up
to maximum value (1000, 1050, 1100, 1200mV...)

It seems using the voltage curve feature of pp_od_clk_voltage ("vc") _could_
fix the issue but it is not supported on cards older than VEGA20...
Not sure if this is a SW limitation in the driver or a GPU limitation.

Unexpectedly, the same effect can be seen when sending a full PP table.


(In reply to haro41 from comment #9)
> https://bugzilla.kernel.org/show_bug.cgi?id=205277
> 
> Should be fixed 5.4 release.

No, it's not fixed.

(In reply to mistarzy from comment #7)
> Best results performance wise had with following setup:
> 
> echo 275000000 >> /sys/class/drm/card0/device/hwmon/hwmon0/power1_cap
> echo "m 3 1100 1000" > /sys/class/drm/card0/device/pp_od_clk_voltage

Caution, if just changing the MCLK without committing (sending "c") it seems
the change is not actually sent to GPU even though all the other tables and
info files report the updated value.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-02-08 18:22 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-22 12:24 [Bug 202043] New: amdgpu: Vega 56 SCLK drops to 700 Mhz when undervolting bugzilla-daemon
2018-12-22 15:31 ` [Bug 202043] " bugzilla-daemon
2018-12-24 13:05 ` bugzilla-daemon
2018-12-26 19:39 ` bugzilla-daemon
2019-03-20  7:40 ` bugzilla-daemon
2019-03-20  7:45 ` bugzilla-daemon
2019-03-27  0:41 ` bugzilla-daemon
2019-03-27  0:45 ` bugzilla-daemon
2019-09-29  5:04 ` bugzilla-daemon
2019-10-31 13:07 ` bugzilla-daemon
2021-02-08 18:22 ` bugzilla-daemon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).