Bug ID 109650
Summary [amd-staging-drm-next] - Polaris 20 dc - idle power regession 3x [bisected]
Product DRI
Version DRI git
Hardware x86-64 (AMD64)
OS Linux (All)
Status NEW
Severity normal
Priority medium
Component DRM/AMDgpu
Assignee dri-devel@lists.freedesktop.org
Reporter Dieter@nuetzel-hh.de

Polaris 20

Idle power went up from ~32 W to ~96 W.

With broken commits:

amdgpu-pci-0100
Adapter: PCI adapter
vddgfx:       +1.20 V  
fan1:         888 RPM  (min =    0 RPM, max = 3200 RPM)
temp1:        +55.0°C  (crit = +94.0°C, hyst = -273.1°C)
power1:       96.04 W  (cap = 175.00 W)

Bisected to:

764c85fef41722db0f21558c6c2fb38bee172d19 is the first bad commit
commit 764c85fef41722db0f21558c6c2fb38bee172d19
Author: Yong Zhao <Yong.Zhao@amd.com>
Date:   Tue Feb 5 15:17:40 2019 -0500

    drm/amdgpu: Fix bugs in setting CP RB/MEC DOORBELL_RANGE registers

    CP_RB_DOORBELL_RANGE_LOWER/UPPER and CP_MEC_DOORBELL_RANGE_LOWER/UPPER
    are used for waking up an idle scheduler and for power gating support.
    Usually the first few doorbells in pci doorbell bar are used for RB
    and all leftover for MEC. This patch fixes the incorrect settings.

    Theoretically, gfx ring doorbells should come before all MEC doorbells
    to be consistent with the design. However, since the doorbell
    allocations are agreed by all and we are not free to change them, also
    considering the kernel MEC ring doorbells which are before gfx ring
    doorbells are not used often, we compromise by leaving the doorbell
    allocations unchanged.

    Change-Id: I402a56ce9a80e6c2ed2f96be431ae71ca88e73a4
    Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
    Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>

:040000 040000 a5747a6be3d388ae851855eebe7ebbf20488ba22
7b516291deb849c593199a4c8df3ad08c5b7a769 M drivers

After reverting both related commits from current
amd-staging-drm-next (256445aee13f)

9affde0e44af (HEAD -> amd-staging-drm-next) Revert "drm/amdgpu: Fix bugs in
setting CP RB/MEC DOORBELL_RANGE registers"
8e73059158d8 Revert "drm/amdgpu: Delete user queue doorbell variables"
256445aee13f (origin/amd-staging-drm-next) drm/amdgpu: remove some old unused
dpm helpers

I get these numbers, again (somewhat higher then Win... as some other pointed
out):

amdgpu-pci-0100
Adapter: PCI adapter
vddgfx:       +0.75 V  
fan1:         900 RPM  (min =    0 RPM, max = 3200 RPM)
temp1:        +30.0°C  (crit = +94.0°C, hyst = -273.1°C)
power1:       32.16 W  (cap = 175.00 W)


You are receiving this mail because: